While the latent entropy plugin mostly doesn't derive entropy from
get_random_const() for measuring the call graph, when __latent_entropy is
applied to a constant, then it's initialized statically to output from
get_random_const(). In that case, this data is derived from a 64-bit
seed, which means a buffer of 512 bits doesn't really have that amount
of compile-time entropy.
This patch fixes that shortcoming by just buffering chunks of
/dev/urandom output and doling it out as requested.
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
I'm not super familiar with this plugin or its conventions, so pointers
would be most welcome if something here looks amiss. The decision to
buffer 2k at a time is pretty arbitrary too; I haven't measured usage.
scripts/gcc-plugins/latent_entropy_plugin.c | 34 +++++++++------------
1 file changed, 15 insertions(+), 19 deletions(-)
diff --git a/scripts/gcc-plugins/latent_entropy_plugin.c b/scripts/gcc-plugins/latent_entropy_plugin.c
index 589454bce930..f238ba6726b8 100644
--- a/scripts/gcc-plugins/latent_entropy_plugin.c
+++ b/scripts/gcc-plugins/latent_entropy_plugin.c
@@ -82,29 +82,27 @@ __visible int plugin_is_GPL_compatible;
static GTY(()) tree latent_entropy_decl;
static struct plugin_info latent_entropy_plugin_info = {
- .version = "201606141920vanilla",
+ .version = "202203311920vanilla",
.help = "disable\tturn off latent entropy instrumentation\n",
};
-static unsigned HOST_WIDE_INT seed;
-/*
- * get_random_seed() (this is a GCC function) generates the seed.
- * This is a simple random generator without any cryptographic security because
- * the entropy doesn't come from here.
- */
+static unsigned HOST_WIDE_INT rnd_buf[256];
+static size_t rnd_idx = ARRAY_SIZE(rnd_buf);
+static int urandom_fd = -1;
+
static unsigned HOST_WIDE_INT get_random_const(void)
{
- unsigned int i;
- unsigned HOST_WIDE_INT ret = 0;
-
- for (i = 0; i < 8 * sizeof(ret); i++) {
- ret = (ret << 1) | (seed & 1);
- seed >>= 1;
- if (ret & 1)
- seed ^= 0xD800000000000000ULL;
+ if (urandom_fd < 0) {
+ urandom_fd = open("/dev/urandom", O_RDONLY);
+ if (urandom_fd < 0)
+ abort();
}
-
- return ret;
+ if (rnd_idx >= ARRAY_SIZE(rnd_buf)) {
+ if (read(urandom_fd, rnd_buf, sizeof(rnd_buf)) != sizeof(rnd_buf))
+ abort();
+ rnd_idx = 0;
+ }
+ return rnd_buf[rnd_idx++];
}
static tree tree_get_random_const(tree type)
@@ -537,8 +535,6 @@ static void latent_entropy_start_unit(void *gcc_data __unused,
tree type, id;
int quals;
- seed = get_random_seed(false);
-
if (in_lto_p)
return;
--
2.35.1
--
Dear E-mail User,
I am Nigel Higgins, Group Chairman of Barclays Bank Plc, London UK. I
would like to inform you that the British Overseas NGOs for
Development (commonly called Bond) has instructed our Bank, Barclays
Bank PLC through my desk to issue a Bank Draft for $1.5 Million USD
only in your name. The draft has be raised for the same amount in your
name. A copy of the draft will be sent to you for your perusal as soon
as you contact me with your current address and direct telephone
number.
The money is a humanitarian service for those less privileged around
your locality due to the Global pandemic and the crises in Asian
Countries.
Looking forward to hearing from you,
Yours sincerely,
Nigel Higgins, (Group Chairman),
Barclays Bank Plc,
Registered number: 1026167,
1 Churchill Place, London, ENG E14 5HP,
SWIFT Code: BARCGB21,
Direct Telephone: +44 770 000 8965,
www.barclays.co.uk
Print two possible reasons /sys/kernel/debug/gup_test
cannot be opened to help users of this test diagnose
failures.
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: stable(a)vger.kernel.org # 5.15+
---
tools/testing/selftests/vm/gup_test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c
index fe043f67798b0..c496bcefa7a0e 100644
--- a/tools/testing/selftests/vm/gup_test.c
+++ b/tools/testing/selftests/vm/gup_test.c
@@ -205,7 +205,9 @@ int main(int argc, char **argv)
gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
if (gup_fd == -1) {
- perror("open");
+ perror("failed to open /sys/kernel/debug/gup_test");
+ printf("check if CONFIG_GUP_TEST is enabled in kernel config\n");
+ printf("check if debugfs is mounted at /sys/kernel/debug\n");
exit(1);
}
--
2.24.1
Hello!
This is the spectre-bhb backport for v4.14.
This comes with an A76 timer workaround. v4.14 doesn't have a compat
vdso, so doesn't need all the patches for that workaround.
In particular, it doesn't need Marc's series:
https://lore.kernel.org/linux-arm-kernel/20200715125614.3240269-1-maz@kerne…
I included the Kconfig change that restricts this to COMPAT, but not commit
0f80cad3124f ("arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32"),
which is an invasive performance optimisation that wasn't marked as
being for stable.
Thanks,
James
Anshuman Khandual (1):
arm64: Add Cortex-X2 CPU part definition
Arnd Bergmann (1):
arm64: arch_timer: avoid unused function warning
James Morse (19):
arm64: entry.S: Add ventry overflow sanity checks
arm64: entry: Make the trampoline cleanup optional
arm64: entry: Free up another register on kpti's tramp_exit path
arm64: entry: Move the trampoline data page before the text page
arm64: entry: Allow tramp_alias to access symbols after the 4K
boundary
arm64: entry: Don't assume tramp_vectors is the start of the vectors
arm64: entry: Move trampoline macros out of ifdef'd section
arm64: entry: Make the kpti trampoline's kpti sequence optional
arm64: entry: Allow the trampoline text to occupy multiple pages
arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
arm64: entry: Add vectors that have the bhb mitigation sequences
arm64: entry: Add macro for reading symbol addresses from the
trampoline
arm64: Add percpu vectors for EL1
arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of
Spectre-v2
KVM: arm64: Add templates for BHB mitigation sequences
arm64: Mitigate spectre style branch history side channels
KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and
migrated
arm64: add ID_AA64ISAR2_EL1 sys register
arm64: Use the clearbhb instruction in mitigations
Marc Zyngier (4):
arm64: arch_timer: Add workaround for ARM erratum 1188873
arm64: Add silicon-errata.txt entry for ARM erratum 1188873
arm64: Make ARM64_ERRATUM_1188873 depend on COMPAT
arm64: Add part number for Neoverse N1
Rob Herring (1):
arm64: Add part number for Arm Cortex-A77
Suzuki K Poulose (1):
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
Documentation/arm64/silicon-errata.txt | 1 +
arch/arm/include/asm/kvm_host.h | 6 +
arch/arm64/Kconfig | 24 ++
arch/arm64/include/asm/assembler.h | 34 +++
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpucaps.h | 4 +-
arch/arm64/include/asm/cpufeature.h | 39 +++
arch/arm64/include/asm/cputype.h | 20 ++
arch/arm64/include/asm/fixmap.h | 6 +-
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/kvm_mmu.h | 2 +-
arch/arm64/include/asm/mmu.h | 8 +-
arch/arm64/include/asm/sections.h | 6 +
arch/arm64/include/asm/sysreg.h | 5 +
arch/arm64/include/asm/vectors.h | 74 +++++
arch/arm64/kernel/bpi.S | 55 ++++
arch/arm64/kernel/cpu_errata.c | 395 ++++++++++++++++++++++++-
arch/arm64/kernel/cpufeature.c | 21 ++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry.S | 198 ++++++++++---
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/kvm/hyp/hyp-entry.S | 4 +
arch/arm64/kvm/hyp/switch.c | 9 +-
arch/arm64/mm/mmu.c | 11 +-
drivers/clocksource/arm_arch_timer.c | 15 +
include/linux/arm-smccc.h | 7 +
virt/kvm/arm/psci.c | 12 +
27 files changed, 908 insertions(+), 57 deletions(-)
create mode 100644 arch/arm64/include/asm/vectors.h
--
2.30.2
Syzbot found an issue [1] in ext4_fallocate().
The C reproducer [2] calls fallocate(), passing size 0xffeffeff000ul,
and offset 0x1000000ul, which, when added together exceed the disk size,
and trigger a BUG in ext4_ind_remove_space() [3].
According to the comment doc in ext4_ind_remove_space() the 'end' block
parameter needs to be one block after the last block to remove.
In the case when the BUG is triggered it points to the last block on
a 4GB virtual disk image. This is calculated in
ext4_ind_remove_space() in [4].
This patch adds a check that ensure the length + offest to be
within the valid range and returns -ENOSPC error code in case
it is invalid.
LINK: [1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d7213…
LINK: [2] https://syzkaller.appspot.com/text?tag=ReproC&x=14ba0238700000
LINK: [3] https://elixir.bootlin.com/linux/v5.17-rc8/source/fs/ext4/indirect.c#L1244
LINK: [4] https://elixir.bootlin.com/linux/v5.17-rc8/source/fs/ext4/indirect.c#L1234
Cc: Theodore Ts'o <tytso(a)mit.edu>
Cc: Andreas Dilger <adilger.kernel(a)dilger.ca>
Cc: Ritesh Harjani <riteshh(a)linux.ibm.com>
Cc: <linux-ext4(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Fixes: a4bb6b64e39a ("ext4: enable "punch hole" functionality")
Reported-by: syzbot+7a806094edd5d07ba029(a)syzkaller.appspotmail.com
Signed-off-by: Tadeusz Struk <tadeusz.struk(a)linaro.org>
--
v2: Change sbi->s_blocksize to inode->i_sb->s_blocksize in maxlength
computation.
---
fs/ext4/inode.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 01c9e4f743ba..355384007d11 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3924,7 +3924,8 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length)
struct super_block *sb = inode->i_sb;
ext4_lblk_t first_block, stop_block;
struct address_space *mapping = inode->i_mapping;
- loff_t first_block_offset, last_block_offset;
+ loff_t first_block_offset, last_block_offset, max_length;
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
handle_t *handle;
unsigned int credits;
int ret = 0, ret2 = 0;
@@ -3967,6 +3968,16 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length)
offset;
}
+ /*
+ * For punch hole the length + offset needs to be at least within
+ * one block before last
+ */
+ max_length = sbi->s_bitmap_maxbytes - inode->i_sb->s_blocksize;
+ if (offset + length >= max_length) {
+ ret = -ENOSPC;
+ goto out_mutex;
+ }
+
if (offset & (sb->s_blocksize - 1) ||
(offset + length) & (sb->s_blocksize - 1)) {
/*
--
2.35.1
Good day,
We offer Loan and Commercial Project Funding services for Private
and Oversea Companies. We offer a low interest rate 1.5% annually
& five (5) Years Grace Period for Loan / Funding services .
Please tell us exactly how much Loan are you requesting ? What
type of Projects will you be Investing with our Loan ?
Yours Faithfully
Hoang Thu Dinh
Project Financing Executive`
World Funding Group Inc.
Component match callback functions need to check if expected data is
passed to them. Without this check, it can cause a NULL pointer
dereference when another driver registers a component before i915
drivers have their component master fully bind.
Fixes: 1e8d19d9b0dfc ("mei: hdcp: bind only with i915 on the same PCH")
Fixes: c2004ce99ed73 ("mei: pxp: export pavp client to me client bus")
Suggested-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Suggested-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Won Chung <wonchung(a)google.com>
Cc: stable(a)vger.kernel.org
---
Changes from v2:
- Correctly add "Suggested-by" tag
- Add "Cc: stable(a)vger.kernel.org"
Changes from v1:
- Add "Fixes" tag
- Send to stable(a)vger.kernel.org
drivers/misc/mei/hdcp/mei_hdcp.c | 2 +-
drivers/misc/mei/pxp/mei_pxp.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mei/hdcp/mei_hdcp.c b/drivers/misc/mei/hdcp/mei_hdcp.c
index ec2a4fce8581..843dbc2b21b1 100644
--- a/drivers/misc/mei/hdcp/mei_hdcp.c
+++ b/drivers/misc/mei/hdcp/mei_hdcp.c
@@ -784,7 +784,7 @@ static int mei_hdcp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_HDCP)
return 0;
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index f7380d387bab..e32a81da8af6 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -131,7 +131,7 @@ static int mei_pxp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_PXP)
return 0;
--
2.35.1.1021.g381101b075-goog
Hi,
We provide estimation & quantities takeoff services. We are providing 98-100 accuracy in our estimates and take-offs. Please tell us if you need any estimating services regarding your projects.
Send over the plans and mention the exact scope of work and shortly we will get back with a proposal on which our charges and turnaround time will be mentioned
You may ask for sample estimates and take-offs. Thanks.
Kind Regards
Nickolas Casas
Dreamland Estimation, LLC
Re-enable the registration of algorithms after fixes to (1) use
pre-allocated buffers in the datapath and (2) support the
CRYPTO_TFM_REQ_MAY_BACKLOG flag.
This reverts commit 8893d27ffcaf6ec6267038a177cb87bcde4dd3de.
Cc: stable(a)vger.kernel.org
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu(a)intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero(a)intel.com>
---
drivers/crypto/qat/qat_4xxx/adf_drv.c | 7 -------
drivers/crypto/qat/qat_common/qat_crypto.c | 7 -------
2 files changed, 14 deletions(-)
diff --git a/drivers/crypto/qat/qat_4xxx/adf_drv.c b/drivers/crypto/qat/qat_4xxx/adf_drv.c
index fa4c350c1bf9..a6c78b9c730b 100644
--- a/drivers/crypto/qat/qat_4xxx/adf_drv.c
+++ b/drivers/crypto/qat/qat_4xxx/adf_drv.c
@@ -75,13 +75,6 @@ static int adf_crypto_dev_config(struct adf_accel_dev *accel_dev)
if (ret)
goto err;
- /* Temporarily set the number of crypto instances to zero to avoid
- * registering the crypto algorithms.
- * This will be removed when the algorithms will support the
- * CRYPTO_TFM_REQ_MAY_BACKLOG flag
- */
- instances = 0;
-
for (i = 0; i < instances; i++) {
val = i;
bank = i * 2;
diff --git a/drivers/crypto/qat/qat_common/qat_crypto.c b/drivers/crypto/qat/qat_common/qat_crypto.c
index 80d905ed102e..9341d892533a 100644
--- a/drivers/crypto/qat/qat_common/qat_crypto.c
+++ b/drivers/crypto/qat/qat_common/qat_crypto.c
@@ -161,13 +161,6 @@ int qat_crypto_dev_config(struct adf_accel_dev *accel_dev)
if (ret)
goto err;
- /* Temporarily set the number of crypto instances to zero to avoid
- * registering the crypto algorithms.
- * This will be removed when the algorithms will support the
- * CRYPTO_TFM_REQ_MAY_BACKLOG flag
- */
- instances = 0;
-
for (i = 0; i < instances; i++) {
val = i;
snprintf(key, sizeof(key), ADF_CY "%d" ADF_RING_ASYM_BANK_NUM, i);
--
2.35.1
Make the two locations where exportfs helpers check permission to lookup
a given inode idmapped mount aware by switching it to the lookup_one()
helper. This is a bugfix for the open_by_handle_at() system call which
doesn't take idmapped mounts into account currently. It's not tied to a
specific commit so we'll just Cc stable.
In addition this is required to support idmapped base layers in overlay.
The overlay filesystem uses exportfs to encode and decode file handles
for its index=on mount option and when nfs_export=on.
Cc: <stable(a)vger.kernel.org>
Cc: <linux-fsdevel(a)vger.kernel.org>
Tested-by: Giuseppe Scrivano <gscrivan(a)redhat.com>
Reviewed-by: Amir Goldstein <amir73il(a)gmail.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner(a)kernel.org>
---
/* v2 */
unchanged
/* v3 */
unchanged
---
fs/exportfs/expfs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 0106eba46d5a..3ef80d000e13 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -145,7 +145,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
if (err)
goto out_err;
dprintk("%s: found name: %s\n", __func__, nbuf);
- tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
+ tmp = lookup_one_unlocked(mnt_user_ns(mnt), nbuf, parent, strlen(nbuf));
if (IS_ERR(tmp)) {
dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
err = PTR_ERR(tmp);
@@ -525,7 +525,8 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
}
inode_lock(target_dir->d_inode);
- nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));
+ nresult = lookup_one(mnt_user_ns(mnt), nbuf,
+ target_dir, strlen(nbuf));
if (!IS_ERR(nresult)) {
if (unlikely(nresult->d_inode != result->d_inode)) {
dput(nresult);
--
2.32.0
آیا نامه پیروزی خود را دریافت کرده اید که شما را از طریق آدرس ایمیل
6,500 میلیون دلاری شما راهنمایی می کند؟
1) انتقال از بانک به بانک.
2) تحویل اکسپرس ATM
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64f93a9a27c1970fa8ee5ffc5a6ae2bda477ec5b Mon Sep 17 00:00:00 2001
From: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Date: Tue, 1 Mar 2022 21:33:00 +0530
Subject: [PATCH] bus: mhi: Fix pm_state conversion to string
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Also the current API expects "mhi_pm_state" enumerator as the function
argument but the function only works with bitmasks. So as Alex suggested,
let's change the argument to u32 to avoid confusion.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Cc: stable(a)vger.kernel.org
[mani: changed the function argument to u32]
Reviewed-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
Reviewed-by: Alex Elder <elder(a)linaro.org>
Signed-off-by: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Link: https://lore.kernel.org/r/20220301160308.107452-3-manivannan.sadhasivam@lin…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 046f407dc5d6..09394a1c29ec 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -77,12 +77,14 @@ static const char * const mhi_pm_state_str[] = {
[MHI_PM_STATE_LD_ERR_FATAL_DETECT] = "Linkdown or Error Fatal Detect",
};
-const char *to_mhi_pm_state_str(enum mhi_pm_state state)
+const char *to_mhi_pm_state_str(u32 state)
{
- unsigned long pm_state = state;
- int index = find_last_bit(&pm_state, 32);
+ int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str))
+ if (state)
+ index = __fls(state);
+
+ if (!state || index >= ARRAY_SIZE(mhi_pm_state_str))
return "Invalid State";
return mhi_pm_state_str[index];
diff --git a/drivers/bus/mhi/core/internal.h b/drivers/bus/mhi/core/internal.h
index e2e10474a9d9..3508cbbf555d 100644
--- a/drivers/bus/mhi/core/internal.h
+++ b/drivers/bus/mhi/core/internal.h
@@ -622,7 +622,7 @@ void mhi_free_bhie_table(struct mhi_controller *mhi_cntrl,
enum mhi_pm_state __must_check mhi_tryset_pm_state(
struct mhi_controller *mhi_cntrl,
enum mhi_pm_state state);
-const char *to_mhi_pm_state_str(enum mhi_pm_state state);
+const char *to_mhi_pm_state_str(u32 state);
int mhi_queue_state_transition(struct mhi_controller *mhi_cntrl,
enum dev_st_transition state);
void mhi_pm_st_worker(struct work_struct *work);
The patch below does not apply to the 5.16-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64f93a9a27c1970fa8ee5ffc5a6ae2bda477ec5b Mon Sep 17 00:00:00 2001
From: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Date: Tue, 1 Mar 2022 21:33:00 +0530
Subject: [PATCH] bus: mhi: Fix pm_state conversion to string
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Also the current API expects "mhi_pm_state" enumerator as the function
argument but the function only works with bitmasks. So as Alex suggested,
let's change the argument to u32 to avoid confusion.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Cc: stable(a)vger.kernel.org
[mani: changed the function argument to u32]
Reviewed-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
Reviewed-by: Alex Elder <elder(a)linaro.org>
Signed-off-by: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Link: https://lore.kernel.org/r/20220301160308.107452-3-manivannan.sadhasivam@lin…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 046f407dc5d6..09394a1c29ec 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -77,12 +77,14 @@ static const char * const mhi_pm_state_str[] = {
[MHI_PM_STATE_LD_ERR_FATAL_DETECT] = "Linkdown or Error Fatal Detect",
};
-const char *to_mhi_pm_state_str(enum mhi_pm_state state)
+const char *to_mhi_pm_state_str(u32 state)
{
- unsigned long pm_state = state;
- int index = find_last_bit(&pm_state, 32);
+ int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str))
+ if (state)
+ index = __fls(state);
+
+ if (!state || index >= ARRAY_SIZE(mhi_pm_state_str))
return "Invalid State";
return mhi_pm_state_str[index];
diff --git a/drivers/bus/mhi/core/internal.h b/drivers/bus/mhi/core/internal.h
index e2e10474a9d9..3508cbbf555d 100644
--- a/drivers/bus/mhi/core/internal.h
+++ b/drivers/bus/mhi/core/internal.h
@@ -622,7 +622,7 @@ void mhi_free_bhie_table(struct mhi_controller *mhi_cntrl,
enum mhi_pm_state __must_check mhi_tryset_pm_state(
struct mhi_controller *mhi_cntrl,
enum mhi_pm_state state);
-const char *to_mhi_pm_state_str(enum mhi_pm_state state);
+const char *to_mhi_pm_state_str(u32 state);
int mhi_queue_state_transition(struct mhi_controller *mhi_cntrl,
enum dev_st_transition state);
void mhi_pm_st_worker(struct work_struct *work);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64f93a9a27c1970fa8ee5ffc5a6ae2bda477ec5b Mon Sep 17 00:00:00 2001
From: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Date: Tue, 1 Mar 2022 21:33:00 +0530
Subject: [PATCH] bus: mhi: Fix pm_state conversion to string
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Also the current API expects "mhi_pm_state" enumerator as the function
argument but the function only works with bitmasks. So as Alex suggested,
let's change the argument to u32 to avoid confusion.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Cc: stable(a)vger.kernel.org
[mani: changed the function argument to u32]
Reviewed-by: Manivannan Sadhasivam <mani(a)kernel.org>
Reviewed-by: Hemant Kumar <hemantk(a)codeaurora.org>
Reviewed-by: Alex Elder <elder(a)linaro.org>
Signed-off-by: Paul Davey <paul.davey(a)alliedtelesis.co.nz>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Link: https://lore.kernel.org/r/20220301160308.107452-3-manivannan.sadhasivam@lin…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 046f407dc5d6..09394a1c29ec 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -77,12 +77,14 @@ static const char * const mhi_pm_state_str[] = {
[MHI_PM_STATE_LD_ERR_FATAL_DETECT] = "Linkdown or Error Fatal Detect",
};
-const char *to_mhi_pm_state_str(enum mhi_pm_state state)
+const char *to_mhi_pm_state_str(u32 state)
{
- unsigned long pm_state = state;
- int index = find_last_bit(&pm_state, 32);
+ int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str))
+ if (state)
+ index = __fls(state);
+
+ if (!state || index >= ARRAY_SIZE(mhi_pm_state_str))
return "Invalid State";
return mhi_pm_state_str[index];
diff --git a/drivers/bus/mhi/core/internal.h b/drivers/bus/mhi/core/internal.h
index e2e10474a9d9..3508cbbf555d 100644
--- a/drivers/bus/mhi/core/internal.h
+++ b/drivers/bus/mhi/core/internal.h
@@ -622,7 +622,7 @@ void mhi_free_bhie_table(struct mhi_controller *mhi_cntrl,
enum mhi_pm_state __must_check mhi_tryset_pm_state(
struct mhi_controller *mhi_cntrl,
enum mhi_pm_state state);
-const char *to_mhi_pm_state_str(enum mhi_pm_state state);
+const char *to_mhi_pm_state_str(u32 state);
int mhi_queue_state_transition(struct mhi_controller *mhi_cntrl,
enum dev_st_transition state);
void mhi_pm_st_worker(struct work_struct *work);
Component match callback functions need to check if expected data is
passed to them. Without this check, it can cause a NULL pointer
dereference when another driver registers a component before i915
drivers have their component master fully bind.
Fixes: 1e8d19d9b0dfc ("mei: hdcp: bind only with i915 on the same PCH")
Fixes: c2004ce99ed73 ("mei: pxp: export pavp client to me client bus")
Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Won Chung <wonchung(a)google.com>
---
Changes from v1:
- Add "Fixes" tag
- Send to stable(a)vger.kernel.org
drivers/misc/mei/hdcp/mei_hdcp.c | 2 +-
drivers/misc/mei/pxp/mei_pxp.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mei/hdcp/mei_hdcp.c b/drivers/misc/mei/hdcp/mei_hdcp.c
index ec2a4fce8581..843dbc2b21b1 100644
--- a/drivers/misc/mei/hdcp/mei_hdcp.c
+++ b/drivers/misc/mei/hdcp/mei_hdcp.c
@@ -784,7 +784,7 @@ static int mei_hdcp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_HDCP)
return 0;
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index f7380d387bab..e32a81da8af6 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -131,7 +131,7 @@ static int mei_pxp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_PXP)
return 0;
--
2.35.1.1021.g381101b075-goog
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 11ec5c2715a9..d91a795ad432 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -134,14 +134,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -205,7 +197,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -213,29 +204,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -243,9 +217,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -257,7 +229,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -306,14 +277,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -321,6 +287,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -449,7 +419,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
tpm_del_legacy_sysfs(chip);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 019fe80fedd8..7436da043040 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -593,4 +593,6 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u32 cc,
u8 *cmd);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
u32 cc, u8 *buf, size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
#endif
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index c7763a96cbaf..8e42bc096e83 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -540,3 +540,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
return 0;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
helped to Cc stable@ list...
On 2022/3/29 21:46, Like Xu wrote:
> From: Like Xu <likexu(a)tencent.com>
>
> NMI-watchdog is one of the favorite features of kernel developers,
> but it does not work in AMD guest even with vPMU enabled and worse,
> the system misrepresents this capability via /proc.
>
> This is a PMC emulation error. KVM does not pass the latest valid
> value to perf_event in time when guest NMI-watchdog is running, thus
> the perf_event corresponding to the watchdog counter will enter the
> old state at some point after the first guest NMI injection, forcing
> the hardware register PMC0 to be constantly written to 0x800000000001.
>
> Meanwhile, the running counter should accurately reflect its new value
> based on the latest coordinated pmc->counter (from vPMC's point of view)
> rather than the value written directly by the guest.
>
> Fixes: 168d918f2643 ("KVM: x86: Adjust counter sample period after a wrmsr")
> Reported-by: Dongli Cao <caodongli(a)kingsoft.com>
> Signed-off-by: Like Xu <likexu(a)tencent.com>
> ---
> arch/x86/kvm/pmu.h | 9 +++++++++
> arch/x86/kvm/svm/pmu.c | 1 +
> arch/x86/kvm/vmx/pmu_intel.c | 8 ++------
> 3 files changed, 12 insertions(+), 6 deletions(-)
Recently I also met the "NMI watchdog not working on AMD guest"
issue, I have tested this patch locally and it helps.
Reviewed-by: Yanan Wang <wangyanan55(a)huawei.com>
Tested-by: Yanan Wang <wangyanan55(a)huawei.com>
Thanks,
Yanan
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 7a7b8d5b775e..5e7e8d163b98 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -140,6 +140,15 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
> return sample_period;
> }
>
> +static inline void pmc_update_sample_period(struct kvm_pmc *pmc)
> +{
> + if (!pmc->perf_event || pmc->is_paused)
> + return;
> +
> + perf_event_period(pmc->perf_event,
> + get_sample_period(pmc, pmc->counter));
> +}
> +
> void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
> void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
> void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> index 24eb935b6f85..b14860863c39 100644
> --- a/arch/x86/kvm/svm/pmu.c
> +++ b/arch/x86/kvm/svm/pmu.c
> @@ -257,6 +257,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER);
> if (pmc) {
> pmc->counter += data - pmc_read_counter(pmc);
> + pmc_update_sample_period(pmc);
> return 0;
> }
> /* MSR_EVNTSELn */
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index efa172a7278e..e64046fbcdca 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -431,15 +431,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> !(msr & MSR_PMC_FULL_WIDTH_BIT))
> data = (s64)(s32)data;
> pmc->counter += data - pmc_read_counter(pmc);
> - if (pmc->perf_event && !pmc->is_paused)
> - perf_event_period(pmc->perf_event,
> - get_sample_period(pmc, data));
> + pmc_update_sample_period(pmc);
> return 0;
> } else if ((pmc = get_fixed_pmc(pmu, msr))) {
> pmc->counter += data - pmc_read_counter(pmc);
> - if (pmc->perf_event && !pmc->is_paused)
> - perf_event_period(pmc->perf_event,
> - get_sample_period(pmc, data));
> + pmc_update_sample_period(pmc);
> return 0;
> } else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
> if (data == pmc->eventsel)
From: Lars Ellenberg <lars.ellenberg(a)linbit.com>
Scenario:
---------
bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.
We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.
Reproducer:
-----------
How to trigger this in the real world within seconds:
DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").
Cause significant read ahead.
Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.
For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.
Signed-off-by: Lars Ellenberg <lars.ellenberg(a)linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder(a)linbit.com>
Cc: <stable(a)vger.kernel.org> # 4.13.x
---
drivers/block/drbd/drbd_req.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index c04394518b07..e1e58e91ee58 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_connection *connection)
void complete_master_bio(struct drbd_device *device,
struct bio_and_error *m)
{
- m->bio->bi_status = errno_to_blk_status(m->error);
+ if (unlikely(m->error))
+ m->bio->bi_status = errno_to_blk_status(m->error);
bio_endio(m->bio);
dec_ap_bio(device);
}
--
2.32.0
On Wed, 30 Mar 2022, 조준완 wrote:
> author
> Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> 2021-12-01 19:35:03 +0100
>
> committer
> Benjamin Tissoires <benjamin.tissoires(a)redhat.com> 2021-12-02 15:36:18 +0100
>
> commit
> 93020953d0fa7035fd036ad87a47ae2b7aa4ae33 (patch)
>
> tree
> f734e9962e28cedcccfbb109e510d39a18ed6d34 /drivers/hid/hid-samsung.c
>
> parent
> 720ac467204a70308bd687927ed475afb904e11b (diff)
>
> download
> linux-93020953d0fa7035fd036ad87a47ae2b7aa4ae33.tar.gz
>
> HID: check for valid USB device for many HID drivers
>
> Many HID drivers assume that the HID device assigned to them is a USB
> device as that was the only way HID devices used to be able to be
> created in Linux. However, with the additional ways that HID devices can
> be created for many different bus types, that is no longer true, so
> properly check that we have a USB device associated with the HID device
> before allowing a driver that makes this assumption to claim it.
>
>
> diff --git a/drivers/hid/hid-samsung.c b/drivers/hid/hid-samsung.c
> index 2e1c31156eca0..cf5992e970940 100644
> --- a/drivers/hid/hid-samsung.c
> +++ b/drivers/hid/hid-samsung.c
>
> @@ -152,6 +152,9 @@ static int samsung_probe(struct hid_device *hdev,
>
> int ret;
>
> unsigned int cmask = HID_CONNECT_DEFAULT;
>
>
> + if (!hid_is_usb(hdev))
>
> + return -EINVAL;
>
> +
>
> ret = hid_parse(hdev);
>
> if (ret) {
>
> hid_err(hdev, "parse failed\n");
>
>
> Bluetooth HID devices like Keyboard and mice don't work at all because
> of this commit.
Could you please elaborate a little bit more -- which Bluetooth device
(VID/PID) are you referring to please? hid-samsung as-is in mainline
should only match against these two:
static const struct hid_device_id samsung_devices[] = {
{ HID_USB_DEVICE(USB_VENDOR_ID_SAMSUNG, USB_DEVICE_ID_SAMSUNG_IR_REMOTE) },
{ HID_USB_DEVICE(USB_VENDOR_ID_SAMSUNG, USB_DEVICE_ID_SAMSUNG_WIRELESS_KBD_MOUSE) },
{ }
};
MODULE_DEVICE_TABLE(hid, samsung_devices);
which are both USB devices, not Bluetooth.
> We Samsung applies several patches to Samsung drivers for Android models
> based on Linux kernel Samsung Driver. For example, add Samsung
> accessories or define key codes for special key processing.
>
> We don't want any changes in Samsung hid driver by others.
That's not how Linux development works though.
> Soorer or later, I have a plan to contribute a modified driver by
> Samsung that is currently in use on Samsung Android.
That would be very much appreciated, thanks.
Once you do so, you can very well become maintainers of hid-samsung driver
(which would make a lot of sense for someone from Samsung to actually
participate in development of that driver), and have much better influence
on what goes in and in which form.
> Anyway, we sincerely hope you will revert this commit because Samsung
> driver is not just for USB accessories.
I still would like to understand this part, because as far as I can tell,
the in-kernel one is.
Thanks,
--
Jiri Kosina
SUSE Labs
The patch titled
Subject: mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
has been added to the -mm tree. Its filename is
mm-avoid-pointless-invalidate_range_start-end-on-mremapold_size=0.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-avoid-pointless-invalidate_ran…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-avoid-pointless-invalidate_ran…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Paolo Bonzini <pbonzini(a)redhat.com>
Subject: mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
If an mremap() syscall with old_size=0 ends up in move_page_tables(), it
will call invalidate_range_start()/invalidate_range_end() unnecessarily,
i.e. with an empty range.
This causes a WARN in KVM's mmu_notifier. In the past, empty ranges have
been diagnosed to be off-by-one bugs, hence the WARNing. Given the low
(so far) number of unique reports, the benefits of detecting more buggy
callers seem to outweigh the cost of having to fix cases such as this one,
where userspace is doing something silly. In this particular case, an
early return from move_page_tables() is enough to fix the issue.
Link: https://lkml.kernel.org/r/20220329173155.172439-1-pbonzini@redhat.com
Reported-by: syzbot+6bde52d89cfdf9f61425(a)syzkaller.appspotmail.com
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mremap.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/mremap.c~mm-avoid-pointless-invalidate_range_start-end-on-mremapold_size=0
+++ a/mm/mremap.c
@@ -486,6 +486,9 @@ unsigned long move_page_tables(struct vm
pmd_t *old_pmd, *new_pmd;
pud_t *old_pud, *new_pud;
+ if (!len)
+ return 0;
+
old_end = old_addr + len;
flush_cache_range(vma, old_addr, old_end);
_
Patches currently in -mm which might be from pbonzini(a)redhat.com are
mm-avoid-pointless-invalidate_range_start-end-on-mremapold_size=0.patch
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index f79f87794273..2d62a3649402 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -163,14 +163,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -234,7 +226,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -242,29 +233,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -272,9 +246,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -286,7 +258,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -335,14 +306,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -350,6 +316,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -508,7 +478,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
hwrng_unregister(&chip->hwrng);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index b9a30f0b8825..7b8e279c180b 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -605,6 +605,8 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u32 cc,
u8 *cmd);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
u32 cc, u8 *buf, size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
void tpm_bios_log_setup(struct tpm_chip *chip);
void tpm_bios_log_teardown(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 205d930d7ea6..a54bb904b281 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -536,3 +536,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
return 0;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 1838039b0333..17fbd7f7a295 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -274,14 +274,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -344,7 +336,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -352,29 +343,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -382,9 +356,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -396,7 +368,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -445,14 +416,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -460,6 +426,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -641,7 +611,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
hwrng_unregister(&chip->hwrng);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 37f010421a36..58312062d435 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -466,6 +466,8 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u8 *cmd,
size_t cmdsiz);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, void *buf,
size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
void tpm_bios_log_setup(struct tpm_chip *chip);
void tpm_bios_log_teardown(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 97e916856cf3..265ec72b1d81 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -574,3 +574,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
dev_err(&chip->dev, "%s: error %d\n", __func__, rc);
return rc;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index ddaeceb7e109..ed600473ad7e 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -274,14 +274,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -344,7 +336,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -352,29 +343,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -382,9 +356,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -396,7 +368,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -445,14 +416,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -460,6 +426,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -641,7 +611,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
hwrng_unregister(&chip->hwrng);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..2163c6ee0d36 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -234,6 +234,8 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u8 *cmd,
size_t cmdsiz);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, void *buf,
size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
void tpm_bios_log_setup(struct tpm_chip *chip);
void tpm_bios_log_teardown(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 97e916856cf3..265ec72b1d81 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -574,3 +574,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
dev_err(&chip->dev, "%s: error %d\n", __func__, rc);
return rc;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index df37e7b6a10a..65d800ecc996 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -274,14 +274,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -344,7 +336,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -352,29 +343,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -382,9 +356,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -396,7 +368,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -445,14 +416,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -460,6 +426,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -649,7 +619,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
hwrng_unregister(&chip->hwrng);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..2163c6ee0d36 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -234,6 +234,8 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u8 *cmd,
size_t cmdsiz);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, void *buf,
size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
void tpm_bios_log_setup(struct tpm_chip *chip);
void tpm_bios_log_teardown(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 97e916856cf3..265ec72b1d81 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -574,3 +574,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
dev_err(&chip->dev, "%s: error %d\n", __func__, rc);
return rc;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
commit 7e0438f83dc769465ee663bb5dcf8cc154940712 upstream.
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000
bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684
bfe0: 0000006c beafe648 0001056c b6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable(a)vger.kernel.org
Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device")
Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Tested-by: Stefan Berger <stefanb(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm-chip.c | 46 +++++--------------------
drivers/char/tpm/tpm.h | 2 ++
drivers/char/tpm/tpm2-space.c | 65 +++++++++++++++++++++++++++++++++++
3 files changed, 75 insertions(+), 38 deletions(-)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index df37e7b6a10a..65d800ecc996 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -274,14 +274,6 @@ static void tpm_dev_release(struct device *dev)
kfree(chip);
}
-static void tpm_devs_release(struct device *dev)
-{
- struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
-
- /* release the master device reference */
- put_device(&chip->dev);
-}
-
/**
* tpm_class_shutdown() - prepare the TPM device for loss of power.
* @dev: device to which the chip is associated.
@@ -344,7 +336,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev_num = rc;
device_initialize(&chip->dev);
- device_initialize(&chip->devs);
chip->dev.class = tpm_class;
chip->dev.class->shutdown_pre = tpm_class_shutdown;
@@ -352,29 +343,12 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
- chip->devs.parent = pdev;
- chip->devs.class = tpmrm_class;
- chip->devs.release = tpm_devs_release;
- /* get extra reference on main device to hold on
- * behalf of devs. This holds the chip structure
- * while cdevs is in use. The corresponding put
- * is in the tpm_devs_release (TPM2 only)
- */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
- get_device(&chip->dev);
-
if (chip->dev_num == 0)
chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR);
else
chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num);
- chip->devs.devt =
- MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
-
rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num);
- if (rc)
- goto out;
- rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
if (rc)
goto out;
@@ -382,9 +356,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
chip->flags |= TPM_CHIP_FLAG_VIRTUAL;
cdev_init(&chip->cdev, &tpm_fops);
- cdev_init(&chip->cdevs, &tpmrm_fops);
chip->cdev.owner = THIS_MODULE;
- chip->cdevs.owner = THIS_MODULE;
rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE);
if (rc) {
@@ -396,7 +368,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
return chip;
out:
- put_device(&chip->devs);
put_device(&chip->dev);
return ERR_PTR(rc);
}
@@ -445,14 +416,9 @@ static int tpm_add_char_device(struct tpm_chip *chip)
}
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
- rc = cdev_device_add(&chip->cdevs, &chip->devs);
- if (rc) {
- dev_err(&chip->devs,
- "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
- dev_name(&chip->devs), MAJOR(chip->devs.devt),
- MINOR(chip->devs.devt), rc);
- return rc;
- }
+ rc = tpm_devs_add(chip);
+ if (rc)
+ goto err_del_cdev;
}
/* Make the chip available. */
@@ -460,6 +426,10 @@ static int tpm_add_char_device(struct tpm_chip *chip)
idr_replace(&dev_nums_idr, chip, chip->dev_num);
mutex_unlock(&idr_lock);
+ return 0;
+
+err_del_cdev:
+ cdev_device_del(&chip->cdev, &chip->dev);
return rc;
}
@@ -649,7 +619,7 @@ void tpm_chip_unregister(struct tpm_chip *chip)
hwrng_unregister(&chip->hwrng);
tpm_bios_log_teardown(chip);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
- cdev_device_del(&chip->cdevs, &chip->devs);
+ tpm_devs_remove(chip);
tpm_del_char_device(chip);
}
EXPORT_SYMBOL_GPL(tpm_chip_unregister);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 283f78211c3a..2163c6ee0d36 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -234,6 +234,8 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u8 *cmd,
size_t cmdsiz);
int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, void *buf,
size_t *bufsiz);
+int tpm_devs_add(struct tpm_chip *chip);
+void tpm_devs_remove(struct tpm_chip *chip);
void tpm_bios_log_setup(struct tpm_chip *chip);
void tpm_bios_log_teardown(struct tpm_chip *chip);
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 97e916856cf3..265ec72b1d81 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -574,3 +574,68 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
dev_err(&chip->dev, "%s: error %d\n", __func__, rc);
return rc;
}
+
+/*
+ * Put the reference to the main device.
+ */
+static void tpm_devs_release(struct device *dev)
+{
+ struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs);
+
+ /* release the master device reference */
+ put_device(&chip->dev);
+}
+
+/*
+ * Remove the device file for exposed TPM spaces and release the device
+ * reference. This may also release the reference to the master device.
+ */
+void tpm_devs_remove(struct tpm_chip *chip)
+{
+ cdev_device_del(&chip->cdevs, &chip->devs);
+ put_device(&chip->devs);
+}
+
+/*
+ * Add a device file to expose TPM spaces. Also take a reference to the
+ * main device.
+ */
+int tpm_devs_add(struct tpm_chip *chip)
+{
+ int rc;
+
+ device_initialize(&chip->devs);
+ chip->devs.parent = chip->dev.parent;
+ chip->devs.class = tpmrm_class;
+
+ /*
+ * Get extra reference on main device to hold on behalf of devs.
+ * This holds the chip structure while cdevs is in use. The
+ * corresponding put is in the tpm_devs_release.
+ */
+ get_device(&chip->dev);
+ chip->devs.release = tpm_devs_release;
+ chip->devs.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES);
+ cdev_init(&chip->cdevs, &tpmrm_fops);
+ chip->cdevs.owner = THIS_MODULE;
+
+ rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num);
+ if (rc)
+ goto err_put_devs;
+
+ rc = cdev_device_add(&chip->cdevs, &chip->devs);
+ if (rc) {
+ dev_err(&chip->devs,
+ "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n",
+ dev_name(&chip->devs), MAJOR(chip->devs.devt),
+ MINOR(chip->devs.devt), rc);
+ goto err_put_devs;
+ }
+
+ return 0;
+
+err_put_devs:
+ put_device(&chip->devs);
+
+ return rc;
+}
--
2.35.1
The bug is here:
dev = new_dev->dev;
The list iterator 'new_dev' will point to a bogus position containing
HEAD if the list is empty or no element is found. This case must
be checked before any use of the iterator, otherwise it will lead
to a invalid memory access.
To fix this bug, add an check. Use a new variable 'iter' as the
list iterator, while use the old variable 'new_dev' as a dedicated
pointer to point to the found element.
Cc: stable(a)vger.kernel.org
Fixes: deaa51465105a ("PM / OPP: Add debugfs support")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
drivers/opp/debugfs.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/opp/debugfs.c b/drivers/opp/debugfs.c
index 596c185b5dda..a4476985e4ce 100644
--- a/drivers/opp/debugfs.c
+++ b/drivers/opp/debugfs.c
@@ -187,14 +187,19 @@ void opp_debug_register(struct opp_device *opp_dev, struct opp_table *opp_table)
static void opp_migrate_dentry(struct opp_device *opp_dev,
struct opp_table *opp_table)
{
- struct opp_device *new_dev;
+ struct opp_device *new_dev = NULL, *iter;
const struct device *dev;
struct dentry *dentry;
/* Look for next opp-dev */
- list_for_each_entry(new_dev, &opp_table->dev_list, node)
- if (new_dev != opp_dev)
+ list_for_each_entry(iter, &opp_table->dev_list, node)
+ if (iter != opp_dev) {
+ new_dev = iter;
break;
+ }
+
+ if (!new_dev)
+ return;
/* new_dev is guaranteed to be valid here */
dev = new_dev->dev;
--
2.17.1
عزيزي المستفيد،
لقد أرسلت لك هذه الرسالة قبل شهر ، لكني لم أسمع منك ، لا
أنا متأكد من أنك تلقيتها ، ولهذا أرسلتها لك مرة أخرى ،
بادئ ذي بدء ، أنا السيدة كريستالينا جورجيفا ، المدير العام و
رئيس صندوق النقد الدولي.
في الواقع ، لقد قمنا باستعراض جميع المعوقات والقضايا المحيطة
معاملتك غير المكتملة وعدم قدرتك على سداد الرسوم
رسوم التحويل المفروضة عليك مقابل خيارات
التحويلات السابقة ، قم بزيارة موقعنا للتأكيد 38
° 53′56 ″ شمالاً 77 ° 2 ′ 39 غربًا
نحن مجلس الإدارة والبنك الدولي وصندوق النقد
الدولي (IMF) في واشنطن العاصمة ، مع وزارة
وزارة الخزانة الأمريكية وبعض وكالات التحقيق الأخرى
ذات الصلة هنا في الولايات المتحدة الأمريكية. قد أمر
وحدة تحويل المدفوعات الخارجية الخاصة بنا ، مصرف United Bank of
Africa Lome Togo ، لإصدار بطاقة فيزا لك ، حيث $
1.5 مليون من أموالك لسحب أكبر من أموالك.
خلال مسار تحقيقنا ، اكتشفنا مع
استياء من أن المسؤولين الفاسدين قد أخروا دفعك
البنك الذي يحاول تحويل أموالك إلى حساباتك
نشر.
واليوم نخطرك بأن أموالك قد تم إيداعها في بطاقة
تأشيرة من UBA Bank وهي أيضًا جاهزة للتسليم. حاليا
اتصل بمدير بنك UBA ، اسمه السيد توني
Elumelu ، البريد الإلكتروني: (Ubagroup.tgo12(a)gmail.com)
لإخبارك بكيفية الحصول على بطاقة فيزا الصراف الآلي الخاصة بك.
بإخلاص،
السيدة كريستالينا جورجيفا
Please apply this to stable 5.10.y, 5.15.y
---8<---
From: Kees Cook <keescook(a)chromium.org>
Upstream commit: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro")
Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:
struct foo {
int one;
struct {
int two;
int three, four;
} thing;
int five;
};
This would allow for traditional references and sizing:
memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:
do_something(dst.thing.three);
This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.
To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:
#define f_three thing.three
This ends up polluting the global namespace, and makes it difficult to
search for identifiers.
Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:
struct foo {
int one;
struct { } start;
int two;
int three, four;
struct { } finish;
int five;
};
struct foo {
int one;
int start[0];
int two;
int three, four;
int finish[0];
int five;
};
This allows code to avoid needing to use a sub-struct named for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length), making bounds checking depend on open-coded calculations:
if (length > offsetof(struct foo, finish) -
offsetof(struct foo, start))
return -EINVAL;
memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
offsetof(struct foo, start));
However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):
BUILD_BUG_ON((offsetof(struct foo, four) <
offsetof(struct foo, two)) ||
(offsetof(struct foo, four) <
offsetof(struct foo, three));
if (length > offsetof(struct foo, four) -
offsetof(struct foo, two))
return -EINVAL;
memcpy(&dst.two, &src.two, length);
In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):
struct foo {
int one;
struct_group(thing,
int two;
int three, four;
);
int five;
};
if (length > sizeof(src.thing))
return -EINVAL;
memcpy(&dst.thing, &src.thing, length);
do_something(dst.three);
There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).
Additionally, there are places where such declarations would like to
have the struct be tagged, so struct_group_tagged() is added.
Given there is a need for a handful of UAPI uses too, the underlying
__struct_group() macro has been defined in UAPI so it can be used there
too.
To avoid confusing scripts/kernel-doc, hide the macro from its struct
parsing.
Co-developed-by: Keith Packard <keithp(a)keithp.com>
Signed-off-by: Keith Packard <keithp(a)keithp.com>
Acked-by: Gustavo A. R. Silva <gustavoars(a)kernel.org>
Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
Enhanced-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvil…
Enhanced-by: Dan Williams <dan.j.williams(a)intel.com>
Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel…
Enhanced-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
Acked-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Tadeusz Struk <tadeusz.struk(a)linaro.org>
---
include/linux/stddef.h | 48 +++++++++++++++++++++++++++++++++++++
include/uapi/linux/stddef.h | 24 +++++++++++++++++++
2 files changed, 72 insertions(+)
diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 998a4ba28eba..938216f8ab7e 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -36,4 +36,52 @@ enum {
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
+/**
+ * struct_group() - Wrap a set of declarations in a mirrored struct
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members.
+ */
+#define struct_group(NAME, MEMBERS...) \
+ __struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS)
+
+/**
+ * struct_group_attr() - Create a struct_group() with trailing attributes
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes to apply
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes structure attributes argument.
+ */
+#define struct_group_attr(NAME, ATTRS, MEMBERS...) \
+ __struct_group(/* no tag */, NAME, ATTRS, MEMBERS)
+
+/**
+ * struct_group_tagged() - Create a struct_group with a reusable tag
+ *
+ * @TAG: The tag name for the named sub-struct
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes struct tag argument for the named copy,
+ * so the specified layout can be reused later.
+ */
+#define struct_group_tagged(TAG, NAME, MEMBERS...) \
+ __struct_group(TAG, NAME, /* no attrs */, MEMBERS)
+
#endif
diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h
index ee8220f8dcf5..9f5da295ff1c 100644
--- a/include/uapi/linux/stddef.h
+++ b/include/uapi/linux/stddef.h
@@ -1,6 +1,30 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#include <linux/compiler_types.h>
+#ifndef _UAPI_LINUX_STDDEF_H
+#define _UAPI_LINUX_STDDEF_H
#ifndef __always_inline
#define __always_inline inline
#endif
+
+/**
+ * __struct_group() - Create a mirrored named and anonyomous struct
+ *
+ * @TAG: The tag name for the named sub-struct (usually empty)
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes (usually empty)
+ * @MEMBERS: The member declarations for the mirrored structs
+ *
+ * Used to create an anonymous union of two structs with identical layout
+ * and size: one anonymous and one named. The former's members can be used
+ * normally without sub-struct naming, and the latter can be used to
+ * reason about the start, end, and size of the group of struct members.
+ * The named struct can also be explicitly tagged for layer reuse, as well
+ * as both having struct attributes appended.
+ */
+#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \
+ union { \
+ struct { MEMBERS } ATTRS; \
+ struct TAG { MEMBERS } ATTRS NAME; \
+ }
+#endif
--
2.35.1
Component match callback function needs to check if expected data is
passed to it. Without this check, it can cause a NULL pointer
dereference when another driver registers a component before i915
drivers have their component master fully bind.
Fixes: 7b882fe3e3e8b ("ALSA: hda - handle multiple i915 device instances")
Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Won Chung <wonchung(a)google.com>
---
- Add "Fixes" tag
- Send to stable(a)vger.kernel.org
sound/hda/hdac_i915.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
index efe810af28c5..958b0975fa40 100644
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -102,13 +102,13 @@ static int i915_component_master_match(struct device *dev, int subcomponent,
struct pci_dev *hdac_pci, *i915_pci;
struct hdac_bus *bus = data;
- if (!dev_is_pci(dev))
+ if (!dev_is_pci(dev) || !bus)
return 0;
hdac_pci = to_pci_dev(bus->dev);
i915_pci = to_pci_dev(dev);
- if (!strcmp(dev->driver->name, "i915") &&
+ if (dev->driver && !strcmp(dev->driver->name, "i915") &&
subcomponent == I915_COMPONENT_AUDIO &&
connectivity_check(i915_pci, hdac_pci))
return 1;
--
2.35.1.1021.g381101b075-goog
Component match callback functions need to check if expected data is
passed to them. Without this check, it can cause a NULL pointer
dereference when another driver registers a component before i915
drivers have their component master fully bind.
Fixes: 1e8d19d9b0dfc ("mei: hdcp: bind only with i915 on the same PCH")
Fixes: c2004ce99ed73 ("mei: pxp: export pavp client to me client bus")
Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Won Chung <wonchung(a)google.com>
---
Changes from v1:
- Add "Fixes" tag
- Send to stable(a)vger.kernel.org
drivers/misc/mei/hdcp/mei_hdcp.c | 2 +-
drivers/misc/mei/pxp/mei_pxp.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mei/hdcp/mei_hdcp.c b/drivers/misc/mei/hdcp/mei_hdcp.c
index ec2a4fce8581..843dbc2b21b1 100644
--- a/drivers/misc/mei/hdcp/mei_hdcp.c
+++ b/drivers/misc/mei/hdcp/mei_hdcp.c
@@ -784,7 +784,7 @@ static int mei_hdcp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_HDCP)
return 0;
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index f7380d387bab..e32a81da8af6 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -131,7 +131,7 @@ static int mei_pxp_component_match(struct device *dev, int subcomponent,
{
struct device *base = data;
- if (strcmp(dev->driver->name, "i915") ||
+ if (!base || !dev->driver || strcmp(dev->driver->name, "i915") ||
subcomponent != I915_COMPONENT_PXP)
return 0;
--
2.35.1.1021.g381101b075-goog
If an application uses direct open or accept, it knows in advance what
direct descriptor value it will get as it picks it itself. This allows
combined requests such as:
sqe = io_uring_get_sqe(ring);
io_uring_prep_openat_direct(sqe, ..., file_slot);
sqe->flags |= IOSQE_IO_LINK | IOSQE_CQE_SKIP_SUCCESS;
sqe = io_uring_get_sqe(ring);
io_uring_prep_read(sqe,file_slot, buf, buf_size, 0);
sqe->flags |= IOSQE_FIXED_FILE;
io_uring_submit(ring);
where we prepare both a file open and read, and only get a completion
event for the read when both have completed successfully.
Currently links are fully prepared before the head is issued, but that
fails if the dependent link needs a file assigned that isn't valid until
the head has completed.
Conversely, if the same chain is performed but the fixed file slot is
already valid, then we would be unexpectedly returning data from the
old file slot rather than the newly opened one. Make sure we're
consistent here.
Allow deferral of file setup, which makes this documented case work.
Cc: stable(a)vger.kernel.org # v5.15+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
fs/io_uring.c | 43 ++++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 84433dc57914..e73e333d4705 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -915,7 +915,11 @@ struct io_kiocb {
unsigned int flags;
u64 user_data;
- u32 result;
+ /* fd before execution, if valid, result after execution */
+ union {
+ u32 result;
+ s32 fd;
+ };
u32 cflags;
struct io_ring_ctx *ctx;
@@ -7208,6 +7212,23 @@ static void io_clean_op(struct io_kiocb *req)
req->flags &= ~IO_REQ_CLEAN_FLAGS;
}
+static bool io_assign_file(struct io_kiocb *req)
+{
+ if (req->file || !io_op_defs[req->opcode].needs_file)
+ return true;
+
+ req->file = io_file_get(req->ctx, req, req->fd,
+ req->flags & REQ_F_FIXED_FILE);
+ if (req->file) {
+ req->result = 0;
+ return 0;
+ }
+
+ req_set_fail(req);
+ req->result = -EBADF;
+ return false;
+}
+
static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
{
const struct cred *creds = NULL;
@@ -7218,6 +7239,8 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
if (!io_op_defs[req->opcode].audit_skip)
audit_uring_entry(req->opcode);
+ if (unlikely(!io_assign_file(req)))
+ return -EBADF;
switch (req->opcode) {
case IORING_OP_NOP:
@@ -7362,10 +7385,11 @@ static struct io_wq_work *io_wq_free_work(struct io_wq_work *work)
static void io_wq_submit_work(struct io_wq_work *work)
{
struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+ const struct io_op_def *def = &io_op_defs[req->opcode];
unsigned int issue_flags = IO_URING_F_UNLOCKED;
bool needs_poll = false;
struct io_kiocb *timeout;
- int ret = 0;
+ int ret = 0, err = -ECANCELED;
/* one will be dropped by ->io_free_work() after returning to io-wq */
if (!(req->flags & REQ_F_REFCOUNT))
@@ -7377,14 +7401,18 @@ static void io_wq_submit_work(struct io_wq_work *work)
if (timeout)
io_queue_linked_timeout(timeout);
+ if (!io_assign_file(req)) {
+ work->flags |= IO_WQ_WORK_CANCEL;
+ err = -EBADF;
+ }
+
/* either cancelled or io-wq is dying, so don't touch tctx->iowq */
if (work->flags & IO_WQ_WORK_CANCEL) {
- io_req_task_queue_fail(req, -ECANCELED);
+ io_req_task_queue_fail(req, err);
return;
}
if (req->flags & REQ_F_FORCE_ASYNC) {
- const struct io_op_def *def = &io_op_defs[req->opcode];
bool opcode_poll = def->pollin || def->pollout;
if (opcode_poll && file_can_poll(req->file)) {
@@ -7720,6 +7748,8 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
if (io_op_defs[opcode].needs_file) {
struct io_submit_state *state = &ctx->submit_state;
+ req->fd = READ_ONCE(sqe->fd);
+
/*
* Plug now if we have more than 2 IO left after this, and the
* target is potentially a read/write to block based storage.
@@ -7729,11 +7759,6 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
state->need_plug = false;
blk_start_plug_nr_ios(&state->plug, state->submit_nr);
}
-
- req->file = io_file_get(ctx, req, READ_ONCE(sqe->fd),
- (sqe_flags & IOSQE_FIXED_FILE));
- if (unlikely(!req->file))
- return -EBADF;
}
personality = READ_ONCE(sqe->personality);
--
2.35.1
This is a leftover from the really old days where we weren't able to
track and error early if we need a file and it wasn't assigned. Kill
the check.
Cc: stable(a)vger.kernel.org # v5.15+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
fs/io_uring.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0b89f35378fa..67244ea0af48 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4511,9 +4511,6 @@ static int io_fsync_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct io_ring_ctx *ctx = req->ctx;
- if (!req->file)
- return -EBADF;
-
if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
return -EINVAL;
if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
--
2.35.1
From: Miklos Szeredi <mszeredi(a)redhat.com>
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream.
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.
On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.
This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.
Fix by copying pages coming from the user address space to new pipe
buffers.
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
---
Applies against stable-v4.14 and stable-v4.19
struct fuse_args hasn't been piped through relevant functions yet, so
place user_pages flag in an empty hole in struct fuse_req instead.
fs/fuse/dev.c | 12 +++++++++++-
fs/fuse/file.c | 1 +
fs/fuse/fuse_i.h | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index db7d746633cf..1c98b5b7bead 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -991,7 +991,17 @@ static int fuse_copy_page(struct fuse_copy_state *cs, struct page **pagep,
while (count) {
if (cs->write && cs->pipebufs && page) {
- return fuse_ref_page(cs, page, offset, count);
+ /*
+ * Can't control lifetime of pipe buffers, so always
+ * copy user pages.
+ */
+ if (cs->req->user_pages) {
+ err = fuse_copy_fill(cs);
+ if (err)
+ return err;
+ } else {
+ return fuse_ref_page(cs, page, offset, count);
+ }
} else if (!cs->len) {
if (cs->move_pages && page &&
offset == 0 && count == PAGE_SIZE) {
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 5f5da2911cea..a32b2ca3de6f 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1325,6 +1325,7 @@ static int fuse_get_user_pages(struct fuse_req *req, struct iov_iter *ii,
(PAGE_SIZE - ret) & (PAGE_SIZE - 1);
}
+ req->user_pages = true;
if (write)
req->in.argpages = 1;
else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index fac1f08dd32e..30fdede2ea64 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -312,6 +312,8 @@ struct fuse_req {
/** refcount */
refcount_t count;
+ bool user_pages;
+
/** Unique ID for the interrupt request */
u64 intr_unique;
--
2.35.1.1021.g381101b075-goog
Make the two locations where exportfs helpers check permission to lookup
a given inode idmapped mount aware by switching it to the lookup_one()
helper. This is a bugfix for the open_by_handle_at() system call which
doesn't take idmapped mounts into account currently. It's not tied to a
specific commit so we'll just Cc stable.
In addition this is required to support idmapped base layers in overlay.
The overlay filesystem uses exportfs to encode and decode file handles
for its index=on mount option and when nfs_export=on.
Cc: <stable(a)vger.kernel.org>
Cc: <linux-fsdevel(a)vger.kernel.org>
Tested-by: Giuseppe Scrivano <gscrivan(a)redhat.com>
Reviewed-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner(a)kernel.org>
---
/* v2 */
unchanged
---
fs/exportfs/expfs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 0106eba46d5a..3ef80d000e13 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -145,7 +145,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
if (err)
goto out_err;
dprintk("%s: found name: %s\n", __func__, nbuf);
- tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
+ tmp = lookup_one_unlocked(mnt_user_ns(mnt), nbuf, parent, strlen(nbuf));
if (IS_ERR(tmp)) {
dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
err = PTR_ERR(tmp);
@@ -525,7 +525,8 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
}
inode_lock(target_dir->d_inode);
- nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));
+ nresult = lookup_one(mnt_user_ns(mnt), nbuf,
+ target_dir, strlen(nbuf));
if (!IS_ERR(nresult)) {
if (unlikely(nresult->d_inode != result->d_inode)) {
dput(nresult);
--
2.32.0
The bug is here:
if (!pdev) {
The list iterator value 'pdev' will *always* be set and non-NULL
by for_each_netdev(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (in this case, the check 'if (!pdev)' can be bypassed as
it always be false unexpectly).
To fix the bug, use a new variable 'iter' as the list iterator,
while use the original variable 'pdev' as a dedicated pointer to
point to the found element.
Cc: stable(a)vger.kernel.org
Fixes: 830662f6f032f ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
drivers/infiniband/hw/cxgb4/cm.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index c16017f6e8db..870d8517310b 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -2071,7 +2071,7 @@ static int import_ep(struct c4iw_ep *ep, int iptype, __u8 *peer_ip,
{
struct neighbour *n;
int err, step;
- struct net_device *pdev;
+ struct net_device *pdev = NULL, *iter;
n = dst_neigh_lookup(dst, peer_ip);
if (!n)
@@ -2083,14 +2083,14 @@ static int import_ep(struct c4iw_ep *ep, int iptype, __u8 *peer_ip,
if (iptype == 4)
pdev = ip_dev_find(&init_net, *(__be32 *)peer_ip);
else if (IS_ENABLED(CONFIG_IPV6))
- for_each_netdev(&init_net, pdev) {
+ for_each_netdev(&init_net, iter) {
if (ipv6_chk_addr(&init_net,
(struct in6_addr *)peer_ip,
- pdev, 1))
+ iter, 1)) {
+ pdev = iter;
break;
+ }
}
- else
- pdev = NULL;
if (!pdev) {
err = -ENODEV;
--
2.17.1
Bios queued into BFQ IO scheduler can be associated with a cgroup that
was already offlined. This may then cause insertion of this bfq_group
into a service tree. But this bfq_group will get freed as soon as last
bio associated with it is completed leading to use after free issues for
service tree users. Fix the problem by making sure we always operate on
online bfq_group. If the bfq_group associated with the bio is not
online, we pick the first online parent.
CC: stable(a)vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-cgroup.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 32d2c2a47480..09574af83566 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -612,10 +612,19 @@ static void bfq_link_bfqg(struct bfq_data *bfqd, struct bfq_group *bfqg)
struct bfq_group *bfq_bio_bfqg(struct bfq_data *bfqd, struct bio *bio)
{
struct blkcg_gq *blkg = bio->bi_blkg;
+ struct bfq_group *bfqg;
- if (!blkg)
- return bfqd->root_group;
- return blkg_to_bfqg(blkg);
+ while (blkg) {
+ bfqg = blkg_to_bfqg(blkg);
+ if (bfqg->online) {
+ bio_associate_blkg_from_css(bio, &blkg->blkcg->css);
+ return bfqg;
+ }
+ blkg = blkg->parent;
+ }
+ bio_associate_blkg_from_css(bio,
+ &bfqg_to_blkg(bfqd->root_group)->blkcg->css);
+ return bfqd->root_group;
}
/**
--
2.34.1
When the process is migrated to a different cgroup (or in case of
writeback just starts submitting bios associated with a different
cgroup) bfq_merge_bio() can operate with stale cgroup information in
bic. Thus the bio can be merged to a request from a different cgroup or
it can result in merging of bfqqs for different cgroups or bfqqs of
already dead cgroups and causing possible use-after-free issues. Fix the
problem by updating cgroup information in bfq_merge_bio().
CC: stable(a)vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-iosched.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 89fe3f85eb3c..1fc4d4628fba 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2457,10 +2457,17 @@ static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
spin_lock_irq(&bfqd->lock);
- if (bic)
+ if (bic) {
+ /*
+ * Make sure cgroup info is uptodate for current process before
+ * considering the merge.
+ */
+ bfq_bic_update_cgroup(bic, bio);
+
bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf));
- else
+ } else {
bfqd->bio_bfqq = NULL;
+ }
bfqd->bio_bic = bic;
ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free);
--
2.34.1
It can happen that the parent of a bfqq changes between the moment we
decide two queues are worth to merge (and set bic->stable_merge_bfqq)
and the moment bfq_setup_merge() is called. This can happen e.g. because
the process submitted IO for a different cgroup and thus bfqq got
reparented. It can even happen that the bfqq we are merging with has
parent cgroup that is already offline and going to be destroyed in which
case the merge can lead to use-after-free issues such as:
BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x9cb/0xa50
Read of size 8 at addr ffff88800693c0c0 by task runc:[2:INIT]/10544
CPU: 0 PID: 10544 Comm: runc:[2:INIT] Tainted: G E 5.15.2-0.g5fb85fd-default #1 openSUSE Tumbleweed (unreleased) f1f3b891c72369aebecd2e43e4641a6358867c70
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x46/0x5a
print_address_description.constprop.0+0x1f/0x140
? __bfq_deactivate_entity+0x9cb/0xa50
kasan_report.cold+0x7f/0x11b
? __bfq_deactivate_entity+0x9cb/0xa50
__bfq_deactivate_entity+0x9cb/0xa50
? update_curr+0x32f/0x5d0
bfq_deactivate_entity+0xa0/0x1d0
bfq_del_bfqq_busy+0x28a/0x420
? resched_curr+0x116/0x1d0
? bfq_requeue_bfqq+0x70/0x70
? check_preempt_wakeup+0x52b/0xbc0
__bfq_bfqq_expire+0x1a2/0x270
bfq_bfqq_expire+0xd16/0x2160
? try_to_wake_up+0x4ee/0x1260
? bfq_end_wr_async_queues+0xe0/0xe0
? _raw_write_unlock_bh+0x60/0x60
? _raw_spin_lock_irq+0x81/0xe0
bfq_idle_slice_timer+0x109/0x280
? bfq_dispatch_request+0x4870/0x4870
__hrtimer_run_queues+0x37d/0x700
? enqueue_hrtimer+0x1b0/0x1b0
? kvm_clock_get_cycles+0xd/0x10
? ktime_get_update_offsets_now+0x6f/0x280
hrtimer_interrupt+0x2c8/0x740
Fix the problem by checking that the parent of the two bfqqs we are
merging in bfq_setup_merge() is the same.
Link: https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/
CC: stable(a)vger.kernel.org
Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-iosched.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 6d122c28086e..7d00b21ebe5d 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2758,6 +2758,14 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
if (process_refs == 0 || new_process_refs == 0)
return NULL;
+ /*
+ * Make sure merged queues belong to the same parent. Parents could
+ * have changed since the time we decided the two queues are suitable
+ * for merging.
+ */
+ if (new_bfqq->entity.parent != bfqq->entity.parent)
+ return NULL;
+
bfq_log_bfqq(bfqq->bfqd, bfqq, "scheduling merge with queue %d",
new_bfqq->pid);
--
2.34.1
From: Tim Gardner <tim.gardner(a)canonical.com>
[ Upstream commit 37a1a2e6eeeb101285cd34e12e48a881524701aa ]
Coverity complains of a possible buffer overflow. However,
given the 'static' scope of nvidia_setup_i2c_bus() it looks
like that can't happen after examiniing the call sites.
CID 19036 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW)
1. fixed_size_dest: You might overrun the 48-character fixed-size string
chan->adapter.name by copying name without checking the length.
2. parameter_as_source: Note: This defect has an elevated risk because the
source argument is a parameter of the current function.
89 strcpy(chan->adapter.name, name);
Fix this warning by using strscpy() which will silence the warning and
prevent any future buffer overflows should the names used to identify the
channel become much longer.
Cc: Antonino Daplas <adaplas(a)gmail.com>
Cc: linux-fbdev(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner(a)canonical.com>
Signed-off-by: Helge Deller <deller(a)gmx.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/video/fbdev/nvidia/nv_i2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/nvidia/nv_i2c.c b/drivers/video/fbdev/nvidia/nv_i2c.c
index d7994a173245..0b48965a6420 100644
--- a/drivers/video/fbdev/nvidia/nv_i2c.c
+++ b/drivers/video/fbdev/nvidia/nv_i2c.c
@@ -86,7 +86,7 @@ static int nvidia_setup_i2c_bus(struct nvidia_i2c_chan *chan, const char *name,
{
int rc;
- strcpy(chan->adapter.name, name);
+ strscpy(chan->adapter.name, name, sizeof(chan->adapter.name));
chan->adapter.owner = THIS_MODULE;
chan->adapter.class = i2c_class;
chan->adapter.algo_data = &chan->algo;
--
2.34.1
From: Tim Gardner <tim.gardner(a)canonical.com>
[ Upstream commit 37a1a2e6eeeb101285cd34e12e48a881524701aa ]
Coverity complains of a possible buffer overflow. However,
given the 'static' scope of nvidia_setup_i2c_bus() it looks
like that can't happen after examiniing the call sites.
CID 19036 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW)
1. fixed_size_dest: You might overrun the 48-character fixed-size string
chan->adapter.name by copying name without checking the length.
2. parameter_as_source: Note: This defect has an elevated risk because the
source argument is a parameter of the current function.
89 strcpy(chan->adapter.name, name);
Fix this warning by using strscpy() which will silence the warning and
prevent any future buffer overflows should the names used to identify the
channel become much longer.
Cc: Antonino Daplas <adaplas(a)gmail.com>
Cc: linux-fbdev(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner(a)canonical.com>
Signed-off-by: Helge Deller <deller(a)gmx.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/video/fbdev/nvidia/nv_i2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/nvidia/nv_i2c.c b/drivers/video/fbdev/nvidia/nv_i2c.c
index d7994a173245..0b48965a6420 100644
--- a/drivers/video/fbdev/nvidia/nv_i2c.c
+++ b/drivers/video/fbdev/nvidia/nv_i2c.c
@@ -86,7 +86,7 @@ static int nvidia_setup_i2c_bus(struct nvidia_i2c_chan *chan, const char *name,
{
int rc;
- strcpy(chan->adapter.name, name);
+ strscpy(chan->adapter.name, name, sizeof(chan->adapter.name));
chan->adapter.owner = THIS_MODULE;
chan->adapter.class = i2c_class;
chan->adapter.algo_data = &chan->algo;
--
2.34.1
From: Tim Gardner <tim.gardner(a)canonical.com>
[ Upstream commit 37a1a2e6eeeb101285cd34e12e48a881524701aa ]
Coverity complains of a possible buffer overflow. However,
given the 'static' scope of nvidia_setup_i2c_bus() it looks
like that can't happen after examiniing the call sites.
CID 19036 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW)
1. fixed_size_dest: You might overrun the 48-character fixed-size string
chan->adapter.name by copying name without checking the length.
2. parameter_as_source: Note: This defect has an elevated risk because the
source argument is a parameter of the current function.
89 strcpy(chan->adapter.name, name);
Fix this warning by using strscpy() which will silence the warning and
prevent any future buffer overflows should the names used to identify the
channel become much longer.
Cc: Antonino Daplas <adaplas(a)gmail.com>
Cc: linux-fbdev(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner(a)canonical.com>
Signed-off-by: Helge Deller <deller(a)gmx.de>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/video/fbdev/nvidia/nv_i2c.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/nvidia/nv_i2c.c b/drivers/video/fbdev/nvidia/nv_i2c.c
index d7994a173245..0b48965a6420 100644
--- a/drivers/video/fbdev/nvidia/nv_i2c.c
+++ b/drivers/video/fbdev/nvidia/nv_i2c.c
@@ -86,7 +86,7 @@ static int nvidia_setup_i2c_bus(struct nvidia_i2c_chan *chan, const char *name,
{
int rc;
- strcpy(chan->adapter.name, name);
+ strscpy(chan->adapter.name, name, sizeof(chan->adapter.name));
chan->adapter.owner = THIS_MODULE;
chan->adapter.class = i2c_class;
chan->adapter.algo_data = &chan->algo;
--
2.34.1
From: Ranjani Sridharan <ranjani.sridharan(a)linux.intel.com>
[ Upstream commit 2ce0d008dcc59f9c01f43277b9f9743af7b01dad ]
The limitation to assign a link DMA channel for a BE iff the
corresponding host DMA channel is assigned to a connected FE is only
applicable if the PROCEN_FMT_QUIRK is set. So, remove it for platforms
that do not enable the quirk.
Complements: a792bfc1c2bc ("ASoC: SOF: Intel: hda-stream: limit PROCEN workaround")
Signed-off-by: Ranjani Sridharan <ranjani.sridharan(a)linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
Reviewed-by: Rander Wang <rander.wang(a)intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20220128130017.28508-1-peter.ujfalusi@linux.intel…
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/sof/intel/hda-dai.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/sound/soc/sof/intel/hda-dai.c b/sound/soc/sof/intel/hda-dai.c
index b3cdd10c83ae..80e3a02e629f 100644
--- a/sound/soc/sof/intel/hda-dai.c
+++ b/sound/soc/sof/intel/hda-dai.c
@@ -57,6 +57,8 @@ static struct hdac_ext_stream *
{
struct snd_soc_pcm_runtime *rtd = substream->private_data;
struct sof_intel_hda_stream *hda_stream;
+ const struct sof_intel_dsp_desc *chip;
+ struct snd_sof_dev *sdev;
struct hdac_ext_stream *res = NULL;
struct hdac_stream *stream = NULL;
@@ -75,9 +77,20 @@ static struct hdac_ext_stream *
continue;
hda_stream = hstream_to_sof_hda_stream(hstream);
+ sdev = hda_stream->sdev;
+ chip = get_chip_info(sdev->pdata);
/* check if link is available */
if (!hstream->link_locked) {
+ /*
+ * choose the first available link for platforms that do not have the
+ * PROCEN_FMT_QUIRK set.
+ */
+ if (!(chip->quirks & SOF_INTEL_PROCEN_FMT_QUIRK)) {
+ res = hstream;
+ break;
+ }
+
if (stream->opened) {
/*
* check if the stream tag matches the stream
--
2.34.1
On Wed, Mar 30, 2022 at 06:28:33AM -0400, Joshua Freedman wrote:
> I'm a mortgage broker but I can figure this out.
>
> Am I doing this:
> git clone git://
> git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-stable
>
> and then git good and bad for for 5.16.11 and 5.16.12 and putting that to a
> log file that you can use?
Yes, that is exactly right!
thanks,
greg k-h
On Wed, Mar 30, 2022 at 05:42:45AM -0400, Joshua Freedman wrote:
> I just re-verified all this:
>
> For audio and mic:
> 5.17.1 is broken. Dummy Output. 5.16.12 is also dummy output.
> 5.16.11 was the last working version. 5.16.11-76051611-generic
Great, can you use 'git bisect' to track down the exact commit that
causes problems? Odds are it's not this one, but perhaps an audio
change?
thanks,
greg k-h
Hi reviewers,
I suggest to apply the following patch to stable kernel 5.4.y and 5.10.y:
commit: 5b61343b50590fb04a3f6be2cdc4868091757262
Subject: iommu/iova: Improve 32-bit free space estimate
kernel version to apply to: 5.4.y and 5.10.y
Reason:
The patch fix a corner case of iova allocation and the allocation failure
can be found under stress tests.
Test:
The patch can be applied to Linux 5.10.109 and Linux 5.4.188 without conflicts.
Both Linux 5.10.109 and Linux 5.4.188 can be built successfully with
this patch. (ARCH=arm64 defconfig)
Thanks,
Miles
Hi Joshua,
Le mer., mars 30 2022 at 03:34:34 -0400, Joshua Freedman
<freedman.joshua(a)gmail.com> a écrit :
> Hi, I noticed my audio and mic stopped working on my yoga c930 on
> 5.16.12 and newer. 5.16.11 was ok. On 5.16.12 and above I get a
> dummy output and no mic.
>
> Anything we can do about that? Thanks.
>
> I'm not a git guy so I just looked at the changlogs and this was the
> only audio one in 5.16.12. I could be wrong in the ID though.
This commit changes the clocks on the JZ4725B SoC, there is just no way
for it to be the cause of your problem.
Cheers,
-Paul
> commit 6b0d719ffed1c21c1a2e473301fd95f78cd35c9e
> Author: Siarhei Volkau <lis8215(a)gmail.com>
> Date: Sat Feb 5 20:18:49 2022 +0300
>
> clk: jz4725b: fix mmc0 clock gating
>
> commit 2f0754f27a230fee6e6d753f07585cee03bedfe3 upstream.
>
> The mmc0 clock gate bit was mistakenly assigned to "i2s" clock.
> You can find that the same bit is assigned to "mmc0" too.
> It leads to mmc0 hang for a long time after any sound activity
> also it prevented PM_SLEEP to work properly.
> I guess it was introduced by copy-paste from jz4740 driver
> where it is really controls I2S clock gate.
>
> Fixes: 226dfa4726eb ("clk: Add Ingenic jz4725b CGU driver")
> Signed-off-by: Siarhei Volkau <lis8215(a)gmail.com>
> Tested-by: Siarhei Volkau <lis8215(a)gmail.com>
> Reviewed-by: Paul Cercueil <paul(a)crapouillou.net>
> Cc: stable(a)vger.kernel.org
> Link:
> https://lore.kernel.org/r/20220205171849.687805-2-lis8215@gmail.com
> Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
>
> --
>
>
> Joshua Freedman
>
> President
>
> P: +19168060052
>
> 49 Sample BR RD
>
> Mechanicsburg, PA 17050-2386
>
> Fix & Flip Rehab Financing <31 Unit
>
>
>
>
> PGPKEY
>
On Wed, Mar 30, 2022 at 03:34:34AM -0400, Joshua Freedman wrote:
> Hi, I noticed my audio and mic stopped working on my yoga c930 on 5.16.12
> and newer. 5.16.11 was ok. On 5.16.12 and above I get a dummy output and
> no mic.
>
> Anything we can do about that? Thanks.
>
> I'm not a git guy so I just looked at the changlogs and this was the only
> audio one in 5.16.12. I could be wrong in the ID though.
>
> commit 6b0d719ffed1c21c1a2e473301fd95f78cd35c9e
> Author: Siarhei Volkau <lis8215(a)gmail.com>
> Date: Sat Feb 5 20:18:49 2022 +0300
>
> clk: jz4725b: fix mmc0 clock gating
>
> commit 2f0754f27a230fee6e6d753f07585cee03bedfe3 upstream.
>
> The mmc0 clock gate bit was mistakenly assigned to "i2s" clock.
> You can find that the same bit is assigned to "mmc0" too.
> It leads to mmc0 hang for a long time after any sound activity
> also it prevented PM_SLEEP to work properly.
> I guess it was introduced by copy-paste from jz4740 driver
> where it is really controls I2S clock gate.
>
> Fixes: 226dfa4726eb ("clk: Add Ingenic jz4725b CGU driver")
> Signed-off-by: Siarhei Volkau <lis8215(a)gmail.com>
> Tested-by: Siarhei Volkau <lis8215(a)gmail.com>
> Reviewed-by: Paul Cercueil <paul(a)crapouillou.net>
> Cc: stable(a)vger.kernel.org
> Link: https://lore.kernel.org/r/20220205171849.687805-2-lis8215@gmail.com
> Signed-off-by: Stephen Boyd <sboyd(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
If you revert this does your machine work properly? Is 5.17 also broken
for you due to this change, or does it work properly?
thanks,
greg k-h
On Wed, Mar 30, 2022 at 04:01:59PM +0900, 조준완 wrote:
>
>
> Dear Greg Kroah-Hartman,
<snip>
Again, this was sent in html format, which is rejected by the lists.
Please fix your email client and try again.
thanks,
greg k-h
On Wed, Mar 30, 2022 at 03:21:22PM +0900, 조준완 wrote:
>
>
>
>
> Dear Greg Kroah-Hartman,
<snip>
Please resend in non-html format so it properly makes it to the mailing
lists and I will gladly respond.
thanks,
greg k-h
The patch titled
Subject: mm: fix unexpected zeroed page mapping with zram swap
has been added to the -mm tree. Its filename is
mm-fix-unexpected-zeroed-page-mapping-with-zram-swap.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-fix-unexpected-zeroed-page-map…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-unexpected-zeroed-page-map…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Minchan Kim <minchan(a)kernel.org>
Subject: mm: fix unexpected zeroed page mapping with zram swap
Two processes under CLONE_VM cloning, user process can be corrupted by
seeing zeroed page unexpectedly.
CPU A CPU B
do_swap_page do_swap_page
SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path
swap_readpage valid data
swap_slot_free_notify
delete zram entry
swap_readpage zeroed(invalid) data
pte_lock
map the *zero data* to userspace
pte_unlock
pte_lock
if (!pte_same)
goto out_nomap;
pte_unlock
return and next refault will
read zeroed data
The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
increase the refcount of swap slot at copy_mm so it couldn't catch up
whether it's safe or not to discard data from backing device. In the
case, only the lock it could rely on to synchronize swap slot freeing is
page table lock. Thus, this patch gets rid of the swap_slot_free_notify
function. With this patch, CPU A will see correct data.
CPU A CPU B
do_swap_page do_swap_page
SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path
swap_readpage original data
pte_lock
map the original data
swap_free
swap_range_free
bd_disk->fops->swap_slot_free_notify
swap_readpage read zeroed data
pte_unlock
pte_lock
if (!pte_same)
goto out_nomap;
pte_unlock
return
on next refault will see mapped data by CPU B
The concern of the patch would increase memory consumption since it could
keep wasted memory with compressed form in zram as well as uncompressed
form in address space. However, most of cases of zram uses no readahead
and do_swap_page is followed by swap_free so it will free the compressed
form from in zram quickly.
Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: Ivan Babrou <ivan(a)cloudflare.com>
Tested-by: Ivan Babrou <ivan(a)cloudflare.com>
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Cc: Nitin Gupta <ngupta(a)vflare.org>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_io.c | 54 -------------------------------------------------
1 file changed, 54 deletions(-)
--- a/mm/page_io.c~mm-fix-unexpected-zeroed-page-mapping-with-zram-swap
+++ a/mm/page_io.c
@@ -51,54 +51,6 @@ void end_swap_bio_write(struct bio *bio)
bio_put(bio);
}
-static void swap_slot_free_notify(struct page *page)
-{
- struct swap_info_struct *sis;
- struct gendisk *disk;
- swp_entry_t entry;
-
- /*
- * There is no guarantee that the page is in swap cache - the software
- * suspend code (at least) uses end_swap_bio_read() against a non-
- * swapcache page. So we must check PG_swapcache before proceeding with
- * this optimization.
- */
- if (unlikely(!PageSwapCache(page)))
- return;
-
- sis = page_swap_info(page);
- if (data_race(!(sis->flags & SWP_BLKDEV)))
- return;
-
- /*
- * The swap subsystem performs lazy swap slot freeing,
- * expecting that the page will be swapped out again.
- * So we can avoid an unnecessary write if the page
- * isn't redirtied.
- * This is good for real swap storage because we can
- * reduce unnecessary I/O and enhance wear-leveling
- * if an SSD is used as the as swap device.
- * But if in-memory swap device (eg zram) is used,
- * this causes a duplicated copy between uncompressed
- * data in VM-owned memory and compressed data in
- * zram-owned memory. So let's free zram-owned memory
- * and make the VM-owned decompressed page *dirty*,
- * so the page should be swapped out somewhere again if
- * we again wish to reclaim it.
- */
- disk = sis->bdev->bd_disk;
- entry.val = page_private(page);
- if (disk->fops->swap_slot_free_notify && __swap_count(entry) == 1) {
- unsigned long offset;
-
- offset = swp_offset(entry);
-
- SetPageDirty(page);
- disk->fops->swap_slot_free_notify(sis->bdev,
- offset);
- }
-}
-
static void end_swap_bio_read(struct bio *bio)
{
struct page *page = bio_first_page_all(bio);
@@ -114,7 +66,6 @@ static void end_swap_bio_read(struct bio
}
SetPageUptodate(page);
- swap_slot_free_notify(page);
out:
unlock_page(page);
WRITE_ONCE(bio->bi_private, NULL);
@@ -394,11 +345,6 @@ int swap_readpage(struct page *page, boo
if (sis->flags & SWP_SYNCHRONOUS_IO) {
ret = bdev_read_page(sis->bdev, swap_page_sector(page), page);
if (!ret) {
- if (trylock_page(page)) {
- swap_slot_free_notify(page);
- unlock_page(page);
- }
-
count_vm_event(PSWPIN);
goto out;
}
_
Patches currently in -mm which might be from minchan(a)kernel.org are
mm-fix-unexpected-zeroed-page-mapping-with-zram-swap.patch
The previous commit fixed a memory leak on the send path in the event
that IPv6 is disabled at compile time, but how did a packet even arrive
there to begin with? It turns out we have previously allowed IPv6
endpoints even when IPv6 support is disabled at compile time. This is
awkward and inconsistent. Instead, let's just ignore all things IPv6,
the same way we do other malformed endpoints, in the case where IPv6 is
disabled.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/socket.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
index 467eef0e563b..0414d7a6ce74 100644
--- a/drivers/net/wireguard/socket.c
+++ b/drivers/net/wireguard/socket.c
@@ -242,7 +242,7 @@ int wg_socket_endpoint_from_skb(struct endpoint *endpoint,
endpoint->addr4.sin_addr.s_addr = ip_hdr(skb)->saddr;
endpoint->src4.s_addr = ip_hdr(skb)->daddr;
endpoint->src_if4 = skb->skb_iif;
- } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ } else if (IS_ENABLED(CONFIG_IPV6) && skb->protocol == htons(ETH_P_IPV6)) {
endpoint->addr6.sin6_family = AF_INET6;
endpoint->addr6.sin6_port = udp_hdr(skb)->source;
endpoint->addr6.sin6_addr = ipv6_hdr(skb)->saddr;
@@ -285,7 +285,7 @@ void wg_socket_set_peer_endpoint(struct wg_peer *peer,
peer->endpoint.addr4 = endpoint->addr4;
peer->endpoint.src4 = endpoint->src4;
peer->endpoint.src_if4 = endpoint->src_if4;
- } else if (endpoint->addr.sa_family == AF_INET6) {
+ } else if (IS_ENABLED(CONFIG_IPV6) && endpoint->addr.sa_family == AF_INET6) {
peer->endpoint.addr6 = endpoint->addr6;
peer->endpoint.src6 = endpoint->src6;
} else {
--
2.35.1
We make too nuanced use of ptr_ring to entirely move to the skb_array
wrappers, but we at least should avoid the naughty function pointer cast
when cleaning up skbs. Otherwise RAP/CFI will honk at us. This patch
uses the __skb_array_destroy_skb wrapper for the cleanup, rather than
directly providing kfree_skb, which is what other drivers in the same
situation do too.
Reported-by: PaX Team <pageexec(a)freemail.hu>
Fixes: 886fcee939ad ("wireguard: receive: use ring buffer for incoming handshakes")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/net/wireguard/queueing.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
index 1de413b19e34..8084e7408c0a 100644
--- a/drivers/net/wireguard/queueing.c
+++ b/drivers/net/wireguard/queueing.c
@@ -4,6 +4,7 @@
*/
#include "queueing.h"
+#include <linux/skb_array.h>
struct multicore_worker __percpu *
wg_packet_percpu_multicore_worker_alloc(work_func_t function, void *ptr)
@@ -42,7 +43,7 @@ void wg_packet_queue_free(struct crypt_queue *queue, bool purge)
{
free_percpu(queue->worker);
WARN_ON(!purge && !__ptr_ring_empty(&queue->ring));
- ptr_ring_cleanup(&queue->ring, purge ? (void(*)(void*))kfree_skb : NULL);
+ ptr_ring_cleanup(&queue->ring, purge ? __skb_array_destroy_skb : NULL);
}
#define NEXT(skb) ((skb)->prev)
--
2.35.1
Beste begunstigde,
Je hebt een liefdadigheidsdonatie van ($ 10.000.000,00) van Mr. Mike Weirsky, een winnaar van een powerball-jackpotloterij van $ 273 miljoen. Ik doneer aan 5 willekeurige personen als je deze e-mail ontvangt, dan is je e-mail geselecteerd na een spin-ball. Ik heb vrijwillig besloten om het bedrag van $ 10 miljoen USD aan jou te doneren als een van de geselecteerde 5, om mijn winst te verifiëren
Vriendelijk antwoord op: mike.weirsky.foundation003(a)gmail.com
Voor uw claim.
In some cases it appears the invalidation of a hwpoisoned page
fails because the page is still mapped in another process. This
can cause a program to be continuously restarted and die when
it page faults on the page that was not invalidated. Avoid that
problem by unmapping the hwpoisoned page when we find it.
Another issue is that sometimes we end up oopsing in finish_fault,
if the code tries to do something with the now-NULL vmf->page.
I did not hit this error when submitting the previous patch because
there are several opportunities for alloc_set_pte to bail out before
accessing vmf->page, and that apparently happened on those systems,
and most of the time on other systems, too.
However, across several million systems that error does occur a
handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
which will cause do_read_fault to return before calling finish_fault.
Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
---
mm/memory.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index be44d0b36b18..76e3af9639d9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
return ret;
if (unlikely(PageHWPoison(vmf->page))) {
+ struct page *page = vmf->page;
vm_fault_t poisonret = VM_FAULT_HWPOISON;
if (ret & VM_FAULT_LOCKED) {
+ if (page_mapped(page))
+ unmap_mapping_pages(page_mapping(page),
+ page->index, 1, false);
/* Retry if a clean page was removed from the cache. */
- if (invalidate_inode_page(vmf->page))
- poisonret = 0;
- unlock_page(vmf->page);
+ if (invalidate_inode_page(page))
+ poisonret = VM_FAULT_NOPAGE;
+ unlock_page(page);
}
- put_page(vmf->page);
+ put_page(page);
vmf->page = NULL;
return poisonret;
}
--
2.35.1
If an mremap() syscall with old_size=0 ends up in move_page_tables(),
it will call invalidate_range_start()/invalidate_range_end() unnecessarily,
i.e. with an empty range.
This causes a WARN in KVM's mmu_notifier. In the past, empty ranges
have been diagnosed to be off-by-one bugs, hence the WARNing.
Given the low (so far) number of unique reports, the benefits of
detecting more buggy callers seem to outweigh the cost of having
to fix cases such as this one, where userspace is doing something
silly. In this particular case, an early return from move_page_tables()
is enough to fix the issue.
Reported-by: syzbot+6bde52d89cfdf9f61425(a)syzkaller.appspotmail.com
Cc: linux-mm(a)kvack.org
Cc: akpm(a)linux-foundation.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
mm/mremap.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/mremap.c b/mm/mremap.c
index 002eec83e91e..0e175aef536e 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -486,6 +486,9 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
pmd_t *old_pmd, *new_pmd;
pud_t *old_pud, *new_pud;
+ if (!len)
+ return 0;
+
old_end = old_addr + len;
flush_cache_range(vma, old_addr, old_end);
--
2.31.1
On Thu, Mar 10, 2022 at 4:54 AM Jeremy Linton <jeremy.linton(a)arm.com> wrote:
>
> GCC12 appears to be much smarter about its dependency tracking and is
> aware that the relaxed variants are just normal loads and stores and
> this is causing problems like:
>
> [ 210.074549] ------------[ cut here ]------------
> [ 210.079223] NETDEV WATCHDOG: enabcm6e4ei0 (bcmgenet): transmit queue 1 timed out
> [ 210.086717] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:529 dev_watchdog+0x234/0x240
> [ 210.095044] Modules linked in: genet(E) nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat]
> [ 210.146561] ACPI CPPC: PCC check channel failed for ss: 0. ret=-110
> [ 210.146927] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 5.17.0-rc7G12+ #58
> [ 210.153226] CPPC Cpufreq:cppc_scale_freq_workfn: failed to read perf counters
> [ 210.161349] Hardware name: Raspberry Pi Foundation Raspberry Pi 4 Model B/Raspberry Pi 4 Model B, BIOS EDK2-DEV 02/08/2022
> [ 210.161353] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 210.161358] pc : dev_watchdog+0x234/0x240
> [ 210.161364] lr : dev_watchdog+0x234/0x240
> [ 210.161368] sp : ffff8000080a3a40
> [ 210.161370] x29: ffff8000080a3a40 x28: ffffcd425af87000 x27: ffff8000080a3b20
> [ 210.205150] x26: ffffcd425aa00000 x25: 0000000000000001 x24: ffffcd425af8ec08
> [ 210.212321] x23: 0000000000000100 x22: ffffcd425af87000 x21: ffff55b142688000
> [ 210.219491] x20: 0000000000000001 x19: ffff55b1426884c8 x18: ffffffffffffffff
> [ 210.226661] x17: 64656d6974203120 x16: 0000000000000001 x15: 6d736e617274203a
> [ 210.233831] x14: 2974656e65676d63 x13: ffffcd4259c300d8 x12: ffffcd425b07d5f0
> [ 210.241001] x11: 00000000ffffffff x10: ffffcd425b07d5f0 x9 : ffffcd4258bdad9c
> [ 210.248171] x8 : 00000000ffffdfff x7 : 000000000000003f x6 : 0000000000000000
> [ 210.255341] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000001000
> [ 210.262511] x2 : 0000000000001000 x1 : 0000000000000005 x0 : 0000000000000044
> [ 210.269682] Call trace:
> [ 210.272133] dev_watchdog+0x234/0x240
> [ 210.275811] call_timer_fn+0x3c/0x15c
> [ 210.279489] __run_timers.part.0+0x288/0x310
> [ 210.283777] run_timer_softirq+0x48/0x80
> [ 210.287716] __do_softirq+0x128/0x360
> [ 210.291392] __irq_exit_rcu+0x138/0x140
> [ 210.295243] irq_exit_rcu+0x1c/0x30
> [ 210.298745] el1_interrupt+0x38/0x54
> [ 210.302334] el1h_64_irq_handler+0x18/0x24
> [ 210.306445] el1h_64_irq+0x7c/0x80
> [ 210.309857] arch_cpu_idle+0x18/0x2c
> [ 210.313445] default_idle_call+0x4c/0x140
> [ 210.317470] cpuidle_idle_call+0x14c/0x1a0
> [ 210.321584] do_idle+0xb0/0x100
> [ 210.324737] cpu_startup_entry+0x30/0x8c
> [ 210.328675] secondary_start_kernel+0xe4/0x110
> [ 210.333138] __secondary_switched+0x94/0x98
>
> The assumption when these were relaxed seems to be that device memory
> would be mapped non reordering, and that other constructs
> (spinlocks/etc) would provide the barriers to assure that packet data
> and in memory rings/queues were ordered with respect to device
> register reads/writes. This itself seems a bit sketchy, but the real
> problem with GCC12 is that it is moving the actual reads/writes around
> at will as though they were independent operations when in truth they
> are not, but the compiler can't know that. When looking at the
> assembly dumps for many of these routines its possible to see very
> clean, but not strictly in program order operations occurring as the
> compiler would be free to do if these weren't actually register
> reads/write operations.
>
> Its possible to suppress the timeout with a liberal bit of dma_mb()'s
> sprinkled around but the device still seems unable to reliably
> send/receive data. A better plan is to use the safer readl/writel
> everywhere.
>
> Since this partially reverts an older commit, which notes the use of
> the relaxed variants for performance reasons. I would suggest that
> any performance problems with this commit are targeted at relaxing only
> the performance critical code paths after assuring proper barriers.
>
> Fixes: 69d2ea9c79898 ("net: bcmgenet: Use correct I/O accessors")
> Reported-by: Peter Robinson <pbrobinson(a)gmail.com>
> Signed-off-by: Jeremy Linton <jeremy.linton(a)arm.com>
This is now in Linus's tree as commit 8d3ea3d402db would be good to
get it int 5.17.x
> ---
> drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index 87f1056e29ff..e907a2df299c 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -76,7 +76,7 @@ static inline void bcmgenet_writel(u32 value, void __iomem *offset)
> if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> __raw_writel(value, offset);
> else
> - writel_relaxed(value, offset);
> + writel(value, offset);
> }
>
> static inline u32 bcmgenet_readl(void __iomem *offset)
> @@ -84,7 +84,7 @@ static inline u32 bcmgenet_readl(void __iomem *offset)
> if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> return __raw_readl(offset);
> else
> - return readl_relaxed(offset);
> + return readl(offset);
> }
>
> static inline void dmadesc_set_length_status(struct bcmgenet_priv *priv,
> --
> 2.35.1
>
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 61cc4534b6550997c97a03759ab46b29d44c0017 Mon Sep 17 00:00:00 2001
From: Waiman Long <longman(a)redhat.com>
Date: Sun, 2 Jan 2022 21:35:58 -0500
Subject: [PATCH] locking/lockdep: Avoid potential access of invalid memory in
lock_class
It was found that reading /proc/lockdep after a lockdep splat may
potentially cause an access to freed memory if lockdep_unregister_key()
is called after the splat but before access to /proc/lockdep [1]. This
is due to the fact that graph_lock() call in lockdep_unregister_key()
fails after the clearing of debug_locks by the splat process.
After lockdep_unregister_key() is called, the lock_name may be freed
but the corresponding lock_class structure still have a reference to
it. That invalid memory pointer will then be accessed when /proc/lockdep
is read by a user and a use-after-free (UAF) error will be reported if
KASAN is enabled.
To fix this problem, lockdep_unregister_key() is now modified to always
search for a matching key irrespective of the debug_locks state and
zap the corresponding lock class if a matching one is found.
[1] https://lore.kernel.org/lkml/77f05c15-81b6-bddd-9650-80d5f23fe330@i-love.sa…
Fixes: 8b39adbee805 ("locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again")
Reported-by: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Signed-off-by: Waiman Long <longman(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Link: https://lkml.kernel.org/r/20220103023558.1377055-1-longman@redhat.com
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 89b3df51fd98..2e6892ec3756 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6287,7 +6287,13 @@ void lockdep_reset_lock(struct lockdep_map *lock)
lockdep_reset_lock_reg(lock);
}
-/* Unregister a dynamically allocated key. */
+/*
+ * Unregister a dynamically allocated key.
+ *
+ * Unlike lockdep_register_key(), a search is always done to find a matching
+ * key irrespective of debug_locks to avoid potential invalid access to freed
+ * memory in lock_class entry.
+ */
void lockdep_unregister_key(struct lock_class_key *key)
{
struct hlist_head *hash_head = keyhashentry(key);
@@ -6302,10 +6308,8 @@ void lockdep_unregister_key(struct lock_class_key *key)
return;
raw_local_irq_save(flags);
- if (!graph_lock())
- goto out_irq;
+ lockdep_lock();
- pf = get_pending_free();
hlist_for_each_entry_rcu(k, hash_head, hash_entry) {
if (k == key) {
hlist_del_rcu(&k->hash_entry);
@@ -6313,11 +6317,13 @@ void lockdep_unregister_key(struct lock_class_key *key)
break;
}
}
- WARN_ON_ONCE(!found);
- __lockdep_free_key_range(pf, key, 1);
- call_rcu_zapped(pf);
- graph_unlock();
-out_irq:
+ WARN_ON_ONCE(!found && debug_locks);
+ if (found) {
+ pf = get_pending_free();
+ __lockdep_free_key_range(pf, key, 1);
+ call_rcu_zapped(pf);
+ }
+ lockdep_unlock();
raw_local_irq_restore(flags);
/* Wait until is_dynamic_key() has finished accessing k->hash_entry. */
[Public]
Hi,
Some OEM platforms containing AMD APU + AMD dGPU contain ACPI _PR3 objects that are mistakenly activating the wrong power management features.
This is fixed in mainline by the following commits that backport cleanly to 5.17.y:
commit 901e2be20dc5 ("drm/amdgpu: move PX checking into amdgpu_device_ip_early_init")
commit 85ac2021fe3ac ("drm/amdgpu: only check for _PR3 on dGPUs")
Can you please bring these to 5.17.y? They *do not* backport cleanly to earlier stable trees, and a separate backport will be submitted for those.
Thanks,
Make the two locations where exportfs helpers check permission to lookup
a given inode idmapped mount aware by switching it to the lookup_one()
helper. This is a bugfix for the open_by_handle_at() system call which
doesn't take idmapped mounts into account currently. It's not tied to a
specific commit so we'll just Cc stable.
In addition this is required to support idmapped base layers in overlay.
The overlay filesystem uses exportfs to encode and decode file handles
for its index=on mount option and when nfs_export=on.
Cc: <stable(a)vger.kernel.org>
Cc: <linux-fsdevel(a)vger.kernel.org>
Tested-by: Giuseppe Scrivano <gscrivan(a)redhat.com>
Reviewed-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner(a)kernel.org>
---
fs/exportfs/expfs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 0106eba46d5a..3ef80d000e13 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -145,7 +145,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
if (err)
goto out_err;
dprintk("%s: found name: %s\n", __func__, nbuf);
- tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
+ tmp = lookup_one_unlocked(mnt_user_ns(mnt), nbuf, parent, strlen(nbuf));
if (IS_ERR(tmp)) {
dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
err = PTR_ERR(tmp);
@@ -525,7 +525,8 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
}
inode_lock(target_dir->d_inode);
- nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));
+ nresult = lookup_one(mnt_user_ns(mnt), nbuf,
+ target_dir, strlen(nbuf));
if (!IS_ERR(nresult)) {
if (unlikely(nresult->d_inode != result->d_inode)) {
dput(nresult);
--
2.32.0
The bug is here:
return crtc;
The list iterator value 'crtc' will *always* be set and non-NULL by
list_for_each_entry(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element is found.
To fix the bug, return 'crtc' when found, otherwise return NULL.
Cc: stable(a)vger.kernel.org
fixes: 89c78134cc54d ("gma500: Add Poulsbo support")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
drivers/gpu/drm/gma500/psb_intel_display.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/gma500/psb_intel_display.c b/drivers/gpu/drm/gma500/psb_intel_display.c
index d5f95212934e..42d1a733e124 100644
--- a/drivers/gpu/drm/gma500/psb_intel_display.c
+++ b/drivers/gpu/drm/gma500/psb_intel_display.c
@@ -535,14 +535,15 @@ void psb_intel_crtc_init(struct drm_device *dev, int pipe,
struct drm_crtc *psb_intel_get_crtc_from_pipe(struct drm_device *dev, int pipe)
{
- struct drm_crtc *crtc = NULL;
+ struct drm_crtc *crtc;
list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
+
if (gma_crtc->pipe == pipe)
- break;
+ return crtc;
}
- return crtc;
+ return NULL;
}
int gma_connector_clones(struct drm_device *dev, int type_mask)
--
2.17.1
The bug is here:
if (s->len != flen) {
The list iterator 's' will point to a bogus position containing
HEAD if the list is empty or no element is found. This case must
be checked before any use of the iterator, otherwise it may bpass
the 'if (s->len != flen) {' in theory iif s->len's value is flen,
or/and lead to an invalid memory access.
To fix this bug, use a new variable 'iter' as the list iterator,
while using the origin variable 's' as a dedicated pointer to
point to the found element. And if the list is empty or no element
is found, WARN_ON and return.
Cc: stable(a)vger.kernel.org
Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
changes since v2:
- WARN_ON and return (Sven Schnelle)
changes since v1:
- reallocate s when s == NULL (Sven Schnelle)
v1:https://lore.kernel.org/lkml/20220327064931.7775-1-xiam0nd.tong@gmail.co…v2:https://lore.kernel.org/lkml/20220328070543.24671-1-xiam0nd.tong@gmail.c…
---
drivers/s390/char/tty3270.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/s390/char/tty3270.c b/drivers/s390/char/tty3270.c
index 5c83f71c1d0e..9d0952178322 100644
--- a/drivers/s390/char/tty3270.c
+++ b/drivers/s390/char/tty3270.c
@@ -1109,9 +1109,9 @@ static void tty3270_put_character(struct tty3270 *tp, char ch)
static void
tty3270_convert_line(struct tty3270 *tp, int line_nr)
{
+ struct string *s = NULL, *n, *iter;
struct tty3270_line *line;
struct tty3270_cell *cell;
- struct string *s, *n;
unsigned char highlight;
unsigned char f_color;
char *cp;
@@ -1142,9 +1142,14 @@ tty3270_convert_line(struct tty3270 *tp, int line_nr)
/* Find the line in the list. */
i = tp->view.rows - 2 - line_nr;
- list_for_each_entry_reverse(s, &tp->lines, list)
- if (--i <= 0)
+ list_for_each_entry_reverse(iter, &tp->lines, list)
+ if (--i <= 0) {
+ s = iter;
break;
+ }
+
+ if(WARN_ON(!s))
+ return;
/*
* Check if the line needs to get reallocated.
*/
--
2.17.1
The bug is here:
if (!server ||
server->pnfs_curr_ld->id != dev->cbd_layout_type) {
The list iterator value 'server' will *always* be set and non-NULL
by list_for_each_entry_rcu, so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element is
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD, which is used for above check at next
outer loop). Otherwise it may bypass the check in theory (iif
server->pnfs_curr_ld->id == dev->cbd_layout_type, 'server' now is
a bogus pointer) and lead to invalid memory access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator,
while use the original variable 'server' as a dedicated pointer to
point to the found element.
Cc: stable(a)vger.kernel.org
Fixes: 1be5683b03a76 ("pnfs: CB_NOTIFY_DEVICEID")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
fs/nfs/callback_proc.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index c343666d9a42..84779785dc8d 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -361,7 +361,7 @@ __be32 nfs4_callback_devicenotify(void *argp, void *resp,
uint32_t i;
__be32 res = 0;
struct nfs_client *clp = cps->clp;
- struct nfs_server *server = NULL;
+ struct nfs_server *server = NULL, *iter;
if (!clp) {
res = cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION);
@@ -374,10 +374,11 @@ __be32 nfs4_callback_devicenotify(void *argp, void *resp,
if (!server ||
server->pnfs_curr_ld->id != dev->cbd_layout_type) {
rcu_read_lock();
- list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link)
- if (server->pnfs_curr_ld &&
- server->pnfs_curr_ld->id == dev->cbd_layout_type) {
+ list_for_each_entry_rcu(iter, &clp->cl_superblocks, client_link)
+ if (iter->pnfs_curr_ld &&
+ iter->pnfs_curr_ld->id == dev->cbd_layout_type) {
rcu_read_unlock();
+ server = iter;
goto found;
}
rcu_read_unlock();
--
2.17.1
The bug is here:
if (!server ||
server->pnfs_curr_ld->id != dev->cbd_layout_type) {
The list iterator value 'server' will *always* be set and non-NULL
by list_for_each_entry_rcu, so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element is
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD, which is used for above check at next
outer loop). Otherwise it may bypass the check in theory (if
server->pnfs_curr_ld->id == dev->cbd_layout_type, 'server' now is
a bogus pointer) and lead to invalid memory access passing the check.
Furthermore, even if we have a valid pointer, nothing pins the super
block, and so the struct nfs_server could end up getting freed while
we're using it.
Since all we want is a pointer to the struct pnfs_layoutdriver_type,
let's skip all the iteration over super blocks, and just use API to
find the layout driver directly. And to avoid use last found 'ld'
which may not exists any more, just call the API for every 'dev'.
At the same time, move the code to make the logic clearer.
Cc: stable(a)vger.kernel.org
Fixes: 1be5683b03a76 ("pnfs: CB_NOTIFY_DEVICEID")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
changes since v1:
- use API to find the layout driver directly (Trond Myklebust)
- avoid use last found 'ld' (Xiaomeng Tong)
- code movement (Xiaomeng Tong)
v1:https://lore.kernel.org/lkml/20220327080230.12134-1-xiam0nd.tong@gmail.c…
---
fs/nfs/callback_proc.c | 32 ++++++++++----------------------
fs/nfs/pnfs.c | 5 +++++
fs/nfs/pnfs.h | 1 +
3 files changed, 16 insertions(+), 22 deletions(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index c343666d9a42..579887749870 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -358,39 +358,27 @@ __be32 nfs4_callback_devicenotify(void *argp, void *resp,
struct cb_process_state *cps)
{
struct cb_devicenotifyargs *args = argp;
+ const struct pnfs_layoutdriver_type *ld;
uint32_t i;
- __be32 res = 0;
- struct nfs_client *clp = cps->clp;
- struct nfs_server *server = NULL;
- if (!clp) {
- res = cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION);
- goto out;
+ if (!cps->clp) {
+ kfree(args->devs);
+ return cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION);
}
for (i = 0; i < args->ndevs; i++) {
struct cb_devicenotifyitem *dev = &args->devs[i];
- if (!server ||
- server->pnfs_curr_ld->id != dev->cbd_layout_type) {
- rcu_read_lock();
- list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link)
- if (server->pnfs_curr_ld &&
- server->pnfs_curr_ld->id == dev->cbd_layout_type) {
- rcu_read_unlock();
- goto found;
- }
- rcu_read_unlock();
- continue;
+ ld = pnfs_find_layoutdriver(dev->cbd_layout_type);
+ if (ld) {
+ nfs4_delete_deviceid(ld, cps->clp,
+ &dev->cbd_dev_id);
+ module_put(ld->owner);
}
-
- found:
- nfs4_delete_deviceid(server->pnfs_curr_ld, clp, &dev->cbd_dev_id);
}
-out:
kfree(args->devs);
- return res;
+ return 0;
}
/*
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 7c9090a28e5c..112c36977feb 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -92,6 +92,11 @@ find_pnfs_driver(u32 id)
return local;
}
+const struct pnfs_layoutdriver_type *pnfs_find_layoutdriver(u32 id)
+{
+ return find_pnfs_driver(id);
+}
+
void
unset_pnfs_layoutdriver(struct nfs_server *nfss)
{
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index f4d7548d67b2..873ea8fe945b 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -234,6 +234,7 @@ struct pnfs_devicelist {
extern int pnfs_register_layoutdriver(struct pnfs_layoutdriver_type *);
extern void pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *);
+extern const struct pnfs_layoutdriver_type *pnfs_find_layoutdriver(u32 id);
/* nfs4proc.c */
extern size_t max_response_pages(struct nfs_server *server);
--
2.17.1
The bug is here:
mt8195_etdm_hw_params_fixup(runtime, params);
For the for_each_card_rtds(), just like list_for_each_entry(),
the list iterator 'runtime' will point to a bogus position
containing HEAD if the list is empty or no element is found.
This case must be checked before any use of the iterator,
otherwise it will lead to a invalid memory access.
To fix the bug, use a new variable 'iter' as the list iterator,
while use the original variable 'runtime' as a dedicated pointer
to point to the found element.
Cc: stable(a)vger.kernel.org
Fixes: 3d00d2c07f04f ("ASoC: mediatek: mt8195: add sof support on mt8195-mt6359-rt1019-rt5682")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
.../mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c b/sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c
index 29c2d3407cc7..dc91877e4c3c 100644
--- a/sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c
+++ b/sound/soc/mediatek/mt8195/mt8195-mt6359-rt1019-rt5682.c
@@ -814,7 +814,7 @@ static int mt8195_dai_link_fixup(struct snd_soc_pcm_runtime *rtd,
{
struct snd_soc_card *card = rtd->card;
struct snd_soc_dai_link *sof_dai_link = NULL;
- struct snd_soc_pcm_runtime *runtime;
+ struct snd_soc_pcm_runtime *runtime = NULL, *iter;
struct snd_soc_dai *cpu_dai;
int i, j, ret = 0;
@@ -824,16 +824,17 @@ static int mt8195_dai_link_fixup(struct snd_soc_pcm_runtime *rtd,
if (strcmp(rtd->dai_link->name, conn->normal_link))
continue;
- for_each_card_rtds(card, runtime) {
- if (strcmp(runtime->dai_link->name, conn->sof_link))
+ for_each_card_rtds(card, iter) {
+ if (strcmp(iter->dai_link->name, conn->sof_link))
continue;
- for_each_rtd_cpu_dais(runtime, j, cpu_dai) {
+ for_each_rtd_cpu_dais(iter, j, cpu_dai) {
if (cpu_dai->stream_active[conn->stream_dir] > 0) {
- sof_dai_link = runtime->dai_link;
+ sof_dai_link = iter->dai_link;
break;
}
}
+ runtime = iter;
break;
}
@@ -845,7 +846,8 @@ static int mt8195_dai_link_fixup(struct snd_soc_pcm_runtime *rtd,
if (!strcmp(rtd->dai_link->name, "ETDM2_IN_BE") ||
!strcmp(rtd->dai_link->name, "ETDM1_OUT_BE")) {
- mt8195_etdm_hw_params_fixup(runtime, params);
+ if (runtime)
+ mt8195_etdm_hw_params_fixup(runtime, params);
}
return ret;
--
2.17.1
These three bugs are here:
struct gbaudio_data_connection *data;
If the list '&codec->module_list' is empty then the 'data' will
keep unchanged. However, the 'data' is not initialized and filled
with trash value. As a result, if the value is not NULL, the check
'if (!data) {' will always be false and never exit expectly.
To fix these bug, just initialize 'data' with NULL.
Cc: stable(a)vger.kernel.org
Fixes: 6dd67645f22cf ("greybus: audio: Use single codec driver registration")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
drivers/staging/greybus/audio_codec.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/greybus/audio_codec.c b/drivers/staging/greybus/audio_codec.c
index b589cf6b1d03..939e05af4dcf 100644
--- a/drivers/staging/greybus/audio_codec.c
+++ b/drivers/staging/greybus/audio_codec.c
@@ -397,7 +397,7 @@ static int gbcodec_hw_params(struct snd_pcm_substream *substream,
u8 sig_bits, channels;
u32 format, rate;
struct gbaudio_module_info *module;
- struct gbaudio_data_connection *data;
+ struct gbaudio_data_connection *data = NULL;
struct gb_bundle *bundle;
struct gbaudio_codec_info *codec = dev_get_drvdata(dai->dev);
struct gbaudio_stream_params *params;
@@ -498,7 +498,7 @@ static int gbcodec_prepare(struct snd_pcm_substream *substream,
{
int ret;
struct gbaudio_module_info *module;
- struct gbaudio_data_connection *data;
+ struct gbaudio_data_connection *data = NULL;
struct gb_bundle *bundle;
struct gbaudio_codec_info *codec = dev_get_drvdata(dai->dev);
struct gbaudio_stream_params *params;
@@ -562,7 +562,7 @@ static int gbcodec_prepare(struct snd_pcm_substream *substream,
static int gbcodec_mute_stream(struct snd_soc_dai *dai, int mute, int stream)
{
int ret;
- struct gbaudio_data_connection *data;
+ struct gbaudio_data_connection *data = NULL;
struct gbaudio_module_info *module;
struct gb_bundle *bundle;
struct gbaudio_codec_info *codec = dev_get_drvdata(dai->dev);
--
2.17.1
These two bug are here:
list_for_each_entry_safe_continue(w, n, list,
power_list);
list_for_each_entry_safe_continue(w, n, list,
power_list);
After the list_for_each_entry_safe_continue() exits, the list iterator
will always be a bogus pointer which point to an invalid struct objdect
containing HEAD member. The funciton poniter 'w->event' will be a
invalid value which can lead to a control-flow hijack if the 'w' can be
controlled.
The original intention was to break the outer list_for_each_entry_safe()
loop if w->event is NULL, but forgot to *break* switch statement first.
So just add a break to fix the bug.
Cc: stable(a)vger.kernel.org
Fixes: 163cac061c973 ("ASoC: Factor out DAPM sequence execution")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong(a)gmail.com>
---
sound/soc/soc-dapm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index b06c5682445c..2a5a64d21856 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -1686,9 +1686,11 @@ static void dapm_seq_run(struct snd_soc_card *card,
switch (w->id) {
case snd_soc_dapm_pre:
- if (!w->event)
+ if (!w->event) {
list_for_each_entry_safe_continue(w, n, list,
power_list);
+ break;
+ }
if (event == SND_SOC_DAPM_STREAM_START)
ret = w->event(w,
@@ -1699,9 +1701,11 @@ static void dapm_seq_run(struct snd_soc_card *card,
break;
case snd_soc_dapm_post:
- if (!w->event)
+ if (!w->event) {
list_for_each_entry_safe_continue(w, n, list,
power_list);
+ break;
+ }
if (event == SND_SOC_DAPM_STREAM_START)
ret = w->event(w,
--
2.17.1
The patch titled
Subject: mm/kmemleak: Reset tag when compare object pointer
has been added to the -mm tree. Its filename is
mm-kmemleak-reset-tag-when-compare-object-pointer.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-kmemleak-reset-tag-when-compar…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-kmemleak-reset-tag-when-compar…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Kuan-Ying Lee <Kuan-Ying.Lee(a)mediatek.com>
Subject: mm/kmemleak: Reset tag when compare object pointer
When we use HW-tag based kasan and enable vmalloc support, we hit
the following bug. It is due to comparison between tagged object
and non-tagged pointer.
We need to reset the kasan tag when we need to compare tagged object
and non-tagged pointer.
[ 7.690429][T400001] init: kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440
[ 7.691762][T400001] init: CPU: 4 PID: 1 Comm: init Tainted: G S W 5.15.25-android13-0-g5cacf919c2bc #1
[ 7.693218][T400001] init: Hardware name: MT6983(ENG) (DT)
[ 7.693983][T400001] init: Call trace:
[ 7.694508][T400001] init: dump_backtrace.cfi_jt+0x0/0x8
[ 7.695272][T400001] init: dump_stack_lvl+0xac/0x120
[ 7.695985][T400001] init: add_scan_area+0xc4/0x244
[ 7.696685][T400001] init: kmemleak_scan_area+0x40/0x9c
[ 7.697428][T400001] init: layout_and_allocate+0x1e8/0x288
[ 7.698211][T400001] init: load_module+0x2c8/0xf00
[ 7.698895][T400001] init: __se_sys_finit_module+0x190/0x1d0
[ 7.699701][T400001] init: __arm64_sys_finit_module+0x20/0x30
[ 7.700517][T400001] init: invoke_syscall+0x60/0x170
[ 7.701225][T400001] init: el0_svc_common+0xc8/0x114
[ 7.701933][T400001] init: do_el0_svc+0x28/0xa0
[ 7.702580][T400001] init: el0_svc+0x60/0xf8
[ 7.703196][T400001] init: el0t_64_sync_handler+0x88/0xec
[ 7.703964][T400001] init: el0t_64_sync+0x1b4/0x1b8
[ 7.704658][T400001] init: kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768):
[ 7.705824][T400001] init: kmemleak: [name:kmemleak&] comm "init", pid 1, jiffies 4294894197
[ 7.707002][T400001] init: kmemleak: [name:kmemleak&] min_count = 0
[ 7.707886][T400001] init: kmemleak: [name:kmemleak&] count = 0
[ 7.708718][T400001] init: kmemleak: [name:kmemleak&] flags = 0x1
[ 7.709574][T400001] init: kmemleak: [name:kmemleak&] checksum = 0
[ 7.710440][T400001] init: kmemleak: [name:kmemleak&] backtrace:
[ 7.711284][T400001] init: module_alloc+0x9c/0x120
[ 7.712015][T400001] init: move_module+0x34/0x19c
[ 7.712735][T400001] init: layout_and_allocate+0x1c4/0x288
[ 7.713561][T400001] init: load_module+0x2c8/0xf00
[ 7.714291][T400001] init: __se_sys_finit_module+0x190/0x1d0
[ 7.715142][T400001] init: __arm64_sys_finit_module+0x20/0x30
[ 7.716004][T400001] init: invoke_syscall+0x60/0x170
[ 7.716758][T400001] init: el0_svc_common+0xc8/0x114
[ 7.717512][T400001] init: do_el0_svc+0x28/0xa0
[ 7.718207][T400001] init: el0_svc+0x60/0xf8
[ 7.718869][T400001] init: el0t_64_sync_handler+0x88/0xec
[ 7.719683][T400001] init: el0t_64_sync+0x1b4/0x1b8
Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee(a)mediatek.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: Chinwen Chang <chinwen.chang(a)mediatek.com>
Cc: Nicholas Tang <nicholas.tang(a)mediatek.com>
Cc: Yee Lee <yee.lee(a)mediatek.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmemleak.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/kmemleak.c~mm-kmemleak-reset-tag-when-compare-object-pointer
+++ a/mm/kmemleak.c
@@ -796,6 +796,8 @@ static void add_scan_area(unsigned long
unsigned long flags;
struct kmemleak_object *object;
struct kmemleak_scan_area *area = NULL;
+ unsigned long untagged_ptr;
+ unsigned long untagged_objp;
object = find_and_get_object(ptr, 1);
if (!object) {
@@ -804,6 +806,9 @@ static void add_scan_area(unsigned long
return;
}
+ untagged_ptr = (unsigned long)kasan_reset_tag((void *)ptr);
+ untagged_objp = (unsigned long)kasan_reset_tag((void *)object->pointer);
+
if (scan_area_cache)
area = kmem_cache_alloc(scan_area_cache, gfp_kmemleak_mask(gfp));
@@ -815,8 +820,8 @@ static void add_scan_area(unsigned long
goto out_unlock;
}
if (size == SIZE_MAX) {
- size = object->pointer + object->size - ptr;
- } else if (ptr + size > object->pointer + object->size) {
+ size = untagged_objp + object->size - untagged_ptr;
+ } else if (untagged_ptr + size > untagged_objp + object->size) {
kmemleak_warn("Scan area larger than object 0x%08lx\n", ptr);
dump_object_info(object);
kmem_cache_free(scan_area_cache, area);
_
Patches currently in -mm which might be from Kuan-Ying.Lee(a)mediatek.com are
mm-kmemleak-reset-tag-when-compare-object-pointer.patch
Hi,
commit 92833e8b5db6c209e9311ac8c6a44d3bf1856659 breaks the
build of sch_* modules in stable.
I already have:
#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 20, 0)
qdisc_destroy(cl->leaf.q);
#else
qdisc_put(cl->leaf.q);
#endif
But this makes it more tricky… or can I “just” change this
to KERNEL_VERSION(4, 19, 221) ?
Nevertheless, renaming functions isn’t something I’d expect
to happen in stable. At least add a #define or so redirecting
from the old/stable name…
bye,
//mirabilos
--
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
****************************************************
/⁀\ The UTF-8 Ribbon
╲ ╱ Campaign against Mit dem tarent-Newsletter nichts mehr verpassen:
╳ HTML eMail! Also, https://www.tarent.de/newsletter
╱ ╲ header encryption!
****************************************************
The patch titled
Subject: mm,hwpoison: unmap poisoned page before invalidation
has been added to the -mm tree. Its filename is
mmhwpoison-unmap-poisoned-page-before-invalidation.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mmhwpoison-unmap-poisoned-page-be…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mmhwpoison-unmap-poisoned-page-be…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Rik van Riel <riel(a)surriel.com>
Subject: mm,hwpoison: unmap poisoned page before invalidation
In some cases it appears the invalidation of a hwpoisoned page fails
because the page is still mapped in another process. This can cause a
program to be continuously restarted and die when it page faults on the
page that was not invalidated. Avoid that problem by unmapping the
hwpoisoned page when we find it.
Another issue is that sometimes we end up oopsing in finish_fault, if the
code tries to do something with the now-NULL vmf->page. I did not hit
this error when submitting the previous patch because there are several
opportunities for alloc_set_pte to bail out before accessing vmf->page,
and that apparently happened on those systems, and most of the time on
other systems, too.
However, across several million systems that error does occur a handful of
times a day. It can be avoided by returning VM_FAULT_NOPAGE which will
cause do_read_fault to return before calling finish_fault.
Link: https://lkml.kernel.org/r/20220325161428.5068d97e@imladris.surriel.com
Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Tested-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
--- a/mm/memory.c~mmhwpoison-unmap-poisoned-page-before-invalidation
+++ a/mm/memory.c
@@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_f
return ret;
if (unlikely(PageHWPoison(vmf->page))) {
+ struct page *page = vmf->page;
vm_fault_t poisonret = VM_FAULT_HWPOISON;
if (ret & VM_FAULT_LOCKED) {
+ if (page_mapped(page))
+ unmap_mapping_pages(page_mapping(page),
+ page->index, 1, false);
/* Retry if a clean page was removed from the cache. */
- if (invalidate_inode_page(vmf->page))
- poisonret = 0;
- unlock_page(vmf->page);
+ if (invalidate_inode_page(page))
+ poisonret = VM_FAULT_NOPAGE;
+ unlock_page(page);
}
- put_page(vmf->page);
+ put_page(page);
vmf->page = NULL;
return poisonret;
}
_
Patches currently in -mm which might be from riel(a)surriel.com are
mmhwpoison-unmap-poisoned-page-before-invalidation.patch
From: Theodore Ts'o <tytso(a)mit.edu>
[ Upstream commit cc5095747edfb054ca2068d01af20be3fcc3634f ]
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance. A related race was noted by Jan Kara in
2018[1]; however, more recently instead of it being a very hard-to-hit
race, it could be reliably triggered by process_vm_writev(2) which was
discovered by Syzbot[2].
This is technically a bug in mm/gup.c, but arguably ext4 is fragile in
that if some other kernel subsystem dirty pages without properly
notifying the file system using page_mkwrite(), ext4 will BUG, while
other file systems will not BUG (although data will still be lost).
So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed. More discussion and background can be found in the
thread starting at [2].
[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db(a)syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones(a)linaro.org>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/ext4/inode.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 79c067f74253..e66aa8918dee 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2048,6 +2048,15 @@ static int ext4_writepage(struct page *page,
else
len = PAGE_SIZE;
+ /* Should never happen but for bugs in other kernel subsystems */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(inode,
+ "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ return 0;
+ }
+
page_bufs = page_buffers(page);
/*
* We cannot do block allocation or other extent handling in this
@@ -2608,6 +2617,22 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
wait_on_page_writeback(page);
BUG_ON(PageWriteback(page));
+ /*
+ * Should never happen but for buggy code in
+ * other subsystems that call
+ * set_page_dirty() without properly warning
+ * the file system first. See [1] for more
+ * information.
+ *
+ * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
+ */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ continue;
+ }
+
if (mpd->map.m_len == 0)
mpd->first_page = page->index;
mpd->next_page = page->index + 1;
--
2.34.1
From: Theodore Ts'o <tytso(a)mit.edu>
[ Upstream commit cc5095747edfb054ca2068d01af20be3fcc3634f ]
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance. A related race was noted by Jan Kara in
2018[1]; however, more recently instead of it being a very hard-to-hit
race, it could be reliably triggered by process_vm_writev(2) which was
discovered by Syzbot[2].
This is technically a bug in mm/gup.c, but arguably ext4 is fragile in
that if some other kernel subsystem dirty pages without properly
notifying the file system using page_mkwrite(), ext4 will BUG, while
other file systems will not BUG (although data will still be lost).
So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed. More discussion and background can be found in the
thread starting at [2].
[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db(a)syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones(a)linaro.org>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/ext4/inode.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9c07c8674b21..4d3eefff3c84 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2147,6 +2147,15 @@ static int ext4_writepage(struct page *page,
else
len = PAGE_SIZE;
+ /* Should never happen but for bugs in other kernel subsystems */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(inode,
+ "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ return 0;
+ }
+
page_bufs = page_buffers(page);
/*
* We cannot do block allocation or other extent handling in this
@@ -2706,6 +2715,22 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
wait_on_page_writeback(page);
BUG_ON(PageWriteback(page));
+ /*
+ * Should never happen but for buggy code in
+ * other subsystems that call
+ * set_page_dirty() without properly warning
+ * the file system first. See [1] for more
+ * information.
+ *
+ * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
+ */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ continue;
+ }
+
if (mpd->map.m_len == 0)
mpd->first_page = page->index;
mpd->next_page = page->index + 1;
--
2.34.1
From: Theodore Ts'o <tytso(a)mit.edu>
[ Upstream commit cc5095747edfb054ca2068d01af20be3fcc3634f ]
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance. A related race was noted by Jan Kara in
2018[1]; however, more recently instead of it being a very hard-to-hit
race, it could be reliably triggered by process_vm_writev(2) which was
discovered by Syzbot[2].
This is technically a bug in mm/gup.c, but arguably ext4 is fragile in
that if some other kernel subsystem dirty pages without properly
notifying the file system using page_mkwrite(), ext4 will BUG, while
other file systems will not BUG (although data will still be lost).
So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed. More discussion and background can be found in the
thread starting at [2].
[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db(a)syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones(a)linaro.org>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/ext4/inode.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index dcbd8ac8d471..0d62f05f8925 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2161,6 +2161,15 @@ static int ext4_writepage(struct page *page,
else
len = PAGE_SIZE;
+ /* Should never happen but for bugs in other kernel subsystems */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(inode,
+ "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ return 0;
+ }
+
page_bufs = page_buffers(page);
/*
* We cannot do block allocation or other extent handling in this
@@ -2710,6 +2719,22 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
wait_on_page_writeback(page);
BUG_ON(PageWriteback(page));
+ /*
+ * Should never happen but for buggy code in
+ * other subsystems that call
+ * set_page_dirty() without properly warning
+ * the file system first. See [1] for more
+ * information.
+ *
+ * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
+ */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ continue;
+ }
+
if (mpd->map.m_len == 0)
mpd->first_page = page->index;
mpd->next_page = page->index + 1;
--
2.34.1
From: Theodore Ts'o <tytso(a)mit.edu>
[ Upstream commit cc5095747edfb054ca2068d01af20be3fcc3634f ]
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance. A related race was noted by Jan Kara in
2018[1]; however, more recently instead of it being a very hard-to-hit
race, it could be reliably triggered by process_vm_writev(2) which was
discovered by Syzbot[2].
This is technically a bug in mm/gup.c, but arguably ext4 is fragile in
that if some other kernel subsystem dirty pages without properly
notifying the file system using page_mkwrite(), ext4 will BUG, while
other file systems will not BUG (although data will still be lost).
So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed. More discussion and background can be found in the
thread starting at [2].
[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db(a)syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones(a)linaro.org>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/ext4/inode.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 7959aae4857e..96cf0f57ca95 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2148,6 +2148,15 @@ static int ext4_writepage(struct page *page,
else
len = PAGE_SIZE;
+ /* Should never happen but for bugs in other kernel subsystems */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(inode,
+ "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ return 0;
+ }
+
page_bufs = page_buffers(page);
/*
* We cannot do block allocation or other extent handling in this
@@ -2697,6 +2706,22 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
wait_on_page_writeback(page);
BUG_ON(PageWriteback(page));
+ /*
+ * Should never happen but for buggy code in
+ * other subsystems that call
+ * set_page_dirty() without properly warning
+ * the file system first. See [1] for more
+ * information.
+ *
+ * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
+ */
+ if (!page_has_buffers(page)) {
+ ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
+ ClearPageDirty(page);
+ unlock_page(page);
+ continue;
+ }
+
if (mpd->map.m_len == 0)
mpd->first_page = page->index;
mpd->next_page = page->index + 1;
--
2.34.1
I should have said 'predicate', not 'axiom'. Please review the
rigorous proof of correctness as listed on my github. Another issue
I'd like to bring up in 5.17 is the apparent lack of mathematical
proofs and overuse of the word "entropy" when we should be using the
word uniqueness - do we really want to go live with something that
isn't documented at all and just trust the word of any one individual?
My team put together mathematical proof of correctness, the onerious
of proving a defect in the implementation is up to individuals who run
the code and observe a failing - that is the point of code review.
The proposed changes to the pool means it no longer needs to be
"drained" or "accumulated", however we still may need a lock to make
sure the /dev/random device driver is ready for use prior to access.
Although I am actually not entirely convinced that this lock is needed
because it directly impacts boot time because of KSLR. We could lock
and wait for the slow long term storage to wake up and feed us the
seed - but is that really necessary? The alternate_rand() function in
the provided patch should be sufficient for the vast majority of
non-paranoid users, I don't see how a would-be adversary could
undermine this construction - sure it isn't ideally secure as a fully
loaded keypool - but it's good enough to thwart a memory corruption
exploit. A fast boot option is preferable IMHO.
Regards,
Mike
On Mon, Mar 28, 2022 at 9:16 AM Michael Brooks <m(a)sweetwater.ai> wrote:
>
> Jason,
>
> I think this is an astute observation - the current design of the pool has this defect of being emptied leading to deadlocks, and in addition, access to it undermines the predictability of the pool. This needs to be fundamentally addressed.
>
> What if access to the pool could never undermine the predictability of the pool state? If this axiom holds, then there would never be a reason to keep a global counter for how much “entropy” or uniqueness is in the pool - and that would remove the lock that handle_irq_event_percpu() is fighting over... which creates a condition where one lock is being battled over by every irq event - which is made worse as the machine has additional CPUs. This one lock is so bad, it is basically a GIL, and we need to have a serious discussion on how to remove this one lock.
>
> This may sound far fetched, but there is actually some clever cryptography that can provide a good solution here. What if the pool was not a linear structure FILO structure, but actually a jump table? What if access to this jump table was randomized, so that no two callers could take the same path through the table? What if in addition, upon read - the reader "covers their tracks" and modifies their entrypoint into the table to prevent any other caller from being able to follow the same path? This is exactly what keypoolrandom does.
>
> Now, if the output of this jump table is used as an input to cryptographic primitives such as an ideal block cipher or hash function - then even a single bit flip would cause an avalanche which dramatically affects the output - and also obscures how it was generated. Therefore, access using this method could never undermine the pool state - therefore this speciture structure never "drains" but rather becomes less and less predictable over time.
>
> This kind of jump table has been written and is called https://github.com/TheRook/KeypoolRandom. This took three years to write, has been peer reviewed and has exquisite documentation, if you take the time to read over this code and the docs - I think you'll enjoy it. This project is very easy to compile and to run - it is the software equivalent of a breakout board, and doesn't require a full kernel compile to see how it functions.
>
> Regards,
> Michael Brooks
>
>
> On Mon, Mar 28, 2022 at 4:20 AM Sasha Levin <sashal(a)kernel.org> wrote:
>>
>> From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
>>
>> [ Upstream commit 6e8ec2552c7d13991148e551e3325a624d73fac6 ]
>>
>> The current 4096-bit LFSR used for entropy collection had a few
>> desirable attributes for the context in which it was created. For
>> example, the state was huge, which meant that /dev/random would be able
>> to output quite a bit of accumulated entropy before blocking. It was
>> also, in its time, quite fast at accumulating entropy byte-by-byte,
>> which matters given the varying contexts in which mix_pool_bytes() is
>> called. And its diffusion was relatively high, which meant that changes
>> would ripple across several words of state rather quickly.
>>
>> However, it also suffers from a few security vulnerabilities. In
>> particular, inputs learned by an attacker can be undone, but moreover,
>> if the state of the pool leaks, its contents can be controlled and
>> entirely zeroed out. I've demonstrated this attack with this SMT2
>> script, <https://xn--4db.cc/5o9xO8pb>, which Boolector/CaDiCal solves in
>> a matter of seconds on a single core of my laptop, resulting in little
>> proof of concept C demonstrators such as <https://xn--4db.cc/jCkvvIaH/c>.
>>
>> For basically all recent formal models of RNGs, these attacks represent
>> a significant cryptographic flaw. But how does this manifest
>> practically? If an attacker has access to the system to such a degree
>> that he can learn the internal state of the RNG, arguably there are
>> other lower hanging vulnerabilities -- side-channel, infoleak, or
>> otherwise -- that might have higher priority. On the other hand, seed
>> files are frequently used on systems that have a hard time generating
>> much entropy on their own, and these seed files, being files, often leak
>> or are duplicated and distributed accidentally, or are even seeded over
>> the Internet intentionally, where their contents might be recorded or
>> tampered with. Seen this way, an otherwise quasi-implausible
>> vulnerability is a bit more practical than initially thought.
>>
>> Another aspect of the current mix_pool_bytes() function is that, while
>> its performance was arguably competitive for the time in which it was
>> created, it's no longer considered so. This patch improves performance
>> significantly: on a high-end CPU, an i7-11850H, it improves performance
>> of mix_pool_bytes() by 225%, and on a low-end CPU, a Cortex-A7, it
>> improves performance by 103%.
>>
>> This commit replaces the LFSR of mix_pool_bytes() with a straight-
>> forward cryptographic hash function, BLAKE2s, which is already in use
>> for pool extraction. Universal hashing with a secret seed was considered
>> too, something along the lines of <https://eprint.iacr.org/2013/338>,
>> but the requirement for a secret seed makes for a chicken & egg problem.
>> Instead we go with a formally proven scheme using a computational hash
>> function, described in sections 5.1, 6.4, and B.1.8 of
>> <https://eprint.iacr.org/2019/198>.
>>
>> BLAKE2s outputs 256 bits, which should give us an appropriate amount of
>> min-entropy accumulation, and a wide enough margin of collision
>> resistance against active attacks. mix_pool_bytes() becomes a simple
>> call to blake2s_update(), for accumulation, while the extraction step
>> becomes a blake2s_final() to generate a seed, with which we can then do
>> a HKDF-like or BLAKE2X-like expansion, the first part of which we fold
>> back as an init key for subsequent blake2s_update()s, and the rest we
>> produce to the caller. This then is provided to our CRNG like usual. In
>> that expansion step, we make opportunistic use of 32 bytes of RDRAND
>> output, just as before. We also always reseed the crng with 32 bytes,
>> unconditionally, or not at all, rather than sometimes with 16 as before,
>> as we don't win anything by limiting beyond the 16 byte threshold.
>>
>> Going for a hash function as an entropy collector is a conservative,
>> proven approach. The result of all this is a much simpler and much less
>> bespoke construction than what's there now, which not only plugs a
>> vulnerability but also improves performance considerably.
>>
>> Cc: Theodore Ts'o <tytso(a)mit.edu>
>> Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
>> Reviewed-by: Eric Biggers <ebiggers(a)google.com>
>> Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson(a)gmail.com>
>> Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
>> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>> ---
>> drivers/char/random.c | 304 ++++++++----------------------------------
>> 1 file changed, 55 insertions(+), 249 deletions(-)
>>
>> diff --git a/drivers/char/random.c b/drivers/char/random.c
>> index 3404a91edf29..882f78829a24 100644
>> --- a/drivers/char/random.c
>> +++ b/drivers/char/random.c
>> @@ -42,61 +42,6 @@
>> */
>>
>> /*
>> - * (now, with legal B.S. out of the way.....)
>> - *
>> - * This routine gathers environmental noise from device drivers, etc.,
>> - * and returns good random numbers, suitable for cryptographic use.
>> - * Besides the obvious cryptographic uses, these numbers are also good
>> - * for seeding TCP sequence numbers, and other places where it is
>> - * desirable to have numbers which are not only random, but hard to
>> - * predict by an attacker.
>> - *
>> - * Theory of operation
>> - * ===================
>> - *
>> - * Computers are very predictable devices. Hence it is extremely hard
>> - * to produce truly random numbers on a computer --- as opposed to
>> - * pseudo-random numbers, which can easily generated by using a
>> - * algorithm. Unfortunately, it is very easy for attackers to guess
>> - * the sequence of pseudo-random number generators, and for some
>> - * applications this is not acceptable. So instead, we must try to
>> - * gather "environmental noise" from the computer's environment, which
>> - * must be hard for outside attackers to observe, and use that to
>> - * generate random numbers. In a Unix environment, this is best done
>> - * from inside the kernel.
>> - *
>> - * Sources of randomness from the environment include inter-keyboard
>> - * timings, inter-interrupt timings from some interrupts, and other
>> - * events which are both (a) non-deterministic and (b) hard for an
>> - * outside observer to measure. Randomness from these sources are
>> - * added to an "entropy pool", which is mixed using a CRC-like function.
>> - * This is not cryptographically strong, but it is adequate assuming
>> - * the randomness is not chosen maliciously, and it is fast enough that
>> - * the overhead of doing it on every interrupt is very reasonable.
>> - * As random bytes are mixed into the entropy pool, the routines keep
>> - * an *estimate* of how many bits of randomness have been stored into
>> - * the random number generator's internal state.
>> - *
>> - * When random bytes are desired, they are obtained by taking the BLAKE2s
>> - * hash of the contents of the "entropy pool". The BLAKE2s hash avoids
>> - * exposing the internal state of the entropy pool. It is believed to
>> - * be computationally infeasible to derive any useful information
>> - * about the input of BLAKE2s from its output. Even if it is possible to
>> - * analyze BLAKE2s in some clever way, as long as the amount of data
>> - * returned from the generator is less than the inherent entropy in
>> - * the pool, the output data is totally unpredictable. For this
>> - * reason, the routine decreases its internal estimate of how many
>> - * bits of "true randomness" are contained in the entropy pool as it
>> - * outputs random numbers.
>> - *
>> - * If this estimate goes to zero, the routine can still generate
>> - * random numbers; however, an attacker may (at least in theory) be
>> - * able to infer the future output of the generator from prior
>> - * outputs. This requires successful cryptanalysis of BLAKE2s, which is
>> - * not believed to be feasible, but there is a remote possibility.
>> - * Nonetheless, these numbers should be useful for the vast majority
>> - * of purposes.
>> - *
>> * Exported interfaces ---- output
>> * ===============================
>> *
>> @@ -298,23 +243,6 @@
>> *
>> * mknod /dev/random c 1 8
>> * mknod /dev/urandom c 1 9
>> - *
>> - * Acknowledgements:
>> - * =================
>> - *
>> - * Ideas for constructing this random number generator were derived
>> - * from Pretty Good Privacy's random number generator, and from private
>> - * discussions with Phil Karn. Colin Plumb provided a faster random
>> - * number generator, which speed up the mixing function of the entropy
>> - * pool, taken from PGPfone. Dale Worley has also contributed many
>> - * useful ideas and suggestions to improve this driver.
>> - *
>> - * Any flaws in the design are solely my responsibility, and should
>> - * not be attributed to the Phil, Colin, or any of authors of PGP.
>> - *
>> - * Further background information on this topic may be obtained from
>> - * RFC 1750, "Randomness Recommendations for Security", by Donald
>> - * Eastlake, Steve Crocker, and Jeff Schiller.
>> */
>>
>> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>> @@ -358,79 +286,15 @@
>>
>> /* #define ADD_INTERRUPT_BENCH */
>>
>> -/*
>> - * If the entropy count falls under this number of bits, then we
>> - * should wake up processes which are selecting or polling on write
>> - * access to /dev/random.
>> - */
>> -static int random_write_wakeup_bits = 28 * (1 << 5);
>> -
>> -/*
>> - * Originally, we used a primitive polynomial of degree .poolwords
>> - * over GF(2). The taps for various sizes are defined below. They
>> - * were chosen to be evenly spaced except for the last tap, which is 1
>> - * to get the twisting happening as fast as possible.
>> - *
>> - * For the purposes of better mixing, we use the CRC-32 polynomial as
>> - * well to make a (modified) twisted Generalized Feedback Shift
>> - * Register. (See M. Matsumoto & Y. Kurita, 1992. Twisted GFSR
>> - * generators. ACM Transactions on Modeling and Computer Simulation
>> - * 2(3):179-194. Also see M. Matsumoto & Y. Kurita, 1994. Twisted
>> - * GFSR generators II. ACM Transactions on Modeling and Computer
>> - * Simulation 4:254-266)
>> - *
>> - * Thanks to Colin Plumb for suggesting this.
>> - *
>> - * The mixing operation is much less sensitive than the output hash,
>> - * where we use BLAKE2s. All that we want of mixing operation is that
>> - * it be a good non-cryptographic hash; i.e. it not produce collisions
>> - * when fed "random" data of the sort we expect to see. As long as
>> - * the pool state differs for different inputs, we have preserved the
>> - * input entropy and done a good job. The fact that an intelligent
>> - * attacker can construct inputs that will produce controlled
>> - * alterations to the pool's state is not important because we don't
>> - * consider such inputs to contribute any randomness. The only
>> - * property we need with respect to them is that the attacker can't
>> - * increase his/her knowledge of the pool's state. Since all
>> - * additions are reversible (knowing the final state and the input,
>> - * you can reconstruct the initial state), if an attacker has any
>> - * uncertainty about the initial state, he/she can only shuffle that
>> - * uncertainty about, but never cause any collisions (which would
>> - * decrease the uncertainty).
>> - *
>> - * Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and
>> - * Videau in their paper, "The Linux Pseudorandom Number Generator
>> - * Revisited" (see: http://eprint.iacr.org/2012/251.pdf). In their
>> - * paper, they point out that we are not using a true Twisted GFSR,
>> - * since Matsumoto & Kurita used a trinomial feedback polynomial (that
>> - * is, with only three taps, instead of the six that we are using).
>> - * As a result, the resulting polynomial is neither primitive nor
>> - * irreducible, and hence does not have a maximal period over
>> - * GF(2**32). They suggest a slight change to the generator
>> - * polynomial which improves the resulting TGFSR polynomial to be
>> - * irreducible, which we have made here.
>> - */
>> enum poolinfo {
>> - POOL_WORDS = 128,
>> - POOL_WORDMASK = POOL_WORDS - 1,
>> - POOL_BYTES = POOL_WORDS * sizeof(u32),
>> - POOL_BITS = POOL_BYTES * 8,
>> + POOL_BITS = BLAKE2S_HASH_SIZE * 8,
>> POOL_BITSHIFT = ilog2(POOL_BITS),
>>
>> /* To allow fractional bits to be tracked, the entropy_count field is
>> * denominated in units of 1/8th bits. */
>> POOL_ENTROPY_SHIFT = 3,
>> #define POOL_ENTROPY_BITS() (input_pool.entropy_count >> POOL_ENTROPY_SHIFT)
>> - POOL_FRACBITS = POOL_BITS << POOL_ENTROPY_SHIFT,
>> -
>> - /* x^128 + x^104 + x^76 + x^51 +x^25 + x + 1 */
>> - POOL_TAP1 = 104,
>> - POOL_TAP2 = 76,
>> - POOL_TAP3 = 51,
>> - POOL_TAP4 = 25,
>> - POOL_TAP5 = 1,
>> -
>> - EXTRACT_SIZE = BLAKE2S_HASH_SIZE / 2
>> + POOL_FRACBITS = POOL_BITS << POOL_ENTROPY_SHIFT
>> };
>>
>> /*
>> @@ -438,6 +302,12 @@ enum poolinfo {
>> */
>> static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
>> static struct fasync_struct *fasync;
>> +/*
>> + * If the entropy count falls under this number of bits, then we
>> + * should wake up processes which are selecting or polling on write
>> + * access to /dev/random.
>> + */
>> +static int random_write_wakeup_bits = POOL_BITS * 3 / 4;
>>
>> static DEFINE_SPINLOCK(random_ready_list_lock);
>> static LIST_HEAD(random_ready_list);
>> @@ -493,73 +363,31 @@ MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit suppression");
>> *
>> **********************************************************************/
>>
>> -static u32 input_pool_data[POOL_WORDS] __latent_entropy;
>> -
>> static struct {
>> + struct blake2s_state hash;
>> spinlock_t lock;
>> - u16 add_ptr;
>> - u16 input_rotate;
>> int entropy_count;
>> } input_pool = {
>> + .hash.h = { BLAKE2S_IV0 ^ (0x01010000 | BLAKE2S_HASH_SIZE),
>> + BLAKE2S_IV1, BLAKE2S_IV2, BLAKE2S_IV3, BLAKE2S_IV4,
>> + BLAKE2S_IV5, BLAKE2S_IV6, BLAKE2S_IV7 },
>> + .hash.outlen = BLAKE2S_HASH_SIZE,
>> .lock = __SPIN_LOCK_UNLOCKED(input_pool.lock),
>> };
>>
>> -static ssize_t extract_entropy(void *buf, size_t nbytes, int min);
>> -static ssize_t _extract_entropy(void *buf, size_t nbytes);
>> +static bool extract_entropy(void *buf, size_t nbytes, int min);
>> +static void _extract_entropy(void *buf, size_t nbytes);
>>
>> static void crng_reseed(struct crng_state *crng, bool use_input_pool);
>>
>> -static const u32 twist_table[8] = {
>> - 0x00000000, 0x3b6e20c8, 0x76dc4190, 0x4db26158,
>> - 0xedb88320, 0xd6d6a3e8, 0x9b64c2b0, 0xa00ae278 };
>> -
>> /*
>> * This function adds bytes into the entropy "pool". It does not
>> * update the entropy estimate. The caller should call
>> * credit_entropy_bits if this is appropriate.
>> - *
>> - * The pool is stirred with a primitive polynomial of the appropriate
>> - * degree, and then twisted. We twist by three bits at a time because
>> - * it's cheap to do so and helps slightly in the expected case where
>> - * the entropy is concentrated in the low-order bits.
>> */
>> static void _mix_pool_bytes(const void *in, int nbytes)
>> {
>> - unsigned long i;
>> - int input_rotate;
>> - const u8 *bytes = in;
>> - u32 w;
>> -
>> - input_rotate = input_pool.input_rotate;
>> - i = input_pool.add_ptr;
>> -
>> - /* mix one byte at a time to simplify size handling and churn faster */
>> - while (nbytes--) {
>> - w = rol32(*bytes++, input_rotate);
>> - i = (i - 1) & POOL_WORDMASK;
>> -
>> - /* XOR in the various taps */
>> - w ^= input_pool_data[i];
>> - w ^= input_pool_data[(i + POOL_TAP1) & POOL_WORDMASK];
>> - w ^= input_pool_data[(i + POOL_TAP2) & POOL_WORDMASK];
>> - w ^= input_pool_data[(i + POOL_TAP3) & POOL_WORDMASK];
>> - w ^= input_pool_data[(i + POOL_TAP4) & POOL_WORDMASK];
>> - w ^= input_pool_data[(i + POOL_TAP5) & POOL_WORDMASK];
>> -
>> - /* Mix the result back in with a twist */
>> - input_pool_data[i] = (w >> 3) ^ twist_table[w & 7];
>> -
>> - /*
>> - * Normally, we add 7 bits of rotation to the pool.
>> - * At the beginning of the pool, add an extra 7 bits
>> - * rotation, so that successive passes spread the
>> - * input bits across the pool evenly.
>> - */
>> - input_rotate = (input_rotate + (i ? 7 : 14)) & 31;
>> - }
>> -
>> - input_pool.input_rotate = input_rotate;
>> - input_pool.add_ptr = i;
>> + blake2s_update(&input_pool.hash, in, nbytes);
>> }
>>
>> static void __mix_pool_bytes(const void *in, int nbytes)
>> @@ -953,15 +781,14 @@ static int crng_slow_load(const u8 *cp, size_t len)
>> static void crng_reseed(struct crng_state *crng, bool use_input_pool)
>> {
>> unsigned long flags;
>> - int i, num;
>> + int i;
>> union {
>> u8 block[CHACHA_BLOCK_SIZE];
>> u32 key[8];
>> } buf;
>>
>> if (use_input_pool) {
>> - num = extract_entropy(&buf, 32, 16);
>> - if (num == 0)
>> + if (!extract_entropy(&buf, 32, 16))
>> return;
>> } else {
>> _extract_crng(&primary_crng, buf.block);
>> @@ -1329,74 +1156,48 @@ static size_t account(size_t nbytes, int min)
>> }
>>
>> /*
>> - * This function does the actual extraction for extract_entropy.
>> - *
>> - * Note: we assume that .poolwords is a multiple of 16 words.
>> + * This is an HKDF-like construction for using the hashed collected entropy
>> + * as a PRF key, that's then expanded block-by-block.
>> */
>> -static void extract_buf(u8 *out)
>> +static void _extract_entropy(void *buf, size_t nbytes)
>> {
>> - struct blake2s_state state __aligned(__alignof__(unsigned long));
>> - u8 hash[BLAKE2S_HASH_SIZE];
>> - unsigned long *salt;
>> unsigned long flags;
>> -
>> - blake2s_init(&state, sizeof(hash));
>> -
>> - /*
>> - * If we have an architectural hardware random number
>> - * generator, use it for BLAKE2's salt & personal fields.
>> - */
>> - for (salt = (unsigned long *)&state.h[4];
>> - salt < (unsigned long *)&state.h[8]; ++salt) {
>> - unsigned long v;
>> - if (!arch_get_random_long(&v))
>> - break;
>> - *salt ^= v;
>> + u8 seed[BLAKE2S_HASH_SIZE], next_key[BLAKE2S_HASH_SIZE];
>> + struct {
>> + unsigned long rdrand[32 / sizeof(long)];
>> + size_t counter;
>> + } block;
>> + size_t i;
>> +
>> + for (i = 0; i < ARRAY_SIZE(block.rdrand); ++i) {
>> + if (!arch_get_random_long(&block.rdrand[i]))
>> + block.rdrand[i] = random_get_entropy();
>> }
>>
>> - /* Generate a hash across the pool */
>> spin_lock_irqsave(&input_pool.lock, flags);
>> - blake2s_update(&state, (const u8 *)input_pool_data, POOL_BYTES);
>> - blake2s_final(&state, hash); /* final zeros out state */
>>
>> - /*
>> - * We mix the hash back into the pool to prevent backtracking
>> - * attacks (where the attacker knows the state of the pool
>> - * plus the current outputs, and attempts to find previous
>> - * outputs), unless the hash function can be inverted. By
>> - * mixing at least a hash worth of hash data back, we make
>> - * brute-forcing the feedback as hard as brute-forcing the
>> - * hash.
>> - */
>> - __mix_pool_bytes(hash, sizeof(hash));
>> - spin_unlock_irqrestore(&input_pool.lock, flags);
>> + /* seed = HASHPRF(last_key, entropy_input) */
>> + blake2s_final(&input_pool.hash, seed);
>>
>> - /* Note that EXTRACT_SIZE is half of hash size here, because above
>> - * we've dumped the full length back into mixer. By reducing the
>> - * amount that we emit, we retain a level of forward secrecy.
>> - */
>> - memcpy(out, hash, EXTRACT_SIZE);
>> - memzero_explicit(hash, sizeof(hash));
>> -}
>> + /* next_key = HASHPRF(seed, RDRAND || 0) */
>> + block.counter = 0;
>> + blake2s(next_key, (u8 *)&block, seed, sizeof(next_key), sizeof(block), sizeof(seed));
>> + blake2s_init_key(&input_pool.hash, BLAKE2S_HASH_SIZE, next_key, sizeof(next_key));
>>
>> -static ssize_t _extract_entropy(void *buf, size_t nbytes)
>> -{
>> - ssize_t ret = 0, i;
>> - u8 tmp[EXTRACT_SIZE];
>> + spin_unlock_irqrestore(&input_pool.lock, flags);
>> + memzero_explicit(next_key, sizeof(next_key));
>>
>> while (nbytes) {
>> - extract_buf(tmp);
>> - i = min_t(int, nbytes, EXTRACT_SIZE);
>> - memcpy(buf, tmp, i);
>> + i = min_t(size_t, nbytes, BLAKE2S_HASH_SIZE);
>> + /* output = HASHPRF(seed, RDRAND || ++counter) */
>> + ++block.counter;
>> + blake2s(buf, (u8 *)&block, seed, i, sizeof(block), sizeof(seed));
>> nbytes -= i;
>> buf += i;
>> - ret += i;
>> }
>>
>> - /* Wipe data just returned from memory */
>> - memzero_explicit(tmp, sizeof(tmp));
>> -
>> - return ret;
>> + memzero_explicit(seed, sizeof(seed));
>> + memzero_explicit(&block, sizeof(block));
>> }
>>
>> /*
>> @@ -1404,13 +1205,18 @@ static ssize_t _extract_entropy(void *buf, size_t nbytes)
>> * returns it in a buffer.
>> *
>> * The min parameter specifies the minimum amount we can pull before
>> - * failing to avoid races that defeat catastrophic reseeding.
>> + * failing to avoid races that defeat catastrophic reseeding. If we
>> + * have less than min entropy available, we return false and buf is
>> + * not filled.
>> */
>> -static ssize_t extract_entropy(void *buf, size_t nbytes, int min)
>> +static bool extract_entropy(void *buf, size_t nbytes, int min)
>> {
>> trace_extract_entropy(nbytes, POOL_ENTROPY_BITS(), _RET_IP_);
>> - nbytes = account(nbytes, min);
>> - return _extract_entropy(buf, nbytes);
>> + if (account(nbytes, min)) {
>> + _extract_entropy(buf, nbytes);
>> + return true;
>> + }
>> + return false;
>> }
>>
>> #define warn_unseeded_randomness(previous) \
>> @@ -1674,7 +1480,7 @@ static void __init init_std_data(void)
>> unsigned long rv;
>>
>> mix_pool_bytes(&now, sizeof(now));
>> - for (i = POOL_BYTES; i > 0; i -= sizeof(rv)) {
>> + for (i = BLAKE2S_BLOCK_SIZE; i > 0; i -= sizeof(rv)) {
>> if (!arch_get_random_seed_long(&rv) &&
>> !arch_get_random_long(&rv))
>> rv = random_get_entropy();
>> --
>> 2.34.1
>>
Dearest,
I sent this message to you before without responding.
please,confirm this and reply for more
instruction.(jerryesq22(a)gmail.com or wilfredesq23(a)gmail.com)
WILFRED Esq.
Telephone +228 96277913
As pointed out by this bug report [1], buffered writes are now broken on
S29GL064N. This issue comes from a rework which switched from using chip_good()
to chip_ready(), because DQ true data 0xFF is read on S29GL064N and an error
returned by chip_good(). One way to solve the issue is to revert the change
partially to use chip_ready for S29GL064N.
[1] https://lore.kernel.org/r/b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.…
Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Tokunori Ikegami <ikegami.t(a)gmail.com>
Tested-by: Ahmad Fatoum <a.fatoum(a)pengutronix.de>
Cc: stable(a)vger.kernel.org
Tokunori Ikegami (3):
mtd: cfi_cmdset_0002: Move and rename
chip_check/chip_ready/chip_good_for_write
mtd: cfi_cmdset_0002: Use chip_ready() for write on S29GL064N
mtd: cfi_cmdset_0002: Add S29GL064N ID definition
drivers/mtd/chips/cfi_cmdset_0002.c | 93 +++++++++++++++--------------
1 file changed, 49 insertions(+), 44 deletions(-)
--
2.32.0