February 2021 - Linux-stable-mirror

+ mm-vmscan-restore-zone_reclaim_mode-abi.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/vmscan: restore zone_reclaim_mode ABI has been added to the -mm tree. Its filename is mm-vmscan-restore-zone_reclaim_mode-abi.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-restore-zone_reclaim_mo… and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-restore-zone_reclaim_mo… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Dave Hansen <dave.hansen(a)linux.intel.com> Subject: mm/vmscan: restore zone_reclaim_mode ABI I went to go add a new RECLAIM_* mode for the zone_reclaim_mode sysctl. Like a good kernel developer, I also went to go update the documentation. I noticed that the bits in the documentation didn't match the bits in the #defines. The VM never explicitly checks the RECLAIM_ZONE bit. The bit is, however implicitly checked when checking 'node_reclaim_mode==0'. The RECLAIM_ZONE #define was removed in a cleanup. That, by itself is fine. But, when the bit was removed (bit 0) the _other_ bit locations also got changed. That's not OK because the bit values are documented to mean one specific thing. Users surely do not expect the meaning to change from kernel to kernel. The end result is that if someone had a script that did: sysctl vm.zone_reclaim_mode=1 it would have gone from enabling node reclaim for clean unmapped pages to writing out pages during node reclaim after the commit in question. That's not great. Put the bits back the way they were and add a comment so something like this is a bit harder to do again. Update the documentation to make it clear that the first bit is ignored. Link: https://lkml.kernel.org/r/20210219172555.FF0CDF23@viggo.jf.intel.com Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Fixes: 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE") Reviewed-by: Ben Widawsky <ben.widawsky(a)intel.com> Reviewed-by: Oscar Salvador <osalvador(a)suse.de> Acked-by: David Rientjes <rientjes(a)google.com> Acked-by: Christoph Lameter <cl(a)linux.com> Cc: Alex Shi <alex.shi(a)linux.alibaba.com> Cc: Daniel Wagner <dwagner(a)suse.de> Cc: "Tobin C. Harding" <tobin(a)kernel.org> Cc: Christoph Lameter <cl(a)linux.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Huang Ying <ying.huang(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Qian Cai <cai(a)lca.pw> Cc: Daniel Wagner <dwagner(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- Documentation/admin-guide/sysctl/vm.rst | 10 +++++----- mm/vmscan.c | 9 +++++++-- 2 files changed, 12 insertions(+), 7 deletions(-) --- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-zone_reclaim_mode-abi +++ a/Documentation/admin-guide/sysctl/vm.rst @@ -983,11 +983,11 @@ that benefit from having their data cach left disabled as the caching effect is likely to be more important than data locality. -zone_reclaim may be enabled if it's known that the workload is partitioned -such that each partition fits within a NUMA node and that accessing remote -memory would cause a measurable performance reduction. The page allocator -will then reclaim easily reusable pages (those page cache pages that are -currently not used) before allocating off node pages. +Consider enabling one or more zone_reclaim mode bits if it's known that the +workload is partitioned such that each partition fits within a NUMA node +and that accessing remote memory would cause a measurable performance +reduction. The page allocator will take additional actions before +allocating off node pages. Allowing zone reclaim to write out pages stops processes that are writing large amounts of data from dirtying pages on other nodes. Zone --- a/mm/vmscan.c~mm-vmscan-restore-zone_reclaim_mode-abi +++ a/mm/vmscan.c @@ -4085,8 +4085,13 @@ module_init(kswapd_init) */ int node_reclaim_mode __read_mostly; -#define RECLAIM_WRITE (1<<0) /* Writeout pages during reclaim */ -#define RECLAIM_UNMAP (1<<1) /* Unmap pages during reclaim */ +/* + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl + * ABI. New bits are OK, but existing bits can never change. + */ +#define RECLAIM_ZONE (1<<0) /* Run shrink_inactive_list on the zone */ +#define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */ +#define RECLAIM_UNMAP (1<<2) /* Unmap pages during reclaim */ /* * Priority for NODE_RECLAIM. This determines the fraction of pages _ Patches currently in -mm which might be from dave.hansen(a)linux.intel.com are mm-vmscan-restore-zone_reclaim_mode-abi.patch

4 years, 8 months

1
0
0 0

[PATCH] btrfs: fix backport of 2175bf57dc952 in 5.10.13

by David Sterba

From: David Sterba <dsterba(a)suse.cz> There's a mistake in backport of upstream commit 2175bf57dc95 ("btrfs: fix possible free space tree corruption with online conversion") as 5.10.13 commit 2175bf57dc95. The enum value BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED has been added to the wrong enum set, colliding with value of BTRFS_FS_QUOTA_ENABLE. This could cause problems during the tree conversion, where the quotas wouldn't be set up properly but the related code executed anyway due to the bit set. Link: https://lore.kernel.org/linux-btrfs/20210219111741.95DD.409509F4@e16-tech.c… Reported-by: Wang Yugui <wangyugui(a)e16-tech.com> CC: stable(a)vger.kernel.org # 5.10.13+ Signed-off-by: David Sterba <dsterba(a)suse.com> --- fs/btrfs/ctree.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 30ea9780725f..b6884eda9ff6 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -146,9 +146,6 @@ enum { BTRFS_FS_STATE_DEV_REPLACING, /* The btrfs_fs_info created for self-tests */ BTRFS_FS_STATE_DUMMY_FS_INFO, - - /* Indicate that we can't trust the free space tree for caching yet */ - BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, }; #define BTRFS_BACKREF_REV_MAX 256 @@ -562,6 +559,9 @@ enum { /* Indicate that the discard workqueue can service discards. */ BTRFS_FS_DISCARD_RUNNING, + + /* Indicate that we can't trust the free space tree for caching yet */ + BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, }; /* -- 2.29.2

4 years, 8 months

1
0
0 0

[PATCH 1/3] mm/vmscan: restore zone_reclaim_mode ABI

by Dave Hansen

From: Dave Hansen <dave.hansen(a)linux.intel.com> I went to go add a new RECLAIM_* mode for the zone_reclaim_mode sysctl. Like a good kernel developer, I also went to go update the documentation. I noticed that the bits in the documentation didn't match the bits in the #defines. The VM never explicitly checks the RECLAIM_ZONE bit. The bit is, however implicitly checked when checking 'node_reclaim_mode==0'. The RECLAIM_ZONE #define was removed in a cleanup. That, by itself is fine. But, when the bit was removed (bit 0) the _other_ bit locations also got changed. That's not OK because the bit values are documented to mean one specific thing. Users surely do not expect the meaning to change from kernel to kernel. The end result is that if someone had a script that did: sysctl vm.zone_reclaim_mode=1 it would have gone from enabling node reclaim for clean unmapped pages to writing out pages during node reclaim after the commit in question. That's not great. Put the bits back the way they were and add a comment so something like this is a bit harder to do again. Update the documentation to make it clear that the first bit is ignored. Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Fixes: 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE") Reviewed-by: Ben Widawsky <ben.widawsky(a)intel.com> Reviewed-by: Oscar Salvador <osalvador(a)suse.de> Acked-by: David Rientjes <rientjes(a)google.com> Acked-by: Christoph Lameter <cl(a)linux.com> Cc: Alex Shi <alex.shi(a)linux.alibaba.com> Cc: Daniel Wagner <dwagner(a)suse.de> Cc: "Tobin C. Harding" <tobin(a)kernel.org> Cc: Christoph Lameter <cl(a)linux.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Huang Ying <ying.huang(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Qian Cai <cai(a)lca.pw> Cc: Daniel Wagner <dwagner(a)suse.de> Cc: stable(a)vger.kernel.org -- Changes from v2: * Update description to indicate that bit0 was used for clean unmapped page node reclaim. --- b/Documentation/admin-guide/sysctl/vm.rst | 10 +++++----- b/mm/vmscan.c | 9 +++++++-- 2 files changed, 12 insertions(+), 7 deletions(-) diff -puN Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi Documentation/admin-guide/sysctl/vm.rst --- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi 2021-02-19 09:25:26.656663105 -0800 +++ b/Documentation/admin-guide/sysctl/vm.rst 2021-02-19 09:25:26.662663105 -0800 @@ -983,11 +983,11 @@ that benefit from having their data cach left disabled as the caching effect is likely to be more important than data locality. -zone_reclaim may be enabled if it's known that the workload is partitioned -such that each partition fits within a NUMA node and that accessing remote -memory would cause a measurable performance reduction. The page allocator -will then reclaim easily reusable pages (those page cache pages that are -currently not used) before allocating off node pages. +Consider enabling one or more zone_reclaim mode bits if it's known that the +workload is partitioned such that each partition fits within a NUMA node +and that accessing remote memory would cause a measurable performance +reduction. The page allocator will take additional actions before +allocating off node pages. Allowing zone reclaim to write out pages stops processes that are writing large amounts of data from dirtying pages on other nodes. Zone diff -puN mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi mm/vmscan.c --- a/mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi 2021-02-19 09:25:26.658663105 -0800 +++ b/mm/vmscan.c 2021-02-19 09:25:26.665663105 -0800 @@ -4095,8 +4095,13 @@ module_init(kswapd_init) */ int node_reclaim_mode __read_mostly; -#define RECLAIM_WRITE (1<<0) /* Writeout pages during reclaim */ -#define RECLAIM_UNMAP (1<<1) /* Unmap pages during reclaim */ +/* + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl + * ABI. New bits are OK, but existing bits can never change. + */ +#define RECLAIM_ZONE (1<<0) /* Run shrink_inactive_list on the zone */ +#define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */ +#define RECLAIM_UNMAP (1<<2) /* Unmap pages during reclaim */ /* * Priority for NODE_RECLAIM. This determines the fraction of pages _

4 years, 8 months

1
0
0 0

[PATCH] iio: chemical: scd30: fix Oops due to missing parent device

by Petr Štetiar

My machine Oopsed while testing SCD30 sensor in interrupt driven mode: Unable to handle kernel NULL pointer dereference at virtual address 00000188 pgd = (ptrval) [00000188] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.96+ #473 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) PC is at _raw_spin_lock_irqsave+0x10/0x4c LR is at devres_add+0x18/0x38 ... [<8070ecac>] (_raw_spin_lock_irqsave) from [<804916a8>] (devres_add+0x18/0x38) [<804916a8>] (devres_add) from [<805ef708>] (devm_iio_trigger_alloc+0x5c/0x7c) [<805ef708>] (devm_iio_trigger_alloc) from [<805f0a90>] (scd30_probe+0x1d4/0x3f0) [<805f0a90>] (scd30_probe) from [<805f10fc>] (scd30_i2c_probe+0x54/0x64) [<805f10fc>] (scd30_i2c_probe) from [<80583390>] (i2c_device_probe+0x150/0x278) [<80583390>] (i2c_device_probe) from [<8048e6c0>] (really_probe+0x1f8/0x360) I've found out, that it's due to missing parent/owner device in iio_dev struct which then leads to NULL pointer dereference during spinlock while registering the device resource via devres_add(). Cc: <stable(a)vger.kernel.org> # v5.9+ Fixes: 64b3d8b1b0f5 ("iio: chemical: scd30: add core driver") Signed-off-by: Petr Štetiar <ynezz(a)true.cz> --- drivers/iio/chemical/scd30_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iio/chemical/scd30_core.c b/drivers/iio/chemical/scd30_core.c index 4d0d798c7cd3..33aa6eb1963d 100644 --- a/drivers/iio/chemical/scd30_core.c +++ b/drivers/iio/chemical/scd30_core.c @@ -697,6 +697,7 @@ int scd30_probe(struct device *dev, int irq, const char *name, void *priv, dev_set_drvdata(dev, indio_dev); + indio_dev->dev.parent = dev; indio_dev->info = &scd30_info; indio_dev->name = name; indio_dev->channels = scd30_channels;

4 years, 8 months

2
2
0 0

[PATCH 4.9] smp: Warn on function calls from softirq context

by Wen Yang

>> On Fri, Feb 19, 2021 at 02:43:34PM +0800, Wen Yang wrote: >> From: Peter Zijlstra <peterz(a)infradead.org> >> >> commit 19dbdcb8039cff16669a05136a29180778d16d0a upstream. >> >> It's clearly documented that smp function calls cannot be invoked from >> softirq handling context. Unfortunately nothing enforces that or emits a >> warning. >> >> A single function call can be invoked from softirq context only via >> smp_call_function_single_async(). >> >> The only legit context is task context, so add a warning to that effect. >> >> Reported-by: luferry <luferry(a)163.com> >> Signed-off-by: Peter Zijlstra <peterz(a)infradead.org> >> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> >> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass… >> Cc: stable <stable(a)vger.kernel.org> # 4.9.x >> Signed-off-by: Wen Yang <simon.wy(a)alibaba-inc.com> >> --- >> kernel/smp.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff --git a/kernel/smp.c b/kernel/smp.c >> index 399905f..f2b29c4 100644 >> --- a/kernel/smp.c >> +++ b/kernel/smp.c >> @@ -276,6 +276,14 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, >> WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() >> && !oops_in_progress); >> >> + /* >> + * When @wait we can deadlock when we interrupt between llist_add() and >> + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to >> + * csd_lock() on because the interrupt context uses the same csd >> + * storage. >> + */ >> + WARN_ON_ONCE(!in_task()); >> + >> csd = &csd_stack; >> if (!wait) { >> csd = this_cpu_ptr(&csd_data); >> @@ -401,6 +409,14 @@ void smp_call_function_many(const struct cpumask *mask, >> WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() >> && !oops_in_progress && !early_boot_irqs_disabled); >> >> + /* >> + * When @wait we can deadlock when we interrupt between llist_add() and >> + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to >> + * csd_lock() on because the interrupt context uses the same csd >> + * storage. >> + */ >> + WARN_ON_ONCE(!in_task()); >> + >> /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ >> cpu = cpumask_first_and(mask, cpu_online_mask); >> if (cpu == this_cpu) >> -- >> 1.8.3.1 >> > > WHy do you want this in the 4.9.y kernel tree only, and not all others? > What bug/problem does this fix? It seems that it will only report > problems that other code has, not fix existing code. If anything, it's > going to start causing machines to reboot that have "panic on warn" set, > is that a good thing to do? 4.9, 4.14 and 4.19 should all need it. We find that some third party kernel modules occasionally cause kernel panic (such as watchdog reset). After further analysis, it is found that the functions such as smp_call_function()/on_each_cpu() are called in the interrupt context or softirq context. Since these usages are illegal and cannot be prohibited, we should add a warning to enhance the robustness of the stable kernel and/or facilitate the analysis of the problems. thanks, Wen

4 years, 8 months

2
1
0 0

[PATCH 2/3] x86/sev-es: Check if regs->sp is trusted before adjusting #VC IST stack

by Joerg Roedel

From: Joerg Roedel <jroedel(a)suse.de> The code in the NMI handler to adjust the #VC handler IST stack is needed in case an NMI hits when the #VC handler is still using its IST stack. But the check for this condition also needs to look if the regs->sp value is trusted, meaning it was not set by user-space. Extend the check to not use regs->sp when the NMI interrupted user-space code or the SYSCALL gap. Reported-by: Andy Lutomirski <luto(a)kernel.org> Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Cc: stable(a)vger.kernel.org # 5.10+ Signed-off-by: Joerg Roedel <jroedel(a)suse.de> --- arch/x86/kernel/sev-es.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c index 84c1821819af..0df38b185d53 100644 --- a/arch/x86/kernel/sev-es.c +++ b/arch/x86/kernel/sev-es.c @@ -144,7 +144,9 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs) old_ist = __this_cpu_read(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC]); /* Make room on the IST stack */ - if (on_vc_stack(regs->sp)) + if (on_vc_stack(regs->sp) && + !user_mode(regs) && + !from_syscall_gap(regs)) new_ist = ALIGN_DOWN(regs->sp, 8) - sizeof(old_ist); else new_ist = old_ist - sizeof(old_ist); -- 2.30.0

4 years, 8 months

4
7
0 0

[PATCH RESEND v5] tpm: fix reference counting for struct tpm_chip

by Lino Sanfilippo

From: Lino Sanfilippo <l.sanfilippo(a)kunbus.com> The following sequence of operations results in a refcount warning: 1. Open device /dev/tpmrm. 2. Remove module tpm_tis_spi. 3. Write a TPM command to the file descriptor opened at step 1. ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4 refcount_t: addition on 0; use-after-free. Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4 brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835] CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2 Hardware name: BCM2711 [<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14) [<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8) [<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108) [<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8) [<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4) [<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm]) [<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm]) [<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0) [<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc) [<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c) Exception stack(0xc226bfa8 to 0xc226bff0) bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000 bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684 bfe0: 0000006c beafe648 0001056c b6eb6944 ---[ end trace d4b8409def9b8b1f ]--- The reason for this warning is the attempt to get the chip->dev reference in tpm_common_write() although the reference counter is already zero. Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the extra reference used to prevent a premature zero counter is never taken, because the required TPM_CHIP_FLAG_TPM2 flag is never set. Fix this by moving the TPM 2 character device handling from tpm_chip_alloc() to tpm_add_char_device() which is called at a later point in time when the flag has been set in case of TPM2. Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>") already introduced function tpm_devs_release() to release the extra reference but did not implement the required put on chip->devs that results in the call of this function. Fix this by putting chip->devs in tpm_chip_unregister(). Finally move the new implementation for the TPM 2 handling into a new function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the good case and error cases. Cc: stable(a)vger.kernel.org Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>") Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device") Co-developed-by: Jason Gunthorpe <jgg(a)ziepe.ca> Signed-off-by: Jason Gunthorpe <jgg(a)ziepe.ca> Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com> --- drivers/char/tpm/tpm-chip.c | 48 +++++++++---------------------------------- drivers/char/tpm/tpm.h | 1 + drivers/char/tpm/tpm2-space.c | 48 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 59 insertions(+), 38 deletions(-) diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index ddaeceb..a1fda0d 100644 --- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -274,14 +274,6 @@ static void tpm_dev_release(struct device *dev) kfree(chip); } -static void tpm_devs_release(struct device *dev) -{ - struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs); - - /* release the master device reference */ - put_device(&chip->dev); -} - /** * tpm_class_shutdown() - prepare the TPM device for loss of power. * @dev: device to which the chip is associated. @@ -344,7 +336,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev, chip->dev_num = rc; device_initialize(&chip->dev); - device_initialize(&chip->devs); chip->dev.class = tpm_class; chip->dev.class->shutdown_pre = tpm_class_shutdown; @@ -352,39 +343,20 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev, chip->dev.parent = pdev; chip->dev.groups = chip->groups; - chip->devs.parent = pdev; - chip->devs.class = tpmrm_class; - chip->devs.release = tpm_devs_release; - /* get extra reference on main device to hold on - * behalf of devs. This holds the chip structure - * while cdevs is in use. The corresponding put - * is in the tpm_devs_release (TPM2 only) - */ - if (chip->flags & TPM_CHIP_FLAG_TPM2) - get_device(&chip->dev); - if (chip->dev_num == 0) chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR); else chip->dev.devt = MKDEV(MAJOR(tpm_devt), chip->dev_num); - chip->devs.devt = - MKDEV(MAJOR(tpm_devt), chip->dev_num + TPM_NUM_DEVICES); - rc = dev_set_name(&chip->dev, "tpm%d", chip->dev_num); if (rc) goto out; - rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num); - if (rc) - goto out; if (!pdev) chip->flags |= TPM_CHIP_FLAG_VIRTUAL; cdev_init(&chip->cdev, &tpm_fops); - cdev_init(&chip->cdevs, &tpmrm_fops); chip->cdev.owner = THIS_MODULE; - chip->cdevs.owner = THIS_MODULE; rc = tpm2_init_space(&chip->work_space, TPM2_SPACE_BUFFER_SIZE); if (rc) { @@ -396,7 +368,6 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev, return chip; out: - put_device(&chip->devs); put_device(&chip->dev); return ERR_PTR(rc); } @@ -445,14 +416,9 @@ static int tpm_add_char_device(struct tpm_chip *chip) } if (chip->flags & TPM_CHIP_FLAG_TPM2) { - rc = cdev_device_add(&chip->cdevs, &chip->devs); - if (rc) { - dev_err(&chip->devs, - "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n", - dev_name(&chip->devs), MAJOR(chip->devs.devt), - MINOR(chip->devs.devt), rc); - return rc; - } + rc = tpm2_add_device(chip); + if (rc) + goto del_cdev; } /* Make the chip available. */ @@ -460,6 +426,10 @@ static int tpm_add_char_device(struct tpm_chip *chip) idr_replace(&dev_nums_idr, chip, chip->dev_num); mutex_unlock(&idr_lock); + return 0; + +del_cdev: + cdev_device_del(&chip->cdev, &chip->dev); return rc; } @@ -640,8 +610,10 @@ void tpm_chip_unregister(struct tpm_chip *chip) if (IS_ENABLED(CONFIG_HW_RANDOM_TPM)) hwrng_unregister(&chip->hwrng); tpm_bios_log_teardown(chip); - if (chip->flags & TPM_CHIP_FLAG_TPM2) + if (chip->flags & TPM_CHIP_FLAG_TPM2) { cdev_device_del(&chip->cdevs, &chip->devs); + put_device(&chip->devs); + } tpm_del_char_device(chip); } EXPORT_SYMBOL_GPL(tpm_chip_unregister); diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index 947d1db..aa93af5 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -238,6 +238,7 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space, u8 *cmd, size_t cmdsiz); int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, void *buf, size_t *bufsiz); +int tpm2_add_device(struct tpm_chip *chip); void tpm_bios_log_setup(struct tpm_chip *chip); void tpm_bios_log_teardown(struct tpm_chip *chip); diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c index 784b8b3..96b2a4b 100644 --- a/drivers/char/tpm/tpm2-space.c +++ b/drivers/char/tpm/tpm2-space.c @@ -571,3 +571,51 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space, dev_err(&chip->dev, "%s: error %d\n", __func__, rc); return rc; } + +static void tpm_devs_release(struct device *dev) +{ + struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs); + + /* release the master device reference */ + put_device(&chip->dev); +} + +int tpm2_add_device(struct tpm_chip *chip) +{ + int rc; + + device_initialize(&chip->devs); + chip->devs.parent = chip->dev.parent; + chip->devs.class = tpmrm_class; + /* + * get extra reference on main device to hold on behalf of devs. + * This holds the chip structure while cdevs is in use. The + * corresponding put is in the tpm_devs_release. + */ + get_device(&chip->dev); + chip->devs.release = tpm_devs_release; + chip->devs.devt = MKDEV(MAJOR(tpm_devt), + chip->dev_num + TPM_NUM_DEVICES); + cdev_init(&chip->cdevs, &tpmrm_fops); + chip->cdevs.owner = THIS_MODULE; + + rc = dev_set_name(&chip->devs, "tpmrm%d", chip->dev_num); + if (rc) + goto out_put_devs; + + rc = cdev_device_add(&chip->cdevs, &chip->devs); + if (rc) { + dev_err(&chip->devs, + "unable to cdev_device_add() %s, major %d, minor %d, err=%d\n", + dev_name(&chip->devs), MAJOR(chip->devs.devt), + MINOR(chip->devs.devt), rc); + goto out_put_devs; + } + + return 0; + +out_put_devs: + put_device(&chip->devs); + + return rc; +} -- 2.7.4

4 years, 8 months

2
3
0 0

[PATCH 4.9] smp: Warn on function calls from softirq context

by Wen Yang

From: Peter Zijlstra <peterz(a)infradead.org> commit 19dbdcb8039cff16669a05136a29180778d16d0a upstream. It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). The only legit context is task context, so add a warning to that effect. Reported-by: luferry <luferry(a)163.com> Signed-off-by: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass… Cc: stable <stable(a)vger.kernel.org> # 4.9.x Signed-off-by: Wen Yang <simon.wy(a)alibaba-inc.com> --- kernel/smp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 399905f..f2b29c4 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -276,6 +276,14 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); @@ -401,6 +409,14 @@ void smp_call_function_many(const struct cpumask *mask, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress && !early_boot_irqs_disabled); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ cpu = cpumask_first_and(mask, cpu_online_mask); if (cpu == this_cpu) -- 1.8.3.1

4 years, 8 months

2
1
0 0

[PATCH 4.19] smp: Warn on function calls from softirq context

by Wen Yang

From: Peter Zijlstra <peterz(a)infradead.org> commit 19dbdcb8039cff16669a05136a29180778d16d0a upstream. It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). The only legit context is task context, so add a warning to that effect. Reported-by: luferry <luferry(a)163.com> Signed-off-by: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass… Cc: stable <stable(a)vger.kernel.org> # 4.19.x Signed-off-by: Wen Yang <simon.wy(a)alibaba-inc.com> --- kernel/smp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 084c8b3..9afcbb4 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -290,6 +290,14 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); @@ -415,6 +423,14 @@ void smp_call_function_many(const struct cpumask *mask, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress && !early_boot_irqs_disabled); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ cpu = cpumask_first_and(mask, cpu_online_mask); if (cpu == this_cpu) -- 1.8.3.1

4 years, 8 months

1
0
0 0

[PATCH 4.14] smp: Warn on function calls from softirq context

by Wen Yang

From: Peter Zijlstra <peterz(a)infradead.org> commit 19dbdcb8039cff16669a05136a29180778d16d0a upstream. It's clearly documented that smp function calls cannot be invoked from softirq handling context. Unfortunately nothing enforces that or emits a warning. A single function call can be invoked from softirq context only via smp_call_function_single_async(). The only legit context is task context, so add a warning to that effect. Reported-by: luferry <luferry(a)163.com> Signed-off-by: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass… Cc: stable <stable(a)vger.kernel.org> # 4.14.x Signed-off-by: Wen Yang <simon.wy(a)alibaba-inc.com> --- kernel/smp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index c94dd85..7d00d3e 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -290,6 +290,14 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + csd = &csd_stack; if (!wait) { csd = this_cpu_ptr(&csd_data); @@ -415,6 +423,14 @@ void smp_call_function_many(const struct cpumask *mask, WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() && !oops_in_progress && !early_boot_irqs_disabled); + /* + * When @wait we can deadlock when we interrupt between llist_add() and + * arch_send_call_function_ipi*(); when !@wait we can deadlock due to + * csd_lock() on because the interrupt context uses the same csd + * storage. + */ + WARN_ON_ONCE(!in_task()); + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ cpu = cpumask_first_and(mask, cpu_online_mask); if (cpu == this_cpu) -- 1.8.3.1

4 years, 8 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror February 2021