From: Zou Wei zou_wei@huawei.com
[ Upstream commit 603fcfb9d4ec1cad8d66d3bb37f3613afa8a661a ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/sc27xx_fuel_gauge.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/supply/sc27xx_fuel_gauge.c b/drivers/power/supply/sc27xx_fuel_gauge.c index 9c627618c224..1ae8374e1ceb 100644 --- a/drivers/power/supply/sc27xx_fuel_gauge.c +++ b/drivers/power/supply/sc27xx_fuel_gauge.c @@ -1342,6 +1342,7 @@ static const struct of_device_id sc27xx_fgu_of_match[] = { { .compatible = "sprd,sc2731-fgu", }, { } }; +MODULE_DEVICE_TABLE(of, sc27xx_fgu_of_match);
static struct platform_driver sc27xx_fgu_driver = { .probe = sc27xx_fgu_probe,
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 2aac79d14d76879c8e307820b31876e315b1b242 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/sc2731_charger.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/supply/sc2731_charger.c b/drivers/power/supply/sc2731_charger.c index 335cb857ef30..288b79836c13 100644 --- a/drivers/power/supply/sc2731_charger.c +++ b/drivers/power/supply/sc2731_charger.c @@ -524,6 +524,7 @@ static const struct of_device_id sc2731_charger_of_match[] = { { .compatible = "sprd,sc2731-charger", }, { } }; +MODULE_DEVICE_TABLE(of, sc2731_charger_of_match);
static struct platform_driver sc2731_charger_driver = { .driver = {
From: Chao Yu yuchao0@huawei.com
[ Upstream commit cad83c968c2ebe97905f900326988ed37146c347 ]
As syzbot reported, there is an use-after-free issue during f2fs recovery:
Use-after-free write at 0xffff88823bc16040 (in kfence-#10): kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486 f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869 f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945 mount_bdev+0x26c/0x3a0 fs/super.c:1367 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline] __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is multi f2fs filesystem instances can race on accessing global fsync_entry_slab pointer, result in use-after-free issue of slab cache, fixes to init/destroy this slab cache only once during module init/destroy procedure to avoid this issue.
Reported-by: syzbot+9d90dad32dd9727ed084@syzkaller.appspotmail.com Signed-off-by: Chao Yu yuchao0@huawei.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/f2fs.h | 2 ++ fs/f2fs/recovery.c | 23 ++++++++++++++--------- fs/f2fs/super.c | 8 +++++++- 3 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index f3fabb1edfe9..148fc1ce52d7 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3560,6 +3560,8 @@ void f2fs_destroy_garbage_collection_cache(void); */ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only); bool f2fs_space_for_roll_forward(struct f2fs_sb_info *sbi); +int __init f2fs_create_recovery_cache(void); +void f2fs_destroy_recovery_cache(void);
/* * debug.c diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c index da75d5d52f0a..4ae055f39e21 100644 --- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -787,13 +787,6 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only) quota_enabled = f2fs_enable_quota_files(sbi, s_flags & SB_RDONLY); #endif
- fsync_entry_slab = f2fs_kmem_cache_create("f2fs_fsync_inode_entry", - sizeof(struct fsync_inode_entry)); - if (!fsync_entry_slab) { - err = -ENOMEM; - goto out; - } - INIT_LIST_HEAD(&inode_list); INIT_LIST_HEAD(&tmp_inode_list); INIT_LIST_HEAD(&dir_list); @@ -866,8 +859,6 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only) } }
- kmem_cache_destroy(fsync_entry_slab); -out: #ifdef CONFIG_QUOTA /* Turn quotas off */ if (quota_enabled) @@ -877,3 +868,17 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
return ret ? ret: err; } + +int __init f2fs_create_recovery_cache(void) +{ + fsync_entry_slab = f2fs_kmem_cache_create("f2fs_fsync_inode_entry", + sizeof(struct fsync_inode_entry)); + if (!fsync_entry_slab) + return -ENOMEM; + return 0; +} + +void f2fs_destroy_recovery_cache(void) +{ + kmem_cache_destroy(fsync_entry_slab); +} diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index d852d96773a3..1f7bc4b3f6ae 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4187,9 +4187,12 @@ static int __init init_f2fs_fs(void) err = f2fs_create_checkpoint_caches(); if (err) goto free_segment_manager_caches; - err = f2fs_create_extent_cache(); + err = f2fs_create_recovery_cache(); if (err) goto free_checkpoint_caches; + err = f2fs_create_extent_cache(); + if (err) + goto free_recovery_cache; err = f2fs_create_garbage_collection_cache(); if (err) goto free_extent_cache; @@ -4238,6 +4241,8 @@ static int __init init_f2fs_fs(void) f2fs_destroy_garbage_collection_cache(); free_extent_cache: f2fs_destroy_extent_cache(); +free_recovery_cache: + f2fs_destroy_recovery_cache(); free_checkpoint_caches: f2fs_destroy_checkpoint_caches(); free_segment_manager_caches: @@ -4263,6 +4268,7 @@ static void __exit exit_f2fs_fs(void) f2fs_exit_sysfs(); f2fs_destroy_garbage_collection_cache(); f2fs_destroy_extent_cache(); + f2fs_destroy_recovery_cache(); f2fs_destroy_checkpoint_caches(); f2fs_destroy_segment_manager_caches(); f2fs_destroy_node_manager_caches();
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit b601a18f12383001e7a8da238de7ca1559ebc450 ]
A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.
So drop the hardware modification from the .remove() callback.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-spear.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/pwm/pwm-spear.c b/drivers/pwm/pwm-spear.c index f63b54aae1b4..7467e03d2fb5 100644 --- a/drivers/pwm/pwm-spear.c +++ b/drivers/pwm/pwm-spear.c @@ -229,10 +229,6 @@ static int spear_pwm_probe(struct platform_device *pdev) static int spear_pwm_remove(struct platform_device *pdev) { struct spear_pwm_chip *pc = platform_get_drvdata(pdev); - int i; - - for (i = 0; i < NUM_PWM; i++) - pwm_disable(&pc->chip.pwms[i]);
/* clk was prepared in probe, hence unprepare it here */ clk_unprepare(pc->clk);
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 5be967d5016ac5ffb9c4d0df51b48441ee4d5ed1 ]
PCI_IOSIZE is defined in mach-loongson64/spaces.h, so change the name of the PCI_* macros in pci-ftpci100.c to use FTPCI_* so that they are more localized and won't conflict with other drivers or arches.
../drivers/pci/controller/pci-ftpci100.c:37: warning: "PCI_IOSIZE" redefined 37 | #define PCI_IOSIZE 0x00 | In file included from ../arch/mips/include/asm/addrspace.h:13, ... from ../drivers/pci/controller/pci-ftpci100.c:15: arch/mips/include/asm/mach-loongson64/spaces.h:11: note: this is the location of the previous definition 11 | #define PCI_IOSIZE SZ_16M
Suggested-by: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20210517234117.3660-1-rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Cc: Jiaxun Yang jiaxun.yang@flygoat.com Cc: Linus Walleij linus.walleij@linaro.org Cc: Krzysztof Wilczyński kw@linux.com Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: linux-mips@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-ftpci100.c | 30 +++++++++++++-------------- 1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/drivers/pci/controller/pci-ftpci100.c b/drivers/pci/controller/pci-ftpci100.c index da3cd216da00..aefef1986201 100644 --- a/drivers/pci/controller/pci-ftpci100.c +++ b/drivers/pci/controller/pci-ftpci100.c @@ -34,12 +34,12 @@ * Special configuration registers directly in the first few words * in I/O space. */ -#define PCI_IOSIZE 0x00 -#define PCI_PROT 0x04 /* AHB protection */ -#define PCI_CTRL 0x08 /* PCI control signal */ -#define PCI_SOFTRST 0x10 /* Soft reset counter and response error enable */ -#define PCI_CONFIG 0x28 /* PCI configuration command register */ -#define PCI_DATA 0x2C +#define FTPCI_IOSIZE 0x00 +#define FTPCI_PROT 0x04 /* AHB protection */ +#define FTPCI_CTRL 0x08 /* PCI control signal */ +#define FTPCI_SOFTRST 0x10 /* Soft reset counter and response error enable */ +#define FTPCI_CONFIG 0x28 /* PCI configuration command register */ +#define FTPCI_DATA 0x2C
#define FARADAY_PCI_STATUS_CMD 0x04 /* Status and command */ #define FARADAY_PCI_PMC 0x40 /* Power management control */ @@ -195,9 +195,9 @@ static int faraday_raw_pci_read_config(struct faraday_pci *p, int bus_number, PCI_CONF_FUNCTION(PCI_FUNC(fn)) | PCI_CONF_WHERE(config) | PCI_CONF_ENABLE, - p->base + PCI_CONFIG); + p->base + FTPCI_CONFIG);
- *value = readl(p->base + PCI_DATA); + *value = readl(p->base + FTPCI_DATA);
if (size == 1) *value = (*value >> (8 * (config & 3))) & 0xFF; @@ -230,17 +230,17 @@ static int faraday_raw_pci_write_config(struct faraday_pci *p, int bus_number, PCI_CONF_FUNCTION(PCI_FUNC(fn)) | PCI_CONF_WHERE(config) | PCI_CONF_ENABLE, - p->base + PCI_CONFIG); + p->base + FTPCI_CONFIG);
switch (size) { case 4: - writel(value, p->base + PCI_DATA); + writel(value, p->base + FTPCI_DATA); break; case 2: - writew(value, p->base + PCI_DATA + (config & 3)); + writew(value, p->base + FTPCI_DATA + (config & 3)); break; case 1: - writeb(value, p->base + PCI_DATA + (config & 3)); + writeb(value, p->base + FTPCI_DATA + (config & 3)); break; default: ret = PCIBIOS_BAD_REGISTER_NUMBER; @@ -469,7 +469,7 @@ static int faraday_pci_probe(struct platform_device *pdev) if (!faraday_res_to_memcfg(io->start - win->offset, resource_size(io), &val)) { /* setup I/O space size */ - writel(val, p->base + PCI_IOSIZE); + writel(val, p->base + FTPCI_IOSIZE); } else { dev_err(dev, "illegal IO mem size\n"); return -EINVAL; @@ -477,11 +477,11 @@ static int faraday_pci_probe(struct platform_device *pdev) }
/* Setup hostbridge */ - val = readl(p->base + PCI_CTRL); + val = readl(p->base + FTPCI_CTRL); val |= PCI_COMMAND_IO; val |= PCI_COMMAND_MEMORY; val |= PCI_COMMAND_MASTER; - writel(val, p->base + PCI_CTRL); + writel(val, p->base + FTPCI_CTRL); /* Mask and clear all interrupts */ faraday_raw_pci_write_config(p, 0, 0, FARADAY_PCI_CTRL2 + 2, 2, 0xF000); if (variant->cascaded_irq) {
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 5bcb5087c9dd3dca1ff0ebd8002c5313c9332b56 ]
Sometimes the code will crash because we haven't enabled AC or USB charging and thus not created the corresponding psy device. Fix it by checking that it is there before notifying.
Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/ab8500_charger.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/power/supply/ab8500_charger.c b/drivers/power/supply/ab8500_charger.c index ac77c8882d17..893185448284 100644 --- a/drivers/power/supply/ab8500_charger.c +++ b/drivers/power/supply/ab8500_charger.c @@ -413,6 +413,14 @@ static void ab8500_enable_disable_sw_fallback(struct ab8500_charger *di, static void ab8500_power_supply_changed(struct ab8500_charger *di, struct power_supply *psy) { + /* + * This happens if we get notifications or interrupts and + * the platform has been configured not to support one or + * other type of charging. + */ + if (!psy) + return; + if (di->autopower_cfg) { if (!di->usb.charger_connected && !di->ac.charger_connected && @@ -439,7 +447,15 @@ static void ab8500_charger_set_usb_connected(struct ab8500_charger *di, if (!connected) di->flags.vbus_drop_end = false;
- sysfs_notify(&di->usb_chg.psy->dev.kobj, NULL, "present"); + /* + * Sometimes the platform is configured not to support + * USB charging and no psy has been created, but we still + * will get these notifications. + */ + if (di->usb_chg.psy) { + sysfs_notify(&di->usb_chg.psy->dev.kobj, NULL, + "present"); + }
if (connected) { mutex_lock(&di->charger_attached_mutex);
From: Long Li longli@microsoft.com
[ Upstream commit 94d22763207ac6633612b8d8e0ca4fba0f7aa139 ]
On removing the device, any work item (hv_pci_devices_present() or hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running and race with hv_pci_remove().
This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2) and decide to rescind the channel immediately after that.
Fix this by flushing/destroying the workqueue of hbus before doing hbus remove.
Link: https://lore.kernel.org/r/1620806800-30983-1-git-send-email-longli@linuxonhy... Signed-off-by: Long Li longli@microsoft.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Michael Kelley mikelley@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-hyperv.c | 30 ++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 27a17a1e4a7c..c6122a1b0c46 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -444,7 +444,6 @@ enum hv_pcibus_state { hv_pcibus_probed, hv_pcibus_installed, hv_pcibus_removing, - hv_pcibus_removed, hv_pcibus_maximum };
@@ -3247,8 +3246,9 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) struct pci_packet teardown_packet; u8 buffer[sizeof(struct pci_message)]; } pkt; - struct hv_dr_state *dr; struct hv_pci_compl comp_pkt; + struct hv_pci_dev *hpdev, *tmp; + unsigned long flags; int ret;
/* @@ -3260,9 +3260,16 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs)
if (!keep_devs) { /* Delete any children which might still exist. */ - dr = kzalloc(sizeof(*dr), GFP_KERNEL); - if (dr && hv_pci_start_relations_work(hbus, dr)) - kfree(dr); + spin_lock_irqsave(&hbus->device_list_lock, flags); + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) { + list_del(&hpdev->list_entry); + if (hpdev->pci_slot) + pci_destroy_slot(hpdev->pci_slot); + /* For the two refs got in new_pcichild_device() */ + put_pcichild(hpdev); + put_pcichild(hpdev); + } + spin_unlock_irqrestore(&hbus->device_list_lock, flags); }
ret = hv_send_resources_released(hdev); @@ -3305,13 +3312,23 @@ static int hv_pci_remove(struct hv_device *hdev)
hbus = hv_get_drvdata(hdev); if (hbus->state == hv_pcibus_installed) { + tasklet_disable(&hdev->channel->callback_event); + hbus->state = hv_pcibus_removing; + tasklet_enable(&hdev->channel->callback_event); + destroy_workqueue(hbus->wq); + hbus->wq = NULL; + /* + * At this point, no work is running or can be scheduled + * on hbus-wq. We can't race with hv_pci_devices_present() + * or hv_pci_eject_device(), it's safe to proceed. + */ + /* Remove the bus from PCI's point of view. */ pci_lock_rescan_remove(); pci_stop_root_bus(hbus->pci_bus); hv_pci_remove_slots(hbus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - hbus->state = hv_pcibus_removed; }
ret = hv_pci_bus_exit(hdev, false); @@ -3326,7 +3343,6 @@ static int hv_pci_remove(struct hv_device *hdev) irq_domain_free_fwnode(hbus->sysdata.fwnode); put_hvpcibus(hbus); wait_for_completion(&hbus->remove_event); - destroy_workqueue(hbus->wq);
hv_put_dom_num(hbus->sysdata.domain);
From: Krzysztof Kozlowski krzk@kernel.org
[ Upstream commit 7fbf6b731bca347700e460d94b130f9d734b33e9 ]
Interrupt line can be configured on different hardware in different way, even inverted. Therefore driver should not enforce specific trigger type - edge falling - but instead rely on Devicetree to configure it.
The Maxim 17047/77693 datasheets describe the interrupt line as active low with a requirement of acknowledge from the CPU therefore the edge falling is not correct.
The interrupt line is shared between PMIC and RTC driver, so using level sensitive interrupt is here especially important to avoid races. With an edge configuration in case if first PMIC signals interrupt followed shortly after by the RTC, the interrupt might not be yet cleared/acked thus the second one would not be noticed.
Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/max17042_battery.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/power/supply/max17042_battery.c b/drivers/power/supply/max17042_battery.c index 79d4b5988360..8117ecabe31c 100644 --- a/drivers/power/supply/max17042_battery.c +++ b/drivers/power/supply/max17042_battery.c @@ -1104,7 +1104,7 @@ static int max17042_probe(struct i2c_client *client, }
if (client->irq) { - unsigned int flags = IRQF_TRIGGER_FALLING | IRQF_ONESHOT; + unsigned int flags = IRQF_ONESHOT;
/* * On ACPI systems the IRQ may be handled by ACPI-event code,
From: Bixuan Cui cuibixuan@huawei.com
[ Upstream commit ed3443fb4df4e140a22f65144546c8a8e1e27f4e ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Bixuan Cui cuibixuan@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/reset/gpio-poweroff.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/reset/gpio-poweroff.c b/drivers/power/reset/gpio-poweroff.c index c5067eb75370..1c5af2fef142 100644 --- a/drivers/power/reset/gpio-poweroff.c +++ b/drivers/power/reset/gpio-poweroff.c @@ -90,6 +90,7 @@ static const struct of_device_id of_gpio_poweroff_match[] = { { .compatible = "gpio-poweroff", }, {}, }; +MODULE_DEVICE_TABLE(of, of_gpio_poweroff_match);
static struct platform_driver gpio_poweroff_driver = { .probe = gpio_poweroff_probe,
From: Nick Desaulniers ndesaulniers@google.com
[ Upstream commit 8b95a7d90ce8160ac5cffd5bace6e2eba01a871e ]
There's a few instructions that GAS infers operands but Clang doesn't; from what I can tell the Arm ARM doesn't say these are optional.
F5.1.257 TBB, TBH T1 Halfword variant F5.1.238 STREXD T1 variant F5.1.84 LDREXD T1 variant
Link: https://github.com/ClangBuiltLinux/linux/issues/1309
Signed-off-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Jian Cai jiancai@google.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/probes/kprobes/test-thumb.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm/probes/kprobes/test-thumb.c b/arch/arm/probes/kprobes/test-thumb.c index 456c181a7bfe..4e11f0b760f8 100644 --- a/arch/arm/probes/kprobes/test-thumb.c +++ b/arch/arm/probes/kprobes/test-thumb.c @@ -441,21 +441,21 @@ void kprobe_thumb32_test_cases(void) "3: mvn r0, r0 \n\t" "2: nop \n\t")
- TEST_RX("tbh [pc, r",7, (9f-(1f+4))>>1,"]", + TEST_RX("tbh [pc, r",7, (9f-(1f+4))>>1,", lsl #1]", "9: \n\t" ".short (2f-1b-4)>>1 \n\t" ".short (3f-1b-4)>>1 \n\t" "3: mvn r0, r0 \n\t" "2: nop \n\t")
- TEST_RX("tbh [pc, r",12, ((9f-(1f+4))>>1)+1,"]", + TEST_RX("tbh [pc, r",12, ((9f-(1f+4))>>1)+1,", lsl #1]", "9: \n\t" ".short (2f-1b-4)>>1 \n\t" ".short (3f-1b-4)>>1 \n\t" "3: mvn r0, r0 \n\t" "2: nop \n\t")
- TEST_RRX("tbh [r",1,9f, ", r",14,1,"]", + TEST_RRX("tbh [r",1,9f, ", r",14,1,", lsl #1]", "9: \n\t" ".short (2f-1b-4)>>1 \n\t" ".short (3f-1b-4)>>1 \n\t" @@ -468,10 +468,10 @@ void kprobe_thumb32_test_cases(void)
TEST_UNSUPPORTED("strexb r0, r1, [r2]") TEST_UNSUPPORTED("strexh r0, r1, [r2]") - TEST_UNSUPPORTED("strexd r0, r1, [r2]") + TEST_UNSUPPORTED("strexd r0, r1, r2, [r2]") TEST_UNSUPPORTED("ldrexb r0, [r1]") TEST_UNSUPPORTED("ldrexh r0, [r1]") - TEST_UNSUPPORTED("ldrexd r0, [r1]") + TEST_UNSUPPORTED("ldrexd r0, r1, [r1]")
TEST_GROUP("Data-processing (shifted register) and (modified immediate)")
From: Logan Gunthorpe logang@deltatee.com
[ Upstream commit 3ec0c3ec2d92c09465534a1ff9c6f9d9506ffef6 ]
In order to use upstream_bridge_distance_warn() from a dma_map function, it must not sleep. However, pci_get_slot() takes the pci_bus_sem so it might sleep.
In order to avoid this, try to get the host bridge's device from the first element in the device list. It should be impossible for the host bridge's device to go away while references are held on child devices, so the first element should not be able to change and, thus, this should be safe.
Introduce a static function called pci_host_bridge_dev() to obtain the host bridge's root device.
Link: https://lore.kernel.org/r/20210610160609.28447-7-logang@deltatee.com Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/p2pdma.c | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 196382630363..c49c13a5fedc 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -308,10 +308,41 @@ static const struct pci_p2pdma_whitelist_entry { {} };
+/* + * This lookup function tries to find the PCI device corresponding to a given + * host bridge. + * + * It assumes the host bridge device is the first PCI device in the + * bus->devices list and that the devfn is 00.0. These assumptions should hold + * for all the devices in the whitelist above. + * + * This function is equivalent to pci_get_slot(host->bus, 0), however it does + * not take the pci_bus_sem lock seeing __host_bridge_whitelist() must not + * sleep. + * + * For this to be safe, the caller should hold a reference to a device on the + * bridge, which should ensure the host_bridge device will not be freed + * or removed from the head of the devices list. + */ +static struct pci_dev *pci_host_bridge_dev(struct pci_host_bridge *host) +{ + struct pci_dev *root; + + root = list_first_entry_or_null(&host->bus->devices, + struct pci_dev, bus_list); + + if (!root) + return NULL; + if (root->devfn != PCI_DEVFN(0, 0)) + return NULL; + + return root; +} + static bool __host_bridge_whitelist(struct pci_host_bridge *host, bool same_host_bridge) { - struct pci_dev *root = pci_get_slot(host->bus, PCI_DEVFN(0, 0)); + struct pci_dev *root = pci_host_bridge_dev(host); const struct pci_p2pdma_whitelist_entry *entry; unsigned short vendor, device;
@@ -320,7 +351,6 @@ static bool __host_bridge_whitelist(struct pci_host_bridge *host,
vendor = root->vendor; device = root->device; - pci_dev_put(root);
for (entry = pci_p2pdma_whitelist; entry->vendor; entry++) { if (vendor != entry->vendor || device != entry->device)
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit be20037725d17935ec669044bd2b15bc40c3b5ab ]
If we're unable to immediately recover all locks because the server is unable to immediately service our reclaim calls, then we want to retry after we've finished servicing all the other asynchronous delegation returns on our queue.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/delegation.c | 71 +++++++++++++++++++++++++++++++++++---------- fs/nfs/delegation.h | 1 + fs/nfs/nfs4_fs.h | 1 + 3 files changed, 58 insertions(+), 15 deletions(-)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 04bf8066980c..d6ac2c4f88b6 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -75,6 +75,13 @@ void nfs_mark_delegation_referenced(struct nfs_delegation *delegation) set_bit(NFS_DELEGATION_REFERENCED, &delegation->flags); }
+static void nfs_mark_return_delegation(struct nfs_server *server, + struct nfs_delegation *delegation) +{ + set_bit(NFS_DELEGATION_RETURN, &delegation->flags); + set_bit(NFS4CLNT_DELEGRETURN, &server->nfs_client->cl_state); +} + static bool nfs4_is_valid_delegation(const struct nfs_delegation *delegation, fmode_t flags) @@ -293,6 +300,7 @@ nfs_start_delegation_return_locked(struct nfs_inode *nfsi) goto out; spin_lock(&delegation->lock); if (!test_and_set_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) { + clear_bit(NFS_DELEGATION_RETURN_DELAYED, &delegation->flags); /* Refcount matched in nfs_end_delegation_return() */ ret = nfs_get_delegation(delegation); } @@ -314,16 +322,17 @@ nfs_start_delegation_return(struct nfs_inode *nfsi) return delegation; }
-static void -nfs_abort_delegation_return(struct nfs_delegation *delegation, - struct nfs_client *clp) +static void nfs_abort_delegation_return(struct nfs_delegation *delegation, + struct nfs_client *clp, int err) {
spin_lock(&delegation->lock); clear_bit(NFS_DELEGATION_RETURNING, &delegation->flags); - set_bit(NFS_DELEGATION_RETURN, &delegation->flags); + if (err == -EAGAIN) { + set_bit(NFS_DELEGATION_RETURN_DELAYED, &delegation->flags); + set_bit(NFS4CLNT_DELEGRETURN_DELAYED, &clp->cl_state); + } spin_unlock(&delegation->lock); - set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state); }
static struct nfs_delegation * @@ -528,7 +537,7 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation } while (err == 0);
if (err) { - nfs_abort_delegation_return(delegation, clp); + nfs_abort_delegation_return(delegation, clp, err); goto out; }
@@ -557,6 +566,7 @@ static bool nfs_delegation_need_return(struct nfs_delegation *delegation) if (ret) clear_bit(NFS_DELEGATION_RETURN_IF_CLOSED, &delegation->flags); if (test_bit(NFS_DELEGATION_RETURNING, &delegation->flags) || + test_bit(NFS_DELEGATION_RETURN_DELAYED, &delegation->flags) || test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) ret = false;
@@ -636,6 +646,38 @@ static int nfs_server_return_marked_delegations(struct nfs_server *server, return err; }
+static bool nfs_server_clear_delayed_delegations(struct nfs_server *server) +{ + struct nfs_delegation *d; + bool ret = false; + + list_for_each_entry_rcu (d, &server->delegations, super_list) { + if (!test_bit(NFS_DELEGATION_RETURN_DELAYED, &d->flags)) + continue; + nfs_mark_return_delegation(server, d); + clear_bit(NFS_DELEGATION_RETURN_DELAYED, &d->flags); + ret = true; + } + return ret; +} + +static bool nfs_client_clear_delayed_delegations(struct nfs_client *clp) +{ + struct nfs_server *server; + bool ret = false; + + if (!test_and_clear_bit(NFS4CLNT_DELEGRETURN_DELAYED, &clp->cl_state)) + goto out; + rcu_read_lock(); + list_for_each_entry_rcu (server, &clp->cl_superblocks, client_link) { + if (nfs_server_clear_delayed_delegations(server)) + ret = true; + } + rcu_read_unlock(); +out: + return ret; +} + /** * nfs_client_return_marked_delegations - return previously marked delegations * @clp: nfs_client to process @@ -648,8 +690,14 @@ static int nfs_server_return_marked_delegations(struct nfs_server *server, */ int nfs_client_return_marked_delegations(struct nfs_client *clp) { - return nfs_client_for_each_server(clp, - nfs_server_return_marked_delegations, NULL); + int err = nfs_client_for_each_server( + clp, nfs_server_return_marked_delegations, NULL); + if (err) + return err; + /* If a return was delayed, sleep to prevent hard looping */ + if (nfs_client_clear_delayed_delegations(clp)) + ssleep(1); + return 0; }
/** @@ -764,13 +812,6 @@ static void nfs_mark_return_if_closed_delegation(struct nfs_server *server, set_bit(NFS4CLNT_DELEGRETURN, &server->nfs_client->cl_state); }
-static void nfs_mark_return_delegation(struct nfs_server *server, - struct nfs_delegation *delegation) -{ - set_bit(NFS_DELEGATION_RETURN, &delegation->flags); - set_bit(NFS4CLNT_DELEGRETURN, &server->nfs_client->cl_state); -} - static bool nfs_server_mark_return_all_delegations(struct nfs_server *server) { struct nfs_delegation *delegation; diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h index 9b00a0b7f832..26f57a99da84 100644 --- a/fs/nfs/delegation.h +++ b/fs/nfs/delegation.h @@ -36,6 +36,7 @@ enum { NFS_DELEGATION_REVOKED, NFS_DELEGATION_TEST_EXPIRED, NFS_DELEGATION_INODE_FREEING, + NFS_DELEGATION_RETURN_DELAYED, };
int nfs_inode_set_delegation(struct inode *inode, const struct cred *cred, diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 543d916f79ab..3e344bec3647 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -45,6 +45,7 @@ enum nfs4_client_state { NFS4CLNT_RECALL_RUNNING, NFS4CLNT_RECALL_ANY_LAYOUT_READ, NFS4CLNT_RECALL_ANY_LAYOUT_RW, + NFS4CLNT_DELEGRETURN_DELAYED, };
#define NFS4_RENEW_TIMEOUT 0x01
From: Lukas Wunner lukas@wunner.de
[ Upstream commit a97396c6eb13f65bea894dbe7739b2e883d40a3e ]
Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver.
A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state.
Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot.
Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event.
Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist().
The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes.
Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag.
If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default.
pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform.
The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete.
This commit draws inspiration from previous attempts to synchronize DPC with pciehp:
By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/
By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel....
By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1...
Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.161985712... Reported-by: Sinan Kaya okaya@kernel.org Reported-by: Ethan Zhao haifeng.zhao@intel.com Reported-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Tested-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Tested-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Dan Williams dan.j.williams@intel.com Cc: Ashok Raj ashok.raj@intel.com Cc: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/hotplug/pciehp_hpc.c | 36 ++++++++++++++++ drivers/pci/pci.h | 4 ++ drivers/pci/pcie/dpc.c | 74 +++++++++++++++++++++++++++++--- 3 files changed, 109 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index fb3840e222ad..9d06939736c0 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -563,6 +563,32 @@ void pciehp_power_off_slot(struct controller *ctrl) PCI_EXP_SLTCTL_PWR_OFF); }
+static void pciehp_ignore_dpc_link_change(struct controller *ctrl, + struct pci_dev *pdev, int irq) +{ + /* + * Ignore link changes which occurred while waiting for DPC recovery. + * Could be several if DPC triggered multiple times consecutively. + */ + synchronize_hardirq(irq); + atomic_and(~PCI_EXP_SLTSTA_DLLSC, &ctrl->pending_events); + if (pciehp_poll_mode) + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, + PCI_EXP_SLTSTA_DLLSC); + ctrl_info(ctrl, "Slot(%s): Link Down/Up ignored (recovered by DPC)\n", + slot_name(ctrl)); + + /* + * If the link is unexpectedly down after successful recovery, + * the corresponding link change may have been ignored above. + * Synthesize it to ensure that it is acted on. + */ + down_read(&ctrl->reset_lock); + if (!pciehp_check_link_active(ctrl)) + pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC); + up_read(&ctrl->reset_lock); +} + static irqreturn_t pciehp_isr(int irq, void *dev_id) { struct controller *ctrl = (struct controller *)dev_id; @@ -706,6 +732,16 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) PCI_EXP_SLTCTL_ATTN_IND_ON); }
+ /* + * Ignore Link Down/Up events caused by Downstream Port Containment + * if recovery from the error succeeded. + */ + if ((events & PCI_EXP_SLTSTA_DLLSC) && pci_dpc_recovered(pdev) && + ctrl->state == ON_STATE) { + events &= ~PCI_EXP_SLTSTA_DLLSC; + pciehp_ignore_dpc_link_change(ctrl, pdev, irq); + } + /* * Disable requests have higher priority than Presence Detect Changed * or Data Link Layer State Changed events. diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 9684b468267f..e5ae5e860431 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -392,6 +392,8 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
/* pci_dev priv_flags */ #define PCI_DEV_ADDED 0 +#define PCI_DPC_RECOVERED 1 +#define PCI_DPC_RECOVERING 2
static inline void pci_dev_assign_added(struct pci_dev *dev, bool added) { @@ -446,10 +448,12 @@ void pci_restore_dpc_state(struct pci_dev *dev); void pci_dpc_init(struct pci_dev *pdev); void dpc_process_error(struct pci_dev *pdev); pci_ers_result_t dpc_reset_link(struct pci_dev *pdev); +bool pci_dpc_recovered(struct pci_dev *pdev); #else static inline void pci_save_dpc_state(struct pci_dev *dev) {} static inline void pci_restore_dpc_state(struct pci_dev *dev) {} static inline void pci_dpc_init(struct pci_dev *pdev) {} +static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; } #endif
#ifdef CONFIG_PCIEPORTBUS diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c index e05aba86a317..c556e7beafe3 100644 --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -71,6 +71,58 @@ void pci_restore_dpc_state(struct pci_dev *dev) pci_write_config_word(dev, dev->dpc_cap + PCI_EXP_DPC_CTL, *cap); }
+static DECLARE_WAIT_QUEUE_HEAD(dpc_completed_waitqueue); + +#ifdef CONFIG_HOTPLUG_PCI_PCIE +static bool dpc_completed(struct pci_dev *pdev) +{ + u16 status; + + pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS, &status); + if ((status != 0xffff) && (status & PCI_EXP_DPC_STATUS_TRIGGER)) + return false; + + if (test_bit(PCI_DPC_RECOVERING, &pdev->priv_flags)) + return false; + + return true; +} + +/** + * pci_dpc_recovered - whether DPC triggered and has recovered successfully + * @pdev: PCI device + * + * Return true if DPC was triggered for @pdev and has recovered successfully. + * Wait for recovery if it hasn't completed yet. Called from the PCIe hotplug + * driver to recognize and ignore Link Down/Up events caused by DPC. + */ +bool pci_dpc_recovered(struct pci_dev *pdev) +{ + struct pci_host_bridge *host; + + if (!pdev->dpc_cap) + return false; + + /* + * Synchronization between hotplug and DPC is not supported + * if DPC is owned by firmware and EDR is not enabled. + */ + host = pci_find_host_bridge(pdev->bus); + if (!host->native_dpc && !IS_ENABLED(CONFIG_PCIE_EDR)) + return false; + + /* + * Need a timeout in case DPC never completes due to failure of + * dpc_wait_rp_inactive(). The spec doesn't mandate a time limit, + * but reports indicate that DPC completes within 4 seconds. + */ + wait_event_timeout(dpc_completed_waitqueue, dpc_completed(pdev), + msecs_to_jiffies(4000)); + + return test_and_clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags); +} +#endif /* CONFIG_HOTPLUG_PCI_PCIE */ + static int dpc_wait_rp_inactive(struct pci_dev *pdev) { unsigned long timeout = jiffies + HZ; @@ -91,8 +143,11 @@ static int dpc_wait_rp_inactive(struct pci_dev *pdev)
pci_ers_result_t dpc_reset_link(struct pci_dev *pdev) { + pci_ers_result_t ret; u16 cap;
+ set_bit(PCI_DPC_RECOVERING, &pdev->priv_flags); + /* * DPC disables the Link automatically in hardware, so it has * already been reset by the time we get here. @@ -106,18 +161,27 @@ pci_ers_result_t dpc_reset_link(struct pci_dev *pdev) if (!pcie_wait_for_link(pdev, false)) pci_info(pdev, "Data Link Layer Link Active not cleared in 1000 msec\n");
- if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) - return PCI_ERS_RESULT_DISCONNECT; + if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev)) { + clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags); + ret = PCI_ERS_RESULT_DISCONNECT; + goto out; + }
pci_write_config_word(pdev, cap + PCI_EXP_DPC_STATUS, PCI_EXP_DPC_STATUS_TRIGGER);
if (!pcie_wait_for_link(pdev, true)) { pci_info(pdev, "Data Link Layer Link Active not set in 1000 msec\n"); - return PCI_ERS_RESULT_DISCONNECT; + clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags); + ret = PCI_ERS_RESULT_DISCONNECT; + } else { + set_bit(PCI_DPC_RECOVERED, &pdev->priv_flags); + ret = PCI_ERS_RESULT_RECOVERED; } - - return PCI_ERS_RESULT_RECOVERED; +out: + clear_bit(PCI_DPC_RECOVERING, &pdev->priv_flags); + wake_up_all(&dpc_completed_waitqueue); + return ret; }
static void dpc_process_rp_pio_error(struct pci_dev *pdev)
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 8fe55ef23387ce3c7488375b1fd539420d7654bb ]
Attempting to boot 32-bit ARM kernels under QEMU's 3.x virt models fails when we have more than 512M of RAM in the model as we run out of vmalloc space for the PCI ECAM regions. This failure will be silent when running libvirt, as the console in that situation is a PCI device.
In this configuration, the kernel maps the whole ECAM, which QEMU sets up for 256 buses, even when maybe only seven buses are in use. Each bus uses 1M of ECAM space, and ioremap() adds an additional guard page between allocations. The kernel vmap allocator will align these regions to 512K, resulting in each mapping eating 1.5M of vmalloc space. This means we need 384M of vmalloc space just to map all of these, which is very wasteful of resources.
Fix this by only mapping the ECAM for buses we are going to be using. In my setups, this is around seven buses in most guests, which is 10.5M of vmalloc space - way smaller than the 384M that would otherwise be required. This also means that the kernel can boot without forcing extra RAM into highmem with the vmalloc= argument, or decreasing the virtual RAM available to the guest.
Suggested-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/E1lhCAV-0002yb-50@rmk-PC.armlinux.org.uk Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/ecam.c | 54 ++++++++++++++++++++++++++++++++++------ include/linux/pci-ecam.h | 1 + 2 files changed, 47 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c index d2a1920bb055..1c40d2506aef 100644 --- a/drivers/pci/ecam.c +++ b/drivers/pci/ecam.c @@ -32,7 +32,7 @@ struct pci_config_window *pci_ecam_create(struct device *dev, struct pci_config_window *cfg; unsigned int bus_range, bus_range_max, bsz; struct resource *conflict; - int i, err; + int err;
if (busr->start > busr->end) return ERR_PTR(-EINVAL); @@ -50,6 +50,7 @@ struct pci_config_window *pci_ecam_create(struct device *dev, cfg->busr.start = busr->start; cfg->busr.end = busr->end; cfg->busr.flags = IORESOURCE_BUS; + cfg->bus_shift = bus_shift; bus_range = resource_size(&cfg->busr); bus_range_max = resource_size(cfgres) >> bus_shift; if (bus_range > bus_range_max) { @@ -77,13 +78,6 @@ struct pci_config_window *pci_ecam_create(struct device *dev, cfg->winp = kcalloc(bus_range, sizeof(*cfg->winp), GFP_KERNEL); if (!cfg->winp) goto err_exit_malloc; - for (i = 0; i < bus_range; i++) { - cfg->winp[i] = - pci_remap_cfgspace(cfgres->start + i * bsz, - bsz); - if (!cfg->winp[i]) - goto err_exit_iomap; - } } else { cfg->win = pci_remap_cfgspace(cfgres->start, bus_range * bsz); if (!cfg->win) @@ -129,6 +123,44 @@ void pci_ecam_free(struct pci_config_window *cfg) } EXPORT_SYMBOL_GPL(pci_ecam_free);
+static int pci_ecam_add_bus(struct pci_bus *bus) +{ + struct pci_config_window *cfg = bus->sysdata; + unsigned int bsz = 1 << cfg->bus_shift; + unsigned int busn = bus->number; + phys_addr_t start; + + if (!per_bus_mapping) + return 0; + + if (busn < cfg->busr.start || busn > cfg->busr.end) + return -EINVAL; + + busn -= cfg->busr.start; + start = cfg->res.start + busn * bsz; + + cfg->winp[busn] = pci_remap_cfgspace(start, bsz); + if (!cfg->winp[busn]) + return -ENOMEM; + + return 0; +} + +static void pci_ecam_remove_bus(struct pci_bus *bus) +{ + struct pci_config_window *cfg = bus->sysdata; + unsigned int busn = bus->number; + + if (!per_bus_mapping || busn < cfg->busr.start || busn > cfg->busr.end) + return; + + busn -= cfg->busr.start; + if (cfg->winp[busn]) { + iounmap(cfg->winp[busn]); + cfg->winp[busn] = NULL; + } +} + /* * Function to implement the pci_ops ->map_bus method */ @@ -167,6 +199,8 @@ EXPORT_SYMBOL_GPL(pci_ecam_map_bus); /* ECAM ops */ const struct pci_ecam_ops pci_generic_ecam_ops = { .pci_ops = { + .add_bus = pci_ecam_add_bus, + .remove_bus = pci_ecam_remove_bus, .map_bus = pci_ecam_map_bus, .read = pci_generic_config_read, .write = pci_generic_config_write, @@ -178,6 +212,8 @@ EXPORT_SYMBOL_GPL(pci_generic_ecam_ops); /* ECAM ops for 32-bit access only (non-compliant) */ const struct pci_ecam_ops pci_32b_ops = { .pci_ops = { + .add_bus = pci_ecam_add_bus, + .remove_bus = pci_ecam_remove_bus, .map_bus = pci_ecam_map_bus, .read = pci_generic_config_read32, .write = pci_generic_config_write32, @@ -187,6 +223,8 @@ const struct pci_ecam_ops pci_32b_ops = { /* ECAM ops for 32-bit read only (non-compliant) */ const struct pci_ecam_ops pci_32b_read_ops = { .pci_ops = { + .add_bus = pci_ecam_add_bus, + .remove_bus = pci_ecam_remove_bus, .map_bus = pci_ecam_map_bus, .read = pci_generic_config_read32, .write = pci_generic_config_write, diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h index 65d3d83015c3..944da75ff25c 100644 --- a/include/linux/pci-ecam.h +++ b/include/linux/pci-ecam.h @@ -55,6 +55,7 @@ struct pci_ecam_ops { struct pci_config_window { struct resource res; struct resource busr; + unsigned int bus_shift; void *priv; const struct pci_ecam_ops *ops; union {
From: Zou Wei zou_wei@huawei.com
[ Upstream commit c08a6b31e4917034f0ed0cb457c3bb209576f542 ]
This module's remove path calls del_timer(). However, that function does not wait until the timer handler finishes. This means that the timer handler may still be running after the driver's remove function has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/1620716495-108352-1-git-send-email-zou_wei@huawei.... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/sbc60xxwdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/watchdog/sbc60xxwdt.c b/drivers/watchdog/sbc60xxwdt.c index a947a63fb44a..7b974802dfc7 100644 --- a/drivers/watchdog/sbc60xxwdt.c +++ b/drivers/watchdog/sbc60xxwdt.c @@ -146,7 +146,7 @@ static void wdt_startup(void) static void wdt_turnoff(void) { /* Stop the timer */ - del_timer(&timer); + del_timer_sync(&timer); inb_p(wdt_stop); pr_info("Watchdog timer is now disabled...\n"); }
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 90b7c141132244e8e49a34a4c1e445cce33e07f4 ]
This module's remove path calls del_timer(). However, that function does not wait until the timer handler finishes. This means that the timer handler may still be running after the driver's remove function has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/1620716691-108460-1-git-send-email-zou_wei@huawei.... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/sc520_wdt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/watchdog/sc520_wdt.c b/drivers/watchdog/sc520_wdt.c index e66e6b905964..ca65468f4b9c 100644 --- a/drivers/watchdog/sc520_wdt.c +++ b/drivers/watchdog/sc520_wdt.c @@ -186,7 +186,7 @@ static int wdt_startup(void) static int wdt_turnoff(void) { /* Stop the timer */ - del_timer(&timer); + del_timer_sync(&timer);
/* Stop the watchdog */ wdt_config(0);
From: Zou Wei zou_wei@huawei.com
[ Upstream commit d0212f095ab56672f6f36aabc605bda205e1e0bf ]
This driver's remove path calls del_timer(). However, that function does not wait until the timer handler finishes. This means that the timer handler may still be running after the driver's remove function has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Reviewed-by: Guenter Roeck linux@roeck-us.net Acked-by: Vladimir Zapolskiy vz@mleia.com Link: https://lore.kernel.org/r/1620802676-19701-1-git-send-email-zou_wei@huawei.c... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/lpc18xx_wdt.c | 2 +- drivers/watchdog/w83877f_wdt.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/watchdog/lpc18xx_wdt.c b/drivers/watchdog/lpc18xx_wdt.c index 78cf11c94941..60b6d74f267d 100644 --- a/drivers/watchdog/lpc18xx_wdt.c +++ b/drivers/watchdog/lpc18xx_wdt.c @@ -292,7 +292,7 @@ static int lpc18xx_wdt_remove(struct platform_device *pdev) struct lpc18xx_wdt_dev *lpc18xx_wdt = platform_get_drvdata(pdev);
dev_warn(&pdev->dev, "I quit now, hardware will probably reboot!\n"); - del_timer(&lpc18xx_wdt->timer); + del_timer_sync(&lpc18xx_wdt->timer);
return 0; } diff --git a/drivers/watchdog/w83877f_wdt.c b/drivers/watchdog/w83877f_wdt.c index 5772cc5d3780..f2650863fd02 100644 --- a/drivers/watchdog/w83877f_wdt.c +++ b/drivers/watchdog/w83877f_wdt.c @@ -166,7 +166,7 @@ static void wdt_startup(void) static void wdt_turnoff(void) { /* Stop the timer */ - del_timer(&timer); + del_timer_sync(&timer);
wdt_change(WDT_DISABLE);
From: Stefan Eichenberger eichest@gmail.com
[ Upstream commit 854478a381078ee86ae2a7908a934b1ded399130 ]
If the WDIOF_PRETIMEOUT flag is not set when registering the device the driver will not show the sysfs entries or register the default governor. By moving the registering after the decision whether pretimeout is supported this gets fixed.
Signed-off-by: Stefan Eichenberger eichest@gmail.com Reviewed-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Dong Aisheng aisheng.dong@nxp.com Link: https://lore.kernel.org/r/20210519080311.142928-1-eichest@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/imx_sc_wdt.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/watchdog/imx_sc_wdt.c b/drivers/watchdog/imx_sc_wdt.c index e9ee22a7cb45..8ac021748d16 100644 --- a/drivers/watchdog/imx_sc_wdt.c +++ b/drivers/watchdog/imx_sc_wdt.c @@ -183,16 +183,12 @@ static int imx_sc_wdt_probe(struct platform_device *pdev) watchdog_stop_on_reboot(wdog); watchdog_stop_on_unregister(wdog);
- ret = devm_watchdog_register_device(dev, wdog); - if (ret) - return ret; - ret = imx_scu_irq_group_enable(SC_IRQ_GROUP_WDOG, SC_IRQ_WDOG, true); if (ret) { dev_warn(dev, "Enable irq failed, pretimeout NOT supported\n"); - return 0; + goto register_device; }
imx_sc_wdd->wdt_notifier.notifier_call = imx_sc_wdt_notify; @@ -203,7 +199,7 @@ static int imx_sc_wdt_probe(struct platform_device *pdev) false); dev_warn(dev, "Register irq notifier failed, pretimeout NOT supported\n"); - return 0; + goto register_device; }
ret = devm_add_action_or_reset(dev, imx_sc_wdt_action, @@ -213,7 +209,8 @@ static int imx_sc_wdt_probe(struct platform_device *pdev) else dev_warn(dev, "Add action failed, pretimeout NOT supported\n");
- return 0; +register_device: + return devm_watchdog_register_device(dev, wdog); }
static int __maybe_unused imx_sc_wdt_suspend(struct device *dev)
From: Jan Kiszka jan.kiszka@siemens.com
[ Upstream commit cb011044e34c293e139570ce5c01aed66a34345c ]
This was already attempted to fix via 1fccb73011ea: If the BIOS did not enable TCO SMIs, the timer definitely needs to trigger twice in order to cause a reboot. If TCO SMIs are on, as well as SMIs in general, we can continue to assume that the BIOS will perform a reboot on the first timeout.
QEMU with its ICH9 and related BIOS falls into the former category, currently taking twice the configured timeout in order to reboot the machine. For iTCO version that fall under turn_SMI_watchdog_clear_off, this is also true and was currently only addressed for v1, irrespective of the turn_SMI_watchdog_clear_off value.
Signed-off-by: Jan Kiszka jan.kiszka@siemens.com Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/0b8bb307-d08b-41b5-696c-305cdac6789c@siemens.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/iTCO_wdt.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/watchdog/iTCO_wdt.c b/drivers/watchdog/iTCO_wdt.c index bf31d7b67a69..3f1324871cfd 100644 --- a/drivers/watchdog/iTCO_wdt.c +++ b/drivers/watchdog/iTCO_wdt.c @@ -71,6 +71,8 @@ #define TCOBASE(p) ((p)->tco_res->start) /* SMI Control and Enable Register */ #define SMI_EN(p) ((p)->smi_res->start) +#define TCO_EN (1 << 13) +#define GBL_SMI_EN (1 << 0)
#define TCO_RLD(p) (TCOBASE(p) + 0x00) /* TCO Timer Reload/Curr. Value */ #define TCOv1_TMR(p) (TCOBASE(p) + 0x01) /* TCOv1 Timer Initial Value*/ @@ -355,8 +357,12 @@ static int iTCO_wdt_set_timeout(struct watchdog_device *wd_dev, unsigned int t)
tmrval = seconds_to_ticks(p, t);
- /* For TCO v1 the timer counts down twice before rebooting */ - if (p->iTCO_version == 1) + /* + * If TCO SMIs are off, the timer counts down twice before rebooting. + * Otherwise, the BIOS generally reboots when the SMI triggers. + */ + if (p->smi_res && + (SMI_EN(p) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN)) tmrval /= 2;
/* from the specs: */ @@ -521,7 +527,7 @@ static int iTCO_wdt_probe(struct platform_device *pdev) * Disables TCO logic generating an SMI# */ val32 = inl(SMI_EN(p)); - val32 &= 0xffffdfff; /* Turn off SMI clearing watchdog */ + val32 &= ~TCO_EN; /* Turn off SMI clearing watchdog */ outl(val32, SMI_EN(p)); }
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit aee8c67a4faa40a8df4e79316dbfc92d123989c1 ]
When *RSTOR from user memory raises an exception, there is no way to differentiate them. That's bad because it forces the slow path even when the failure was not a fault. If the operation raised eg. #GP then going through the slow path is pointless.
Use _ASM_EXTABLE_FAULT() which stores the trap number and let the exception fixup return the negated trap number as error.
This allows to separate the fast path and let it handle faults directly and avoid the slow path for all other exceptions.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/20210623121457.601480369@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/fpu/internal.h | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 16bf4d4a8159..4e5af2b00d89 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -103,6 +103,7 @@ static inline void fpstate_init_fxstate(struct fxregs_state *fx) } extern void fpstate_sanitize_xstate(struct fpu *fpu);
+/* Returns 0 or the negated trap number, which results in -EFAULT for #PF */ #define user_insn(insn, output, input...) \ ({ \ int err; \ @@ -110,14 +111,14 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu); might_fault(); \ \ asm volatile(ASM_STAC "\n" \ - "1:" #insn "\n\t" \ + "1: " #insn "\n" \ "2: " ASM_CLAC "\n" \ ".section .fixup,"ax"\n" \ - "3: movl $-1,%[err]\n" \ + "3: negl %%eax\n" \ " jmp 2b\n" \ ".previous\n" \ - _ASM_EXTABLE(1b, 3b) \ - : [err] "=r" (err), output \ + _ASM_EXTABLE_FAULT(1b, 3b) \ + : [err] "=a" (err), output \ : "0"(0), input); \ err; \ }) @@ -219,16 +220,20 @@ static inline void fxsave(struct fxregs_state *fx) #define XRSTOR ".byte " REX_PREFIX "0x0f,0xae,0x2f" #define XRSTORS ".byte " REX_PREFIX "0x0f,0xc7,0x1f"
+/* + * After this @err contains 0 on success or the negated trap number when + * the operation raises an exception. For faults this results in -EFAULT. + */ #define XSTATE_OP(op, st, lmask, hmask, err) \ asm volatile("1:" op "\n\t" \ "xor %[err], %[err]\n" \ "2:\n\t" \ ".pushsection .fixup,"ax"\n\t" \ - "3: movl $-2,%[err]\n\t" \ + "3: negl %%eax\n\t" \ "jmp 2b\n\t" \ ".popsection\n\t" \ - _ASM_EXTABLE(1b, 3b) \ - : [err] "=r" (err) \ + _ASM_EXTABLE_FAULT(1b, 3b) \ + : [err] "=a" (err) \ : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory")
From: Siddharth Gupta sidgup@codeaurora.org
[ Upstream commit 930eec0be20c93a53160c74005a1485a230e6911 ]
The rproc_char_device_remove() call currently unmaps the cdev region instead of simply deleting the cdev that was added as a part of the rproc_char_device_add() call. This change fixes that behaviour, and also fixes the order in which device_del() and cdev_del() need to be called.
Signed-off-by: Siddharth Gupta sidgup@codeaurora.org Link: https://lore.kernel.org/r/1623723671-5517-4-git-send-email-sidgup@codeaurora... Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/remoteproc_cdev.c | 2 +- drivers/remoteproc/remoteproc_core.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c index b19ea3057bde..ff92ed25d8b0 100644 --- a/drivers/remoteproc/remoteproc_cdev.c +++ b/drivers/remoteproc/remoteproc_cdev.c @@ -111,7 +111,7 @@ int rproc_char_device_add(struct rproc *rproc)
void rproc_char_device_remove(struct rproc *rproc) { - __unregister_chrdev(MAJOR(rproc->dev.devt), rproc->index, 1, "remoteproc"); + cdev_del(&rproc->cdev); }
void __init rproc_init_cdev(void) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index ab150765d124..d9d2a240dd58 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -2357,7 +2357,6 @@ int rproc_del(struct rproc *rproc) mutex_unlock(&rproc->lock);
rproc_delete_debug_dir(rproc); - rproc_char_device_remove(rproc);
/* the rproc is downref'ed as soon as it's removed from the klist */ mutex_lock(&rproc_list_mutex); @@ -2368,6 +2367,7 @@ int rproc_del(struct rproc *rproc) synchronize_rcu();
device_del(&rproc->dev); + rproc_char_device_remove(rproc);
return 0; }
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 7bf475a4614a9722b9b989e53184a02596cf16d1 ]
Add missing MODULE_DEVICE_TABLE definition so we generate correct modalias for automatic loading of this driver when it is built as a module.
Link: https://lore.kernel.org/r/1620792422-16535-1-git-send-email-zou_wei@huawei.c... Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Vidya Sagar vidyas@nvidia.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-tegra.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c index 8fcabed7c6a6..1a2af963599c 100644 --- a/drivers/pci/controller/pci-tegra.c +++ b/drivers/pci/controller/pci-tegra.c @@ -2506,6 +2506,7 @@ static const struct of_device_id tegra_pcie_of_match[] = { { .compatible = "nvidia,tegra20-pcie", .data = &tegra20_pcie }, { }, }; +MODULE_DEVICE_TABLE(of, tegra_pcie_of_match);
static void *tegra_pcie_ports_seq_start(struct seq_file *s, loff_t *pos) {
From: Mike Marshall hubcap@omnibond.com
[ Upstream commit 0fdec1b3c9fbb5e856a40db5993c9eaf91c74a83 ]
Orangefs df output is whacky. Walt Ligon suggested this might fix it. It seems way more in line with reality now...
Signed-off-by: Mike Marshall hubcap@omnibond.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/orangefs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/orangefs/super.c b/fs/orangefs/super.c index ee5efdc35cc1..2f2e430461b2 100644 --- a/fs/orangefs/super.c +++ b/fs/orangefs/super.c @@ -209,7 +209,7 @@ static int orangefs_statfs(struct dentry *dentry, struct kstatfs *buf) buf->f_bavail = (sector_t) new_op->downcall.resp.statfs.blocks_avail; buf->f_files = (sector_t) new_op->downcall.resp.statfs.files_total; buf->f_ffree = (sector_t) new_op->downcall.resp.statfs.files_avail; - buf->f_frsize = sb->s_blocksize; + buf->f_frsize = 0;
out_op_release: op_release(new_op);
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 22d41cdcd3cfd467a4af074165357fcbea1c37f5 ]
The checks for page->mapping are odd, as set_page_dirty is an address_space operation, and I don't see where it would be called on a non-pagecache page.
The warning about the page lock also seems bogus. The comment over set_page_dirty() says that it can be called without the page lock in some rare cases. I don't think we want to warn if that's the case.
Reported-by: Matthew Wilcox willy@infradead.org Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/addr.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index c000fe338f7e..c54317c10f58 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -78,10 +78,6 @@ static int ceph_set_page_dirty(struct page *page) struct inode *inode; struct ceph_inode_info *ci; struct ceph_snap_context *snapc; - int ret; - - if (unlikely(!mapping)) - return !TestSetPageDirty(page);
if (PageDirty(page)) { dout("%p set_page_dirty %p idx %lu -- already dirty\n", @@ -127,11 +123,7 @@ static int ceph_set_page_dirty(struct page *page) page->private = (unsigned long)snapc; SetPagePrivate(page);
- ret = __set_page_dirty_nobuffers(page); - WARN_ON(!PageLocked(page)); - WARN_ON(!page->mapping); - - return ret; + return __set_page_dirty_nobuffers(page); }
/*
From: Jing Xiangfeng jingxiangfeng@huawei.com
[ Upstream commit cd8f318fbd266b127ffc93cc4c1eaf9a5196fafb ]
psb_user_framebuffer_create() misses to call drm_gem_object_put() in an error path. Add the missed function call to fix it.
Signed-off-by: Jing Xiangfeng jingxiangfeng@huawei.com Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20210629115956.15160-1-jingxia... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/gma500/framebuffer.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/gma500/framebuffer.c b/drivers/gpu/drm/gma500/framebuffer.c index ebe9dccf2d83..0b8648396fb2 100644 --- a/drivers/gpu/drm/gma500/framebuffer.c +++ b/drivers/gpu/drm/gma500/framebuffer.c @@ -352,6 +352,7 @@ static struct drm_framebuffer *psb_user_framebuffer_create const struct drm_mode_fb_cmd2 *cmd) { struct drm_gem_object *obj; + struct drm_framebuffer *fb;
/* * Find the GEM object and thus the gtt range object that is @@ -362,7 +363,11 @@ static struct drm_framebuffer *psb_user_framebuffer_create return ERR_PTR(-ENOENT);
/* Let the core code do all the work */ - return psb_framebuffer_create(dev, cmd, obj); + fb = psb_framebuffer_create(dev, cmd, obj); + if (IS_ERR(fb)) + drm_gem_object_put(obj); + + return fb; }
static int psbfb_probe(struct drm_fb_helper *fb_helper,
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit e97bc66377bca097e1f3349ca18ca17f202ff659 ]
If a file has already been closed, then it should not be selected to support further I/O.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com [Trond: Fix an invalid pointer deref reported by Colin Ian King] Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/inode.c | 4 ++++ include/linux/nfs_fs.h | 1 + 2 files changed, 5 insertions(+)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index ae8bc84e39fb..6b1f2bf65bee 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1079,6 +1079,7 @@ EXPORT_SYMBOL_GPL(nfs_inode_attach_open_context); void nfs_file_set_open_context(struct file *filp, struct nfs_open_context *ctx) { filp->private_data = get_nfs_open_context(ctx); + set_bit(NFS_CONTEXT_FILE_OPEN, &ctx->flags); if (list_empty(&ctx->list)) nfs_inode_attach_open_context(ctx); } @@ -1098,6 +1099,8 @@ struct nfs_open_context *nfs_find_open_context(struct inode *inode, const struct continue; if ((pos->mode & (FMODE_READ|FMODE_WRITE)) != mode) continue; + if (!test_bit(NFS_CONTEXT_FILE_OPEN, &pos->flags)) + continue; ctx = get_nfs_open_context(pos); if (ctx) break; @@ -1113,6 +1116,7 @@ void nfs_file_clear_open_context(struct file *filp) if (ctx) { struct inode *inode = d_inode(ctx->dentry);
+ clear_bit(NFS_CONTEXT_FILE_OPEN, &ctx->flags); /* * We fatal error on write before. Try to writeback * every page again. diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index eadaabd18dc7..be9eadd23659 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -84,6 +84,7 @@ struct nfs_open_context { #define NFS_CONTEXT_RESEND_WRITES (1) #define NFS_CONTEXT_BAD (2) #define NFS_CONTEXT_UNLOCK (3) +#define NFS_CONTEXT_FILE_OPEN (4) int error;
struct list_head list;
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 4465b3a621e761d82d1a92e3fda88c5d33c804b8 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/reset/regulator-poweroff.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/reset/regulator-poweroff.c b/drivers/power/reset/regulator-poweroff.c index f697088e0ad1..20701203935f 100644 --- a/drivers/power/reset/regulator-poweroff.c +++ b/drivers/power/reset/regulator-poweroff.c @@ -64,6 +64,7 @@ static const struct of_device_id of_regulator_poweroff_match[] = { { .compatible = "regulator-poweroff", }, {}, }; +MODULE_DEVICE_TABLE(of, of_regulator_poweroff_match);
static struct platform_driver regulator_poweroff_driver = { .probe = regulator_poweroff_probe,
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 073b5d5b1f9cc94a3eea25279fbafee3f4f5f097 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/charger-manager.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/power/supply/charger-manager.c b/drivers/power/supply/charger-manager.c index 4dea8ecd70bc..331803c221da 100644 --- a/drivers/power/supply/charger-manager.c +++ b/drivers/power/supply/charger-manager.c @@ -1279,6 +1279,7 @@ static const struct of_device_id charger_manager_match[] = { }, {}, }; +MODULE_DEVICE_TABLE(of, charger_manager_match);
static struct charger_desc *of_cm_parse_desc(struct device *dev) {
From: Zou Wei zou_wei@huawei.com
[ Upstream commit dfe52db13ab8d24857a9840ec7ca75eef800c26c ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/ab8500_btemp.c | 1 + drivers/power/supply/ab8500_charger.c | 1 + drivers/power/supply/ab8500_fg.c | 1 + 3 files changed, 3 insertions(+)
diff --git a/drivers/power/supply/ab8500_btemp.c b/drivers/power/supply/ab8500_btemp.c index d20345386b1e..69f67edf83c5 100644 --- a/drivers/power/supply/ab8500_btemp.c +++ b/drivers/power/supply/ab8500_btemp.c @@ -1135,6 +1135,7 @@ static const struct of_device_id ab8500_btemp_match[] = { { .compatible = "stericsson,ab8500-btemp", }, { }, }; +MODULE_DEVICE_TABLE(of, ab8500_btemp_match);
static struct platform_driver ab8500_btemp_driver = { .probe = ab8500_btemp_probe, diff --git a/drivers/power/supply/ab8500_charger.c b/drivers/power/supply/ab8500_charger.c index 893185448284..163d6098e8e0 100644 --- a/drivers/power/supply/ab8500_charger.c +++ b/drivers/power/supply/ab8500_charger.c @@ -3667,6 +3667,7 @@ static const struct of_device_id ab8500_charger_match[] = { { .compatible = "stericsson,ab8500-charger", }, { }, }; +MODULE_DEVICE_TABLE(of, ab8500_charger_match);
static struct platform_driver ab8500_charger_driver = { .probe = ab8500_charger_probe, diff --git a/drivers/power/supply/ab8500_fg.c b/drivers/power/supply/ab8500_fg.c index 06ff42c71f24..2fe41091fa03 100644 --- a/drivers/power/supply/ab8500_fg.c +++ b/drivers/power/supply/ab8500_fg.c @@ -3218,6 +3218,7 @@ static const struct of_device_id ab8500_fg_match[] = { { .compatible = "stericsson,ab8500-fg", }, { }, }; +MODULE_DEVICE_TABLE(of, ab8500_fg_match);
static struct platform_driver ab8500_fg_driver = { .probe = ab8500_fg_probe,
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 3a06b912a5ce494d7b7300b12719c562be7b566f ]
It turns out that the "T3 MRD" DMI_BOARD_NAME value is used in a lot of different Cherry Trail x5-z8300 / x5-z8350 based Mini-PC / HDMI-stick models from Ace PC / Meegopad / MinisForum / Wintel (and likely also other vendors).
Most of the other DMI strings on these boxes unfortunately contain various generic values like "Default string" or "$(DEFAULT_STRING)", so we cannot match on them. These devices do have their chassis-type correctly set to a value of "3" (desktop) which is a pleasant surprise, so also match on that.
This should avoid the quirk accidentally also getting applied to laptops / tablets (which do actually have a battery). Although in my quite large database of Bay and Cherry Trail based devices DMIdecode dumps I don't have any laptops / tables with a board-name of "T3 MRD", so this should not be an issue.
BugLink: https://askubuntu.com/questions/1206714/how-can-a-mini-pc-be-stopped-from-be... Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/power/supply/axp288_fuel_gauge.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/power/supply/axp288_fuel_gauge.c b/drivers/power/supply/axp288_fuel_gauge.c index 39e16ecb7638..37af0e216bc3 100644 --- a/drivers/power/supply/axp288_fuel_gauge.c +++ b/drivers/power/supply/axp288_fuel_gauge.c @@ -723,15 +723,6 @@ static const struct dmi_system_id axp288_fuel_gauge_blacklist[] = { DMI_MATCH(DMI_PRODUCT_NAME, "MEEGOPAD T02"), }, }, - { - /* Meegopad T08 */ - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Default string"), - DMI_MATCH(DMI_BOARD_VENDOR, "To be filled by OEM."), - DMI_MATCH(DMI_BOARD_NAME, "T3 MRD"), - DMI_MATCH(DMI_BOARD_VERSION, "V1.1"), - }, - }, { /* Mele PCG03 Mini PC */ .matches = { DMI_EXACT_MATCH(DMI_BOARD_VENDOR, "Mini PC"), @@ -745,6 +736,15 @@ static const struct dmi_system_id axp288_fuel_gauge_blacklist[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Z83-4"), } }, + { + /* Various Ace PC/Meegopad/MinisForum/Wintel Mini-PCs/HDMI-sticks */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "T3 MRD"), + DMI_MATCH(DMI_CHASSIS_TYPE, "3"), + DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc."), + DMI_MATCH(DMI_BIOS_VERSION, "5.11"), + }, + }, {} };
From: Evan Quan evan.quan@amd.com
[ Upstream commit 9c26ddb1c5b6e30c6bca48b8ad9205d96efe93d0 ]
Fix TCP hang when a lightweight invalidation happens on Navi1x.
Signed-off-by: Evan Quan evan.quan@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 95 ++++++++++++++++++++++++++ 1 file changed, 95 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 2342c5d216f9..5c40912b51d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -7744,6 +7744,97 @@ static void gfx_v10_0_update_fine_grain_clock_gating(struct amdgpu_device *adev, } }
+static void gfx_v10_0_apply_medium_grain_clock_gating_workaround(struct amdgpu_device *adev) +{ + uint32_t reg_data = 0; + uint32_t reg_idx = 0; + uint32_t i; + + const uint32_t tcp_ctrl_regs[] = { + mmCGTS_SA0_WGP00_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP00_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP01_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP01_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP02_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP02_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP10_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP10_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP11_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP11_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP12_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP12_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP00_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP00_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP01_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP01_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP02_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP02_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP10_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP10_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP11_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP11_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP12_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP12_CU1_TCP_CTRL_REG + }; + + const uint32_t tcp_ctrl_regs_nv12[] = { + mmCGTS_SA0_WGP00_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP00_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP01_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP01_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP02_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP02_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP10_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP10_CU1_TCP_CTRL_REG, + mmCGTS_SA0_WGP11_CU0_TCP_CTRL_REG, + mmCGTS_SA0_WGP11_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP00_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP00_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP01_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP01_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP02_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP02_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP10_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP10_CU1_TCP_CTRL_REG, + mmCGTS_SA1_WGP11_CU0_TCP_CTRL_REG, + mmCGTS_SA1_WGP11_CU1_TCP_CTRL_REG, + }; + + const uint32_t sm_ctlr_regs[] = { + mmCGTS_SA0_QUAD0_SM_CTRL_REG, + mmCGTS_SA0_QUAD1_SM_CTRL_REG, + mmCGTS_SA1_QUAD0_SM_CTRL_REG, + mmCGTS_SA1_QUAD1_SM_CTRL_REG + }; + + if (adev->asic_type == CHIP_NAVI12) { + for (i = 0; i < ARRAY_SIZE(tcp_ctrl_regs_nv12); i++) { + reg_idx = adev->reg_offset[GC_HWIP][0][mmCGTS_SA0_WGP00_CU0_TCP_CTRL_REG_BASE_IDX] + + tcp_ctrl_regs_nv12[i]; + reg_data = RREG32(reg_idx); + reg_data |= CGTS_SA0_WGP00_CU0_TCP_CTRL_REG__TCPI_LS_OVERRIDE_MASK; + WREG32(reg_idx, reg_data); + } + } else { + for (i = 0; i < ARRAY_SIZE(tcp_ctrl_regs); i++) { + reg_idx = adev->reg_offset[GC_HWIP][0][mmCGTS_SA0_WGP00_CU0_TCP_CTRL_REG_BASE_IDX] + + tcp_ctrl_regs[i]; + reg_data = RREG32(reg_idx); + reg_data |= CGTS_SA0_WGP00_CU0_TCP_CTRL_REG__TCPI_LS_OVERRIDE_MASK; + WREG32(reg_idx, reg_data); + } + } + + for (i = 0; i < ARRAY_SIZE(sm_ctlr_regs); i++) { + reg_idx = adev->reg_offset[GC_HWIP][0][mmCGTS_SA0_QUAD0_SM_CTRL_REG_BASE_IDX] + + sm_ctlr_regs[i]; + reg_data = RREG32(reg_idx); + reg_data &= ~CGTS_SA0_QUAD0_SM_CTRL_REG__SM_MODE_MASK; + reg_data |= 2 << CGTS_SA0_QUAD0_SM_CTRL_REG__SM_MODE__SHIFT; + WREG32(reg_idx, reg_data); + } +} + static int gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev, bool enable) { @@ -7760,6 +7851,10 @@ static int gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev, gfx_v10_0_update_3d_clock_gating(adev, enable); /* === CGCG + CGLS === */ gfx_v10_0_update_coarse_grain_clock_gating(adev, enable); + + if ((adev->asic_type >= CHIP_NAVI10) && + (adev->asic_type <= CHIP_NAVI12)) + gfx_v10_0_apply_medium_grain_clock_gating_workaround(adev); } else { /* CGCG/CGLS should be disabled before MGCG/MGLS * === CGCG + CGLS ===
From: Philip Yang Philip.Yang@amd.com
[ Upstream commit dcdb4d904b4bd3078fe8d4d24b1658560d6078ef ]
3 cases of kobj leak, which causes memory leak:
kobj_type must have release() method to free memory from release callback. Don't need NULL default_attrs to init kobj.
sysfs files created under kobj_status should be removed with kobj_status as parent kobject.
Remove queue sysfs files when releasing queue from process MMU notifier release callback.
Signed-off-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 ++++++-------- .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 + 2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 65803e153a22..d243e60c6eef 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -452,13 +452,9 @@ static const struct sysfs_ops procfs_stats_ops = { .show = kfd_procfs_stats_show, };
-static struct attribute *procfs_stats_attrs[] = { - NULL -}; - static struct kobj_type procfs_stats_type = { .sysfs_ops = &procfs_stats_ops, - .default_attrs = procfs_stats_attrs, + .release = kfd_procfs_kobj_release, };
int kfd_procfs_add_queue(struct queue *q) @@ -973,9 +969,11 @@ static void kfd_process_wq_release(struct work_struct *work) list_for_each_entry(pdd, &p->per_device_data, per_device_list) { sysfs_remove_file(p->kobj, &pdd->attr_vram); sysfs_remove_file(p->kobj, &pdd->attr_sdma); - sysfs_remove_file(p->kobj, &pdd->attr_evict); - if (pdd->dev->kfd2kgd->get_cu_occupancy != NULL) - sysfs_remove_file(p->kobj, &pdd->attr_cu_occupancy); + + sysfs_remove_file(pdd->kobj_stats, &pdd->attr_evict); + if (pdd->dev->kfd2kgd->get_cu_occupancy) + sysfs_remove_file(pdd->kobj_stats, + &pdd->attr_cu_occupancy); kobject_del(pdd->kobj_stats); kobject_put(pdd->kobj_stats); pdd->kobj_stats = NULL; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index eb1635ac8988..43c07ac2c6fc 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -153,6 +153,7 @@ void pqm_uninit(struct process_queue_manager *pqm) if (pqn->q && pqn->q->gws) amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info, pqn->q->gws); + kfd_procfs_del_queue(pqn->q); uninit_queue(pqn->q); list_del(&pqn->process_queue_list); kfree(pqn);
From: Zou Wei zou_wei@huawei.com
[ Upstream commit fde25294dfd8e36e4e30b693c27a86232864002a ]
pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-img.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pwm/pwm-img.c b/drivers/pwm/pwm-img.c index 6faf5b5a5584..37200850cdc6 100644 --- a/drivers/pwm/pwm-img.c +++ b/drivers/pwm/pwm-img.c @@ -156,7 +156,7 @@ static int img_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm) struct img_pwm_chip *pwm_chip = to_img_pwm_chip(chip); int ret;
- ret = pm_runtime_get_sync(chip->dev); + ret = pm_runtime_resume_and_get(chip->dev); if (ret < 0) return ret;
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 86f7fa71cd830d18d7ebcaf719dffd5ddfe1acdd ]
A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.
So drop the hardware modification from the .remove() callback.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-tegra.c | 13 ------------- 1 file changed, 13 deletions(-)
diff --git a/drivers/pwm/pwm-tegra.c b/drivers/pwm/pwm-tegra.c index 55bc63d5a0ae..6d8e324864fa 100644 --- a/drivers/pwm/pwm-tegra.c +++ b/drivers/pwm/pwm-tegra.c @@ -301,7 +301,6 @@ static int tegra_pwm_probe(struct platform_device *pdev) static int tegra_pwm_remove(struct platform_device *pdev) { struct tegra_pwm_chip *pc = platform_get_drvdata(pdev); - unsigned int i; int err;
if (WARN_ON(!pc)) @@ -311,18 +310,6 @@ static int tegra_pwm_remove(struct platform_device *pdev) if (err < 0) return err;
- for (i = 0; i < pc->chip.npwm; i++) { - struct pwm_device *pwm = &pc->chip.pwms[i]; - - if (!pwm_is_enabled(pwm)) - if (clk_prepare_enable(pc->clk) < 0) - continue; - - pwm_writel(pc, i, 0); - - clk_disable_unprepare(pc->clk); - } - reset_control_assert(pc->rst); clk_disable_unprepare(pc->clk);
From: Liguang Zhang zhangliguang@linux.alibaba.com
[ Upstream commit 7718629432676b5ebd9a32940782fe297a0abf8d ]
In function amba_handler_attach(), dev->res.name is initialized by amba_device_alloc. But when address_found is false, dev->res.name is assigned to null value, which leads to wrong resource name display in /proc/iomem, "<BAD>" is seen for those resources.
Signed-off-by: Liguang Zhang zhangliguang@linux.alibaba.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpi_amba.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/acpi/acpi_amba.c b/drivers/acpi/acpi_amba.c index 49b781a9cd97..ab8a4e0191b1 100644 --- a/drivers/acpi/acpi_amba.c +++ b/drivers/acpi/acpi_amba.c @@ -76,6 +76,7 @@ static int amba_handler_attach(struct acpi_device *adev, case IORESOURCE_MEM: if (!address_found) { dev->res = *rentry->res; + dev->res.name = dev_name(&dev->dev); address_found = true; } break;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 9249c32ec9197e8d34fe5179c9e31668a205db04 ]
The Dell Vostro 3350 ACPI video-bus device reports spurious ACPI_VIDEO_NOTIFY_CYCLE events resulting in spurious KEY_SWITCHVIDEOMODE events being reported to userspace (and causing trouble there).
Add a quirk setting the report_key_events mask to REPORT_BRIGHTNESS_KEY_EVENTS so that the ACPI_VIDEO_NOTIFY_CYCLE events will be ignored, while still reporting brightness up/down hotkey-presses to userspace normally.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1911763 Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpi_video.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c index 2ea1781290cc..1b68d116d808 100644 --- a/drivers/acpi/acpi_video.c +++ b/drivers/acpi/acpi_video.c @@ -540,6 +540,15 @@ static const struct dmi_system_id video_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Vostro V131"), }, }, + { + .callback = video_set_report_key_events, + .driver_data = (void *)((uintptr_t)REPORT_BRIGHTNESS_KEY_EVENTS), + .ident = "Dell Vostro 3350", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Vostro 3350"), + }, + }, /* * Some machines change the brightness themselves when a brightness * hotkey gets pressed, despite us telling them not to. In this case
From: Javier Martinez Canillas javierm@redhat.com
[ Upstream commit 3cf5f7ab230e2b886e493c7a8449ed50e29d2b98 ]
An IRQ handler may be called at any time after it is registered, so anything it relies on must be ready before registration.
rockchip_pcie_subsys_irq_handler() and rockchip_pcie_client_irq_handler() read registers in the PCIe controller, but we registered them before turning on clocks to the controller. If either is called before the clocks are turned on, the register reads fail and the machine hangs.
Similarly, rockchip_pcie_legacy_int_handler() uses rockchip->irq_domain, but we installed it before initializing irq_domain.
Register IRQ handlers after their data structures are initialized and clocks are enabled.
Found by enabling CONFIG_DEBUG_SHIRQ, which calls the IRQ handler when it is being unregistered. An error during the probe path might cause this unregistration and IRQ handler execution before the device or data structure init has finished.
[bhelgaas: commit log] Link: https://lore.kernel.org/r/20210608080409.1729276-1-javierm@redhat.com Reported-by: Peter Robinson pbrobinson@gmail.com Tested-by: Peter Robinson pbrobinson@gmail.com Signed-off-by: Javier Martinez Canillas javierm@redhat.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Shawn Lin shawn.lin@rock-chips.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-host.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/controller/pcie-rockchip-host.c b/drivers/pci/controller/pcie-rockchip-host.c index f1d08a1b1591..78d04ac29cd5 100644 --- a/drivers/pci/controller/pcie-rockchip-host.c +++ b/drivers/pci/controller/pcie-rockchip-host.c @@ -592,10 +592,6 @@ static int rockchip_pcie_parse_host_dt(struct rockchip_pcie *rockchip) if (err) return err;
- err = rockchip_pcie_setup_irq(rockchip); - if (err) - return err; - rockchip->vpcie12v = devm_regulator_get_optional(dev, "vpcie12v"); if (IS_ERR(rockchip->vpcie12v)) { if (PTR_ERR(rockchip->vpcie12v) != -ENODEV) @@ -973,8 +969,6 @@ static int rockchip_pcie_probe(struct platform_device *pdev) if (err) goto err_vpcie;
- rockchip_pcie_enable_interrupts(rockchip); - err = rockchip_pcie_init_irq_domain(rockchip); if (err < 0) goto err_deinit_port; @@ -992,6 +986,12 @@ static int rockchip_pcie_probe(struct platform_device *pdev) bridge->sysdata = rockchip; bridge->ops = &rockchip_pcie_ops;
+ err = rockchip_pcie_setup_irq(rockchip); + if (err) + goto err_remove_irq_domain; + + rockchip_pcie_enable_interrupts(rockchip); + err = pci_host_probe(bridge); if (err < 0) goto err_remove_irq_domain;
From: Ye Bin yebin10@huawei.com
[ Upstream commit 558d6450c7755aa005d89021204b6cdcae5e848f ]
If a writeback of the superblock fails with an I/O error, the buffer is marked not uptodate. However, this can cause a WARN_ON to trigger when we attempt to write superblock a second time. (Which might succeed this time, for cerrtain types of block devices such as iSCSI devices over a flaky network.)
Try to detect this case in flush_stashed_error_work(), and also change __ext4_handle_dirty_metadata() so we always set the uptodate flag, not just in the nojournal case.
Before this commit, this problem can be repliciated via:
1. dmsetup create dust1 --table '0 2097152 dust /dev/sdc 0 4096' 2. mount /dev/mapper/dust1 /home/test 3. dmsetup message dust1 0 addbadblock 0 10 4. cd /home/test 5. echo "XXXXXXX" > t
After a few seconds, we got following warning:
[ 80.654487] end_buffer_async_write: bh=0xffff88842f18bdd0 [ 80.656134] Buffer I/O error on dev dm-0, logical block 0, lost async page write [ 85.774450] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:193: comm kworker/u16:8: Error while async write back metadata [ 91.415513] mark_buffer_dirty: bh=0xffff88842f18bdd0 [ 91.417038] ------------[ cut here ]------------ [ 91.418450] WARNING: CPU: 1 PID: 1944 at fs/buffer.c:1092 mark_buffer_dirty.cold+0x1c/0x5e [ 91.440322] Call Trace: [ 91.440652] __jbd2_journal_temp_unlink_buffer+0x135/0x220 [ 91.441354] __jbd2_journal_unfile_buffer+0x24/0x90 [ 91.441981] __jbd2_journal_refile_buffer+0x134/0x1d0 [ 91.442628] jbd2_journal_commit_transaction+0x249a/0x3240 [ 91.443336] ? put_prev_entity+0x2a/0x200 [ 91.443856] ? kjournald2+0x12e/0x510 [ 91.444324] kjournald2+0x12e/0x510 [ 91.444773] ? woken_wake_function+0x30/0x30 [ 91.445326] kthread+0x150/0x1b0 [ 91.445739] ? commit_timeout+0x20/0x20 [ 91.446258] ? kthread_flush_worker+0xb0/0xb0 [ 91.446818] ret_from_fork+0x1f/0x30 [ 91.447293] ---[ end trace 66f0b6bf3d1abade ]---
Signed-off-by: Ye Bin yebin10@huawei.com Link: https://lore.kernel.org/r/20210615090537.3423231-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/ext4_jbd2.c | 2 +- fs/ext4/super.c | 12 ++++++++++-- 2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c index be799040a415..b96ecba91899 100644 --- a/fs/ext4/ext4_jbd2.c +++ b/fs/ext4/ext4_jbd2.c @@ -327,6 +327,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
set_buffer_meta(bh); set_buffer_prio(bh); + set_buffer_uptodate(bh); if (ext4_handle_valid(handle)) { err = jbd2_journal_dirty_metadata(handle, bh); /* Errors can only happen due to aborted journal or a nasty bug */ @@ -355,7 +356,6 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line, err); } } else { - set_buffer_uptodate(bh); if (inode) mark_buffer_dirty_inode(bh, inode); else diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 0e3a847b5d27..67fb3cb34c6f 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -705,15 +705,23 @@ static void flush_stashed_error_work(struct work_struct *work) * ext4 error handling code during handling of previous errors. */ if (!sb_rdonly(sbi->s_sb) && journal) { + struct buffer_head *sbh = sbi->s_sbh; handle = jbd2_journal_start(journal, 1); if (IS_ERR(handle)) goto write_directly; - if (jbd2_journal_get_write_access(handle, sbi->s_sbh)) { + if (jbd2_journal_get_write_access(handle, sbh)) { jbd2_journal_stop(handle); goto write_directly; } ext4_update_super(sbi->s_sb); - if (jbd2_journal_dirty_metadata(handle, sbi->s_sbh)) { + if (buffer_write_io_error(sbh) || !buffer_uptodate(sbh)) { + ext4_msg(sbi->s_sb, KERN_ERR, "previous I/O error to " + "superblock detected"); + clear_buffer_write_io_error(sbh); + set_buffer_uptodate(sbh); + } + + if (jbd2_journal_dirty_metadata(handle, sbh)) { jbd2_journal_stop(handle); goto write_directly; }
From: Xie Yongji xieyongji@bytedance.com
[ Upstream commit b71ba22e7c6c6b279c66f53ee7818709774efa1f ]
The vblk->vqs should be freed before we call init_vqs() in virtblk_restore().
Signed-off-by: Xie Yongji xieyongji@bytedance.com Link: https://lore.kernel.org/r/20210517084332.280-1-xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/virtio_blk.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index b9fa3ef5b57c..425bae618131 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -948,6 +948,8 @@ static int virtblk_freeze(struct virtio_device *vdev) blk_mq_quiesce_queue(vblk->disk->queue);
vdev->config->del_vqs(vdev); + kfree(vblk->vqs); + return 0; }
From: Xie Yongji xieyongji@bytedance.com
[ Upstream commit 3f2869cace829fb4b80fc53b3ddaa7f4ba9acbf1 ]
Do some cleanups in virtnet_restore() when virtnet_cpu_notif_add() failed.
Signed-off-by: Xie Yongji xieyongji@bytedance.com Link: https://lore.kernel.org/r/20210517084516.332-1-xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0824e6999e49..9007d73863d0 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3223,8 +3223,11 @@ static __maybe_unused int virtnet_restore(struct virtio_device *vdev) virtnet_set_queues(vi, vi->curr_queue_pairs);
err = virtnet_cpu_notif_add(vi); - if (err) + if (err) { + virtnet_freeze_down(vdev); + remove_vq_common(vi); return err; + }
return 0; }
From: Xie Yongji xieyongji@bytedance.com
[ Upstream commit d00d8da5869a2608e97cfede094dfc5e11462a46 ]
The buf->len might come from an untrusted device. This ensures the value would not exceed the size of the buffer to avoid data corruption or loss.
Signed-off-by: Xie Yongji xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Link: https://lore.kernel.org/r/20210525125622.1203-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/virtio_console.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 1836cc56e357..673522874cec 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -475,7 +475,7 @@ static struct port_buffer *get_inbuf(struct port *port)
buf = virtqueue_get_buf(port->in_vq, &len); if (buf) { - buf->len = len; + buf->len = min_t(size_t, len, buf->size); buf->offset = 0; port->stats.bytes_received += len; } @@ -1712,7 +1712,7 @@ static void control_work_handler(struct work_struct *work) while ((buf = virtqueue_get_buf(vq, &len))) { spin_unlock(&portdev->c_ivq_lock);
- buf->len = len; + buf->len = min_t(size_t, len, buf->size); buf->offset = 0;
handle_control_message(vq->vdev, portdev, buf);
From: "Michael S. Tsirkin" mst@redhat.com
[ Upstream commit 8d622d21d24803408b256d96463eac4574dcf067 ]
virtio_disable_cb is currently a nop for split ring with event index. This is because it used to be always called from a callback when we know device won't trigger more events until we update the index. However, now that we run with interrupts enabled a lot we also poll without a callback so that is different: disabling callbacks will help reduce the number of spurious interrupts. Further, if using event index with a packed ring, and if being called from a callback, we actually do disable interrupts which is unnecessary.
Fix both issues by tracking whenever we get a callback. If that is the case disabling interrupts with event index can be a nop. If not the case disable interrupts. Note: with a split ring there's no explicit "no interrupts" value. For now we write a fixed value so our chance of triggering an interupt is 1/ring size. It's probably better to write something related to the last used index there to reduce the chance even further. For now I'm keeping it simple.
Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/virtio/virtio_ring.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 71e16b53e9c1..88f0b16b11b8 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -113,6 +113,9 @@ struct vring_virtqueue { /* Last used index we've seen. */ u16 last_used_idx;
+ /* Hint for event idx: already triggered no need to disable. */ + bool event_triggered; + union { /* Available for split ring */ struct { @@ -739,7 +742,10 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) { vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT; - if (!vq->event) + if (vq->event) + /* TODO: this is a hack. Figure out a cleaner value to write. */ + vring_used_event(&vq->split.vring) = 0x0; + else vq->split.vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->split.avail_flags_shadow); @@ -1605,6 +1611,7 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->weak_barriers = weak_barriers; vq->broken = false; vq->last_used_idx = 0; + vq->event_triggered = false; vq->num_added = 0; vq->packed_ring = true; vq->use_dma_api = vring_use_dma_api(vdev); @@ -1919,6 +1926,12 @@ void virtqueue_disable_cb(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
+ /* If device triggered an event already it won't trigger one again: + * no need to disable. + */ + if (vq->event_triggered) + return; + if (vq->packed_ring) virtqueue_disable_cb_packed(_vq); else @@ -1942,6 +1955,9 @@ unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
+ if (vq->event_triggered) + vq->event_triggered = false; + return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) : virtqueue_enable_cb_prepare_split(_vq); } @@ -2005,6 +2021,9 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
+ if (vq->event_triggered) + vq->event_triggered = false; + return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) : virtqueue_enable_cb_delayed_split(_vq); } @@ -2044,6 +2063,10 @@ irqreturn_t vring_interrupt(int irq, void *_vq) if (unlikely(vq->broken)) return IRQ_HANDLED;
+ /* Just a hint for performance: so it's ok that this can be racy! */ + if (vq->event) + vq->event_triggered = true; + pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); if (vq->vq.callback) vq->vq.callback(&vq->vq); @@ -2083,6 +2106,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, vq->weak_barriers = weak_barriers; vq->broken = false; vq->last_used_idx = 0; + vq->event_triggered = false; vq->num_added = 0; vq->use_dma_api = vring_use_dma_api(vdev); #ifdef DEBUG
On Sat, Jul 10, 2021 at 07:49:14PM -0400, Sasha Levin wrote:
From: "Michael S. Tsirkin" mst@redhat.com
[ Upstream commit 8d622d21d24803408b256d96463eac4574dcf067 ]
virtio_disable_cb is currently a nop for split ring with event index. This is because it used to be always called from a callback when we know device won't trigger more events until we update the index. However, now that we run with interrupts enabled a lot we also poll without a callback so that is different: disabling callbacks will help reduce the number of spurious interrupts. Further, if using event index with a packed ring, and if being called from a callback, we actually do disable interrupts which is unnecessary.
Fix both issues by tracking whenever we get a callback. If that is the case disabling interrupts with event index can be a nop. If not the case disable interrupts. Note: with a split ring there's no explicit "no interrupts" value. For now we write a fixed value so our chance of triggering an interupt is 1/ring size. It's probably better to write something related to the last used index there to reduce the chance even further. For now I'm keeping it simple.
Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
I am not sure we want this in stable yet. It should in theory fix the reported interrupt storms but the reporter is on vacation.
drivers/virtio/virtio_ring.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 71e16b53e9c1..88f0b16b11b8 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -113,6 +113,9 @@ struct vring_virtqueue { /* Last used index we've seen. */ u16 last_used_idx;
- /* Hint for event idx: already triggered no need to disable. */
- bool event_triggered;
- union { /* Available for split ring */ struct {
@@ -739,7 +742,10 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq) if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) { vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
if (!vq->event)
if (vq->event)
/* TODO: this is a hack. Figure out a cleaner value to write. */
vring_used_event(&vq->split.vring) = 0x0;
else vq->split.vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->split.avail_flags_shadow);
@@ -1605,6 +1611,7 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->weak_barriers = weak_barriers; vq->broken = false; vq->last_used_idx = 0;
- vq->event_triggered = false; vq->num_added = 0; vq->packed_ring = true; vq->use_dma_api = vring_use_dma_api(vdev);
@@ -1919,6 +1926,12 @@ void virtqueue_disable_cb(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
- /* If device triggered an event already it won't trigger one again:
* no need to disable.
*/
- if (vq->event_triggered)
return;
- if (vq->packed_ring) virtqueue_disable_cb_packed(_vq); else
@@ -1942,6 +1955,9 @@ unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
- if (vq->event_triggered)
vq->event_triggered = false;
- return vq->packed_ring ? virtqueue_enable_cb_prepare_packed(_vq) : virtqueue_enable_cb_prepare_split(_vq);
} @@ -2005,6 +2021,9 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq);
- if (vq->event_triggered)
vq->event_triggered = false;
- return vq->packed_ring ? virtqueue_enable_cb_delayed_packed(_vq) : virtqueue_enable_cb_delayed_split(_vq);
} @@ -2044,6 +2063,10 @@ irqreturn_t vring_interrupt(int irq, void *_vq) if (unlikely(vq->broken)) return IRQ_HANDLED;
- /* Just a hint for performance: so it's ok that this can be racy! */
- if (vq->event)
vq->event_triggered = true;
- pr_debug("virtqueue callback for %p (%p)\n", vq, vq->vq.callback); if (vq->vq.callback) vq->vq.callback(&vq->vq);
@@ -2083,6 +2106,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, vq->weak_barriers = weak_barriers; vq->broken = false; vq->last_used_idx = 0;
- vq->event_triggered = false; vq->num_added = 0; vq->use_dma_api = vring_use_dma_api(vdev);
#ifdef DEBUG
2.30.2
On Sun, Jul 11, 2021 at 12:23:59AM -0400, Michael S. Tsirkin wrote:
On Sat, Jul 10, 2021 at 07:49:14PM -0400, Sasha Levin wrote:
From: "Michael S. Tsirkin" mst@redhat.com
[ Upstream commit 8d622d21d24803408b256d96463eac4574dcf067 ]
virtio_disable_cb is currently a nop for split ring with event index. This is because it used to be always called from a callback when we know device won't trigger more events until we update the index. However, now that we run with interrupts enabled a lot we also poll without a callback so that is different: disabling callbacks will help reduce the number of spurious interrupts. Further, if using event index with a packed ring, and if being called from a callback, we actually do disable interrupts which is unnecessary.
Fix both issues by tracking whenever we get a callback. If that is the case disabling interrupts with event index can be a nop. If not the case disable interrupts. Note: with a split ring there's no explicit "no interrupts" value. For now we write a fixed value so our chance of triggering an interupt is 1/ring size. It's probably better to write something related to the last used index there to reduce the chance even further. For now I'm keeping it simple.
Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
I am not sure we want this in stable yet. It should in theory fix the reported interrupt storms but the reporter is on vacation.
Sure, I'll drop it for now. Let me know when you want it re-added. Thanks!
From: Chunguang Xu brookxu@tencent.com
[ Upstream commit d80c228d44640f0b47b57a2ca4afa26ef87e16b0 ]
On the IO submission path, blk_account_io_start() may interrupt the system interruption. When the interruption returns, the value of part->stamp may have been updated by other cores, so the time value collected before the interruption may be less than part-> stamp. So when this happens, we should do nothing to make io_ticks more accurate? For kernels less than 5.0, this may cause io_ticks to become smaller, which in turn may cause abnormal ioutil values.
Signed-off-by: Chunguang Xu brookxu@tencent.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/1625521646-1069-1-git-send-email-brookxu.cn@gmail.... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c index fc60ff208497..e34dfa13b7bc 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1255,7 +1255,7 @@ static void update_io_ticks(struct block_device *part, unsigned long now, unsigned long stamp; again: stamp = READ_ONCE(part->bd_stamp); - if (unlikely(stamp != now)) { + if (unlikely(time_after(now, stamp))) { if (likely(cmpxchg(&part->bd_stamp, stamp, now) == stamp)) __part_stat_add(part, io_ticks, end ? now - stamp : 1); }
linux-stable-mirror@lists.linaro.org