From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 7ebaa41047738d46fca6376b3f1765ef69c463c5 ]
When disabling pin bias, there is no need to touch the LSI pin pull-up/down control register (PUDn), which selects between pull-up and pull-down. Just disabling the pull-up/down function through the LSI pin pull-enable register (PUENn) is sufficient.
Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Link: https://lore.kernel.org/r/071ec644de2555da593a4531ef5d3e4d79cf997d.162506407... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/renesas/pinctrl.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/pinctrl/renesas/pinctrl.c b/drivers/pinctrl/renesas/pinctrl.c index bb488af29862..85cb78cfcfa6 100644 --- a/drivers/pinctrl/renesas/pinctrl.c +++ b/drivers/pinctrl/renesas/pinctrl.c @@ -898,17 +898,17 @@ void rcar_pinmux_set_bias(struct sh_pfc *pfc, unsigned int pin,
if (reg->puen) { enable = sh_pfc_read(pfc, reg->puen) & ~BIT(bit); - if (bias != PIN_CONFIG_BIAS_DISABLE) + if (bias != PIN_CONFIG_BIAS_DISABLE) { enable |= BIT(bit);
- if (reg->pud) { - updown = sh_pfc_read(pfc, reg->pud) & ~BIT(bit); - if (bias == PIN_CONFIG_BIAS_PULL_UP) - updown |= BIT(bit); + if (reg->pud) { + updown = sh_pfc_read(pfc, reg->pud) & ~BIT(bit); + if (bias == PIN_CONFIG_BIAS_PULL_UP) + updown |= BIT(bit);
- sh_pfc_write(pfc, reg->pud, updown); + sh_pfc_write(pfc, reg->pud, updown); + } } - sh_pfc_write(pfc, reg->puen, enable); } else { enable = sh_pfc_read(pfc, reg->pud) & ~BIT(bit);
From: Dominique Martinet dominique.martinet@atmark-techno.com
[ Upstream commit 868c9ddc182bc6728bb380cbfb3170734f72c599 ]
This is a follow-up on 5f89468e2f06 ("swiotlb: manipulate orig_addr when tlb_addr has offset") which fixed unaligned dma mappings, making sure the following overflows are caught:
- offset of the start of the slot within the device bigger than requested address' offset, in other words if the base address given in swiotlb_tbl_map_single to create the mapping (orig_addr) was after the requested address for the sync (tlb_offset) in the same block:
|------------------------------------------| block <----------------------------> mapped part of the block ^ orig_addr ^ invalid tlb_addr for sync
- if the resulting offset was bigger than the allocation size this one could happen if the mapping was not until the end. e.g.
|------------------------------------------| block <---------------------> mapped part of the block ^ ^ orig_addr invalid tlb_addr
Both should never happen so print a warning and bail out without trying to adjust the sizes/offsets: the first one could try to sync from orig_addr to whatever is left of the requested size, but the later really has nothing to sync there...
Signed-off-by: Dominique Martinet dominique.martinet@atmark-techno.com Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Bumyong Lee <bumyong.lee@samsung.com Cc: Chanho Park chanho61.park@samsung.com Cc: Christoph Hellwig hch@lst.de Signed-off-by: Konrad Rzeszutek Wilk konrad@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/swiotlb.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index e50df8d8f87e..23f8d0b168c5 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -354,13 +354,27 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size size_t alloc_size = mem->slots[index].alloc_size; unsigned long pfn = PFN_DOWN(orig_addr); unsigned char *vaddr = phys_to_virt(tlb_addr); - unsigned int tlb_offset; + unsigned int tlb_offset, orig_addr_offset;
if (orig_addr == INVALID_PHYS_ADDR) return;
- tlb_offset = (tlb_addr & (IO_TLB_SIZE - 1)) - - swiotlb_align_offset(dev, orig_addr); + tlb_offset = tlb_addr & (IO_TLB_SIZE - 1); + orig_addr_offset = swiotlb_align_offset(dev, orig_addr); + if (tlb_offset < orig_addr_offset) { + dev_WARN_ONCE(dev, 1, + "Access before mapping start detected. orig offset %u, requested offset %u.\n", + orig_addr_offset, tlb_offset); + return; + } + + tlb_offset -= orig_addr_offset; + if (tlb_offset > alloc_size) { + dev_WARN_ONCE(dev, 1, + "Buffer overflow detected. Allocation size: %zu. Mapping size: %zu+%u.\n", + alloc_size, size, tlb_offset); + return; + }
orig_addr += tlb_offset; alloc_size -= tlb_offset;
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 18eeef46d3593e3bc3d6a8f7a6f94ace356578ff ]
The regulator for the touchscreen could be: * A dedicated regulator just for the touchscreen. * A regulator shared with something else in the system. * An always-on regulator.
How we want the "reset" line to behave depends a bit on which of those three cases we're in. Currently the code is written with the assumption that it has a dedicated regulator, but that's not really guaranteed to be the case.
The problem we run into is that if we leave the touchscreen powered on (because someone else is requesting the regulator or it's an always-on regulator) and we assert reset then we apparently burn an extra 67 mW of power. That's not great.
Let's instead tie the control of the reset line to the true state of the regulator as reported by regulator notifiers. If we have an always-on regulator our notifier will never be called. If we have a shared regulator then our notifier will be called when the touchscreen is truly turned on or truly turned off.
Using notifiers like this nicely handles all the cases without resorting to hacks like pretending that there is no "reset" GPIO if we have an always-on regulator.
NOTE: if the regulator is on a shared line it's still possible that things could be a little off. Specifically, this case is not handled even after this patch: 1. Suspend goodix (send "sleep", goodix stops requesting regulator on) 2. Other regulator user turns off (regulator fully turns off). 3. Goodix driver gets notified and asserts reset. 4. Other regulator user turns on. 5. Goodix driver gets notified and deasserts reset. 6. Nobody resumes goodix.
With that set of steps we'll have reset deasserted but we will have lost the results of the I2C_HID_PWR_SLEEP from the suspend path. That means we might be in higher power than we could be even if the goodix driver thinks things are suspended. Presumably, however, we're still in better shape than if we were asserting "reset" the whole time. If somehow the above situation is actually affecting someone and we want to do better we can deal with it when we have a real use case.
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/i2c-hid/i2c-hid-of-goodix.c | 92 +++++++++++++++++++++---- 1 file changed, 79 insertions(+), 13 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-of-goodix.c b/drivers/hid/i2c-hid/i2c-hid-of-goodix.c index ee0225982a82..31a4c229fdb7 100644 --- a/drivers/hid/i2c-hid/i2c-hid-of-goodix.c +++ b/drivers/hid/i2c-hid/i2c-hid-of-goodix.c @@ -26,28 +26,29 @@ struct i2c_hid_of_goodix { struct i2chid_ops ops;
struct regulator *vdd; + struct notifier_block nb; + struct mutex regulator_mutex; struct gpio_desc *reset_gpio; const struct goodix_i2c_hid_timing_data *timings; };
-static int goodix_i2c_hid_power_up(struct i2chid_ops *ops) +static void goodix_i2c_hid_deassert_reset(struct i2c_hid_of_goodix *ihid_goodix, + bool regulator_just_turned_on) { - struct i2c_hid_of_goodix *ihid_goodix = - container_of(ops, struct i2c_hid_of_goodix, ops); - int ret; - - ret = regulator_enable(ihid_goodix->vdd); - if (ret) - return ret; - - if (ihid_goodix->timings->post_power_delay_ms) + if (regulator_just_turned_on && ihid_goodix->timings->post_power_delay_ms) msleep(ihid_goodix->timings->post_power_delay_ms);
gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 0); if (ihid_goodix->timings->post_gpio_reset_delay_ms) msleep(ihid_goodix->timings->post_gpio_reset_delay_ms); +}
- return 0; +static int goodix_i2c_hid_power_up(struct i2chid_ops *ops) +{ + struct i2c_hid_of_goodix *ihid_goodix = + container_of(ops, struct i2c_hid_of_goodix, ops); + + return regulator_enable(ihid_goodix->vdd); }
static void goodix_i2c_hid_power_down(struct i2chid_ops *ops) @@ -55,20 +56,54 @@ static void goodix_i2c_hid_power_down(struct i2chid_ops *ops) struct i2c_hid_of_goodix *ihid_goodix = container_of(ops, struct i2c_hid_of_goodix, ops);
- gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1); regulator_disable(ihid_goodix->vdd); }
+static int ihid_goodix_vdd_notify(struct notifier_block *nb, + unsigned long event, + void *ignored) +{ + struct i2c_hid_of_goodix *ihid_goodix = + container_of(nb, struct i2c_hid_of_goodix, nb); + int ret = NOTIFY_OK; + + mutex_lock(&ihid_goodix->regulator_mutex); + + switch (event) { + case REGULATOR_EVENT_PRE_DISABLE: + gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1); + break; + + case REGULATOR_EVENT_ENABLE: + goodix_i2c_hid_deassert_reset(ihid_goodix, true); + break; + + case REGULATOR_EVENT_ABORT_DISABLE: + goodix_i2c_hid_deassert_reset(ihid_goodix, false); + break; + + default: + ret = NOTIFY_DONE; + break; + } + + mutex_unlock(&ihid_goodix->regulator_mutex); + + return ret; +} + static int i2c_hid_of_goodix_probe(struct i2c_client *client, const struct i2c_device_id *id) { struct i2c_hid_of_goodix *ihid_goodix; - + int ret; ihid_goodix = devm_kzalloc(&client->dev, sizeof(*ihid_goodix), GFP_KERNEL); if (!ihid_goodix) return -ENOMEM;
+ mutex_init(&ihid_goodix->regulator_mutex); + ihid_goodix->ops.power_up = goodix_i2c_hid_power_up; ihid_goodix->ops.power_down = goodix_i2c_hid_power_down;
@@ -84,6 +119,37 @@ static int i2c_hid_of_goodix_probe(struct i2c_client *client,
ihid_goodix->timings = device_get_match_data(&client->dev);
+ /* + * We need to control the "reset" line in lockstep with the regulator + * actually turning on an off instead of just when we make the request. + * This matters if the regulator is shared with another consumer. + * - If the regulator is off then we must assert reset. The reset + * line is active low and on some boards it could cause a current + * leak if left high. + * - If the regulator is on then we don't want reset asserted for very + * long. Holding the controller in reset apparently draws extra + * power. + */ + mutex_lock(&ihid_goodix->regulator_mutex); + ihid_goodix->nb.notifier_call = ihid_goodix_vdd_notify; + ret = regulator_register_notifier(ihid_goodix->vdd, &ihid_goodix->nb); + if (ret) { + mutex_unlock(&ihid_goodix->regulator_mutex); + return dev_err_probe(&client->dev, ret, + "regulator notifier request failed\n"); + } + + /* + * If someone else is holding the regulator on (or the regulator is + * an always-on one) we might never be told to deassert reset. Do it + * now. Here we'll assume that someone else might have _just + * barely_ turned the regulator on so we'll do the full + * "post_power_delay" just in case. + */ + if (ihid_goodix->reset_gpio && regulator_is_enabled(ihid_goodix->vdd)) + goodix_i2c_hid_deassert_reset(ihid_goodix, true); + mutex_unlock(&ihid_goodix->regulator_mutex); + return i2c_hid_core_probe(client, &ihid_goodix->ops, 0x0001); }
From: Jon Lin jon.lin@rock-chips.com
[ Upstream commit 0be3df186f870cbde56b223c1ad7892109c9c440 ]
Choose the correct pll
Signed-off-by: Elaine Zhang zhangqing@rock-chips.com Signed-off-by: Jon Lin jon.lin@rock-chips.com Acked-by: Stephen Boyd sboyd@kernel.org Link: https://lore.kernel.org/r/20210713094456.23288-5-jon.lin@rock-chips.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/rockchip/clk-rk3036.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/rockchip/clk-rk3036.c b/drivers/clk/rockchip/clk-rk3036.c index 614845cc5b4a..c38ad4ec8746 100644 --- a/drivers/clk/rockchip/clk-rk3036.c +++ b/drivers/clk/rockchip/clk-rk3036.c @@ -121,6 +121,7 @@ PNAME(mux_pll_src_3plls_p) = { "apll", "dpll", "gpll" }; PNAME(mux_timer_p) = { "xin24m", "pclk_peri_src" };
PNAME(mux_pll_src_apll_dpll_gpll_usb480m_p) = { "apll", "dpll", "gpll", "usb480m" }; +PNAME(mux_pll_src_dmyapll_dpll_gpll_xin24_p) = { "dummy_apll", "dpll", "gpll", "xin24m" };
PNAME(mux_mmc_src_p) = { "apll", "dpll", "gpll", "xin24m" }; PNAME(mux_i2s_pre_p) = { "i2s_src", "i2s_frac", "ext_i2s", "xin12m" }; @@ -340,7 +341,7 @@ static struct rockchip_clk_branch rk3036_clk_branches[] __initdata = { RK2928_CLKSEL_CON(16), 8, 2, MFLAGS, 10, 5, DFLAGS, RK2928_CLKGATE_CON(10), 4, GFLAGS),
- COMPOSITE(SCLK_SFC, "sclk_sfc", mux_pll_src_apll_dpll_gpll_usb480m_p, 0, + COMPOSITE(SCLK_SFC, "sclk_sfc", mux_pll_src_dmyapll_dpll_gpll_xin24_p, 0, RK2928_CLKSEL_CON(16), 0, 2, MFLAGS, 2, 5, DFLAGS, RK2928_CLKGATE_CON(10), 5, GFLAGS),
From: Mike Christie michael.christie@oracle.com
[ Upstream commit 7b0ddc1346089b62b45e688e350c9e1c3f7a3ab2 ]
This fixes a bug found by Lv Yunlong where, because beiscsi_exec_nemb_cmd() frees memory for the be_dma_mem cmd(), we can access freed memory when beiscsi_if_clr_ip()/beiscsi_if_set_ip()'s call to beiscsi_exec_nemb_cmd() fails and we access the freed req. This fixes the issue by having the caller free the cmd's memory.
Link: https://lore.kernel.org/r/20210701190840.175120-1-michael.christie@oracle.co... Reported-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/be2iscsi/be_mgmt.c | 84 ++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 39 deletions(-)
diff --git a/drivers/scsi/be2iscsi/be_mgmt.c b/drivers/scsi/be2iscsi/be_mgmt.c index 462717bbb5b7..4e899ec1477d 100644 --- a/drivers/scsi/be2iscsi/be_mgmt.c +++ b/drivers/scsi/be2iscsi/be_mgmt.c @@ -235,8 +235,7 @@ static int beiscsi_exec_nemb_cmd(struct beiscsi_hba *phba, wrb = alloc_mcc_wrb(phba, &tag); if (!wrb) { mutex_unlock(&ctrl->mbox_lock); - rc = -ENOMEM; - goto free_cmd; + return -ENOMEM; }
sge = nonembedded_sgl(wrb); @@ -269,24 +268,6 @@ static int beiscsi_exec_nemb_cmd(struct beiscsi_hba *phba, /* copy the response, if any */ if (resp_buf) memcpy(resp_buf, nonemb_cmd->va, resp_buf_len); - /** - * This is special case of NTWK_GET_IF_INFO where the size of - * response is not known. beiscsi_if_get_info checks the return - * value to free DMA buffer. - */ - if (rc == -EAGAIN) - return rc; - - /** - * If FW is busy that is driver timed out, DMA buffer is saved with - * the tag, only when the cmd completes this buffer is freed. - */ - if (rc == -EBUSY) - return rc; - -free_cmd: - dma_free_coherent(&ctrl->pdev->dev, nonemb_cmd->size, - nonemb_cmd->va, nonemb_cmd->dma); return rc; }
@@ -309,6 +290,19 @@ static int beiscsi_prep_nemb_cmd(struct beiscsi_hba *phba, return 0; }
+static void beiscsi_free_nemb_cmd(struct beiscsi_hba *phba, + struct be_dma_mem *cmd, int rc) +{ + /* + * If FW is busy the DMA buffer is saved with the tag. When the cmd + * completes this buffer is freed. + */ + if (rc == -EBUSY) + return; + + dma_free_coherent(&phba->ctrl.pdev->dev, cmd->size, cmd->va, cmd->dma); +} + static void __beiscsi_eq_delay_compl(struct beiscsi_hba *phba, unsigned int tag) { struct be_dma_mem *tag_mem; @@ -344,8 +338,16 @@ int beiscsi_modify_eq_delay(struct beiscsi_hba *phba, cpu_to_le32(set_eqd[i].delay_multiplier); }
- return beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, - __beiscsi_eq_delay_compl, NULL, 0); + rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, __beiscsi_eq_delay_compl, + NULL, 0); + if (rc) { + /* + * Only free on failure. Async cmds are handled like -EBUSY + * where it's handled for us. + */ + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); + } + return rc; }
/** @@ -372,6 +374,7 @@ int beiscsi_get_initiator_name(struct beiscsi_hba *phba, char *name, bool cfg) req->hdr.version = 1; rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, &resp, sizeof(resp)); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); if (rc) { beiscsi_log(phba, KERN_ERR, BEISCSI_LOG_CONFIG | BEISCSI_LOG_MBOX, @@ -449,7 +452,9 @@ static int beiscsi_if_mod_gw(struct beiscsi_hba *phba, req->ip_addr.ip_type = ip_type; memcpy(req->ip_addr.addr, gw, (ip_type < BEISCSI_IP_TYPE_V6) ? IP_V4_LEN : IP_V6_LEN); - return beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, NULL, 0); + rt_val = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, NULL, 0); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rt_val); + return rt_val; }
int beiscsi_if_set_gw(struct beiscsi_hba *phba, u32 ip_type, u8 *gw) @@ -499,8 +504,10 @@ int beiscsi_if_get_gw(struct beiscsi_hba *phba, u32 ip_type, req = nonemb_cmd.va; req->ip_type = ip_type;
- return beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, - resp, sizeof(*resp)); + rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, resp, + sizeof(*resp)); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); + return rc; }
static int @@ -537,6 +544,7 @@ beiscsi_if_clr_ip(struct beiscsi_hba *phba, "BG_%d : failed to clear IP: rc %d status %d\n", rc, req->ip_params.ip_record.status); } + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); return rc; }
@@ -581,6 +589,7 @@ beiscsi_if_set_ip(struct beiscsi_hba *phba, u8 *ip, if (req->ip_params.ip_record.status) rc = -EINVAL; } + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); return rc; }
@@ -608,6 +617,7 @@ int beiscsi_if_en_static(struct beiscsi_hba *phba, u32 ip_type, reldhcp->interface_hndl = phba->interface_handle; reldhcp->ip_type = ip_type; rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, NULL, 0); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); if (rc < 0) { beiscsi_log(phba, KERN_WARNING, BEISCSI_LOG_CONFIG, "BG_%d : failed to release existing DHCP: %d\n", @@ -689,7 +699,7 @@ int beiscsi_if_en_dhcp(struct beiscsi_hba *phba, u32 ip_type) dhcpreq->interface_hndl = phba->interface_handle; dhcpreq->ip_type = ip_type; rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, NULL, 0); - + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); exit: kfree(if_info); return rc; @@ -762,11 +772,8 @@ int beiscsi_if_get_info(struct beiscsi_hba *phba, int ip_type, BEISCSI_LOG_INIT | BEISCSI_LOG_CONFIG, "BG_%d : Memory Allocation Failure\n");
- /* Free the DMA memory for the IOCTL issuing */ - dma_free_coherent(&phba->ctrl.pdev->dev, - nonemb_cmd.size, - nonemb_cmd.va, - nonemb_cmd.dma); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, + -ENOMEM); return -ENOMEM; }
@@ -781,15 +788,13 @@ int beiscsi_if_get_info(struct beiscsi_hba *phba, int ip_type, nonemb_cmd.va)->actual_resp_len; ioctl_size += sizeof(struct be_cmd_req_hdr);
- /* Free the previous allocated DMA memory */ - dma_free_coherent(&phba->ctrl.pdev->dev, nonemb_cmd.size, - nonemb_cmd.va, - nonemb_cmd.dma); - + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); /* Free the virtual memory */ kfree(*if_info); - } else + } else { + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); break; + } } while (true); return rc; } @@ -806,8 +811,9 @@ int mgmt_get_nic_conf(struct beiscsi_hba *phba, if (rc) return rc;
- return beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, - nic, sizeof(*nic)); + rc = beiscsi_exec_nemb_cmd(phba, &nonemb_cmd, NULL, nic, sizeof(*nic)); + beiscsi_free_nemb_cmd(phba, &nonemb_cmd, rc); + return rc; }
static void beiscsi_boot_process_compl(struct beiscsi_hba *phba,
From: James Smart jsmart2021@gmail.com
[ Upstream commit ae463b60235e7a5decffbb0bd7209952ccda78eb ]
The NVMe support indicator in log message 6422 is displaying a field that was initialized but never set to indicate NVMe support. Remove obsolete nvme_support element from the lpfc_hba structure and change log message to display NVMe support status as reported in SLI4 Config Parameters mailbox command.
Link: https://lore.kernel.org/r/20210707184351.67872-2-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc.h | 1 - drivers/scsi/lpfc/lpfc_init.c | 3 +-- 2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 17028861234b..dd3ddfa5f761 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -922,7 +922,6 @@ struct lpfc_hba { uint8_t wwpn[8]; uint32_t RandomData[7]; uint8_t fcp_embed_io; - uint8_t nvme_support; /* Firmware supports NVME */ uint8_t nvmet_support; /* driver supports NVMET */ #define LPFC_NVMET_MAX_PORTS 32 uint8_t mds_diags_support; diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index e29523a1b530..4bbc621e77e9 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -12241,7 +12241,6 @@ lpfc_get_sli4_parameters(struct lpfc_hba *phba, LPFC_MBOXQ_t *mboxq) bf_get(cfg_xib, mbx_sli4_parameters), phba->cfg_enable_fc4_type); fcponly: - phba->nvme_support = 0; phba->nvmet_support = 0; phba->cfg_nvmet_mrq = 0; phba->cfg_nvme_seg_cnt = 0; @@ -12299,7 +12298,7 @@ lpfc_get_sli4_parameters(struct lpfc_hba *phba, LPFC_MBOXQ_t *mboxq) "6422 XIB %d PBDE %d: FCP %d NVME %d %d %d\n", bf_get(cfg_xib, mbx_sli4_parameters), phba->cfg_enable_pbde, - phba->fcp_embed_io, phba->nvme_support, + phba->fcp_embed_io, sli4_params->nvme, phba->cfg_nvme_embed_cmd, phba->cfg_suppress_rsp);
if ((bf_get(lpfc_sli_intf_if_type, &phba->sli4_hba.sli_intf) ==
From: James Smart jsmart2021@gmail.com
[ Upstream commit e8613084053d406c22914385a488e8b85072100c ]
There are instances when trace event logs are triggered from an interrupt context. The trace event log may attempt to alloc memory causing scheduling while atomic bug call traces.
Remove the need for the kmalloc'ed vport array when checking the log_verbose flag, which eliminates the need for any allocation.
Link: https://lore.kernel.org/r/20210707184351.67872-3-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_init.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 4bbc621e77e9..cb7773a9167d 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -14163,8 +14163,9 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba) unsigned int temp_idx; int i; int j = 0; - unsigned long rem_nsec; - struct lpfc_vport **vports; + unsigned long rem_nsec, iflags; + bool log_verbose = false; + struct lpfc_vport *port_iterator;
/* Don't dump messages if we explicitly set log_verbose for the * physical port or any vport. @@ -14172,16 +14173,24 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba) if (phba->cfg_log_verbose) return;
- vports = lpfc_create_vport_work_array(phba); - if (vports != NULL) { - for (i = 0; i <= phba->max_vpi && vports[i] != NULL; i++) { - if (vports[i]->cfg_log_verbose) { - lpfc_destroy_vport_work_array(phba, vports); + spin_lock_irqsave(&phba->port_list_lock, iflags); + list_for_each_entry(port_iterator, &phba->port_list, listentry) { + if (port_iterator->load_flag & FC_UNLOADING) + continue; + if (scsi_host_get(lpfc_shost_from_vport(port_iterator))) { + if (port_iterator->cfg_log_verbose) + log_verbose = true; + + scsi_host_put(lpfc_shost_from_vport(port_iterator)); + + if (log_verbose) { + spin_unlock_irqrestore(&phba->port_list_lock, + iflags); return; } } } - lpfc_destroy_vport_work_array(phba, vports); + spin_unlock_irqrestore(&phba->port_list_lock, iflags);
if (atomic_cmpxchg(&phba->dbg_log_dmping, 0, 1) != 0) return;
From: James Smart jsmart2021@gmail.com
[ Upstream commit 50baa1595d30412177da3b22625bffc1ce4f65d5 ]
Update comment headers for functions lpfc_vmid_cmd and lpfc_vmid_poll.
Link: https://lore.kernel.org/r/20210707184351.67872-5-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_ct.c | 5 ++--- drivers/scsi/lpfc/lpfc_init.c | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c index 610b6dabb3b5..1acb8820a08e 100644 --- a/drivers/scsi/lpfc/lpfc_ct.c +++ b/drivers/scsi/lpfc/lpfc_ct.c @@ -3884,9 +3884,8 @@ lpfc_cmpl_ct_cmd_vmid(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, /** * lpfc_vmid_cmd - Build and send a FDMI cmd to the specified NPort * @vport: pointer to a host virtual N_Port data structure. - * @ndlp: ndlp to send FDMI cmd to (if NULL use FDMI_DID) - * cmdcode: FDMI command to send - * mask: Mask of HBA or PORT Attributes to send + * @cmdcode: application server command code to send + * @vmid: pointer to vmid info structure * * Builds and sends a FDMI command using the CT subsystem. */ diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index cb7773a9167d..9a84a0b06ee0 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -4845,7 +4845,7 @@ lpfc_sli4_fcf_redisc_wait_tmo(struct timer_list *t)
/** * lpfc_vmid_poll - VMID timeout detection - * @ptr: Map to lpfc_hba data structure pointer. + * @t: Timer context used to obtain the pointer to lpfc hba data structure. * * This routine is invoked when there is no I/O on by a VM for the specified * amount of time. When this situation is detected, the VMID has to be
From: James Smart jsmart2021@gmail.com
[ Upstream commit e77803bdbf0aad98d36b1d3fa082852831814edd ]
If a LOGO is received for a node that is in the NPR state, post a DEVICE_RM event to allow clean up of the logged out node.
Clearing the NLP_NPR_2B_DISC flag upon receipt of a LOGO ACC may cause skipping of processing outstanding PLOGIs triggered by parallel RSCN events. If an outstanding PLOGI is being retried and receipt of a LOGO ACC occurs, then allow the discovery state machine's PLOGI completion to clean up the node.
Link: https://lore.kernel.org/r/20210707184351.67872-6-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_els.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index e481f5fe29d7..b0c443a0cf92 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -4612,6 +4612,15 @@ lpfc_cmpl_els_logo_acc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, goto out;
if (ndlp->nlp_state == NLP_STE_NPR_NODE) { + + /* If PLOGI is being retried, PLOGI completion will cleanup the + * node. The NLP_NPR_2B_DISC flag needs to be retained to make + * progress on nodes discovered from last RSCN. + */ + if ((ndlp->nlp_flag & NLP_DELAY_TMO) && + (ndlp->nlp_last_elscmd == ELS_CMD_PLOGI)) + goto out; + /* NPort Recovery mode or node is just allocated */ if (!lpfc_nlp_not_used(ndlp)) { /* A LOGO is completing and the node is in NPR state. @@ -8948,6 +8957,9 @@ lpfc_els_unsol_buffer(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, break; } lpfc_disc_state_machine(vport, ndlp, elsiocb, NLP_EVT_RCV_LOGO); + if (newnode) + lpfc_disc_state_machine(vport, ndlp, NULL, + NLP_EVT_DEVICE_RM); break; case ELS_CMD_PRLO: lpfc_debugfs_disc_trc(vport, LPFC_DISC_TRC_ELS_UNSOL,
From: James Smart jsmart2021@gmail.com
[ Upstream commit 21990d3d1861c7aa8e3e4ed98614f0c161c29b0c ]
Previous logic accidentally overrides the status variable to FAILURE when target reset status is SUCCESS.
Refactor the non-SUCCESS logic of lpfc_vmid_vport_cleanup(), which resolves the false override.
Link: https://lore.kernel.org/r/20210707184351.67872-7-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_scsi.c | 68 +++++++++++++++++++---------------- 1 file changed, 37 insertions(+), 31 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 1b248c237be1..10002a13c5c6 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -6273,6 +6273,7 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd) struct lpfc_scsi_event_header scsi_event; int status; u32 logit = LOG_FCP; + u32 dev_loss_tmo = vport->cfg_devloss_tmo; unsigned long flags; DECLARE_WAIT_QUEUE_HEAD_ONSTACK(waitq);
@@ -6314,39 +6315,44 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
status = lpfc_send_taskmgmt(vport, cmnd, tgt_id, lun_id, FCP_TARGET_RESET); - if (status != SUCCESS) - logit = LOG_TRACE_EVENT; - spin_lock_irqsave(&pnode->lock, flags); - if (status != SUCCESS && - (!(pnode->upcall_flags & NLP_WAIT_FOR_LOGO)) && - !pnode->logo_waitq) { - pnode->logo_waitq = &waitq; - pnode->nlp_fcp_info &= ~NLP_FCP_2_DEVICE; - pnode->nlp_flag |= NLP_ISSUE_LOGO; - pnode->upcall_flags |= NLP_WAIT_FOR_LOGO; - spin_unlock_irqrestore(&pnode->lock, flags); - lpfc_unreg_rpi(vport, pnode); - wait_event_timeout(waitq, - (!(pnode->upcall_flags & NLP_WAIT_FOR_LOGO)), - msecs_to_jiffies(vport->cfg_devloss_tmo * - 1000)); - - if (pnode->upcall_flags & NLP_WAIT_FOR_LOGO) { - lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, - "0725 SCSI layer TGTRST failed & LOGO TMO " - " (%d, %llu) return x%x\n", tgt_id, - lun_id, status); - spin_lock_irqsave(&pnode->lock, flags); - pnode->upcall_flags &= ~NLP_WAIT_FOR_LOGO; + if (status != SUCCESS) { + logit = LOG_TRACE_EVENT; + + /* Issue LOGO, if no LOGO is outstanding */ + spin_lock_irqsave(&pnode->lock, flags); + if (!(pnode->upcall_flags & NLP_WAIT_FOR_LOGO) && + !pnode->logo_waitq) { + pnode->logo_waitq = &waitq; + pnode->nlp_fcp_info &= ~NLP_FCP_2_DEVICE; + pnode->nlp_flag |= NLP_ISSUE_LOGO; + pnode->upcall_flags |= NLP_WAIT_FOR_LOGO; + spin_unlock_irqrestore(&pnode->lock, flags); + lpfc_unreg_rpi(vport, pnode); + wait_event_timeout(waitq, + (!(pnode->upcall_flags & + NLP_WAIT_FOR_LOGO)), + msecs_to_jiffies(dev_loss_tmo * + 1000)); + + if (pnode->upcall_flags & NLP_WAIT_FOR_LOGO) { + lpfc_printf_vlog(vport, KERN_ERR, logit, + "0725 SCSI layer TGTRST " + "failed & LOGO TMO (%d, %llu) " + "return x%x\n", + tgt_id, lun_id, status); + spin_lock_irqsave(&pnode->lock, flags); + pnode->upcall_flags &= ~NLP_WAIT_FOR_LOGO; + } else { + spin_lock_irqsave(&pnode->lock, flags); + } + pnode->logo_waitq = NULL; + spin_unlock_irqrestore(&pnode->lock, flags); + status = SUCCESS; + } else { - spin_lock_irqsave(&pnode->lock, flags); + spin_unlock_irqrestore(&pnode->lock, flags); + status = FAILED; } - pnode->logo_waitq = NULL; - spin_unlock_irqrestore(&pnode->lock, flags); - status = SUCCESS; - } else { - status = FAILED; - spin_unlock_irqrestore(&pnode->lock, flags); }
lpfc_printf_vlog(vport, KERN_ERR, logit,
From: James Smart jsmart2021@gmail.com
[ Upstream commit 2d338eb55b14ab9d245e8b1d982adecca8c4c613 ]
RDF ELS handling for NPIV ports may result in an incorrect NDLP reference count. In the event of a persistent link down, this may lead to premature release of an NDLP structure and subsequent NULL ptr dereference panic.
Remove extraneous lpfc_nlp_put() call in NPIV port RDF processing.
Link: https://lore.kernel.org/r/20210707184351.67872-9-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_els.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index b0c443a0cf92..b1ca6f8e5970 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -3413,7 +3413,6 @@ lpfc_issue_els_scr(struct lpfc_vport *vport, uint8_t retry) return 1; }
- /* Keep the ndlp just in case RDF is being sent */ return 0; }
@@ -3657,11 +3656,9 @@ lpfc_issue_els_rdf(struct lpfc_vport *vport, uint8_t retry) lpfc_enqueue_node(vport, ndlp); }
- /* RDF ELS is not required on an NPIV VN_Port. */ - if (vport->port_type == LPFC_NPIV_PORT) { - lpfc_nlp_put(ndlp); + /* RDF ELS is not required on an NPIV VN_Port. */ + if (vport->port_type == LPFC_NPIV_PORT) return -EACCES; - }
elsiocb = lpfc_prep_els_iocb(vport, 1, cmdsize, retry, ndlp, ndlp->nlp_DID, ELS_CMD_RDF);
From: James Smart jsmart2021@gmail.com
[ Upstream commit cd6047e92c6a5b0a44479cf98f76aac56ddfe108 ]
The ELS job request structure, that is allocated while issuing ELS RDF/SCR request path, is not being released in an error path causing a memory leak message on driver unload.
Free the ELS job structure in the error paths.
Link: https://lore.kernel.org/r/20210707184351.67872-10-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_els.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c index b1ca6f8e5970..3381912bf982 100644 --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -3375,6 +3375,7 @@ lpfc_issue_els_scr(struct lpfc_vport *vport, uint8_t retry) if (phba->sli_rev == LPFC_SLI_REV4) { rc = lpfc_reg_fab_ctrl_node(vport, ndlp); if (rc) { + lpfc_els_free_iocb(phba, elsiocb); lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE, "0937 %s: Failed to reg fc node, rc %d\n", __func__, rc); @@ -3667,6 +3668,7 @@ lpfc_issue_els_rdf(struct lpfc_vport *vport, uint8_t retry)
if (phba->sli_rev == LPFC_SLI_REV4 && !(ndlp->nlp_flag & NLP_RPI_REGISTERED)) { + lpfc_els_free_iocb(phba, elsiocb); lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE, "0939 %s: FC_NODE x%x RPI x%x flag x%x " "ste x%x type x%x Not registered\n",
From: James Smart jsmart2021@gmail.com
[ Upstream commit affbe24429410fddf4e50ca456c090ed6d8e05bf ]
In lpfc_offline_prep() an RPI is freed and nlp_rpi set to 0xFFFF before calling lpfc_unreg_rpi(). Unfortunately, lpfc_unreg_rpi() uses nlp_rpi to index the sli4_hba.rpi_ids[] array.
In lpfc_offline_prep(), unreg rpi before freeing the rpi.
Link: https://lore.kernel.org/r/20210707184351.67872-12-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_init.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 9a84a0b06ee0..c4235699a123 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -3541,6 +3541,8 @@ lpfc_offline_prep(struct lpfc_hba *phba, int mbx_action) spin_lock_irq(&ndlp->lock); ndlp->nlp_flag &= ~NLP_NPR_ADISC; spin_unlock_irq(&ndlp->lock); + + lpfc_unreg_rpi(vports[i], ndlp); /* * Whenever an SLI4 port goes offline, free the * RPI. Get a new RPI when the adapter port @@ -3556,7 +3558,6 @@ lpfc_offline_prep(struct lpfc_hba *phba, int mbx_action) lpfc_sli4_free_rpi(phba, ndlp->nlp_rpi); ndlp->nlp_rpi = LPFC_RPI_ALLOC_ERROR; } - lpfc_unreg_rpi(vports[i], ndlp);
if (ndlp->nlp_type & NLP_FABRIC) { lpfc_disc_state_machine(vports[i], ndlp,
From: James Smart jsmart2021@gmail.com
[ Upstream commit a9978e3978406ef5e35870b10e677cf75a2620b6 ]
Mailbox commands sent via ioctl/bsg from user applications may be interrupted from processing by a concurrently triggered PCI function reset. The command will not generate a completion due to the reset. This results in a user application hang waiting for the mailbox command to complete.
Resolve by changing the function reset handler to detect that there was an outstanding mailbox command and simulate a mailbox completion. Add some additional debug when a mailbox command times out.
Link: https://lore.kernel.org/r/20210707184351.67872-13-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_init.c | 11 ++++++++++- drivers/scsi/lpfc/lpfc_sli.c | 32 ++++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index c4235699a123..221f22a026e5 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -1852,6 +1852,7 @@ lpfc_sli4_port_sta_fn_reset(struct lpfc_hba *phba, int mbx_action, { int rc; uint32_t intr_mode; + LPFC_MBOXQ_t *mboxq;
if (bf_get(lpfc_sli_intf_if_type, &phba->sli4_hba.sli_intf) >= LPFC_SLI_INTF_IF_TYPE_2) { @@ -1871,11 +1872,19 @@ lpfc_sli4_port_sta_fn_reset(struct lpfc_hba *phba, int mbx_action, "Recovery...\n");
/* If we are no wait, the HBA has been reset and is not - * functional, thus we should clear LPFC_SLI_ACTIVE flag. + * functional, thus we should clear + * (LPFC_SLI_ACTIVE | LPFC_SLI_MBOX_ACTIVE) flags. */ if (mbx_action == LPFC_MBX_NO_WAIT) { spin_lock_irq(&phba->hbalock); phba->sli.sli_flag &= ~LPFC_SLI_ACTIVE; + if (phba->sli.mbox_active) { + mboxq = phba->sli.mbox_active; + mboxq->u.mb.mbxStatus = MBX_NOT_FINISHED; + __lpfc_mbox_cmpl_put(phba, mboxq); + phba->sli.sli_flag &= ~LPFC_SLI_MBOX_ACTIVE; + phba->sli.mbox_active = NULL; + } spin_unlock_irq(&phba->hbalock); }
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index f530d8fe7a8c..b6f58843c77a 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -8790,8 +8790,11 @@ static int lpfc_sli4_async_mbox_block(struct lpfc_hba *phba) { struct lpfc_sli *psli = &phba->sli; + LPFC_MBOXQ_t *mboxq; int rc = 0; unsigned long timeout = 0; + u32 sli_flag; + u8 cmd, subsys, opcode;
/* Mark the asynchronous mailbox command posting as blocked */ spin_lock_irq(&phba->hbalock); @@ -8809,12 +8812,37 @@ lpfc_sli4_async_mbox_block(struct lpfc_hba *phba) if (timeout) lpfc_sli4_process_missed_mbox_completions(phba);
- /* Wait for the outstnading mailbox command to complete */ + /* Wait for the outstanding mailbox command to complete */ while (phba->sli.mbox_active) { /* Check active mailbox complete status every 2ms */ msleep(2); if (time_after(jiffies, timeout)) { - /* Timeout, marked the outstanding cmd not complete */ + /* Timeout, mark the outstanding cmd not complete */ + + /* Sanity check sli.mbox_active has not completed or + * cancelled from another context during last 2ms sleep, + * so take hbalock to be sure before logging. + */ + spin_lock_irq(&phba->hbalock); + if (phba->sli.mbox_active) { + mboxq = phba->sli.mbox_active; + cmd = mboxq->u.mb.mbxCommand; + subsys = lpfc_sli_config_mbox_subsys_get(phba, + mboxq); + opcode = lpfc_sli_config_mbox_opcode_get(phba, + mboxq); + sli_flag = psli->sli_flag; + spin_unlock_irq(&phba->hbalock); + lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT, + "2352 Mailbox command x%x " + "(x%x/x%x) sli_flag x%x could " + "not complete\n", + cmd, subsys, opcode, + sli_flag); + } else { + spin_unlock_irq(&phba->hbalock); + } + rc = 1; break; }
From: James Smart jsmart2021@gmail.com
[ Upstream commit ab803860882514ddbf97713b143b861b524e8476 ]
When a node moves to NPR state due to a device recovery event, the nlp_fc4_types in the node are cleared. An ADISC received for a node in the NPR state triggers an ADISC. Without fc4 types being known, the calls to register with the transport are no-op'd, thus no additional references are placed on the node by transport re-registrations. A subsequent RSCN could trigger another unregister request, which will decrement the reference counts, leading to the ref count hitting zero and the node being freed while futher discovery on the node is being attempted by the RSCN event handling.
Fix by skipping the trigger of an ADISC when in NPR state. The normal ADISC process will kick off in the regular discovery path after receiving a response from name server.
Link: https://lore.kernel.org/r/20210707184351.67872-19-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_nportdisc.c | 34 +++++++++++++++++------------- 1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c index e12f83fb795c..6b06e5808e14 100644 --- a/drivers/scsi/lpfc/lpfc_nportdisc.c +++ b/drivers/scsi/lpfc/lpfc_nportdisc.c @@ -736,9 +736,13 @@ lpfc_rcv_padisc(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, * is already in MAPPED or UNMAPPED state. Catch this * condition and don't set the nlp_state again because * it causes an unnecessary transport unregister/register. + * + * Nodes marked for ADISC will move MAPPED or UNMAPPED state + * after issuing ADISC */ if (ndlp->nlp_type & (NLP_FCP_TARGET | NLP_NVME_TARGET)) { - if (ndlp->nlp_state != NLP_STE_MAPPED_NODE) + if ((ndlp->nlp_state != NLP_STE_MAPPED_NODE) && + !(ndlp->nlp_flag & NLP_NPR_ADISC)) lpfc_nlp_set_state(vport, ndlp, NLP_STE_MAPPED_NODE); } @@ -2645,14 +2649,13 @@ lpfc_rcv_prli_npr_node(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, lpfc_els_rsp_reject(vport, stat.un.lsRjtError, cmdiocb, ndlp, NULL);
if (!(ndlp->nlp_flag & NLP_DELAY_TMO)) { - if (ndlp->nlp_flag & NLP_NPR_ADISC) { - spin_lock_irq(&ndlp->lock); - ndlp->nlp_flag &= ~NLP_NPR_ADISC; - ndlp->nlp_prev_state = NLP_STE_NPR_NODE; - spin_unlock_irq(&ndlp->lock); - lpfc_nlp_set_state(vport, ndlp, NLP_STE_ADISC_ISSUE); - lpfc_issue_els_adisc(vport, ndlp, 0); - } else { + /* + * ADISC nodes will be handled in regular discovery path after + * receiving response from NS. + * + * For other nodes, Send PLOGI to trigger an implicit LOGO. + */ + if (!(ndlp->nlp_flag & NLP_NPR_ADISC)) { ndlp->nlp_prev_state = NLP_STE_NPR_NODE; lpfc_nlp_set_state(vport, ndlp, NLP_STE_PLOGI_ISSUE); lpfc_issue_els_plogi(vport, ndlp->nlp_DID, 0); @@ -2685,12 +2688,13 @@ lpfc_rcv_padisc_npr_node(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, */ if (!(ndlp->nlp_flag & NLP_DELAY_TMO) && !(ndlp->nlp_flag & NLP_NPR_2B_DISC)) { - if (ndlp->nlp_flag & NLP_NPR_ADISC) { - ndlp->nlp_flag &= ~NLP_NPR_ADISC; - ndlp->nlp_prev_state = NLP_STE_NPR_NODE; - lpfc_nlp_set_state(vport, ndlp, NLP_STE_ADISC_ISSUE); - lpfc_issue_els_adisc(vport, ndlp, 0); - } else { + /* + * ADISC nodes will be handled in regular discovery path after + * receiving response from NS. + * + * For other nodes, Send PLOGI to trigger an implicit LOGO. + */ + if (!(ndlp->nlp_flag & NLP_NPR_ADISC)) { ndlp->nlp_prev_state = NLP_STE_NPR_NODE; lpfc_nlp_set_state(vport, ndlp, NLP_STE_PLOGI_ISSUE); lpfc_issue_els_plogi(vport, ndlp->nlp_DID, 0);
From: Yang Li yang.lee@linux.alibaba.com
[ Upstream commit 97c29755598f98c6c91f68f12bdd3f517e457890 ]
Currently the function returns NULL on error, so exact error code is lost. This patch changes return convention of the function to use ERR_PTR() on error instead.
Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Yang Li yang.lee@linux.alibaba.com Link: https://lore.kernel.org/r/1623896524-102058-1-git-send-email-yang.lee@linux.... [geert: Drop curly braces] Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/renesas/renesas-rzg2l-cpg.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/renesas/renesas-rzg2l-cpg.c b/drivers/clk/renesas/renesas-rzg2l-cpg.c index e7c59af2a1d8..5fe73225ece2 100644 --- a/drivers/clk/renesas/renesas-rzg2l-cpg.c +++ b/drivers/clk/renesas/renesas-rzg2l-cpg.c @@ -182,10 +182,8 @@ rzg2l_cpg_pll_clk_register(const struct cpg_core_clk *core, return ERR_CAST(parent);
pll_clk = devm_kzalloc(dev, sizeof(*pll_clk), GFP_KERNEL); - if (!pll_clk) { - clk = ERR_PTR(-ENOMEM); - return NULL; - } + if (!pll_clk) + return ERR_PTR(-ENOMEM);
parent_name = __clk_get_name(parent); init.name = core->name;
From: Jia Yang jiayang5@huawei.com
[ Upstream commit 10d0786b39b3b91c4fbf8c2926e97ab456a4eea1 ]
This reverts commit 957fa47823dfe449c5a15a944e4e7a299a6601db.
The patch "f2fs: Fix indefinite loop in f2fs_gc()" v1 and v4 are all merged. Patch v4 is test info for patch v1. Patch v1 doesn't work and may cause that sbi->cur_victim_sec can't be resetted to NULL_SEGNO, which makes SSR unable to get segment of sbi->cur_victim_sec. So it should be reverted.
The mails record: [1] https://lore.kernel.org/linux-f2fs-devel/7288dcd4-b168-7656-d1af-7e2cafa4f72... [2] https://lore.kernel.org/linux-f2fs-devel/20190809153653.GD93481@jaegeuk-macb...
Signed-off-by: Jia Yang jiayang5@huawei.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/gc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 0e42ee5f7770..396b6f55ec24 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1747,7 +1747,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, round++; }
- if (gc_type == FG_GC && seg_freed) + if (gc_type == FG_GC) sbi->cur_victim_sec = NULL_SEGNO;
if (sync)
From: Mike McGowen mike.mcgowen@microchip.com
[ Upstream commit 0777a3fb98f0ea546561d04db4fd325248c39961 ]
Correct driver's ISR accessing a data structure member that has not been fully initialized during driver initialization.
The pqi queue groups can have uninitialized members when an interrupt fires. This has not resulted in any driver crashes. This was found during our own internal testing. No bugs were ever filed.
Link: https://lore.kernel.org/r/20210714182847.50360-9-don.brace@microchip.com Reviewed-by: Kevin Barnett kevin.barnett@microchip.com Reviewed-by: Scott Benesh scott.benesh@microchip.com Reviewed-by: Scott Teel scott.teel@microchip.com Signed-off-by: Mike McGowen mike.mcgowen@microchip.com Signed-off-by: Don Brace don.brace@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/smartpqi/smartpqi_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index dcc0b9618a64..1bda105f7892 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -7758,11 +7758,11 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
pqi_init_operational_queues(ctrl_info);
- rc = pqi_request_irqs(ctrl_info); + rc = pqi_create_queues(ctrl_info); if (rc) return rc;
- rc = pqi_create_queues(ctrl_info); + rc = pqi_request_irqs(ctrl_info); if (rc) return rc;
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit 2eeb0dce728a7eac3e4dfe355d98af40d61f7a26 ]
This tries to fix priority inversion in the below condition resulting in long checkpoint delay.
f2fs_get_node_info() - nat_tree_lock -> sleep to grab journal_rwsem by contention
checkpoint - waiting for nat_tree_lock
In order to let checkpoint go, let's release nat_tree_lock, if there's a journal_rwsem contention.
Signed-off-by: Daeho Jeong daehojeong@google.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/node.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 0be9e2d7120e..c945a9730f3c 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -552,7 +552,7 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, int i;
ni->nid = nid; - +retry: /* Check nat cache */ down_read(&nm_i->nat_tree_lock); e = __lookup_nat_cache(nm_i, nid); @@ -564,10 +564,19 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, return 0; }
- memset(&ne, 0, sizeof(struct f2fs_nat_entry)); + /* + * Check current segment summary by trying to grab journal_rwsem first. + * This sem is on the critical path on the checkpoint requiring the above + * nat_tree_lock. Therefore, we should retry, if we failed to grab here + * while not bothering checkpoint. + */ + if (!rwsem_is_locked(&sbi->cp_global_sem)) { + down_read(&curseg->journal_rwsem); + } else if (!down_read_trylock(&curseg->journal_rwsem)) { + up_read(&nm_i->nat_tree_lock); + goto retry; + }
- /* Check current segment summary */ - down_read(&curseg->journal_rwsem); i = f2fs_lookup_journal_in_cursum(journal, NAT_JOURNAL, nid, 0); if (i >= 0) { ne = nat_in_journal(journal, i);
From: Lennert Buytenhek buytenh@wantstofly.org
[ Upstream commit ee974d9625c405977ef5d9aedc476be1d0362ebf ]
For the printing of RMP_HW_ERROR / RMP_PAGE_FAULT / IO_PAGE_FAULT events, the AMD IOMMU code uses such logic:
if (pdev) dev_data = dev_iommu_priv_get(&pdev->dev);
if (dev_data && __ratelimit(&dev_data->rs)) { pci_err(pdev, ... } else { printk_ratelimit() / pr_err{,_ratelimited}(... }
This means that if we receive an event for a PCI devid which actually does have a struct pci_dev and an attached struct iommu_dev_data, but rate limiting kicks in, we'll fall back to the non-PCI branch of the test, and print the event in a different format.
Fix this by changing the logic to:
if (dev_data) { if (__ratelimit(&dev_data->rs)) { pci_err(pdev, ... } } else { pr_err_ratelimited(... }
Suggested-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Lennert Buytenhek buytenh@wantstofly.org Reviewed-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/YPgk1dD1gPMhJXgY@wantstofly.org Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/iommu.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 811a49a95d04..a7d6d78147b7 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -425,9 +425,11 @@ static void amd_iommu_report_rmp_hw_error(volatile u32 *event) if (pdev) dev_data = dev_iommu_priv_get(&pdev->dev);
- if (dev_data && __ratelimit(&dev_data->rs)) { - pci_err(pdev, "Event logged [RMP_HW_ERROR vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n", - vmg_tag, spa, flags); + if (dev_data) { + if (__ratelimit(&dev_data->rs)) { + pci_err(pdev, "Event logged [RMP_HW_ERROR vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n", + vmg_tag, spa, flags); + } } else { pr_err_ratelimited("Event logged [RMP_HW_ERROR device=%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n", PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), @@ -456,9 +458,11 @@ static void amd_iommu_report_rmp_fault(volatile u32 *event) if (pdev) dev_data = dev_iommu_priv_get(&pdev->dev);
- if (dev_data && __ratelimit(&dev_data->rs)) { - pci_err(pdev, "Event logged [RMP_PAGE_FAULT vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n", - vmg_tag, gpa, flags_rmp, flags); + if (dev_data) { + if (__ratelimit(&dev_data->rs)) { + pci_err(pdev, "Event logged [RMP_PAGE_FAULT vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n", + vmg_tag, gpa, flags_rmp, flags); + } } else { pr_err_ratelimited("Event logged [RMP_PAGE_FAULT device=%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n", PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), @@ -480,11 +484,13 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, if (pdev) dev_data = dev_iommu_priv_get(&pdev->dev);
- if (dev_data && __ratelimit(&dev_data->rs)) { - pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n", - domain_id, address, flags); - } else if (printk_ratelimit()) { - pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", + if (dev_data) { + if (__ratelimit(&dev_data->rs)) { + pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n", + domain_id, address, flags); + } + } else { + pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), domain_id, address, flags); }
From: James Smart jsmart2021@gmail.com
[ Upstream commit df3d78c3eb4eba13b3ef9740a8c664508ee644ae ]
On the newer hardware, CQ_ID values can be larger than seen on previous generations. This exposed an issue in the driver where its definition of cq_id in the RQ Create mailbox cmd was too small, thus the cq_id was truncated, causing the command to fail.
Revise the RQ_CREATE CQ_ID field to its proper size (16 bits).
Link: https://lore.kernel.org/r/20210722221721.74388-3-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_hw4.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_hw4.h b/drivers/scsi/lpfc/lpfc_hw4.h index eb8c735a243b..773a317e984f 100644 --- a/drivers/scsi/lpfc/lpfc_hw4.h +++ b/drivers/scsi/lpfc/lpfc_hw4.h @@ -1555,7 +1555,7 @@ struct rq_context { #define lpfc_rq_context_hdr_size_WORD word1 uint32_t word2; #define lpfc_rq_context_cq_id_SHIFT 16 -#define lpfc_rq_context_cq_id_MASK 0x000003FF +#define lpfc_rq_context_cq_id_MASK 0x0000FFFF #define lpfc_rq_context_cq_id_WORD word2 #define lpfc_rq_context_buf_size_SHIFT 0 #define lpfc_rq_context_buf_size_MASK 0x0000FFFF
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 91d1be9fb7d667ae136f05cc645276eb2c9fa40e ]
As R-Car H3 ES1.x (R8A77950) and R-Car ES2.0+ (R8A77951) use the same compatible value, the pin control driver relies on soc_device_match() with soc_id = "r8a7795" and the (non)matching of revision = "ES1.*" to match with and distinguish between the two SoC variants. The corresponding entries in the normal of_match_table are present only to make the optional sanity checks work.
The R-Car H3e-2G (R8A779M1) SoC is a different grading of the R-Car H3 ES3.0 (R8A77951) SoC. It uses the same compatible values for individual devices, but has an additional compatible value for the root node. When running on an R-Car H3e-2G SoC, soc_device_match() with soc_id = "r8a7795" does not return a match. Hence the pin control driver falls back to the normal of_match_table, and, as the R8A77950 entry is listed first, incorrectly uses the sub-driver for R-Car H3 ES1.x.
Fix this by moving the entry for R8A77951 before the entry for R8A77950. Simplify sh_pfc_quirk_match() to only handle R-Car H3 ES1,x, as R-Car H3 ES2.0+ can now be matched using the normal of_match_table as well.
Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Reviewed-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://lore.kernel.org/r/6cdc5bfa424461105779b56f455387e03560cf66.162670768... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/renesas/core.c | 29 ++++++++++++----------------- drivers/pinctrl/renesas/sh_pfc.h | 4 ++-- 2 files changed, 14 insertions(+), 19 deletions(-)
diff --git a/drivers/pinctrl/renesas/core.c b/drivers/pinctrl/renesas/core.c index 5ccc49b387f1..f2ab02225837 100644 --- a/drivers/pinctrl/renesas/core.c +++ b/drivers/pinctrl/renesas/core.c @@ -571,17 +571,21 @@ static const struct of_device_id sh_pfc_of_table[] = { .data = &r8a7794_pinmux_info, }, #endif -/* Both r8a7795 entries must be present to make sanity checks work */ -#ifdef CONFIG_PINCTRL_PFC_R8A77950 +/* + * Both r8a7795 entries must be present to make sanity checks work, but only + * the first entry is actually used. + * R-Car H3 ES1.x is matched using soc_device_match() instead. + */ +#ifdef CONFIG_PINCTRL_PFC_R8A77951 { .compatible = "renesas,pfc-r8a7795", - .data = &r8a77950_pinmux_info, + .data = &r8a77951_pinmux_info, }, #endif -#ifdef CONFIG_PINCTRL_PFC_R8A77951 +#ifdef CONFIG_PINCTRL_PFC_R8A77950 { .compatible = "renesas,pfc-r8a7795", - .data = &r8a77951_pinmux_info, + .data = &r8a77950_pinmux_info, }, #endif #ifdef CONFIG_PINCTRL_PFC_R8A77960 @@ -1085,26 +1089,20 @@ static inline void sh_pfc_check_driver(struct platform_driver *pdrv) {} #ifdef CONFIG_OF static const void *sh_pfc_quirk_match(void) { -#if defined(CONFIG_PINCTRL_PFC_R8A77950) || \ - defined(CONFIG_PINCTRL_PFC_R8A77951) +#ifdef CONFIG_PINCTRL_PFC_R8A77950 const struct soc_device_attribute *match; static const struct soc_device_attribute quirks[] = { { .soc_id = "r8a7795", .revision = "ES1.*", .data = &r8a77950_pinmux_info, }, - { - .soc_id = "r8a7795", - .data = &r8a77951_pinmux_info, - }, - { /* sentinel */ } };
match = soc_device_match(quirks); if (match) - return match->data ?: ERR_PTR(-ENODEV); -#endif /* CONFIG_PINCTRL_PFC_R8A77950 || CONFIG_PINCTRL_PFC_R8A77951 */ + return match->data; +#endif /* CONFIG_PINCTRL_PFC_R8A77950 */
return NULL; } @@ -1119,9 +1117,6 @@ static int sh_pfc_probe(struct platform_device *pdev) #ifdef CONFIG_OF if (pdev->dev.of_node) { info = sh_pfc_quirk_match(); - if (IS_ERR(info)) - return PTR_ERR(info); - if (!info) info = of_device_get_match_data(&pdev->dev); } else diff --git a/drivers/pinctrl/renesas/sh_pfc.h b/drivers/pinctrl/renesas/sh_pfc.h index 320898861c4b..0fdafef2dc69 100644 --- a/drivers/pinctrl/renesas/sh_pfc.h +++ b/drivers/pinctrl/renesas/sh_pfc.h @@ -332,8 +332,8 @@ extern const struct sh_pfc_soc_info r8a7791_pinmux_info; extern const struct sh_pfc_soc_info r8a7792_pinmux_info; extern const struct sh_pfc_soc_info r8a7793_pinmux_info; extern const struct sh_pfc_soc_info r8a7794_pinmux_info; -extern const struct sh_pfc_soc_info r8a77950_pinmux_info __weak; -extern const struct sh_pfc_soc_info r8a77951_pinmux_info __weak; +extern const struct sh_pfc_soc_info r8a77950_pinmux_info; +extern const struct sh_pfc_soc_info r8a77951_pinmux_info; extern const struct sh_pfc_soc_info r8a77960_pinmux_info; extern const struct sh_pfc_soc_info r8a77961_pinmux_info; extern const struct sh_pfc_soc_info r8a77965_pinmux_info;
From: Chun-Jie Chen chun-jie.chen@mediatek.com
[ Upstream commit 7cc4e1bbe300c5cf610ece8eca6c6751b8bc74db ]
In fact, the en_mask is a combination of divider enable mask and pll enable bit(bit0). Before this patch, we enabled both divider mask and bit0 in prepare(), but only cleared the bit0 in unprepare(). In the future, we hope en_mask will only be used as divider enable mask. The enable register(CON0) will be set in 2 steps: first is divider mask, and then bit0 during prepare(), and vice versa. But considering backward compatibility, at this stage we allow en_mask to be a combination or a pure divider enable mask. And then we will make en_mask a pure divider enable mask in another following patch series.
Reviewed-by: Ikjoon Jang ikjn@chromium.org Signed-off-by: Weiyi Lu weiyi.lu@mediatek.com Signed-off-by: Chun-Jie Chen chun-jie.chen@mediatek.com Link: https://lore.kernel.org/r/20210726105719.15793-7-chun-jie.chen@mediatek.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mediatek/clk-pll.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/mediatek/clk-pll.c b/drivers/clk/mediatek/clk-pll.c index f440f2cd0b69..11ed5d1d1c36 100644 --- a/drivers/clk/mediatek/clk-pll.c +++ b/drivers/clk/mediatek/clk-pll.c @@ -238,6 +238,7 @@ static int mtk_pll_prepare(struct clk_hw *hw) { struct mtk_clk_pll *pll = to_mtk_clk_pll(hw); u32 r; + u32 div_en_mask;
r = readl(pll->pwr_addr) | CON0_PWR_ON; writel(r, pll->pwr_addr); @@ -247,10 +248,15 @@ static int mtk_pll_prepare(struct clk_hw *hw) writel(r, pll->pwr_addr); udelay(1);
- r = readl(pll->base_addr + REG_CON0); - r |= pll->data->en_mask; + r = readl(pll->base_addr + REG_CON0) | CON0_BASE_EN; writel(r, pll->base_addr + REG_CON0);
+ div_en_mask = pll->data->en_mask & ~CON0_BASE_EN; + if (div_en_mask) { + r = readl(pll->base_addr + REG_CON0) | div_en_mask; + writel(r, pll->base_addr + REG_CON0); + } + __mtk_pll_tuner_enable(pll);
udelay(20); @@ -268,6 +274,7 @@ static void mtk_pll_unprepare(struct clk_hw *hw) { struct mtk_clk_pll *pll = to_mtk_clk_pll(hw); u32 r; + u32 div_en_mask;
if (pll->data->flags & HAVE_RST_BAR) { r = readl(pll->base_addr + REG_CON0); @@ -277,8 +284,13 @@ static void mtk_pll_unprepare(struct clk_hw *hw)
__mtk_pll_tuner_disable(pll);
- r = readl(pll->base_addr + REG_CON0); - r &= ~CON0_BASE_EN; + div_en_mask = pll->data->en_mask & ~CON0_BASE_EN; + if (div_en_mask) { + r = readl(pll->base_addr + REG_CON0) & ~div_en_mask; + writel(r, pll->base_addr + REG_CON0); + } + + r = readl(pll->base_addr + REG_CON0) & ~CON0_BASE_EN; writel(r, pll->base_addr + REG_CON0);
r = readl(pll->pwr_addr) | CON0_ISO_EN;
From: Anirudh Rayabharam mail@anirudhrb.com
[ Upstream commit f7744fa16b96da57187dc8e5634152d3b63d72de ]
Free the unsent raw_report buffers when the device is removed.
Fixes a memory leak reported by syzbot at: https://syzkaller.appspot.com/bug?id=7b4fa7cb1a7c2d3342a2a8a6c53371c8c418ab4...
Reported-by: syzbot+47b26cd837ececfc666d@syzkaller.appspotmail.com Tested-by: syzbot+47b26cd837ececfc666d@syzkaller.appspotmail.com Signed-off-by: Anirudh Rayabharam mail@anirudhrb.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/usbhid/hid-core.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 06130dc431a0..0bf123bf2ef8 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -505,7 +505,7 @@ static void hid_ctrl(struct urb *urb)
if (unplug) { usbhid->ctrltail = usbhid->ctrlhead; - } else { + } else if (usbhid->ctrlhead != usbhid->ctrltail) { usbhid->ctrltail = (usbhid->ctrltail + 1) & (HID_CONTROL_FIFO_SIZE - 1);
if (usbhid->ctrlhead != usbhid->ctrltail && @@ -1223,9 +1223,20 @@ static void usbhid_stop(struct hid_device *hid) mutex_lock(&usbhid->mutex);
clear_bit(HID_STARTED, &usbhid->iofl); + spin_lock_irq(&usbhid->lock); /* Sync with error and led handlers */ set_bit(HID_DISCONNECTED, &usbhid->iofl); + while (usbhid->ctrltail != usbhid->ctrlhead) { + if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_OUT) { + kfree(usbhid->ctrl[usbhid->ctrltail].raw_report); + usbhid->ctrl[usbhid->ctrltail].raw_report = NULL; + } + + usbhid->ctrltail = (usbhid->ctrltail + 1) & + (HID_CONTROL_FIFO_SIZE - 1); + } spin_unlock_irq(&usbhid->lock); + usb_kill_urb(usbhid->urbin); usb_kill_urb(usbhid->urbout); usb_kill_urb(usbhid->urbctrl);
From: James Smart jsmart2021@gmail.com
[ Upstream commit 7740b615b6665e47f162e261d805f1bbbac15876 ]
The lpfc_sli4_nvmet_xri_aborted() routine takes out the abts_buf_list_lock and traverses the buffer contexts to match the xri. Upon match, it then takes the context lock before potentially removing the context from the associated buffer list. This violates the lock hierarchy used elsewhere in the driver of locking context, then the abts_buf_list_lock - thus a possible deadlock.
Resolve by: after matching, release the abts_buf_list_lock, then take the context lock, and if to be deleted from the list, retake the abts_buf_list_lock, maintaining lock hierarchy. This matches same list lock hierarchy as elsewhere in the driver
Link: https://lore.kernel.org/r/20210730163309.25809-1-jsmart2021@gmail.com Reported-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_nvmet.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c index f2d9a3580887..6e3dd0b9bcfa 100644 --- a/drivers/scsi/lpfc/lpfc_nvmet.c +++ b/drivers/scsi/lpfc/lpfc_nvmet.c @@ -1797,19 +1797,22 @@ lpfc_sli4_nvmet_xri_aborted(struct lpfc_hba *phba, if (ctxp->ctxbuf->sglq->sli4_xritag != xri) continue;
- spin_lock(&ctxp->ctxlock); + spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock, + iflag); + + spin_lock_irqsave(&ctxp->ctxlock, iflag); /* Check if we already received a free context call * and we have completed processing an abort situation. */ if (ctxp->flag & LPFC_NVME_CTX_RLS && !(ctxp->flag & LPFC_NVME_ABORT_OP)) { + spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock); list_del_init(&ctxp->list); + spin_unlock(&phba->sli4_hba.abts_nvmet_buf_list_lock); released = true; } ctxp->flag &= ~LPFC_NVME_XBUSY; - spin_unlock(&ctxp->ctxlock); - spin_unlock_irqrestore(&phba->sli4_hba.abts_nvmet_buf_list_lock, - iflag); + spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
rrq_empty = list_empty(&phba->active_rrq_list); ndlp = lpfc_findnode_did(phba->pport, ctxp->sid);
From: Nadav Amit namit@vmware.com
[ Upstream commit 3b122a5666cb7c0bb9a439fba0c9a6cf59f999c3 ]
On virtual machines, software must flush the IOTLB after each page table entry update.
The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through iommu_ops callback. As the result, a single sg mapping may lead to multiple IOTLB flushes.
Fix this by adding amd_iotlb_sync_map() callback and flushing at this point after all sg mappings we set.
This commit is followed and inspired by commit 933fcd01e97e2 ("iommu/vt-d: Add iotlb_sync_map callback").
Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: Jiajun Cao caojiajun@vmware.com Cc: Robin Murphy robin.murphy@arm.com Cc: Lu Baolu baolu.lu@linux.intel.com Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Nadav Amit namit@vmware.com Link: https://lore.kernel.org/r/20210723093209.714328-7-namit@vmware.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/iommu.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a7d6d78147b7..e0cd6e42d349 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2028,6 +2028,16 @@ static int amd_iommu_attach_device(struct iommu_domain *dom, return ret; }
+static void amd_iommu_iotlb_sync_map(struct iommu_domain *dom, + unsigned long iova, size_t size) +{ + struct protection_domain *domain = to_pdomain(dom); + struct io_pgtable_ops *ops = &domain->iop.iop.ops; + + if (ops->map) + domain_flush_np_cache(domain, iova, size); +} + static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, phys_addr_t paddr, size_t page_size, int iommu_prot, gfp_t gfp) @@ -2046,10 +2056,8 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, if (iommu_prot & IOMMU_WRITE) prot |= IOMMU_PROT_IW;
- if (ops->map) { + if (ops->map) ret = ops->map(ops, iova, paddr, page_size, prot, gfp); - domain_flush_np_cache(domain, iova, page_size); - }
return ret; } @@ -2197,6 +2205,7 @@ const struct iommu_ops amd_iommu_ops = { .attach_dev = amd_iommu_attach_device, .detach_dev = amd_iommu_detach_device, .map = amd_iommu_map, + .iotlb_sync_map = amd_iommu_iotlb_sync_map, .unmap = amd_iommu_unmap, .iova_to_phys = amd_iommu_iova_to_phys, .probe_device = amd_iommu_probe_device,
From: Chao Yu chao@kernel.org
[ Upstream commit 2787991516468bfafafb9bf2b45a848e6b202e7c ]
[1] https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg15126...
As [1] reported, if lower device doesn't support write barrier, in below case:
- write page #0; persist - overwrite page #0 - fsync - write data page #0 OPU into device's cache - write inode page into device's cache - issue flush
If SPO is triggered during flush command, inode page can be persisted before data page #0, so that after recovery, inode page can be recovered with new physical block address of data page #0, however there may contains dummy data in new physical block address.
Then what user will see is: after overwrite & fsync + SPO, old data in file was corrupted, if any user do care about such case, we can suggest user to use STRICT fsync mode, in this mode, we will force to use atomic write sematics to keep write order in between data/node and last node, so that it avoids potential data corruption during fsync().
Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/file.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 6afd4562335f..00b45876eaa1 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end, f2fs_exist_written_data(sbi, ino, UPDATE_INO)) goto flush_out; goto out; + } else { + /* + * for OPU case, during fsync(), node can be persisted before + * data when lower device doesn't support write barrier, result + * in data corruption after SPO. + * So for strict fsync mode, force to use atomic write sematics + * to keep write order in between data/node and last node to + * avoid potential data corruption. + */ + if (F2FS_OPTION(sbi).fsync_mode == + FSYNC_MODE_STRICT && !atomic) + atomic = true; } go_write: /*
From: Laibin Qiu qiulaibin@huawei.com
[ Upstream commit dc675a97129c4d9d5af55a3d7f23d7e092b8e032 ]
F2FS have dirty page count control for batched sequential write in writepages, and get the value of min_seq_blocks by blocks_per_seg * segs_per_sec(segs_per_sec defaults to 1). But in some scenes we set a lager section size, Min_seq_blocks will become too large to achieve the expected effect(eg. 4thread sequential write, the number of merge requests will be reduced).
Signed-off-by: Laibin Qiu qiulaibin@huawei.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/segment.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 15cc89eef28d..2e543c7c1bc3 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -5159,7 +5159,7 @@ int f2fs_build_segment_manager(struct f2fs_sb_info *sbi) sm_info->ipu_policy = 1 << F2FS_IPU_FSYNC; sm_info->min_ipu_util = DEF_MIN_IPU_UTIL; sm_info->min_fsync_blocks = DEF_MIN_FSYNC_BLOCKS; - sm_info->min_seq_blocks = sbi->blocks_per_seg * sbi->segs_per_sec; + sm_info->min_seq_blocks = sbi->blocks_per_seg; sm_info->min_hot_blocks = DEF_MIN_HOT_BLOCKS; sm_info->min_ssr_sections = reserved_sections(sbi);
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit 35c7d874f5993db04ce3aa310ae088f14b801eda ]
Instead of documenting the locking requirements of the UIC code as comments, use lockdep_assert_held() such that lockdep verifies the lockdep requirements at runtime if lockdep is enabled.
Link: https://lore.kernel.org/r/20210722033439.26550-8-bvanassche@acm.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Stanley Chu stanley.chu@mediatek.com Cc: Can Guo cang@codeaurora.org Cc: Asutosh Das asutoshd@codeaurora.org Reviewed-by: Avri Altman avri.altman@wdc.com Reviewed-by: Bean Huo beanhuo@micron.com Reviewed-by: Daejun Park daejun7.park@samsung.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 16 +++++++++------- drivers/scsi/ufs/ufshcd.h | 2 +- 2 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 708b3b62fc4d..3350b0cff9ef 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2247,15 +2247,15 @@ static inline u8 ufshcd_get_upmcrs(struct ufs_hba *hba) }
/** - * ufshcd_dispatch_uic_cmd - Dispatch UIC commands to unipro layers + * ufshcd_dispatch_uic_cmd - Dispatch an UIC command to the Unipro layer * @hba: per adapter instance * @uic_cmd: UIC command - * - * Mutex must be held. */ static inline void ufshcd_dispatch_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) { + lockdep_assert_held(&hba->uic_cmd_mutex); + WARN_ON(hba->active_uic_cmd);
hba->active_uic_cmd = uic_cmd; @@ -2273,11 +2273,10 @@ ufshcd_dispatch_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) }
/** - * ufshcd_wait_for_uic_cmd - Wait complectioin of UIC command + * ufshcd_wait_for_uic_cmd - Wait for completion of an UIC command * @hba: per adapter instance * @uic_cmd: UIC command * - * Must be called with mutex held. * Returns 0 only if success. */ static int @@ -2286,6 +2285,8 @@ ufshcd_wait_for_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) int ret; unsigned long flags;
+ lockdep_assert_held(&hba->uic_cmd_mutex); + if (wait_for_completion_timeout(&uic_cmd->done, msecs_to_jiffies(UIC_CMD_TIMEOUT))) { ret = uic_cmd->argument2 & MASK_UIC_COMMAND_RESULT; @@ -2315,14 +2316,15 @@ ufshcd_wait_for_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) * @uic_cmd: UIC command * @completion: initialize the completion only if this is set to true * - * Identical to ufshcd_send_uic_cmd() expect mutex. Must be called - * with mutex held and host_lock locked. * Returns 0 only if success. */ static int __ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd, bool completion) { + lockdep_assert_held(&hba->uic_cmd_mutex); + lockdep_assert_held(hba->host->host_lock); + if (!ufshcd_ready_for_uic_cmd(hba)) { dev_err(hba->dev, "Controller not ready to accept UIC commands\n"); diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 194755c9ddfe..2718f73aa755 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -683,7 +683,7 @@ struct ufs_hba_monitor { * @priv: pointer to variant specific private data * @irq: Irq number of the controller * @active_uic_cmd: handle of active UIC command - * @uic_cmd_mutex: mutex for uic command + * @uic_cmd_mutex: mutex for UIC command * @tmf_tag_set: TMF tag set. * @tmf_queue: Used to allocate TMF tags. * @pwr_done: completion for power mode change
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit ac1bc2ba060f9609972fb486073ebd9eab1ef3b6 ]
Clearing a unit attention synchronously from inside the UFS error handler may trigger the following deadlock:
- ufshcd_err_handler() calls ufshcd_err_handling_unprepare() and the latter function calls ufshcd_clear_ua_wluns().
- ufshcd_clear_ua_wluns() submits a REQUEST SENSE command and that command activates the SCSI error handler.
- The SCSI error handler calls ufshcd_host_reset_and_restore().
- ufshcd_host_reset_and_restore() executes the following code: ufshcd_schedule_eh_work(hba); flush_work(&hba->eh_work);
This sequence results in a deadlock (circular wait). Fix this by requesting sense data asynchronously.
Link: https://lore.kernel.org/r/20210722033439.26550-16-bvanassche@acm.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Stanley Chu stanley.chu@mediatek.com Cc: Can Guo cang@codeaurora.org Cc: Asutosh Das asutoshd@codeaurora.org Cc: Avri Altman avri.altman@wdc.com Reviewed-by: Bean Huo beanhuo@micron.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 64 ++++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 3350b0cff9ef..52731dffa624 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -7911,8 +7911,39 @@ static int ufshcd_add_lus(struct ufs_hba *hba) return ret; }
+static void ufshcd_request_sense_done(struct request *rq, blk_status_t error) +{ + if (error != BLK_STS_OK) + pr_err("%s: REQUEST SENSE failed (%d)", __func__, error); + blk_put_request(rq); +} + static int -ufshcd_send_request_sense(struct ufs_hba *hba, struct scsi_device *sdp); +ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev) +{ + /* + * From SPC-6: the REQUEST SENSE command with any allocation length + * clears the sense data. + */ + static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, 0, 0}; + struct scsi_request *rq; + struct request *req; + + req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, /*flags=*/0); + if (IS_ERR(req)) + return PTR_ERR(req); + + rq = scsi_req(req); + rq->cmd_len = ARRAY_SIZE(cmd); + memcpy(rq->cmd, cmd, rq->cmd_len); + rq->retries = 3; + req->timeout = 1 * HZ; + req->rq_flags |= RQF_PM | RQF_QUIET; + + blk_execute_rq_nowait(/*bd_disk=*/NULL, req, /*at_head=*/true, + ufshcd_request_sense_done); + return 0; +}
static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun) { @@ -7940,7 +7971,7 @@ static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun) if (ret) goto out_err;
- ret = ufshcd_send_request_sense(hba, sdp); + ret = ufshcd_request_sense_async(hba, sdp); scsi_device_put(sdp); out_err: if (ret) @@ -8535,35 +8566,6 @@ static void ufshcd_hba_exit(struct ufs_hba *hba) } }
-static int -ufshcd_send_request_sense(struct ufs_hba *hba, struct scsi_device *sdp) -{ - unsigned char cmd[6] = {REQUEST_SENSE, - 0, - 0, - 0, - UFS_SENSE_SIZE, - 0}; - char *buffer; - int ret; - - buffer = kzalloc(UFS_SENSE_SIZE, GFP_KERNEL); - if (!buffer) { - ret = -ENOMEM; - goto out; - } - - ret = scsi_execute(sdp, cmd, DMA_FROM_DEVICE, buffer, - UFS_SENSE_SIZE, NULL, NULL, - msecs_to_jiffies(1000), 3, 0, RQF_PM, NULL); - if (ret) - pr_err("%s: failed with err %d\n", __func__, ret); - - kfree(buffer); -out: - return ret; -} - /** * ufshcd_set_dev_pwr_mode - sends START STOP UNIT command to set device * power mode
From: "Gautham R. Shenoy" ego@linux.vnet.ibm.com
[ Upstream commit 71737a6c2a8f801622d2b71567d1ec1e4c5b40b8 ]
Currently in fixup_cede0_latency() code, we perform the fixup the CEDE(0) exit latency value only if minimum advertized extended CEDE latency values are less than 10us. This was done so as to not break the expected behaviour on POWER8 platforms where the advertised latency was higher than the default 10us, which would delay the SMT folding on the core.
However, after the earlier patch "cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards", we can be sure that the fixup of CEDE0 latency is going to happen only from POWER10 onwards. Hence unconditionally use the minimum exit latency provided by the platform.
Signed-off-by: Gautham R. Shenoy ego@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/1626676399-15975-3-git-send-email-ego@linux.vnet.i... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpuidle/cpuidle-pseries.c | 59 ++++++++++++++++--------------- 1 file changed, 30 insertions(+), 29 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index a2b5c6f60cf0..18747e5287c8 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -346,11 +346,9 @@ static int pseries_cpuidle_driver_init(void) static void __init fixup_cede0_latency(void) { struct xcede_latency_payload *payload; - u64 min_latency_us; + u64 min_xcede_latency_us = UINT_MAX; int i;
- min_latency_us = dedicated_states[1].exit_latency; // CEDE latency - if (parse_cede_parameters()) return;
@@ -358,42 +356,45 @@ static void __init fixup_cede0_latency(void) nr_xcede_records);
payload = &xcede_latency_parameter.payload; + + /* + * The CEDE idle state maps to CEDE(0). While the hypervisor + * does not advertise CEDE(0) exit latency values, it does + * advertise the latency values of the extended CEDE states. + * We use the lowest advertised exit latency value as a proxy + * for the exit latency of CEDE(0). + */ for (i = 0; i < nr_xcede_records; i++) { struct xcede_latency_record *record = &payload->records[i]; + u8 hint = record->hint; u64 latency_tb = be64_to_cpu(record->latency_ticks); u64 latency_us = DIV_ROUND_UP_ULL(tb_to_ns(latency_tb), NSEC_PER_USEC);
- if (latency_us == 0) - pr_warn("cpuidle: xcede record %d has an unrealistic latency of 0us.\n", i); - - if (latency_us < min_latency_us) - min_latency_us = latency_us; - } - - /* - * By default, we assume that CEDE(0) has exit latency 10us, - * since there is no way for us to query from the platform. - * - * However, if the wakeup latency of an Extended CEDE state is - * smaller than 10us, then we can be sure that CEDE(0) - * requires no more than that. - * - * Perform the fix-up. - */ - if (min_latency_us < dedicated_states[1].exit_latency) { /* - * We set a minimum of 1us wakeup latency for cede0 to - * distinguish it from snooze + * We expect the exit latency of an extended CEDE + * state to be non-zero, it to since it takes at least + * a few nanoseconds to wakeup the idle CPU and + * dispatch the virtual processor into the Linux + * Guest. + * + * So we consider only non-zero value for performing + * the fixup of CEDE(0) latency. */ - u64 cede0_latency = 1; + if (latency_us == 0) { + pr_warn("cpuidle: Skipping xcede record %d [hint=%d]. Exit latency = 0us\n", + i, hint); + continue; + }
- if (min_latency_us > cede0_latency) - cede0_latency = min_latency_us - 1; + if (latency_us < min_xcede_latency_us) + min_xcede_latency_us = latency_us; + }
- dedicated_states[1].exit_latency = cede0_latency; - dedicated_states[1].target_residency = 10 * (cede0_latency); + if (min_xcede_latency_us != UINT_MAX) { + dedicated_states[1].exit_latency = min_xcede_latency_us; + dedicated_states[1].target_residency = 10 * (min_xcede_latency_us); pr_info("cpuidle: Fixed up CEDE exit latency to %llu us\n", - cede0_latency); + min_xcede_latency_us); }
}
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 9bef456b20581e630ef9a13555ca04fed65a859d ]
The install target should not depend on any build artifact.
The reason is explained in commit 19514fc665ff ("arm, kbuild: make "make install" not depend on vmlinux").
Change the PowerPC installation code in a similar way.
Signed-off-by: Masahiro Yamada masahiroy@kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210729141937.445051-2-masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/Makefile | 2 +- arch/powerpc/boot/install.sh | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index e312ea802aa6..b90e53e413c8 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -445,7 +445,7 @@ $(obj)/zImage.initrd: $(addprefix $(obj)/, $(initrd-y)) $(Q)rm -f $@; ln $< $@
# Only install the vmlinux -install: $(CONFIGURE) $(addprefix $(obj)/, $(image-y)) +install: sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux System.map "$(INSTALL_PATH)"
# Install the vmlinux and other built boot targets. diff --git a/arch/powerpc/boot/install.sh b/arch/powerpc/boot/install.sh index b6a256bc96ee..8d669cf1ccda 100644 --- a/arch/powerpc/boot/install.sh +++ b/arch/powerpc/boot/install.sh @@ -21,6 +21,20 @@ # Bail with error code if anything goes wrong set -e
+verify () { + if [ ! -f "$1" ]; then + echo "" 1>&2 + echo " *** Missing file: $1" 1>&2 + echo ' *** You need to run "make" before "make install".' 1>&2 + echo "" 1>&2 + exit 1 + fi +} + +# Make sure the files actually exist +verify "$2" +verify "$3" + # User may have a custom install script
if [ -x ~/bin/${INSTALLKERNEL} ]; then exec ~/bin/${INSTALLKERNEL} "$@"; fi
From: Chao Yu chao@kernel.org
[ Upstream commit 91803392c732c43b5cf440e885ea89be7f5fecef ]
During f2fs_write_checkpoint(), once we failed in f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem such as prefree bitmap, nat/sit version bitmap won't be recovered, it may cause f2fs image to be inconsistent, let's just set CP error flag to avoid further updates until we figure out a scheme to rollback all metadatas in such condition.
Reported-by: Yangtao Li frank.li@vivo.com Signed-off-by: Yangtao Li frank.li@vivo.com Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/checkpoint.c | 12 +++++++++--- fs/f2fs/f2fs.h | 2 +- fs/f2fs/segment.c | 15 +++++++++++++-- 3 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 6c208108d69c..7f6745f4630e 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -1639,8 +1639,11 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* write cached NAT/SIT entries to NAT/SIT area */ err = f2fs_flush_nat_entries(sbi, cpc); - if (err) + if (err) { + f2fs_err(sbi, "f2fs_flush_nat_entries failed err:%d, stop checkpoint", err); + f2fs_bug_on(sbi, !f2fs_cp_error(sbi)); goto stop; + }
f2fs_flush_sit_entries(sbi, cpc);
@@ -1648,10 +1651,13 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc) f2fs_save_inmem_curseg(sbi);
err = do_checkpoint(sbi, cpc); - if (err) + if (err) { + f2fs_err(sbi, "do_checkpoint failed err:%d, stop checkpoint", err); + f2fs_bug_on(sbi, !f2fs_cp_error(sbi)); f2fs_release_discard_addrs(sbi); - else + } else { f2fs_clear_prefree_segments(sbi, cpc); + }
f2fs_restore_inmem_curseg(sbi); stop: diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index ee8eb33e2c25..6cdcd62ea80c 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -542,7 +542,7 @@ enum { */ };
-#define DEFAULT_RETRY_IO_COUNT 8 /* maximum retry read IO count */ +#define DEFAULT_RETRY_IO_COUNT 8 /* maximum retry read IO or flush count */
/* congestion wait timeout value, default: 20ms */ #define DEFAULT_IO_TIMEOUT (msecs_to_jiffies(20)) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 2e543c7c1bc3..93d4a49eed69 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -776,11 +776,22 @@ int f2fs_flush_device_cache(struct f2fs_sb_info *sbi) return 0;
for (i = 1; i < sbi->s_ndevs; i++) { + int count = DEFAULT_RETRY_IO_COUNT; + if (!f2fs_test_bit(i, (char *)&sbi->dirty_device)) continue; - ret = __submit_flush_wait(sbi, FDEV(i).bdev); - if (ret) + + do { + ret = __submit_flush_wait(sbi, FDEV(i).bdev); + if (ret) + congestion_wait(BLK_RW_ASYNC, + DEFAULT_IO_TIMEOUT); + } while (ret && --count); + + if (ret) { + f2fs_stop_checkpoint(sbi, false); break; + }
spin_lock(&sbi->dev_lock); f2fs_clear_bit(i, (char *)&sbi->dirty_device);
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit ad548993a66c498267695edd8b19a682be0e3a8b ]
LOONGSON_UART_BASE depends on EARLY_PRINTK || SERIAL_8250, but when neither of these Kconfig symbols is set, the kernel build has errors:
../arch/mips/loongson2ef/common/serial.c: In function 'serial_init': ../arch/mips/loongson2ef/common/serial.c:66:25: error: 'loongson_uart_base' undeclared (first use in this function) 66 | loongson_uart_base; ../arch/mips/loongson2ef/common/serial.c:66:25: note: each undeclared identifier is reported only once for each function it appears in ../arch/mips/loongson2ef/common/serial.c:68:41: error: '_loongson_uart_base' undeclared (first use in this function) 68 | (void __iomem *)_loongson_uart_base;
Fix this by building serial.o only when one (or both) of these Kconfig symbols is enabled.
Tested with: (a) EARLY_PRINTK=y, SERIAL_8250 not set; (b) EARLY_PRINTK=y, SERIAL_8250=y; (c) EARLY_PRINTK=y, SERIAL_8250=m. (d) EARLY_PRINTK not set, SERIAL_8250=y; (e) EARLY_PRINTK not set, SERIAL_8250=m; (f) EARLY_PRINTK not set, SERIAL_8250 not set.
Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Cc: Jiaxun Yang jiaxun.yang@flygoat.com Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: linux-mips@vger.kernel.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/loongson2ef/common/Makefile | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/mips/loongson2ef/common/Makefile b/arch/mips/loongson2ef/common/Makefile index d5ab3e543ea3..30ea8b5ca685 100644 --- a/arch/mips/loongson2ef/common/Makefile +++ b/arch/mips/loongson2ef/common/Makefile @@ -4,12 +4,14 @@ #
obj-y += setup.o init.o env.o time.o reset.o irq.o \ - bonito-irq.o mem.o machtype.o platform.o serial.o + bonito-irq.o mem.o machtype.o platform.o obj-$(CONFIG_PCI) += pci.o
# # Serial port support # +obj-$(CONFIG_LOONGSON_UART_BASE) += serial.o +obj-$(CONFIG_EARLY_PRINTK) += serial.o obj-$(CONFIG_LOONGSON_UART_BASE) += uart_base.o obj-$(CONFIG_LOONGSON_MC146818) += rtc.o
From: Rui Wang wangrui@loongson.cn
[ Upstream commit cb95ea79b3fc772c5873a7a4532ab4c14a455da2 ]
This looks like a typo and that caused atomic64 test failed.
Signed-off-by: Rui Wang wangrui@loongson.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/atomic.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 95e1f7f3597f..a0b9e7c1e4fc 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -206,7 +206,7 @@ ATOMIC_OPS(atomic64, xor, s64, ^=, xor, lld, scd) * The function returns the old value of @v minus @i. */ #define ATOMIC_SIP_OP(pfx, type, op, ll, sc) \ -static __inline__ int arch_##pfx##_sub_if_positive(type i, pfx##_t * v) \ +static __inline__ type arch_##pfx##_sub_if_positive(type i, pfx##_t * v) \ { \ type temp, result; \ \
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 1143129e4d0d27740ce680d2fb0161ad4f27aa7e ]
ib_post_send() does not disconnect the QP when it returns an immediate error. Thus, the code that posts LocalInv has to explicitly disconnect after an immediate error. This is just like the frwr_send() callers handle it.
If a disconnect isn't done here, the transport deadlocks.
Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/xprtrdma/frwr_ops.c | 8 ++++++++ net/sunrpc/xprtrdma/verbs.c | 2 +- net/sunrpc/xprtrdma/xprt_rdma.h | 1 + 3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index 229fcc9a9064..754c5dffe127 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -557,6 +557,10 @@ void frwr_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
/* On error, the MRs get destroyed once the QP has drained. */ trace_xprtrdma_post_linv_err(req, rc); + + /* Force a connection loss to ensure complete recovery. + */ + rpcrdma_force_disconnect(ep); }
/** @@ -653,4 +657,8 @@ void frwr_unmap_async(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req) * retransmission. */ rpcrdma_unpin_rqst(req->rl_reply); + + /* Force a connection loss to ensure complete recovery. + */ + rpcrdma_force_disconnect(ep); } diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 649c23518ec0..c1797ea19418 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -124,7 +124,7 @@ static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt) * connection is closed or lost. (The important thing is it needs * to be invoked "at least" once). */ -static void rpcrdma_force_disconnect(struct rpcrdma_ep *ep) +void rpcrdma_force_disconnect(struct rpcrdma_ep *ep) { if (atomic_add_unless(&ep->re_force_disconnect, 1, 1)) xprt_force_disconnect(ep->re_xprt); diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h index 5d231d94e944..927e20a2c04e 100644 --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -454,6 +454,7 @@ extern unsigned int xprt_rdma_memreg_strategy; /* * Endpoint calls - xprtrdma/verbs.c */ +void rpcrdma_force_disconnect(struct rpcrdma_ep *ep); void rpcrdma_flush_disconnect(struct rpcrdma_xprt *r_xprt, struct ib_wc *wc); int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt); void rpcrdma_xprt_disconnect(struct rpcrdma_xprt *r_xprt);
From: Jordan Niethe jniethe5@gmail.com
[ Upstream commit 27fd1111051dc218e5b6cb2da5dbb3f342879ff1 ]
This is the same as commit acdad8fb4a15 ("powerpc: Force inlining of mmu_has_feature to fix build failure") but for radix_enabled(). The config in the linked bugzilla causes the following build failure:
LD .tmp_vmlinux.kallsyms1 powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function `.__ptep_set_access_flags': pgtable.c:(.text+0x17c): undefined reference to `.radix__ptep_set_access_flags' powerpc64-linux-ld: arch/powerpc/mm/pageattr.o: in function `.change_page_attr': pageattr.c:(.text+0xc0): undefined reference to `.radix__flush_tlb_kernel_range' etc.
This is due to radix_enabled() not being inlined. See extract from building with -Winline:
In file included from arch/powerpc/include/asm/lppaca.h:46, from arch/powerpc/include/asm/paca.h:17, from arch/powerpc/include/asm/current.h:13, from include/linux/thread_info.h:23, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:78, from include/linux/spinlock.h:51, from include/linux/mmzone.h:8, from include/linux/gfp.h:6, from arch/powerpc/mm/pgtable.c:21: arch/powerpc/include/asm/book3s/64/pgtable.h: In function '__ptep_set_access_flags': arch/powerpc/include/asm/mmu.h:327:20: error: inlining failed in call to 'radix_enabled': call is unlikely and code size would grow [-Werror=inline]
The code relies on constant folding of MMU_FTRS_POSSIBLE at buildtime and elimination of non possible parts of code at compile time. For this to work radix_enabled() must be inlined so make it __always_inline.
Reported-by: Erhard F. erhard_f@mailbox.org Suggested-by: Michael Ellerman mpe@ellerman.id.au Tested-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Jordan Niethe jniethe5@gmail.com [mpe: Trimmed error messages in change log] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://bugzilla.kernel.org/show_bug.cgi?id=213803 Link: https://lore.kernel.org/r/20210804013724.514468-1-jniethe5@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/mmu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index 27016b98ecb2..8abe8e42e045 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -324,7 +324,7 @@ static inline void assert_pte_locked(struct mm_struct *mm, unsigned long addr) } #endif /* !CONFIG_DEBUG_VM */
-static inline bool radix_enabled(void) +static __always_inline bool radix_enabled(void) { return mmu_has_feature(MMU_FTR_TYPE_RADIX); }
From: Cédric Le Goater clg@kaod.org
[ Upstream commit 1753081f2d445f9157550692fcc4221cd3ff0958 ]
PCI MSIs now live in an MSI domain but the underlying calls, which will EOI the interrupt in real mode, need an HW IRQ number mapped in the XICS IRQ domain. Grab it there.
Signed-off-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210701132750.1475580-31-clg@kaod.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 085fb8ecbf68..1ca0a4f760bc 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -5328,6 +5328,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) struct kvmppc_passthru_irqmap *pimap; struct irq_chip *chip; int i, rc = 0; + struct irq_data *host_data;
if (!kvm_irq_bypass) return 1; @@ -5392,7 +5393,14 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) * the KVM real mode handler. */ smp_wmb(); - irq_map->r_hwirq = desc->irq_data.hwirq; + + /* + * The 'host_irq' number is mapped in the PCI-MSI domain but + * the underlying calls, which will EOI the interrupt in real + * mode, need an HW IRQ number mapped in the XICS IRQ domain. + */ + host_data = irq_domain_get_irq_data(irq_get_default_host(), host_irq); + irq_map->r_hwirq = (unsigned int)irqd_to_hwirq(host_data);
if (i == pimap->n_mapped) pimap->n_mapped++; @@ -5400,7 +5408,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) if (xics_on_xive()) rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); else - kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq); + kvmppc_xics_set_mapped(kvm, guest_gsi, irq_map->r_hwirq); if (rc) irq_map->r_hwirq = 0;
On 9/10/21 2:14 AM, Sasha Levin wrote:
From: Cédric Le Goater clg@kaod.org
[ Upstream commit 1753081f2d445f9157550692fcc4221cd3ff0958 ]
PCI MSIs now live in an MSI domain but the underlying calls, which will EOI the interrupt in real mode, need an HW IRQ number mapped in the XICS IRQ domain. Grab it there.
Signed-off-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210701132750.1475580-31-clg@kaod.org Signed-off-by: Sasha Levin sashal@kernel.org
Why are we backporting this patch in stable trees ?
It should be fine but to compile, we need a partial backport of commit 51be9e51a800 ("KVM: PPC: Book3S HV: XIVE: Fix mapping of passthrough interrupts") which exports irq_get_default_host().
Thanks,
C.
arch/powerpc/kvm/book3s_hv.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 085fb8ecbf68..1ca0a4f760bc 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -5328,6 +5328,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) struct kvmppc_passthru_irqmap *pimap; struct irq_chip *chip; int i, rc = 0;
- struct irq_data *host_data;
if (!kvm_irq_bypass) return 1; @@ -5392,7 +5393,14 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) * the KVM real mode handler. */ smp_wmb();
- irq_map->r_hwirq = desc->irq_data.hwirq;
- /*
* The 'host_irq' number is mapped in the PCI-MSI domain but
* the underlying calls, which will EOI the interrupt in real
* mode, need an HW IRQ number mapped in the XICS IRQ domain.
*/
- host_data = irq_domain_get_irq_data(irq_get_default_host(), host_irq);
- irq_map->r_hwirq = (unsigned int)irqd_to_hwirq(host_data);
if (i == pimap->n_mapped) pimap->n_mapped++; @@ -5400,7 +5408,7 @@ static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi) if (xics_on_xive()) rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); else
kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq);
if (rc) irq_map->r_hwirq = 0;kvmppc_xics_set_mapped(kvm, guest_gsi, irq_map->r_hwirq);
On Fri, Sep 10, 2021 at 07:48:18AM +0200, Cédric Le Goater wrote:
On 9/10/21 2:14 AM, Sasha Levin wrote:
From: Cédric Le Goater clg@kaod.org
[ Upstream commit 1753081f2d445f9157550692fcc4221cd3ff0958 ]
PCI MSIs now live in an MSI domain but the underlying calls, which will EOI the interrupt in real mode, need an HW IRQ number mapped in the XICS IRQ domain. Grab it there.
Signed-off-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210701132750.1475580-31-clg@kaod.org Signed-off-by: Sasha Levin sashal@kernel.org
Why are we backporting this patch in stable trees ?
It should be fine but to compile, we need a partial backport of commit 51be9e51a800 ("KVM: PPC: Book3S HV: XIVE: Fix mapping of passthrough interrupts") which exports irq_get_default_host().
Or, I can drop it if it makes no sense?
On 9/11/21 4:35 PM, Sasha Levin wrote:
On Fri, Sep 10, 2021 at 07:48:18AM +0200, Cédric Le Goater wrote:
On 9/10/21 2:14 AM, Sasha Levin wrote:
From: Cédric Le Goater clg@kaod.org
[ Upstream commit 1753081f2d445f9157550692fcc4221cd3ff0958 ]
PCI MSIs now live in an MSI domain but the underlying calls, which will EOI the interrupt in real mode, need an HW IRQ number mapped in the XICS IRQ domain. Grab it there.
Signed-off-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210701132750.1475580-31-clg@kaod.org Signed-off-by: Sasha Levin sashal@kernel.org
Why are we backporting this patch in stable trees ?
It should be fine but to compile, we need a partial backport of commit 51be9e51a800 ("KVM: PPC: Book3S HV: XIVE: Fix mapping of passthrough interrupts") which exports irq_get_default_host().
Or, I can drop it if it makes no sense?
Yes I would.
It makes sense only with the full patchset, the one reworking PCI MSI support in the PPC pSeries and PowerNV platforms.
Thanks,
C.
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit a20d1cebb98bba75f2e34fddc768dd8712c1bded ]
This commit applies the e2fsck/recovery.c portions of commit 1e0c8ca7c08a ("e2fsck: fix portability problems caused by unaligned accesses) from the e2fsprogs git tree.
The on-disk format for the ext4 journal can have unaigned 32-bit integers. This can happen when replaying a journal using a obsolete checksum format (which was never popularly used, since the v3 format replaced v2 while the metadata checksum feature was being stablized).
Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/recovery.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index d47a0d96bf30..4c4209262437 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -196,7 +196,7 @@ static int jbd2_descriptor_block_csum_verify(journal_t *j, void *buf) static int count_tags(journal_t *journal, struct buffer_head *bh) { char * tagp; - journal_block_tag_t * tag; + journal_block_tag_t tag; int nr = 0, size = journal->j_blocksize; int tag_bytes = journal_tag_bytes(journal);
@@ -206,14 +206,14 @@ static int count_tags(journal_t *journal, struct buffer_head *bh) tagp = &bh->b_data[sizeof(journal_header_t)];
while ((tagp - bh->b_data + tag_bytes) <= size) { - tag = (journal_block_tag_t *) tagp; + memcpy(&tag, tagp, sizeof(tag));
nr++; tagp += tag_bytes; - if (!(tag->t_flags & cpu_to_be16(JBD2_FLAG_SAME_UUID))) + if (!(tag.t_flags & cpu_to_be16(JBD2_FLAG_SAME_UUID))) tagp += 16;
- if (tag->t_flags & cpu_to_be16(JBD2_FLAG_LAST_TAG)) + if (tag.t_flags & cpu_to_be16(JBD2_FLAG_LAST_TAG)) break; }
@@ -433,9 +433,9 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf) }
static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag, + journal_block_tag3_t *tag3, void *buf, __u32 sequence) { - journal_block_tag3_t *tag3 = (journal_block_tag3_t *)tag; __u32 csum32; __be32 seq;
@@ -496,7 +496,7 @@ static int do_one_pass(journal_t *journal, while (1) { int flags; char * tagp; - journal_block_tag_t * tag; + journal_block_tag_t tag; struct buffer_head * obh; struct buffer_head * nbh;
@@ -613,8 +613,8 @@ static int do_one_pass(journal_t *journal, <= journal->j_blocksize - descr_csum_size) { unsigned long io_block;
- tag = (journal_block_tag_t *) tagp; - flags = be16_to_cpu(tag->t_flags); + memcpy(&tag, tagp, sizeof(tag)); + flags = be16_to_cpu(tag.t_flags);
io_block = next_log_block++; wrap(journal, next_log_block); @@ -632,7 +632,7 @@ static int do_one_pass(journal_t *journal,
J_ASSERT(obh != NULL); blocknr = read_tag_block(journal, - tag); + &tag);
/* If the block has been * revoked, then we're all done @@ -647,8 +647,8 @@ static int do_one_pass(journal_t *journal,
/* Look for block corruption */ if (!jbd2_block_tag_csum_verify( - journal, tag, obh->b_data, - be32_to_cpu(tmp->h_sequence))) { + journal, &tag, (journal_block_tag3_t *)tagp, + obh->b_data, be32_to_cpu(tmp->h_sequence))) { brelse(obh); success = -EFSBADCRC; printk(KERN_ERR "JBD2: Invalid "
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit 390add0cc9f4d7fda89cf3db7651717e82cf0afc ]
Remove unused variable store which was never used.
This fix is also in e2fsprogs commit 99a2294f85f0 ("e2fsck: value stored to err is never read").
Signed-off-by: Lukas Czerner lczerner@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/recovery.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 4c4209262437..ba979fcf1cd3 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -760,7 +760,6 @@ static int do_one_pass(journal_t *journal, */ jbd_debug(1, "JBD2: Invalid checksum ignored in transaction %u, likely stale data\n", next_commit_ID); - err = 0; brelse(bh); goto done; }
From: Ashish Mhetre amhetre@nvidia.com
[ Upstream commit 211ff31b3d33b56aa12937e898c9280d07daf0d9 ]
When two devices with same SID are getting probed concurrently through iommu_probe_device(), the iommu_domain sometimes is getting allocated more than once as call to iommu_alloc_default_domain() is not protected for concurrency. Furthermore, it leads to each device holding a different iommu_domain pointer, separate IOVA space and only one of the devices' domain is used for translations from IOMMU. This causes accesses from other device to fault or see incorrect translations. Fix this by protecting iommu_alloc_default_domain() call with group->mutex and let all devices with same SID share same iommu_domain.
Signed-off-by: Ashish Mhetre amhetre@nvidia.com Link: https://lore.kernel.org/r/1628570641-9127-2-git-send-email-amhetre@nvidia.co... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/iommu.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 63f0af10c403..3050cc4b02bc 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) * support default domains, so the return value is not yet * checked. */ + mutex_lock(&group->mutex); iommu_alloc_default_domain(group, dev); + mutex_unlock(&group->mutex);
if (group->default_domain) { ret = __iommu_attach_device(group->default_domain, dev);
From: Krishna Reddy vdumpa@nvidia.com
[ Upstream commit b1a1347912a742a4e1fcdc9df6302dd9dd2c3405 ]
When two devices with same SID are getting probed concurrently through iommu_probe_device(), the iommu_group sometimes is getting allocated more than once as call to arm_smmu_device_group() is not protected for concurrency. Furthermore, it leads to each device holding a different iommu_group and domain pointer, separate IOVA space and only one of the devices' domain is used for translations from IOMMU. This causes accesses from other device to fault or see incorrect translations. Fix this by protecting iommu_group allocation from concurrency in arm_smmu_device_group().
Signed-off-by: Krishna Reddy vdumpa@nvidia.com Signed-off-by: Ashish Mhetre amhetre@nvidia.com Link: https://lore.kernel.org/r/1628570641-9127-3-git-send-email-amhetre@nvidia.co... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index f22dbeb1e510..3b4743d830b8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1478,6 +1478,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) struct iommu_group *group = NULL; int i, idx;
+ mutex_lock(&smmu->stream_map_mutex); for_each_cfg_sme(cfg, fwspec, i, idx) { if (group && smmu->s2crs[idx].group && group != smmu->s2crs[idx].group) @@ -1486,8 +1487,10 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) group = smmu->s2crs[idx].group; }
- if (group) + if (group) { + mutex_unlock(&smmu->stream_map_mutex); return iommu_group_ref_get(group); + }
if (dev_is_pci(dev)) group = pci_device_group(dev); @@ -1501,6 +1504,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) for_each_cfg_sme(cfg, fwspec, i, idx) smmu->s2crs[idx].group = group;
+ mutex_unlock(&smmu->stream_map_mutex); return group; }
From: Quinn Tran qutran@marvell.com
[ Upstream commit 01c97f2dd8fb4d2188c779a975031c0fe1ec061d ]
Over time, fcport->port_type became a flag field. The flags within this field were not defined properly. This caused external tools to read wrong info.
Link: https://lore.kernel.org/r/20210810043720.1137-8-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 2f67ec1df3e6..2da1d4eccbaa 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -2377,11 +2377,9 @@ struct mbx_24xx_entry { */ typedef enum { FCT_UNKNOWN, - FCT_RSCN, - FCT_SWITCH, - FCT_BROADCAST, - FCT_INITIATOR, - FCT_TARGET, + FCT_BROADCAST = 0x01, + FCT_INITIATOR = 0x02, + FCT_TARGET = 0x04, FCT_NVME_INITIATOR = 0x10, FCT_NVME_TARGET = 0x20, FCT_NVME_DISCOVERY = 0x40,
From: Quinn Tran qutran@marvell.com
[ Upstream commit 0c9a5f3e42f75dbad5966aa59db935b694ab00d1 ]
On NPIV delete, the VPort is taken off a linked list in an unsafe manner. The check for VPort refcount should be done behind lock before taking off the element.
[ 2733.016907] general protection fault: 0000 [#1] SMP NOPTI [ 2733.016908] qla2xxx [0000:22:00.1]-7088:27: VP[4] deleted. [ 2733.016912] CPU: 22 PID: 23481 Comm: qla2xxx_15_dpc Kdump: loaded Tainted: G OE KX 5.3.18-47-default #1 SLE15-SP3 [ 2733.016914] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.1.4 02/17/2021 [ 2733.016929] RIP: 0010:qla2x00_abort_isp+0x90/0x850 [qla2xxx] [ 2733.016933] RSP: 0018:ffffb9cfc91efe98 EFLAGS: 00010087 [ 2733.016935] RAX: 0000000000000292 RBX: dead000000000100 RCX: 0000000000000000 [ 2733.016936] RDX: 0000000000000001 RSI: ffff944bfeb99558 RDI: ffff944bfc4b4488 [ 2733.016937] RBP: ffff944bfc4b2868 R08: 00000000000187a2 R09: 0000000000000001 [ 2733.016937] R10: ffffb9cfc91efcc8 R11: 0000000000000001 R12: ffff944bfc4b4000 [ 2733.016938] R13: ffff944bfc4b4870 R14: ffff944bfc4b4488 R15: ffff944bda895c80 [ 2733.016939] FS: 0000000000000000(0000) GS:ffff944bfeb80000(0000) knlGS:0000000000000000 [ 2733.016940] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2733.016940] CR2: 00007fc173e74458 CR3: 0000001ff57de000 CR4: 0000000000350ee0 [ 2733.016941] Call Trace: [ 2733.016951] qla2xxx_pci_error_detected+0x190/0x190 [qla2xxx] [ 2733.016957] qla2x00_do_dpc+0x560/0xa10 [qla2xxx] [ 2733.016962] kthread+0x10d/0x130 [ 2733.016963] kthread_park+0xa0/0xa0 [ 2733.016966] ret_from_fork+0x22/0x40
Link: https://lore.kernel.org/r/20210810043720.1137-9-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_init.c | 33 +++++++++++++++++--------- drivers/scsi/qla2xxx/qla_mid.c | 42 ++++++++++++++++++++------------- drivers/scsi/qla2xxx/qla_os.c | 6 ++--- 3 files changed, 50 insertions(+), 31 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index f8f471157109..f40e233c8ce5 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -6459,13 +6459,13 @@ void qla2x00_update_fcports(scsi_qla_host_t *base_vha) { fc_port_t *fcport; - struct scsi_qla_host *vha; + struct scsi_qla_host *vha, *tvp; struct qla_hw_data *ha = base_vha->hw; unsigned long flags;
spin_lock_irqsave(&ha->vport_slock, flags); /* Go with deferred removal of rport references. */ - list_for_each_entry(vha, &base_vha->hw->vp_list, list) { + list_for_each_entry_safe(vha, tvp, &base_vha->hw->vp_list, list) { atomic_inc(&vha->vref_count); list_for_each_entry(fcport, &vha->vp_fcports, list) { if (fcport->drport && @@ -6810,7 +6810,8 @@ void qla2x00_quiesce_io(scsi_qla_host_t *vha) { struct qla_hw_data *ha = vha->hw; - struct scsi_qla_host *vp; + struct scsi_qla_host *vp, *tvp; + unsigned long flags;
ql_dbg(ql_dbg_dpc, vha, 0x401d, "Quiescing I/O - ha=%p.\n", ha); @@ -6819,8 +6820,18 @@ qla2x00_quiesce_io(scsi_qla_host_t *vha) if (atomic_read(&vha->loop_state) != LOOP_DOWN) { atomic_set(&vha->loop_state, LOOP_DOWN); qla2x00_mark_all_devices_lost(vha); - list_for_each_entry(vp, &ha->vp_list, list) + + spin_lock_irqsave(&ha->vport_slock, flags); + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { + atomic_inc(&vp->vref_count); + spin_unlock_irqrestore(&ha->vport_slock, flags); + qla2x00_mark_all_devices_lost(vp); + + spin_lock_irqsave(&ha->vport_slock, flags); + atomic_dec(&vp->vref_count); + } + spin_unlock_irqrestore(&ha->vport_slock, flags); } else { if (!atomic_read(&vha->loop_down_timer)) atomic_set(&vha->loop_down_timer, @@ -6835,7 +6846,7 @@ void qla2x00_abort_isp_cleanup(scsi_qla_host_t *vha) { struct qla_hw_data *ha = vha->hw; - struct scsi_qla_host *vp; + struct scsi_qla_host *vp, *tvp; unsigned long flags; fc_port_t *fcport; u16 i; @@ -6903,7 +6914,7 @@ qla2x00_abort_isp_cleanup(scsi_qla_host_t *vha) qla2x00_mark_all_devices_lost(vha);
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags);
@@ -6925,7 +6936,7 @@ qla2x00_abort_isp_cleanup(scsi_qla_host_t *vha) fcport->scan_state = 0; } spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags);
@@ -6969,7 +6980,7 @@ qla2x00_abort_isp(scsi_qla_host_t *vha) int rval; uint8_t status = 0; struct qla_hw_data *ha = vha->hw; - struct scsi_qla_host *vp; + struct scsi_qla_host *vp, *tvp; struct req_que *req = ha->req_q_map[0]; unsigned long flags;
@@ -7125,7 +7136,7 @@ qla2x00_abort_isp(scsi_qla_host_t *vha) ql_dbg(ql_dbg_taskm, vha, 0x8022, "%s succeeded.\n", __func__); qla2x00_configure_hba(vha); spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { if (vp->vp_idx) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags); @@ -8810,7 +8821,7 @@ qla82xx_restart_isp(scsi_qla_host_t *vha) { int status, rval; struct qla_hw_data *ha = vha->hw; - struct scsi_qla_host *vp; + struct scsi_qla_host *vp, *tvp; unsigned long flags;
status = qla2x00_init_rings(vha); @@ -8882,7 +8893,7 @@ qla82xx_restart_isp(scsi_qla_host_t *vha) "qla82xx_restart_isp succeeded.\n");
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { if (vp->vp_idx) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags); diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c index c7caf322f445..787e0c0446a6 100644 --- a/drivers/scsi/qla2xxx/qla_mid.c +++ b/drivers/scsi/qla2xxx/qla_mid.c @@ -65,7 +65,7 @@ qla24xx_deallocate_vp_id(scsi_qla_host_t *vha) uint16_t vp_id; struct qla_hw_data *ha = vha->hw; unsigned long flags = 0; - u8 i; + u32 i, bailout;
mutex_lock(&ha->vport_lock); /* @@ -75,21 +75,29 @@ qla24xx_deallocate_vp_id(scsi_qla_host_t *vha) * ensures no active vp_list traversal while the vport is removed * from the queue) */ - for (i = 0; i < 10; i++) { - if (wait_event_timeout(vha->vref_waitq, - !atomic_read(&vha->vref_count), HZ) > 0) + bailout = 0; + for (i = 0; i < 500; i++) { + spin_lock_irqsave(&ha->vport_slock, flags); + if (atomic_read(&vha->vref_count) == 0) { + list_del(&vha->list); + qlt_update_vp_map(vha, RESET_VP_IDX); + bailout = 1; + } + spin_unlock_irqrestore(&ha->vport_slock, flags); + + if (bailout) break; + else + msleep(20); } - - spin_lock_irqsave(&ha->vport_slock, flags); - if (atomic_read(&vha->vref_count)) { - ql_dbg(ql_dbg_vport, vha, 0xfffa, - "vha->vref_count=%u timeout\n", vha->vref_count.counter); - vha->vref_count = (atomic_t)ATOMIC_INIT(0); + if (!bailout) { + ql_log(ql_log_info, vha, 0xfffa, + "vha->vref_count=%u timeout\n", vha->vref_count.counter); + spin_lock_irqsave(&ha->vport_slock, flags); + list_del(&vha->list); + qlt_update_vp_map(vha, RESET_VP_IDX); + spin_unlock_irqrestore(&ha->vport_slock, flags); } - list_del(&vha->list); - qlt_update_vp_map(vha, RESET_VP_IDX); - spin_unlock_irqrestore(&ha->vport_slock, flags);
vp_id = vha->vp_idx; ha->num_vhosts--; @@ -257,13 +265,13 @@ qla24xx_configure_vp(scsi_qla_host_t *vha) void qla2x00_alert_all_vps(struct rsp_que *rsp, uint16_t *mb) { - scsi_qla_host_t *vha; + scsi_qla_host_t *vha, *tvp; struct qla_hw_data *ha = rsp->hw; int i = 0; unsigned long flags;
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vha, &ha->vp_list, list) { + list_for_each_entry_safe(vha, tvp, &ha->vp_list, list) { if (vha->vp_idx) { if (test_bit(VPORT_DELETE, &vha->dpc_flags)) continue; @@ -416,7 +424,7 @@ void qla2x00_do_dpc_all_vps(scsi_qla_host_t *vha) { struct qla_hw_data *ha = vha->hw; - scsi_qla_host_t *vp; + scsi_qla_host_t *vp, *tvp; unsigned long flags = 0;
if (vha->vp_idx) @@ -430,7 +438,7 @@ qla2x00_do_dpc_all_vps(scsi_qla_host_t *vha) return;
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { if (vp->vp_idx) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags); diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index cedd558f65eb..47f15c0e4277 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -7430,7 +7430,7 @@ static void qla_pci_error_cleanup(scsi_qla_host_t *vha) struct qla_hw_data *ha = vha->hw; scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev); struct qla_qpair *qpair = NULL; - struct scsi_qla_host *vp; + struct scsi_qla_host *vp, *tvp; fc_port_t *fcport; int i; unsigned long flags; @@ -7461,7 +7461,7 @@ static void qla_pci_error_cleanup(scsi_qla_host_t *vha) qla2x00_mark_all_devices_lost(vha);
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags); qla2x00_mark_all_devices_lost(vp); @@ -7475,7 +7475,7 @@ static void qla_pci_error_cleanup(scsi_qla_host_t *vha) fcport->flags &= ~(FCF_LOGIN_NEEDED | FCF_ASYNC_SENT);
spin_lock_irqsave(&ha->vport_slock, flags); - list_for_each_entry(vp, &ha->vp_list, list) { + list_for_each_entry_safe(vp, tvp, &ha->vp_list, list) { atomic_inc(&vp->vref_count); spin_unlock_irqrestore(&ha->vport_slock, flags); list_for_each_entry(fcport, &vp->vp_fcports, list)
From: Quinn Tran qutran@marvell.com
[ Upstream commit a57214443f0f85639a0d9bbb8bd658d82dbf0927 ]
When user creates multiple NPIVs, the switch capabilities field is checked before a vport is allowed to be created. This field is being toggled if a switch scan is in progress. This creates erroneous reject of vport create.
Link: https://lore.kernel.org/r/20210810043720.1137-10-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_init.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index f40e233c8ce5..27d10524ec3e 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -4531,11 +4531,11 @@ qla2x00_configure_hba(scsi_qla_host_t *vha) /* initialize */ ha->min_external_loopid = SNS_FIRST_LOOP_ID; ha->operating_mode = LOOP; - ha->switch_cap = 0;
switch (topo) { case 0: ql_dbg(ql_dbg_disc, vha, 0x200b, "HBA in NL topology.\n"); + ha->switch_cap = 0; ha->current_topology = ISP_CFG_NL; strcpy(connect_type, "(Loop)"); break; @@ -4549,6 +4549,7 @@ qla2x00_configure_hba(scsi_qla_host_t *vha)
case 2: ql_dbg(ql_dbg_disc, vha, 0x200d, "HBA in N P2P topology.\n"); + ha->switch_cap = 0; ha->operating_mode = P2P; ha->current_topology = ISP_CFG_N; strcpy(connect_type, "(N_Port-to-N_Port)"); @@ -4565,6 +4566,7 @@ qla2x00_configure_hba(scsi_qla_host_t *vha) default: ql_dbg(ql_dbg_disc, vha, 0x200f, "HBA in unknown topology %x, using NL.\n", topo); + ha->switch_cap = 0; ha->current_topology = ISP_CFG_NL; strcpy(connect_type, "(Loop)"); break;
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 72db82115d2bdfbfba8b15a92d91872cfe1b40c6 ]
When a lower file has sync/noatime fileattr flags, the behavior of overlayfs post copy up is inconsistent.
Immediately after copy up, ovl inode still has the S_SYNC/S_NOATIME inode flags copied from lower inode, so vfs code still treats the ovl inode as sync/noatime. After ovl inode evict or mount cycle, the ovl inode does not have these inode flags anymore.
To fix this inconsistency, try to copy the fileattr flags on copy up if the upper fs supports the fileattr_set() method.
This gives consistent behavior post copy up regardless of inode eviction from cache.
We cannot copy up the immutable/append-only inode flags in a similar manner, because immutable/append-only inodes cannot be linked and because overlayfs will not be able to set overlay.* xattr on the upper inodes.
Those flags will be addressed by a followup patch.
Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/copy_up.c | 51 ++++++++++++++++++++++++++++++++++------ fs/overlayfs/inode.c | 44 ++++++++++++++++++++++++---------- fs/overlayfs/overlayfs.h | 15 +++++++++++- 3 files changed, 89 insertions(+), 21 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index 2846b943e80c..5e7fb92f9edd 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -8,6 +8,7 @@ #include <linux/fs.h> #include <linux/slab.h> #include <linux/file.h> +#include <linux/fileattr.h> #include <linux/splice.h> #include <linux/xattr.h> #include <linux/security.h> @@ -130,6 +131,31 @@ int ovl_copy_xattr(struct super_block *sb, struct dentry *old, return error; }
+static int ovl_copy_fileattr(struct path *old, struct path *new) +{ + struct fileattr oldfa = { .flags_valid = true }; + struct fileattr newfa = { .flags_valid = true }; + int err; + + err = ovl_real_fileattr_get(old, &oldfa); + if (err) + return err; + + err = ovl_real_fileattr_get(new, &newfa); + if (err) + return err; + + BUILD_BUG_ON(OVL_COPY_FS_FLAGS_MASK & ~FS_COMMON_FL); + newfa.flags &= ~OVL_COPY_FS_FLAGS_MASK; + newfa.flags |= (oldfa.flags & OVL_COPY_FS_FLAGS_MASK); + + BUILD_BUG_ON(OVL_COPY_FSX_FLAGS_MASK & ~FS_XFLAG_COMMON); + newfa.fsx_xflags &= ~OVL_COPY_FSX_FLAGS_MASK; + newfa.fsx_xflags |= (oldfa.fsx_xflags & OVL_COPY_FSX_FLAGS_MASK); + + return ovl_real_fileattr_set(new, &newfa); +} + static int ovl_copy_up_data(struct ovl_fs *ofs, struct path *old, struct path *new, loff_t len) { @@ -493,20 +519,21 @@ static int ovl_link_up(struct ovl_copy_up_ctx *c) static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp) { struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb); + struct inode *inode = d_inode(c->dentry); + struct path upperpath, datapath; int err;
+ ovl_path_upper(c->dentry, &upperpath); + if (WARN_ON(upperpath.dentry != NULL)) + return -EIO; + + upperpath.dentry = temp; + /* * Copy up data first and then xattrs. Writing data after * xattrs will remove security.capability xattr automatically. */ if (S_ISREG(c->stat.mode) && !c->metacopy) { - struct path upperpath, datapath; - - ovl_path_upper(c->dentry, &upperpath); - if (WARN_ON(upperpath.dentry != NULL)) - return -EIO; - upperpath.dentry = temp; - ovl_path_lowerdata(c->dentry, &datapath); err = ovl_copy_up_data(ofs, &datapath, &upperpath, c->stat.size); @@ -518,6 +545,16 @@ static int ovl_copy_up_inode(struct ovl_copy_up_ctx *c, struct dentry *temp) if (err) return err;
+ if (inode->i_flags & OVL_COPY_I_FLAGS_MASK) { + /* + * Copy the fileattr inode flags that are the source of already + * copied i_flags + */ + err = ovl_copy_fileattr(&c->lowerpath, &upperpath); + if (err) + return err; + } + /* * Store identifier of lower inode in upper inode xattr to * allow lookup of the copy up origin inode. diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 5e828a1c98a8..b288843e6b42 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -503,16 +503,14 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, * Introducing security_inode_fileattr_get/set() hooks would solve this issue * properly. */ -static int ovl_security_fileattr(struct dentry *dentry, struct fileattr *fa, +static int ovl_security_fileattr(struct path *realpath, struct fileattr *fa, bool set) { - struct path realpath; struct file *file; unsigned int cmd; int err;
- ovl_path_real(dentry, &realpath); - file = dentry_open(&realpath, O_RDONLY, current_cred()); + file = dentry_open(realpath, O_RDONLY, current_cred()); if (IS_ERR(file)) return PTR_ERR(file);
@@ -527,11 +525,22 @@ static int ovl_security_fileattr(struct dentry *dentry, struct fileattr *fa, return err; }
+int ovl_real_fileattr_set(struct path *realpath, struct fileattr *fa) +{ + int err; + + err = ovl_security_fileattr(realpath, fa, true); + if (err) + return err; + + return vfs_fileattr_set(&init_user_ns, realpath->dentry, fa); +} + int ovl_fileattr_set(struct user_namespace *mnt_userns, struct dentry *dentry, struct fileattr *fa) { struct inode *inode = d_inode(dentry); - struct dentry *upperdentry; + struct path upperpath; const struct cred *old_cred; int err;
@@ -541,12 +550,10 @@ int ovl_fileattr_set(struct user_namespace *mnt_userns,
err = ovl_copy_up(dentry); if (!err) { - upperdentry = ovl_dentry_upper(dentry); + ovl_path_real(dentry, &upperpath);
old_cred = ovl_override_creds(inode->i_sb); - err = ovl_security_fileattr(dentry, fa, true); - if (!err) - err = vfs_fileattr_set(&init_user_ns, upperdentry, fa); + err = ovl_real_fileattr_set(&upperpath, fa); revert_creds(old_cred); ovl_copyflags(ovl_inode_real(inode), inode); } @@ -555,17 +562,28 @@ int ovl_fileattr_set(struct user_namespace *mnt_userns, return err; }
+int ovl_real_fileattr_get(struct path *realpath, struct fileattr *fa) +{ + int err; + + err = ovl_security_fileattr(realpath, fa, false); + if (err) + return err; + + return vfs_fileattr_get(realpath->dentry, fa); +} + int ovl_fileattr_get(struct dentry *dentry, struct fileattr *fa) { struct inode *inode = d_inode(dentry); - struct dentry *realdentry = ovl_dentry_real(dentry); + struct path realpath; const struct cred *old_cred; int err;
+ ovl_path_real(dentry, &realpath); + old_cred = ovl_override_creds(inode->i_sb); - err = ovl_security_fileattr(dentry, fa, false); - if (!err) - err = vfs_fileattr_get(realdentry, fa); + err = ovl_real_fileattr_get(&realpath, fa); revert_creds(old_cred);
return err; diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 6ec73db4bf9e..8e94d9d0c919 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -518,9 +518,20 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to) i_size_write(to, i_size_read(from)); }
+/* vfs inode flags copied from real to ovl inode */ +#define OVL_COPY_I_FLAGS_MASK (S_SYNC | S_NOATIME | S_APPEND | S_IMMUTABLE) + +/* + * fileattr flags copied from lower to upper inode on copy up. + * We cannot copy immutable/append-only flags, because that would prevevnt + * linking temp inode to upper dir. + */ +#define OVL_COPY_FS_FLAGS_MASK (FS_SYNC_FL | FS_NOATIME_FL) +#define OVL_COPY_FSX_FLAGS_MASK (FS_XFLAG_SYNC | FS_XFLAG_NOATIME) + static inline void ovl_copyflags(struct inode *from, struct inode *to) { - unsigned int mask = S_SYNC | S_IMMUTABLE | S_APPEND | S_NOATIME; + unsigned int mask = OVL_COPY_I_FLAGS_MASK;
inode_set_flags(to, from->i_flags & mask, mask); } @@ -548,6 +559,8 @@ struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr); extern const struct file_operations ovl_file_operations; int __init ovl_aio_request_cache_init(void); void ovl_aio_request_cache_destroy(void); +int ovl_real_fileattr_get(struct path *realpath, struct fileattr *fa); +int ovl_real_fileattr_set(struct path *realpath, struct fileattr *fa); int ovl_fileattr_get(struct dentry *dentry, struct fileattr *fa); int ovl_fileattr_set(struct user_namespace *mnt_userns, struct dentry *dentry, struct fileattr *fa);
On Fri, Sep 10, 2021 at 3:17 AM Sasha Levin sashal@kernel.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 72db82115d2bdfbfba8b15a92d91872cfe1b40c6 ]
When a lower file has sync/noatime fileattr flags, the behavior of overlayfs post copy up is inconsistent.
Immediately after copy up, ovl inode still has the S_SYNC/S_NOATIME inode flags copied from lower inode, so vfs code still treats the ovl inode as sync/noatime. After ovl inode evict or mount cycle, the ovl inode does not have these inode flags anymore.
To fix this inconsistency, try to copy the fileattr flags on copy up if the upper fs supports the fileattr_set() method.
This gives consistent behavior post copy up regardless of inode eviction from cache.
We cannot copy up the immutable/append-only inode flags in a similar manner, because immutable/append-only inodes cannot be linked and because overlayfs will not be able to set overlay.* xattr on the upper inodes.
Those flags will be addressed by a followup patch.
Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
Sasha,
I do not recommend applying this patch to stable. The value/risk ratio is not worth it IMO.
I don't know of anyone who ever complained about not copying the NOATIME/SYNC fileattrs specifically.
This patch is more of a complimentary patch to the IMMUTABLE/ APPEND fileattr patch, which is not appropriate for stable either.
OTOH, ovl-update-5.15 has this patch that was not included in the AUTOSEL batch, even though it has a Fixes tag, CC stable and very strong hints in the subject: 52d5a0c6bd8a ("ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()")
I suppose AUTOSEL leaves these sorts of patches to Greg's scripts?
Thanks, Amir.
On Fri, Sep 10, 2021 at 08:35:41AM +0300, Amir Goldstein wrote:
On Fri, Sep 10, 2021 at 3:17 AM Sasha Levin sashal@kernel.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 72db82115d2bdfbfba8b15a92d91872cfe1b40c6 ]
When a lower file has sync/noatime fileattr flags, the behavior of overlayfs post copy up is inconsistent.
Immediately after copy up, ovl inode still has the S_SYNC/S_NOATIME inode flags copied from lower inode, so vfs code still treats the ovl inode as sync/noatime. After ovl inode evict or mount cycle, the ovl inode does not have these inode flags anymore.
To fix this inconsistency, try to copy the fileattr flags on copy up if the upper fs supports the fileattr_set() method.
This gives consistent behavior post copy up regardless of inode eviction from cache.
We cannot copy up the immutable/append-only inode flags in a similar manner, because immutable/append-only inodes cannot be linked and because overlayfs will not be able to set overlay.* xattr on the upper inodes.
Those flags will be addressed by a followup patch.
Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
Sasha,
I do not recommend applying this patch to stable. The value/risk ratio is not worth it IMO.
I don't know of anyone who ever complained about not copying the NOATIME/SYNC fileattrs specifically.
This patch is more of a complimentary patch to the IMMUTABLE/ APPEND fileattr patch, which is not appropriate for stable either.
Thanks! I'll drop it.
OTOH, ovl-update-5.15 has this patch that was not included in the AUTOSEL batch, even though it has a Fixes tag, CC stable and very strong hints in the subject: 52d5a0c6bd8a ("ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()")
I suppose AUTOSEL leaves these sorts of patches to Greg's scripts?
That's correct. If it has a stable tag AUTOSEL won't look at it (why should it? :) ).
From: Chengguang Xu cgxu519@mykernel.net
[ Upstream commit b71759ef1e1730db81dab98e9dab9455e8c7f5a2 ]
It is possible that a directory tree is shared between multiple overlay instances as a lower layer. In this case when one instance executes a file residing on the lower layer, the other instance denies a truncate(2) call on this file.
This only happens for truncate(2) and not for open(2) with the O_TRUNC flag.
Fix this interference and inconsistency by removing the preliminary i_writecount check before copy-up.
This means that unlike on normal filesystems truncate(argv[0]) will now succeed. If this ever causes a regression in a real world use case this needs to be revisited.
One way to fix this properly would be to keep a correct i_writecount in the overlay inode, but that is difficult due to memory mapping code only dealing with the real file/inode.
Signed-off-by: Chengguang Xu cgxu519@mykernel.net Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/filesystems/overlayfs.rst | 3 +++ fs/overlayfs/inode.c | 6 ------ 2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst index 455ca86eb4fc..7da6c30ed596 100644 --- a/Documentation/filesystems/overlayfs.rst +++ b/Documentation/filesystems/overlayfs.rst @@ -427,6 +427,9 @@ b) If a file residing on a lower layer is opened for read-only and then memory mapped with MAP_SHARED, then subsequent changes to the file are not reflected in the memory mapping.
+c) If a file residing on a lower layer is being executed, then opening that +file for write or truncating the file will not be denied with ETXTBSY. + The following options allow overlayfs to act more like a standards compliant filesystem:
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index b288843e6b42..6566294c5fd6 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -33,12 +33,6 @@ int ovl_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, goto out;
if (attr->ia_valid & ATTR_SIZE) { - struct inode *realinode = d_inode(ovl_dentry_real(dentry)); - - err = -ETXTBSY; - if (atomic_read(&realinode->i_writecount) < 0) - goto out_drop_write; - /* Truncate should trigger data copy up as well */ full_copy_up = true; }
From: "David E. Box" david.e.box@linux.intel.com
[ Upstream commit 45b6f75eab6aabf9d88933830f41f532d39f38d2 ]
Substate priority levels are encoded in 4 bits in the LPM_PRI register. This value was used as an index to an array whose element size was less than 16, leading to the possibility of overflow should we read a larger than expected priority. In addition to the overflow, bad values could lead to incorrect state reporting. So rework the priority code to prevent the overflow and perform some validation of the register. Use the priority register values if they give an ordering of unique numbers between 0 and the maximum number of states. Otherwise, use a default ordering instead.
Reported-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: David E. Box david.e.box@linux.intel.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Link: https://lore.kernel.org/r/20210814014728.520856-1-david.e.box@linux.intel.co... Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel_pmc_core.c | 65 +++++++++++++++++++++------ drivers/platform/x86/intel_pmc_core.h | 2 + 2 files changed, 53 insertions(+), 14 deletions(-)
diff --git a/drivers/platform/x86/intel_pmc_core.c b/drivers/platform/x86/intel_pmc_core.c index b0e486a6bdfb..ae410a358ffe 100644 --- a/drivers/platform/x86/intel_pmc_core.c +++ b/drivers/platform/x86/intel_pmc_core.c @@ -1449,9 +1449,42 @@ static int pmc_core_pkgc_show(struct seq_file *s, void *unused) } DEFINE_SHOW_ATTRIBUTE(pmc_core_pkgc);
-static void pmc_core_get_low_power_modes(struct pmc_dev *pmcdev) +static bool pmc_core_pri_verify(u32 lpm_pri, u8 *mode_order) { - u8 lpm_priority[LPM_MAX_NUM_MODES]; + int i, j; + + if (!lpm_pri) + return false; + /* + * Each byte contains the priority level for 2 modes (7:4 and 3:0). + * In a 32 bit register this allows for describing 8 modes. Store the + * levels and look for values out of range. + */ + for (i = 0; i < 8; i++) { + int level = lpm_pri & GENMASK(3, 0); + + if (level >= LPM_MAX_NUM_MODES) + return false; + + mode_order[i] = level; + lpm_pri >>= 4; + } + + /* Check that we have unique values */ + for (i = 0; i < LPM_MAX_NUM_MODES - 1; i++) + for (j = i + 1; j < LPM_MAX_NUM_MODES; j++) + if (mode_order[i] == mode_order[j]) + return false; + + return true; +} + +static void pmc_core_get_low_power_modes(struct platform_device *pdev) +{ + struct pmc_dev *pmcdev = platform_get_drvdata(pdev); + u8 pri_order[LPM_MAX_NUM_MODES] = LPM_DEFAULT_PRI; + u8 mode_order[LPM_MAX_NUM_MODES]; + u32 lpm_pri; u32 lpm_en; int mode, i, p;
@@ -1462,24 +1495,28 @@ static void pmc_core_get_low_power_modes(struct pmc_dev *pmcdev) lpm_en = pmc_core_reg_read(pmcdev, pmcdev->map->lpm_en_offset); pmcdev->num_lpm_modes = hweight32(lpm_en);
- /* Each byte contains information for 2 modes (7:4 and 3:0) */ - for (mode = 0; mode < LPM_MAX_NUM_MODES; mode += 2) { - u8 priority = pmc_core_reg_read_byte(pmcdev, - pmcdev->map->lpm_priority_offset + (mode / 2)); - int pri0 = GENMASK(3, 0) & priority; - int pri1 = (GENMASK(7, 4) & priority) >> 4; + /* Read 32 bit LPM_PRI register */ + lpm_pri = pmc_core_reg_read(pmcdev, pmcdev->map->lpm_priority_offset);
- lpm_priority[pri0] = mode; - lpm_priority[pri1] = mode + 1; - }
/* - * Loop though all modes from lowest to highest priority, + * If lpm_pri value passes verification, then override the default + * modes here. Otherwise stick with the default. + */ + if (pmc_core_pri_verify(lpm_pri, mode_order)) + /* Get list of modes in priority order */ + for (mode = 0; mode < LPM_MAX_NUM_MODES; mode++) + pri_order[mode_order[mode]] = mode; + else + dev_warn(&pdev->dev, "Assuming a default substate order for this platform\n"); + + /* + * Loop through all modes from lowest to highest priority, * and capture all enabled modes in order */ i = 0; for (p = LPM_MAX_NUM_MODES - 1; p >= 0; p--) { - int mode = lpm_priority[p]; + int mode = pri_order[p];
if (!(BIT(mode) & lpm_en)) continue; @@ -1675,7 +1712,7 @@ static int pmc_core_probe(struct platform_device *pdev) mutex_init(&pmcdev->lock);
pmcdev->pmc_xram_read_bit = pmc_core_check_read_lock_bit(pmcdev); - pmc_core_get_low_power_modes(pmcdev); + pmc_core_get_low_power_modes(pdev); pmc_core_do_dmi_quirks(pmcdev);
if (pmcdev->map == &tgl_reg_map) diff --git a/drivers/platform/x86/intel_pmc_core.h b/drivers/platform/x86/intel_pmc_core.h index e8dae9c6c45f..b9bf3d3d6f7a 100644 --- a/drivers/platform/x86/intel_pmc_core.h +++ b/drivers/platform/x86/intel_pmc_core.h @@ -188,6 +188,8 @@ enum ppfear_regs { #define ICL_PMC_SLP_S0_RES_COUNTER_STEP 0x64
#define LPM_MAX_NUM_MODES 8 +#define LPM_DEFAULT_PRI { 7, 6, 2, 5, 4, 1, 3, 0 } + #define GET_X2_COUNTER(v) ((v) >> 1) #define LPM_STS_LATCH_MODE BIT(31)
From: Tuo Li islituo@gmail.com
[ Upstream commit 0f99792c01d1d6d35b86e850e9ccadd98d6f3e0c ]
The return value of transport_kmap_data_sg() is assigned to the variable buf:
buf = transport_kmap_data_sg(cmd);
And then it is checked:
if (!buf) {
This indicates that buf can be NULL. However, it is dereferenced in the following statements:
if (!(buf[3] & 0x80)) buf[3] |= 0x80; if (!(buf[2] & 0x80)) buf[2] |= 0x80;
To fix these possible null-pointer dereferences, dereference buf and call transport_kunmap_data_sg() only when buf is not NULL.
Link: https://lore.kernel.org/r/20210810040414.248167-1-islituo@gmail.com Reported-by: TOTE Robot oslab@tsinghua.edu.cn Reviewed-by: Bodo Stroesser bostroesser@gmail.com Signed-off-by: Tuo Li islituo@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_pscsi.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c index 2629d2ef3970..75ef52f008ff 100644 --- a/drivers/target/target_core_pscsi.c +++ b/drivers/target/target_core_pscsi.c @@ -620,17 +620,17 @@ static void pscsi_complete_cmd(struct se_cmd *cmd, u8 scsi_status, buf = transport_kmap_data_sg(cmd); if (!buf) { ; /* XXX: TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE */ - } - - if (cdb[0] == MODE_SENSE_10) { - if (!(buf[3] & 0x80)) - buf[3] |= 0x80; } else { - if (!(buf[2] & 0x80)) - buf[2] |= 0x80; - } + if (cdb[0] == MODE_SENSE_10) { + if (!(buf[3] & 0x80)) + buf[3] |= 0x80; + } else { + if (!(buf[2] & 0x80)) + buf[2] |= 0x80; + }
- transport_kunmap_data_sg(cmd); + transport_kunmap_data_sg(cmd); + } } } after_mode_sense:
From: Liu Yi L yi.l.liu@intel.com
[ Upstream commit 423d39d8518c1bba12e0889a92beeddbb1502392 ]
The helper functions should not modify the pasid entries which are still in use. Add a check against present bit.
Signed-off-by: Liu Yi L yi.l.liu@intel.com Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/pasid.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 9ec374e17469..60c07050f6a7 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -540,6 +540,10 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev, devtlb_invalidation_with_pasid(iommu, dev, pasid); }
+/* + * This function flushes cache for a newly setup pasid table entry. + * Caller of it should not modify the in-use pasid table entries. + */ static void pasid_flush_caches(struct intel_iommu *iommu, struct pasid_entry *pte, u32 pasid, u16 did) @@ -591,6 +595,10 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu, if (WARN_ON(!pte)) return -EINVAL;
+ /* Caller must ensure PASID entry is not in use. */ + if (pasid_pte_is_present(pte)) + return -EBUSY; + pasid_clear_entry(pte);
/* Setup the first level page table pointer: */ @@ -690,6 +698,10 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu, return -ENODEV; }
+ /* Caller must ensure PASID entry is not in use. */ + if (pasid_pte_is_present(pte)) + return -EBUSY; + pasid_clear_entry(pte); pasid_set_domain_id(pte, did); pasid_set_slptr(pte, pgd_val); @@ -729,6 +741,10 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu, return -ENODEV; }
+ /* Caller must ensure PASID entry is not in use. */ + if (pasid_pte_is_present(pte)) + return -EBUSY; + pasid_clear_entry(pte); pasid_set_domain_id(pte, did); pasid_set_address_width(pte, iommu->agaw);
From: Alexander Aring aahringo@redhat.com
[ Upstream commit aee742c9928ab4f5f4e0b00f41fb2d2cffae179e ]
This patch will return -EINTR instead of 1 if recovery is stopped. In case of ping_members() the return value will be checked if the error is -EINTR for signaling another recovery was triggered and the whole recovery process will come to a clean end to process the next one. Returning 1 will abort the recovery process and can leave the recovery in a broken state.
It was reported with the following kernel log message attached and a gfs2 mount stopped working:
"dlm: bobvirt1: dlm_recover_members error 1"
whereas 1 was returned because of a conversion of "dlm_recovery_stopped()" to an errno was missing which this patch will introduce. While on it all other possible missing errno conversions at other places were added as they are done as in other places.
It might be worth to check the error case at this recovery level, because some of the functionality also returns -ENOBUFS and check why recovery ends in a broken state. However this will fix the issue if another recovery was triggered at some points of recovery handling.
Reported-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/dir.c | 4 +++- fs/dlm/member.c | 4 +++- fs/dlm/recoverd.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/dir.c b/fs/dlm/dir.c index 10c36ae1a8f9..45ebbe602bbf 100644 --- a/fs/dlm/dir.c +++ b/fs/dlm/dir.c @@ -85,8 +85,10 @@ int dlm_recover_directory(struct dlm_ls *ls) for (;;) { int left; error = dlm_recovery_stopped(ls); - if (error) + if (error) { + error = -EINTR; goto out_free; + }
error = dlm_rcom_names(ls, memb->nodeid, last_name, last_len); diff --git a/fs/dlm/member.c b/fs/dlm/member.c index d9e1e4170eb1..731d489aa323 100644 --- a/fs/dlm/member.c +++ b/fs/dlm/member.c @@ -443,8 +443,10 @@ static int ping_members(struct dlm_ls *ls)
list_for_each_entry(memb, &ls->ls_nodes, list) { error = dlm_recovery_stopped(ls); - if (error) + if (error) { + error = -EINTR; break; + } error = dlm_rcom_status(ls, memb->nodeid, 0); if (error) break; diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c index 85e245392715..97d052cea5a9 100644 --- a/fs/dlm/recoverd.c +++ b/fs/dlm/recoverd.c @@ -125,8 +125,10 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv) dlm_recover_waiters_pre(ls);
error = dlm_recovery_stopped(ls); - if (error) + if (error) { + error = -EINTR; goto fail; + }
if (neg || dlm_no_directory(ls)) { /*
From: Meng Dong whenov@gmail.com
[ Upstream commit 3ae86d2d4704796ee658a34245cb86e68c40c5d7 ]
This patch fixes the bug 212671. Althrough the Fn lock (Fn + Esc) works on Legion 5 (R7000P), its LED light does not change with the state. This modification sets the Fn lock state to its current value on receiving the wmi event 8FC0DE0C-B4E4-43FD-B0F3-8871711C1294 to update the LED state.
Signed-off-by: Meng Dong whenov@gmail.com Link: https://lore.kernel.org/r/20210817171203.12855-1-whenov@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/ideapad-laptop.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c index 784326bd72f0..e7a1299e3776 100644 --- a/drivers/platform/x86/ideapad-laptop.c +++ b/drivers/platform/x86/ideapad-laptop.c @@ -41,6 +41,7 @@ static const char *const ideapad_wmi_fnesc_events[] = { "26CAB2E5-5CF1-46AE-AAC3-4A12B6BA50E6", /* Yoga 3 */ "56322276-8493-4CE8-A783-98C991274F5E", /* Yoga 700 */ + "8FC0DE0C-B4E4-43FD-B0F3-8871711C1294", /* Legion 5 */ }; #endif
@@ -1459,11 +1460,19 @@ static void ideapad_acpi_notify(acpi_handle handle, u32 event, void *data) static void ideapad_wmi_notify(u32 value, void *context) { struct ideapad_private *priv = context; + unsigned long result;
switch (value) { case 128: ideapad_input_report(priv, value); break; + case 208: + if (!eval_hals(priv->adev->handle, &result)) { + bool state = test_bit(HALS_FNLOCK_STATE_BIT, &result); + + exec_sals(priv->adev->handle, state ? SALS_FNLOCK_ON : SALS_FNLOCK_OFF); + } + break; default: dev_info(&priv->platform_device->dev, "Unknown WMI event: %u\n", value);
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit d0f1d5ae23803bd82647a337fa508fa8615defc5 ]
When thrustmaster_probe() handles errors of usb_submit_urb() it does not free allocated resources and fails. The patch fixes that.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-thrustmaster.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-thrustmaster.c b/drivers/hid/hid-thrustmaster.c index cdc7d82ae9ed..e94d3409fd10 100644 --- a/drivers/hid/hid-thrustmaster.c +++ b/drivers/hid/hid-thrustmaster.c @@ -336,11 +336,14 @@ static int thrustmaster_probe(struct hid_device *hdev, const struct hid_device_i );
ret = usb_submit_urb(tm_wheel->urb, GFP_ATOMIC); - if (ret) + if (ret) { hid_err(hdev, "Error %d while submitting the URB. I am unable to initialize this wheel...\n", ret); + goto error6; + }
return ret;
+error6: kfree(tm_wheel->change_request); error5: kfree(tm_wheel->response); error4: kfree(tm_wheel->model_request); error3: usb_free_urb(tm_wheel->urb);
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit df3a97bdbc252d3421f1c5d6d713ad6e4f36a469 ]
thrustmaster_remove() does not release memory for tm_wheel->change_request. This is fixed by the patch.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-thrustmaster.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hid/hid-thrustmaster.c b/drivers/hid/hid-thrustmaster.c index e94d3409fd10..9cb4248f95af 100644 --- a/drivers/hid/hid-thrustmaster.c +++ b/drivers/hid/hid-thrustmaster.c @@ -253,6 +253,7 @@ static void thrustmaster_remove(struct hid_device *hdev)
usb_kill_urb(tm_wheel->urb);
+ kfree(tm_wheel->change_request); kfree(tm_wheel->response); kfree(tm_wheel->model_request); usb_free_urb(tm_wheel->urb);
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit c3800eed22d21c66912b4461a403b4448ed88d95 ]
thrustmaster_interrupts() does not free memory for send_buf when usb_interrupt_msg() fails. This is fixed by the given patch.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-thrustmaster.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hid/hid-thrustmaster.c b/drivers/hid/hid-thrustmaster.c index 9cb4248f95af..d44550aa8805 100644 --- a/drivers/hid/hid-thrustmaster.c +++ b/drivers/hid/hid-thrustmaster.c @@ -173,6 +173,7 @@ static void thrustmaster_interrupts(struct hid_device *hdev)
if (ret) { hid_err(hdev, "setup data couldn't be sent\n"); + kfree(send_buf); return; } }
From: Ulrich Spörlein uqs@FreeBSD.org
[ Upstream commit bab94e97323baefe0afccad66e776f9c78b4f521 ]
The device string on these can differ, apparently, including typos. I've bought 2 of these in 2012 and googling shows many folks out there with that broken spelling in their dmesg.
Signed-off-by: Ulrich Spörlein uqs@FreeBSD.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-sony.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-sony.c b/drivers/hid/hid-sony.c index b3722c51ec78..a2fef59063a6 100644 --- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -2974,7 +2974,8 @@ static int sony_probe(struct hid_device *hdev, const struct hid_device_id *id) if (!strcmp(hdev->name, "FutureMax Dance Mat")) quirks |= FUTUREMAX_DANCE_MAT;
- if (!strcmp(hdev->name, "SHANWAN PS3 GamePad")) + if (!strcmp(hdev->name, "SHANWAN PS3 GamePad") || + !strcmp(hdev->name, "ShanWan PS(R) Ga`epad")) quirks |= SHANWAN_GAMEPAD;
sc = devm_kzalloc(&hdev->dev, sizeof(*sc), GFP_KERNEL);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 0181f6f19c6c35b24f1516d8db22f3bbce762633 ]
The ocelot switch driver used to ignore ports which do not have a phy-handle property and not probe those, but this is not quite ok since it is valid to not have a phy-handle property if there is a fixed-link.
It seems that checking for a phy-handle was a proxy for the proper check which is for the status, but that doesn't make a lot of sense, since the ocelot driver already iterates using for_each_available_child_of_node which skips the disabled ports, so I have no idea.
Anyway, a widespread pattern in device trees is for a SoC dtsi to disable by default all hardware, and let board dts files enable what is used. So let's do that and enable only the ports with a phy-handle in the pcb120 and pcb123 device tree files.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/boot/dts/mscc/ocelot.dtsi | 11 +++++++++++ arch/mips/boot/dts/mscc/ocelot_pcb120.dts | 8 ++++++++ arch/mips/boot/dts/mscc/ocelot_pcb123.dts | 4 ++++ 3 files changed, 23 insertions(+)
diff --git a/arch/mips/boot/dts/mscc/ocelot.dtsi b/arch/mips/boot/dts/mscc/ocelot.dtsi index 535a98284dcb..e51db651af13 100644 --- a/arch/mips/boot/dts/mscc/ocelot.dtsi +++ b/arch/mips/boot/dts/mscc/ocelot.dtsi @@ -150,36 +150,47 @@ ethernet-ports {
port0: port@0 { reg = <0>; + status = "disabled"; }; port1: port@1 { reg = <1>; + status = "disabled"; }; port2: port@2 { reg = <2>; + status = "disabled"; }; port3: port@3 { reg = <3>; + status = "disabled"; }; port4: port@4 { reg = <4>; + status = "disabled"; }; port5: port@5 { reg = <5>; + status = "disabled"; }; port6: port@6 { reg = <6>; + status = "disabled"; }; port7: port@7 { reg = <7>; + status = "disabled"; }; port8: port@8 { reg = <8>; + status = "disabled"; }; port9: port@9 { reg = <9>; + status = "disabled"; }; port10: port@10 { reg = <10>; + status = "disabled"; }; }; }; diff --git a/arch/mips/boot/dts/mscc/ocelot_pcb120.dts b/arch/mips/boot/dts/mscc/ocelot_pcb120.dts index 897de5025d7f..d2dc6b3d923c 100644 --- a/arch/mips/boot/dts/mscc/ocelot_pcb120.dts +++ b/arch/mips/boot/dts/mscc/ocelot_pcb120.dts @@ -69,40 +69,48 @@ phy4: ethernet-phy@3 { };
&port0 { + status = "okay"; phy-handle = <&phy0>; };
&port1 { + status = "okay"; phy-handle = <&phy1>; };
&port2 { + status = "okay"; phy-handle = <&phy2>; };
&port3 { + status = "okay"; phy-handle = <&phy3>; };
&port4 { + status = "okay"; phy-handle = <&phy7>; phy-mode = "sgmii"; phys = <&serdes 4 SERDES1G(2)>; };
&port5 { + status = "okay"; phy-handle = <&phy4>; phy-mode = "sgmii"; phys = <&serdes 5 SERDES1G(5)>; };
&port6 { + status = "okay"; phy-handle = <&phy6>; phy-mode = "sgmii"; phys = <&serdes 6 SERDES1G(3)>; };
&port9 { + status = "okay"; phy-handle = <&phy5>; phy-mode = "sgmii"; phys = <&serdes 9 SERDES1G(4)>; diff --git a/arch/mips/boot/dts/mscc/ocelot_pcb123.dts b/arch/mips/boot/dts/mscc/ocelot_pcb123.dts index ef852f382da8..7d7e638791dd 100644 --- a/arch/mips/boot/dts/mscc/ocelot_pcb123.dts +++ b/arch/mips/boot/dts/mscc/ocelot_pcb123.dts @@ -47,17 +47,21 @@ &mdio0 { };
&port0 { + status = "okay"; phy-handle = <&phy0>; };
&port1 { + status = "okay"; phy-handle = <&phy1>; };
&port2 { + status = "okay"; phy-handle = <&phy2>; };
&port3 { + status = "okay"; phy-handle = <&phy3>; };
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit eba54cbb92d28b4f6dc1ed5f73f5187b09d82c08 ]
The ocelot driver was converted to phylink, and that expects a valid phy_interface_t. Without a phy-mode, of_get_phy_mode returns PHY_INTERFACE_MODE_NA, which is not ideal because phylink rejects that.
The ocelot driver was patched to treat PHY_INTERFACE_MODE_NA as PHY_INTERFACE_MODE_INTERNAL to work with the broken DT blobs, but we should fix the device trees and specify the phy-mode too.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/boot/dts/mscc/ocelot_pcb120.dts | 4 ++++ arch/mips/boot/dts/mscc/ocelot_pcb123.dts | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/arch/mips/boot/dts/mscc/ocelot_pcb120.dts b/arch/mips/boot/dts/mscc/ocelot_pcb120.dts index d2dc6b3d923c..bd240690cb37 100644 --- a/arch/mips/boot/dts/mscc/ocelot_pcb120.dts +++ b/arch/mips/boot/dts/mscc/ocelot_pcb120.dts @@ -71,21 +71,25 @@ phy4: ethernet-phy@3 { &port0 { status = "okay"; phy-handle = <&phy0>; + phy-mode = "internal"; };
&port1 { status = "okay"; phy-handle = <&phy1>; + phy-mode = "internal"; };
&port2 { status = "okay"; phy-handle = <&phy2>; + phy-mode = "internal"; };
&port3 { status = "okay"; phy-handle = <&phy3>; + phy-mode = "internal"; };
&port4 { diff --git a/arch/mips/boot/dts/mscc/ocelot_pcb123.dts b/arch/mips/boot/dts/mscc/ocelot_pcb123.dts index 7d7e638791dd..0185045c7630 100644 --- a/arch/mips/boot/dts/mscc/ocelot_pcb123.dts +++ b/arch/mips/boot/dts/mscc/ocelot_pcb123.dts @@ -49,19 +49,23 @@ &mdio0 { &port0 { status = "okay"; phy-handle = <&phy0>; + phy-mode = "internal"; };
&port1 { status = "okay"; phy-handle = <&phy1>; + phy-mode = "internal"; };
&port2 { status = "okay"; phy-handle = <&phy2>; + phy-mode = "internal"; };
&port3 { status = "okay"; phy-handle = <&phy3>; + phy-mode = "internal"; };
From: Gioh Kim gi-oh.kim@ionos.com
[ Upstream commit 0d8f2cfa23f04ca01f6d4bba09933cb6310193aa ]
There are mis-match at counting inflight IO after changing the multipath policy.
For example, we started fio test with round-robin policy and then we changed the policy to min-inflight. IOs created under the RR policy is finished under the min-inflight policy and inflight counter only decreased. So the counter would be negative value. And also we started fio test with min-inflight policy and changed the policy to the round-robin. IOs created under the min-inflight policy increased the inflight IO counter but the inflight IO counter was not decreased because the policy was the round-robin when IO was finished.
So it should count IOs only if the IO is created under the min-inflight policy. It should not care the policy when the IO is finished.
This patch adds a field mp_policy in struct rtrs_clt_io_req and stores the multipath policy when an object of rtrs_clt_io_req is created. Then rtrs-clt checks the mp_policy of only struct rtrs_clt_io_req instead of the struct rtrs_clt.
Link: https://lore.kernel.org/r/20210806112112.124313-6-haris.iqbal@ionos.com Signed-off-by: Gioh Kim gi-oh.kim@ionos.com Signed-off-by: Jack Wang jinpu.wang@ionos.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Md Haris Iqbal haris.iqbal@ionos.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c | 2 +- drivers/infiniband/ulp/rtrs/rtrs-clt.c | 7 ++++--- drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 + 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c index 26bbe5d6dff5..553e173975fb 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-stats.c @@ -180,7 +180,7 @@ void rtrs_clt_update_all_stats(struct rtrs_clt_io_req *req, int dir)
len = req->usr_len + req->data_len; rtrs_clt_update_rdma_stats(stats, len, dir); - if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT) + if (req->mp_policy == MP_POLICY_MIN_INFLIGHT) atomic_inc(&stats->inflight); }
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c index f2c40e50f25e..3b3bc77d02cc 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c @@ -438,7 +438,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno, } if (!refcount_dec_and_test(&req->ref)) return; - if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT) + if (req->mp_policy == MP_POLICY_MIN_INFLIGHT) atomic_dec(&sess->stats->inflight);
req->in_use = false; @@ -963,6 +963,7 @@ static void rtrs_clt_init_req(struct rtrs_clt_io_req *req, req->need_inv_comp = false; req->inv_errno = 0; refcount_set(&req->ref, 1); + req->mp_policy = sess->clt->mp_policy;
iov_iter_kvec(&iter, READ, vec, 1, usr_len); len = _copy_from_iter(req->iu->buf, usr_len, &iter); @@ -1153,7 +1154,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req) "Write request failed: error=%d path=%s [%s:%u]\n", ret, kobject_name(&sess->kobj), sess->hca_name, sess->hca_port); - if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT) + if (req->mp_policy == MP_POLICY_MIN_INFLIGHT) atomic_dec(&sess->stats->inflight); if (req->sg_cnt) ib_dma_unmap_sg(sess->s.dev->ib_dev, req->sglist, @@ -1259,7 +1260,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req) "Read request failed: error=%d path=%s [%s:%u]\n", ret, kobject_name(&sess->kobj), sess->hca_name, sess->hca_port); - if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT) + if (req->mp_policy == MP_POLICY_MIN_INFLIGHT) atomic_dec(&sess->stats->inflight); req->need_inv = false; if (req->sg_cnt) diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h index e276a2dfcf7c..12eaea44c1f9 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h @@ -102,6 +102,7 @@ struct rtrs_clt_io_req { unsigned int usr_len; void *priv; bool in_use; + enum rtrs_mp_policy mp_policy; struct rtrs_clt_con *con; struct rtrs_sg_desc *desc; struct ib_sge *sge;
From: Arun Easi aeasi@marvell.com
[ Upstream commit 310e69edfbd57995868a428eeddea09a7b5d2749 ]
The following hung task call trace was seen:
[ 1230.183294] INFO: task qla2xxx_wq:523 blocked for more than 120 seconds. [ 1230.197749] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1230.205585] qla2xxx_wq D 0 523 2 0x80004000 [ 1230.205636] Workqueue: qla2xxx_wq qlt_free_session_done [qla2xxx] [ 1230.205639] Call Trace: [ 1230.208100] __schedule+0x2c4/0x700 [ 1230.211607] schedule+0x38/0xa0 [ 1230.214769] schedule_timeout+0x246/0x2f0 [ 1230.222651] wait_for_completion+0x97/0x100 [ 1230.226921] qlt_free_session_done+0x6a0/0x6f0 [qla2xxx] [ 1230.232254] process_one_work+0x1a7/0x360
...when device side port resets were done.
Abort threads were getting out without processing due to the "deleted" flag check. The delete thread, meanwhile, could not proceed with a logout (that would have cleared out pending requests) as the logout IOCB work was not progressing. It appears like the hung qlt_free_session_done() thread is causing the ha->wq works on hold. The qlt_free_session_done() was hung waiting for nvme_fc_unregister_remoteport() + localport_delete cb to be complete, which would only happen when all I/Os are released.
Fix this by allowing abort to progress until device delete is completely done. This should make the qlt_free_session_done() proceed without hang and thus clear up the deadlock.
Link: https://lore.kernel.org/r/20210817051315.2477-5-njavali@marvell.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 3e5c70a1d969..6e1abfc1be41 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -227,7 +227,7 @@ static void qla_nvme_abort_work(struct work_struct *work) "%s called for sp=%p, hndl=%x on fcport=%p deleted=%d\n", __func__, sp, sp->handle, fcport, fcport->deleted);
- if (!ha->flags.fw_started || fcport->deleted) + if (!ha->flags.fw_started || fcport->deleted == QLA_SESS_DELETED) goto out;
if (ha->flags.host_shutting_down) {
From: Quinn Tran qutran@marvell.com
[ Upstream commit f6e327fc09e48271c103efb3b69fc4ccda3f408b ]
Currently driver saves the personality type (FCP|NVMe) at the start of first discovery of the remote device. If the remote device personality do change over time, then qla driver needs to present that to user to decide.
Link: https://lore.kernel.org/r/20210817051315.2477-8-njavali@marvell.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_gs.c | 1 + drivers/scsi/qla2xxx/qla_init.c | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c index 5b6e04a91a18..f6ed1d62e16b 100644 --- a/drivers/scsi/qla2xxx/qla_gs.c +++ b/drivers/scsi/qla2xxx/qla_gs.c @@ -3498,6 +3498,7 @@ void qla24xx_async_gnnft_done(scsi_qla_host_t *vha, srb_t *sp) continue; fcport->scan_state = QLA_FCPORT_FOUND; fcport->last_rscn_gen = fcport->rscn_gen; + fcport->fc4_type = rp->fc4type; found = true; /* * If device was not a fabric device before. diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 27d10524ec3e..3ddc198b5c5c 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1538,11 +1538,12 @@ int qla24xx_fcport_handle_login(struct scsi_qla_host *vha, fc_port_t *fcport) u16 sec;
ql_dbg(ql_dbg_disc, vha, 0x20d8, - "%s %8phC DS %d LS %d P %d fl %x confl %p rscn %d|%d login %d lid %d scan %d\n", + "%s %8phC DS %d LS %d P %d fl %x confl %p rscn %d|%d login %d lid %d scan %d fc4type %x\n", __func__, fcport->port_name, fcport->disc_state, fcport->fw_login_state, fcport->login_pause, fcport->flags, fcport->conflict, fcport->last_rscn_gen, fcport->rscn_gen, - fcport->login_gen, fcport->loop_id, fcport->scan_state); + fcport->login_gen, fcport->loop_id, fcport->scan_state, + fcport->fc4_type);
if (fcport->scan_state != QLA_FCPORT_FOUND) return 0;
From: Arun Easi aeasi@marvell.com
[ Upstream commit 2cabf10dbbe380e2ef27a69ce2059bcab7c8b419 ]
The abort callback gets called only when it gets posted to firmware. The refcounting is done properly in the callback. On internal errors, the callback is not invoked leading to a hung I/O. Fix this by having separate error code when command gets returned from firmware.
Link: https://lore.kernel.org/r/20210817051315.2477-9-njavali@marvell.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 3 +++ drivers/scsi/qla2xxx/qla_init.c | 6 +++--- drivers/scsi/qla2xxx/qla_mbx.c | 4 ++-- drivers/scsi/qla2xxx/qla_nvme.c | 26 +++++++++++++++++--------- 4 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 2da1d4eccbaa..db3daba500ee 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -5056,6 +5056,9 @@ struct secure_flash_update_block_pk { #define QLA_BUSY 0x107 #define QLA_ALREADY_REGISTERED 0x109 #define QLA_OS_TIMER_EXPIRED 0x10a +#define QLA_ERR_NO_QPAIR 0x10b +#define QLA_ERR_NOT_FOUND 0x10c +#define QLA_ERR_FROM_FW 0x10d
#define NVRAM_DELAY() udelay(10)
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 3ddc198b5c5c..7a9d5e4f15db 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -158,7 +158,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait) sp = qla2xxx_get_qpair_sp(cmd_sp->vha, cmd_sp->qpair, cmd_sp->fcport, GFP_ATOMIC); if (!sp) - return rval; + return QLA_MEMORY_ALLOC_FAILED;
abt_iocb = &sp->u.iocb_cmd; sp->type = SRB_ABT_CMD; @@ -191,7 +191,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait) if (wait) { wait_for_completion(&abt_iocb->u.abt.comp); rval = abt_iocb->u.abt.comp_status == CS_COMPLETE ? - QLA_SUCCESS : QLA_FUNCTION_FAILED; + QLA_SUCCESS : QLA_ERR_FROM_FW; sp->free(sp); }
@@ -1903,7 +1903,7 @@ qla24xx_async_abort_command(srb_t *sp)
if (handle == req->num_outstanding_cmds) { /* Command not found. */ - return QLA_FUNCTION_FAILED; + return QLA_ERR_NOT_FOUND; } if (sp->type == SRB_FXIOCB_DCMD) return qlafx00_fx_disc(vha, &vha->hw->mr.fcport, diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index 9f3ad8aa649c..f9992e010ed4 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -3231,7 +3231,7 @@ qla24xx_abort_command(srb_t *sp) if (sp->qpair) req = sp->qpair->req; else - return QLA_FUNCTION_FAILED; + return QLA_ERR_NO_QPAIR;
if (ql2xasynctmfenable) return qla24xx_async_abort_command(sp); @@ -3244,7 +3244,7 @@ qla24xx_abort_command(srb_t *sp) spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); if (handle == req->num_outstanding_cmds) { /* Command not found. */ - return QLA_FUNCTION_FAILED; + return QLA_ERR_NOT_FOUND; }
abt = dma_pool_zalloc(ha->s_dma_pool, GFP_KERNEL, &abt_dma); diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 6e1abfc1be41..5a95de7b5757 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -221,11 +221,11 @@ static void qla_nvme_abort_work(struct work_struct *work) srb_t *sp = priv->sp; fc_port_t *fcport = sp->fcport; struct qla_hw_data *ha = fcport->vha->hw; - int rval; + int rval, abts_done_called = 1;
ql_dbg(ql_dbg_io, fcport->vha, 0xffff, - "%s called for sp=%p, hndl=%x on fcport=%p deleted=%d\n", - __func__, sp, sp->handle, fcport, fcport->deleted); + "%s called for sp=%p, hndl=%x on fcport=%p desc=%p deleted=%d\n", + __func__, sp, sp->handle, fcport, sp->u.iocb_cmd.u.nvme.desc, fcport->deleted);
if (!ha->flags.fw_started || fcport->deleted == QLA_SESS_DELETED) goto out; @@ -245,12 +245,20 @@ static void qla_nvme_abort_work(struct work_struct *work) __func__, (rval != QLA_SUCCESS) ? "Failed to abort" : "Aborted", sp, sp->handle, fcport, rval);
+ /* + * If async tmf is enabled, the abort callback is called only on + * return codes QLA_SUCCESS and QLA_ERR_FROM_FW. + */ + if (ql2xasynctmfenable && + rval != QLA_SUCCESS && rval != QLA_ERR_FROM_FW) + abts_done_called = 0; + /* * Returned before decreasing kref so that I/O requests * are waited until ABTS complete. This kref is decreased * at qla24xx_abort_sp_done function. */ - if (ql2xabts_wait_nvme && QLA_ABTS_WAIT_ENABLED(sp)) + if (abts_done_called && ql2xabts_wait_nvme && QLA_ABTS_WAIT_ENABLED(sp)) return; out: /* kref_get was done before work was schedule. */ @@ -803,14 +811,14 @@ void qla_nvme_abort_process_comp_status(struct abort_entry_24xx *abt, srb_t *ori case CS_PORT_LOGGED_OUT: /* BA_RJT was received for the ABTS */ case CS_PORT_CONFIG_CHG: - ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf09d, + ql_dbg(ql_dbg_async, vha, 0xf09d, "Abort I/O IOCB completed with error, comp_status=%x\n", comp_status); break;
/* BA_RJT was received for the ABTS */ case CS_REJECT_RECEIVED: - ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf09e, + ql_dbg(ql_dbg_async, vha, 0xf09e, "BA_RJT was received for the ABTS rjt_vendorUnique = %u", abt->fw.ba_rjt_vendorUnique); ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf09e, @@ -819,18 +827,18 @@ void qla_nvme_abort_process_comp_status(struct abort_entry_24xx *abt, srb_t *ori break;
case CS_COMPLETE: - ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf09f, + ql_dbg(ql_dbg_async + ql_dbg_verbose, vha, 0xf09f, "IOCB request is completed successfully comp_status=%x\n", comp_status); break;
case CS_IOCB_ERROR: - ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf0a0, + ql_dbg(ql_dbg_async, vha, 0xf0a0, "IOCB request is failed, comp_status=%x\n", comp_status); break;
default: - ql_dbg(ql_dbg_async + ql_dbg_mbx, vha, 0xf0a1, + ql_dbg(ql_dbg_async, vha, 0xf0a1, "Invalid Abort IO IOCB Completion Status %x\n", comp_status); break;
From: Quinn Tran qutran@marvell.com
[ Upstream commit 7a8ff7d9854a1727435557184c8255bbbca60920 ]
When Target port transitions personality from one to another (NVMe <--> FCP), there could be some overlap of the two where one layer is going down while the other layer is coming up. This overlap can cause temporary I/O error. Detect those errors/transitions and recover from them. Triggers session tear down and allow relogin to re-drive the connection under the following conditions:
- NVMe command error
- On PRLO + N2N (rida format 2)
Link: https://lore.kernel.org/r/20210817051315.2477-11-njavali@marvell.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_attr.c | 12 +++++++----- drivers/scsi/qla2xxx/qla_isr.c | 9 +++++++++ drivers/scsi/qla2xxx/qla_mbx.c | 10 ++++++++++ 3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c index 3aa9869f6fae..0708dbdfa2cc 100644 --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -2706,12 +2706,14 @@ qla2x00_terminate_rport_io(struct fc_rport *rport) * final cleanup of firmware resources (PCBs and XCBs). */ if (fcport->loop_id != FC_NO_LOOP_ID) { - if (IS_FWI2_CAPABLE(fcport->vha->hw)) - fcport->vha->hw->isp_ops->fabric_logout(fcport->vha, - fcport->loop_id, fcport->d_id.b.domain, - fcport->d_id.b.area, fcport->d_id.b.al_pa); - else + if (IS_FWI2_CAPABLE(fcport->vha->hw)) { + if (fcport->loop_id != FC_NO_LOOP_ID) + fcport->logout_on_delete = 1; + + qlt_schedule_sess_for_deletion(fcport); + } else { qla2x00_port_logout(fcport->vha, fcport); + } } }
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index d9fb093a60a1..6d47f4a93eff 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -2417,6 +2417,15 @@ static void qla24xx_nvme_iocb_entry(scsi_qla_host_t *vha, struct req_que *req, case CS_PORT_UNAVAILABLE: case CS_PORT_LOGGED_OUT: fcport->nvme_flag |= NVME_FLAG_RESETTING; + if (atomic_read(&fcport->state) == FCS_ONLINE) { + ql_dbg(ql_dbg_disc, fcport->vha, 0x3021, + "Port to be marked lost on fcport=%06x, current " + "port state= %s comp_status %x.\n", + fcport->d_id.b24, port_state_str[FCS_ONLINE], + comp_status); + + qlt_schedule_sess_for_deletion(fcport); + } fallthrough; case CS_ABORTED: case CS_PORT_BUSY: diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c index f9992e010ed4..2c1b8d590162 100644 --- a/drivers/scsi/qla2xxx/qla_mbx.c +++ b/drivers/scsi/qla2xxx/qla_mbx.c @@ -4172,6 +4172,16 @@ qla24xx_report_id_acquisition(scsi_qla_host_t *vha, rptid_entry->u.f2.remote_nport_id[1]; fcport->d_id.b.al_pa = rptid_entry->u.f2.remote_nport_id[0]; + + /* + * For the case where remote port sending PRLO, FW + * sends up RIDA Format 2 as an indication of session + * loss. In other word, FW state change from PRLI + * complete back to PLOGI complete. Delete the + * session and let relogin drive the reconnect. + */ + if (atomic_read(&fcport->state) == FCS_ONLINE) + qlt_schedule_sess_for_deletion(fcport); } } }
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 98079418c53fff5f9e2d4087f08eaff2a9ce7714 ]
Add FORCE so that if_changed can detect the command line change. scsi_devinfo_tbl.c must be added to 'targets' too.
Link: https://lore.kernel.org/r/20210819012339.709409-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/Makefile | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index 1748d1ec1338..77812ba3c4aa 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -183,7 +183,7 @@ CFLAGS_ncr53c8xx.o := $(ncr53c8xx-flags-y) $(ncr53c8xx-flags-m) zalon7xx-objs := zalon.o ncr53c8xx.o
# Files generated that shall be removed upon make clean -clean-files := 53c700_d.h 53c700_u.h scsi_devinfo_tbl.c +clean-files := 53c700_d.h 53c700_u.h
$(obj)/53c700.o: $(obj)/53c700_d.h
@@ -192,9 +192,11 @@ $(obj)/scsi_sysfs.o: $(obj)/scsi_devinfo_tbl.c quiet_cmd_bflags = GEN $@ cmd_bflags = sed -n 's/.*define *BLIST_([A-Z0-9_]*) *.*/BLIST_FLAG_NAME(\1),/p' $< > $@
-$(obj)/scsi_devinfo_tbl.c: include/scsi/scsi_devinfo.h +$(obj)/scsi_devinfo_tbl.c: include/scsi/scsi_devinfo.h FORCE $(call if_changed,bflags)
+targets += scsi_devinfo_tbl.c + # If you want to play with the firmware, uncomment # GENERATE_FIRMWARE := 1
From: Anthony Yznaga anthony.yznaga@oracle.com
[ Upstream commit ffc95d1b8edb80d2dab77f2e8a823c8d93b06419 ]
vfio_find_dma_valid is defined to return WAITED on success if it was necessary to wait. However, the loop forgets the WAITED value returned by vfio_wait() and returns 0 in a later iteration. Fix it.
Signed-off-by: Anthony Yznaga anthony.yznaga@oracle.com Reviewed-by: Steve Sistare steven.sistare@oracle.com Link: https://lore.kernel.org/r/1629736550-2388-1-git-send-email-anthony.yznaga@or... Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/vfio_iommu_type1.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0b4f7c174c7a..0e9217687f5c 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -612,17 +612,17 @@ static int vfio_wait(struct vfio_iommu *iommu) static int vfio_find_dma_valid(struct vfio_iommu *iommu, dma_addr_t start, size_t size, struct vfio_dma **dma_p) { - int ret; + int ret = 0;
do { *dma_p = vfio_find_dma(iommu, start, size); if (!*dma_p) - ret = -EINVAL; + return -EINVAL; else if (!(*dma_p)->vaddr_invalid) - ret = 0; + return ret; else ret = vfio_wait(iommu); - } while (ret > 0); + } while (ret == WAITED);
return ret; }
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 0da66348c26ffde19d69ed7770514d202afde222 ]
Driver is not setting up IRQs in the resume path. As a result, hibernation path is broken and controller will not be operational after system is resumed.
Set up IRQs to handle the hibernation case.
Link: https://lore.kernel.org/r/20210818081755.1274470-1-kashyap.desai@broadcom.co... Cc: sathya.prakash@broadcom.com Cc: thenzl@redhat.com Reported-by: Marco Patalano mpatalan@redhat.com Tested-by: Marco Patalano mpatalan@redhat.com Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpi3mr/mpi3mr.h | 19 +++++++++++++++-- drivers/scsi/mpi3mr/mpi3mr_fw.c | 37 ++++++++++++++++++--------------- drivers/scsi/mpi3mr/mpi3mr_os.c | 13 ++++++------ 3 files changed, 44 insertions(+), 25 deletions(-)
diff --git a/drivers/scsi/mpi3mr/mpi3mr.h b/drivers/scsi/mpi3mr/mpi3mr.h index 6f5dc9e78553..9787b53a2b59 100644 --- a/drivers/scsi/mpi3mr/mpi3mr.h +++ b/drivers/scsi/mpi3mr/mpi3mr.h @@ -183,6 +183,20 @@ enum mpi3mr_iocstate { MRIOC_STATE_UNRECOVERABLE, };
+/* Init type definitions */ +enum mpi3mr_init_type { + MPI3MR_IT_INIT = 0, + MPI3MR_IT_RESET, + MPI3MR_IT_RESUME, +}; + +/* Cleanup reason definitions */ +enum mpi3mr_cleanup_reason { + MPI3MR_COMPLETE_CLEANUP = 0, + MPI3MR_REINIT_FAILURE, + MPI3MR_SUSPEND, +}; + /* Reset reason code definitions*/ enum mpi3mr_reset_reason { MPI3MR_RESET_FROM_BRINGUP = 1, @@ -855,8 +869,8 @@ struct delayed_dev_rmhs_node {
int mpi3mr_setup_resources(struct mpi3mr_ioc *mrioc); void mpi3mr_cleanup_resources(struct mpi3mr_ioc *mrioc); -int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init); -void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc, u8 re_init); +int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 init_type); +void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc, u8 reason); int mpi3mr_issue_port_enable(struct mpi3mr_ioc *mrioc, u8 async); int mpi3mr_admin_request_post(struct mpi3mr_ioc *mrioc, void *admin_req, u16 admin_req_sz, u8 ignore_reset); @@ -872,6 +886,7 @@ void *mpi3mr_get_reply_virt_addr(struct mpi3mr_ioc *mrioc, void mpi3mr_repost_sense_buf(struct mpi3mr_ioc *mrioc, u64 sense_buf_dma);
+void mpi3mr_memset_buffers(struct mpi3mr_ioc *mrioc); void mpi3mr_os_handle_events(struct mpi3mr_ioc *mrioc, struct mpi3_event_notification_reply *event_reply); void mpi3mr_process_op_reply_desc(struct mpi3mr_ioc *mrioc, diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c index 2dba2b0af166..4a8316c6bd41 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -3205,7 +3205,7 @@ int mpi3mr_setup_resources(struct mpi3mr_ioc *mrioc) /** * mpi3mr_init_ioc - Initialize the controller * @mrioc: Adapter instance reference - * @re_init: Flag to indicate is this fresh init or re-init + * @init_type: Flag to indicate is the init_type * * This the controller initialization routine, executed either * after soft reset or from pci probe callback. @@ -3218,7 +3218,7 @@ int mpi3mr_setup_resources(struct mpi3mr_ioc *mrioc) * * Return: 0 on success and non-zero on failure. */ -int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) +int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 init_type) { int retval = 0; enum mpi3mr_iocstate ioc_state; @@ -3229,7 +3229,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init)
mrioc->irqpoll_sleep = MPI3MR_IRQ_POLL_SLEEP; mrioc->change_count = 0; - if (!re_init) { + if (init_type == MPI3MR_IT_INIT) { mrioc->cpu_count = num_online_cpus(); retval = mpi3mr_setup_resources(mrioc); if (retval) { @@ -3314,7 +3314,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) goto out_failed; }
- if (!re_init) { + if (init_type != MPI3MR_IT_RESET) { retval = mpi3mr_setup_isr(mrioc, 1); if (retval) { ioc_err(mrioc, "Failed to setup ISR error %d\n", @@ -3332,7 +3332,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) }
mpi3mr_process_factsdata(mrioc, &facts_data); - if (!re_init) { + if (init_type == MPI3MR_IT_INIT) { retval = mpi3mr_check_reset_dma_mask(mrioc); if (retval) { ioc_err(mrioc, "Resetting dma mask failed %d\n", @@ -3351,7 +3351,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) goto out_failed; }
- if (!re_init) { + if (init_type == MPI3MR_IT_INIT) { retval = mpi3mr_alloc_chain_bufs(mrioc); if (retval) { ioc_err(mrioc, "Failed to allocated chain buffers %d\n", @@ -3374,7 +3374,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) writel(mrioc->sbq_host_index, &mrioc->sysif_regs->sense_buffer_free_host_index);
- if (!re_init) { + if (init_type != MPI3MR_IT_RESET) { retval = mpi3mr_setup_isr(mrioc, 0); if (retval) { ioc_err(mrioc, "Failed to re-setup ISR, error %d\n", @@ -3390,7 +3390,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) goto out_failed; }
- if (re_init && + if ((init_type != MPI3MR_IT_INIT) && (mrioc->shost->nr_hw_queues > mrioc->num_op_reply_q)) { retval = -1; ioc_err(mrioc, @@ -3422,7 +3422,7 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) goto out_failed; }
- if (re_init) { + if (init_type != MPI3MR_IT_INIT) { ioc_info(mrioc, "Issuing Port Enable\n"); retval = mpi3mr_issue_port_enable(mrioc, 0); if (retval) { @@ -3434,7 +3434,10 @@ int mpi3mr_init_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) return retval;
out_failed: - mpi3mr_cleanup_ioc(mrioc, re_init); + if (init_type == MPI3MR_IT_INIT) + mpi3mr_cleanup_ioc(mrioc, MPI3MR_COMPLETE_CLEANUP); + else + mpi3mr_cleanup_ioc(mrioc, MPI3MR_REINIT_FAILURE); out_nocleanup: return retval; } @@ -3495,7 +3498,7 @@ static void mpi3mr_memset_op_req_q_buffers(struct mpi3mr_ioc *mrioc, u16 qidx) * * Return: Nothing. */ -static void mpi3mr_memset_buffers(struct mpi3mr_ioc *mrioc) +void mpi3mr_memset_buffers(struct mpi3mr_ioc *mrioc) { u16 i;
@@ -3710,7 +3713,7 @@ static void mpi3mr_issue_ioc_shutdown(struct mpi3mr_ioc *mrioc) /** * mpi3mr_cleanup_ioc - Cleanup controller * @mrioc: Adapter instance reference - * @re_init: Cleanup due to a reinit or not + * @reason: Cleanup reason * * controller cleanup handler, Message unit reset or soft reset * and shutdown notification is issued to the controller and the @@ -3718,11 +3721,11 @@ static void mpi3mr_issue_ioc_shutdown(struct mpi3mr_ioc *mrioc) * * Return: Nothing. */ -void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) +void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc, u8 reason) { enum mpi3mr_iocstate ioc_state;
- if (!re_init) + if (reason == MPI3MR_COMPLETE_CLEANUP) mpi3mr_stop_watchdog(mrioc);
mpi3mr_ioc_disable_intr(mrioc); @@ -3737,11 +3740,11 @@ void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc, u8 re_init) MPI3_SYSIF_HOST_DIAG_RESET_ACTION_SOFT_RESET, MPI3MR_RESET_FROM_MUR_FAILURE);
- if (!re_init) + if (reason != MPI3MR_REINIT_FAILURE) mpi3mr_issue_ioc_shutdown(mrioc); }
- if (!re_init) { + if (reason == MPI3MR_COMPLETE_CLEANUP) { mpi3mr_free_mem(mrioc); mpi3mr_cleanup_resources(mrioc); } @@ -3923,7 +3926,7 @@ int mpi3mr_soft_reset_handler(struct mpi3mr_ioc *mrioc, mpi3mr_flush_host_io(mrioc); mpi3mr_invalidate_devhandles(mrioc); mpi3mr_memset_buffers(mrioc); - retval = mpi3mr_init_ioc(mrioc, 1); + retval = mpi3mr_init_ioc(mrioc, MPI3MR_IT_RESET); if (retval) { pr_err(IOCNAME "reinit after soft reset failed: reason %d\n", mrioc->name, reset_reason); diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c index 24ac7ddec749..b092fc52d884 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_os.c +++ b/drivers/scsi/mpi3mr/mpi3mr_os.c @@ -3795,7 +3795,7 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id) }
mrioc->is_driver_loading = 1; - if (mpi3mr_init_ioc(mrioc, 0)) { + if (mpi3mr_init_ioc(mrioc, MPI3MR_IT_INIT)) { ioc_err(mrioc, "failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); retval = -ENODEV; @@ -3818,7 +3818,7 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id) return retval;
addhost_failed: - mpi3mr_cleanup_ioc(mrioc, 0); + mpi3mr_cleanup_ioc(mrioc, MPI3MR_COMPLETE_CLEANUP); out_iocinit_failed: destroy_workqueue(mrioc->fwevt_worker_thread); out_fwevtthread_failed: @@ -3870,7 +3870,7 @@ static void mpi3mr_remove(struct pci_dev *pdev) mpi3mr_tgtdev_del_from_list(mrioc, tgtdev); mpi3mr_tgtdev_put(tgtdev); } - mpi3mr_cleanup_ioc(mrioc, 0); + mpi3mr_cleanup_ioc(mrioc, MPI3MR_COMPLETE_CLEANUP);
spin_lock(&mrioc_list_lock); list_del(&mrioc->list); @@ -3910,7 +3910,7 @@ static void mpi3mr_shutdown(struct pci_dev *pdev) spin_unlock_irqrestore(&mrioc->fwevt_lock, flags); if (wq) destroy_workqueue(wq); - mpi3mr_cleanup_ioc(mrioc, 0); + mpi3mr_cleanup_ioc(mrioc, MPI3MR_COMPLETE_CLEANUP); }
#ifdef CONFIG_PM @@ -3940,7 +3940,7 @@ static int mpi3mr_suspend(struct pci_dev *pdev, pm_message_t state) mpi3mr_cleanup_fwevt_list(mrioc); scsi_block_requests(shost); mpi3mr_stop_watchdog(mrioc); - mpi3mr_cleanup_ioc(mrioc, 1); + mpi3mr_cleanup_ioc(mrioc, MPI3MR_SUSPEND);
device_state = pci_choose_state(pdev, state); ioc_info(mrioc, "pdev=0x%p, slot=%s, entering operating state [D%d]\n", @@ -3988,7 +3988,8 @@ static int mpi3mr_resume(struct pci_dev *pdev) }
mrioc->stop_drv_processing = 0; - mpi3mr_init_ioc(mrioc, 1); + mpi3mr_memset_buffers(mrioc); + mpi3mr_init_ioc(mrioc, MPI3MR_IT_RESUME); scsi_unblock_requests(shost); mpi3mr_start_watchdog(mrioc);
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit 9b5ac8ab4e8bf5636d1d425aee68ddf45af12057 ]
Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if the length is zero. So go back to requesting all the sense data, as it was before patch "scsi: ufs: Request sense data asynchronously". That is simpler than creating and maintaining a quirk for affected devices.
Link: https://lore.kernel.org/r/20210824114150.2105-1-adrian.hunter@intel.com Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 36 +++++++++++++++++++++++++++++------- 1 file changed, 29 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 52731dffa624..379d2ab6c557 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -7914,7 +7914,8 @@ static int ufshcd_add_lus(struct ufs_hba *hba) static void ufshcd_request_sense_done(struct request *rq, blk_status_t error) { if (error != BLK_STS_OK) - pr_err("%s: REQUEST SENSE failed (%d)", __func__, error); + pr_err("%s: REQUEST SENSE failed (%d)\n", __func__, error); + kfree(rq->end_io_data); blk_put_request(rq); }
@@ -7922,16 +7923,30 @@ static int ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev) { /* - * From SPC-6: the REQUEST SENSE command with any allocation length - * clears the sense data. + * Some UFS devices clear unit attention condition only if the sense + * size used (UFS_SENSE_SIZE in this case) is non-zero. */ - static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, 0, 0}; + static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, UFS_SENSE_SIZE, 0}; struct scsi_request *rq; struct request *req; + char *buffer; + int ret;
- req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, /*flags=*/0); - if (IS_ERR(req)) - return PTR_ERR(req); + buffer = kzalloc(UFS_SENSE_SIZE, GFP_KERNEL); + if (!buffer) + return -ENOMEM; + + req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, + /*flags=*/BLK_MQ_REQ_PM); + if (IS_ERR(req)) { + ret = PTR_ERR(req); + goto out_free; + } + + ret = blk_rq_map_kern(sdev->request_queue, req, + buffer, UFS_SENSE_SIZE, GFP_NOIO); + if (ret) + goto out_put;
rq = scsi_req(req); rq->cmd_len = ARRAY_SIZE(cmd); @@ -7939,10 +7954,17 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev) rq->retries = 3; req->timeout = 1 * HZ; req->rq_flags |= RQF_PM | RQF_QUIET; + req->end_io_data = buffer;
blk_execute_rq_nowait(/*bd_disk=*/NULL, req, /*at_head=*/true, ufshcd_request_sense_done); return 0; + +out_put: + blk_put_request(req); +out_free: + kfree(buffer); + return ret; }
static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 113ec9ccc8049c3772f0eab46b62c5d6654c09f7 ]
Copied from commit 89bbe4c798bc ("powerpc/64: indirect function call use bctrl rather than blrl in ret_from_kernel_thread")
blrl is not recommended to use as an indirect function call, as it may corrupt the link stack predictor.
This is not a performance critical path but this should be fixed for consistency.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/91b1d242525307ceceec7ef6e832bfbacdd4501b.162943647... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/entry_32.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 0273a1349006..61fdd53cdd9a 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -161,10 +161,10 @@ ret_from_fork: ret_from_kernel_thread: REST_NVGPRS(r1) bl schedule_tail - mtlr r14 + mtctr r14 mr r3,r15 PPC440EP_ERR42 - blrl + bctrl li r3,0 b ret_from_syscall
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit f5007dbf4da729baa850b33a64dc3cc53757bdf8 ]
Use bcl 20,31,+4 instead of bl in order to preserve link stack.
See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption in __get_datapage()") for details.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/e9fbc285eceb720e6c0e032ef47fe8b05f669b48.162979175... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/ppc_asm.h | 2 +- arch/powerpc/kernel/exceptions-64e.S | 6 +++--- arch/powerpc/kernel/fsl_booke_entry_mapping.S | 8 ++++---- arch/powerpc/kernel/head_44x.S | 6 +++--- arch/powerpc/kernel/head_fsl_booke.S | 6 +++--- arch/powerpc/mm/nohash/tlb_low.S | 4 ++-- 6 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 116c1519728a..d5d7b8f1b021 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -259,7 +259,7 @@ GLUE(.,name):
/* Be careful, this will clobber the lr register. */ #define LOAD_REG_ADDR_PIC(reg, name) \ - bl 0f; \ + bcl 20,31,$+4; \ 0: mflr reg; \ addis reg,reg,(name - 0b)@ha; \ addi reg,reg,(name - 0b)@l; diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 1401787b0b93..7e0943d9f9b0 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -1127,7 +1127,7 @@ found_iprot: * r3 = MAS0_TLBSEL (for the iprot array) * r4 = SPRN_TLBnCFG */ - bl invstr /* Find our address */ + bcl 20,31,$+4 /* Find our address */ invstr: mflr r6 /* Make it accessible */ mfmsr r7 rlwinm r5,r7,27,31,31 /* extract MSR[IS] */ @@ -1196,7 +1196,7 @@ skpinv: addi r6,r6,1 /* Increment */ mfmsr r6 xori r6,r6,MSR_IS mtspr SPRN_SRR1,r6 - bl 1f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 1: mflr r6 addi r6,r6,(2f - 1b) mtspr SPRN_SRR0,r6 @@ -1256,7 +1256,7 @@ skpinv: addi r6,r6,1 /* Increment */ * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping */ /* Now we branch the new virtual address mapped by this entry */ - bl 1f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 1: mflr r6 addi r6,r6,(2f - 1b) tovirt(r6,r6) diff --git a/arch/powerpc/kernel/fsl_booke_entry_mapping.S b/arch/powerpc/kernel/fsl_booke_entry_mapping.S index 8bccce6544b5..dedc17fac8f8 100644 --- a/arch/powerpc/kernel/fsl_booke_entry_mapping.S +++ b/arch/powerpc/kernel/fsl_booke_entry_mapping.S @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */
/* 1. Find the index of the entry we're executing in */ - bl invstr /* Find our address */ + bcl 20,31,$+4 /* Find our address */ invstr: mflr r6 /* Make it accessible */ mfmsr r7 rlwinm r4,r7,27,31,31 /* extract MSR[IS] */ @@ -85,7 +85,7 @@ skpinv: addi r6,r6,1 /* Increment */ addi r6,r6,10 slw r6,r8,r6 /* convert to mask */
- bl 1f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 1: mflr r7
mfspr r8,SPRN_MAS3 @@ -117,7 +117,7 @@ skpinv: addi r6,r6,1 /* Increment */
xori r6,r4,1 slwi r6,r6,5 /* setup new context with other address space */ - bl 1f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 1: mflr r9 rlwimi r7,r9,0,20,31 addi r7,r7,(2f - 1b) @@ -207,7 +207,7 @@ next_tlb_setup:
lis r7,MSR_KERNEL@h ori r7,r7,MSR_KERNEL@l - bl 1f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 1: mflr r9 rlwimi r6,r9,0,20,31 addi r6,r6,(2f - 1b) diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S index ddc978a2d381..02d2928d1e01 100644 --- a/arch/powerpc/kernel/head_44x.S +++ b/arch/powerpc/kernel/head_44x.S @@ -70,7 +70,7 @@ _ENTRY(_start); * address. * r21 will be loaded with the physical runtime address of _stext */ - bl 0f /* Get our runtime address */ + bcl 20,31,$+4 /* Get our runtime address */ 0: mflr r21 /* Make it accessible */ addis r21,r21,(_stext - 0b)@ha addi r21,r21,(_stext - 0b)@l /* Get our current runtime base */ @@ -853,7 +853,7 @@ _GLOBAL(init_cpu_state) wmmucr: mtspr SPRN_MMUCR,r3 /* Put MMUCR */ sync
- bl invstr /* Find our address */ + bcl 20,31,$+4 /* Find our address */ invstr: mflr r5 /* Make it accessible */ tlbsx r23,0,r5 /* Find entry we are in */ li r4,0 /* Start at TLB entry 0 */ @@ -1045,7 +1045,7 @@ head_start_47x: sync
/* Find the entry we are running from */ - bl 1f + bcl 20,31,$+4 1: mflr r23 tlbsx r23,0,r23 tlbre r24,r23,0 diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S index 0f9642f36b49..dbf3b89e543c 100644 --- a/arch/powerpc/kernel/head_fsl_booke.S +++ b/arch/powerpc/kernel/head_fsl_booke.S @@ -79,7 +79,7 @@ _ENTRY(_start); mr r23,r3 mr r25,r4
- bl 0f + bcl 20,31,$+4 0: mflr r8 addis r3,r8,(is_second_reloc - 0b)@ha lwz r19,(is_second_reloc - 0b)@l(r3) @@ -1132,7 +1132,7 @@ _GLOBAL(switch_to_as1) bne 1b
/* Get the tlb entry used by the current running code */ - bl 0f + bcl 20,31,$+4 0: mflr r4 tlbsx 0,r4
@@ -1166,7 +1166,7 @@ _GLOBAL(switch_to_as1) _GLOBAL(restore_to_as0) mflr r0
- bl 0f + bcl 20,31,$+4 0: mflr r9 addi r9,r9,1f - 0b
diff --git a/arch/powerpc/mm/nohash/tlb_low.S b/arch/powerpc/mm/nohash/tlb_low.S index 4613bf8e9aae..5add4a51e51f 100644 --- a/arch/powerpc/mm/nohash/tlb_low.S +++ b/arch/powerpc/mm/nohash/tlb_low.S @@ -199,7 +199,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_476_DD2) * Touch enough instruction cache lines to ensure cache hits */ 1: mflr r9 - bl 2f + bcl 20,31,$+4 2: mflr r6 li r7,32 PPC_ICBT(0,R6,R7) /* touch next cache line */ @@ -414,7 +414,7 @@ _GLOBAL(loadcam_multi) * Set up temporary TLB entry that is the same as what we're * running from, but in AS=1. */ - bl 1f + bcl 20,31,$+4 1: mflr r6 tlbsx 0,r8 mfspr r6,SPRN_MAS1
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 33e1402435cb9f3021439a15935ea2dc69ec1844 ]
bl;mflr is used at several places to get code position.
Use bcl 20,31,+4 instead of bl in order to preserve link stack.
See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption in __get_datapage()") for details.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/c6eabb4fb6c156f75d56dcbcc6f243e5ac0fba42.162979176... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/misc.S | 2 +- arch/powerpc/kernel/misc_32.S | 2 +- arch/powerpc/kernel/misc_64.S | 2 +- arch/powerpc/kernel/reloc_32.S | 2 +- arch/powerpc/kexec/relocate_32.S | 12 ++++++------ 5 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S index 5be96feccb55..fb7de3543c03 100644 --- a/arch/powerpc/kernel/misc.S +++ b/arch/powerpc/kernel/misc.S @@ -29,7 +29,7 @@ _GLOBAL(reloc_offset) li r3, 0 _GLOBAL(add_reloc_offset) mflr r0 - bl 1f + bcl 20,31,$+4 1: mflr r5 PPC_LL r4,(2f-1b)(r5) subf r5,r4,r5 diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 39ab15419592..921d4198bfaa 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -67,7 +67,7 @@ _GLOBAL(reloc_got2) srwi. r8,r8,2 beqlr mtctr r8 - bl 1f + bcl 20,31,$+4 1: mflr r0 lis r4,1b@ha addi r4,r4,1b@l diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index 4b761a18a74d..d38a019b38e1 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -255,7 +255,7 @@ _GLOBAL(scom970_write) * Physical (hardware) cpu id should be in r3. */ _GLOBAL(kexec_wait) - bl 1f + bcl 20,31,$+4 1: mflr r5 addi r5,r5,kexec_flag-1b
diff --git a/arch/powerpc/kernel/reloc_32.S b/arch/powerpc/kernel/reloc_32.S index 10e96f3e22fe..0508c14b4c28 100644 --- a/arch/powerpc/kernel/reloc_32.S +++ b/arch/powerpc/kernel/reloc_32.S @@ -30,7 +30,7 @@ R_PPC_RELATIVE = 22 _GLOBAL(relocate)
mflr r0 /* Save our LR */ - bl 0f /* Find our current runtime address */ + bcl 20,31,$+4 /* Find our current runtime address */ 0: mflr r12 /* Make it accessible */ mtlr r0
diff --git a/arch/powerpc/kexec/relocate_32.S b/arch/powerpc/kexec/relocate_32.S index 61946c19e07c..cf6e52bdf8d8 100644 --- a/arch/powerpc/kexec/relocate_32.S +++ b/arch/powerpc/kexec/relocate_32.S @@ -93,7 +93,7 @@ wmmucr: * Invalidate all the TLB entries except the current entry * where we are running from */ - bl 0f /* Find our address */ + bcl 20,31,$+4 /* Find our address */ 0: mflr r5 /* Make it accessible */ tlbsx r23,0,r5 /* Find entry we are in */ li r4,0 /* Start at TLB entry 0 */ @@ -158,7 +158,7 @@ write_out: /* Switch to other address space in MSR */ insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */
- bl 1f + bcl 20,31,$+4 1: mflr r8 addi r8, r8, (2f-1b) /* Find the target offset */
@@ -202,7 +202,7 @@ next_tlb: li r9,0 insrwi r9, r7, 1, 26 /* Set MSR[IS] = r7 */
- bl 1f + bcl 20,31,$+4 1: mflr r8 and r8, r8, r11 /* Get our offset within page */ addi r8, r8, (2f-1b) @@ -240,7 +240,7 @@ setup_map_47x: sync
/* Find the entry we are running from */ - bl 2f + bcl 20,31,$+4 2: mflr r23 tlbsx r23, 0, r23 tlbre r24, r23, 0 /* TLB Word 0 */ @@ -296,7 +296,7 @@ clear_utlb_entry: /* Update the msr to the new TS */ insrwi r5, r7, 1, 26
- bl 1f + bcl 20,31,$+4 1: mflr r6 addi r6, r6, (2f-1b)
@@ -355,7 +355,7 @@ write_utlb: /* Defaults to 256M */ lis r10, 0x1000
- bl 1f + bcl 20,31,$+4 1: mflr r4 addi r4, r4, (2f-1b) /* virtual address of 2f */
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit fd42b7b09c602c904452c0c3e5955ca21d8e387a ]
It is possible to create a VCPU without setting the MSR before running it, which results in a warning in kvmhv_vcpu_entry_p9() that MSR_ME is not set. This is pretty harmless because the MSR_ME bit is added to HSRR1 before HRFID to guest, and a normal qemu guest doesn't hit it.
Initialise the vcpu MSR with MSR_ME set.
Reported-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210811160134.904987-2-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 1ca0a4f760bc..18453aba86c4 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2684,6 +2684,7 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu) spin_lock_init(&vcpu->arch.vpa_update_lock); spin_lock_init(&vcpu->arch.tbacct_lock); vcpu->arch.busy_preempt = TB_NIL; + vcpu->arch.shregs.msr = MSR_ME; vcpu->arch.intr_msr = MSR_SF | MSR_ME;
/*
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 4782e0cd0d184d727ad3b0cfe20d1d44d9f98239 ]
The softpatch interrupt sets HSRR0 to the faulting instruction +4, so it should subtract 4 for the faulting instruction address in the case it is a TM softpatch interrupt (the instruction was not executed) and it was not emulated.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210811160134.904987-4-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv_tm.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_tm.c b/arch/powerpc/kvm/book3s_hv_tm.c index cc90b8b82329..e7c36f8bf205 100644 --- a/arch/powerpc/kvm/book3s_hv_tm.c +++ b/arch/powerpc/kvm/book3s_hv_tm.c @@ -46,6 +46,15 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) u64 newmsr, bescr; int ra, rs;
+ /* + * The TM softpatch interrupt sets NIP to the instruction following + * the faulting instruction, which is not executed. Rewind nip to the + * faulting instruction so it looks like a normal synchronous + * interrupt, then update nip in the places where the instruction is + * emulated. + */ + vcpu->arch.regs.nip -= 4; + /* * rfid, rfebb, and mtmsrd encode bit 31 = 0 since it's a reserved bit * in these instructions, so masking bit 31 out doesn't change these @@ -67,7 +76,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) (newmsr & MSR_TM))); newmsr = sanitize_msr(newmsr); vcpu->arch.shregs.msr = newmsr; - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; + vcpu->arch.cfar = vcpu->arch.regs.nip; vcpu->arch.regs.nip = vcpu->arch.shregs.srr0; return RESUME_GUEST;
@@ -100,7 +109,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) vcpu->arch.bescr = bescr; msr = (msr & ~MSR_TS_MASK) | MSR_TS_T; vcpu->arch.shregs.msr = msr; - vcpu->arch.cfar = vcpu->arch.regs.nip - 4; + vcpu->arch.cfar = vcpu->arch.regs.nip; vcpu->arch.regs.nip = vcpu->arch.ebbrr; return RESUME_GUEST;
@@ -116,6 +125,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) newmsr = (newmsr & ~MSR_LE) | (msr & MSR_LE); newmsr = sanitize_msr(newmsr); vcpu->arch.shregs.msr = newmsr; + vcpu->arch.regs.nip += 4; return RESUME_GUEST;
/* ignore bit 31, see comment above */ @@ -152,6 +162,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) msr = (msr & ~MSR_TS_MASK) | MSR_TS_S; } vcpu->arch.shregs.msr = msr; + vcpu->arch.regs.nip += 4; return RESUME_GUEST;
/* ignore bit 31, see comment above */ @@ -189,6 +200,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); vcpu->arch.shregs.msr &= ~MSR_TS_MASK; + vcpu->arch.regs.nip += 4; return RESUME_GUEST;
/* ignore bit 31, see comment above */ @@ -220,6 +232,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); vcpu->arch.shregs.msr = msr | MSR_TS_S; + vcpu->arch.regs.nip += 4; return RESUME_GUEST; }
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit d82b392d9b3556b63e3f9916cf057ea847e173a9 ]
Have the TM softpatch emulation code set up the HFAC interrupt and return -1 in case an instruction was executed with HFSCR bits clear, and have the interrupt exit handler fall through to the HFAC handler. When the L0 is running a nested guest, this ensures the HFAC interrupt is correctly passed up to the L1.
The "direct guest" exit handler will turn these into PROGILL program interrupts so functionality in practice will be unchanged. But it's possible an L1 would want to handle these in a different way.
Also rearrange the FAC interrupt emulation code to match the HFAC format while here (mainly, adding the FSCR_INTR_CAUSE mask).
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210811160134.904987-5-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/reg.h | 3 ++- arch/powerpc/kvm/book3s_hv.c | 35 ++++++++++++++++---------- arch/powerpc/kvm/book3s_hv_tm.c | 44 ++++++++++++++++++--------------- 3 files changed, 48 insertions(+), 34 deletions(-)
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index be85cf156a1f..e9d27265253b 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -415,6 +415,7 @@ #define FSCR_TAR __MASK(FSCR_TAR_LG) #define FSCR_EBB __MASK(FSCR_EBB_LG) #define FSCR_DSCR __MASK(FSCR_DSCR_LG) +#define FSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ #define SPRN_HFSCR 0xbe /* HV=1 Facility Status & Control Register */ #define HFSCR_PREFIX __MASK(FSCR_PREFIX_LG) #define HFSCR_MSGP __MASK(FSCR_MSGP_LG) @@ -426,7 +427,7 @@ #define HFSCR_DSCR __MASK(FSCR_DSCR_LG) #define HFSCR_VECVSX __MASK(FSCR_VECVSX_LG) #define HFSCR_FP __MASK(FSCR_FP_LG) -#define HFSCR_INTR_CAUSE (ASM_CONST(0xFF) << 56) /* interrupt cause */ +#define HFSCR_INTR_CAUSE FSCR_INTR_CAUSE #define SPRN_TAR 0x32f /* Target Address Register */ #define SPRN_LPCR 0x13E /* LPAR Control Register */ #define LPCR_VPM0 ASM_CONST(0x8000000000000000) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 18453aba86c4..c364eeec410f 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1679,6 +1679,21 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, r = RESUME_GUEST; } break; + +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + case BOOK3S_INTERRUPT_HV_SOFTPATCH: + /* + * This occurs for various TM-related instructions that + * we need to emulate on POWER9 DD2.2. We have already + * handled the cases where the guest was in real-suspend + * mode and was transitioning to transactional state. + */ + r = kvmhv_p9_tm_emulation(vcpu); + if (r != -1) + break; + fallthrough; /* go to facility unavailable handler */ +#endif + /* * This occurs if the guest (kernel or userspace), does something that * is prohibited by HFSCR. @@ -1697,18 +1712,6 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, } break;
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM - case BOOK3S_INTERRUPT_HV_SOFTPATCH: - /* - * This occurs for various TM-related instructions that - * we need to emulate on POWER9 DD2.2. We have already - * handled the cases where the guest was in real-suspend - * mode and was transitioning to transactional state. - */ - r = kvmhv_p9_tm_emulation(vcpu); - break; -#endif - case BOOK3S_INTERRUPT_HV_RM_HARD: r = RESUME_PASSTHROUGH; break; @@ -1811,9 +1814,15 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu) * mode and was transitioning to transactional state. */ r = kvmhv_p9_tm_emulation(vcpu); - break; + if (r != -1) + break; + fallthrough; /* go to facility unavailable handler */ #endif
+ case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: + r = RESUME_HOST; + break; + case BOOK3S_INTERRUPT_HV_RM_HARD: vcpu->arch.trap = 0; r = RESUME_GUEST; diff --git a/arch/powerpc/kvm/book3s_hv_tm.c b/arch/powerpc/kvm/book3s_hv_tm.c index e7c36f8bf205..866cadd70094 100644 --- a/arch/powerpc/kvm/book3s_hv_tm.c +++ b/arch/powerpc/kvm/book3s_hv_tm.c @@ -88,14 +88,15 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) } /* check EBB facility is available */ if (!(vcpu->arch.hfscr & HFSCR_EBB)) { - /* generate an illegal instruction interrupt */ - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - return RESUME_GUEST; + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; + vcpu->arch.hfscr |= (u64)FSCR_EBB_LG << 56; + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; + return -1; /* rerun host interrupt handler */ } if ((msr & MSR_PR) && !(vcpu->arch.fscr & FSCR_EBB)) { /* generate a facility unavailable interrupt */ - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | - ((u64)FSCR_EBB_LG << 56); + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; + vcpu->arch.fscr |= (u64)FSCR_EBB_LG << 56; kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); return RESUME_GUEST; } @@ -138,14 +139,15 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) } /* check for TM disabled in the HFSCR or MSR */ if (!(vcpu->arch.hfscr & HFSCR_TM)) { - /* generate an illegal instruction interrupt */ - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - return RESUME_GUEST; + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; + return -1; /* rerun host interrupt handler */ } if (!(msr & MSR_TM)) { /* generate a facility unavailable interrupt */ - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | - ((u64)FSCR_TM_LG << 56); + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); return RESUME_GUEST; @@ -169,14 +171,15 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) case (PPC_INST_TRECLAIM & PO_XOP_OPCODE_MASK): /* check for TM disabled in the HFSCR or MSR */ if (!(vcpu->arch.hfscr & HFSCR_TM)) { - /* generate an illegal instruction interrupt */ - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - return RESUME_GUEST; + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; + return -1; /* rerun host interrupt handler */ } if (!(msr & MSR_TM)) { /* generate a facility unavailable interrupt */ - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | - ((u64)FSCR_TM_LG << 56); + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); return RESUME_GUEST; @@ -208,14 +211,15 @@ int kvmhv_p9_tm_emulation(struct kvm_vcpu *vcpu) /* XXX do we need to check for PR=0 here? */ /* check for TM disabled in the HFSCR or MSR */ if (!(vcpu->arch.hfscr & HFSCR_TM)) { - /* generate an illegal instruction interrupt */ - kvmppc_core_queue_program(vcpu, SRR1_PROGILL); - return RESUME_GUEST; + vcpu->arch.hfscr &= ~HFSCR_INTR_CAUSE; + vcpu->arch.hfscr |= (u64)FSCR_TM_LG << 56; + vcpu->arch.trap = BOOK3S_INTERRUPT_H_FAC_UNAVAIL; + return -1; /* rerun host interrupt handler */ } if (!(msr & MSR_TM)) { /* generate a facility unavailable interrupt */ - vcpu->arch.fscr = (vcpu->arch.fscr & ~(0xffull << 56)) | - ((u64)FSCR_TM_LG << 56); + vcpu->arch.fscr &= ~FSCR_INTR_CAUSE; + vcpu->arch.fscr |= (u64)FSCR_TM_LG << 56; kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_FAC_UNAVAIL); return RESUME_GUEST;
From: Håkon Bugge haakon.bugge@oracle.com
[ Upstream commit 5f5a650999d5718af766fc70a120230b04235a6f ]
A MAD packet is sent as an unreliable datagram (UD). SA requests are sent as MAD packets. As such, SA requests or responses may be silently dropped.
IB Core's MAD layer has a timeout and retry mechanism, which amongst other, is used by RDMA CM. But it is not used by SA queries. The lack of retries of SA queries leads to long specified timeout, and error being returned in case of packet loss. The ULP or user-land process has to perform the retry.
Fix this by taking advantage of the MAD layer's retry mechanism.
First, a check against a zero timeout is added in rdma_resolve_route(). In send_mad(), we set the MAD layer timeout to one tenth of the specified timeout and the number of retries to 10. The special case when timeout is less than 10 is handled.
With this fix:
# ucmatose -c 1000 -S 1024 -C 1
runs stable on an Infiniband fabric. Without this fix, we see an intermittent behavior and it errors out with:
cmatose: event: RDMA_CM_EVENT_ROUTE_ERROR, error: -110
(110 is ETIMEDOUT)
Link: https://lore.kernel.org/r/1628784755-28316-1-git-send-email-haakon.bugge@ora... Signed-off-by: Håkon Bugge haakon.bugge@oracle.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 3 +++ drivers/infiniband/core/sa_query.c | 9 ++++++++- 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 5d3b8b8d163d..c40791baced5 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3132,6 +3132,9 @@ int rdma_resolve_route(struct rdma_cm_id *id, unsigned long timeout_ms) struct rdma_id_private *id_priv; int ret;
+ if (!timeout_ms) + return -EINVAL; + id_priv = container_of(id, struct rdma_id_private, id); if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_RESOLVED, RDMA_CM_ROUTE_QUERY)) return -EINVAL; diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c index b61576f702b8..5a560820f6e0 100644 --- a/drivers/infiniband/core/sa_query.c +++ b/drivers/infiniband/core/sa_query.c @@ -1358,6 +1358,7 @@ static int send_mad(struct ib_sa_query *query, unsigned long timeout_ms, { unsigned long flags; int ret, id; + const int nmbr_sa_query_retries = 10;
xa_lock_irqsave(&queries, flags); ret = __xa_alloc(&queries, &id, query, xa_limit_32b, gfp_mask); @@ -1365,7 +1366,13 @@ static int send_mad(struct ib_sa_query *query, unsigned long timeout_ms, if (ret < 0) return ret;
- query->mad_buf->timeout_ms = timeout_ms; + query->mad_buf->timeout_ms = timeout_ms / nmbr_sa_query_retries; + query->mad_buf->retries = nmbr_sa_query_retries; + if (!query->mad_buf->timeout_ms) { + /* Special case, very small timeout_ms */ + query->mad_buf->timeout_ms = 1; + query->mad_buf->retries = timeout_ms; + } query->mad_buf->context[0] = query; query->id = id;
From: Baolin Wang baolin.wang@linux.alibaba.com
[ Upstream commit d538ddb97e066571e4fc58b832f40739621b42bb ]
The openat2 test suite fails on ARM64 because the definition of O_LARGEFILE is different on ARM64. Fix the problem by defining the correct O_LARGEFILE definition on ARM64.
"openat2 unexpectedly returned # 3['.../tools/testing/selftests/openat2'] with 208000 (!= 208000) not ok 102 openat2 with incompatible flags (O_PATH | O_LARGEFILE) fails with -22 (Invalid argument)"
Fixed change log to improve formatting and clarity: Shuah Khan skhan@linuxfoundation.org
Signed-off-by: Baolin Wang baolin.wang@linux.alibaba.com Reviewed-by: Aleksa Sarai cyphar@cyphar.com Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/openat2/openat2_test.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/openat2/openat2_test.c b/tools/testing/selftests/openat2/openat2_test.c index d7ec1e7da0d0..1bddbe934204 100644 --- a/tools/testing/selftests/openat2/openat2_test.c +++ b/tools/testing/selftests/openat2/openat2_test.c @@ -22,7 +22,11 @@ * XXX: This is wrong on {mips, parisc, powerpc, sparc}. */ #undef O_LARGEFILE +#ifdef __aarch64__ +#define O_LARGEFILE 0x20000 +#else #define O_LARGEFILE 0x8000 +#endif
struct open_how_ext { struct open_how inner;
From: Jun Miao jun.miao@windriver.com
[ Upstream commit a051b2e56f2aa287b37ab0134a8af852f84e3f8e ]
A glibc 2.34 feature adds support for variable MINSIGSTKSZ and SIGSTKSZ. When _DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are no longer constant on Linux. glibc 2.34 flags code paths assuming MINSIGSTKSZ or SIGSTKSZ are constant. Fix these error in x86 test.
Feature description and build error:
NEWS for version 2.34 ===================== Major new features: * Add _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. When _DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are no longer constant on Linux. MINSIGSTKSZ is redefined to sysconf(_SC_MINSIGSTKSZ) and SIGSTKSZ is redefined to sysconf (_SC_SIGSTKSZ). This supports dynamic sized register sets for modern architectural features like Arm SVE. =====================
If _SC_SIGSTKSZ_SOURCE or _GNU_SOURCE are defined, MINSIGSTKSZ and SIGSTKSZ are redefined as:
/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ # undef SIGSTKSZ # define SIGSTKSZ sysconf (_SC_SIGSTKSZ)
/* Minimum stack size for a signal handler: SIGSTKSZ. */ # undef MINSIGSTKSZ # define MINSIGSTKSZ SIGSTKSZ
Compilation will fail if the source assumes constant MINSIGSTKSZ or SIGSTKSZ.
Build error with the GNU C Library 2.34: DEBUG: | sigreturn.c:150:13: error: variably modified 'altstack_data' at file scope | sigreturn.c:150:13: error: variably modified 'altstack_data' at file scope DEBUG: | 150 | static char altstack_data[SIGSTKSZ]; | 150 | static char altstack_data[SIGSTKSZ]; DEBUG: | | ^~~~~~~~~~~~~
DEBUG: | single_step_syscall.c:60:22: error: variably modified 'altstack_data' at file scope DEBUG: | 60 | static unsigned char altstack_data[SIGSTKSZ]; DEBUG: | | ^~~~~~~~~~~~~
Fixed commit log to improve formatting and clarity: Shuah Khan skhan@linuxfoundation.org
Link: https://sourceware.org/pipermail/libc-alpha/2021-January/121996.html Link: https://sourceware.org/pipermail/libc-alpha/2021-August/129718.html Suggested-by: Jianwei Hu jianwei.hu@windriver.com Signed-off-by: Jun Miao jun.miao@windriver.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/x86/mov_ss_trap.c | 4 ++-- tools/testing/selftests/x86/sigreturn.c | 7 +++---- tools/testing/selftests/x86/single_step_syscall.c | 4 ++-- tools/testing/selftests/x86/syscall_arg_fault.c | 7 +++---- 4 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/x86/mov_ss_trap.c b/tools/testing/selftests/x86/mov_ss_trap.c index 6da0ac3f0135..cc3de6ff9fba 100644 --- a/tools/testing/selftests/x86/mov_ss_trap.c +++ b/tools/testing/selftests/x86/mov_ss_trap.c @@ -47,7 +47,6 @@ unsigned short ss; extern unsigned char breakpoint_insn[]; sigjmp_buf jmpbuf; -static unsigned char altstack_data[SIGSTKSZ];
static void enable_watchpoint(void) { @@ -250,13 +249,14 @@ int main() if (sigsetjmp(jmpbuf, 1) == 0) { printf("[RUN]\tMOV SS; SYSENTER\n"); stack_t stack = { - .ss_sp = altstack_data, + .ss_sp = malloc(sizeof(char) * SIGSTKSZ), .ss_size = SIGSTKSZ, }; if (sigaltstack(&stack, NULL) != 0) err(1, "sigaltstack"); sethandler(SIGSEGV, handle_and_longjmp, SA_RESETHAND | SA_ONSTACK); nr = SYS_getpid; + free(stack.ss_sp); /* Clear EBP first to make sure we segfault cleanly. */ asm volatile ("xorl %%ebp, %%ebp; mov %[ss], %%ss; SYSENTER" : "+a" (nr) : [ss] "m" (ss) : "flags", "rcx" diff --git a/tools/testing/selftests/x86/sigreturn.c b/tools/testing/selftests/x86/sigreturn.c index 57c4f67f16ef..5d7961a5f7f6 100644 --- a/tools/testing/selftests/x86/sigreturn.c +++ b/tools/testing/selftests/x86/sigreturn.c @@ -138,9 +138,6 @@ static unsigned short LDT3(int idx) return (idx << 3) | 7; }
-/* Our sigaltstack scratch space. */ -static char altstack_data[SIGSTKSZ]; - static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), int flags) { @@ -771,7 +768,8 @@ int main() setup_ldt();
stack_t stack = { - .ss_sp = altstack_data, + /* Our sigaltstack scratch space. */ + .ss_sp = malloc(sizeof(char) * SIGSTKSZ), .ss_size = SIGSTKSZ, }; if (sigaltstack(&stack, NULL) != 0) @@ -872,5 +870,6 @@ int main() total_nerrs += test_nonstrict_ss(); #endif
+ free(stack.ss_sp); return total_nerrs ? 1 : 0; } diff --git a/tools/testing/selftests/x86/single_step_syscall.c b/tools/testing/selftests/x86/single_step_syscall.c index 120ac741fe44..9a30f443e928 100644 --- a/tools/testing/selftests/x86/single_step_syscall.c +++ b/tools/testing/selftests/x86/single_step_syscall.c @@ -57,7 +57,6 @@ static void clearhandler(int sig)
static volatile sig_atomic_t sig_traps, sig_eflags; sigjmp_buf jmpbuf; -static unsigned char altstack_data[SIGSTKSZ];
#ifdef __x86_64__ # define REG_IP REG_RIP @@ -210,7 +209,7 @@ int main() unsigned long nr = SYS_getpid; printf("[RUN]\tSet TF and check SYSENTER\n"); stack_t stack = { - .ss_sp = altstack_data, + .ss_sp = malloc(sizeof(char) * SIGSTKSZ), .ss_size = SIGSTKSZ, }; if (sigaltstack(&stack, NULL) != 0) @@ -219,6 +218,7 @@ int main() SA_RESETHAND | SA_ONSTACK); sethandler(SIGILL, print_and_longjmp, SA_RESETHAND); set_eflags(get_eflags() | X86_EFLAGS_TF); + free(stack.ss_sp); /* Clear EBP first to make sure we segfault cleanly. */ asm volatile ("xorl %%ebp, %%ebp; SYSENTER" : "+a" (nr) :: "flags", "rcx" #ifdef __x86_64__ diff --git a/tools/testing/selftests/x86/syscall_arg_fault.c b/tools/testing/selftests/x86/syscall_arg_fault.c index bff474b5efc6..461fa41a4d02 100644 --- a/tools/testing/selftests/x86/syscall_arg_fault.c +++ b/tools/testing/selftests/x86/syscall_arg_fault.c @@ -17,9 +17,6 @@
#include "helpers.h"
-/* Our sigaltstack scratch space. */ -static unsigned char altstack_data[SIGSTKSZ]; - static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *), int flags) { @@ -104,7 +101,8 @@ static void sigill(int sig, siginfo_t *info, void *ctx_void) int main() { stack_t stack = { - .ss_sp = altstack_data, + /* Our sigaltstack scratch space. */ + .ss_sp = malloc(sizeof(char) * SIGSTKSZ), .ss_size = SIGSTKSZ, }; if (sigaltstack(&stack, NULL) != 0) @@ -233,5 +231,6 @@ int main() set_eflags(get_eflags() & ~X86_EFLAGS_TF); #endif
+ free(stack.ss_sp); return 0; }
From: Kees Cook keescook@chromium.org
[ Upstream commit fb49d9946f96081f9a05d8f305b3f40285afe4a9 ]
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields.
Since all the size checking has already happened, use input.pointer (void *) so memcpy() doesn't get confused about how much is being written.
Avoids this false-positive warning when run-time memcpy() strict bounds checking is enabled:
memcpy: detected field-spanning write (size 4096) of single field (size 36) WARNING: CPU: 0 PID: 357 at drivers/platform/x86/dell/dell-smbios-wmi.c:74 run_smbios_call+0x110/0x1e0 [dell_smbios]
Cc: Hans de Goede hdegoede@redhat.com Cc: Mark Gross mgross@linux.intel.com Cc: Mario Limonciello mario.limonciello@dell.com Cc: "Pali Rohár" pali@kernel.org Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: "Uwe Kleine-König" u.kleine-koenig@pengutronix.de Cc: Dell.Client.Kernel@dell.com Cc: platform-driver-x86@vger.kernel.org Reported-by: Andy Lavr andy.lavr@gmail.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20210825160749.3891090-1-keescook@chromium.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell/dell-smbios-wmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/dell/dell-smbios-wmi.c b/drivers/platform/x86/dell/dell-smbios-wmi.c index 33f823772733..01ea4bb958af 100644 --- a/drivers/platform/x86/dell/dell-smbios-wmi.c +++ b/drivers/platform/x86/dell/dell-smbios-wmi.c @@ -71,7 +71,7 @@ static int run_smbios_call(struct wmi_device *wdev) obj->integer.value); return -EIO; } - memcpy(&priv->buf->std, obj->buffer.pointer, obj->buffer.length); + memcpy(input.pointer, obj->buffer.pointer, obj->buffer.length); dev_dbg(&wdev->dev, "result: [%08x,%08x,%08x,%08x]\n", priv->buf->std.output[0], priv->buf->std.output[1], priv->buf->std.output[2], priv->buf->std.output[3]);
From: Leonardo Bras leobras.c@gmail.com
[ Upstream commit 2ca73c54ce24489518a56d816331b774044c2445 ]
enable_ddw() currently returns the address of the DMA window, which is considered invalid if has the value 0x00.
Also, it only considers valid an address returned from find_existing_ddw if it's not 0x00.
Changing this behavior makes sense, given the users of enable_ddw() only need to know if direct mapping is possible. It can also allow a DMA window starting at 0x00 to be used.
This will be helpful for using a DDW with indirect mapping, as the window address will be different than 0x00, but it will not map the whole partition.
Signed-off-by: Leonardo Bras leobras.c@gmail.com Reviewed-by: Alexey Kardashevskiy aik@ozlabs.ru Reviewed-by: Frederic Barrat fbarrat@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210817063929.38701-6-leobras.c@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/iommu.c | 36 +++++++++++++------------- 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 0c55b991f665..a189178ca8e0 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -843,25 +843,26 @@ static void remove_ddw(struct device_node *np, bool remove_prop) np, ret); }
-static u64 find_existing_ddw(struct device_node *pdn, int *window_shift) +static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift) { struct direct_window *window; const struct dynamic_dma_window_prop *direct64; - u64 dma_addr = 0; + bool found = false;
spin_lock(&direct_window_list_lock); /* check if we already created a window and dupe that config if so */ list_for_each_entry(window, &direct_window_list, list) { if (window->device == pdn) { direct64 = window->prop; - dma_addr = be64_to_cpu(direct64->dma_base); + *dma_addr = be64_to_cpu(direct64->dma_base); *window_shift = be32_to_cpu(direct64->window_shift); + found = true; break; } } spin_unlock(&direct_window_list_lock);
- return dma_addr; + return found; }
static int find_existing_ddw_windows(void) @@ -1139,20 +1140,20 @@ static int iommu_get_page_shift(u32 query_page_size) * pdn: the parent pe node with the ibm,dma_window property * Future: also check if we can remap the base window for our base page size * - * returns the dma offset for use by the direct mapped DMA code. + * returns true if can map all pages (direct mapping), false otherwise.. */ -static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) +static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) { int len = 0, ret; int max_ram_len = order_base_2(ddw_memory_hotplug_max()); struct ddw_query_response query; struct ddw_create_response create; int page_shift; - u64 dma_addr; struct device_node *dn; u32 ddw_avail[DDW_APPLICABLE_SIZE]; struct direct_window *window; struct property *win64; + bool ddw_enabled = false; struct dynamic_dma_window_prop *ddwprop; struct failed_ddw_pdn *fpdn; bool default_win_removed = false; @@ -1164,9 +1165,10 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
mutex_lock(&direct_window_init_mutex);
- dma_addr = find_existing_ddw(pdn, &len); - if (dma_addr != 0) + if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { + ddw_enabled = true; goto out_unlock; + }
/* * If we already went through this for a previous function of @@ -1322,7 +1324,8 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) list_add(&window->list, &direct_window_list); spin_unlock(&direct_window_list_lock);
- dma_addr = be64_to_cpu(ddwprop->dma_base); + dev->dev.archdata.dma_offset = be64_to_cpu(ddwprop->dma_base); + ddw_enabled = true; goto out_unlock;
out_free_window: @@ -1354,10 +1357,10 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) * as RAM, then we failed to create a window to cover persistent * memory and need to set the DMA limit. */ - if (pmem_present && dma_addr && (len == max_ram_len)) - dev->dev.bus_dma_limit = dma_addr + (1ULL << len); + if (pmem_present && ddw_enabled && (len == max_ram_len)) + dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len);
- return dma_addr; + return ddw_enabled; }
static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) @@ -1436,11 +1439,8 @@ static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask) break; }
- if (pdn && PCI_DN(pdn)) { - pdev->dev.archdata.dma_offset = enable_ddw(pdev, pdn); - if (pdev->dev.archdata.dma_offset) - return true; - } + if (pdn && PCI_DN(pdn)) + return enable_ddw(pdev, pdn);
return false; }
From: Shubhrajyoti Datta shubhrajyoti.datta@xilinx.com
[ Upstream commit e7296d16ef7be11a6001be9bd89906ef55ab2405 ]
Fix a memory leak of mux.
Signed-off-by: Shubhrajyoti Datta shubhrajyoti.datta@xilinx.com Link: https://lore.kernel.org/r/20210818065929.12835-3-shubhrajyoti.datta@xilinx.c... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/zynqmp/clk-mux-zynqmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/zynqmp/clk-mux-zynqmp.c b/drivers/clk/zynqmp/clk-mux-zynqmp.c index 157d4a960bf4..17afce594f28 100644 --- a/drivers/clk/zynqmp/clk-mux-zynqmp.c +++ b/drivers/clk/zynqmp/clk-mux-zynqmp.c @@ -159,7 +159,7 @@ struct clk_hw *zynqmp_clk_register_mux(const char *name, u32 clk_id, hw = &mux->hw; ret = clk_hw_register(NULL, hw); if (ret) { - kfree(hw); + kfree(mux); hw = ERR_PTR(ret); }
From: Rafał Miłecki rafal@milecki.pl
[ Upstream commit 6880d94f84262e721f7da6eaa41cd8fd5d87164c ]
armpll clocks (available on Cygnus and Northstar Plus) are simple clocks with no cells. Adjust binding props #clock-cells and clock-output-names to handle them.
Signed-off-by: Rafał Miłecki rafal@milecki.pl Link: https://lore.kernel.org/r/20210819052918.6753-1-zajec5@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Rob Herring robh@kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../bindings/clock/brcm,iproc-clocks.yaml | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.yaml b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.yaml index 1174c9aa9934..5ad147d265e6 100644 --- a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.yaml +++ b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.yaml @@ -61,13 +61,30 @@ properties: maxItems: 1
'#clock-cells': - const: 1 + true
clock-output-names: minItems: 1 maxItems: 45
allOf: + - if: + properties: + compatible: + contains: + enum: + - brcm,cygnus-armpll + - brcm,nsp-armpll + then: + properties: + '#clock-cells': + const: 0 + else: + properties: + '#clock-cells': + const: 1 + required: + - clock-output-names - if: properties: compatible: @@ -358,7 +375,6 @@ required: - reg - clocks - '#clock-cells' - - clock-output-names
additionalProperties: false
@@ -392,3 +408,10 @@ examples: clocks = <&osc2>; clock-output-names = "keypad", "adc/touch", "pwm"; }; + - | + arm_clk@0 { + #clock-cells = <0>; + compatible = "brcm,nsp-armpll"; + clocks = <&osc>; + reg = <0x0 0x1000>; + };
From: Paul Cercueil paul@crapouillou.net
[ Upstream commit 71f8817c28e2e1e5549138e2aef68c2fd784e162 ]
Make sure that the PLL that feeds the CPU won't be stopped while the kernel is running.
This fixes a problem on JZ4760 (and probably others) where under very specific conditions, the main PLL would be turned OFF when the kernel was shutting down, causing the shutdown process to fail.
Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/generic/board-ingenic.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/mips/generic/board-ingenic.c b/arch/mips/generic/board-ingenic.c index 0cec0bea13d6..11387a93867b 100644 --- a/arch/mips/generic/board-ingenic.c +++ b/arch/mips/generic/board-ingenic.c @@ -7,6 +7,8 @@ * Copyright (C) 2020 Paul Cercueil paul@crapouillou.net */
+#include <linux/clk.h> +#include <linux/of.h> #include <linux/of_address.h> #include <linux/of_fdt.h> #include <linux/pm.h> @@ -108,10 +110,36 @@ static const struct platform_suspend_ops ingenic_pm_ops __maybe_unused = {
static int __init ingenic_pm_init(void) { + struct device_node *cpu_node; + struct clk *cpu0_clk; + int ret; + if (boot_cpu_type() == CPU_XBURST) { if (IS_ENABLED(CONFIG_PM_SLEEP)) suspend_set_ops(&ingenic_pm_ops); _machine_halt = ingenic_halt; + + /* + * Unconditionally enable the clock for the first CPU. + * This makes sure that the PLL that feeds the CPU won't be + * stopped while the kernel is running. + */ + cpu_node = of_get_cpu_node(0, NULL); + if (!cpu_node) { + pr_err("Unable to get CPU node\n"); + } else { + cpu0_clk = of_clk_get(cpu_node, 0); + if (IS_ERR(cpu0_clk)) { + pr_err("Unable to get CPU0 clock\n"); + return PTR_ERR(cpu0_clk); + } + + ret = clk_prepare_enable(cpu0_clk); + if (ret) { + pr_err("Unable to enable CPU0 clock\n"); + return ret; + } + } }
return 0;
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit 308c57ccf4318236be75dfa251c84713e694457b ]
If the underlying storage device is using thin-provisioning, it's possible for a zeroout operation to return ENOSPC.
Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks left") added logic to retry block allocation since we might get free block after we commit a transaction. But the ENOSPC from thin-provisioning will confuse ext4, and lead to an infinite loop.
Since using zeroout instead of splitting the extent node is an optimization, if it fails, we might as well fall back to splitting the extent node.
Reported-by: yangerkun yangerkun@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/extents.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 92ad64b89d9b..501516cadc1b 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, split_map.m_len - ee_block); err = ext4_ext_zeroout(inode, &zero_ex1); if (err) - goto out; + goto fallback; split_map.m_len = allocated; } if (split_map.m_lblk - ee_block + split_map.m_len < @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, ext4_ext_pblock(ex)); err = ext4_ext_zeroout(inode, &zero_ex2); if (err) - goto out; + goto fallback; }
split_map.m_len += split_map.m_lblk - ee_block; @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, } }
+fallback: err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag, flags); if (err > 0)
From: Jan Kara jack@suse.cz
[ Upstream commit bd2c38cf1726ea913024393a0d11f2e2a3f4c180 ]
If ext4 filesystem is corrupted so that quota files are linked from directory hirerarchy, bad things can happen. E.g. quota files can get corrupted or deleted. Make sure we are not grabbing quota file inodes when we expect normal inodes.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20210812133122.26360-1-jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/inode.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d8de607849df..2c33c795c4a7 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4603,6 +4603,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, struct ext4_iloc iloc; struct ext4_inode *raw_inode; struct ext4_inode_info *ei; + struct ext4_super_block *es = EXT4_SB(sb)->s_es; struct inode *inode; journal_t *journal = EXT4_SB(sb)->s_journal; long ret; @@ -4613,9 +4614,12 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, projid_t i_projid;
if ((!(flags & EXT4_IGET_SPECIAL) && - (ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO)) || + ((ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) || + ino == le32_to_cpu(es->s_usr_quota_inum) || + ino == le32_to_cpu(es->s_grp_quota_inum) || + ino == le32_to_cpu(es->s_prj_quota_inum))) || (ino < EXT4_ROOT_INO) || - (ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))) { + (ino > le32_to_cpu(es->s_inodes_count))) { if (flags & EXT4_IGET_HANDLE) return ERR_PTR(-ESTALE); __ext4_error(sb, function, line, false, EFSCORRUPTED, 0,
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit baaae979b112642a41b71c71c599d875c067d257 ]
Now that ext4_do_update_inode() return error before filling the whole inode data if we fail to set inode blocks in ext4_inode_blocks_set(). This error should never happen in theory since sb->s_maxbytes should not have allowed this, we have already init sb->s_maxbytes according to this feature in ext4_fill_super(). So even through that could only happen due to the filesystem corruption, we'd better to return after we finish updating the inode because it may left an uninitialized buffer and we could read this buffer later in "errors=continue" mode.
This patch make the updating inode data procedure atomic, call EXT4_ERROR_INODE() after we dropping i_raw_lock after something bad happened, make sure that the inode is integrated, and also drop a BUG_ON and do some small cleanups.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20210826130412.3921207-4-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/inode.c | 44 ++++++++++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 16 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2c33c795c4a7..ca032ae25765 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4932,8 +4932,14 @@ static int ext4_inode_blocks_set(handle_t *handle, ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE); return 0; } + + /* + * This should never happen since sb->s_maxbytes should not have + * allowed this, sb->s_maxbytes was set according to the huge_file + * feature in ext4_fill_super(). + */ if (!ext4_has_feature_huge_file(sb)) - return -EFBIG; + return -EFSCORRUPTED;
if (i_blocks <= 0xffffffffffffULL) { /* @@ -5036,16 +5042,14 @@ static int ext4_do_update_inode(handle_t *handle,
spin_lock(&ei->i_raw_lock);
- /* For fields not tracked in the in-memory inode, - * initialise them to zero for new inodes. */ + /* + * For fields not tracked in the in-memory inode, initialise them + * to zero for new inodes. + */ if (ext4_test_inode_state(inode, EXT4_STATE_NEW)) memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
err = ext4_inode_blocks_set(handle, raw_inode, ei); - if (err) { - spin_unlock(&ei->i_raw_lock); - goto out_brelse; - }
raw_inode->i_mode = cpu_to_le16(inode->i_mode); i_uid = i_uid_read(inode); @@ -5054,10 +5058,11 @@ static int ext4_do_update_inode(handle_t *handle, if (!(test_opt(inode->i_sb, NO_UID32))) { raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid)); raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid)); -/* - * Fix up interoperability with old kernels. Otherwise, old inodes get - * re-used with the upper 16 bits of the uid/gid intact - */ + /* + * Fix up interoperability with old kernels. Otherwise, + * old inodes get re-used with the upper 16 bits of the + * uid/gid intact. + */ if (ei->i_dtime && list_empty(&ei->i_orphan)) { raw_inode->i_uid_high = 0; raw_inode->i_gid_high = 0; @@ -5126,8 +5131,9 @@ static int ext4_do_update_inode(handle_t *handle, } }
- BUG_ON(!ext4_has_feature_project(inode->i_sb) && - i_projid != EXT4_DEF_PROJID); + if (i_projid != EXT4_DEF_PROJID && + !ext4_has_feature_project(inode->i_sb)) + err = err ?: -EFSCORRUPTED;
if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE && EXT4_FITS_IN_INODE(raw_inode, ei, i_projid)) @@ -5135,6 +5141,11 @@ static int ext4_do_update_inode(handle_t *handle,
ext4_inode_csum_set(inode, raw_inode, ei); spin_unlock(&ei->i_raw_lock); + if (err) { + EXT4_ERROR_INODE(inode, "corrupted inode contents"); + goto out_brelse; + } + if (inode->i_sb->s_flags & SB_LAZYTIME) ext4_update_other_inodes_time(inode->i_sb, inode->i_ino, bh->b_data); @@ -5142,13 +5153,13 @@ static int ext4_do_update_inode(handle_t *handle, BUFFER_TRACE(bh, "call ext4_handle_dirty_metadata"); err = ext4_handle_dirty_metadata(handle, NULL, bh); if (err) - goto out_brelse; + goto out_error; ext4_clear_inode_state(inode, EXT4_STATE_NEW); if (set_large_file) { BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get write access"); err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh); if (err) - goto out_brelse; + goto out_error; lock_buffer(EXT4_SB(sb)->s_sbh); ext4_set_feature_large_file(sb); ext4_superblock_csum_set(sb); @@ -5158,9 +5169,10 @@ static int ext4_do_update_inode(handle_t *handle, EXT4_SB(sb)->s_sbh); } ext4_update_inode_fsync_trans(handle, inode, need_datasync); +out_error: + ext4_std_error(inode->i_sb, err); out_brelse: brelse(bh); - ext4_std_error(inode->i_sb, err); return err; }
From: Juergen Gross jgross@suse.com
[ Upstream commit 58e636039b512697554b579c2bb23774061877f5 ]
In cpu_bringup() there is a call of preempt_disable() without a paired preempt_enable(). This is not needed as interrupts are off initially. Additionally this will result in early boot messages like:
BUG: scheduling while atomic: swapper/1/0/0x00000002
Signed-off-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20210825113158.11716-1-jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/smp_pv.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index c2ac319f11a4..96afadf9878e 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -64,7 +64,6 @@ static void cpu_bringup(void) cr4_init(); cpu_init(); touch_softlockup_watchdog(); - preempt_disable();
/* PVH runs in ring 0 and allows us to do native syscalls. Yay! */ if (!xen_feature(XENFEAT_supervisor_mode_kernel)) {
From: Alexander Aring aahringo@redhat.com
[ Upstream commit ecd95673142ef80169a6c003b569b8a86d1e6329 ]
When dlm_release_lockspace does active shutdown on connections to other nodes, the active shutdown will wait for any exisitng passive shutdowns to be resolved. But, the sequence of operations during dlm_release_lockspace can prevent the normal resolution of passive shutdowns (processed normally by way of lockspace recovery.) This disruption of passive shutdown handling can cause the active shutdown to wait for a full timeout period, delaying the completion of dlm_release_lockspace.
To fix this, make dlm_release_lockspace resolve existing passive shutdowns (by calling dlm_clear_members earlier), before it does active shutdowns. The active shutdowns will not find any passive shutdowns to wait for, and will not be delayed.
Reported-by: Chris Mackowski cmackows@redhat.com Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/lockspace.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index d71aba8c3e64..e51ae83fc5b4 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -793,6 +793,7 @@ static int release_lockspace(struct dlm_ls *ls, int force)
if (ls_count == 1) { dlm_scand_stop(); + dlm_clear_members(ls); dlm_midcomms_shutdown(); }
From: Sami Tolvanen samitolvanen@google.com
[ Upstream commit 850ded46c64299f04d15e39caaba21963fb966f8 ]
With CONFIG_LTO_CLANG, we currently link modules into native code just before modpost, which means with TRIM_UNUSED_KSYMS enabled, we still look at the LLVM bitcode in the .o files when generating the list of used symbols. As the bitcode doesn't yet have calls to compiler intrinsics and llvm-nm doesn't see function references that only exist in function-level inline assembly, we currently need a whitelist for TRIM_UNUSED_KSYMS to work with LTO.
This change moves module LTO linking to happen earlier, and thus avoids the issue with LLVM bitcode and TRIM_UNUSED_KSYMS entirely, allowing us to also drop the whitelist from gen_autoksyms.sh.
Link: https://github.com/ClangBuiltLinux/linux/issues/1369 Signed-off-by: Sami Tolvanen samitolvanen@google.com Reviewed-by: Alexander Lobakin alobakin@pm.me Tested-by: Alexander Lobakin alobakin@pm.me Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/Makefile.build | 27 ++++++++++++++++++++++++++- scripts/Makefile.lib | 7 +++++++ scripts/Makefile.modfinal | 21 ++------------------- scripts/Makefile.modpost | 22 +++------------------- scripts/gen_autoksyms.sh | 12 ------------ 5 files changed, 38 insertions(+), 51 deletions(-)
diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 02197cb8e3a7..ea549579bfb7 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -88,6 +88,10 @@ endif
targets-for-modules := $(patsubst %.o, %.mod, $(filter %.o, $(obj-m)))
+ifdef CONFIG_LTO_CLANG +targets-for-modules += $(patsubst %.o, %.lto.o, $(filter %.o, $(obj-m))) +endif + ifdef need-modorder targets-for-modules += $(obj)/modules.order endif @@ -271,12 +275,33 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) $$(objtool_dep) FORCE $(call if_changed_rule,cc_o_c) $(call cmd,force_checksrc)
+ifdef CONFIG_LTO_CLANG +# Module .o files may contain LLVM bitcode, compile them into native code +# before ELF processing +quiet_cmd_cc_lto_link_modules = LTO [M] $@ +cmd_cc_lto_link_modules = \ + $(LD) $(ld_flags) -r -o $@ \ + $(shell [ -s $(@:.lto.o=.o.symversions) ] && \ + echo -T $(@:.lto.o=.o.symversions)) \ + --whole-archive $(filter-out FORCE,$^) + +ifdef CONFIG_STACK_VALIDATION +# objtool was skipped for LLVM bitcode, run it now that we have compiled +# modules into native code +cmd_cc_lto_link_modules += ; \ + $(objtree)/tools/objtool/objtool $(objtool_args) --module $@ +endif + +$(obj)/%.lto.o: $(obj)/%.o FORCE + $(call if_changed,cc_lto_link_modules) +endif + cmd_mod = { \ echo $(if $($*-objs)$($*-y)$($*-m), $(addprefix $(obj)/, $($*-objs) $($*-y) $($*-m)), $(@:.mod=.o)); \ $(undefined_syms) echo; \ } > $@
-$(obj)/%.mod: $(obj)/%.o FORCE +$(obj)/%.mod: $(obj)/%$(mod-prelink-ext).o FORCE $(call if_changed,mod)
quiet_cmd_cc_lst_c = MKLST $@ diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 10950559b223..af1c920a585c 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -225,6 +225,13 @@ dtc_cpp_flags = -Wp,-MMD,$(depfile).pre.tmp -nostdinc \ $(addprefix -I,$(DTC_INCLUDE)) \ -undef -D__DTS__
+ifeq ($(CONFIG_LTO_CLANG),y) +# With CONFIG_LTO_CLANG, .o files in modules might be LLVM bitcode, so we +# need to run LTO to compile them into native code (.lto.o) before further +# processing. +mod-prelink-ext := .lto +endif + # Objtool arguments are also needed for modfinal with LTO, so we define # then here to avoid duplication. objtool_args = \ diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal index 5e9b8057fb24..ff805777431c 100644 --- a/scripts/Makefile.modfinal +++ b/scripts/Makefile.modfinal @@ -9,7 +9,7 @@ __modfinal: include include/config/auto.conf include $(srctree)/scripts/Kbuild.include
-# for c_flags and objtool_args +# for c_flags and mod-prelink-ext include $(srctree)/scripts/Makefile.lib
# find all modules listed in modules.order @@ -30,23 +30,6 @@ quiet_cmd_cc_o_c = CC [M] $@
ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
-ifdef CONFIG_LTO_CLANG -# With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to -# avoid a second slow LTO link -prelink-ext := .lto - -# ELF processing was skipped earlier because we didn't have native code, -# so let's now process the prelinked binary before we link the module. - -ifdef CONFIG_STACK_VALIDATION -cmd_ld_ko_o += \ - $(objtree)/tools/objtool/objtool $(objtool_args) \ - $(@:.ko=$(prelink-ext).o); - -endif # CONFIG_STACK_VALIDATION - -endif # CONFIG_LTO_CLANG - quiet_cmd_ld_ko_o = LD [M] $@ cmd_ld_ko_o += \ $(LD) -r $(KBUILD_LDFLAGS) \ @@ -72,7 +55,7 @@ if_changed_except = $(if $(call newer_prereqs_except,$(2))$(cmd-check), \
# Re-generate module BTFs if either module's .ko or vmlinux changed -$(modules): %.ko: %$(prelink-ext).o %.mod.o scripts/module.lds $(if $(KBUILD_BUILTIN),vmlinux) FORCE +$(modules): %.ko: %$(mod-prelink-ext).o %.mod.o scripts/module.lds $(if $(KBUILD_BUILTIN),vmlinux) FORCE +$(call if_changed_except,ld_ko_o,vmlinux) ifdef CONFIG_DEBUG_INFO_BTF_MODULES +$(if $(newer-prereqs),$(call cmd,btf_ko)) diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost index c383ba33d837..eef56d629799 100644 --- a/scripts/Makefile.modpost +++ b/scripts/Makefile.modpost @@ -41,7 +41,7 @@ __modpost: include include/config/auto.conf include $(srctree)/scripts/Kbuild.include
-# for ld_flags +# for mod-prelink-ext include $(srctree)/scripts/Makefile.lib
MODPOST = scripts/mod/modpost \ @@ -118,22 +118,6 @@ $(input-symdump): @echo >&2 ' Modules may not have dependencies or modversions.' @echo >&2 ' You may get many unresolved symbol warnings.'
-ifdef CONFIG_LTO_CLANG -# With CONFIG_LTO_CLANG, .o files might be LLVM bitcode, so we need to run -# LTO to compile them into native code before running modpost -prelink-ext := .lto - -quiet_cmd_cc_lto_link_modules = LTO [M] $@ -cmd_cc_lto_link_modules = \ - $(LD) $(ld_flags) -r -o $@ \ - $(shell [ -s $(@:.lto.o=.o.symversions) ] && \ - echo -T $(@:.lto.o=.o.symversions)) \ - --whole-archive $^ - -%.lto.o: %.o - $(call if_changed,cc_lto_link_modules) -endif - modules := $(sort $(shell cat $(MODORDER)))
# KBUILD_MODPOST_WARN can be set to avoid error out in case of undefined symbols @@ -144,9 +128,9 @@ endif # Read out modules.order to pass in modpost. # Otherwise, allmodconfig would fail with "Argument list too long". quiet_cmd_modpost = MODPOST $@ - cmd_modpost = sed 's/.ko$$/$(prelink-ext).o/' $< | $(MODPOST) -T - + cmd_modpost = sed 's/.ko$$/$(mod-prelink-ext).o/' $< | $(MODPOST) -T -
-$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(prelink-ext).o) FORCE +$(output-symdump): $(MODORDER) $(input-symdump) $(modules:.ko=$(mod-prelink-ext).o) FORCE $(call if_changed,modpost)
targets += $(output-symdump) diff --git a/scripts/gen_autoksyms.sh b/scripts/gen_autoksyms.sh index da320151e7c3..6ed0d225c8b1 100755 --- a/scripts/gen_autoksyms.sh +++ b/scripts/gen_autoksyms.sh @@ -26,18 +26,6 @@ if [ -n "$CONFIG_MODVERSIONS" ]; then needed_symbols="$needed_symbols module_layout" fi
-# With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary -# when the .mod files are generated, which means they don't yet contain -# references to certain symbols that will be present in the final binaries. -if [ -n "$CONFIG_LTO_CLANG" ]; then - # intrinsic functions - needed_symbols="$needed_symbols memcpy memmove memset" - # ftrace - needed_symbols="$needed_symbols _mcount" - # stack protector symbols - needed_symbols="$needed_symbols __stack_chk_fail __stack_chk_guard" -fi - ksym_wl= if [ -n "$CONFIG_UNUSED_KSYMS_WHITELIST" ]; then # Use 'eval' to expand the whitelist path and check if it is relative
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 55a6d00ed0c1b4d20d7966d00fda6848d776658d ]
Add FORCE so that if_changed can detect the command line change.
Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/entry/vdso/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 05c4abc2fdfd..a2dddcc189f6 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -131,7 +131,7 @@ $(obj)/%-x32.o: $(obj)/%.o FORCE targets += vdsox32.lds $(vobjx32s-y)
$(obj)/%.so: OBJCOPYFLAGS := -S --remove-section __ex_table -$(obj)/%.so: $(obj)/%.so.dbg +$(obj)/%.so: $(obj)/%.so.dbg FORCE $(call if_changed,objcopy)
$(obj)/vdsox32.so.dbg: $(obj)/vdsox32.lds $(vobjx32s) FORCE
From: Ariel Marcovitch arielmarcovitch@gmail.com
[ Upstream commit 1439ebd2ce77242400518d4e6a1e85bebcd8084f ]
It seems like the implementation of the --ignore option is broken.
In check_symbols_helper, when going through the list of files, a file is added to the list of source files to check if it matches the ignore pattern. Instead, as stated in the comment below this condition, the file should be added if it doesn't match the pattern.
This means that when providing an ignore pattern, the only files that will be checked will be the ones we want the ignore, in addition to the Kconfig files that don't match the pattern (the check in parse_kconfig_files is done right)
Signed-off-by: Ariel Marcovitch arielmarcovitch@gmail.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/checkkconfigsymbols.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/checkkconfigsymbols.py b/scripts/checkkconfigsymbols.py index 1548f9ce4682..b9b0f15e5880 100755 --- a/scripts/checkkconfigsymbols.py +++ b/scripts/checkkconfigsymbols.py @@ -329,7 +329,7 @@ def check_symbols_helper(pool, ignore): if REGEX_FILE_KCONFIG.match(gitfile): kconfig_files.append(gitfile) else: - if ignore and not re.match(ignore, gitfile): + if ignore and re.match(ignore, gitfile): continue # add source files that do not match the ignore pattern source_files.append(gitfile)
From: Tuo Li islituo@gmail.com
[ Upstream commit 6c85c2c728193d19d6a908ae9fb312d0325e65ca ]
A memory block is allocated through kmalloc(), and its return value is assigned to the pointer oinfo. However, oinfo->dqi_gqinode is not initialized but it is accessed in: iput(oinfo->dqi_gqinode);
To fix this possible uninitialized-variable access, assign NULL to oinfo->dqi_gqinode, and add ocfs2_qinfo_lock_res_init() behind the assignment in ocfs2_local_read_info(). Remove ocfs2_qinfo_lock_res_init() in ocfs2_global_read_info().
Link: https://lkml.kernel.org/r/20210804031832.57154-1-islituo@gmail.com Signed-off-by: Tuo Li islituo@gmail.com Reported-by: TOTE Robot oslab@tsinghua.edu.cn Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/quota_global.c | 1 - fs/ocfs2/quota_local.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index eda83487c9ec..f033de733adb 100644 --- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2/quota_global.c @@ -357,7 +357,6 @@ int ocfs2_global_read_info(struct super_block *sb, int type) } oinfo->dqi_gi.dqi_sb = sb; oinfo->dqi_gi.dqi_type = type; - ocfs2_qinfo_lock_res_init(&oinfo->dqi_gqlock, oinfo); oinfo->dqi_gi.dqi_entry_size = sizeof(struct ocfs2_global_disk_dqblk); oinfo->dqi_gi.dqi_ops = &ocfs2_global_ops; oinfo->dqi_gqi_bh = NULL; diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c index b1a8b046f4c2..0e4b16d4c037 100644 --- a/fs/ocfs2/quota_local.c +++ b/fs/ocfs2/quota_local.c @@ -702,6 +702,8 @@ static int ocfs2_local_read_info(struct super_block *sb, int type) info->dqi_priv = oinfo; oinfo->dqi_type = type; INIT_LIST_HEAD(&oinfo->dqi_chunk); + oinfo->dqi_gqinode = NULL; + ocfs2_qinfo_lock_res_init(&oinfo->dqi_gqlock, oinfo); oinfo->dqi_rec = NULL; oinfo->dqi_lqi_bh = NULL; oinfo->dqi_libh = NULL;
From: Gang He ghe@suse.com
[ Upstream commit 9673e0050c39b0534d0e2ca431223f52089f4959 ]
Usually, ocfs2_downconvert_lock() function always downconverts dlm lock to the expected level for satisfy dlm bast requests from the other nodes.
But there is a rare situation. When dlm lock conversion is being canceled, ocfs2_downconvert_lock() function will return -EBUSY. You need to be aware that ocfs2_cancel_convert() function is asynchronous in fsdlm implementation.
If we does not requeue this lockres entry, ocfs2 downconvert thread no longer handles this dlm lock bast request. Then, the other nodes will not get the dlm lock again, the current node's process will be blocked when acquire this dlm lock again.
Link: https://lkml.kernel.org/r/20210830044621.12544-1-ghe@suse.com Signed-off-by: Gang He ghe@suse.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/dlmglue.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 48fd369c29a4..f8f561850470 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -16,6 +16,7 @@ #include <linux/debugfs.h> #include <linux/seq_file.h> #include <linux/time.h> +#include <linux/delay.h> #include <linux/quotaops.h> #include <linux/sched/signal.h>
@@ -3912,6 +3913,17 @@ static int ocfs2_unblock_lock(struct ocfs2_super *osb, spin_unlock_irqrestore(&lockres->l_lock, flags); ret = ocfs2_downconvert_lock(osb, lockres, new_level, set_lvb, gen); + /* The dlm lock convert is being cancelled in background, + * ocfs2_cancel_convert() is asynchronous in fs/dlm, + * requeue it, try again later. + */ + if (ret == -EBUSY) { + ctl->requeue = 1; + mlog(ML_BASTS, "lockres %s, ReQ: Downconvert busy\n", + lockres->l_name); + ret = 0; + msleep(20); + }
leave: if (ret)
From: Johannes Weiner hannes@cmpxchg.org
[ Upstream commit 16e2df2a05d46c983bf310b19432c5ca4684b2bc ]
When drop_caches truncates the page cache in an inode it also includes any shadow entries for evicted pages. However, there is a preliminary check on whether the inode has pages: if it has *only* shadow entries, it will skip running truncation on the inode and leave it behind.
Fix the check to mapping_empty(), such that it runs truncation on any inode that has cache entries at all.
Link: https://lkml.kernel.org/r/20210614211904.14420-2-hannes@cmpxchg.org Signed-off-by: Johannes Weiner hannes@cmpxchg.org Reported-by: Roman Gushchin guro@fb.com Acked-by: Roman Gushchin guro@fb.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/drop_caches.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/drop_caches.c b/fs/drop_caches.c index f00fcc4a4f72..e619c31b6bd9 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -3,6 +3,7 @@ * Implement the manual drop-all-pagecache function */
+#include <linux/pagemap.h> #include <linux/kernel.h> #include <linux/mm.h> #include <linux/fs.h> @@ -27,7 +28,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) * we need to reschedule to avoid softlockups. */ if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || - (inode->i_mapping->nrpages == 0 && !need_resched())) { + (mapping_empty(inode->i_mapping) && !need_resched())) { spin_unlock(&inode->i_lock); continue; }
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit 8fbad19bdcb4b9be8131536e5bb9616ab2e4eeb3 ]
Multiple KASAN tests do writes past the allocated objects or writes to freed memory. Turn these writes into reads to avoid corrupting memory. Otherwise, these tests might lead to crashes with the HW_TAGS mode, as it neither uses quarantine nor redzones.
Link: https://lkml.kernel.org/r/c3cd2a383e757e27dd9131635fc7d09a48a49cf9.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index 8f7b0b2f6e11..b261fe9f3110 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -151,7 +151,7 @@ static void kmalloc_node_oob_right(struct kunit *test) ptr = kmalloc_node(size, GFP_KERNEL, 0); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, ptr[size] = 0); + KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = ptr[size]); kfree(ptr); }
@@ -187,7 +187,7 @@ static void kmalloc_pagealloc_uaf(struct kunit *test) KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); kfree(ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = 0); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[0]); }
static void kmalloc_pagealloc_invalid_free(struct kunit *test) @@ -221,7 +221,7 @@ static void pagealloc_oob_right(struct kunit *test) ptr = page_address(pages); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, ptr[size] = 0); + KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = ptr[size]); free_pages((unsigned long)ptr, order); }
@@ -236,7 +236,7 @@ static void pagealloc_uaf(struct kunit *test) KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); free_pages((unsigned long)ptr, order);
- KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = 0); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[0]); }
static void kmalloc_large_oob_right(struct kunit *test) @@ -498,7 +498,7 @@ static void kmalloc_uaf(struct kunit *test) KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
kfree(ptr); - KUNIT_EXPECT_KASAN_FAIL(test, *(ptr + 8) = 'x'); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[8]); }
static void kmalloc_uaf_memset(struct kunit *test) @@ -537,7 +537,7 @@ static void kmalloc_uaf2(struct kunit *test) goto again; }
- KUNIT_EXPECT_KASAN_FAIL(test, ptr1[40] = 'x'); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr1)[40]); KUNIT_EXPECT_PTR_NE(test, ptr1, ptr2);
kfree(ptr2); @@ -684,7 +684,7 @@ static void ksize_unpoisons_memory(struct kunit *test) ptr[size] = 'x';
/* This one must. */ - KUNIT_EXPECT_KASAN_FAIL(test, ptr[real_size] = 'y'); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[real_size]);
kfree(ptr); }
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit 555999a009aacd90ea51a6690e8eb2a5d0427edc ]
kmalloc_oob_memset_*() tests do writes past the allocated objects. As the result, they corrupt memory, which might lead to crashes with the HW_TAGS mode, as it neither uses quarantine nor redzones.
Adjust the tests to only write memory within the aligned kmalloc objects.
Also add a comment mentioning that memset tests are designed to touch both valid and invalid memory.
Link: https://lkml.kernel.org/r/64fd457668a16e7b58d094f14a165f9d5170c5a9.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index b261fe9f3110..b298edb325ab 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -412,64 +412,70 @@ static void kmalloc_uaf_16(struct kunit *test) kfree(ptr1); }
+/* + * Note: in the memset tests below, the written range touches both valid and + * invalid memory. This makes sure that the instrumentation does not only check + * the starting address but the whole range. + */ + static void kmalloc_oob_memset_2(struct kunit *test) { char *ptr; - size_t size = 8; + size_t size = 128 - KASAN_GRANULE_SIZE;
ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + 7 + OOB_TAG_OFF, 0, 2)); + KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 1, 0, 2)); kfree(ptr); }
static void kmalloc_oob_memset_4(struct kunit *test) { char *ptr; - size_t size = 8; + size_t size = 128 - KASAN_GRANULE_SIZE;
ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + 5 + OOB_TAG_OFF, 0, 4)); + KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 3, 0, 4)); kfree(ptr); }
- static void kmalloc_oob_memset_8(struct kunit *test) { char *ptr; - size_t size = 8; + size_t size = 128 - KASAN_GRANULE_SIZE;
ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + 1 + OOB_TAG_OFF, 0, 8)); + KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 7, 0, 8)); kfree(ptr); }
static void kmalloc_oob_memset_16(struct kunit *test) { char *ptr; - size_t size = 16; + size_t size = 128 - KASAN_GRANULE_SIZE;
ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + 1 + OOB_TAG_OFF, 0, 16)); + KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 15, 0, 16)); kfree(ptr); }
static void kmalloc_oob_in_memset(struct kunit *test) { char *ptr; - size_t size = 666; + size_t size = 128 - KASAN_GRANULE_SIZE;
ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
- KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr, 0, size + 5 + OOB_TAG_OFF)); + KUNIT_EXPECT_KASAN_FAIL(test, + memset(ptr, 0, size + KASAN_GRANULE_SIZE)); kfree(ptr); }
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit 1b0668be62cfa394903bb368641c80533bf42d5a ]
The HW_TAGS mode doesn't check memmove for negative size. As a result, the kmalloc_memmove_invalid_size test corrupts memory, which can result in a crash.
Disable this test with HW_TAGS KASAN.
Link: https://lkml.kernel.org/r/088733a06ac21eba29aa85b6f769d2abd74f9638.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index b298edb325ab..c149675300bd 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -485,11 +485,17 @@ static void kmalloc_memmove_invalid_size(struct kunit *test) size_t size = 64; volatile size_t invalid_size = -2;
+ /* + * Hardware tag-based mode doesn't check memmove for negative size. + * As a result, this test introduces a side-effect memory corruption, + * which can result in a crash. + */ + KASAN_TEST_NEEDS_CONFIG_OFF(test, CONFIG_KASAN_HW_TAGS); + ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
memset((char *)ptr, 0, 64); - KUNIT_EXPECT_KASAN_FAIL(test, memmove((char *)ptr, (char *)ptr + 4, invalid_size)); kfree(ptr);
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit 25b12a58e848459ae2dbf2e7d318ef168bd1c5e2 ]
kmalloc_uaf_memset() writes to freed memory, which is only safe with the GENERIC mode (as it uses quarantine). For other modes, this test corrupts kernel memory, which might result in a crash.
Only enable kmalloc_uaf_memset() for the GENERIC mode.
Link: https://lkml.kernel.org/r/2e1c87b607b1292556cde3cab2764f108542b60c.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index c149675300bd..65adde0757a3 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -518,6 +518,12 @@ static void kmalloc_uaf_memset(struct kunit *test) char *ptr; size_t size = 33;
+ /* + * Only generic KASAN uses quarantine, which is required to avoid a + * kernel memory corruption this test causes. + */ + KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_GENERIC); + ptr = kmalloc(size, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit b38fcca339dbcf680c9e43054502608fabc81508 ]
Some KASAN tests use global variables to store function returns values so that the compiler doesn't optimize away these functions.
ksize_uaf() doesn't call any functions, so it doesn't need to use kasan_int_result. Use volatile accesses instead, to be consistent with other similar tests.
Link: https://lkml.kernel.org/r/a1fc34faca4650f4a6e4dfb3f8d8d82c82eb953a.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index 65adde0757a3..564bee50cfa8 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -721,8 +721,8 @@ static void ksize_uaf(struct kunit *test) kfree(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, ksize(ptr)); - KUNIT_EXPECT_KASAN_FAIL(test, kasan_int_result = *ptr); - KUNIT_EXPECT_KASAN_FAIL(test, kasan_int_result = *(ptr + size)); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[0]); + KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[size]); }
static void kasan_stack_oob(struct kunit *test)
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit 756e5a47a5ddf0caa3708f922385a92af9d330b5 ]
copy_user_test() does writes past the allocated object. As the result, it corrupts kernel memory, which might lead to crashes with the HW_TAGS mode, as it neither uses quarantine nor redzones.
(Technically, this test can't yet be enabled with the HW_TAGS mode, but this will be implemented in the future.)
Adjust the test to only write memory within the aligned kmalloc object.
Link: https://lkml.kernel.org/r/19bf3a5112ee65b7db88dc731643b657b816c5e8.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan_module.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/lib/test_kasan_module.c b/lib/test_kasan_module.c index f1017f345d6c..fa73b9df0be4 100644 --- a/lib/test_kasan_module.c +++ b/lib/test_kasan_module.c @@ -15,13 +15,11 @@
#include "../mm/kasan/kasan.h"
-#define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : KASAN_GRANULE_SIZE) - static noinline void __init copy_user_test(void) { char *kmem; char __user *usermem; - size_t size = 10; + size_t size = 128 - KASAN_GRANULE_SIZE; int __maybe_unused unused;
kmem = kmalloc(size, GFP_KERNEL); @@ -38,25 +36,25 @@ static noinline void __init copy_user_test(void) }
pr_info("out-of-bounds in copy_from_user()\n"); - unused = copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF); + unused = copy_from_user(kmem, usermem, size + 1);
pr_info("out-of-bounds in copy_to_user()\n"); - unused = copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF); + unused = copy_to_user(usermem, kmem, size + 1);
pr_info("out-of-bounds in __copy_from_user()\n"); - unused = __copy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF); + unused = __copy_from_user(kmem, usermem, size + 1);
pr_info("out-of-bounds in __copy_to_user()\n"); - unused = __copy_to_user(usermem, kmem, size + 1 + OOB_TAG_OFF); + unused = __copy_to_user(usermem, kmem, size + 1);
pr_info("out-of-bounds in __copy_from_user_inatomic()\n"); - unused = __copy_from_user_inatomic(kmem, usermem, size + 1 + OOB_TAG_OFF); + unused = __copy_from_user_inatomic(kmem, usermem, size + 1);
pr_info("out-of-bounds in __copy_to_user_inatomic()\n"); - unused = __copy_to_user_inatomic(usermem, kmem, size + 1 + OOB_TAG_OFF); + unused = __copy_to_user_inatomic(usermem, kmem, size + 1);
pr_info("out-of-bounds in strncpy_from_user()\n"); - unused = strncpy_from_user(kmem, usermem, size + 1 + OOB_TAG_OFF); + unused = strncpy_from_user(kmem, usermem, size + 1);
vm_munmap((unsigned long)usermem, PAGE_SIZE); kfree(kmem);
From: Andrey Konovalov andreyknvl@gmail.com
[ Upstream commit f16de0bcdb55bf18e2533ca625f3e4b4952f254c ]
kasan_rcu_uaf() writes to freed memory via kasan_rcu_reclaim(), which is only safe with the GENERIC mode (as it uses quarantine). For other modes, this test corrupts kernel memory, which might result in a crash.
Turn the write into a read.
Link: https://lkml.kernel.org/r/b6f2c3bf712d2457c783fa59498225b66a634f62.162877980... Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Marco Elver elver@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan_module.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/test_kasan_module.c b/lib/test_kasan_module.c index fa73b9df0be4..7ebf433edef3 100644 --- a/lib/test_kasan_module.c +++ b/lib/test_kasan_module.c @@ -71,7 +71,7 @@ static noinline void __init kasan_rcu_reclaim(struct rcu_head *rp) struct kasan_rcu_info, rcu);
kfree(fp); - fp->i = 1; + ((volatile struct kasan_rcu_info *)fp)->i; }
static noinline void __init kasan_rcu_uaf(void)
linux-stable-mirror@lists.linaro.org