From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 715ea61532e731c62392221238906704e63d75b6 ]
When krealloc() fails and new is NULL, no error return code of icc_link_destroy() is assigned. To fix this bug, ret is assigned with -ENOMEM hen new is NULL.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Link: https://lore.kernel.org/r/20210306132857.17020-1-baijiaju1990@gmail.com Signed-off-by: Georgi Djakov georgi.djakov@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/interconnect/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c index 5ad519c9f239..8a1e70e00876 100644 --- a/drivers/interconnect/core.c +++ b/drivers/interconnect/core.c @@ -942,6 +942,8 @@ int icc_link_destroy(struct icc_node *src, struct icc_node *dst) GFP_KERNEL); if (new) src->links = new; + else + ret = -ENOMEM;
out: mutex_unlock(&icc_lock);
From: Andrew Price anprice@redhat.com
[ Upstream commit 62dd0f98a0e5668424270b47a0c2e973795faba7 ]
Interrupting mount with ^C quickly enough can cause the kthread_run() calls in gfs2's init_threads() to fail and the error path leads to a deadlock on the s_umount rwsem. The abridged chain of events is:
[mount path] get_tree_bdev() sget_fc() alloc_super() down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); [acquired] gfs2_fill_super() gfs2_make_fs_rw() init_threads() kthread_run() ( Interrupted ) [Error path] gfs2_gl_hash_clear() flush_workqueue(glock_workqueue) wait_for_completion()
[workqueue context] glock_work_func() run_queue() do_xmote() freeze_go_sync() freeze_super() down_write(&sb->s_umount) [deadlock]
In freeze_go_sync() there is a gfs2_withdrawn() check that we can use to make sure freeze_super() is not called in the error path, so add a gfs2_withdraw_delayed() call when init_threads() fails.
Ref: https://bugzilla.kernel.org/show_bug.cgi?id=212231
Reported-by: Alexander Aring aahringo@redhat.com Signed-off-by: Andrew Price anprice@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/super.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 754ea2a137b4..34ca312457a6 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -169,8 +169,10 @@ int gfs2_make_fs_rw(struct gfs2_sbd *sdp) int error;
error = init_threads(sdp); - if (error) + if (error) { + gfs2_withdraw_delayed(sdp); return error; + }
j_gl->gl_ops->go_inval(j_gl, DIO_METADATA); if (gfs2_withdrawn(sdp)) {
From: Suzuki K Poulose suzuki.poulose@arm.com
[ Upstream commit 1d676673d665fd2162e7e466dcfbe5373bfdb73e ]
Currently we advertise the ID_AA6DFR0_EL1.TRACEVER for the guest, when the trace register accesses are trapped (CPTR_EL2.TTA == 1). So, the guest will get an undefined instruction, if trusts the ID registers and access one of the trace registers. Lets be nice to the guest and hide the feature to avoid unexpected behavior.
Even though this can be done at KVM sysreg emulation layer, we do this by removing the TRACEVER from the sanitised feature register field. This is fine as long as the ETM drivers can handle the individual trace units separately, even when there are differences among the CPUs.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Mark Rutland mark.rutland@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210323120647.454211-2-suzuki.poulose@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/cpufeature.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 33b6f56dcb21..22106062cb63 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -380,7 +380,6 @@ static const struct arm64_ftr_bits ftr_id_aa64dfr0[] = { * of support. */ S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64DFR0_PMUVER_SHIFT, 4, 0), - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_TRACEVER_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_DEBUGVER_SHIFT, 4, 0x6), ARM64_FTR_END, };
From: Suzuki K Poulose suzuki.poulose@arm.com
[ Upstream commit a354a64d91eec3e0f8ef0eed575b480fd75b999c ]
Disable guest access to the Trace Filter control registers. We do not advertise the Trace filter feature to the guest (ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest can still access the TRFCR_EL1 unless we trap it.
This will also make sure that the guest cannot fiddle with the filtering controls set by a nvhe host.
Cc: Marc Zyngier maz@kernel.org Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210323120647.454211-3-suzuki.poulose@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kvm/debug.c | 2 ++ 2 files changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 4e90c2debf70..94d4025acc0b 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -278,6 +278,7 @@ #define CPTR_EL2_DEFAULT CPTR_EL2_RES1
/* Hyp Debug Configuration Register bits */ +#define MDCR_EL2_TTRF (1 << 19) #define MDCR_EL2_TPMS (1 << 14) #define MDCR_EL2_E2PB_MASK (UL(0x3)) #define MDCR_EL2_E2PB_SHIFT (UL(12)) diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c index 7a7e425616b5..dbc890511631 100644 --- a/arch/arm64/kvm/debug.c +++ b/arch/arm64/kvm/debug.c @@ -89,6 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) * - Debug ROM Address (MDCR_EL2_TDRA) * - OS related registers (MDCR_EL2_TDOSA) * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB) + * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF) * * Additionally, KVM only traps guest accesses to the debug registers if * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY @@ -112,6 +113,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK; vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM | MDCR_EL2_TPMS | + MDCR_EL2_TTRF | MDCR_EL2_TPMCR | MDCR_EL2_TDRA | MDCR_EL2_TDOSA);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 33ce7f2f95cabb5834cf0906308a5cb6103976da ]
When CONFIG_OF is disabled, building with 'make W=1' produces warnings about out of bounds array access:
drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop': drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below array bounds of 'struct clk *[4]' [-Werror=array-bounds]
Add an error check before the index is used, which helps with the warning, as well as any possible other error condition that may be triggered at runtime.
The warning could be fixed by adding a Kconfig depedency on CONFIG_OF, but Liu Ying points out that the driver may hit the out-of-bounds problem at runtime anyway.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Liu Ying victor.liu@nxp.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/imx/imx-ldb.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c index 41e2978cb1eb..75036aaa0c63 100644 --- a/drivers/gpu/drm/imx/imx-ldb.c +++ b/drivers/gpu/drm/imx/imx-ldb.c @@ -190,6 +190,11 @@ static void imx_ldb_encoder_enable(struct drm_encoder *encoder) int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN; int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
+ if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) { + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux); + return; + } + drm_panel_prepare(imx_ldb_ch->panel);
if (dual) { @@ -248,6 +253,11 @@ imx_ldb_encoder_atomic_mode_set(struct drm_encoder *encoder, int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder); u32 bus_format = imx_ldb_ch->bus_format;
+ if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) { + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux); + return; + } + if (mode->clock > 170000) { dev_warn(ldb->dev, "%s: mode exceeds 170 MHz pixel clock\n", __func__);
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit ff132c5f93c06bd4432bbab5c369e468653bdec4 ]
Before this patch, gfs2's freeze function failed to report an error when the target file system was already frozen as it should (and as generic vfs function freeze_super does. Similarly, gfs2's thaw function failed to report an error when trying to thaw a file system that is not frozen, as vfs function thaw_super does. The errors were checked, but it always returned a 0 return code.
This patch adds the missing error return codes to gfs2 freeze and thaw.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/super.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 34ca312457a6..223ebd6b1b8d 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -767,11 +767,13 @@ void gfs2_freeze_func(struct work_struct *work) static int gfs2_freeze(struct super_block *sb) { struct gfs2_sbd *sdp = sb->s_fs_info; - int error = 0; + int error;
mutex_lock(&sdp->sd_freeze_mutex); - if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) + if (atomic_read(&sdp->sd_freeze_state) != SFS_UNFROZEN) { + error = -EBUSY; goto out; + }
for (;;) { if (gfs2_withdrawn(sdp)) { @@ -812,10 +814,10 @@ static int gfs2_unfreeze(struct super_block *sb) struct gfs2_sbd *sdp = sb->s_fs_info;
mutex_lock(&sdp->sd_freeze_mutex); - if (atomic_read(&sdp->sd_freeze_state) != SFS_FROZEN || + if (atomic_read(&sdp->sd_freeze_state) != SFS_FROZEN || !gfs2_holder_initialized(&sdp->sd_freeze_gh)) { mutex_unlock(&sdp->sd_freeze_mutex); - return 0; + return -EINVAL; }
gfs2_freeze_unlock(&sdp->sd_freeze_gh);
From: Gulam Mohamed gulam.mohamed@oracle.com
[ Upstream commit 9e67600ed6b8565da4b85698ec659b5879a6c1c6 ]
A kernel panic was observed due to a timing issue between the sync thread and the initiator processing a login response from the target. The session reopen can be invoked both from the session sync thread when iscsid restarts and from iscsid through the error handler. Before the initiator receives the response to a login, another reopen request can be sent from the error handler/sync session. When the initial login response is subsequently processed, the connection has been closed and the socket has been released.
To fix this a new connection state, ISCSI_CONN_BOUND, is added:
- Set the connection state value to ISCSI_CONN_DOWN upon iscsi_if_ep_disconnect() and iscsi_if_stop_conn()
- Set the connection state to the newly created value ISCSI_CONN_BOUND after bind connection (transport->bind_conn())
- In iscsi_set_param(), return -ENOTCONN if the connection state is not either ISCSI_CONN_BOUND or ISCSI_CONN_UP
Link: https://lore.kernel.org/r/20210325093248.284678-1-gulam.mohamed@oracle.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Gulam Mohamed gulam.mohamed@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com
index 91074fd97f64..f4bf62b007a0 100644
Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 14 +++++++++++++- include/scsi/scsi_transport_iscsi.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index c53c3f9fa526..f648452d8d66 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2478,6 +2478,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag) */ mutex_lock(&conn_mutex); conn->transport->stop_conn(conn, flag); + conn->state = ISCSI_CONN_DOWN; mutex_unlock(&conn_mutex);
} @@ -2904,6 +2905,13 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) default: err = transport->set_param(conn, ev->u.set_param.param, data, ev->u.set_param.len); + if ((conn->state == ISCSI_CONN_BOUND) || + (conn->state == ISCSI_CONN_UP)) { + err = transport->set_param(conn, ev->u.set_param.param, + data, ev->u.set_param.len); + } else { + return -ENOTCONN; + } }
return err; @@ -2963,6 +2971,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport, mutex_lock(&conn->ep_mutex); conn->ep = NULL; mutex_unlock(&conn->ep_mutex); + conn->state = ISCSI_CONN_DOWN; }
transport->ep_disconnect(ep); @@ -3730,6 +3739,8 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group) ev->r.retcode = transport->bind_conn(session, conn, ev->u.b_conn.transport_eph, ev->u.b_conn.is_leading); + if (!ev->r.retcode) + conn->state = ISCSI_CONN_BOUND; mutex_unlock(&conn_mutex);
if (ev->r.retcode || !transport->ep_connect) @@ -3969,7 +3980,8 @@ iscsi_conn_attr(local_ipaddr, ISCSI_PARAM_LOCAL_IPADDR); static const char *const connection_state_names[] = { [ISCSI_CONN_UP] = "up", [ISCSI_CONN_DOWN] = "down", - [ISCSI_CONN_FAILED] = "failed" + [ISCSI_CONN_FAILED] = "failed", + [ISCSI_CONN_BOUND] = "bound" };
static ssize_t show_conn_state(struct device *dev, diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index 8a26a2ffa952..fc5a39839b4b 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -193,6 +193,7 @@ enum iscsi_connection_state { ISCSI_CONN_UP = 0, ISCSI_CONN_DOWN, ISCSI_CONN_FAILED, + ISCSI_CONN_BOUND, };
struct iscsi_cls_conn {
From: "Steven Rostedt (VMware)" rostedt@goodmis.org
[ Upstream commit 59300b36f85f254260c81d9dd09195fa49eb0f98 ]
It is possible that on error pg->size can be zero when getting its order, which would return a -1 value. It is dangerous to pass in an order of -1 to free_pages(). Check if order is greater than or equal to zero before calling free_pages().
Link: https://lore.kernel.org/lkml/20210330093916.432697c7@gandalf.local.home/
Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ftrace.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index b7e29db127fa..3ba52d4e1314 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -3231,7 +3231,8 @@ ftrace_allocate_pages(unsigned long num_to_init) pg = start_pg; while (pg) { order = get_count_order(pg->size / ENTRIES_PER_PAGE); - free_pages((unsigned long)pg->records, order); + if (order >= 0) + free_pages((unsigned long)pg->records, order); start_pg = pg->next; kfree(pg); pg = start_pg; @@ -6451,7 +6452,8 @@ void ftrace_release_mod(struct module *mod) clear_mod_from_hashes(pg);
order = get_count_order(pg->size / ENTRIES_PER_PAGE); - free_pages((unsigned long)pg->records, order); + if (order >= 0) + free_pages((unsigned long)pg->records, order); tmp_page = pg->next; kfree(pg); ftrace_number_of_pages -= 1 << order; @@ -6811,7 +6813,8 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr) if (!pg->index) { *last_pg = pg->next; order = get_count_order(pg->size / ENTRIES_PER_PAGE); - free_pages((unsigned long)pg->records, order); + if (order >= 0) + free_pages((unsigned long)pg->records, order); ftrace_number_of_pages -= 1 << order; ftrace_number_of_groups--; kfree(pg);
From: Stefan Raspl raspl@linux.ibm.com
[ Upstream commit 75f94ecbd0dfd2ac4e671f165f5ae864b7301422 ]
If this service is enabled and the system rebooted, Systemd's initial attempt to start this unit file may fail in case the kvm module is not loaded. Since we did not specify a delay for the retries, Systemd restarts with a minimum delay a number of times before giving up and disabling the service. Which means a subsequent kvm module load will have kvm running without monitoring. Adding a delay to fix this.
Signed-off-by: Stefan Raspl raspl@linux.ibm.com Message-Id: 20210325122949.1433271-1-raspl@linux.ibm.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/kvm/kvm_stat/kvm_stat.service | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/kvm/kvm_stat/kvm_stat.service b/tools/kvm/kvm_stat/kvm_stat.service index 71aabaffe779..8f13b843d5b4 100644 --- a/tools/kvm/kvm_stat/kvm_stat.service +++ b/tools/kvm/kvm_stat/kvm_stat.service @@ -9,6 +9,7 @@ Type=simple ExecStart=/usr/bin/kvm_stat -dtcz -s 10 -L /var/log/kvm_stat.csv ExecReload=/bin/kill -HUP $MAINPID Restart=always +RestartSec=60s SyslogIdentifier=kvm_stat SyslogLevel=debug
From: Dmitry Osipenko digetx@gmail.com
[ Upstream commit f8fb97c915954fc6de6513cdf277103b5c6df7b3 ]
RGB output doesn't allow to change parent clock rate of the display and PCLK rate is set to 0Hz in this case. The tegra_dc_commit_state() shall not set the display clock to 0Hz since this change propagates to the parent clock. The DISP clock is defined as a NODIV clock by the tegra-clk driver and all NODIV clocks use the CLK_SET_RATE_PARENT flag.
This bug stayed unnoticed because by default PLLP is used as the parent clock for the display controller and PLLP silently skips the erroneous 0Hz rate changes because it always has active child clocks that don't permit rate changes. The PLLP isn't acceptable for some devices that we want to upstream (like Samsung Galaxy Tab and ASUS TF700T) due to a display panel clock rate requirements that can't be fulfilled by using PLLP and then the bug pops up in this case since parent clock is set to 0Hz, killing the display output.
Don't touch DC clock if pclk=0 in order to fix the problem.
Signed-off-by: Dmitry Osipenko digetx@gmail.com Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/tegra/dc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 0ae3a025efe9..24362533e14c 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -1688,6 +1688,11 @@ static void tegra_dc_commit_state(struct tegra_dc *dc, dev_err(dc->dev, "failed to set clock rate to %lu Hz\n", state->pclk); + + err = clk_set_rate(dc->clk, state->pclk); + if (err < 0) + dev_err(dc->dev, "failed to set clock %pC to %lu Hz: %d\n", + dc->clk, state->pclk, err); }
DRM_DEBUG_KMS("rate: %lu, div: %u\n", clk_get_rate(dc->clk), @@ -1698,11 +1703,6 @@ static void tegra_dc_commit_state(struct tegra_dc *dc, value = SHIFT_CLK_DIVIDER(state->div) | PIXEL_CLK_DIVIDER_PCD1; tegra_dc_writel(dc, value, DC_DISP_DISP_CLOCK_CONTROL); } - - err = clk_set_rate(dc->clk, state->pclk); - if (err < 0) - dev_err(dc->dev, "failed to set clock %pC to %lu Hz: %d\n", - dc->clk, state->pclk, err); }
static void tegra_dc_stop(struct tegra_dc *dc)
From: Mikko Perttunen mperttunen@nvidia.com
[ Upstream commit a24f98176d1efae2c37d3438c57a624d530d9c33 ]
To avoid false lockdep warnings, give each client lock a different lock class, passed from the initialization site by macro.
Signed-off-by: Mikko Perttunen mperttunen@nvidia.com Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/host1x/bus.c | 10 ++++++---- include/linux/host1x.h | 9 ++++++++- 2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c index 347fb962b6c9..68a766ff0e9d 100644 --- a/drivers/gpu/host1x/bus.c +++ b/drivers/gpu/host1x/bus.c @@ -705,8 +705,9 @@ void host1x_driver_unregister(struct host1x_driver *driver) EXPORT_SYMBOL(host1x_driver_unregister);
/** - * host1x_client_register() - register a host1x client + * __host1x_client_register() - register a host1x client * @client: host1x client + * @key: lock class key for the client-specific mutex * * Registers a host1x client with each host1x controller instance. Note that * each client will only match their parent host1x controller and will only be @@ -715,13 +716,14 @@ EXPORT_SYMBOL(host1x_driver_unregister); * device and call host1x_device_init(), which will in turn call each client's * &host1x_client_ops.init implementation. */ -int host1x_client_register(struct host1x_client *client) +int __host1x_client_register(struct host1x_client *client, + struct lock_class_key *key) { struct host1x *host1x; int err;
INIT_LIST_HEAD(&client->list); - mutex_init(&client->lock); + __mutex_init(&client->lock, "host1x client lock", key); client->usecount = 0;
mutex_lock(&devices_lock); @@ -742,7 +744,7 @@ int host1x_client_register(struct host1x_client *client)
return 0; } -EXPORT_SYMBOL(host1x_client_register); +EXPORT_SYMBOL(__host1x_client_register);
/** * host1x_client_unregister() - unregister a host1x client diff --git a/include/linux/host1x.h b/include/linux/host1x.h index ce59a6a6a008..9eb77c87a83b 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -320,7 +320,14 @@ static inline struct host1x_device *to_host1x_device(struct device *dev) int host1x_device_init(struct host1x_device *device); int host1x_device_exit(struct host1x_device *device);
-int host1x_client_register(struct host1x_client *client); +int __host1x_client_register(struct host1x_client *client, + struct lock_class_key *key); +#define host1x_client_register(class) \ + ({ \ + static struct lock_class_key __key; \ + __host1x_client_register(class, &__key); \ + }) + int host1x_client_unregister(struct host1x_client *client);
int host1x_client_suspend(struct host1x_client *client);
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit 3012110d71f41410932924e1d188f9eb57f1f824 ]
Splitting an order-4 entry into order-2 entries would leave the array containing pointers to 000040008000c000 instead of 000044448888cccc. This is a one-character fix, but enhance the test suite to check this case.
Reported-by: Zi Yan ziy@nvidia.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_xarray.c | 26 ++++++++++++++------------ lib/xarray.c | 4 ++-- 2 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/lib/test_xarray.c b/lib/test_xarray.c index 8294f43f4981..8b1c318189ce 100644 --- a/lib/test_xarray.c +++ b/lib/test_xarray.c @@ -1530,24 +1530,24 @@ static noinline void check_store_range(struct xarray *xa)
#ifdef CONFIG_XARRAY_MULTI static void check_split_1(struct xarray *xa, unsigned long index, - unsigned int order) + unsigned int order, unsigned int new_order) { - XA_STATE(xas, xa, index); - void *entry; - unsigned int i = 0; + XA_STATE_ORDER(xas, xa, index, new_order); + unsigned int i;
xa_store_order(xa, index, order, xa, GFP_KERNEL);
xas_split_alloc(&xas, xa, order, GFP_KERNEL); xas_lock(&xas); xas_split(&xas, xa, order); + for (i = 0; i < (1 << order); i += (1 << new_order)) + __xa_store(xa, index + i, xa_mk_index(index + i), 0); xas_unlock(&xas);
- xa_for_each(xa, index, entry) { - XA_BUG_ON(xa, entry != xa); - i++; + for (i = 0; i < (1 << order); i++) { + unsigned int val = index + (i & ~((1 << new_order) - 1)); + XA_BUG_ON(xa, xa_load(xa, index + i) != xa_mk_index(val)); } - XA_BUG_ON(xa, i != 1 << order);
xa_set_mark(xa, index, XA_MARK_0); XA_BUG_ON(xa, !xa_get_mark(xa, index, XA_MARK_0)); @@ -1557,14 +1557,16 @@ static void check_split_1(struct xarray *xa, unsigned long index,
static noinline void check_split(struct xarray *xa) { - unsigned int order; + unsigned int order, new_order;
XA_BUG_ON(xa, !xa_empty(xa));
for (order = 1; order < 2 * XA_CHUNK_SHIFT; order++) { - check_split_1(xa, 0, order); - check_split_1(xa, 1UL << order, order); - check_split_1(xa, 3UL << order, order); + for (new_order = 0; new_order < order; new_order++) { + check_split_1(xa, 0, order, new_order); + check_split_1(xa, 1UL << order, order, new_order); + check_split_1(xa, 3UL << order, order, new_order); + } } } #else diff --git a/lib/xarray.c b/lib/xarray.c index 5fa51614802a..ed775dee1074 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1011,7 +1011,7 @@ void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order,
do { unsigned int i; - void *sibling; + void *sibling = NULL; struct xa_node *node;
node = kmem_cache_alloc(radix_tree_node_cachep, gfp); @@ -1021,7 +1021,7 @@ void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order, for (i = 0; i < XA_CHUNK_SIZE; i++) { if ((i & mask) == 0) { RCU_INIT_POINTER(node->slots[i], entry); - sibling = xa_mk_sibling(0); + sibling = xa_mk_sibling(i); } else { RCU_INIT_POINTER(node->slots[i], sibling); }
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit 7487de534dcbe143e6f41da751dd3ffcf93b00ee ]
Commit 4bba4c4bb09a added tools/include/linux/compiler_types.h which includes linux/compiler-gcc.h. Unfortunately, we had our own (empty) compiler_types.h which overrode the one added by that commit, and so we lost the definition of __must_be_array(). Removing our empty compiler_types.h fixes the problem and reduces our divergence from the rest of the tools.
Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/radix-tree/linux/compiler_types.h | 0 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 tools/testing/radix-tree/linux/compiler_types.h
diff --git a/tools/testing/radix-tree/linux/compiler_types.h b/tools/testing/radix-tree/linux/compiler_types.h deleted file mode 100644 index e69de29bb2d1..000000000000
From: Yufen Yu yuyufen@huawei.com
[ Upstream commit 3edf5346e4f2ce2fa0c94651a90a8dda169565ee ]
For multiple split bios, if one of the bio is fail, the whole should return error to application. But we found there is a race between bio_integrity_verify_fn and bio complete, which return io success to application after one of the bio fail. The race as following:
split bio(READ) kworker
nvme_complete_rq blk_update_request //split error=0 bio_endio bio_integrity_endio queue_work(kintegrityd_wq, &bip->bip_work);
bio_integrity_verify_fn bio_endio //split bio __bio_chain_endio if (!parent->bi_status)
<interrupt entry> nvme_irq blk_update_request //parent error=7 req_bio_endio bio->bi_status = 7 //parent bio <interrupt exit>
parent->bi_status = 0 parent->bi_end_io() // return bi_status=0
The bio has been split as two: split and parent. When split bio completed, it depends on kworker to do endio, while bio_integrity_verify_fn have been interrupted by parent bio complete irq handler. Then, parent bio->bi_status which have been set in irq handler will overwrite by kworker.
In fact, even without the above race, we also need to conside the concurrency beteen mulitple split bio complete and update the same parent bi_status. Normally, multiple split bios will be issued to the same hctx and complete from the same irq vector. But if we have updated queue map between multiple split bios, these bios may complete on different hw queue and different irq vector. Then the concurrency update parent bi_status may cause the final status error.
Suggested-by: Keith Busch kbusch@kernel.org Signed-off-by: Yufen Yu yuyufen@huawei.com Reviewed-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c index 1f2cc1fbe283..3209d865828a 100644 --- a/block/bio.c +++ b/block/bio.c @@ -313,7 +313,7 @@ static struct bio *__bio_chain_endio(struct bio *bio) { struct bio *parent = bio->bi_private;
- if (!parent->bi_status) + if (bio->bi_status && !parent->bi_status) parent->bi_status = bio->bi_status; bio_put(bio); return parent;
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit 1bb4bd266cf39fd2fa711f2d265c558b92df1119 ]
Several test runners register individual worker threads with the RCU library, but neglect to register the main thread, which can lead to objects being freed while the main thread is in what appears to be an RCU critical section.
Reported-by: Chris von Recklinghausen crecklin@redhat.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/radix-tree/idr-test.c | 2 ++ tools/testing/radix-tree/multiorder.c | 2 ++ tools/testing/radix-tree/xarray.c | 2 ++ 3 files changed, 6 insertions(+)
diff --git a/tools/testing/radix-tree/idr-test.c b/tools/testing/radix-tree/idr-test.c index 3b796dd5e577..44ceff95a9b3 100644 --- a/tools/testing/radix-tree/idr-test.c +++ b/tools/testing/radix-tree/idr-test.c @@ -577,6 +577,7 @@ void ida_tests(void)
int __weak main(void) { + rcu_register_thread(); radix_tree_init(); idr_checks(); ida_tests(); @@ -584,5 +585,6 @@ int __weak main(void) rcu_barrier(); if (nr_allocated) printf("nr_allocated = %d\n", nr_allocated); + rcu_unregister_thread(); return 0; } diff --git a/tools/testing/radix-tree/multiorder.c b/tools/testing/radix-tree/multiorder.c index 9eae0fb5a67d..e00520cc6349 100644 --- a/tools/testing/radix-tree/multiorder.c +++ b/tools/testing/radix-tree/multiorder.c @@ -224,7 +224,9 @@ void multiorder_checks(void)
int __weak main(void) { + rcu_register_thread(); radix_tree_init(); multiorder_checks(); + rcu_unregister_thread(); return 0; } diff --git a/tools/testing/radix-tree/xarray.c b/tools/testing/radix-tree/xarray.c index e61e43efe463..f20e12cbbfd4 100644 --- a/tools/testing/radix-tree/xarray.c +++ b/tools/testing/radix-tree/xarray.c @@ -25,11 +25,13 @@ void xarray_tests(void)
int __weak main(void) { + rcu_register_thread(); radix_tree_init(); xarray_tests(); radix_tree_cpu_dead(1); rcu_barrier(); if (nr_allocated) printf("nr_allocated = %d\n", nr_allocated); + rcu_unregister_thread(); return 0; }
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit 703586410da69eb40062e64d413ca33bd735917a ]
When run on a single CPU, this test would frequently access already-freed memory. Due to timing, this bug never showed up on multi-CPU tests.
Reported-by: Chris von Recklinghausen crecklin@redhat.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/radix-tree/idr-test.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/testing/radix-tree/idr-test.c b/tools/testing/radix-tree/idr-test.c index 44ceff95a9b3..4a9b451b7ba0 100644 --- a/tools/testing/radix-tree/idr-test.c +++ b/tools/testing/radix-tree/idr-test.c @@ -306,11 +306,15 @@ void idr_find_test_1(int anchor_id, int throbber_id) BUG_ON(idr_alloc(&find_idr, xa_mk_value(anchor_id), anchor_id, anchor_id + 1, GFP_KERNEL) != anchor_id);
+ rcu_read_lock(); do { int id = 0; void *entry = idr_get_next(&find_idr, &id); + rcu_read_unlock(); BUG_ON(entry != xa_mk_value(id)); + rcu_read_lock(); } while (time(NULL) < start + 11); + rcu_read_unlock();
pthread_join(throbber, NULL);
From: "Matthew Wilcox (Oracle)" willy@infradead.org
[ Upstream commit 094ffbd1d8eaa27ed426feb8530cb1456348b018 ]
The throbber could race with creation of the anchor entry and cause the IDR to have zero entries in it, which would cause the test to fail.
Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/radix-tree/idr-test.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/radix-tree/idr-test.c b/tools/testing/radix-tree/idr-test.c index 4a9b451b7ba0..6ce7460f3c7a 100644 --- a/tools/testing/radix-tree/idr-test.c +++ b/tools/testing/radix-tree/idr-test.c @@ -301,11 +301,11 @@ void idr_find_test_1(int anchor_id, int throbber_id) pthread_t throbber; time_t start = time(NULL);
- pthread_create(&throbber, NULL, idr_throbber, &throbber_id); - BUG_ON(idr_alloc(&find_idr, xa_mk_value(anchor_id), anchor_id, anchor_id + 1, GFP_KERNEL) != anchor_id);
+ pthread_create(&throbber, NULL, idr_throbber, &throbber_id); + rcu_read_lock(); do { int id = 0;
From: Damien Le Moal damien.lemoal@wdc.com
[ Upstream commit de3510e52b0a398261271455562458003b8eea62 ]
Memory backed or zoned null block devices may generate actual request timeout errors due to the submission path being blocked on memory allocation or zone locking. Unlike fake timeouts or injected timeouts, the request submission path will call blk_mq_complete_request() or blk_mq_end_request() for these real timeout errors, causing a double completion and use after free situation as the block layer timeout handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in blk_mq_check_expired(). This problem often triggers a NULL pointer dereference such as:
BUG: kernel NULL pointer dereference, address: 0000000000000050 RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20 ... Call Trace: dd_finish_request+0x56/0x80 blk_mq_free_request+0x37/0x130 null_handle_cmd+0xbf/0x250 [null_blk] ? null_queue_rq+0x67/0xd0 [null_blk] blk_mq_dispatch_rq_list+0x122/0x850 __blk_mq_do_dispatch_sched+0xbb/0x2c0 __blk_mq_sched_dispatch_requests+0x13d/0x190 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x49/0x90 process_one_work+0x26c/0x580 worker_thread+0x55/0x3c0 ? process_one_work+0x580/0x580 kthread+0x134/0x150 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x1f/0x30
This problem very often triggers when running the full btrfs xfstests on a memory-backed zoned null block device in a VM with limited amount of memory.
Avoid this by executing blk_mq_complete_request() in null_timeout_rq() only for commands that are marked for a fake timeout completion using the fake_timeout boolean in struct null_cmd. For timeout errors injected through debugfs, the timeout handler will execute blk_mq_complete_request()i as before. This is safe as the submission path does not execute complete requests in this case.
In null_timeout_rq(), also make sure to set the command error field to BLK_STS_TIMEOUT and to propagate this error through to the request completion.
Reported-by: Johannes Thumshirn Johannes.Thumshirn@wdc.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Tested-by: Johannes Thumshirn Johannes.Thumshirn@wdc.com Reviewed-by: Johannes Thumshirn Johannes.Thumshirn@wdc.com Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/null_blk/main.c | 26 +++++++++++++++++++++----- drivers/block/null_blk/null_blk.h | 1 + 2 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 5357c3a4a36f..4f6af7a5921e 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1369,10 +1369,13 @@ static blk_status_t null_handle_cmd(struct nullb_cmd *cmd, sector_t sector, }
if (dev->zoned) - cmd->error = null_process_zoned_cmd(cmd, op, - sector, nr_sectors); + sts = null_process_zoned_cmd(cmd, op, sector, nr_sectors); else - cmd->error = null_process_cmd(cmd, op, sector, nr_sectors); + sts = null_process_cmd(cmd, op, sector, nr_sectors); + + /* Do not overwrite errors (e.g. timeout errors) */ + if (cmd->error == BLK_STS_OK) + cmd->error = sts;
out: nullb_complete_cmd(cmd); @@ -1451,8 +1454,20 @@ static bool should_requeue_request(struct request *rq)
static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res) { + struct nullb_cmd *cmd = blk_mq_rq_to_pdu(rq); + pr_info("rq %p timed out\n", rq); - blk_mq_complete_request(rq); + + /* + * If the device is marked as blocking (i.e. memory backed or zoned + * device), the submission path may be blocked waiting for resources + * and cause real timeouts. For these real timeouts, the submission + * path will complete the request using blk_mq_complete_request(). + * Only fake timeouts need to execute blk_mq_complete_request() here. + */ + cmd->error = BLK_STS_TIMEOUT; + if (cmd->fake_timeout) + blk_mq_complete_request(rq); return BLK_EH_DONE; }
@@ -1473,6 +1488,7 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, cmd->rq = bd->rq; cmd->error = BLK_STS_OK; cmd->nq = nq; + cmd->fake_timeout = should_timeout_request(bd->rq);
blk_mq_start_request(bd->rq);
@@ -1489,7 +1505,7 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, return BLK_STS_OK; } } - if (should_timeout_request(bd->rq)) + if (cmd->fake_timeout) return BLK_STS_OK;
return null_handle_cmd(cmd, sector, nr_sectors, req_op(bd->rq)); diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h index 83504f3cc9d6..4876d5adb12d 100644 --- a/drivers/block/null_blk/null_blk.h +++ b/drivers/block/null_blk/null_blk.h @@ -22,6 +22,7 @@ struct nullb_cmd { blk_status_t error; struct nullb_queue *nq; struct hrtimer timer; + bool fake_timeout; };
struct nullb_queue {
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 4b982bd0f383db9132e892c0c5144117359a6289 ]
S_ISBLK is marked as unbounded work for async preparation, because it doesn't match S_ISREG. That is incorrect, as any read/write to a block device is also a bounded operation. Fix it up and ensure that S_ISBLK isn't marked unbounded.
Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 5c4378694d54..d9a673976f2d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1546,7 +1546,7 @@ static void io_prep_async_work(struct io_kiocb *req) if (req->flags & REQ_F_ISREG) { if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL)) io_wq_hash_work(&req->work, file_inode(req->file)); - } else { + } else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) { if (def->unbound_nonreg_file) req->work.flags |= IO_WQ_WORK_UNBOUND; }
From: Ben Dooks ben.dooks@codethink.co.uk
[ Upstream commit 285a76bb2cf51b0c74c634f2aaccdb93e1f2a359 ]
The <asm/uaccess.h> header has a problem with put_user(a, ptr) if the 'a' is not a simple variable, such as a function. This can lead to the compiler producing code as so:
1: enable_user_access() 2: evaluate 'a' into register 'r' 3: put 'r' to 'ptr' 4: disable_user_acess()
The issue is that 'a' is now being evaluated with the user memory protections disabled. So we try and force the evaulation by assigning 'x' to __val at the start, and hoping the compiler barriers in enable_user_access() do the job of ordering step 2 before step 1.
This has shown up in a bug where 'a' sleeps and thus schedules out and loses the SR_SUM flag. This isn't sufficient to fully fix, but should reduce the window of opportunity. The first instance of this we found is in scheudle_tail() where the code does:
$ less -N kernel/sched/core.c
4263 if (current->set_child_tid) 4264 put_user(task_pid_vnr(current), current->set_child_tid);
Here, the task_pid_vnr(current) is called within the block that has enabled the user memory access. This can be made worse with KASAN which makes task_pid_vnr() a rather large call with plenty of opportunity to sleep.
Signed-off-by: Ben Dooks ben.dooks@codethink.co.uk Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com Suggested-by: Arnd Bergman arnd@arndb.de
-- Changes since v1: - fixed formatting and updated the patch description with more info
Changes since v2: - fixed commenting on __put_user() (schwab@linux-m68k.org)
Change since v3: - fixed RFC in patch title. Should be ready to merge.
Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/uaccess.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index 824b2c9da75b..f944062c9d99 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -306,7 +306,9 @@ do { \ * data types like structures or arrays. * * @ptr must have pointer-to-simple-variable type, and @x must be assignable - * to the result of dereferencing @ptr. + * to the result of dereferencing @ptr. The value of @x is copied to avoid + * re-ordering where @x is evaluated inside the block that enables user-space + * access (thus bypassing user space protection if @x is a function). * * Caller must check the pointer with access_ok() before calling this * function. @@ -316,12 +318,13 @@ do { \ #define __put_user(x, ptr) \ ({ \ __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \ + __typeof__(*__gu_ptr) __val = (x); \ long __pu_err = 0; \ \ __chk_user_ptr(__gu_ptr); \ \ __enable_user_access(); \ - __put_user_nocheck(x, __gu_ptr, __pu_err); \ + __put_user_nocheck(__val, __gu_ptr, __pu_err); \ __disable_user_access(); \ \ __pu_err; \
From: Zihao Yu yuzihao@ict.ac.cn
[ Upstream commit ac8d0b901f0033b783156ab2dc1a0e73ec42409b ]
In RV64, the size of each entry in excp_vect_table is 8 bytes. If the base of the table is not 8-byte aligned, loading an entry in the table will raise a misaligned exception. Although such exception will be handled by opensbi/bbl, this still causes performance degradation.
Signed-off-by: Zihao Yu yuzihao@ict.ac.cn Reviewed-by: Anup Patel anup@brainfault.org Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/entry.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 744f3209c48d..76274a4a1d8e 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -447,6 +447,7 @@ ENDPROC(__switch_to) #endif
.section ".rodata" + .align LGREG /* Exception vector table */ ENTRY(excp_vect_table) RISCV_PTR do_trap_insn_misaligned
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit f8b78caf21d5bc3fcfc40c18898f9d52ed1451a5 ]
If IOCB_NOWAIT is set on submission, then that needs to get propagated to REQ_NOWAIT on the block side. Otherwise we completely lose this information, and any issuer of IOCB_NOWAIT IO will potentially end up blocking on eg request allocation on the storage side.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/block_dev.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/block_dev.c b/fs/block_dev.c index c33151020bcd..06b422fa67a4 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -276,6 +276,8 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, bio.bi_opf = dio_bio_write_op(iocb); task_io_account_write(ret); } + if (iocb->ki_flags & IOCB_NOWAIT) + bio.bi_opf |= REQ_NOWAIT; if (iocb->ki_flags & IOCB_HIPRI) bio_set_polled(&bio, iocb);
@@ -429,6 +431,8 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages) bio->bi_opf = dio_bio_write_op(iocb); task_io_account_write(bio->bi_iter.bi_size); } + if (iocb->ki_flags & IOCB_NOWAIT) + bio->bi_opf |= REQ_NOWAIT;
dio->size += bio->bi_iter.bi_size; pos += bio->bi_iter.bi_size;
linux-stable-mirror@lists.linaro.org