'ac->ac_g_ex.fe_len' is a user-controlled value which is used in the
derivation of 'ac->ac_2order'. 'ac->ac_2order', in turn, is used to
index arrays which makes it a potential spectre gadget. Fix this by
sanitizing the value assigned to 'ac->ac2_order'. This covers the
following accesses found with the help of smatch:
* fs/ext4/mballoc.c:1896 ext4_mb_simple_scan_group() warn: potential
spectre issue 'grp->bb_counters' [w] (local cap)
* fs/ext4/mballoc.c:445 mb_find_buddy() warn: potential spectre issue
'EXT4_SB(e4b->bd_sb)->s_mb_offsets' [r] (local cap)
* fs/ext4/mballoc.c:446 mb_find_buddy() warn: potential spectre issue
'EXT4_SB(e4b->bd_sb)->s_mb_maxs' [r] (local cap)
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: stable(a)vger.kernel.org
Suggested-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Signed-off-by: Jeremy Cline <jcline(a)redhat.com>
---
I broke this out of the "ext4: fix spectre v1 gadgets" patch set since
the other patches in that series could, as Josh noted, be replaced with
one fix in do_quotactl. I'll send that fix to the disk quota folks
separately.
Changes from v1:
- Sanitize ac_2order on assignment, rather than down the call chain in
ext4_mb_simple_scan_group.
fs/ext4/mballoc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index f7ab34088162..8b24d3d42cb3 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -14,6 +14,7 @@
#include <linux/log2.h>
#include <linux/module.h>
#include <linux/slab.h>
+#include <linux/nospec.h>
#include <linux/backing-dev.h>
#include <trace/events/ext4.h>
@@ -2140,7 +2141,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
* This should tell if fe_len is exactly power of 2
*/
if ((ac->ac_g_ex.fe_len & (~(1 << (i - 1)))) == 0)
- ac->ac_2order = i - 1;
+ ac->ac_2order = array_index_nospec(i - 1,
+ sb->s_blocksize_bits + 2);
}
/* if stream allocation is enabled, use global goal */
--
2.17.1
From: Dexuan Cui <decui(a)microsoft.com>
Before setting channel->rescind in vmbus_rescind_cleanup(), we should make
sure the channel callback won't run any more, otherwise a high-level
driver like pci_hyperv, which may be infinitely waiting for the host VSP's
response and notices the channel has been rescinded, can't safely give
up: e.g., in hv_pci_protocol_negotiation() -> wait_for_response(), it's
unsafe to exit from wait_for_response() and proceed with the on-stack
variable "comp_pkt" popped. The issue was originally spotted by
Michael Kelley <mikelley(a)microsoft.com>.
In vmbus_close_internal(), the patch also minimizes the range protected by
disabling/enabling channel->callback_event: we don't really need that for
the whole function.
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Reviewed-by: Michael Kelley <mikelley(a)microsoft.com>
Cc: stable(a)vger.kernel.org
Cc: K. Y. Srinivasan <kys(a)microsoft.com>
Cc: Stephen Hemminger <sthemmin(a)microsoft.com>
Cc: Michael Kelley <mikelley(a)microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys(a)microsoft.com>
---
drivers/hv/channel.c | 40 +++++++++++++++++++++++----------------
drivers/hv/channel_mgmt.c | 6 ++++++
include/linux/hyperv.h | 2 ++
3 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index ba0a092ae085..c3949220b770 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -558,11 +558,8 @@ static void reset_channel_cb(void *arg)
channel->onchannel_callback = NULL;
}
-static int vmbus_close_internal(struct vmbus_channel *channel)
+void vmbus_reset_channel_cb(struct vmbus_channel *channel)
{
- struct vmbus_channel_close_channel *msg;
- int ret;
-
/*
* vmbus_on_event(), running in the per-channel tasklet, can race
* with vmbus_close_internal() in the case of SMP guest, e.g., when
@@ -572,6 +569,29 @@ static int vmbus_close_internal(struct vmbus_channel *channel)
*/
tasklet_disable(&channel->callback_event);
+ channel->sc_creation_callback = NULL;
+
+ /* Stop the callback asap */
+ if (channel->target_cpu != get_cpu()) {
+ put_cpu();
+ smp_call_function_single(channel->target_cpu, reset_channel_cb,
+ channel, true);
+ } else {
+ reset_channel_cb(channel);
+ put_cpu();
+ }
+
+ /* Re-enable tasklet for use on re-open */
+ tasklet_enable(&channel->callback_event);
+}
+
+static int vmbus_close_internal(struct vmbus_channel *channel)
+{
+ struct vmbus_channel_close_channel *msg;
+ int ret;
+
+ vmbus_reset_channel_cb(channel);
+
/*
* In case a device driver's probe() fails (e.g.,
* util_probe() -> vmbus_open() returns -ENOMEM) and the device is
@@ -585,16 +605,6 @@ static int vmbus_close_internal(struct vmbus_channel *channel)
}
channel->state = CHANNEL_OPEN_STATE;
- channel->sc_creation_callback = NULL;
- /* Stop callback and cancel the timer asap */
- if (channel->target_cpu != get_cpu()) {
- put_cpu();
- smp_call_function_single(channel->target_cpu, reset_channel_cb,
- channel, true);
- } else {
- reset_channel_cb(channel);
- put_cpu();
- }
/* Send a closing message */
@@ -639,8 +649,6 @@ static int vmbus_close_internal(struct vmbus_channel *channel)
get_order(channel->ringbuffer_pagecount * PAGE_SIZE));
out:
- /* re-enable tasklet for use on re-open */
- tasklet_enable(&channel->callback_event);
return ret;
}
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index f3b551a50653..0f0e091c117c 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -892,6 +892,12 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
return;
}
+ /*
+ * Before setting channel->rescind in vmbus_rescind_cleanup(), we
+ * should make sure the channel callback is not running any more.
+ */
+ vmbus_reset_channel_cb(channel);
+
/*
* Now wait for offer handling to complete.
*/
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2330f08062c7..efda23cf32c7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1061,6 +1061,8 @@ extern int vmbus_establish_gpadl(struct vmbus_channel *channel,
extern int vmbus_teardown_gpadl(struct vmbus_channel *channel,
u32 gpadl_handle);
+void vmbus_reset_channel_cb(struct vmbus_channel *channel);
+
extern int vmbus_recvpacket(struct vmbus_channel *channel,
void *buffer,
u32 bufferlen,
--
2.17.1
On 7/30/2018 12:56 AM, gregkh(a)linuxfoundation.org wrote:
>
> The patch below does not apply to the 4.14-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable(a)vger.kernel.org>.
>
> thanks,
>
> greg k-h
>
> (snip)
Here's the patch revised for stable 4.14.y
-- james
------------------
From: James Smart <jsmart2021(a)gmail.com>
commit d082dc1562a2ff0947b214796f12faaa87e816a9 upstream.
The existing code to carve up the sg list expected an sg element-per-page
which can be very incorrect with iommu's remapping multiple memory pages
to fewer bus addresses. To hit this error required a large io payload
(greater than 256k) and a system that maps on a per-page basis. It's
possible that large ios could get by fine if the system condensed the
sgl list into the first 64 elements.
This patch corrects the sg list handling by specifically walking the
sg list element by element and attempting to divide the transfer up
on a per-sg element boundary. While doing so, it still tries to keep
sequences under 256k, but will exceed that rule if a single sg element
is larger than 256k.
Fixes: 48fa362b6c3f ("nvmet-fc: simplify sg list handling")
Cc: <stable(a)vger.kernel.org> # 4.14
Signed-off-by: James Smart <james.smart(a)broadcom.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
---
stable 4.14.y patch adjusted for deltas made by upstream commit
5e62d5c993e6889cd314d5b5de6b670152109a0e that are not in the stable tree.
---
drivers/nvme/target/fc.c | 44 +++++++++++++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 8e21211b904b..b7a5d1065378 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -58,8 +58,8 @@ struct nvmet_fc_ls_iod {
struct work_struct work;
} __aligned(sizeof(unsigned long long));
+/* desired maximum for a single sequence - if sg list allows it */
#define NVMET_FC_MAX_SEQ_LENGTH (256 * 1024)
-#define NVMET_FC_MAX_XFR_SGENTS (NVMET_FC_MAX_SEQ_LENGTH / PAGE_SIZE)
enum nvmet_fcp_datadir {
NVMET_FCP_NODATA,
@@ -74,6 +74,7 @@ struct nvmet_fc_fcp_iod {
struct nvme_fc_cmd_iu cmdiubuf;
struct nvme_fc_ersp_iu rspiubuf;
dma_addr_t rspdma;
+ struct scatterlist *next_sg;
struct scatterlist *data_sg;
int data_sg_cnt;
u32 total_length;
@@ -1000,8 +1001,7 @@ nvmet_fc_register_targetport(struct nvmet_fc_port_info *pinfo,
INIT_LIST_HEAD(&newrec->assoc_list);
kref_init(&newrec->ref);
ida_init(&newrec->assoc_cnt);
- newrec->max_sg_cnt = min_t(u32, NVMET_FC_MAX_XFR_SGENTS,
- template->max_sgl_segments);
+ newrec->max_sg_cnt = template->max_sgl_segments;
ret = nvmet_fc_alloc_ls_iodlist(newrec);
if (ret) {
@@ -1717,6 +1717,7 @@ nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
/* note: write from initiator perspective */
+ fod->next_sg = fod->data_sg;
return 0;
@@ -1874,24 +1875,49 @@ nvmet_fc_transfer_fcp_data(struct nvmet_fc_tgtport *tgtport,
struct nvmet_fc_fcp_iod *fod, u8 op)
{
struct nvmefc_tgt_fcp_req *fcpreq = fod->fcpreq;
+ struct scatterlist *sg = fod->next_sg;
unsigned long flags;
- u32 tlen;
+ u32 remaininglen = fod->total_length - fod->offset;
+ u32 tlen = 0;
int ret;
fcpreq->op = op;
fcpreq->offset = fod->offset;
fcpreq->timeout = NVME_FC_TGTOP_TIMEOUT_SEC;
- tlen = min_t(u32, tgtport->max_sg_cnt * PAGE_SIZE,
- (fod->total_length - fod->offset));
+ /*
+ * for next sequence:
+ * break at a sg element boundary
+ * attempt to keep sequence length capped at
+ * NVMET_FC_MAX_SEQ_LENGTH but allow sequence to
+ * be longer if a single sg element is larger
+ * than that amount. This is done to avoid creating
+ * a new sg list to use for the tgtport api.
+ */
+ fcpreq->sg = sg;
+ fcpreq->sg_cnt = 0;
+ while (tlen < remaininglen &&
+ fcpreq->sg_cnt < tgtport->max_sg_cnt &&
+ tlen + sg_dma_len(sg) < NVMET_FC_MAX_SEQ_LENGTH) {
+ fcpreq->sg_cnt++;
+ tlen += sg_dma_len(sg);
+ sg = sg_next(sg);
+ }
+ if (tlen < remaininglen && fcpreq->sg_cnt == 0) {
+ fcpreq->sg_cnt++;
+ tlen += min_t(u32, sg_dma_len(sg), remaininglen);
+ sg = sg_next(sg);
+ }
+ if (tlen < remaininglen)
+ fod->next_sg = sg;
+ else
+ fod->next_sg = NULL;
+
fcpreq->transfer_length = tlen;
fcpreq->transferred_length = 0;
fcpreq->fcp_error = 0;
fcpreq->rsplen = 0;
- fcpreq->sg = &fod->data_sg[fod->offset / PAGE_SIZE];
- fcpreq->sg_cnt = DIV_ROUND_UP(tlen, PAGE_SIZE);
-
/*
* If the last READDATA request: check if LLDD supports
* combined xfr with response.
--
2.13.1
We can't and don't need to try resuming the device from our hotplug
handlers, but hotplug events are generally something we'd like to keep
the device awake for whenever possible. So, grab a PM ref safely in our
hotplug handlers using pm_runtime_get_noresume() and mark the device as
busy once we're finished.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Karol Herbst <karolherbst(a)gmail.com>
---
drivers/gpu/drm/nouveau/nouveau_connector.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 8409c3f2c3a1..5a8e8c1ad647 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1152,6 +1152,11 @@ nouveau_connector_hotplug(struct nvif_notify *notify)
const char *name = connector->name;
struct nouveau_encoder *nv_encoder;
+ /* Resuming the device here isn't possible; but the suspend PM ops
+ * will wait for us to finish our work before disabling us so this
+ * should be enough
+ */
+ pm_runtime_get_noresume(drm->dev->dev);
nv_connector->hpd_task = current;
if (rep->mask & NVIF_NOTIFY_CONN_V0_IRQ) {
@@ -1171,6 +1176,9 @@ nouveau_connector_hotplug(struct nvif_notify *notify)
}
nv_connector->hpd_task = NULL;
+
+ pm_runtime_mark_last_busy(drm->dev->dev);
+ pm_runtime_put_autosuspend(drm->dev->dev);
return NVIF_NOTIFY_KEEP;
}
--
2.17.1
It's true we can't resume the device from poll workers in
nouveau_connector_detect(). We can however, prevent the autosuspend
timer from elapsing immediately if it hasn't already without risking any
sort of deadlock with the runtime suspend/resume operations. So do that
instead of entirely avoiding grabbing a power reference.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Karol Herbst <karolherbst(a)gmail.com>
---
drivers/gpu/drm/nouveau/nouveau_connector.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 2a45b4c2ceb0..010d6db14cba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -572,12 +572,16 @@ nouveau_connector_detect(struct drm_connector *connector, bool force)
nv_connector->edid = NULL;
}
- /* Outputs are only polled while runtime active, so acquiring a
- * runtime PM ref here is unnecessary (and would deadlock upon
- * runtime suspend because it waits for polling to finish).
+ /* Outputs are only polled while runtime active, so resuming the
+ * device here is unnecessary (and would deadlock upon runtime suspend
+ * because it waits for polling to finish). We do however, want to
+ * prevent the autosuspend timer from elapsing during this operation
+ * if possible.
*/
- if (!drm_kms_helper_is_poll_worker()) {
- ret = pm_runtime_get_sync(connector->dev->dev);
+ if (drm_kms_helper_is_poll_worker()) {
+ pm_runtime_get_noresume(dev->dev);
+ } else {
+ ret = pm_runtime_get_sync(dev->dev);
if (ret < 0 && ret != -EACCES)
return conn_status;
}
@@ -655,10 +659,8 @@ nouveau_connector_detect(struct drm_connector *connector, bool force)
out:
- if (!drm_kms_helper_is_poll_worker()) {
- pm_runtime_mark_last_busy(connector->dev->dev);
- pm_runtime_put_autosuspend(connector->dev->dev);
- }
+ pm_runtime_mark_last_busy(dev->dev);
+ pm_runtime_put_autosuspend(dev->dev);
return conn_status;
}
--
2.17.1
This removes the potential of deadlocking with fb_helper entirely by
preventing it from handling hotplugs during the runtime suspend process
as early as possible in the suspend process. If it turns out this is not
possible, due to some fb_helper action having been queued up before we
got a time to disable hotplugging, we simply return -EBUSY so that the
runtime PM core attempts autosuspending the device again once fb_helper
isn't doing anything.
This fixes one of the issues causing deadlocks on runtime suspend/resume
with nouveau on my P50.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: Karol Herbst <karolherbst(a)gmail.com>
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 8 ++++++++
drivers/gpu/drm/nouveau/nouveau_fbcon.c | 1 +
2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ee2546db09c9..d47cb5b2af98 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -836,6 +836,14 @@ nouveau_pmops_runtime_suspend(struct device *dev)
return -EBUSY;
}
+ /* There's no way for us to stop fb_helper work in reaction to
+ * hotplugs later in the RPM process. First off: we don't want to,
+ * fb_helper should be able to keep the GPU awake. Second off: it is
+ * capable of grabbing basically any lock in existence.
+ */
+ if (!drm_fb_helper_suspend_hotplug(drm_dev->fb_helper))
+ return -EBUSY;
+
nouveau_switcheroo_optimus_dsm();
ret = nouveau_do_suspend(drm_dev, true);
pci_save_state(pdev);
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 85c1f10bc2b6..963ba630fd04 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -466,6 +466,7 @@ nouveau_fbcon_set_suspend_work(struct work_struct *work)
console_unlock();
if (state == FBINFO_STATE_RUNNING) {
+ drm_fb_helper_resume_hotplug(drm->dev->fb_helper);
pm_runtime_mark_last_busy(drm->dev->dev);
pm_runtime_put_sync(drm->dev->dev);
}
--
2.17.1