The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 06781a5026350cde699d2d10c9914a25c1524f45 Mon Sep 17 00:00:00 2001
From: Sascha Hauer <s.hauer(a)pengutronix.de>
Date: Tue, 14 Jun 2022 10:31:38 +0200
Subject: [PATCH] mtd: rawnand: gpmi: Fix setting busy timeout setting
The DEVICE_BUSY_TIMEOUT value is described in the Reference Manual as:
| Timeout waiting for NAND Ready/Busy or ATA IRQ. Used in WAIT_FOR_READY
| mode. This value is the number of GPMI_CLK cycles multiplied by 4096.
So instead of multiplying the value in cycles with 4096, we have to
divide it by that value. Use DIV_ROUND_UP to make sure we are on the
safe side, especially when the calculated value in cycles is smaller
than 4096 as typically the case.
This bug likely never triggered because any timeout != 0 usually will
do. In my case the busy timeout in cycles was originally calculated as
2408, which multiplied with 4096 is 0x968000. The lower 16 bits were
taken for the 16 bit wide register field, so the register value was
0x8000. With 2970bf5a32f0 ("mtd: rawnand: gpmi: fix controller timings
setting") however the value in cycles became 2384, which multiplied
with 4096 is 0x950000. The lower 16 bit are 0x0 now resulting in an
intermediate timeout when reading from NAND.
Fixes: b1206122069aa ("mtd: rawnand: gpmi: use core timings instead of an empirical derivation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer(a)pengutronix.de>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220614083138.3455683-1-s.hauer@pengutro…
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 0b68d05846e1..889e40329956 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -890,7 +890,7 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
hw->timing0 = BF_GPMI_TIMING0_ADDRESS_SETUP(addr_setup_cycles) |
BF_GPMI_TIMING0_DATA_HOLD(data_hold_cycles) |
BF_GPMI_TIMING0_DATA_SETUP(data_setup_cycles);
- hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(busy_timeout_cycles * 4096);
+ hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(DIV_ROUND_UP(busy_timeout_cycles, 4096));
/*
* Derive NFC ideal delay from {3}:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbe97cff7dd9f0f75c524afdd55ad46be3d15295 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Tue, 21 Jun 2022 22:27:26 -0400
Subject: [PATCH] xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..4b56c39f766d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbe97cff7dd9f0f75c524afdd55ad46be3d15295 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Tue, 21 Jun 2022 22:27:26 -0400
Subject: [PATCH] xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..4b56c39f766d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbe97cff7dd9f0f75c524afdd55ad46be3d15295 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Tue, 21 Jun 2022 22:27:26 -0400
Subject: [PATCH] xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..4b56c39f766d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbe97cff7dd9f0f75c524afdd55ad46be3d15295 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Tue, 21 Jun 2022 22:27:26 -0400
Subject: [PATCH] xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..4b56c39f766d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From dbe97cff7dd9f0f75c524afdd55ad46be3d15295 Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Tue, 21 Jun 2022 22:27:26 -0400
Subject: [PATCH] xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable(a)vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 20d7d059dadb..40ef379c28ab 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -16,6 +16,7 @@
#include <linux/mmu_notifier.h>
#include <linux/types.h>
#include <xen/interface/event_channel.h>
+#include <xen/grant_table.h>
struct gntdev_dmabuf_priv;
@@ -56,6 +57,7 @@ struct gntdev_grant_map {
struct gnttab_unmap_grant_ref *unmap_ops;
struct gnttab_map_grant_ref *kmap_ops;
struct gnttab_unmap_grant_ref *kunmap_ops;
+ bool *being_removed;
struct page **pages;
unsigned long pages_vm_start;
@@ -73,6 +75,11 @@ struct gntdev_grant_map {
/* Needed to avoid allocation in gnttab_dma_free_pages(). */
xen_pfn_t *frames;
#endif
+
+ /* Number of live grants */
+ atomic_t live_grants;
+ /* Needed to avoid allocation in __unmap_grant_pages */
+ struct gntab_unmap_queue_data unmap_data;
};
struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 59ffea800079..4b56c39f766d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -35,6 +35,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
+#include <linux/workqueue.h>
#include <xen/xen.h>
#include <xen/grant_table.h>
@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
MODULE_PARM_DESC(limit,
"Maximum number of grants that may be mapped by one mapping request");
+/* True in PV mode, false otherwise */
static int use_ptemod;
-static int unmap_grant_pages(struct gntdev_grant_map *map,
- int offset, int pages);
+static void unmap_grant_pages(struct gntdev_grant_map *map,
+ int offset, int pages);
static struct miscdevice gntdev_miscdev;
@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
kvfree(map->unmap_ops);
kvfree(map->kmap_ops);
kvfree(map->kunmap_ops);
+ kvfree(map->being_removed);
kfree(map);
}
@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
GFP_KERNEL);
add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
+ add->being_removed =
+ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
if (NULL == add->grants ||
NULL == add->map_ops ||
NULL == add->unmap_ops ||
- NULL == add->pages)
+ NULL == add->pages ||
+ NULL == add->being_removed)
goto err;
if (use_ptemod) {
add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
if (!refcount_dec_and_test(&map->users))
return;
- if (map->pages && !use_ptemod)
+ if (map->pages && !use_ptemod) {
+ /*
+ * Increment the reference count. This ensures that the
+ * subsequent call to unmap_grant_pages() will not wind up
+ * re-entering itself. It *can* wind up calling
+ * gntdev_put_map() recursively, but such calls will be with a
+ * reference count greater than 1, so they will return before
+ * this code is reached. The recursion depth is thus limited to
+ * 1. Do NOT use refcount_inc() here, as it will detect that
+ * the reference count is zero and WARN().
+ */
+ refcount_set(&map->users, 1);
+
+ /*
+ * Unmap the grants. This may or may not be asynchronous, so it
+ * is possible that the reference count is 1 on return, but it
+ * could also be greater than 1.
+ */
unmap_grant_pages(map, 0, map->count);
+ /* Check if the memory now needs to be freed */
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+ /*
+ * All pages have been returned to the hypervisor, so free the
+ * map.
+ */
+ }
+
if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
notify_remote_via_evtchn(map->notify.event);
evtchn_put(map->notify.event);
@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
int gntdev_map_grant_pages(struct gntdev_grant_map *map)
{
+ size_t alloced = 0;
int i, err = 0;
if (!use_ptemod) {
@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
map->count);
for (i = 0; i < map->count; i++) {
- if (map->map_ops[i].status == GNTST_okay)
+ if (map->map_ops[i].status == GNTST_okay) {
map->unmap_ops[i].handle = map->map_ops[i].handle;
- else if (!err)
+ if (!use_ptemod)
+ alloced++;
+ } else if (!err)
err = -EINVAL;
if (map->flags & GNTMAP_device_map)
map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
if (use_ptemod) {
- if (map->kmap_ops[i].status == GNTST_okay)
+ if (map->kmap_ops[i].status == GNTST_okay) {
+ if (map->map_ops[i].status == GNTST_okay)
+ alloced++;
map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
- else if (!err)
+ } else if (!err)
err = -EINVAL;
}
}
+ atomic_add(alloced, &map->live_grants);
return err;
}
-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void __unmap_grant_pages_done(int result,
+ struct gntab_unmap_queue_data *data)
{
- int i, err = 0;
- struct gntab_unmap_queue_data unmap_data;
-
- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
- int pgno = (map->notify.addr >> PAGE_SHIFT);
- if (pgno >= offset && pgno < offset + pages) {
- /* No need for kmap, pages are in lowmem */
- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
- }
- }
-
- unmap_data.unmap_ops = map->unmap_ops + offset;
- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
- unmap_data.pages = map->pages + offset;
- unmap_data.count = pages;
-
- err = gnttab_unmap_refs_sync(&unmap_data);
- if (err)
- return err;
+ unsigned int i;
+ struct gntdev_grant_map *map = data->data;
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
- for (i = 0; i < pages; i++) {
- if (map->unmap_ops[offset+i].status)
- err = -EINVAL;
+ for (i = 0; i < data->count; i++) {
+ WARN_ON(map->unmap_ops[offset+i].status);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- if (map->kunmap_ops[offset+i].status)
- err = -EINVAL;
+ WARN_ON(map->kunmap_ops[offset+i].status);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
}
}
- return err;
+ /*
+ * Decrease the live-grant counter. This must happen after the loop to
+ * prevent premature reuse of the grants by gnttab_mmap().
+ */
+ atomic_sub(data->count, &map->live_grants);
+
+ /* Release reference taken by __unmap_grant_pages */
+ gntdev_put_map(NULL, map);
+}
+
+static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
+{
+ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+ int pgno = (map->notify.addr >> PAGE_SHIFT);
+
+ if (pgno >= offset && pgno < offset + pages) {
+ /* No need for kmap, pages are in lowmem */
+ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+
+ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+ }
+ }
+
+ map->unmap_data.unmap_ops = map->unmap_ops + offset;
+ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+ map->unmap_data.pages = map->pages + offset;
+ map->unmap_data.count = pages;
+ map->unmap_data.done = __unmap_grant_pages_done;
+ map->unmap_data.data = map;
+ refcount_inc(&map->users); /* to keep map alive during async call below */
+
+ gnttab_unmap_refs_async(&map->unmap_data);
}
-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
- int pages)
+static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+ int pages)
{
- int range, err = 0;
+ int range;
+
+ if (atomic_read(&map->live_grants) == 0)
+ return; /* Nothing to do */
pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
/* It is possible the requested range will have a "hole" where we
* already unmapped some of the grants. Only unmap valid ranges.
*/
- while (pages && !err) {
- while (pages &&
- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
+ while (pages) {
+ while (pages && map->being_removed[offset]) {
offset++;
pages--;
}
range = 0;
while (range < pages) {
- if (map->unmap_ops[offset + range].handle ==
- INVALID_GRANT_HANDLE)
+ if (map->being_removed[offset + range])
break;
+ map->being_removed[offset + range] = true;
range++;
}
- err = __unmap_grant_pages(map, offset, range);
+ if (range)
+ __unmap_grant_pages(map, offset, range);
offset += range;
pages -= range;
}
-
- return err;
}
/* ------------------------------------------------------------------ */
@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
struct gntdev_grant_map *map =
container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
- int err;
if (!mmu_notifier_range_blockable(range))
return false;
@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
map->index, map->count,
map->vma->vm_start, map->vma->vm_end,
range->start, range->end, mstart, mend);
- err = unmap_grant_pages(map,
+ unmap_grant_pages(map,
(mstart - map->vma->vm_start) >> PAGE_SHIFT,
(mend - mstart) >> PAGE_SHIFT);
- WARN_ON(err);
return true;
}
@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
goto unlock_out;
if (use_ptemod && map->vma)
goto unlock_out;
+ if (atomic_read(&map->live_grants)) {
+ err = -EAGAIN;
+ goto unlock_out;
+ }
refcount_inc(&map->users);
vma->vm_ops = &gntdev_vmops;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3a4167c880cf889f66887a152799df4d609dd21 Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba(a)suse.com>
Date: Thu, 2 Jun 2022 23:57:17 +0200
Subject: [PATCH] btrfs: add error messages to all unrecognized mount options
Almost none of the errors stemming from a valid mount option but wrong
value prints a descriptive message which would help to identify why
mount failed. Like in the linked report:
$ uname -r
v4.19
$ mount -o compress=zstd /dev/sdb /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/sdb, missing codepage or helper program, or other error.
$ dmesg
...
BTRFS error (device sdb): open_ctree failed
Errors caused by memory allocation failures are left out as it's not a
user error so reporting that would be confusing.
Link: https://lore.kernel.org/linux-btrfs/9c3fec36-fc61-3a33-4977-a7e207c3fa4e@gm…
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1387fbe935c1..6627dd7875ee 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -763,6 +763,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
compress_force = false;
no_compress++;
} else {
+ btrfs_err(info, "unrecognized compression value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -821,8 +823,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_thread_pool:
ret = match_int(&args[0], &intarg);
if (ret) {
+ btrfs_err(info, "unrecognized thread_pool value %s",
+ args[0].from);
goto out;
} else if (intarg == 0) {
+ btrfs_err(info, "invalid value 0 for thread_pool");
ret = -EINVAL;
goto out;
}
@@ -883,8 +888,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_ratio:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized metadata_ratio value %s",
+ args[0].from);
goto out;
+ }
info->metadata_ratio = intarg;
btrfs_info(info, "metadata ratio %u",
info->metadata_ratio);
@@ -901,6 +909,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, DISCARD_ASYNC,
"turning on async discard");
} else {
+ btrfs_err(info, "unrecognized discard mode value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -933,6 +943,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, FREE_SPACE_TREE,
"enabling free space tree");
} else {
+ btrfs_err(info, "unrecognized space_cache value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1014,8 +1026,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_check_integrity_print_mask:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info,
+ "unrecognized check_integrity_print_mask value %s",
+ args[0].from);
goto out;
+ }
info->check_integrity_print_mask = intarg;
btrfs_info(info, "check_integrity_print_mask 0x%x",
info->check_integrity_print_mask);
@@ -1030,13 +1046,15 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
goto out;
#endif
case Opt_fatal_errors:
- if (strcmp(args[0].from, "panic") == 0)
+ if (strcmp(args[0].from, "panic") == 0) {
btrfs_set_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else if (strcmp(args[0].from, "bug") == 0)
+ } else if (strcmp(args[0].from, "bug") == 0) {
btrfs_clear_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else {
+ } else {
+ btrfs_err(info, "unrecognized fatal_errors value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1044,8 +1062,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_commit_interval:
intarg = 0;
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized commit_interval value %s",
+ args[0].from);
+ ret = -EINVAL;
goto out;
+ }
if (intarg == 0) {
btrfs_info(info,
"using default commit interval %us",
@@ -1059,8 +1081,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_rescue:
ret = parse_rescue_options(info, args[0].from);
- if (ret < 0)
+ if (ret < 0) {
+ btrfs_err(info, "unrecognized rescue value %s",
+ args[0].from);
goto out;
+ }
break;
#ifdef CONFIG_BTRFS_DEBUG
case Opt_fragment_all:
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3a4167c880cf889f66887a152799df4d609dd21 Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba(a)suse.com>
Date: Thu, 2 Jun 2022 23:57:17 +0200
Subject: [PATCH] btrfs: add error messages to all unrecognized mount options
Almost none of the errors stemming from a valid mount option but wrong
value prints a descriptive message which would help to identify why
mount failed. Like in the linked report:
$ uname -r
v4.19
$ mount -o compress=zstd /dev/sdb /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/sdb, missing codepage or helper program, or other error.
$ dmesg
...
BTRFS error (device sdb): open_ctree failed
Errors caused by memory allocation failures are left out as it's not a
user error so reporting that would be confusing.
Link: https://lore.kernel.org/linux-btrfs/9c3fec36-fc61-3a33-4977-a7e207c3fa4e@gm…
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1387fbe935c1..6627dd7875ee 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -763,6 +763,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
compress_force = false;
no_compress++;
} else {
+ btrfs_err(info, "unrecognized compression value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -821,8 +823,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_thread_pool:
ret = match_int(&args[0], &intarg);
if (ret) {
+ btrfs_err(info, "unrecognized thread_pool value %s",
+ args[0].from);
goto out;
} else if (intarg == 0) {
+ btrfs_err(info, "invalid value 0 for thread_pool");
ret = -EINVAL;
goto out;
}
@@ -883,8 +888,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_ratio:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized metadata_ratio value %s",
+ args[0].from);
goto out;
+ }
info->metadata_ratio = intarg;
btrfs_info(info, "metadata ratio %u",
info->metadata_ratio);
@@ -901,6 +909,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, DISCARD_ASYNC,
"turning on async discard");
} else {
+ btrfs_err(info, "unrecognized discard mode value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -933,6 +943,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, FREE_SPACE_TREE,
"enabling free space tree");
} else {
+ btrfs_err(info, "unrecognized space_cache value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1014,8 +1026,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_check_integrity_print_mask:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info,
+ "unrecognized check_integrity_print_mask value %s",
+ args[0].from);
goto out;
+ }
info->check_integrity_print_mask = intarg;
btrfs_info(info, "check_integrity_print_mask 0x%x",
info->check_integrity_print_mask);
@@ -1030,13 +1046,15 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
goto out;
#endif
case Opt_fatal_errors:
- if (strcmp(args[0].from, "panic") == 0)
+ if (strcmp(args[0].from, "panic") == 0) {
btrfs_set_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else if (strcmp(args[0].from, "bug") == 0)
+ } else if (strcmp(args[0].from, "bug") == 0) {
btrfs_clear_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else {
+ } else {
+ btrfs_err(info, "unrecognized fatal_errors value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1044,8 +1062,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_commit_interval:
intarg = 0;
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized commit_interval value %s",
+ args[0].from);
+ ret = -EINVAL;
goto out;
+ }
if (intarg == 0) {
btrfs_info(info,
"using default commit interval %us",
@@ -1059,8 +1081,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_rescue:
ret = parse_rescue_options(info, args[0].from);
- if (ret < 0)
+ if (ret < 0) {
+ btrfs_err(info, "unrecognized rescue value %s",
+ args[0].from);
goto out;
+ }
break;
#ifdef CONFIG_BTRFS_DEBUG
case Opt_fragment_all:
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3a4167c880cf889f66887a152799df4d609dd21 Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba(a)suse.com>
Date: Thu, 2 Jun 2022 23:57:17 +0200
Subject: [PATCH] btrfs: add error messages to all unrecognized mount options
Almost none of the errors stemming from a valid mount option but wrong
value prints a descriptive message which would help to identify why
mount failed. Like in the linked report:
$ uname -r
v4.19
$ mount -o compress=zstd /dev/sdb /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/sdb, missing codepage or helper program, or other error.
$ dmesg
...
BTRFS error (device sdb): open_ctree failed
Errors caused by memory allocation failures are left out as it's not a
user error so reporting that would be confusing.
Link: https://lore.kernel.org/linux-btrfs/9c3fec36-fc61-3a33-4977-a7e207c3fa4e@gm…
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1387fbe935c1..6627dd7875ee 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -763,6 +763,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
compress_force = false;
no_compress++;
} else {
+ btrfs_err(info, "unrecognized compression value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -821,8 +823,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_thread_pool:
ret = match_int(&args[0], &intarg);
if (ret) {
+ btrfs_err(info, "unrecognized thread_pool value %s",
+ args[0].from);
goto out;
} else if (intarg == 0) {
+ btrfs_err(info, "invalid value 0 for thread_pool");
ret = -EINVAL;
goto out;
}
@@ -883,8 +888,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_ratio:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized metadata_ratio value %s",
+ args[0].from);
goto out;
+ }
info->metadata_ratio = intarg;
btrfs_info(info, "metadata ratio %u",
info->metadata_ratio);
@@ -901,6 +909,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, DISCARD_ASYNC,
"turning on async discard");
} else {
+ btrfs_err(info, "unrecognized discard mode value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -933,6 +943,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, FREE_SPACE_TREE,
"enabling free space tree");
} else {
+ btrfs_err(info, "unrecognized space_cache value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1014,8 +1026,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_check_integrity_print_mask:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info,
+ "unrecognized check_integrity_print_mask value %s",
+ args[0].from);
goto out;
+ }
info->check_integrity_print_mask = intarg;
btrfs_info(info, "check_integrity_print_mask 0x%x",
info->check_integrity_print_mask);
@@ -1030,13 +1046,15 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
goto out;
#endif
case Opt_fatal_errors:
- if (strcmp(args[0].from, "panic") == 0)
+ if (strcmp(args[0].from, "panic") == 0) {
btrfs_set_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else if (strcmp(args[0].from, "bug") == 0)
+ } else if (strcmp(args[0].from, "bug") == 0) {
btrfs_clear_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else {
+ } else {
+ btrfs_err(info, "unrecognized fatal_errors value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1044,8 +1062,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_commit_interval:
intarg = 0;
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized commit_interval value %s",
+ args[0].from);
+ ret = -EINVAL;
goto out;
+ }
if (intarg == 0) {
btrfs_info(info,
"using default commit interval %us",
@@ -1059,8 +1081,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_rescue:
ret = parse_rescue_options(info, args[0].from);
- if (ret < 0)
+ if (ret < 0) {
+ btrfs_err(info, "unrecognized rescue value %s",
+ args[0].from);
goto out;
+ }
break;
#ifdef CONFIG_BTRFS_DEBUG
case Opt_fragment_all:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3a4167c880cf889f66887a152799df4d609dd21 Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba(a)suse.com>
Date: Thu, 2 Jun 2022 23:57:17 +0200
Subject: [PATCH] btrfs: add error messages to all unrecognized mount options
Almost none of the errors stemming from a valid mount option but wrong
value prints a descriptive message which would help to identify why
mount failed. Like in the linked report:
$ uname -r
v4.19
$ mount -o compress=zstd /dev/sdb /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/sdb, missing codepage or helper program, or other error.
$ dmesg
...
BTRFS error (device sdb): open_ctree failed
Errors caused by memory allocation failures are left out as it's not a
user error so reporting that would be confusing.
Link: https://lore.kernel.org/linux-btrfs/9c3fec36-fc61-3a33-4977-a7e207c3fa4e@gm…
CC: stable(a)vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 1387fbe935c1..6627dd7875ee 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -763,6 +763,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
compress_force = false;
no_compress++;
} else {
+ btrfs_err(info, "unrecognized compression value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -821,8 +823,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_thread_pool:
ret = match_int(&args[0], &intarg);
if (ret) {
+ btrfs_err(info, "unrecognized thread_pool value %s",
+ args[0].from);
goto out;
} else if (intarg == 0) {
+ btrfs_err(info, "invalid value 0 for thread_pool");
ret = -EINVAL;
goto out;
}
@@ -883,8 +888,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_ratio:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized metadata_ratio value %s",
+ args[0].from);
goto out;
+ }
info->metadata_ratio = intarg;
btrfs_info(info, "metadata ratio %u",
info->metadata_ratio);
@@ -901,6 +909,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, DISCARD_ASYNC,
"turning on async discard");
} else {
+ btrfs_err(info, "unrecognized discard mode value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -933,6 +943,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
btrfs_set_and_info(info, FREE_SPACE_TREE,
"enabling free space tree");
} else {
+ btrfs_err(info, "unrecognized space_cache value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1014,8 +1026,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_check_integrity_print_mask:
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info,
+ "unrecognized check_integrity_print_mask value %s",
+ args[0].from);
goto out;
+ }
info->check_integrity_print_mask = intarg;
btrfs_info(info, "check_integrity_print_mask 0x%x",
info->check_integrity_print_mask);
@@ -1030,13 +1046,15 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
goto out;
#endif
case Opt_fatal_errors:
- if (strcmp(args[0].from, "panic") == 0)
+ if (strcmp(args[0].from, "panic") == 0) {
btrfs_set_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else if (strcmp(args[0].from, "bug") == 0)
+ } else if (strcmp(args[0].from, "bug") == 0) {
btrfs_clear_opt(info->mount_opt,
PANIC_ON_FATAL_ERROR);
- else {
+ } else {
+ btrfs_err(info, "unrecognized fatal_errors value %s",
+ args[0].from);
ret = -EINVAL;
goto out;
}
@@ -1044,8 +1062,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
case Opt_commit_interval:
intarg = 0;
ret = match_int(&args[0], &intarg);
- if (ret)
+ if (ret) {
+ btrfs_err(info, "unrecognized commit_interval value %s",
+ args[0].from);
+ ret = -EINVAL;
goto out;
+ }
if (intarg == 0) {
btrfs_info(info,
"using default commit interval %us",
@@ -1059,8 +1081,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
break;
case Opt_rescue:
ret = parse_rescue_options(info, args[0].from);
- if (ret < 0)
+ if (ret < 0) {
+ btrfs_err(info, "unrecognized rescue value %s",
+ args[0].from);
goto out;
+ }
break;
#ifdef CONFIG_BTRFS_DEBUG
case Opt_fragment_all: