When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
pull in arbitrary other resources, including CRTCs (e.g. when
reconfiguring global resources).
But in nonblocking mode userspace has then no idea this happened,
which can lead to spurious EBUSY calls, both:
- when that other CRTC is currently busy doing a page_flip the
ALLOW_MODESET commit can fail with an EBUSY
- on the other CRTC a normal atomic flip can fail with EBUSY because
of the additional commit inserted by the kernel without userspace's
knowledge
For blocking commits this isn't a problem, because everyone else will
just block until all the CRTC are reconfigured. Only thing userspace
can notice is the dropped frames without any reason for why frames got
dropped.
Consensus is that we need new uapi to handle this properly, but no one
has any idea what exactly the new uapi should look like. As a stop-gap
plug this problem by demoting nonblocking commits which might cause
issues by including CRTCs not in the original request to blocking
commits.
v2: Add comments and a WARN_ON to enforce this only when allowed - we
don't want to silently convert page flips into blocking plane updates
just because the driver is buggy.
References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
Cc: Daniel Stone <daniel(a)fooishbar.org>
Cc: Pekka Paalanen <pekka.paalanen(a)collabora.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_atomic.c | 34 +++++++++++++++++++++++++++++++---
1 file changed, 31 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index d5cefb1cb2a2..058512f14772 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -2018,15 +2018,43 @@ EXPORT_SYMBOL(drm_atomic_commit);
int drm_atomic_nonblocking_commit(struct drm_atomic_state *state)
{
struct drm_mode_config *config = &state->dev->mode_config;
- int ret;
+ unsigned requested_crtc = 0;
+ unsigned affected_crtc = 0;
+ struct drm_crtc *crtc;
+ struct drm_crtc_state *crtc_state;
+ bool nonblocking = true;
+ int ret, i;
+
+ /*
+ * For commits that allow modesets drivers can add other CRTCs to the
+ * atomic commit, e.g. when they need to reallocate global resources.
+ *
+ * But when userspace also requests a nonblocking commit then userspace
+ * cannot know that the commit affects other CRTCs, which can result in
+ * spurious EBUSY failures. Until we have better uapi plug this by
+ * demoting such commits to blocking mode.
+ */
+ for_each_new_crtc_in_state(state, crtc, crtc_state, i)
+ requested_crtc |= drm_crtc_mask(crtc);
ret = drm_atomic_check_only(state);
if (ret)
return ret;
- DRM_DEBUG_ATOMIC("committing %p nonblocking\n", state);
+ for_each_new_crtc_in_state(state, crtc, crtc_state, i)
+ affected_crtc |= drm_crtc_mask(crtc);
+
+ if (affected_crtc != requested_crtc) {
+ /* adding other CRTC is only allowed for modeset commits */
+ WARN_ON(state->allow_modeset);
+
+ DRM_DEBUG_ATOMIC("demoting %p to blocking mode to avoid EBUSY\n", state);
+ nonblocking = false;
+ } else {
+ DRM_DEBUG_ATOMIC("committing %p nonblocking\n", state);
+ }
- return config->funcs->atomic_commit(state->dev, state, true);
+ return config->funcs->atomic_commit(state->dev, state, nonblocking);
}
EXPORT_SYMBOL(drm_atomic_nonblocking_commit);
--
2.18.0
On Wed, Sep 23, 2020 at 12:31 PM Ville Syrjälä
<ville.syrjala(a)linux.intel.com> wrote:
>
> On Tue, Sep 22, 2020 at 08:18:34PM +0200, Daniel Vetter wrote:
> > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > pull in arbitrary other resources, including CRTCs (e.g. when
> > reconfiguring global resources).
> >
> > But in nonblocking mode userspace has then no idea this happened,
> > which can lead to spurious EBUSY calls, both:
> > - when that other CRTC is currently busy doing a page_flip the
> > ALLOW_MODESET commit can fail with an EBUSY
> > - on the other CRTC a normal atomic flip can fail with EBUSY because
> > of the additional commit inserted by the kernel without userspace's
> > knowledge
> >
> > For blocking commits this isn't a problem, because everyone else will
> > just block until all the CRTC are reconfigured. Only thing userspace
> > can notice is the dropped frames without any reason for why frames got
> > dropped.
> >
> > Consensus is that we need new uapi to handle this properly, but no one
> > has any idea what exactly the new uapi should look like. Since this
> > has been shipping for years already compositors need to deal no matter
> > what, so as a first step just try to enforce this across drivers
> > better with some checks.
> >
> > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > don't want to silently convert page flips into blocking plane updates
> > just because the driver is buggy.
> >
> > v3: Fix inverted WARN_ON (Pekka).
> >
> > v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> > rules for drivers.
> >
> > References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
> > Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
> > Cc: Daniel Stone <daniel(a)fooishbar.org>
> > Cc: Pekka Paalanen <pekka.paalanen(a)collabora.co.uk>
> > Cc: Simon Ser <contact(a)emersion.fr>
> > Cc: stable(a)vger.kernel.org
> > Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
> > ---
> > drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
> > 1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index 58527f151984..ef106e7153a6 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
> > * needed. It will also grab the relevant CRTC lock to make sure that the state
> > * is consistent.
> > *
> > + * WARNING: Drivers may only add new CRTC states to a @state if
> > + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> > + * not created by userspace through an IOCTL call.
> > + *
> > * Returns:
> > *
> > * Either the allocated state or the error code encoded into the pointer. When
> > @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> > struct drm_crtc_state *new_crtc_state;
> > struct drm_connector *conn;
> > struct drm_connector_state *conn_state;
> > + unsigned requested_crtc = 0;
> > + unsigned affected_crtc = 0;
> > int i, ret = 0;
> >
> > DRM_DEBUG_ATOMIC("checking %p\n", state);
> >
> > + for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > + requested_crtc |= drm_crtc_mask(crtc);
> > +
> > for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
> > ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
> > if (ret) {
> > @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> > }
> > }
> >
> > + for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
>
> Inconsistent old vs. new.
Will fix, but also doesn't matter since I don't care about the state,
just that it's in there.
> > + affected_crtc |= drm_crtc_mask(crtc);
> > +
> > + /*
> > + * For commits that allow modesets drivers can add other CRTCs to the
> > + * atomic commit, e.g. when they need to reallocate global resources.
> > + * This can cause spurious EBUSY, which robs compositors of a very
> > + * effective sanity check for their drawing loop. Therefor only allow
> > + * this for modeset commits.
> > + *
> > + * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> > + * so compositors know what's going on.
> > + */
> > + if (affected_crtc != requested_crtc) {
> > + /* adding other CRTC is only allowed for modeset commits */
> > + WARN_ON(!state->allow_modeset);
> > + }
>
> I think this means pretty much all non-pageflip commits will
> have to have allow_modeset==true on i915 or else we just can't
> guarantee that we can anything (due to sagv and/or cdclk mainly).
I guess not enough machines with multiple outputs in the shards.
> Also a bit baffled that CI didn't hit this. I think it should be
> totally possible to hit this now. To avoid that I guess we'd just
> need to make intel_atomic_serialize_global_state() fail if it
> has to add any new crtcs when allow_modeset==false. Hopefully
> there aren't many other places that add crtcs to the state
> without forcing a modeset on them.
Oh we don't do that? That feels like a pretty bad bug ... Wacking
random other crtc without allow_modeset is pretty nasty.
-Daniel
>
> > +
> > return 0;
> > }
> > EXPORT_SYMBOL(drm_atomic_check_only);
> > --
> > 2.28.0
>
> --
> Ville Syrjälä
> Intel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
pull in arbitrary other resources, including CRTCs (e.g. when
reconfiguring global resources).
But in nonblocking mode userspace has then no idea this happened,
which can lead to spurious EBUSY calls, both:
- when that other CRTC is currently busy doing a page_flip the
ALLOW_MODESET commit can fail with an EBUSY
- on the other CRTC a normal atomic flip can fail with EBUSY because
of the additional commit inserted by the kernel without userspace's
knowledge
For blocking commits this isn't a problem, because everyone else will
just block until all the CRTC are reconfigured. Only thing userspace
can notice is the dropped frames without any reason for why frames got
dropped.
Consensus is that we need new uapi to handle this properly, but no one
has any idea what exactly the new uapi should look like. Since this
has been shipping for years already compositors need to deal no matter
what, so as a first step just try to enforce this across drivers
better with some checks.
v2: Add comments and a WARN_ON to enforce this only when allowed - we
don't want to silently convert page flips into blocking plane updates
just because the driver is buggy.
v3: Fix inverted WARN_ON (Pekka).
v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
rules for drivers.
References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
Cc: Daniel Stone <daniel(a)fooishbar.org>
Cc: Pekka Paalanen <pekka.paalanen(a)collabora.co.uk>
Cc: Simon Ser <contact(a)emersion.fr>
Cc: stable(a)vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 58527f151984..ef106e7153a6 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
* needed. It will also grab the relevant CRTC lock to make sure that the state
* is consistent.
*
+ * WARNING: Drivers may only add new CRTC states to a @state if
+ * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
+ * not created by userspace through an IOCTL call.
+ *
* Returns:
*
* Either the allocated state or the error code encoded into the pointer. When
@@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
struct drm_crtc_state *new_crtc_state;
struct drm_connector *conn;
struct drm_connector_state *conn_state;
+ unsigned requested_crtc = 0;
+ unsigned affected_crtc = 0;
int i, ret = 0;
DRM_DEBUG_ATOMIC("checking %p\n", state);
+ for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
+ requested_crtc |= drm_crtc_mask(crtc);
+
for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
if (ret) {
@@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
}
}
+ for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
+ affected_crtc |= drm_crtc_mask(crtc);
+
+ /*
+ * For commits that allow modesets drivers can add other CRTCs to the
+ * atomic commit, e.g. when they need to reallocate global resources.
+ * This can cause spurious EBUSY, which robs compositors of a very
+ * effective sanity check for their drawing loop. Therefor only allow
+ * this for modeset commits.
+ *
+ * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
+ * so compositors know what's going on.
+ */
+ if (affected_crtc != requested_crtc) {
+ /* adding other CRTC is only allowed for modeset commits */
+ WARN_ON(!state->allow_modeset);
+ }
+
return 0;
}
EXPORT_SYMBOL(drm_atomic_check_only);
--
2.28.0
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
include/linux/net.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..05db8690f67e 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
#include <linux/rcupdate.h>
#include <linux/once.h>
#include <linux/fs.h>
+#include <linux/mm.h>
#include <linux/sockptr.h>
#include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return !PageSlab(page) && page_count(page) >= 1;
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2