On Wed, Dec 13, 2017 at 01:05:49PM +0000, Chris Wilson wrote:
Quoting Daniel Vetter (2017-12-13 12:49:36)
PROBE_DEFER also uses system_wq to reprobe drivers, which means when that again fails, and we try to flush the overall system_wq (to get all the delayed connectore cleanup work_struct completed), we deadlock.
Fix this by using just a single cleanup work, so that we can only flush that one and don't block on anything else. That means a free list plus locking, a standard pattern.
v2:
- Correctly free connectors only on last ref. Oops (Chris).
- use llist_head/node (Chris).
Fixes: a703c55004e1 ("drm: safely free connectors from connector_iter") Fixes: 613051dac40d ("drm: locking&new iterators for connector_list") Cc: Ben Widawsky ben@bwidawsk.net Cc: Dave Airlie airlied@gmail.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Sean Paul seanpaul@chromium.org Cc: stable@vger.kernel.org # v4.11+: 613051dac40d ("drm: locking&new iterators for connector_list" Cc: stable@vger.kernel.org # v4.11+ Cc: Daniel Vetter daniel.vetter@intel.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: Gustavo Padovan gustavo@padovan.org Cc: David Airlie airlied@linux.ie Cc: Javier Martinez Canillas javier@dowhile0.org Cc: Shuah Khan shuahkh@osg.samsung.com Cc: Guillaume Tucker guillaume.tucker@collabora.com Cc: Mark Brown broonie@kernel.org Cc: Kevin Hilman khilman@baylibre.com Cc: Matt Hart matthew.hart@linaro.org Cc: Thierry Escande thierry.escande@collabora.co.uk Cc: Tomeu Vizoso tomeu.vizoso@collabora.com Cc: Enric Balletbo i Serra enric.balletbo@collabora.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_connector.c | 50 ++++++++++++++++++++++++++----------- drivers/gpu/drm/drm_crtc_internal.h | 1 + drivers/gpu/drm/drm_mode_config.c | 4 ++- include/drm/drm_connector.h | 10 +++++--- include/drm/drm_mode_config.h | 18 ++++++++++++- 5 files changed, 62 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 0b7e0974e6da..3f53f127e1f2 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -153,14 +153,23 @@ static void drm_connector_free(struct kref *kref) connector->funcs->destroy(connector); } -static void drm_connector_free_work_fn(struct work_struct *work) +void drm_connector_free_work_fn(struct work_struct *work) {
struct drm_connector *connector =
container_of(work, struct drm_connector, free_work);
struct drm_device *dev = connector->dev;
struct drm_connector *connector, *n;
struct drm_device *dev =
container_of(work, struct drm_device, mode_config.connector_free_work);
struct drm_mode_config *config = &dev->mode_config;
unsigned long flags;
struct llist_node *freed;
drm_mode_object_unregister(dev, &connector->base);
connector->funcs->destroy(connector);
spin_lock_irqsave(&config->connector_list_lock, flags);
freed = llist_del_all(&config->connector_free_list);
spin_unlock_irqrestore(&config->connector_list_lock, flags);
My understanding is that the spinlock here is only used to guard the free_list. (It's not protecting the final refcount.) In which case it is redundant as llist_del_all/llist_add are a safe lockless combination.
That just makes the patch bigger than has to be, but it looks correct.
+__drm_connector_put_safe(struct drm_connector *conn) {
if (refcount_dec_and_test(&conn->base.refcount.refcount))
schedule_work(&conn->free_work);
struct drm_mode_config *config = &conn->dev->mode_config;
lockdep_assert_held(&config->connector_list_lock);
if (!refcount_dec_and_test(&conn->base.refcount.refcount))
return;
llist_add(&conn->free_node, &config->connector_free_list);
schedule_work(&config->connector_free_work);
(Didn't like the if (llist_add) nano-optimisation? :)
I thought that one might race, since the schedule_work is what provides the crucial barrier here. But then I was kinda too lazy to read all the llist guarantees already and just figured I'll keep the spin_lock stuck around everything.
But yeah it's all neatly lockless, now I'm tempted to redo it all. If CI spots something I'll include it in the respin for sure.
diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h index 9ebb8841778c..af00f42ba269 100644 --- a/drivers/gpu/drm/drm_crtc_internal.h +++ b/drivers/gpu/drm/drm_crtc_internal.h @@ -142,6 +142,7 @@ int drm_mode_connector_set_obj_prop(struct drm_mode_object *obj, uint64_t value); int drm_connector_create_standard_properties(struct drm_device *dev); const char *drm_get_connector_force_name(enum drm_connector_force force); +void drm_connector_free_work_fn(struct work_struct *work); /* IOCTL */ int drm_mode_connector_property_set_ioctl(struct drm_device *dev, diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c index 6ffe952142e6..7681269abe41 100644 --- a/drivers/gpu/drm/drm_mode_config.c +++ b/drivers/gpu/drm/drm_mode_config.c @@ -382,6 +382,8 @@ void drm_mode_config_init(struct drm_device *dev) ida_init(&dev->mode_config.connector_ida); spin_lock_init(&dev->mode_config.connector_list_lock);
INIT_WORK(&dev->mode_config.connector_free_work, drm_connector_free_work_fn);
A init_llist_head(&dev->mode_config.connector_free_list) wouldn't go amiss here. So perhaps push the connectors init into its own exported function from drm_connector.c as opposed to exposing the free_fn.
Imo it doesn't matter much how we go about drm.ko internals. But I'll stick the init_llist_head in there when applying, somehow I dind't find it (why is every kernel data type slightly different in this).
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk
Thanks, Daniel