On Tue, 23 Mar 2021 at 11:32, Peter Zijlstra peterz@infradead.org wrote:
On Tue, Mar 23, 2021 at 10:52:41AM +0100, Marco Elver wrote:
with efs->func==__perf_event_enable. I believe it's sufficient to add
mutex_lock(&parent_event->child_mutex); list_del_init(&event->child_list); mutex_unlock(&parent_event->child_mutex);
right before removing from context. With the version I have now (below for completeness), extended torture with the above test results in no more warnings and the test also passes.
list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
struct perf_event *parent_event = event->parent;
if (!event->attr.remove_on_exec) continue;
if (!is_kernel_event(event))
perf_remove_from_owner(event);
modified = true;
if (parent_event) { /*
* Remove event from parent, to avoid race where the
* parent concurrently iterates through its children to
* enable, disable, or otherwise modify an event. */
mutex_lock(&parent_event->child_mutex);
list_del_init(&event->child_list);
mutex_unlock(&parent_event->child_mutex); }
^^^ this, right?
But that's something perf_event_exit_event() alread does. So then you're worried about the order of things.
Correct. We somehow need to prohibit the parent from doing an event_function_call() while we potentially deactivate the context with perf_remove_from_context().
perf_remove_from_context(event, !!event->parent * DETACH_GROUP);
perf_event_exit_event(event, ctx, current, true); }
perf_event_release_kernel() first does perf_remove_from_context() and then clears the child_list, and that makes sense because if we're there, there's no external access anymore, the filedesc is gone and nobody will be iterating child_list anymore.
perf_event_exit_task_context() and perf_event_exit_event() OTOH seem to rely on ctx->task == TOMBSTONE to sabotage event_function_call() such that if anybody is iterating the child_list, it'll NOP out.
But here we don't have neither, and thus need to worry about the order vs child_list iteration.
I suppose we should stick sync_child_event() in there as well.
And at that point there's very little value in still using perf_event_exit_event()... let me see if there's something to be done about that.
I don't mind dropping use of perf_event_exit_event() and open coding all of this. That would also avoid modifying perf_event_exit_event().
But I leave it to you what you think is nicest.
Thanks, -- Marco