Mateusz Guzik mjguzik@gmail.com writes:
I'll say upfront I'm not an epoll person, just looked here out of curiosity.
Feedbacks always welcomed.
I can agree there is a bug. The event is generated before any of the threads even exist and they only poll for it, none of them consume it.
However, the commit message fails to explain why the change fixes anything and I think your patch de facto reverts e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout"). Looking at that diff the point was to avoid the expensive lock trip if timeout == 0 and there are no events.
As for the bug is, from my reading the ep_start_scan()/ep_done_scan() pair transiently disturbs the state checked by ep_events_available(), resulting in false-negatives. Then the locked check works because by the time you acquire it, the damage is undone.
Exactly so. I can add this into the description.
Given the commits referenced in Fixes:, I suspect the real fix would be to stop destroying that state of course.
But if that's not feasible, I would check if a sequence counter around this would do the trick -- then the racy ep_events_available(ep) upfront would become safe with smaller overhead than with your proposal for the no-event case, but with higher overhead when there is something.
My proposal is trivial to implement, I have no idea if it will get a buy-in though.
My question is whether the performance of epoll_wait() with zero timeout is really that important that we have to complicate things. If epoll_wait() with zero timeout is called repeatedly in a loop but there is no event, I'm sure there will be measurabled performance drop. But sane user would just use timeout in that case.
epoll's data is protected by a lock. Therefore I think the most straightforward solution is just taking the lock before reading the data.
Lockless is hard to get right and may cause hard-to-debug problems. So unless this performance drop somehow bothers someone, I would prefer "keep it simple, stupid".
Best regards, Nam