On Wed, May 22, 2019 at 9:14 AM Oleg Nesterov oleg@redhat.com wrote:
On 05/22, Deepa Dinamani wrote:
-Deepa
On May 22, 2019, at 8:05 AM, Oleg Nesterov oleg@redhat.com wrote:
On 05/21, Deepa Dinamani wrote:
Note that this patch returns interrupted errors (EINTR, ERESTARTNOHAND, etc) only when there is no other error. If there is a signal and an error like EINVAL, the syscalls return -EINVAL rather than the interrupted error codes.
Ugh. I need to re-check, but at first glance I really dislike this change.
I think we can fix the problem _and_ simplify the code. Something like below. The patch is obviously incomplete, it changes only only one caller of set_user_sigmask(), epoll_pwait() to explain what I mean. restore_user_sigmask() should simply die. Although perhaps another helper makes sense to add WARN_ON(test_tsk_restore_sigmask() && !signal_pending).
restore_user_sigmask() was added because of all the variants of these syscalls we added because of y2038 as noted in commit message:
signal: Add restore_user_sigmask()
Refactor the logic to restore the sigmask before the syscall returns into an api. This is useful for versions of syscalls that pass in the sigmask and expect the current->sigmask to be changed during the execution and restored after the execution of the syscall. With the advent of new y2038 syscalls in the subsequent patches, we add two more new versions of the syscalls (for pselect, ppoll and io_pgetevents) in addition to the existing native and compat versions. Adding such an api reduces the logic that would need to be replicated otherwise.
Again, I need to re-check, will continue tomorrow. But so far I am not sure this helper can actually help.
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2318,19 +2318,19 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct epoll_event __user *, events, size_t, sigsetsize) { int error;
sigset_t ksigmask, sigsaved;
/*
- If the caller wants a certain signal mask to be set during the wait,
- we apply it here.
*/
error = set_user_sigmask(sigmask, &ksigmask, &sigsaved, sigsetsize);
error = set_user_sigmask(sigmask, sigsetsize); if (error) return error;
error = do_epoll_wait(epfd, events, maxevents, timeout);
- restore_user_sigmask(sigmask, &sigsaved);
- if (error != -EINTR)
As you address all the other syscalls this condition becomes more and more complicated.
May be.
--- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -416,7 +416,6 @@ void task_join_group_stop(struct task_struct *task); static inline void set_restore_sigmask(void) { set_thread_flag(TIF_RESTORE_SIGMASK);
- WARN_ON(!test_thread_flag(TIF_SIGPENDING));
So you always want do_signal() to be called?
Why do you think so? No. This is just to avoid the warning, because with the patch I sent set_restore_sigmask() is called "in advance".
You will have to check each architecture's implementation of do_signal() to check if that has any side effects.
I don't think so.
Why not?
Although this is not what the patch is solving.
Sure. But you know, after I tried to read the changelog, I am not sure I understand what exactly you are trying to fix. Could you please explain this part
The behavior before 854a6ed56839a was that the signals were dropped after the error code was decided. This resulted in lost signals but the userspace did not notice it
? I fail to understand it, sorry. It looks as if the code was already buggy before that commit and it could miss a signal or something like this, but I do not see how.
Did you read the explanation pointed to in the commit text? :
https://lore.kernel.org/linux-fsdevel/20190427093319.sgicqik2oqkez3wk@dcvr/
Let me know what part you don't understand and I can explain more.
It would be better to understand the isssue before we start discussing the fix.
-Deepa