From userspace, spawning a new process with, for example, posix_spawn(), only allows the user to work with the scheduling priority value defined by POSIX in the sched_param struct.
However, sched_setparam() and similar syscalls lead to __sched_setscheduler() which rejects any new value for the priority other than 0 for non-RT schedule classes, a behavior that existed since Linux 2.6 or earlier.
Linux translates the usage of the sched_param struct into it's own internal sched_attr struct during the syscall, but the user currently has no way to manage the other values within the sched_attr struct using only POSIX functions.
The only other way to adjust niceness when using posix_spawn() would be to set the value after the process has started, but this introduces the risk of the process being dead before the syscall can set the priority afterward.
To resolve this, allow the use of the priority value originally from the POSIX sched_param struct in order to set the niceness value instead of rejecting the priority value.
Edit the sched_get_priority_*() POSIX syscalls in order to reflect the range of values accepted.
Cc: stable@vger.kernel.org # Apply to kernel/sched/core.c Signed-off-by: Michael C. Pratt mcpratt@pm.me --- kernel/sched/syscalls.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 24f9f90b6574..43eb283e6281 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -785,6 +785,19 @@ static int _sched_setscheduler(struct task_struct *p, int policy, attr.sched_policy = policy; }
+ if (attr.sched_priority > MAX_PRIO-1) + return -EINVAL; + + /* + * If priority is set for SCHED_NORMAL or SCHED_BATCH, + * set the niceness instead, but only for user calls. + */ + if (check && attr.sched_priority > MAX_RT_PRIO-1 && + ((policy != SETPARAM_POLICY && fair_policy(policy)) || fair_policy(p->policy))) { + attr.sched_nice = PRIO_TO_NICE(attr.sched_priority); + attr.sched_priority = 0; + } + return __sched_setscheduler(p, &attr, check, true); } /** @@ -1532,9 +1545,11 @@ SYSCALL_DEFINE1(sched_get_priority_max, int, policy) case SCHED_RR: ret = MAX_RT_PRIO-1; break; - case SCHED_DEADLINE: case SCHED_NORMAL: case SCHED_BATCH: + ret = MAX_PRIO-1; + break; + case SCHED_DEADLINE: case SCHED_IDLE: case SCHED_EXT: ret = 0; @@ -1560,9 +1575,11 @@ SYSCALL_DEFINE1(sched_get_priority_min, int, policy) case SCHED_RR: ret = 1; break; - case SCHED_DEADLINE: case SCHED_NORMAL: case SCHED_BATCH: + ret = MAX_RT_PRIO; + break; + case SCHED_DEADLINE: case SCHED_IDLE: case SCHED_EXT: ret = 0;
base-commit: 2d5404caa8c7bb5c4e0435f94b28834ae5456623