The process_madvise() system call returns error even after processing some VMA's passed in the 'struct iovec' vector list which leaves the user confused to know where to restart the advise next. It is also against this syscall man page[1] documentation where it mentions that "return value may be less than the total number of requested bytes, if an error occurred after some iovec elements were already processed.".
Consider a user passed 10 VMA's in the 'struct iovec' vector list of which 9 are processed but one. Then it just returns the error caused on that failed VMA despite the first 9 VMA's processed, leaving the user confused about on which VMA it is failed. Returning the number of bytes processed here can help the user to know which VMA it is failed on and thus can retry/skip the advise on that VMA.
[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com --- Changes in V2: -- Separated the ENOMEM handling and return bytes processed, as per Minchan comments. -- This contains correcting return bytes processed with process_madvise().
Changes in V1: -- Fixed the ENOMEM handling and return bytes processed by process_madvise. -- https://patchwork.kernel.org/project/linux-mm/patch/1646803679-11433-1-git-s...
mm/madvise.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 38d0f51..e97e6a9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1433,8 +1433,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, iov_iter_advance(&iter, iovec.iov_len); }
- if (ret == 0) - ret = total_len - iov_iter_count(&iter); + ret = (total_len - iov_iter_count(&iter)) ? : ret;
release_mm: mmput(mm);
On Fri, Mar 11, 2022 at 08:59:05PM +0530, Charan Teja Kalla wrote:
The process_madvise() system call returns error even after processing some VMA's passed in the 'struct iovec' vector list which leaves the user confused to know where to restart the advise next. It is also against this syscall man page[1] documentation where it mentions that "return value may be less than the total number of requested bytes, if an error occurred after some iovec elements were already processed.".
Consider a user passed 10 VMA's in the 'struct iovec' vector list of which 9 are processed but one. Then it just returns the error caused on that failed VMA despite the first 9 VMA's processed, leaving the user confused about on which VMA it is failed. Returning the number of bytes processed here can help the user to know which VMA it is failed on and thus can retry/skip the advise on that VMA.
[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com
Acked-by: Minchan Kim minchan@kernel.org
On Fri 11-03-22 20:59:05, Charan Teja Kalla wrote:
The process_madvise() system call returns error even after processing some VMA's passed in the 'struct iovec' vector list which leaves the user confused to know where to restart the advise next. It is also against this syscall man page[1] documentation where it mentions that "return value may be less than the total number of requested bytes, if an error occurred after some iovec elements were already processed.".
Consider a user passed 10 VMA's in the 'struct iovec' vector list of which 9 are processed but one. Then it just returns the error caused on that failed VMA despite the first 9 VMA's processed, leaving the user confused about on which VMA it is failed. Returning the number of bytes processed here can help the user to know which VMA it is failed on and thus can retry/skip the advise on that VMA.
[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.
Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com
Acked-by: Michal Hocko mhocko@suse.com
Changes in V2: -- Separated the ENOMEM handling and return bytes processed, as per Minchan comments. -- This contains correcting return bytes processed with process_madvise().
Changes in V1: -- Fixed the ENOMEM handling and return bytes processed by process_madvise. -- https://patchwork.kernel.org/project/linux-mm/patch/1646803679-11433-1-git-s...
mm/madvise.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 38d0f51..e97e6a9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1433,8 +1433,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, iov_iter_advance(&iter, iovec.iov_len); }
- if (ret == 0)
ret = total_len - iov_iter_count(&iter);
- ret = (total_len - iov_iter_count(&iter)) ? : ret;
release_mm: mmput(mm); -- 2.7.4
linux-stable-mirror@lists.linaro.org