On Sun, Jul 5, 2020 at 1:58 PM Greg KH gregkh@linuxfoundation.org wrote:
On Sun, Jul 05, 2020 at 06:09:03AM +0200, Jan Ziak wrote:
On Sun, Jul 5, 2020 at 5:27 AM Matthew Wilcox willy@infradead.org wrote:
On Sun, Jul 05, 2020 at 05:18:58AM +0200, Jan Ziak wrote:
On Sun, Jul 5, 2020 at 5:12 AM Matthew Wilcox willy@infradead.org wrote:
You should probably take a look at io_uring. That has the level of complexity of this proposal and supports open/read/close along with many other opcodes.
Then glibc can implement readfile using io_uring and there is no need for a new single-file readfile syscall.
It could, sure. But there's also a value in having a simple interface to accomplish a simple task. Your proposed API added a very complex interface to satisfy needs that clearly aren't part of the problem space that Greg is looking to address.
I believe that we should look at the single-file readfile syscall from a performance viewpoint. If an application is expecting to read a couple of small/medium-size files per second, then neither readfile nor readfiles makes sense in terms of improving performance. The benefits start to show up only in case an application is expecting to read at least a hundred of files per second. The "per second" part is important, it cannot be left out. Because readfile only improves performance for many-file reads, the syscall that applications performing many-file reads actually want is the multi-file version, not the single-file version.
It also is a measurable increase over reading just a single file. Here's my really really fast AMD system doing just one call to readfile vs. one call sequence to open/read/close:
$ ./readfile_speed -l 1 Running readfile test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... Took 3410 ns Running open/read/close test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... Took 3780 ns
370ns isn't all that much, yes, but it is 370ns that could have been used for something else :)
I am curious as to how you amortized or accounted for the fact that readfile() first needs to open the dirfd and then close it later.
From performance viewpoint, only codes where readfile() is called
multiple times from within a loop make sense:
dirfd = open(); for(...) { readfile(dirfd, ...); } close(dirfd);
Look at the overhead these days of a syscall using something like perf to see just how bad things have gotten on Intel-based systems (above was AMD which doesn't suffer all the syscall slowdowns, only some).
I'm going to have to now dig up my old rpi to get the stats on that thing, as well as some Intel boxes to show the problem I'm trying to help out with here. I'll post that for the next round of this patch series.
I am not sure I understand why you think that a pointer to an array of readfile_t structures is very complex. If it was very complex then it would be a deep tree or a large graph.
Of course you can make it more complex if you want, but look at the existing tools that currently do many open/read/close sequences. The apis there don't lend themselves very well to knowing the larger list of files ahead of time. But I could be looking at the wrong thing, what userspace programs are you thinking of that could be easily converted into using something like this?
Perhaps, passing multiple filenames to tools via the command-line is a valid and quite general use case where it is known ahead of time that multiple files are going to be read, such as "gcc *.o" which is commonly used to link shared libraries and executables. Although, in case of "gcc *.o" some of the object files are likely to be cached in memory and thus unlikely to be required to be fetched from HDD/SSD, so the valid use case where we could see a speedup (if gcc was to use the multi-file readfiles() syscall) is when the programmer/Makefile invokes "gcc *.o" after rebuilding a small subset of the object files and the objects files which did not have to be rebuilt are stored on HDD/SSD, so basically this means 1st-time use of a project's Makefile in a particular day.