On 2/19/20 5:27 PM, Alexei Starovoitov wrote:
On Wed, Feb 19, 2020 at 03:59:41PM -0600, Daniel Díaz wrote:
When I download a specific kernel release, how can I know what LLVM git-hash or version I need (to use BPF-selftests)?
as discussed we're going to add documentation-like file that will list required commits in tools. This will be enforced for future llvm/pahole commits.
Do you think it is reasonable to require end-users to compile their own bleeding edge version of LLVM, to use BPF-selftests?
absolutely.
+ linux-kselftest@vger.kernel.org
End-users in this context are users and not necessarily developers.
If a developer wants to send a patch they must run all selftests and all of them must pass in their environment. "but I'm adding a tracing feature and don't care about networking tests failing"... is not acceptable.
This is a reasonable expectation when a developers sends bpf patches.
I do hope that some end-users of BPF-selftests will be CI-systems. That also implies that CI-system maintainers need to constantly do "latest built from sources" of LLVM git-tree to keep up. Is that a reasonable requirement when buying a CI-system in the cloud?
"buying CI-system in the cloud" ? If I could buy such system I would pay for it out of my own pocket to save maintainer's and developer's time.
We [1] are end users of kselftests and many other test suites [2]. We run all of our testing on every git-push on linux-stable-rc, mainline, and linux-next -- approximately 1 million tests per week. We have a dedicated engineering team looking after this CI infrastructure and test results, and as such, I can wholeheartedly echo Jesper's sentiment here: We would really like to help kernel maintainers and developers by automatically testing their code in real hardware, but the BPF kselftests are difficult to work with from a CI perspective. We have caught and reported [3] many [4] build [5] failures [6] in the past for libbpf/Perf, but building is just one of the pieces. We are unable to run the entire BPF kselftests because only a part of the code builds, so our testing is very limited there.
We hope that this situation can be improved and that our and everyone else's automated testing can help you guys too. For this to work out, we need some help.
It would be helpful understand what "help" is in this context.
I don't understand what kind of help you need. Just install the latest tools.
What would be helpful is to write bpf tests such that older tests that worked on older llvm versions continue to work and with some indication on which tests require new bleeding edge tools.
Both the latest llvm and the latest pahole are required.
It would be helpful if you can elaborate why latest tools are a requirement.
If by 'help' you mean to tweak selftests to skip tests then it's a nack. We have human driven CI. Every developer must run selftests/bpf before emailing the patches. Myself and Daniel run them as well before applying. These manual runs is the only thing that keeps bpf tree going. If selftests get to skip tests humans will miss those errors. When I don't see '0 SKIPPED, 0 FAILED' I go and investigate. Anything but zero is a path to broken kernels.
Imagine the tests would get skipped when pahole is too old. That would mean all of the kernel features from year 2019 would get skipped. Is there a point of running such selftests? I think the value is not just zero. The value is negative. Such selftests that run old stuff would give false believe that they do something meaningful. "but CI can do build only tests"... If 'helping' such CI means hurting the key developer/maintainer workflow such CI is on its own.
Skipping tests will be useless. I am with you on that. However, figuring out how to maintain some level of backward compatibility to run at least older tests and warn users to upgrade would be helpful.
I suspect currently users are ignoring bpf failures because they are unable to keep up with the requirement to install newer tools to run the tests. This isn't great either.
Users that care are sharing their pain to see if they can get some help or explanation on why new tools are required every so often. I don't think everybody understands why. :)
thanks, -- Shuah
On Wed, 19 Feb 2020 17:47:23 -0700 shuah shuah@kernel.org wrote:
On 2/19/20 5:27 PM, Alexei Starovoitov wrote:
On Wed, Feb 19, 2020 at 03:59:41PM -0600, Daniel Díaz wrote:
When I download a specific kernel release, how can I know what LLVM git-hash or version I need (to use BPF-selftests)?
as discussed we're going to add documentation-like file that will list required commits in tools. This will be enforced for future llvm/pahole commits.
Do you think it is reasonable to require end-users to compile their own bleeding edge version of LLVM, to use BPF-selftests?
absolutely.
- linux-kselftest@vger.kernel.org
End-users in this context are users and not necessarily developers.
I agree. And I worry that we are making it increasingly hard for non-developer users.
If a developer wants to send a patch they must run all selftests and all of them must pass in their environment. "but I'm adding a tracing feature and don't care about networking tests failing"... is not acceptable.
This is a reasonable expectation when a developers sends bpf patches.
Sure. I have several versions on LLVM that I've compiled manually.
I do hope that some end-users of BPF-selftests will be CI-systems. That also implies that CI-system maintainers need to constantly do "latest built from sources" of LLVM git-tree to keep up. Is that a reasonable requirement when buying a CI-system in the cloud?
"buying CI-system in the cloud" ? If I could buy such system I would pay for it out of my own pocket to save maintainer's and developer's time.
And Daniel Díaz want to provide his help below (to tests it on arch that you likely don't even have access to). That sounds like a good offer, and you don't even have to pay.
We [1] are end users of kselftests and many other test suites [2]. We run all of our testing on every git-push on linux-stable-rc, mainline, and linux-next -- approximately 1 million tests per week. We have a dedicated engineering team looking after this CI infrastructure and test results, and as such, I can wholeheartedly echo Jesper's sentiment here: We would really like to help kernel maintainers and developers by automatically testing their code in real hardware, but the BPF kselftests are difficult to work with from a CI perspective. We have caught and reported [3] many [4] build [5] failures [6] in the past for libbpf/Perf, but building is just one of the pieces. We are unable to run the entire BPF kselftests because only a part of the code builds, so our testing is very limited there.
We hope that this situation can be improved and that our and everyone else's automated testing can help you guys too. For this to work out, we need some help.
It would be helpful understand what "help" is in this context.
I don't understand what kind of help you need. Just install the latest tools.
I admire that you want to push *everybody* forward to use the latest LLVM, but saying latest is LLVM devel git tree HEAD is too extreme. I can support saying latest LLVM release is required.
As soon as your LLVM patches are accepted into llvm-git-tree, you will add some BPF selftests that util this. Then CI-systems pull latest bpf-next they will start to fail to compile BPF-selftests, and CI stops. Now you want to force CI-system maintainer to recompile LLVM from git. This will likely take some time. Until that happens CI-system doesn't catch stuff. E.g. I really want the ARM tests that Linaro can run for us (which isn't run before you apply patches...).
What would be helpful is to write bpf tests such that older tests that worked on older llvm versions continue to work and with some indication on which tests require new bleeding edge tools.
Both the latest llvm and the latest pahole are required.
It would be helpful if you can elaborate why latest tools are a requirement.
If by 'help' you mean to tweak selftests to skip tests then it's a nack. We have human driven CI. Every developer must run selftests/bpf before emailing the patches. Myself and Daniel run them as well before applying. These manual runs is the only thing that keeps bpf tree going. If selftests get to skip tests humans will miss those errors. When I don't see '0 SKIPPED, 0 FAILED' I go and investigate. Anything but zero is a path to broken kernels.
Imagine the tests would get skipped when pahole is too old. That would mean all of the kernel features from year 2019 would get skipped. Is there a point of running such selftests? I think the value is not just zero. The value is negative. Such selftests that run old stuff would give false believe that they do something meaningful. "but CI can do build only tests"... If 'helping' such CI means hurting the key developer/maintainer workflow such CI is on its own.
Skipping tests will be useless. I am with you on that. However, figuring out how to maintain some level of backward compatibility to run at least older tests and warn users to upgrade would be helpful.
What I propose is that a BPF-selftest that use a new LLVM feature, should return FAIL (or perhaps SKIP), when it is compiled with say one release old LLVM. This will allow new-tests to show up in CI-systems reports as FAIL, and give everybody breathing room to upgrade their LLVM compiler.
I suspect currently users are ignoring bpf failures because they are unable to keep up with the requirement to install newer tools to run the tests. This isn't great either.
Yes, my worry is also that we are simply making it too difficult for non-developer users to run these tests. And I specifically want to attract CI-systems to run these. And especially Linaro, who have dedicated engineering team looking after their CI infrastructure, and they explicitly in this email confirm my worry.
Users that care are sharing their pain to see if they can get some help or explanation on why new tools are required every so often. I don't think everybody understands why. :)
-----Original Message----- From: Jesper Dangaard Brouer
On Wed, 19 Feb 2020 17:47:23 -0700 shuah shuah@kernel.org wrote:
On 2/19/20 5:27 PM, Alexei Starovoitov wrote:
On Wed, Feb 19, 2020 at 03:59:41PM -0600, Daniel Díaz wrote:
When I download a specific kernel release, how can I know what LLVM git-hash or version I need (to use BPF-selftests)?
as discussed we're going to add documentation-like file that will list required commits in tools. This will be enforced for future llvm/pahole commits.
Do you think it is reasonable to require end-users to compile their own bleeding edge version of LLVM, to use BPF-selftests?
absolutely.
Is it just the BPF-selftests that require the bleeding edge version of LLVM, or do BPF features themselves need the latest LLVM. If the latter, then this is quite worrisome, and I fear the BPF developers are getting ahead of themselves. We don't usually have a kernel dependency on the latest compiler version (some recent security fixes are an anomaly). In fact deprecating support for older compiler versions has been quite slow and methodical over the years.
It's quite dangerous to be baking stuff into the kernel that depends on features from compilers that haven't even made it to release yet.
I'm sorry, but I'm coming into the middle of this thread. Can you please explain what the features are in the latest LLVM that are required for BPF-selftests? -- Tim
- linux-kselftest@vger.kernel.org
End-users in this context are users and not necessarily developers.
I agree. And I worry that we are making it increasingly hard for non-developer users.
If a developer wants to send a patch they must run all selftests and all of them must pass in their environment. "but I'm adding a tracing feature and don't care about networking tests failing"... is not acceptable.
This is a reasonable expectation when a developers sends bpf patches.
Sure. I have several versions on LLVM that I've compiled manually.
I do hope that some end-users of BPF-selftests will be CI-systems. That also implies that CI-system maintainers need to constantly do "latest built from sources" of LLVM git-tree to keep up. Is that a reasonable requirement when buying a CI-system in the cloud?
"buying CI-system in the cloud" ? If I could buy such system I would pay for it out of my own pocket to save maintainer's and developer's time.
And Daniel Díaz want to provide his help below (to tests it on arch that you likely don't even have access to). That sounds like a good offer, and you don't even have to pay.
We [1] are end users of kselftests and many other test suites [2]. We run all of our testing on every git-push on linux-stable-rc, mainline, and linux-next -- approximately 1 million tests per week. We have a dedicated engineering team looking after this CI infrastructure and test results, and as such, I can wholeheartedly echo Jesper's sentiment here: We would really like to help kernel maintainers and developers by automatically testing their code in real hardware, but the BPF kselftests are difficult to work with from a CI perspective. We have caught and reported [3] many [4] build [5] failures [6] in the past for libbpf/Perf, but building is just one of the pieces. We are unable to run the entire BPF kselftests because only a part of the code builds, so our testing is very limited there.
We hope that this situation can be improved and that our and everyone else's automated testing can help you guys too. For this to work out, we need some help.
It would be helpful understand what "help" is in this context.
I don't understand what kind of help you need. Just install the latest tools.
I admire that you want to push *everybody* forward to use the latest LLVM, but saying latest is LLVM devel git tree HEAD is too extreme. I can support saying latest LLVM release is required.
As soon as your LLVM patches are accepted into llvm-git-tree, you will add some BPF selftests that util this. Then CI-systems pull latest bpf-next they will start to fail to compile BPF-selftests, and CI stops. Now you want to force CI-system maintainer to recompile LLVM from git. This will likely take some time. Until that happens CI-system doesn't catch stuff. E.g. I really want the ARM tests that Linaro can run for us (which isn't run before you apply patches...).
What would be helpful is to write bpf tests such that older tests that worked on older llvm versions continue to work and with some indication on which tests require new bleeding edge tools.
Both the latest llvm and the latest pahole are required.
It would be helpful if you can elaborate why latest tools are a requirement.
If by 'help' you mean to tweak selftests to skip tests then it's a nack. We have human driven CI. Every developer must run selftests/bpf before emailing the patches. Myself and Daniel run them as well before applying. These manual runs is the only thing that keeps bpf tree going. If selftests get to skip tests humans will miss those errors. When I don't see '0 SKIPPED, 0 FAILED' I go and investigate. Anything but zero is a path to broken kernels.
Imagine the tests would get skipped when pahole is too old. That would mean all of the kernel features from year 2019 would get skipped. Is there a point of running such selftests? I think the value is not just zero. The value is negative. Such selftests that run old stuff would give false believe that they do something meaningful. "but CI can do build only tests"... If 'helping' such CI means hurting the key developer/maintainer workflow such CI is on its own.
Skipping tests will be useless. I am with you on that. However, figuring out how to maintain some level of backward compatibility to run at least older tests and warn users to upgrade would be helpful.
What I propose is that a BPF-selftest that use a new LLVM feature, should return FAIL (or perhaps SKIP), when it is compiled with say one release old LLVM. This will allow new-tests to show up in CI-systems reports as FAIL, and give everybody breathing room to upgrade their LLVM compiler.
I suspect currently users are ignoring bpf failures because they are unable to keep up with the requirement to install newer tools to run the tests. This isn't great either.
Yes, my worry is also that we are simply making it too difficult for non-developer users to run these tests. And I specifically want to attract CI-systems to run these. And especially Linaro, who have dedicated engineering team looking after their CI infrastructure, and they explicitly in this email confirm my worry.
Users that care are sharing their pain to see if they can get some help or explanation on why new tools are required every so often. I don't think everybody understands why. :)
-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
On Thu, Feb 20, 2020 at 05:02:25PM +0000, Bird, Tim wrote:
-----Original Message----- From: Jesper Dangaard Brouer
On Wed, 19 Feb 2020 17:47:23 -0700 shuah shuah@kernel.org wrote:
On 2/19/20 5:27 PM, Alexei Starovoitov wrote:
On Wed, Feb 19, 2020 at 03:59:41PM -0600, Daniel Díaz wrote:
When I download a specific kernel release, how can I know what LLVM git-hash or version I need (to use BPF-selftests)?
as discussed we're going to add documentation-like file that will list required commits in tools. This will be enforced for future llvm/pahole commits.
Do you think it is reasonable to require end-users to compile their own bleeding edge version of LLVM, to use BPF-selftests?
absolutely.
Is it just the BPF-selftests that require the bleeding edge version of LLVM, or do BPF features themselves need the latest LLVM. If the latter, then this is quite worrisome, and I fear the BPF developers are getting ahead of themselves. We don't usually have a kernel dependency on the latest compiler version (some recent security fixes are an anomaly). In fact deprecating support for older compiler versions has been quite slow and methodical over the years.
It's quite dangerous to be baking stuff into the kernel that depends on features from compilers that haven't even made it to release yet.
I'm sorry, but I'm coming into the middle of this thread. Can you please explain what the features are in the latest LLVM that are required for BPF-selftests?
Above is correct. bpf kernel features do depend on the latest pahole and llvm features that did not make it into a release. That was the case for many years now and still the case. The first commit 8 years ago relied on something that can generate those instructions. For many years llvm was the only compiler that could generate them. Right now there is GCC backend as well. New features (like new instructions) depend on the compiler.
selftests/bpf are not testing kernel's bpf features. They are testing the whole bpf ecosystem. They test llvm, pahole, libbpf, bpftool, and kernel together. Hence it's a requirement to install the latest pahole and llvm.
When I'm talking about selftests/bpf I'm talking about all the tests in that directory combined. There are several unit tests scattered across repos. The unit tests for llvm bpf backend are inside llvm repo. selftests/bpf/test_verifier and test_maps are unit tests for the verifier and for maps. They are llvm independent. They test a combination of kernel and libbpf only. But majority of the selftests/bpf are done via test_progs which are the whole ecosystem tests.
-----Original Message----- From: Alexei Starovoitov alexei.starovoitov@gmail.com
On Thu, Feb 20, 2020 at 05:02:25PM +0000, Bird, Tim wrote:
-----Original Message----- From: Jesper Dangaard Brouer
On Wed, 19 Feb 2020 17:47:23 -0700 shuah shuah@kernel.org wrote:
On 2/19/20 5:27 PM, Alexei Starovoitov wrote:
On Wed, Feb 19, 2020 at 03:59:41PM -0600, Daniel Díaz wrote:
> > When I download a specific kernel release, how can I know what LLVM > git-hash or version I need (to use BPF-selftests)?
as discussed we're going to add documentation-like file that will list required commits in tools. This will be enforced for future llvm/pahole commits.
> Do you think it is reasonable to require end-users to compile their own > bleeding edge version of LLVM, to use BPF-selftests?
absolutely.
Is it just the BPF-selftests that require the bleeding edge version of LLVM, or do BPF features themselves need the latest LLVM. If the latter, then this is quite worrisome, and I fear the BPF developers are getting ahead of themselves. We don't usually have a kernel dependency on the latest compiler version (some recent security fixes are an anomaly). In fact deprecating support for older compiler versions has been quite slow and methodical over the years.
It's quite dangerous to be baking stuff into the kernel that depends on features from compilers that haven't even made it to release yet.
I'm sorry, but I'm coming into the middle of this thread. Can you please explain what the features are in the latest LLVM that are required for BPF-selftests?
Above is correct. bpf kernel features do depend on the latest pahole and llvm features that did not make it into a release. That was the case for many years now and still the case. The first commit 8 years ago relied on something that can generate those instructions. For many years llvm was the only compiler that could generate them. Right now there is GCC backend as well. New features (like new instructions) depend on the compiler.
selftests/bpf are not testing kernel's bpf features. They are testing the whole bpf ecosystem. They test llvm, pahole, libbpf, bpftool, and kernel together. Hence it's a requirement to install the latest pahole and llvm.
When I'm talking about selftests/bpf I'm talking about all the tests in that directory combined. There are several unit tests scattered across repos. The unit tests for llvm bpf backend are inside llvm repo. selftests/bpf/test_verifier and test_maps are unit tests for the verifier and for maps. They are llvm independent. They test a combination of kernel and libbpf only. But majority of the selftests/bpf are done via test_progs which are the whole ecosystem tests.
Alexei,
Thank you very much for this explanation. It is very helpful. I apologize for my ignorance of this, but can I ask a few questions just to check my understanding? Please forgive me if I use the wrong terminology below.
So - do the BPF developers add new instructions to the virtual machine, that then have to be added to both the compiler and the executor (VM implementation)? It sounds like the compiler support and executor support is done in concert, and that patches are at least accepted upstream (but possibly are not yet available in a compiler release) for the compiler side. What about the Linux kernel side? Is the support for a new instruction only in non-released kernels (say, in the BPF development tree), or could it potentially be included in a released kernel, before the compiler with matching support is released? What would happen if a bug was found, and compiler support for the instruction was delayed? I suppose that this would only mean that the executor supported an instruction that never appeared in a compiled BPF program? Is that right?
Thanks, -- Tim
On Thu, Feb 20, 2020 at 05:41:51PM +0000, Bird, Tim wrote:
So - do the BPF developers add new instructions to the virtual machine, that then have to be added to both the compiler and the executor (VM implementation)?
Right. New instructions are added to the kernel and llvm at the same time. The kernel and llvm release cadence and process are different which complicates it for us.
It sounds like the compiler support and executor support is done in concert, and that patches are at least accepted upstream (but possibly are not yet available in a compiler release) for the compiler side. What about the Linux kernel side? Is the support for a new instruction only in non-released kernels (say, in the BPF development tree), or could it potentially be included in a released kernel, before the compiler with matching support is released? What would happen if a bug was found, and compiler support for the instruction was delayed?
As with all chicken-and-egg problems the feature has to land in one of the repos first. That was one of the reasons llvm community switched to mono repo to avoid clang vs llvm conflicts. The kernel and llvm are not going to be in a single repo, so we have to orchestrate the landing. Most of the time it's easy, because we maintain both kernel and llvm components. But in some cases it's very difficult. For example we've delayed landing kernel and libbpf patches by about six month, since we couldn't get an agreement on how the feature has to be implemented in clang.
I suppose that this would only mean that the executor supported an instruction that never appeared in a compiled BPF program? Is that right?
The answer is yes. It is the case that the kernel supports certain bpf instructions, but llvm doesn't know how to emit them. But it has nothing to do with landing of features and release cadence.
linux-kselftest-mirror@lists.linaro.org