A common recurring mistake made when backporting patches to stable is forgetting to check for additional commits tagged with `Fixes:`. This script validates that local commits have a `commit <sha40> upstream.` line in their commit message, and whether any additional `Fixes:` shas exist in the `master` branch but were not included. It can not know about fixes yet to be discovered, or fixes sent to the mailing list but not yet in mainline.
To save time, it avoids checking all of `master`, stopping early once we've reached the commit time of the earliest backport. It takes 0.5s to validate 2 patches to linux-5.4.y when master is v5.12-rc3 and 5s to validate 27 patches to linux-4.19.y. It does not recheck dependencies of found fixes; the user is expected to run this script to a fixed point. It depnds on pygit2 python library for working with git, which can be installed via: $ pip3 install pygit2
It's expected to be run from a stable tree with commits applied. For example, consider 3cce9d44321e which is a fix for f77ac2e378be. Let's say I cherry picked f77ac2e378be into linux-5.4.y but forgot 3cce9d44321e (true story). If I ran:
$ ./scripts/stable/check_backports.py Checking 1 local commits for additional Fixes: in master Please consider backporting 3cce9d44321e as a fix for f77ac2e378be
So then I could cherry pick 3cce9d44321e as well: $ git cherry-pick -sx 3cce9d44321e $ ./scripts/stable/check_backports.py ... Exception: Missing 'commit <sha40> upstream.' line
Oops, let me fixup the commit message and retry. $ git commit --amend <fix commit message> $ ./scripts/stable/check_backports.py Checking 2 local commits for additional Fixes: in master $ echo $? 0
This allows for client side validation by the backports author, and server side validation by the stable kernel maintainers.
Signed-off-by: Nick Desaulniers ndesaulniers@google.com --- MAINTAINERS | 1 + scripts/stable/check_backports.py | 92 +++++++++++++++++++++++++++++++ 2 files changed, 93 insertions(+) create mode 100755 scripts/stable/check_backports.py
diff --git a/MAINTAINERS b/MAINTAINERS index aa84121c5611..a8639e9277c4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16960,6 +16960,7 @@ M: Sasha Levin sashal@kernel.org L: stable@vger.kernel.org S: Supported F: Documentation/process/stable-kernel-rules.rst +F: scripts/stable/
STAGING - ATOMISP DRIVER M: Mauro Carvalho Chehab mchehab@kernel.org diff --git a/scripts/stable/check_backports.py b/scripts/stable/check_backports.py new file mode 100755 index 000000000000..529294e247ca --- /dev/null +++ b/scripts/stable/check_backports.py @@ -0,0 +1,92 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2021 Google, Inc. + +import os +import re +import sys + +import pygit2 as pg + + +def get_head_branch(repo): + # Walk the branches to find which is HEAD. + for branch_name in repo.branches: + branch = repo.branches[branch_name] + if branch.is_head(): + return branch + + +def get_local_commits(repo): + head_branch = get_head_branch(repo) + # Walk the HEAD ref until we hit the first commit from the upstream. + walker = repo.walk(repo.head.target) + upstream_branch = head_branch.upstream + upstream_commit, _ = repo.resolve_refish(upstream_branch.name) + walker.hide(upstream_commit.id) + commits = [commit for commit in walker] + if not len(commits): + raise Exception("No local commits") + return commits + + +def get_upstream_shas(commits): + upstream_shas = [] + prog = re.compile('commit ([0-9a-f]{40}) upstream.') + # For each line of each commit message, record the + # "commit <sha40> upstream." line. + for commit in commits: + found_upstream_line = False + for line in commit.message.splitlines(): + result = prog.search(line) + if result: + upstream_shas.append(result.group(1)[:12]) + found_upstream_line = True + break + if not found_upstream_line: + raise Exception("Missing 'commit <sha40> upstream.' line") + return upstream_shas + + +def get_oldest_commit_time(repo, shas): + commit_times = [repo.resolve_refish(sha)[0].commit_time for sha in shas] + return sorted(commit_times)[0] + + +def get_fixes_for(shas): + shas = set(shas) + prog = re.compile("Fixes: ([0-9a-f]{12,40})") + # Walk commits in the master branch. + master_commit, master_ref = repo.resolve_refish("master") + walker = repo.walk(master_ref.target) + oldest_commit_time = get_oldest_commit_time(repo, shas) + fixes = [] + for commit in walker: + # It's not possible for a Fixes: to be committed before a fixed tag, so + # don't iterate all of git history. + if commit.commit_time < oldest_commit_time: + break + for line in reversed(commit.message.splitlines()): + result = prog.search(line) + if not result: + continue + fixes_sha = result.group(1)[:12] + if fixes_sha in shas and commit.id.hex[:12] not in shas: + fixes.append((commit.id.hex[:12], fixes_sha)) + return fixes + + +def report(fixes): + if len(fixes): + for fix, broke in fixes: + print("Please consider backporting %s as a fix for %s" % (fix, broke)) + sys.exit(1) + + +if __name__ == "__main__": + repo = pg.Repository(os.getcwd()) + commits = get_local_commits(repo) + print("Checking %d local commits for additional Fixes: in master" % (len(commits))) + upstream_shas = get_upstream_shas(commits) + fixes = get_fixes_for(upstream_shas) + report(fixes)
On Tue, Mar 16, 2021 at 02:31:33PM -0700, Nick Desaulniers wrote:
A common recurring mistake made when backporting patches to stable is forgetting to check for additional commits tagged with `Fixes:`. This script validates that local commits have a `commit <sha40> upstream.` line in their commit message, and whether any additional `Fixes:` shas exist in the `master` branch but were not included. It can not know about fixes yet to be discovered, or fixes sent to the mailing list but not yet in mainline.
To save time, it avoids checking all of `master`, stopping early once we've reached the commit time of the earliest backport. It takes 0.5s to validate 2 patches to linux-5.4.y when master is v5.12-rc3 and 5s to validate 27 patches to linux-4.19.y. It does not recheck dependencies of found fixes; the user is expected to run this script to a fixed point. It depnds on pygit2 python library for working with git, which can be installed via: $ pip3 install pygit2
It's expected to be run from a stable tree with commits applied. For example, consider 3cce9d44321e which is a fix for f77ac2e378be. Let's say I cherry picked f77ac2e378be into linux-5.4.y but forgot 3cce9d44321e (true story). If I ran:
$ ./scripts/stable/check_backports.py Checking 1 local commits for additional Fixes: in master Please consider backporting 3cce9d44321e as a fix for f77ac2e378be
While interesting, I don't use a git tree for the stable queue, so this doesn't really fit into my workflow, sorry.
And we do have other "stable tree helper" scripts in the stable-queue.git repo, perhaps that's a better place for this than the main kernel repo?
thanks,
greg k-h
On Tue, Mar 23, 2021 at 6:56 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Mar 16, 2021 at 02:31:33PM -0700, Nick Desaulniers wrote:
A common recurring mistake made when backporting patches to stable is forgetting to check for additional commits tagged with `Fixes:`. This script validates that local commits have a `commit <sha40> upstream.` line in their commit message, and whether any additional `Fixes:` shas exist in the `master` branch but were not included. It can not know about fixes yet to be discovered, or fixes sent to the mailing list but not yet in mainline.
To save time, it avoids checking all of `master`, stopping early once we've reached the commit time of the earliest backport. It takes 0.5s to validate 2 patches to linux-5.4.y when master is v5.12-rc3 and 5s to validate 27 patches to linux-4.19.y. It does not recheck dependencies of found fixes; the user is expected to run this script to a fixed point. It depnds on pygit2 python library for working with git, which can be installed via: $ pip3 install pygit2
It's expected to be run from a stable tree with commits applied. For example, consider 3cce9d44321e which is a fix for f77ac2e378be. Let's say I cherry picked f77ac2e378be into linux-5.4.y but forgot 3cce9d44321e (true story). If I ran:
$ ./scripts/stable/check_backports.py Checking 1 local commits for additional Fixes: in master Please consider backporting 3cce9d44321e as a fix for f77ac2e378be
While interesting, I don't use a git tree for the stable queue, so this doesn't really fit into my workflow, sorry.
Well, what is your workflow?
And we do have other "stable tree helper" scripts in the stable-queue.git repo, perhaps that's a better place for this than the main kernel repo?
Sure, here it is moved over to there. Let me know if there's a preferred way to send it.
On Tue, Mar 23, 2021 at 11:52:26AM -0700, Nick Desaulniers wrote:
On Tue, Mar 23, 2021 at 6:56 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Mar 16, 2021 at 02:31:33PM -0700, Nick Desaulniers wrote:
A common recurring mistake made when backporting patches to stable is forgetting to check for additional commits tagged with `Fixes:`. This script validates that local commits have a `commit <sha40> upstream.` line in their commit message, and whether any additional `Fixes:` shas exist in the `master` branch but were not included. It can not know about fixes yet to be discovered, or fixes sent to the mailing list but not yet in mainline.
To save time, it avoids checking all of `master`, stopping early once we've reached the commit time of the earliest backport. It takes 0.5s to validate 2 patches to linux-5.4.y when master is v5.12-rc3 and 5s to validate 27 patches to linux-4.19.y. It does not recheck dependencies of found fixes; the user is expected to run this script to a fixed point. It depnds on pygit2 python library for working with git, which can be installed via: $ pip3 install pygit2
It's expected to be run from a stable tree with commits applied. For example, consider 3cce9d44321e which is a fix for f77ac2e378be. Let's say I cherry picked f77ac2e378be into linux-5.4.y but forgot 3cce9d44321e (true story). If I ran:
$ ./scripts/stable/check_backports.py Checking 1 local commits for additional Fixes: in master Please consider backporting 3cce9d44321e as a fix for f77ac2e378be
While interesting, I don't use a git tree for the stable queue, so this doesn't really fit into my workflow, sorry.
Well, what is your workflow?
Look at the stable-queue.git tree. It's a set of quilt-managed patches on top of a solid base (i.e. the last released kernel version.).
The only time git gets involved is when we do a -rc release or when we do a "real" release, and then we use 'git quiltimport' on the whole stack.
Here's a script that I use (much too slow, I know), for checking this type of thing and I try to remember to run it before every cycle of -rc releases: https://github.com/gregkh/commit_tree/blob/master/find_fixes_in_queue
It's a hack, and picks up more things than is really needed, but I would rather it error on that side than the other.
thanks,
greg k-h
On Tue, Mar 23, 2021 at 12:05 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
The only time git gets involved is when we do a -rc release or when we do a "real" release, and then we use 'git quiltimport' on the whole stack.
Here's a script that I use (much too slow, I know), for checking this type of thing and I try to remember to run it before every cycle of -rc releases: https://github.com/gregkh/commit_tree/blob/master/find_fixes_in_queue
It's a hack, and picks up more things than is really needed, but I would rather it error on that side than the other.
Yes, my script is similar. Looks like yours also runs on a git tree.
I noticed that id_fixed_in runs `git grep -l --threads=3 <sha>` to find fixes; that's neat, I didn't know about `--threads=`. I tried it with ae46578b963f manually:
$ git grep -l --threads=3 ae46578b963f $
Should it have found a7889c6320b9 and 773e0c402534? Perhaps `git log --grep=<sha>` should be used instead? I thought `git grep` only greps files in the archive, not commit history?
On Tue, Mar 23, 2021 at 01:28:38PM -0700, Nick Desaulniers wrote:
On Tue, Mar 23, 2021 at 12:05 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
The only time git gets involved is when we do a -rc release or when we do a "real" release, and then we use 'git quiltimport' on the whole stack.
Here's a script that I use (much too slow, I know), for checking this type of thing and I try to remember to run it before every cycle of -rc releases: https://github.com/gregkh/commit_tree/blob/master/find_fixes_in_queue
It's a hack, and picks up more things than is really needed, but I would rather it error on that side than the other.
Yes, my script is similar. Looks like yours also runs on a git tree.
I noticed that id_fixed_in runs `git grep -l --threads=3 <sha>` to find fixes; that's neat, I didn't know about `--threads=`. I tried it with ae46578b963f manually:
$ git grep -l --threads=3 ae46578b963f $
Should it have found a7889c6320b9 and 773e0c402534? Perhaps `git log --grep=<sha>` should be used instead? I thought `git grep` only greps files in the archive, not commit history?
Yes, it does only grep the files in the archive.
But look closer at the archive that this script lives in :)
This archive is a "blown up" copy of the Linux kernel tree, with one file per commit. The name of the file is the commit id, and the content of the file is the changelog of the commit itself.
So it's a hack that I use to be able to simply search the changelogs of all commits to find out if they have a "Fixes:" tag with a specific commit id in it.
So in your example above, in the repo I run it and get:
~/linux/stable/commit_tree $ git grep -l --threads=3 ae46578b963f changes/5.2/773e0c40253443e0ce5491cb0e414b62f7cc45ed ids/5.2
Which shows me that in commit 773e0c402534 ("afs: Fix afs_xattr_get_yfs() to not try freeing an error value") in the kernel tree, it has a "Fixes:" tag that references "ae46578b963f".
It also shows me that commit ae46578b963f was contained in the 5.2 kernel release, as I use the "ids/" subdirectory here for other fast lookups (it's a tiny bit faster than 'git describe --contains').
I don't know how your script is walking through all possible commits to see if they are fixing a specific one, maybe I should look and see if it's doing it better than my "git tree/directory as a database hack" does :)
thanks,
greg k-h
On Wed, Mar 24, 2021 at 10:55:27AM +0100, Greg Kroah-Hartman wrote:
On Tue, Mar 23, 2021 at 01:28:38PM -0700, Nick Desaulniers wrote:
On Tue, Mar 23, 2021 at 12:05 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
The only time git gets involved is when we do a -rc release or when we do a "real" release, and then we use 'git quiltimport' on the whole stack.
Here's a script that I use (much too slow, I know), for checking this type of thing and I try to remember to run it before every cycle of -rc releases: https://github.com/gregkh/commit_tree/blob/master/find_fixes_in_queue
It's a hack, and picks up more things than is really needed, but I would rather it error on that side than the other.
Yes, my script is similar. Looks like yours also runs on a git tree.
I noticed that id_fixed_in runs `git grep -l --threads=3 <sha>` to find fixes; that's neat, I didn't know about `--threads=`. I tried it with ae46578b963f manually:
$ git grep -l --threads=3 ae46578b963f $
Should it have found a7889c6320b9 and 773e0c402534? Perhaps `git log --grep=<sha>` should be used instead? I thought `git grep` only greps files in the archive, not commit history?
Yes, it does only grep the files in the archive.
But look closer at the archive that this script lives in :)
This archive is a "blown up" copy of the Linux kernel tree, with one file per commit. The name of the file is the commit id, and the content of the file is the changelog of the commit itself.
So it's a hack that I use to be able to simply search the changelogs of all commits to find out if they have a "Fixes:" tag with a specific commit id in it.
So in your example above, in the repo I run it and get:
~/linux/stable/commit_tree $ git grep -l --threads=3 ae46578b963f changes/5.2/773e0c40253443e0ce5491cb0e414b62f7cc45ed ids/5.2
Which shows me that in commit 773e0c402534 ("afs: Fix afs_xattr_get_yfs() to not try freeing an error value") in the kernel tree, it has a "Fixes:" tag that references "ae46578b963f".
It also shows me that commit ae46578b963f was contained in the 5.2 kernel release, as I use the "ids/" subdirectory here for other fast lookups (it's a tiny bit faster than 'git describe --contains').
I don't know how your script is walking through all possible commits to see if they are fixing a specific one, maybe I should look and see if it's doing it better than my "git tree/directory as a database hack" does :)
FWIW,
I had a need for something similar and found `git rev-list --grep` provided fastest results. Does not provide for the "ids/" hack though...
❯ N="ae46578b963f"; git rev-list --grep="${N}" "${N}..upstream/master" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done a7889c6320b9200e3fe415238f546db677310fa9 Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") 773e0c40253443e0ce5491cb0e414b62f7cc45ed Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs")
❯ N="a7889c6320b9"; git rev-list --grep="${N}" "${N}..stable/linux-5.4.y" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done 6712b7fcef9d1092e99733645cf52cfb3d482555 commit a7889c6320b9200e3fe415238f546db677310fa9 upstream.
❯ N="ae46578b963f"; git rev-list --grep="${N}" "${N}..stable/linux-5.4.y" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done 6712b7fcef9d1092e99733645cf52cfb3d482555 Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") 773e0c40253443e0ce5491cb0e414b62f7cc45ed Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs")
thanks,
greg k-h
On Fri, Mar 26, 2021 at 03:03:35PM -0600, Tom Saeger wrote:
On Wed, Mar 24, 2021 at 10:55:27AM +0100, Greg Kroah-Hartman wrote:
On Tue, Mar 23, 2021 at 01:28:38PM -0700, Nick Desaulniers wrote:
On Tue, Mar 23, 2021 at 12:05 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
The only time git gets involved is when we do a -rc release or when we do a "real" release, and then we use 'git quiltimport' on the whole stack.
Here's a script that I use (much too slow, I know), for checking this type of thing and I try to remember to run it before every cycle of -rc releases: https://github.com/gregkh/commit_tree/blob/master/find_fixes_in_queue
It's a hack, and picks up more things than is really needed, but I would rather it error on that side than the other.
Yes, my script is similar. Looks like yours also runs on a git tree.
I noticed that id_fixed_in runs `git grep -l --threads=3 <sha>` to find fixes; that's neat, I didn't know about `--threads=`. I tried it with ae46578b963f manually:
$ git grep -l --threads=3 ae46578b963f $
Should it have found a7889c6320b9 and 773e0c402534? Perhaps `git log --grep=<sha>` should be used instead? I thought `git grep` only greps files in the archive, not commit history?
Yes, it does only grep the files in the archive.
But look closer at the archive that this script lives in :)
This archive is a "blown up" copy of the Linux kernel tree, with one file per commit. The name of the file is the commit id, and the content of the file is the changelog of the commit itself.
So it's a hack that I use to be able to simply search the changelogs of all commits to find out if they have a "Fixes:" tag with a specific commit id in it.
So in your example above, in the repo I run it and get:
~/linux/stable/commit_tree $ git grep -l --threads=3 ae46578b963f changes/5.2/773e0c40253443e0ce5491cb0e414b62f7cc45ed ids/5.2
Which shows me that in commit 773e0c402534 ("afs: Fix afs_xattr_get_yfs() to not try freeing an error value") in the kernel tree, it has a "Fixes:" tag that references "ae46578b963f".
It also shows me that commit ae46578b963f was contained in the 5.2 kernel release, as I use the "ids/" subdirectory here for other fast lookups (it's a tiny bit faster than 'git describe --contains').
I don't know how your script is walking through all possible commits to see if they are fixing a specific one, maybe I should look and see if it's doing it better than my "git tree/directory as a database hack" does :)
FWIW,
I had a need for something similar and found `git rev-list --grep` provided fastest results. Does not provide for the "ids/" hack though...
❯ N="ae46578b963f"; git rev-list --grep="${N}" "${N}..upstream/master" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done a7889c6320b9200e3fe415238f546db677310fa9 Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") 773e0c40253443e0ce5491cb0e414b62f7cc45ed Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs")
❯ N="a7889c6320b9"; git rev-list --grep="${N}" "${N}..stable/linux-5.4.y" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done 6712b7fcef9d1092e99733645cf52cfb3d482555 commit a7889c6320b9200e3fe415238f546db677310fa9 upstream.
❯ N="ae46578b963f"; git rev-list --grep="${N}" "${N}..stable/linux-5.4.y" | while read -r hid ; do git log -n1 "${hid}" | grep -F "${N}" | sed "s#^#${hid} #"; done 6712b7fcef9d1092e99733645cf52cfb3d482555 Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") 773e0c40253443e0ce5491cb0e414b62f7cc45ed Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs")
Ah, I did not know about 'git rev-list --grep' thanks! I'll play around with that to see if it actually is any faster than my implementation...
thanks,
greg k-h
On Tue, Mar 23, 2021 at 11:52:26AM -0700, Nick Desaulniers wrote:
On Tue, Mar 23, 2021 at 6:56 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Mar 16, 2021 at 02:31:33PM -0700, Nick Desaulniers wrote:
A common recurring mistake made when backporting patches to stable is forgetting to check for additional commits tagged with `Fixes:`. This script validates that local commits have a `commit <sha40> upstream.` line in their commit message, and whether any additional `Fixes:` shas exist in the `master` branch but were not included. It can not know about fixes yet to be discovered, or fixes sent to the mailing list but not yet in mainline.
To save time, it avoids checking all of `master`, stopping early once we've reached the commit time of the earliest backport. It takes 0.5s to validate 2 patches to linux-5.4.y when master is v5.12-rc3 and 5s to validate 27 patches to linux-4.19.y. It does not recheck dependencies of found fixes; the user is expected to run this script to a fixed point. It depnds on pygit2 python library for working with git, which can be installed via: $ pip3 install pygit2
It's expected to be run from a stable tree with commits applied. For example, consider 3cce9d44321e which is a fix for f77ac2e378be. Let's say I cherry picked f77ac2e378be into linux-5.4.y but forgot 3cce9d44321e (true story). If I ran:
$ ./scripts/stable/check_backports.py Checking 1 local commits for additional Fixes: in master Please consider backporting 3cce9d44321e as a fix for f77ac2e378be
While interesting, I don't use a git tree for the stable queue, so this doesn't really fit into my workflow, sorry.
Well, what is your workflow?
That's a trick question :) I don't think something like this should target our workflow, but rather should be for someone who wants to send patches over to stable@.
I also think that the formatting patch shouldn't be checking for proper formatting, but rather should just be doing it on it's own.
What I don't know is the right place to put it in... It can go into stable-queue.git, but there are very few people who are aware of it's existance, and even a smaller number who knows how it works.
linux-stable-mirror@lists.linaro.org