On Tue, Oct 10, 2017 at 10:31 AM, Julia Lawall julia.lawall@lip6.fr wrote:
On Tue, 10 Oct 2017, Levin, Alexander (Sasha Levin) wrote:
(Cc'ed Julia)
On Mon, Oct 09, 2017 at 09:33:01AM -0700, Laura Abbott wrote:
On 10/06/2017 08:10 PM, Levin, Alexander (Sasha Levin) wrote:
We are experimenting with using neural network to aid with patch selection for stable kernel trees. There are quite a few commits that were not marked for stable, but are stable material, and we're trying to get them into their appropriate kernel trees.
Apart from the practical which has been covered, I'd be interested in hearing about the details of how this works if you can share them.
This work is based on Julia's work (https://soarsmu.github.io/papers/icse12-patch.pdf) to identify commits that fix bugs.
Essentially, my approach to this is to extract as much information as possbile form the commit, including things such as:
- How many times a certain word appeared in the message
- Who is the author
- Code metrics
- etc
In my case, I end up with about 30,000 of these "inputs", and train a neural network based on whether a given commit was included in a stable tree or not.
This approach has a few drawbacks compared to the one Julia described in her paper:
- Not every bug fixing commit ends up in stable (some end up in -rc
fixing a bug from the current merge window).
- Same as above, but for commits we miss and fail to add to stable.
- Sometimes commits get added to stable even though they don't follow
the rules at all (security fixes are a simple example).
But it does seem to be effective at finding bug fixing commits that should be in stable.
At this stage we are still trying to figure out what a "bug fixing" commit really is. For example, an observation we recently made was that the code metrics actually don't have much weight in determining whether a commit should be in stable or not.
As we just started, I'm still experimenting with a few approaches, and I belive Julia is waiting for a new student to take over this, so we don't have any big insights to share just yet :)
That's a good summary of the current status. Thanks!
julia
I just started noticing the AUTOSEL tags yesterday and I think that's a great idea to tag patches, but was there any thought to also putting something in the commit message this way they're easily identifiable in the git logs? I think it would be useful if there was some metadata in the commit message which identified that it was selected through some automated system. That way if I find a regression and it identifies one of these commits I can know that maybe it was chosen incorrectly, and also would allow me to alert the owner of the selection script to better help refine its selection process. Otherwise I'd have to track back through the mailing lists to see how it landed in the stable release.
Just a thought. Also, thank you for trying to improve the stable kernels!
On Wed, Nov 15, 2017 at 09:43:41AM -0800, Josh Hunt wrote:
I just started noticing the AUTOSEL tags yesterday and I think that's a great idea to tag patches, but was there any thought to also putting something in the commit message this way they're easily identifiable in the git logs? I think it would be useful if there was some metadata in the commit message which identified that it was selected through some automated system. That way if I find a regression and it identifies one of these commits I can know that maybe it was chosen incorrectly, and also would allow me to alert the owner of the selection script to better help refine its selection process. Otherwise I'd have to track back through the mailing lists to see how it landed in the stable release.
It's possible, but I didn't want to add a bunch of clutter to the commit message. Right now it's somewhat easy to track it back to automatic selection because:
1. I'm signed off on all of them, so I could chime in in the case concerns/issues arise with a patch. 2. They all have a corresponding review request email with the AUTOSEL marker.
Keep in mind that what the automatic tools are doing is only identifying whether a patch "looks like" a patch that should be in a stable tree. They do not verify that it's appropriate for any of the stable trees it ends up going to - that's still mostly manual and all fuck ups are PEBCAK.
Just a thought. Also, thank you for trying to improve the stable kernels!
Thanks Josh!
On Thu, Nov 16, 2017 at 3:13 PM, alexander.levin@verizon.com wrote:
It's possible, but I didn't want to add a bunch of clutter to the commit message. Right now it's somewhat easy to track it back to automatic selection because:
- I'm signed off on all of them, so I could chime in in the case
concerns/issues arise with a patch. 2. They all have a corresponding review request email with the AUTOSEL marker.
I get the want to not clutter the commit logs. My comment was more directed to a few weeks or months after the patch has made it into a stable release. At that point I'm not sure the person who found the problem with the change would know to CC you on any correspondence, or even if the author of the change would know to contact you. Although I guess maybe they'd eventually track things down and report to stable, or Greg, or something else and it would eventually get back to you.
On Thu, Nov 16, 2017 at 03:24:46PM -0800, Josh Hunt wrote:
On Thu, Nov 16, 2017 at 3:13 PM, alexander.levin@verizon.com wrote:
It's possible, but I didn't want to add a bunch of clutter to the commit message. Right now it's somewhat easy to track it back to automatic selection because:
- I'm signed off on all of them, so I could chime in in the case
concerns/issues arise with a patch. 2. They all have a corresponding review request email with the AUTOSEL marker.
I get the want to not clutter the commit logs. My comment was more directed to a few weeks or months after the patch has made it into a stable release. At that point I'm not sure the person who found the problem with the change would know to CC you on any correspondence, or even if the author of the change would know to contact you. Although I guess maybe they'd eventually track things down and report to stable, or Greg, or something else and it would eventually get back to you.
If they report it to stable@vger (as they should), it is trivial for us to look in the logs to see where the patch came from. It turns out that the way Sasha formats these patches makes it obvious to me that it was an auto-selected patch, so I can easily see that even without looking in the email history.
So don't worry about us not being able to track things down, we are good at that :)
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org