On 4/7/2021 2:33 PM, Jason Gunthorpe wrote:
On Mon, Mar 29, 2021 at 11:36:09AM -0700, Ira Weiny wrote:
On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
From: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com
The security code guards for non-current mm in all cases for updating the rb tree.
That is ok for insert, but NOT ok for remove, since the insert has already guarded the node from being inserted and the remove can be called with a different mm because of a segfault other similar "close" issues where current-mm is NULL.
Best case, is we leak pages. worst case we delete items for an lru_list more than once: [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)
Fix by removing the guard from any functions that remove nodes from the tree assuming the node was entered into the tree as valid since the insert is guarded.
Does this open up a child process being able to remove nodes which the parent added?
Dennis?
I believe it does in a way. I'm not sure what we can do about it.
One thought was to check mm for NULL and if so remove unconditionally because that means it's coming from the kernel killing the proc or something along those lines. If it's not NULL check against the saved mm value. Ira, do you recall discussing that during our internal review?
Need to do some more thinking on the right thing to do as I'm sure there are corner cases that I'm not seeing.
-Denny