> Subject: Patch "net: phylink: set the autoneg state in phylink_phy_change" has
> been added to the 5.1-stable tree
>
>
> This is a note to let you know that I've just added the patch titled
>
> net: phylink: set the autoneg state in phylink_phy_change
>
> to the 5.1-stable tree which can be found at:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
>
> The filename of the patch is:
> net-phylink-set-the-autoneg-state-in-phylink_phy_change.patch
> and it can be found in the queue-5.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree, please let
> <stable(a)vger.kernel.org> know about it.
Hi all,
Sorry for the late response but this patch should not be added to stables trees since it was already reverted in net-next.
More information can be found at: https://marc.info/?l=linux-netdev&m=156104162206869&w=2
Thanks,
Ioana
>
>
> From foo@baz Wed 19 Jun 2019 02:33:45 PM CEST
> From: Ioana Ciornei <ioana.ciornei(a)nxp.com>
> Date: Thu, 13 Jun 2019 09:37:51 +0300
> Subject: net: phylink: set the autoneg state in phylink_phy_change
>
> From: Ioana Ciornei <ioana.ciornei(a)nxp.com>
>
> [ Upstream commit ef7bfa84725d891bbdb88707ed55b2cbf94942bb ]
>
> The phy_state field of phylink should carry only valid information especially
> when this can be passed to the .mac_config callback.
> Update the an_enabled field with the autoneg state in the phylink_phy_change
> function.
>
> Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
> Signed-off-by: Ioana Ciornei <ioana.ciornei(a)nxp.com>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> drivers/net/phy/phylink.c | 1 +
> 1 file changed, 1 insertion(+)
>
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -638,6 +638,7 @@ static void phylink_phy_change(struct ph
> pl->phy_state.pause |= MLO_PAUSE_ASYM;
> pl->phy_state.interface = phydev->interface;
> pl->phy_state.link = up;
> + pl->phy_state.an_enabled = phydev->autoneg;
> mutex_unlock(&pl->state_mutex);
>
> phylink_run_resolve(pl);
>
>
> Patches currently in stable-queue which might be from ioana.ciornei(a)nxp.com
> are
>
> queue-5.1/net-phylink-set-the-autoneg-state-in-phylink_phy_change.patch
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 59ea6d06cfa9247b586a695c21f94afa7183af74 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <aarcange(a)redhat.com>
Date: Thu, 13 Jun 2019 15:56:11 -0700
Subject: [PATCH] coredump: fix race condition between collapse_huge_page() and
core dumping
When fixing the race conditions between the coredump and the mmap_sem
holders outside the context of the process, we focused on
mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
race condition between mmget_not_zero()/get_task_mm() and core
dumping"), but those aren't the only cases where the mmap_sem can be
taken outside of the context of the process as Michal Hocko noticed
while backporting that commit to older -stable kernels.
If mmgrab() is called in the context of the process, but then the
mm_count reference is transferred outside the context of the process,
that can also be a problem if the mmap_sem has to be taken for writing
through that mm_count reference.
khugepaged registration calls mmgrab() in the context of the process,
but the mmap_sem for writing is taken later in the context of the
khugepaged kernel thread.
collapse_huge_page() after taking the mmap_sem for writing doesn't
modify any vma, so it's not obvious that it could cause a problem to the
coredump, but it happens to modify the pmd in a way that breaks an
invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page()
needs the mmap_sem for writing just to block concurrent page faults that
call pmd_trans_huge_lock().
Specifically the invariant that "!pmd_trans_huge()" cannot become a
"pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
The coredump will call __get_user_pages() without mmap_sem for reading,
which eventually can invoke a lockless page fault which will need a
functional pmd_trans_huge_lock().
So collapse_huge_page() needs to use mmget_still_valid() to check it's
not running concurrently with the coredump... as long as the coredump
can invoke page faults without holding the mmap_sem for reading.
This has "Fixes: khugepaged" to facilitate backporting, but in my view
it's more a bug in the coredump code that will eventually have to be
rewritten to stop invoking page faults without the mmap_sem for reading.
So the long term plan is still to drop all mmget_still_valid().
Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
Fixes: ba76149f47d8 ("thp: khugepaged")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reported-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index a3fda9f024c3..4a7944078cc3 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -54,6 +54,10 @@ static inline void mmdrop(struct mm_struct *mm)
* followed by taking the mmap_sem for writing before modifying the
* vmas or anything the coredump pretends not to change from under it.
*
+ * It also has to be called when mmgrab() is used in the context of
+ * the process, but then the mm_count refcount is transferred outside
+ * the context of the process to run down_write() on that pinned mm.
+ *
* NOTE: find_extend_vma() called from GUP context is the only place
* that can modify the "mm" (notably the vm_start/end) under mmap_sem
* for reading and outside the context of the process, so it is also
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a335f7c1fac4..0f7419938008 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1004,6 +1004,9 @@ static void collapse_huge_page(struct mm_struct *mm,
* handled by the anon_vma lock + PG_lock.
*/
down_write(&mm->mmap_sem);
+ result = SCAN_ANY_PROCESS;
+ if (!mmget_still_valid(mm))
+ goto out;
result = hugepage_vma_revalidate(mm, address, &vma);
if (result)
goto out;
In the spirit of filemap_fdatawait_range() and
filemap_fdatawait_keep_errors(), introduce
filemap_fdatawait_range_keep_errors() which both takes a range upon
which to wait and does not clear errors from the address space.
Signed-off-by: Ross Zwisler <zwisler(a)google.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: stable(a)vger.kernel.org
---
include/linux/fs.h | 2 ++
mm/filemap.c | 22 ++++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f7fdfe93e25d3..79fec8a8413f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2712,6 +2712,8 @@ extern int filemap_flush(struct address_space *);
extern int filemap_fdatawait_keep_errors(struct address_space *mapping);
extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
loff_t lend);
+extern int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte);
static inline int filemap_fdatawait(struct address_space *mapping)
{
diff --git a/mm/filemap.c b/mm/filemap.c
index df2006ba0cfa5..e87252ca0835a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -553,6 +553,28 @@ int filemap_fdatawait_range(struct address_space *mapping, loff_t start_byte,
}
EXPORT_SYMBOL(filemap_fdatawait_range);
+/**
+ * filemap_fdatawait_range_keep_errors - wait for writeback to complete
+ * @mapping: address space structure to wait for
+ * @start_byte: offset in bytes where the range starts
+ * @end_byte: offset in bytes where the range ends (inclusive)
+ *
+ * Walk the list of under-writeback pages of the given address space in the
+ * given range and wait for all of them. Unlike filemap_fdatawait_range(),
+ * this function does not clear error status of the address space.
+ *
+ * Use this function if callers don't handle errors themselves. Expected
+ * call sites are system-wide / filesystem-wide data flushers: e.g. sync(2),
+ * fsfreeze(8)
+ */
+int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte)
+{
+ __filemap_fdatawait_range(mapping, start_byte, end_byte);
+ return filemap_check_and_keep_errors(mapping);
+}
+EXPORT_SYMBOL(filemap_fdatawait_range_keep_errors);
+
/**
* file_fdatawait_range - wait for writeback to complete
* @file: file pointing to address space structure to wait for
--
2.22.0.410.gd8fdbe21b5-goog
From: Andrea Arcangeli <aarcange(a)redhat.com>
Upstream 04f5866e41fb70690e28397487d8bd8eea7d712a commit.
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it. Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier. For example in Hugh's post from Jul 2017:
https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
"Not strictly relevant here, but a related note: I was very surprised
to discover, only quite recently, how handle_mm_fault() may be called
without down_read(mmap_sem) - when core dumping. That seems a
misguided optimization to me, which would also be nice to correct"
In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.
Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.
Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.
For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs. Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.
Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.
In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.
Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm(). The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.
Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com>
Reported-by: Jann Horn <jannh(a)google.com>
Suggested-by: Oleg Nesterov <oleg(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reviewed-by: Oleg Nesterov <oleg(a)redhat.com>
Reviewed-by: Jann Horn <jannh(a)google.com>
Acked-by: Jason Gunthorpe <jgg(a)mellanox.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Joel Fernandes (Google) <joel(a)joelfernandes.org>
[mhocko(a)suse.com: stable 4.4 backport
- drop infiniband part because of missing 5f9794dc94f59
- drop userfaultfd_event_wait_completion hunk because of
missing 9cd75c3cd4c3d]
- handle binder_update_page_range because of missing 720c241924046
- add check to collapse_huge_page
]
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
---
Hi,
this is based on the backport I have done for out 4.4 based distribution
kernel. Please double check that I haven't missed anything before
applying to the stable tree. I have also CCed Joel for the binder part
which is not in the current upstream anymore but I believe it needs the
check as well.
Review feedback welcome.
drivers/android/binder.c | 6 ++++++
fs/proc/task_mmu.c | 18 ++++++++++++++++++
fs/userfaultfd.c | 10 ++++++++--
include/linux/mm.h | 21 +++++++++++++++++++++
mm/huge_memory.c | 2 +-
mm/mmap.c | 7 ++++++-
6 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 260ce0e60187..1fb1cddbd19a 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -570,6 +570,12 @@ static int binder_update_page_range(struct binder_proc *proc, int allocate,
if (mm) {
down_write(&mm->mmap_sem);
+ if (!mmget_still_valid(mm)) {
+ if (allocate == 0)
+ goto free_range;
+ goto err_no_vma;
+ }
+
vma = proc->vma;
if (vma && mm != proc->vma_vm_mm) {
pr_err("%d: vma mm and task mm mismatch\n",
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 75691a20313c..ad1ccdcef74e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -947,6 +947,24 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
continue;
up_read(&mm->mmap_sem);
down_write(&mm->mmap_sem);
+ /*
+ * Avoid to modify vma->vm_flags
+ * without locked ops while the
+ * coredump reads the vm_flags.
+ */
+ if (!mmget_still_valid(mm)) {
+ /*
+ * Silently return "count"
+ * like if get_task_mm()
+ * failed. FIXME: should this
+ * function have returned
+ * -ESRCH if get_task_mm()
+ * failed like if
+ * get_proc_task() fails?
+ */
+ up_write(&mm->mmap_sem);
+ goto out_mm;
+ }
for (vma = mm->mmap; vma; vma = vma->vm_next) {
vma->vm_flags &= ~VM_SOFTDIRTY;
vma_set_page_prot(vma);
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 59d58bdad7d3..4053de7b7064 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -443,6 +443,8 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
* taking the mmap_sem for writing.
*/
down_write(&mm->mmap_sem);
+ if (!mmget_still_valid(mm))
+ goto skip_mm;
prev = NULL;
for (vma = mm->mmap; vma; vma = vma->vm_next) {
cond_resched();
@@ -465,6 +467,7 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
vma->vm_flags = new_flags;
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
}
+skip_mm:
up_write(&mm->mmap_sem);
/*
@@ -761,9 +764,10 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
end = start + uffdio_register.range.len;
down_write(&mm->mmap_sem);
- vma = find_vma_prev(mm, start, &prev);
-
ret = -ENOMEM;
+ if (!mmget_still_valid(mm))
+ goto out_unlock;
+ vma = find_vma_prev(mm, start, &prev);
if (!vma)
goto out_unlock;
@@ -903,6 +907,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
end = start + uffdio_unregister.len;
down_write(&mm->mmap_sem);
+ if (!mmget_still_valid(mm))
+ goto out_unlock;
vma = find_vma_prev(mm, start, &prev);
ret = -ENOMEM;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 251adf4d8a71..ed653ba47c46 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1098,6 +1098,27 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long address,
void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
unsigned long start, unsigned long end);
+/*
+ * This has to be called after a get_task_mm()/mmget_not_zero()
+ * followed by taking the mmap_sem for writing before modifying the
+ * vmas or anything the coredump pretends not to change from under it.
+ *
+ * NOTE: find_extend_vma() called from GUP context is the only place
+ * that can modify the "mm" (notably the vm_start/end) under mmap_sem
+ * for reading and outside the context of the process, so it is also
+ * the only case that holds the mmap_sem for reading that must call
+ * this function. Generally if the mmap_sem is hold for reading
+ * there's no need of this check after get_task_mm()/mmget_not_zero().
+ *
+ * This function can be obsoleted and the check can be removed, after
+ * the coredump code will hold the mmap_sem for writing before
+ * invoking the ->core_dump methods.
+ */
+static inline bool mmget_still_valid(struct mm_struct *mm)
+{
+ return likely(!mm->core_state);
+}
+
/**
* mm_walk - callbacks for walk_page_range
* @pmd_entry: if set, called for each non-empty PMD (3rd-level) entry
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 465786cd6490..281fde7b62d2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2587,7 +2587,7 @@ static void collapse_huge_page(struct mm_struct *mm,
* handled by the anon_vma lock + PG_lock.
*/
down_write(&mm->mmap_sem);
- if (unlikely(khugepaged_test_exit(mm)))
+ if (unlikely(khugepaged_test_exit(mm) || mmget_still_valid(mm)))
goto out;
vma = find_vma(mm, address);
diff --git a/mm/mmap.c b/mm/mmap.c
index baa4c1280bff..a24e42477001 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -42,6 +42,7 @@
#include <linux/memory.h>
#include <linux/printk.h>
#include <linux/userfaultfd_k.h>
+#include <linux/mm.h>
#include <asm/uaccess.h>
#include <asm/cacheflush.h>
@@ -2398,7 +2399,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
vma = find_vma_prev(mm, addr, &prev);
if (vma && (vma->vm_start <= addr))
return vma;
- if (!prev || expand_stack(prev, addr))
+ /* don't alter vm_end if the coredump is running */
+ if (!prev || !mmget_still_valid(mm) || expand_stack(prev, addr))
return NULL;
if (prev->vm_flags & VM_LOCKED)
populate_vma_page_range(prev, addr, prev->vm_end, NULL);
@@ -2424,6 +2426,9 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
return vma;
if (!(vma->vm_flags & VM_GROWSDOWN))
return NULL;
+ /* don't alter vm_start if the coredump is running */
+ if (!mmget_still_valid(mm))
+ return NULL;
start = vma->vm_start;
if (expand_stack(vma, addr))
return NULL;
--
2.20.1
commit 7a30df49f63ad92318ddf1f7498d1129a77dd4bd upstream
A few new fields were added to mmu_gather to make TLB flush smarter for
huge page by telling what level of page table is changed.
__tlb_reset_range() is used to reset all these page table state to
unchanged, which is called by TLB flush for parallel mapping changes for
the same range under non-exclusive lock (i.e. read mmap_sem).
Before commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update
PTEs in parallel don't remove page tables. But, the forementioned
commit may do munmap() under read mmap_sem and free page tables. This
may result in program hang on aarch64 reported by Jan Stancek. The
problem could be reproduced by his test program with slightly modified
below.
---8<---
static int map_size = 4096;
static int num_iter = 500;
static long threads_total;
static void *distant_area;
void *map_write_unmap(void *ptr)
{
int *fd = ptr;
unsigned char *map_address;
int i, j = 0;
for (i = 0; i < num_iter; i++) {
map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (map_address == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (j = 0; j < map_size; j++)
map_address[j] = 'b';
if (munmap(map_address, map_size) == -1) {
perror("munmap");
exit(1);
}
}
return NULL;
}
void *dummy(void *ptr)
{
return NULL;
}
int main(void)
{
pthread_t thid[2];
/* hint for mmap in map_write_unmap() */
distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
munmap(distant_area, (size_t)DISTANT_MMAP_SIZE);
distant_area += DISTANT_MMAP_SIZE / 2;
while (1) {
pthread_create(&thid[0], NULL, map_write_unmap, NULL);
pthread_create(&thid[1], NULL, dummy, NULL);
pthread_join(thid[0], NULL);
pthread_join(thid[1], NULL);
}
}
---8<---
The program may bring in parallel execution like below:
t1 t2
munmap(map_address)
downgrade_write(&mm->mmap_sem);
unmap_region()
tlb_gather_mmu()
inc_tlb_flush_pending(tlb->mm);
free_pgtables()
tlb->freed_tables = 1
tlb->cleared_pmds = 1
pthread_exit()
madvise(thread_stack, 8M, MADV_DONTNEED)
zap_page_range()
tlb_gather_mmu()
inc_tlb_flush_pending(tlb->mm);
tlb_finish_mmu()
if (mm_tlb_flush_nested(tlb->mm))
__tlb_reset_range()
__tlb_reset_range() would reset freed_tables and cleared_* bits, but this
may cause inconsistency for munmap() which do free page tables. Then it
may result in some architectures, e.g. aarch64, may not flush TLB
completely as expected to have stale TLB entries remained.
Use fullmm flush since it yields much better performance on aarch64 and
non-fullmm doesn't yields significant difference on x86.
The original proposed fix came from Jan Stancek who mainly debugged this
issue, I just wrapped up everything together.
Jan's testing results:
v5.2-rc2-24-gbec7550cca10
--------------------------
mean stddev
real 37.382 2.780
user 1.420 0.078
sys 54.658 1.855
v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush"
---------------------------------------------------------------------------------------_
mean stddev
real 37.119 2.105
user 1.548 0.087
sys 55.698 1.357
[akpm(a)linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.…
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Signed-off-by: Jan Stancek <jstancek(a)redhat.com>
Reported-by: Jan Stancek <jstancek(a)redhat.com>
Tested-by: Jan Stancek <jstancek(a)redhat.com>
Suggested-by: Will Deacon <will.deacon(a)arm.com>
Tested-by: Will Deacon <will.deacon(a)arm.com>
Acked-by: Will Deacon <will.deacon(a)arm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Nick Piggin <npiggin(a)gmail.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.ibm.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: <stable(a)vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
mm/mmu_gather.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index f2f03c6..a58bd0d 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -93,8 +93,17 @@ void arch_tlb_finish_mmu(struct mmu_gather *tlb,
struct mmu_gather_batch *batch, *next;
if (force) {
+ /*
+ * The aarch64 yields better performance with fullmm by
+ * avoiding multiple CPUs spamming TLBI messages at the
+ * same time.
+ *
+ * On x86 non-fullmm doesn't yield significant difference
+ * against fullmm.
+ */
+ tlb->fullmm = 1;
__tlb_reset_range(tlb);
- __tlb_adjust_range(tlb, start, end - start);
+ tlb->freed_tables = 1;
}
tlb_flush_mmu(tlb);
@@ -249,10 +258,15 @@ void tlb_finish_mmu(struct mmu_gather *tlb,
{
/*
* If there are parallel threads are doing PTE changes on same range
- * under non-exclusive lock(e.g., mmap_sem read-side) but defer TLB
- * flush by batching, a thread has stable TLB entry can fail to flush
- * the TLB by observing pte_none|!pte_dirty, for example so flush TLB
- * forcefully if we detect parallel PTE batching threads.
+ * under non-exclusive lock (e.g., mmap_sem read-side) but defer TLB
+ * flush by batching, one thread may end up seeing inconsistent PTEs
+ * and result in having stale TLB entries. So flush TLB forcefully
+ * if we detect parallel PTE batching threads.
+ *
+ * However, some syscalls, e.g. munmap(), may free page tables, this
+ * needs force flush everything in the given range. Otherwise this
+ * may result in having stale TLB entries for some architectures,
+ * e.g. aarch64, that could specify flush what level TLB.
*/
bool force = mm_tlb_flush_nested(tlb->mm);
--
1.8.3.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2cc08800a6b9fcda7c7afbcf2da1a6e8808da725 Mon Sep 17 00:00:00 2001
From: Jason Gerecke <jason.gerecke(a)wacom.com>
Date: Wed, 24 Apr 2019 15:12:57 -0700
Subject: [PATCH] HID: wacom: Don't set tool type until we're in range
The serial number and tool type information that is reported by the tablet
while a pen is merely "in prox" instead of fully "in range" can be stale
and cause us to report incorrect tool information. Serial number, tool
type, and other information is only valid once the pen comes fully in range
so we should be careful to not use this information until that point.
In particular, this issue may cause the driver to incorectly report
BTN_TOOL_RUBBER after switching from the eraser tool back to the pen.
Fixes: a48324de6d4d ("HID: wacom: Bluetooth IRQ for Intuos Pro should handle prox/range")
Cc: <stable(a)vger.kernel.org> # 4.11+
Signed-off-by: Jason Gerecke <jason.gerecke(a)wacom.com>
Reviewed-by: Aaron Armstrong Skomra <aaron.skomra(a)wacom.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
index 747730d32ab6..4c1bc239207e 100644
--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -1236,13 +1236,13 @@ static void wacom_intuos_pro2_bt_pen(struct wacom_wac *wacom)
/* Add back in missing bits of ID for non-USI pens */
wacom->id[0] |= (wacom->serial[0] >> 32) & 0xFFFFF;
}
- wacom->tool[0] = wacom_intuos_get_tool_type(wacom_intuos_id_mangle(wacom->id[0]));
for (i = 0; i < pen_frames; i++) {
unsigned char *frame = &data[i*pen_frame_len + 1];
bool valid = frame[0] & 0x80;
bool prox = frame[0] & 0x40;
bool range = frame[0] & 0x20;
+ bool invert = frame[0] & 0x10;
if (!valid)
continue;
@@ -1251,9 +1251,24 @@ static void wacom_intuos_pro2_bt_pen(struct wacom_wac *wacom)
wacom->shared->stylus_in_proximity = false;
wacom_exit_report(wacom);
input_sync(pen_input);
+
+ wacom->tool[0] = 0;
+ wacom->id[0] = 0;
+ wacom->serial[0] = 0;
return;
}
+
if (range) {
+ if (!wacom->tool[0]) { /* first in range */
+ /* Going into range select tool */
+ if (invert)
+ wacom->tool[0] = BTN_TOOL_RUBBER;
+ else if (wacom->id[0])
+ wacom->tool[0] = wacom_intuos_get_tool_type(wacom->id[0]);
+ else
+ wacom->tool[0] = BTN_TOOL_PEN;
+ }
+
input_report_abs(pen_input, ABS_X, get_unaligned_le16(&frame[1]));
input_report_abs(pen_input, ABS_Y, get_unaligned_le16(&frame[3]));
Hi,
The patch "crypto: crypto4xx - properly set IV after de- and encrypt"
causes a compile error on kernel 4.4.
When I revert this commit it compiles for me again:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
I do not have hardware to test if it is really working.
Hauke
drivers/crypto/amcc/crypto4xx_core.c: In function
'crypto4xx_ablkcipher_done':
drivers/crypto/amcc/crypto4xx_core.c:649:21: warning: dereferencing
'void *' pointer
if (pd_uinfo->sa_va->sa_command_0.bf.save_iv == SA_SAVE_IV) {
^
drivers/crypto/amcc/crypto4xx_core.c:649:21: error: request for member
'sa_command_0' in something not a structure or union
drivers/crypto/amcc/crypto4xx_core.c:650:38: error: implicit declaration
of function 'crypto_skcipher_reqtfm' [-Werror=implicit-function-declaration]
struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
^
drivers/crypto/amcc/crypto4xx_core.c:650:61: error: 'req' undeclared
(first use in this function)
struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
^
drivers/crypto/amcc/crypto4xx_core.c:650:61: note: each undeclared
identifier is reported only once for each function it appears in
drivers/crypto/amcc/crypto4xx_core.c:652:3: error: implicit declaration
of function 'crypto4xx_memcpy_from_le32'
[-Werror=implicit-function-declaration]
crypto4xx_memcpy_from_le32((u32 *)req->iv,
^
drivers/crypto/amcc/crypto4xx_core.c:653:19: warning: dereferencing
'void *' pointer
pd_uinfo->sr_va->save_iv,
^
drivers/crypto/amcc/crypto4xx_core.c:653:19: error: request for member
'save_iv' in something not a structure or union
drivers/crypto/amcc/crypto4xx_core.c:654:4: error: implicit declaration
of function 'crypto_skcipher_ivsize' [-Werror=implicit-function-declaration]
crypto_skcipher_ivsize(skcipher));
^
If devm_gpiod_get_from_of_node() call returns ERR_PTR, it is assigned
into an array of GPIO descriptors and used later because such error is
not treated as critical thus it is not propagated back to the probe
function.
All code later expects that such GPIO descriptor is either a NULL or
proper value. This later might lead to dereference of ERR_PTR.
Only devices with S2MPS14 flavor are affected (other do not control
regulators with GPIOs).
Fixes: 1c984942f0a4 ("regulator: s2mps11: Pass descriptor instead of GPIO number")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
The exact error condition was not tested because I do not have board
with S2MPS14.
---
drivers/regulator/s2mps11.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c
index 134c62db36c5..af9bf10b4c33 100644
--- a/drivers/regulator/s2mps11.c
+++ b/drivers/regulator/s2mps11.c
@@ -824,6 +824,7 @@ static void s2mps14_pmic_dt_parse_ext_control_gpio(struct platform_device *pdev,
if (IS_ERR(gpio[reg])) {
dev_err(&pdev->dev, "Failed to get control GPIO for %d/%s\n",
reg, rdata[reg].name);
+ gpio[reg] = NULL;
continue;
}
if (gpio[reg])
--
2.7.4