The patch titled
Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
Date: Tue, 1 Nov 2022 18:31:00 -0700
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Address issue by:
- Add a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from
unmap_vmas(). This is used to indicate the 'final' unmapping of a vma.
When called via MADV_DONTNEED, this flag is not set and the vm_lock is
not deleted.
- mmu notification is removed from __unmap_hugepage_range to avoid
duplication, and notification is added to the other calling routine
(unmap_hugepage_range).
- notification calls are updated in zap_page range to take into account
the possibility of multiple vmas.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6J…
Link: https://lkml.kernel.org/r/20221102013100.455139-1-mike.kravetz@oracle.com
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 3 ++
mm/hugetlb.c | 45 +++++++++++++++++++++++++------------------
mm/memory.c | 21 ++++++++++++++------
3 files changed, 45 insertions(+), 24 deletions(-)
--- a/include/linux/mm.h~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/include/linux/mm.h
@@ -3475,4 +3475,7 @@ madvise_set_anon_name(struct mm_struct *
*/
#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+/* Set in unmap_vmas() to indicate an unmap call. Only used by hugetlb */
+#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1))
+
#endif /* _LINUX_MM_H */
--- a/mm/hugetlb.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/mm/hugetlb.c
@@ -5064,7 +5064,6 @@ static void __unmap_hugepage_range(struc
struct page *page;
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
- struct mmu_notifier_range range;
unsigned long last_addr_mask;
bool force_flush = false;
@@ -5079,13 +5078,6 @@ static void __unmap_hugepage_range(struc
tlb_change_page_size(tlb, sz);
tlb_start_vma(tlb, vma);
- /*
- * If sharing possible, alert mmu notifiers of worst case.
- */
- mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, start,
- end);
- adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
- mmu_notifier_invalidate_range_start(&range);
last_addr_mask = hugetlb_mask_last_page(h);
address = start;
for (; address < end; address += sz) {
@@ -5174,7 +5166,6 @@ static void __unmap_hugepage_range(struc
if (ref_page)
break;
}
- mmu_notifier_invalidate_range_end(&range);
tlb_end_vma(tlb, vma);
/*
@@ -5199,32 +5190,50 @@ void __unmap_hugepage_range_final(struct
unsigned long end, struct page *ref_page,
zap_flags_t zap_flags)
{
+ bool final = zap_flags & ZAP_FLAG_UNMAP;
+
hugetlb_vma_lock_write(vma);
i_mmap_lock_write(vma->vm_file->f_mapping);
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
/*
- * Unlock and free the vma lock before releasing i_mmap_rwsem. When
- * the vma_lock is freed, this makes the vma ineligible for pmd
- * sharing. And, i_mmap_rwsem is required to set up pmd sharing.
- * This is important as page tables for this unmapped range will
- * be asynchrously deleted. If the page tables are shared, there
- * will be issues when accessed by someone else.
+ * When called via zap_page_range (MADV_DONTNEED), this is not the
+ * final unmap of the vma, and we do not want to delete the vma_lock.
*/
- __hugetlb_vma_unlock_write_free(vma);
-
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ if (final) {
+ /*
+ * Unlock and free the vma lock before releasing i_mmap_rwsem.
+ * When the vma_lock is freed, this makes the vma ineligible
+ * for pmd sharing. And, i_mmap_rwsem is required to set up
+ * pmd sharing. This is important as page tables for this
+ * unmapped range will be asynchrously deleted. If the page
+ * tables are shared, there will be issues when accessed by
+ * someone else.
+ */
+ __hugetlb_vma_unlock_write_free(vma);
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ } else {
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ hugetlb_vma_unlock_write(vma);
+ }
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end, struct page *ref_page,
zap_flags_t zap_flags)
{
+ struct mmu_notifier_range range;
struct mmu_gather tlb;
+ mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+ start, end);
+ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
tlb_gather_mmu(&tlb, vma->vm_mm);
+
__unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags);
+
+ mmu_notifier_invalidate_range_end(&range);
tlb_finish_mmu(&tlb);
}
--- a/mm/memory.c~hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing
+++ a/mm/memory.c
@@ -1720,7 +1720,7 @@ void unmap_vmas(struct mmu_gather *tlb,
{
struct mmu_notifier_range range;
struct zap_details details = {
- .zap_flags = ZAP_FLAG_DROP_MARKER,
+ .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP,
/* Careful - we need to zap private pages too! */
.even_cows = true,
};
@@ -1753,15 +1753,24 @@ void zap_page_range(struct vm_area_struc
MA_STATE(mas, mt, vma->vm_end, vma->vm_end);
lru_add_drain();
- mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
- start, start + size);
tlb_gather_mmu(&tlb, vma->vm_mm);
update_hiwater_rss(vma->vm_mm);
- mmu_notifier_invalidate_range_start(&range);
do {
- unmap_single_vma(&tlb, vma, start, range.end, NULL);
+ mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma,
+ vma->vm_mm,
+ max(start, vma->vm_start),
+ min(end, vma->vm_end));
+ if (is_vm_hugetlb_page(vma))
+ adjust_range_if_pmd_sharing_possible(vma,
+ &range.start, &range.end);
+ mmu_notifier_invalidate_range_start(&range);
+ /*
+ * unmap 'start-end' not 'range.start-range.end' as range
+ * could have been expanded for pmd sharing.
+ */
+ unmap_single_vma(&tlb, vma, start, end, NULL);
+ mmu_notifier_invalidate_range_end(&range);
} while ((vma = mas_find(&mas, end - 1)) != NULL);
- mmu_notifier_invalidate_range_end(&range);
tlb_finish_mmu(&tlb);
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-dont-delete-vma_lock-in-hugetlb-madv_dontneed-processing.patch
hugetlb-simplify-hugetlb-handling-in-follow_page_mask.patch
hugetlb-simplify-hugetlb-handling-in-follow_page_mask-v4.patch
hugetlb-simplify-hugetlb-handling-in-follow_page_mask-v5.patch
The commit 3c52c6bb831f (tcp/udp: Fix memory leak in
ipv6_renew_options()) fixes a memory leak reported by syzbot. This seems
to be a good candidate for the stable trees. This patch didn't apply cleanly
in 5.15 kernel, since release_sock() calls are changed to
sockopt_release_sock() in the latest kernel versions.
Kuniyuki Iwashima (1):
tcp/udp: Fix memory leak in ipv6_renew_options().
net/ipv6/ipv6_sockglue.c | 7 +++++++
1 file changed, 7 insertions(+)
--
2.38.1.273.g43a17bfeac-goog
From: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Several drivers that support setting the RS485 configuration via userspace
implement one or more of the following tasks:
- in case of an invalid RTS configuration (both RTS after send and RTS on
send set or both unset) fall back to enable RTS on send and disable RTS
after send
- nullify the padding field of the returned serial_rs485 struct
- copy the configuration into the uart port struct
- limit RTS delays to 100 ms
Move these tasks into the serial core to make them generic and to provide
a consistent behaviour among all drivers.
[ Upstream commit 0ed12afa5655512ee418047fb3546d229df20aa1 ]
Link: https://lkml.kernel.org/r/20221017051737.51727-1-dominique.martinet@atmark-…
Signed-off-by: Lino Sanfilippo <LinoSanfilippo(a)gmx.de>
Link: https://lore.kernel.org/r/20220410104642.32195-2-LinoSanfilippo@gmx.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Daisuke Mizobuchi <mizo(a)atmark-techno.com>
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
---
5.15 version of the 5.10 backport:
https://lkml.kernel.org/r/20221017051737.51727-1-dominique.martinet@atmark-…
(only build tested)
drivers/tty/serial/serial_core.c | 33 ++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 82ddbb92d07d..48dafd1e084b 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -42,6 +42,11 @@ static struct lock_class_key port_lock_key;
#define HIGH_BITS_OFFSET ((sizeof(long)-sizeof(int))*8)
+/*
+ * Max time with active RTS before/after data is sent.
+ */
+#define RS485_MAX_RTS_DELAY 100 /* msecs */
+
static void uart_change_speed(struct tty_struct *tty, struct uart_state *state,
struct ktermios *old_termios);
static void uart_wait_until_sent(struct tty_struct *tty, int timeout);
@@ -1299,8 +1304,36 @@ static int uart_set_rs485_config(struct uart_port *port,
if (copy_from_user(&rs485, rs485_user, sizeof(*rs485_user)))
return -EFAULT;
+ /* pick sane settings if the user hasn't */
+ if (!(rs485.flags & SER_RS485_RTS_ON_SEND) ==
+ !(rs485.flags & SER_RS485_RTS_AFTER_SEND)) {
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n",
+ port->name, port->line);
+ rs485.flags |= SER_RS485_RTS_ON_SEND;
+ rs485.flags &= ~SER_RS485_RTS_AFTER_SEND;
+ }
+
+ if (rs485.delay_rts_before_send > RS485_MAX_RTS_DELAY) {
+ rs485.delay_rts_before_send = RS485_MAX_RTS_DELAY;
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): RTS delay before sending clamped to %u ms\n",
+ port->name, port->line, rs485.delay_rts_before_send);
+ }
+
+ if (rs485.delay_rts_after_send > RS485_MAX_RTS_DELAY) {
+ rs485.delay_rts_after_send = RS485_MAX_RTS_DELAY;
+ dev_warn_ratelimited(port->dev,
+ "%s (%d): RTS delay after sending clamped to %u ms\n",
+ port->name, port->line, rs485.delay_rts_after_send);
+ }
+ /* Return clean padding area to userspace */
+ memset(rs485.padding, 0, sizeof(rs485.padding));
+
spin_lock_irqsave(&port->lock, flags);
ret = port->rs485_config(port, &rs485);
+ if (!ret)
+ port->rs485 = rs485;
spin_unlock_irqrestore(&port->lock, flags);
if (ret)
return ret;
--
2.35.1
commit 702de2c21eed04c67cefaaedc248ef16e5f6b293 upstream.
We are seeing an IRQ storm on the global receive IRQ line under heavy
CAN bus load conditions with both CAN channels enabled.
Conditions:
The global receive IRQ line is shared between can0 and can1, either of
the channels can trigger interrupt while the other channel's IRQ line
is disabled (RFIE).
When global a receive IRQ interrupt occurs, we mask the interrupt in
the IRQ handler. Clearing and unmasking of the interrupt is happening
in rx_poll(). There is a race condition where rx_poll() unmasks the
interrupt, but the next IRQ handler does not mask the IRQ due to
NAPIF_STATE_MISSED flag (e.g.: can0 RX FIFO interrupt is disabled and
can1 is triggering RX interrupt, the delay in rx_poll() processing
results in setting NAPIF_STATE_MISSED flag) leading to an IRQ storm.
This patch fixes the issue by checking IRQ active and enabled before
handling the IRQ on a particular channel.
Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Suggested-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
Signed-off-by: Biju Das <biju.das.jz(a)bp.renesas.com>
Link: https://lore.kernel.org/all/20221025155657.1426948-2-biju.das.jz@bp.renesas…
Cc: stable(a)vger.kernel.org # 4.9.x
[mkl: adjust commit message]
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
[biju: removed gpriv from RCANFD_RFCC_RFIE macro]
Signed-off-by: Biju Das <biju.das.jz(a)bp.renesas.com>
---
Resending to 4.9 with confilcts[1] fixed
[1] https://lore.kernel.org/stable/OS0PR01MB59226F2443DFCE7C5D73778786379@OS0PR…
---
drivers/net/can/rcar/rcar_canfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index a127c853a4e9..694a3354554f 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1079,7 +1079,7 @@ static irqreturn_t rcar_canfd_global_interrupt(int irq, void *dev_id)
struct rcar_canfd_global *gpriv = dev_id;
struct net_device *ndev;
struct rcar_canfd_channel *priv;
- u32 sts, gerfl;
+ u32 sts, cc, gerfl;
u32 ch, ridx;
/* Global error interrupts still indicate a condition specific
@@ -1097,7 +1097,9 @@ static irqreturn_t rcar_canfd_global_interrupt(int irq, void *dev_id)
/* Handle Rx interrupts */
sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(ridx));
- if (likely(sts & RCANFD_RFSTS_RFIF)) {
+ cc = rcar_canfd_read(priv->base, RCANFD_RFCC(ridx));
+ if (likely(sts & RCANFD_RFSTS_RFIF &&
+ cc & RCANFD_RFCC_RFIE)) {
if (napi_schedule_prep(&priv->napi)) {
/* Disable Rx FIFO interrupts */
rcar_canfd_clear_bit(priv->base,
--
2.25.1
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Address issue by:
- Add a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from
unmap_vmas(). This is used to indicate the 'final' unmapping of a vma.
When called via MADV_DONTNEED, this flag is not set and the vm_lock is
not deleted.
- mmu notification is removed from __unmap_hugepage_range to avoid
duplication, and notification is added to the other calling routine
(unmap_hugepage_range).
- notification calls are updated in zap_page range to take into account
the possibility of multiple vmas.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6J…
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
---
include/linux/mm.h | 3 +++
mm/hugetlb.c | 45 +++++++++++++++++++++++++++------------------
mm/memory.c | 21 +++++++++++++++------
3 files changed, 45 insertions(+), 24 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8bbcccbc5565..b19d65c36d14 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3475,4 +3475,7 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
*/
#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+/* Set in unmap_vmas() to indicate an unmap call. Only used by hugetlb */
+#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1))
+
#endif /* _LINUX_MM_H */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 546df97c31e4..4699889f11e9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5064,7 +5064,6 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
struct page *page;
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
- struct mmu_notifier_range range;
unsigned long last_addr_mask;
bool force_flush = false;
@@ -5079,13 +5078,6 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
tlb_change_page_size(tlb, sz);
tlb_start_vma(tlb, vma);
- /*
- * If sharing possible, alert mmu notifiers of worst case.
- */
- mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, start,
- end);
- adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
- mmu_notifier_invalidate_range_start(&range);
last_addr_mask = hugetlb_mask_last_page(h);
address = start;
for (; address < end; address += sz) {
@@ -5174,7 +5166,6 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct
if (ref_page)
break;
}
- mmu_notifier_invalidate_range_end(&range);
tlb_end_vma(tlb, vma);
/*
@@ -5199,32 +5190,50 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb,
unsigned long end, struct page *ref_page,
zap_flags_t zap_flags)
{
+ bool final = zap_flags & ZAP_FLAG_UNMAP;
+
hugetlb_vma_lock_write(vma);
i_mmap_lock_write(vma->vm_file->f_mapping);
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
/*
- * Unlock and free the vma lock before releasing i_mmap_rwsem. When
- * the vma_lock is freed, this makes the vma ineligible for pmd
- * sharing. And, i_mmap_rwsem is required to set up pmd sharing.
- * This is important as page tables for this unmapped range will
- * be asynchrously deleted. If the page tables are shared, there
- * will be issues when accessed by someone else.
+ * When called via zap_page_range (MADV_DONTNEED), this is not the
+ * final unmap of the vma, and we do not want to delete the vma_lock.
*/
- __hugetlb_vma_unlock_write_free(vma);
-
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ if (final) {
+ /*
+ * Unlock and free the vma lock before releasing i_mmap_rwsem.
+ * When the vma_lock is freed, this makes the vma ineligible
+ * for pmd sharing. And, i_mmap_rwsem is required to set up
+ * pmd sharing. This is important as page tables for this
+ * unmapped range will be asynchrously deleted. If the page
+ * tables are shared, there will be issues when accessed by
+ * someone else.
+ */
+ __hugetlb_vma_unlock_write_free(vma);
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ } else {
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ hugetlb_vma_unlock_write(vma);
+ }
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end, struct page *ref_page,
zap_flags_t zap_flags)
{
+ struct mmu_notifier_range range;
struct mmu_gather tlb;
+ mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+ start, end);
+ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
tlb_gather_mmu(&tlb, vma->vm_mm);
+
__unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags);
+
+ mmu_notifier_invalidate_range_end(&range);
tlb_finish_mmu(&tlb);
}
diff --git a/mm/memory.c b/mm/memory.c
index f88c351aecd4..474c43156ecf 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1720,7 +1720,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt,
{
struct mmu_notifier_range range;
struct zap_details details = {
- .zap_flags = ZAP_FLAG_DROP_MARKER,
+ .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP,
/* Careful - we need to zap private pages too! */
.even_cows = true,
};
@@ -1753,15 +1753,24 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
MA_STATE(mas, mt, vma->vm_end, vma->vm_end);
lru_add_drain();
- mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
- start, start + size);
tlb_gather_mmu(&tlb, vma->vm_mm);
update_hiwater_rss(vma->vm_mm);
- mmu_notifier_invalidate_range_start(&range);
do {
- unmap_single_vma(&tlb, vma, start, range.end, NULL);
+ mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma,
+ vma->vm_mm,
+ max(start, vma->vm_start),
+ min(end, vma->vm_end));
+ if (is_vm_hugetlb_page(vma))
+ adjust_range_if_pmd_sharing_possible(vma,
+ &range.start, &range.end);
+ mmu_notifier_invalidate_range_start(&range);
+ /*
+ * unmap 'start-end' not 'range.start-range.end' as range
+ * could have been expanded for pmd sharing.
+ */
+ unmap_single_vma(&tlb, vma, start, end, NULL);
+ mmu_notifier_invalidate_range_end(&range);
} while ((vma = mas_find(&mas, end - 1)) != NULL);
- mmu_notifier_invalidate_range_end(&range);
tlb_finish_mmu(&tlb);
}
--
2.37.3
Patch #1 (merged in 5.12-rc3) is required to address the issue
Anders Roxell reported on the list [1]. Patch #2 (in 5.15-rc1) is
a follow up.
[1] https://lore.kernel.org/lkml/20220826120020.GB520@mutt
Anshuman Khandual (1):
arm64/kexec: Test page size support with new TGRAN range values
James Morse (1):
arm64/mm: Fix __enable_mmu() for new TGRAN range values
arch/arm64/include/asm/cpufeature.h | 9 ++++--
arch/arm64/include/asm/sysreg.h | 36 +++++++++++++++--------
arch/arm64/kernel/head.S | 6 ++--
arch/arm64/kvm/reset.c | 10 ++++---
drivers/firmware/efi/libstub/arm64-stub.c | 2 +-
5 files changed, 41 insertions(+), 22 deletions(-)
--
2.33.0