Hello Greg, Sasha,
this is backport of all the recet changes for the Aardvark PCIe controller
for 5.15.
These include
- memory leak fix in driver unbind
- more complete driver unbind
- fixes for MSI support
- add MSI-X support, which fixes support for some cards
- add ERR interrupt support (which we really missed when debugging)
I also included some small cosmetic changes - this will make it easier
to backport next fixes for the driver (there is another batch pending
on linux-pci).
Marek
Marek Behún (5):
PCI: aardvark: Make MSI irq_chip structures static driver structures
PCI: aardvark: Make msi_domain_info structure a static driver
structure
PCI: aardvark: Use dev_fwnode() instead of
of_node_to_fwnode(dev->of_node)
PCI: aardvark: Drop __maybe_unused from advk_pcie_disable_phy()
PCI: aardvark: Update comment about link going down after link-up
Pali Rohár (25):
PCI: pci-bridge-emul: Add description for class_revision field
PCI: pci-bridge-emul: Add definitions for missing capabilities
registers
PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
registers on emulated bridge
PCI: aardvark: Clear all MSIs at setup
PCI: aardvark: Comment actions in driver remove method
PCI: aardvark: Disable bus mastering when unbinding driver
PCI: aardvark: Mask all interrupts when unbinding driver
PCI: aardvark: Fix memory leak in driver unbind
PCI: aardvark: Assert PERST# when unbinding driver
PCI: aardvark: Disable link training when unbinding driver
PCI: aardvark: Disable common PHY when unbinding driver
PCI: aardvark: Replace custom PCIE_CORE_INT_* macros with
PCI_INTERRUPT_*
PCI: aardvark: Rewrite IRQ code to chained IRQ handler
PCI: aardvark: Check return value of generic_handle_domain_irq() when
processing INTx IRQ
PCI: aardvark: Refactor unmasking summary MSI interrupt
PCI: aardvark: Add support for masking MSI interrupts
PCI: aardvark: Fix setting MSI address
PCI: aardvark: Enable MSI-X support
PCI: aardvark: Add support for ERR interrupt on emulated bridge
PCI: aardvark: Optimize writing PCI_EXP_RTCTL_PMEIE and
PCI_EXP_RTSTA_PME on emulated bridge
PCI: aardvark: Add support for PME interrupts
PCI: aardvark: Fix support for PME requester on emulated bridge
PCI: aardvark: Use separate INTA interrupt for emulated root bridge
PCI: aardvark: Remove irq_mask_ack() callback for INTx interrupts
PCI: aardvark: Don't mask irq when mapping
drivers/pci/controller/pci-aardvark.c | 428 +++++++++++++++++++-------
drivers/pci/pci-bridge-emul.c | 49 ++-
2 files changed, 371 insertions(+), 106 deletions(-)
--
2.35.1
Hello Greg, Sasha,
this is backport of all the recet changes for the Aardvark PCIe controller
for 5.17.
These include
- fixes for MSI support
- add MSI-X support, which fixes support for some cards
- add ERR interrupt support (which we really missed when debugging)
As in series for 5.15, I included some small cosmetic changes - this
will make it easier to backport next fixes for the driver (there is
another batch pending on linux-pci).
Marek
Marek Behún (5):
PCI: aardvark: Make MSI irq_chip structures static driver structures
PCI: aardvark: Make msi_domain_info structure a static driver
structure
PCI: aardvark: Use dev_fwnode() instead of
of_node_to_fwnode(dev->of_node)
PCI: aardvark: Drop __maybe_unused from advk_pcie_disable_phy()
PCI: aardvark: Update comment about link going down after link-up
Pali Rohár (14):
PCI: aardvark: Replace custom PCIE_CORE_INT_* macros with
PCI_INTERRUPT_*
PCI: aardvark: Rewrite IRQ code to chained IRQ handler
PCI: aardvark: Check return value of generic_handle_domain_irq() when
processing INTx IRQ
PCI: aardvark: Refactor unmasking summary MSI interrupt
PCI: aardvark: Add support for masking MSI interrupts
PCI: aardvark: Fix setting MSI address
PCI: aardvark: Enable MSI-X support
PCI: aardvark: Add support for ERR interrupt on emulated bridge
PCI: aardvark: Optimize writing PCI_EXP_RTCTL_PMEIE and
PCI_EXP_RTSTA_PME on emulated bridge
PCI: aardvark: Add support for PME interrupts
PCI: aardvark: Fix support for PME requester on emulated bridge
PCI: aardvark: Use separate INTA interrupt for emulated root bridge
PCI: aardvark: Remove irq_mask_ack() callback for INTx interrupts
PCI: aardvark: Don't mask irq when mapping
drivers/pci/controller/pci-aardvark.c | 367 +++++++++++++++++++-------
1 file changed, 266 insertions(+), 101 deletions(-)
--
2.35.1
This is backport of the patch 9f6dc6337610 ("dm: interlock pending dm_io
and dm_wait_for_bios_completion") for the kernel 4.9.
The bugs fixed by this patch can cause random crashing when reloading dm
table, so it is eligible for stable backport.
Note that the kernel 4.9 uses md->pending to count the number of
in-progress I/Os and md->pending is decremented after dm_stats_account_io,
so the race condition doesn't really exist there (except for missing
smp_rmb()).
The percpu variable md->pending_io is not needed in the stable kernels,
because md->pending counts the same value, so it is not backported.
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reviewed-by: Mike Snitzer <snitzer(a)kernel.org>
---
drivers/md/dm.c | 2 ++
1 file changed, 2 insertions(+)
Index: linux-stable/drivers/md/dm.c
===================================================================
--- linux-stable.orig/drivers/md/dm.c 2022-04-30 19:03:08.000000000 +0200
+++ linux-stable/drivers/md/dm.c 2022-04-30 19:03:46.000000000 +0200
@@ -2027,6 +2027,8 @@ static int dm_wait_for_completion(struct
}
finish_wait(&md->wait, &wait);
+ smp_rmb(); /* paired with atomic_dec_return in end_io_acct */
+
return r;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e914d8f00391520ecc4495dd0ca0124538ab7119 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan(a)kernel.org>
Date: Thu, 14 Apr 2022 19:13:46 -0700
Subject: [PATCH] mm: fix unexpected zeroed page mapping with zram swap
Two processes under CLONE_VM cloning, user process can be corrupted by
seeing zeroed page unexpectedly.
CPU A CPU B
do_swap_page do_swap_page
SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path
swap_readpage valid data
swap_slot_free_notify
delete zram entry
swap_readpage zeroed(invalid) data
pte_lock
map the *zero data* to userspace
pte_unlock
pte_lock
if (!pte_same)
goto out_nomap;
pte_unlock
return and next refault will
read zeroed data
The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
increase the refcount of swap slot at copy_mm so it couldn't catch up
whether it's safe or not to discard data from backing device. In the
case, only the lock it could rely on to synchronize swap slot freeing is
page table lock. Thus, this patch gets rid of the swap_slot_free_notify
function. With this patch, CPU A will see correct data.
CPU A CPU B
do_swap_page do_swap_page
SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path
swap_readpage original data
pte_lock
map the original data
swap_free
swap_range_free
bd_disk->fops->swap_slot_free_notify
swap_readpage read zeroed data
pte_unlock
pte_lock
if (!pte_same)
goto out_nomap;
pte_unlock
return
on next refault will see mapped data by CPU B
The concern of the patch would increase memory consumption since it
could keep wasted memory with compressed form in zram as well as
uncompressed form in address space. However, most of cases of zram uses
no readahead and do_swap_page is followed by swap_free so it will free
the compressed form from in zram quickly.
Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: Ivan Babrou <ivan(a)cloudflare.com>
Tested-by: Ivan Babrou <ivan(a)cloudflare.com>
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Cc: Nitin Gupta <ngupta(a)vflare.org>
Cc: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/page_io.c b/mm/page_io.c
index b417f000b49e..89fbf3cae30f 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -51,54 +51,6 @@ void end_swap_bio_write(struct bio *bio)
bio_put(bio);
}
-static void swap_slot_free_notify(struct page *page)
-{
- struct swap_info_struct *sis;
- struct gendisk *disk;
- swp_entry_t entry;
-
- /*
- * There is no guarantee that the page is in swap cache - the software
- * suspend code (at least) uses end_swap_bio_read() against a non-
- * swapcache page. So we must check PG_swapcache before proceeding with
- * this optimization.
- */
- if (unlikely(!PageSwapCache(page)))
- return;
-
- sis = page_swap_info(page);
- if (data_race(!(sis->flags & SWP_BLKDEV)))
- return;
-
- /*
- * The swap subsystem performs lazy swap slot freeing,
- * expecting that the page will be swapped out again.
- * So we can avoid an unnecessary write if the page
- * isn't redirtied.
- * This is good for real swap storage because we can
- * reduce unnecessary I/O and enhance wear-leveling
- * if an SSD is used as the as swap device.
- * But if in-memory swap device (eg zram) is used,
- * this causes a duplicated copy between uncompressed
- * data in VM-owned memory and compressed data in
- * zram-owned memory. So let's free zram-owned memory
- * and make the VM-owned decompressed page *dirty*,
- * so the page should be swapped out somewhere again if
- * we again wish to reclaim it.
- */
- disk = sis->bdev->bd_disk;
- entry.val = page_private(page);
- if (disk->fops->swap_slot_free_notify && __swap_count(entry) == 1) {
- unsigned long offset;
-
- offset = swp_offset(entry);
-
- SetPageDirty(page);
- disk->fops->swap_slot_free_notify(sis->bdev,
- offset);
- }
-}
-
static void end_swap_bio_read(struct bio *bio)
{
struct page *page = bio_first_page_all(bio);
@@ -114,7 +66,6 @@ static void end_swap_bio_read(struct bio *bio)
}
SetPageUptodate(page);
- swap_slot_free_notify(page);
out:
unlock_page(page);
WRITE_ONCE(bio->bi_private, NULL);
@@ -394,11 +345,6 @@ int swap_readpage(struct page *page, bool synchronous)
if (sis->flags & SWP_SYNCHRONOUS_IO) {
ret = bdev_read_page(sis->bdev, swap_page_sector(page), page);
if (!ret) {
- if (trylock_page(page)) {
- swap_slot_free_notify(page);
- unlock_page(page);
- }
-
count_vm_event(PSWPIN);
goto out;
}