commit f18139966d072dab8e4398c95ce955a9742e04f7 upstream.
Trying to start a new PIO transfer by writing value 0 in PIO_START register
when previous transfer has not yet completed (which is indicated by value 1
in PIO_START) causes an External Abort on CPU, which results in kernel
panic:
SError Interrupt on CPU0, code 0xbf000002 -- SError
Kernel panic - not syncing: Asynchronous SError Interrupt
To prevent kernel panic, it is required to reject a new PIO transfer when
previous one has not finished yet.
If previous PIO transfer is not finished yet, the kernel may issue a new
PIO request only if the previous PIO transfer timed out.
In the past the root cause of this issue was incorrectly identified (as it
often happens during link retraining or after link down event) and special
hack was implemented in Trusted Firmware to catch all SError events in EL3,
to ignore errors with code 0xbf000002 and not forwarding any other errors
to kernel and instead throw panic from EL3 Trusted Firmware handler.
Links to discussion and patches about this issue:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7d…https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen…https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541
But the real cause was the fact that during link retraining or after link
down event the PIO transfer may take longer time, up to the 1.44s until it
times out. This increased probability that a new PIO transfer would be
issued by kernel while previous one has not finished yet.
After applying this change into the kernel, it is possible to revert the
mentioned TF-A hack and SError events do not have to be caught in TF-A EL3.
Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali(a)kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Marek Behún <kabel(a)kernel.org>
Cc: stable(a)vger.kernel.org # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
[pali: Backported to 4.19 version]
---
This patch is backported to 4.19 version. It depends on commit
7fbcb5da811b as presented on Cc: stable line.
---
drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++-----
1 file changed, 40 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 524e0fb3b062..947f60ba5b75 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -382,7 +382,7 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
udelay(PIO_RETRY_DELAY);
}
- dev_err(dev, "config read/write timed out\n");
+ dev_err(dev, "PIO read/write transfer time out\n");
return -ETIMEDOUT;
}
@@ -395,6 +395,35 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
return true;
}
+static bool advk_pcie_pio_is_running(struct advk_pcie *pcie)
+{
+ struct device *dev = &pcie->pdev->dev;
+
+ /*
+ * Trying to start a new PIO transfer when previous has not completed
+ * cause External Abort on CPU which results in kernel panic:
+ *
+ * SError Interrupt on CPU0, code 0xbf000002 -- SError
+ * Kernel panic - not syncing: Asynchronous SError Interrupt
+ *
+ * Functions advk_pcie_rd_conf() and advk_pcie_wr_conf() are protected
+ * by raw_spin_lock_irqsave() at pci_lock_config() level to prevent
+ * concurrent calls at the same time. But because PIO transfer may take
+ * about 1.5s when link is down or card is disconnected, it means that
+ * advk_pcie_wait_pio() does not always have to wait for completion.
+ *
+ * Some versions of ARM Trusted Firmware handles this External Abort at
+ * EL3 level and mask it to prevent kernel panic. Relevant TF-A commit:
+ * https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7d…
+ */
+ if (advk_readl(pcie, PIO_START)) {
+ dev_err(dev, "Previous PIO read/write transfer is still running\n");
+ return true;
+ }
+
+ return false;
+}
+
static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
int where, int size, u32 *val)
{
@@ -407,9 +436,10 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
return PCIBIOS_DEVICE_NOT_FOUND;
}
- /* Start PIO */
- advk_writel(pcie, 0, PIO_START);
- advk_writel(pcie, 1, PIO_ISR);
+ if (advk_pcie_pio_is_running(pcie)) {
+ *val = 0xffffffff;
+ return PCIBIOS_SET_FAILED;
+ }
/* Program the control register */
reg = advk_readl(pcie, PIO_CTRL);
@@ -428,7 +458,8 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
/* Program the data strobe */
advk_writel(pcie, 0xf, PIO_WR_DATA_STRB);
- /* Start the transfer */
+ /* Clear PIO DONE ISR and start the transfer */
+ advk_writel(pcie, 1, PIO_ISR);
advk_writel(pcie, 1, PIO_START);
ret = advk_pcie_wait_pio(pcie);
@@ -462,9 +493,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
if (where % size)
return PCIBIOS_SET_FAILED;
- /* Start PIO */
- advk_writel(pcie, 0, PIO_START);
- advk_writel(pcie, 1, PIO_ISR);
+ if (advk_pcie_pio_is_running(pcie))
+ return PCIBIOS_SET_FAILED;
/* Program the control register */
reg = advk_readl(pcie, PIO_CTRL);
@@ -491,7 +521,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
/* Program the data strobe */
advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB);
- /* Start the transfer */
+ /* Clear PIO DONE ISR and start the transfer */
+ advk_writel(pcie, 1, PIO_ISR);
advk_writel(pcie, 1, PIO_START);
ret = advk_pcie_wait_pio(pcie);
--
2.20.1
From: Remi Pommarel <repk(a)triplefau.lt>
commit 7fbcb5da811be7d47468417c7795405058abb3da upstream.
advk_pcie_wait_pio() can be called while holding a spinlock (from
pci_bus_read_config_dword()), then depends on jiffies in order to
timeout while polling on PIO state registers. In the case the PIO
transaction failed, the timeout will never happen and will also cause
the cpu to stall.
This decrements a variable and wait instead of using jiffies.
Signed-off-by: Remi Pommarel <repk(a)triplefau.lt>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
Reviewed-by: Andrew Murray <andrew.murray(a)arm.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni(a)bootlin.com>
---
This is backport to 4.14 kernel. Backport of upstream commit can be done
automatically by git cherry-pick command if merge.renamelimit variable is
set to at least 12711.
---
drivers/pci/host/pci-aardvark.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/host/pci-aardvark.c b/drivers/pci/host/pci-aardvark.c
index c1db09fbbe04..5dbbf3d7de36 100644
--- a/drivers/pci/host/pci-aardvark.c
+++ b/drivers/pci/host/pci-aardvark.c
@@ -185,7 +185,8 @@
(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \
PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
-#define PIO_TIMEOUT_MS 1
+#define PIO_RETRY_CNT 500
+#define PIO_RETRY_DELAY 2 /* 2 us*/
#define LINK_WAIT_MAX_RETRIES 10
#define LINK_WAIT_USLEEP_MIN 90000
@@ -413,17 +414,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
static int advk_pcie_wait_pio(struct advk_pcie *pcie)
{
struct device *dev = &pcie->pdev->dev;
- unsigned long timeout;
-
- timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
+ int i;
- while (time_before(jiffies, timeout)) {
+ for (i = 0; i < PIO_RETRY_CNT; i++) {
u32 start, isr;
start = advk_readl(pcie, PIO_START);
isr = advk_readl(pcie, PIO_ISR);
if (!start && isr)
return 0;
+ udelay(PIO_RETRY_DELAY);
}
dev_err(dev, "config read/write timed out\n");
--
2.20.1
From: Eric Biggers <ebiggers(a)google.com>
commit 77f30bfcfcf484da7208affd6a9e63406420bf91 upstream.
[Please apply to 4.9-stable.]
When initializing a no-key name, fscrypt_fname_disk_to_usr() sets the
minor_hash to 0 if the (major) hash is 0.
This doesn't make sense because 0 is a valid hash code, so we shouldn't
ignore the filesystem-provided minor_hash in that case. Fix this by
removing the special case for 'hash == 0'.
This is an old bug that appears to have originated when the encryption
code in ext4 and f2fs was moved into fs/crypto/. The original ext4 and
f2fs code passed the hash by pointer instead of by value. So
'if (hash)' actually made sense then, as it was checking whether a
pointer was NULL. But now the hashes are passed by value, and
filesystems just pass 0 for any hashes they don't have. There is no
need to handle this any differently from the hashes actually being 0.
It is difficult to reproduce this bug, as it only made a difference in
the case where a filename's 32-bit major hash happened to be 0.
However, it probably had the largest chance of causing problems on
ubifs, since ubifs uses minor_hash to do lookups of no-key names, in
addition to using it as a readdir cookie. ext4 only uses minor_hash as
a readdir cookie, and f2fs doesn't use minor_hash at all.
Fixes: 0b81d0779072 ("fs crypto: move per-file encryption from f2fs tree to fs/crypto")
Cc: <stable(a)vger.kernel.org> # v4.6+
Link: https://lore.kernel.org/r/20210527235236.2376556-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/crypto/fname.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index e14bb7b67e9c..5136ea195934 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -294,12 +294,8 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
oname->name);
return 0;
}
- if (hash) {
- memcpy(buf, &hash, 4);
- memcpy(buf + 4, &minor_hash, 4);
- } else {
- memset(buf, 0, 8);
- }
+ memcpy(buf, &hash, 4);
+ memcpy(buf + 4, &minor_hash, 4);
memcpy(buf + 8, iname->name + ((iname->len - 17) & ~15), 16);
oname->name[0] = '_';
oname->len = 1 + digest_encode(buf, 24, oname->name + 1);
--
2.32.0
Commit e3d4030498c3 ("mac80211: do not accept/forward invalid EAPOL
frames") uses skb_mac_header() before eth_type_trans() is called
leading to incorrect pointer, the pointer gets written to. This issue
has appeared during backporting to 4.4, 4.9 and 4.14.
Fixes: e3d4030498c3 ("mac80211: do not accept/forward invalid EAPOL frames")
Link: https://lore.kernel.org/r/CAHQn7pKcyC_jYmGyTcPCdk9xxATwW5QPNph=bsZV8d-HPwNs…
Cc: <stable(a)vger.kernel.org> # 4.4.x
Signed-off-by: Davis Mosenkovs <davis(a)mosenkovs.lv>
---
net/mac80211/rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index bde924968cd2..b5848bcc09eb 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -2234,7 +2234,7 @@ ieee80211_deliver_skb(struct ieee80211_rx_data *rx)
#endif
if (skb) {
- struct ethhdr *ehdr = (void *)skb_mac_header(skb);
+ struct ethhdr *ehdr = (struct ethhdr *)skb->data;
/* deliver to local stack */
skb->protocol = eth_type_trans(skb, dev);
--
2.25.1
[ Upstream commit 35cba15a504bf4f585bb9d78f47b22b28a1a06b2 ]
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.
[yyl: since devm_platform_get_and_ioremap_resource() is introduced
in linux-5.7, so just check the return value after calling
platform_get_resource()]
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
---
drivers/net/ethernet/moxa/moxart_ether.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index f70bb81e1ed65..caf7051302725 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -481,6 +481,10 @@ static int moxart_mac_probe(struct platform_device *pdev)
priv->pdev = pdev;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ ret = -EINVAL;
+ goto init_fail;
+ }
ndev->base_addr = res->start;
priv->base = devm_ioremap_resource(p_dev, res);
if (IS_ERR(priv->base)) {
--
2.25.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 474dd1c6506411752a9b2f2233eec11f1733a099 Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu(a)linux.intel.com>
Date: Mon, 12 Jul 2021 15:17:12 +0800
Subject: [PATCH] iommu/vt-d: Fix clearing real DMA device's scalable-mode
context entries
The commit 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
fixes an issue of "sub-device is removed where the context entry is cleared
for all aliases". But this commit didn't consider the PASID entry and PASID
table in VT-d scalable mode. This fix increases the coverage of scalable
mode.
Suggested-by: Sanjay Kumar <sanjay.k.kumar(a)intel.com>
Fixes: 8038bdb855331 ("iommu/vt-d: Only clear real DMA device's context entries")
Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable(a)vger.kernel.org # v5.6+
Cc: Jon Derrick <jonathan.derrick(a)intel.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 57270290d62b..dd22fc7d5176 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4472,14 +4472,13 @@ static void __dmar_remove_one_dev_info(struct device_domain_info *info)
iommu = info->iommu;
domain = info->domain;
- if (info->dev) {
+ if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
if (dev_is_pci(info->dev) && sm_supported(iommu))
intel_pasid_tear_down_entry(iommu, info->dev,
PASID_RID2PASID, false);
iommu_disable_dev_iotlb(info);
- if (!dev_is_real_dma_subdevice(info->dev))
- domain_context_clear(info);
+ domain_context_clear(info);
intel_pasid_free_table(info->dev);
}
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5f93e776c6734cea989aeb4f2d6c97e521baa683 Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Date: Tue, 29 Jun 2021 03:16:46 +0900
Subject: [PATCH] btrfs: zoned: print unusable percentage when reclaiming block
groups
When we're automatically reclaiming a zone, because its zone_unusable
value is above the reclaim threshold, we're only logging how much
percent of the zone's capacity are used, but not how much of the
capacity is unusable.
Also print the percentage of the unusable space in the block group
before we're reclaiming it.
Example:
BTRFS info (device sdg): reclaiming chunk 230686720 with 13% used 86% unusable
CC: stable(a)vger.kernel.org # 5.13
Signed-off-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index d4c8dc550889..fec7a34b27f3 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1501,6 +1501,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
mutex_lock(&fs_info->reclaim_bgs_lock);
spin_lock(&fs_info->unused_bgs_lock);
while (!list_empty(&fs_info->reclaim_bgs)) {
+ u64 zone_unusable;
int ret = 0;
bg = list_first_entry(&fs_info->reclaim_bgs,
@@ -1534,13 +1535,22 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
goto next;
}
+ /*
+ * Cache the zone_unusable value before turning the block group
+ * to read only. As soon as the blog group is read only it's
+ * zone_unusable value gets moved to the block group's read-only
+ * bytes and isn't available for calculations anymore.
+ */
+ zone_unusable = bg->zone_unusable;
ret = inc_block_group_ro(bg, 0);
up_write(&space_info->groups_sem);
if (ret < 0)
goto next;
- btrfs_info(fs_info, "reclaiming chunk %llu with %llu%% used",
- bg->start, div64_u64(bg->used * 100, bg->length));
+ btrfs_info(fs_info,
+ "reclaiming chunk %llu with %llu%% used %llu%% unusable",
+ bg->start, div_u64(bg->used * 100, bg->length),
+ div64_u64(zone_unusable * 100, bg->length));
trace_btrfs_reclaim_block_group(bg);
ret = btrfs_relocate_chunk(fs_info, bg->start);
if (ret)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 04bef83a3358946bfc98a5ecebd1b0003d83d882 Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay(a)nvidia.com>
Date: Sun, 11 Jul 2021 12:56:28 +0300
Subject: [PATCH] net: bridge: multicast: fix PIM hello router port marking
race
When a PIM hello packet is received on a bridge port with multicast
snooping enabled, we mark it as a router port automatically, that
includes adding that port the router port list. The multicast lock
protects that list, but it is not acquired in the PIM message case
leading to a race condition, we need to take it to fix the race.
Cc: stable(a)vger.kernel.org
Fixes: 91b02d3d133b ("bridge: mcast: add router port on PIM hello message")
Signed-off-by: Nikolay Aleksandrov <nikolay(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 53c3a9d80d9c..3bbbc6d7b7c3 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -3264,7 +3264,9 @@ static void br_multicast_pim(struct net_bridge *br,
pim_hdr_type(pimhdr) != PIM_TYPE_HELLO)
return;
+ spin_lock(&br->multicast_lock);
br_ip4_multicast_mark_router(br, port);
+ spin_unlock(&br->multicast_lock);
}
static int br_ip4_multicast_mrd_rcv(struct net_bridge *br,