The way cookie_init_hw_msi_region() allocates the iommu_dma_msi_page
structures doesn't match the way iommu_put_dma_cookie() frees them.
The former performs a single allocation of all the required structures,
while the latter tries to free them one at a time. It doesn't quite
work for the main use case (the GICv3 ITS where the range is 64kB)
when the base ganule size is 4kB.
This leads to a nice slab corruption on teardown, which is easily
observable by simply creating a VF on a SRIOV-capable device, and
tearing it down immediately (no need to even make use of it).
Fix it by allocating iommu_dma_msi_page structures one at a time.
Fixes: 7c1b058c8b5a3 ("iommu/dma: Handle IOMMU API reserved regions")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Joerg Roedel <jroedel(a)suse.de>
Cc: Eric Auger <eric.auger(a)redhat.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/iommu/dma-iommu.c | 36 ++++++++++++++++++++++++------------
1 file changed, 24 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index a2e96a5fd9a7..01fa64856c12 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -171,25 +171,37 @@ static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
phys_addr_t start, phys_addr_t end)
{
struct iova_domain *iovad = &cookie->iovad;
- struct iommu_dma_msi_page *msi_page;
- int i, num_pages;
+ struct iommu_dma_msi_page *msi_page, *tmp;
+ int i, num_pages, ret = 0;
+ phys_addr_t base;
- start -= iova_offset(iovad, start);
+ base = start -= iova_offset(iovad, start);
num_pages = iova_align(iovad, end - start) >> iova_shift(iovad);
- msi_page = kcalloc(num_pages, sizeof(*msi_page), GFP_KERNEL);
- if (!msi_page)
- return -ENOMEM;
-
for (i = 0; i < num_pages; i++) {
- msi_page[i].phys = start;
- msi_page[i].iova = start;
- INIT_LIST_HEAD(&msi_page[i].list);
- list_add(&msi_page[i].list, &cookie->msi_page_list);
+ msi_page = kmalloc(sizeof(*msi_page), GFP_KERNEL);
+ if (!msi_page) {
+ ret = -ENOMEM;
+ break;
+ }
+ msi_page->phys = start;
+ msi_page->iova = start;
+ INIT_LIST_HEAD(&msi_page->list);
+ list_add(&msi_page->list, &cookie->msi_page_list);
start += iovad->granule;
}
- return 0;
+ if (ret) {
+ list_for_each_entry_safe(msi_page, tmp,
+ &cookie->msi_page_list, list) {
+ if (msi_page->phys >= base && msi_page->phys < start) {
+ list_del(&msi_page->list);
+ kfree(msi_page);
+ }
+ }
+ }
+
+ return ret;
}
static int iova_reserve_pci_windows(struct pci_dev *dev,
--
2.20.1
[Posting again after correcting CC’ed e-mail id]
These patches include few backported fixes for the 4.4 stable
tree.
I would appreciate if you could kindly consider including them in the
next release.
Ajay
---
[Changes from v3]:
- Dropped [Patch v3 8/8] [2] as patches 1-7 are independent from patch 8
and patch 8 may require more work.
[Changes from v2]:
Merged following changes from Vlastimil's series [1]:
- Added page_ref_count() in [Patch v3 5/8]
- Added missing refcount overflow checks on x86 and s390 [Patch v3 5/8]
- Added [Patch v3 8/8]
- Removed 7aef4172c795 i.e. [Patch v2 3/8]
[1] https://lore.kernel.org/stable/20191108093814.16032-1-vbabka@suse.cz/
[2] https://lore.kernel.org/stable/1576529149-14269-9-git-send-email-akaher@vmw…
---
[PATCH v4 1/7]:
Backporting of upstream commit f958d7b528b1:
mm: make page ref count overflow check tighter and more explicit
[PATCH v4 2/7]:
Backporting of upstream commit 88b1a17dfc3e:
mm: add 'try_get_page()' helper function
[PATCH v4 3/7]:
Backporting of upstream commit a3e328556d41:
mm, gup: remove broken VM_BUG_ON_PAGE compound check for hugepages
[PATCH v4 4/7]:
Backporting of upstream commit d63206ee32b6:
mm, gup: ensure real head page is ref-counted when using hugepages
[PATCH v4 5/7]:
Backporting of upstream commit 8fde12ca79af:
mm: prevent get_user_pages() from overflowing page refcount
[PATCH v4 6/7]:
Backporting of upstream commit 7bf2d1df8082:
pipe: add pipe_buf_get() helper
[PATCH v4 7/7]:
Backporting of upstream commit 15fab63e1e57:
fs: prevent page refcount overflow in pipe_buf_get
Hi Greg, Sasha,
Can you please include this perf stat fix to 4.19?
These two commits needed:
commit eb08d006054e7e374592068919e32579988602d4
Author: Ravi Bangoria <ravi.bangoria(a)linux.ibm.com>
Date: Thu Nov 15 15:25:32 2018 +0530
perf stat: Use perf_evsel__is_clocki() for clock events
commit 57ddf09173c1e7d0511ead8924675c7198e56545
Author: Ravi Bangoria <ravi.bangoria(a)linux.ibm.com>
Date: Fri Nov 16 09:58:43 2018 +0530
perf stat: Fix shadow stats for clock events
-Tommi
From: Jason Wang <jasowang(a)redhat.com>
[ Upstream commit 2f3ab6221e4c87960347d65c7cab9bd917d1f637 ]
When link is down, writes to the device might fail with
-EIO. Userspace needs an indication when the status is resolved. As a
fix, tun_net_open() attempts to wake up writers - but that is only
effective if SOCKWQ_ASYNC_NOSPACE has been set in the past. This is
not the case of vhost_net which only poll for EPOLLOUT after it meets
errors during sendmsg().
This patch fixes this by making sure SOCKWQ_ASYNC_NOSPACE is set when
socket is not writable or device is down to guarantee EPOLLOUT will be
raised in either tun_chr_poll() or tun_sock_write_space() after device
is up.
Cc: Hannes Frederic Sowa <hannes(a)stressinduktion.org>
Cc: Eric Dumazet <edumazet(a)google.com>
Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Tommi Rantala <tommi.t.rantala(a)nokia.com>
---
drivers/net/tun.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3086211829a7..ba34f61d70de 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1134,6 +1134,13 @@ static void tun_net_init(struct net_device *dev)
dev->max_mtu = MAX_MTU - dev->hard_header_len;
}
+static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file *tfile)
+{
+ struct sock *sk = tfile->socket.sk;
+
+ return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
+}
+
/* Character device part */
/* Poll */
@@ -1156,10 +1163,14 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
if (!skb_array_empty(&tfile->tx_array))
mask |= POLLIN | POLLRDNORM;
- if (tun->dev->flags & IFF_UP &&
- (sock_writeable(sk) ||
- (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags) &&
- sock_writeable(sk))))
+ /* Make sure SOCKWQ_ASYNC_NOSPACE is set if not writable to
+ * guarantee EPOLLOUT to be raised by either here or
+ * tun_sock_write_space(). Then process could get notification
+ * after it writes to a down device and meets -EIO.
+ */
+ if (tun_sock_writeable(tun, tfile) ||
+ (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags) &&
+ tun_sock_writeable(tun, tfile)))
mask |= POLLOUT | POLLWRNORM;
if (tun->dev->reg_state != NETREG_REGISTERED)
--
2.21.1
After include 3b5a39979daf ("slip: Fix memory leak in slip_open error path")
and e58c19124189 ("slip: Fix use-after-free Read in slip_open") with 4.4.y/4.9.y.
We will trigger a bug since we can double free sl->dev in slip_open. Actually,
we should backport cf124db566e6 ("net: Fix inconsistent teardown and release
of private netdev state.") too since it has delete free_netdev from sl_free_netdev.
Fix it by delete free_netdev from slip_open.
Signed-off-by: yangerkun <yangerkun(a)huawei.com>
---
drivers/net/slip/slip.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index 0f8d5609ed51..d4a33baa33b6 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -868,7 +868,6 @@ err_free_chan:
tty->disc_data = NULL;
clear_bit(SLF_INUSE, &sl->flags);
sl_free_netdev(sl->dev);
- free_netdev(sl->dev);
err_exit:
rtnl_unlock();
--
2.23.0.rc2.8.gff66981f45
Hi,
I still see high (upto 30%) ksoftirqd cpu use with 4.19.101+ after these 2 back ports went in for 4.19.101
(had all 4 backports applied earlier to our tree):
commit f6783319737f28e4436a69611853a5a098cbe974 sched/fair: Fix insertion in rq->leaf_cfs_rq_list
commit 5d299eabea5a251fbf66e8277704b874bbba92dc sched/fair: Add tmp_alone_branch assertion
perf shows for any given ksoftirqd, with 20k-30k processes on the system with high scheduler load:
58.88% ksoftirqd/0 [kernel.kallsyms] [k] update_blocked_averages
Can we backport these 2 also, confirmed that it fixes this behavior of ksoftirqd.
commit 039ae8bcf7a5f4476f4487e6bf816885fb3fb617 upstream
commit 31bc6aeaab1d1de8959b67edbed5c7a4b3cdbe7c upstream
The second one doesn’t apply cleanly, there’s a small change for the last diff, where pelt needs renaming to task (cfg_rq_clock_pelt was cfs_rq_clock_task in 4.19.y) in that unchanged part of the diff:
@@ -7700,10 +7720,6 @@ static void update_blocked_averages(int cpu)
for_each_leaf_cfs_rq(rq, cfs_rq) {
struct sched_entity *se;
- /* throttled entities do not contribute to load */
- if (throttled_hierarchy(cfs_rq))
- continue;
-
if (update_cfs_rq_load_avg(cfs_rq_clock_pelt(cfs_rq), cfs_rq))
update_tg_load_avg(cfs_rq, 0);
I can post that patch with that rename if required.
Thanks
Vishnu
Address below Coverity complaint (Feb 25, 2020, 8:06 AM CET):
*** CID 1458999: Error handling issues (CHECKED_RETURN)
/drivers/usb/core/hub.c: 1869 in hub_probe()
1863
1864 if (id->driver_info & HUB_QUIRK_CHECK_PORT_AUTOSUSPEND)
1865 hub->quirk_check_port_auto_suspend = 1;
1866
1867 if (id->driver_info & HUB_QUIRK_DISABLE_AUTOSUSPEND) {
1868 hub->quirk_disable_autosuspend = 1;
>>> CID 1458999: Error handling issues (CHECKED_RETURN)
>>> Calling "usb_autopm_get_interface" without checking return value (as is done elsewhere 97 out of 111 times).
1869 usb_autopm_get_interface(intf);
1870 }
1871
1872 if (hub_configure(hub, &desc->endpoint[0].desc) >= 0)
1873 return 0;
1874
Rather than checking the return value of 'usb_autopm_get_interface()',
switch to the usb_autopm_get_interface_no_resume() API, as per:
On Tue, Feb 25, 2020 at 10:32:32AM -0500, Alan Stern wrote:
------ 8< ------
> This change (i.e. 'ret = usb_autopm_get_interface') is not necessary,
> because the resume operation cannot fail at this point (interfaces
> are always powered-up during probe). A better solution would be to
> call usb_autopm_get_interface_no_resume() instead.
------ 8< ------
Fixes: 1208f9e1d758c9 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub")
Cc: Hardik Gajjar <hgajjar(a)de.adit-jv.com>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable(a)vger.kernel.org # v4.14+
Reported-by: scan-admin(a)coverity.com
Suggested-by: Alan Stern <stern(a)rowland.harvard.edu>
Signed-off-by: Eugeniu Rosca <erosca(a)de.adit-jv.com>
---
v3:
- Make the summary line more clear
- s/autpm/autopm/ in patch description
- Cc <stable> with v4.14+ since v4.14.x is the earliest stable kernel
which accepted commit 1208f9e1d758c9 ("USB: hub: Fix the broken
detection of USB3 device in SMSC hub")
v2:
- [Alan Stern] Use usb_autopm_get_interface_no_resume() instead of
usb_autopm_get_interface()
- Augment commit description to provide background
- Link: https://lore.kernel.org/lkml/20200225183057.31953-1-erosca@de.adit-jv.com
v1:
- Link: https://lore.kernel.org/lkml/20200225130846.20236-1-erosca@de.adit-jv.com
---
drivers/usb/core/hub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 1d212f82c69b..1105983b5c1c 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1866,7 +1866,7 @@ static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id)
if (id->driver_info & HUB_QUIRK_DISABLE_AUTOSUSPEND) {
hub->quirk_disable_autosuspend = 1;
- usb_autopm_get_interface(intf);
+ usb_autopm_get_interface_no_resume(intf);
}
if (hub_configure(hub, &desc->endpoint[0].desc) >= 0)
--
2.25.1