The vfio_ap_mdev_filter_matrix function is called whenever a new adapter or
domain is assigned to the mdev. The purpose of the function is to update
the guest's AP configuration by filtering the matrix of adapters and
domains assigned to the mdev. When an adapter or domain is assigned, only
the APQNs associated with the APID of the new adapter or APQI of the new
domain are inspected. If an APQN does not reference a queue device bound to
the vfio_ap device driver, then it's APID will be filtered from the mdev's
matrix when updating the guest's AP configuration.
Inspecting only the APID of the new adapter or APQI of the new domain will
result in passing AP queues through to a guest that are not bound to the
vfio_ap device driver under certain circumstances. Consider the following:
guest's AP configuration (all also assigned to the mdev's matrix):
14.0004
14.0005
14.0006
16.0004
16.0005
16.0006
unassign domain 4
unbind queue 16.0005
assign domain 4
When domain 4 is re-assigned, since only domain 4 will be inspected, the
APQNs that will be examined will be:
14.0004
16.0004
Since both of those APQNs reference queue devices that are bound to the
vfio_ap device driver, nothing will get filtered from the mdev's matrix
when updating the guest's AP configuration. Consequently, queue 16.0005
will get passed through despite not being bound to the driver. This
violates the linux device model requirement that a guest shall only be
given access to devices bound to the device driver facilitating their
pass-through.
To resolve this problem, every adapter and domain assigned to the mdev will
be inspected when filtering the mdev's matrix.
Signed-off-by: Tony Krowiak <akrowiak(a)linux.ibm.com>
Acked-by: Halil Pasic <pasic(a)linux.ibm.com>
Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev")
Cc: <stable(a)vger.kernel.org>
---
drivers/s390/crypto/vfio_ap_ops.c | 57 +++++++++----------------------
1 file changed, 17 insertions(+), 40 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 4db538a55192..9382b32e5bd1 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -670,8 +670,7 @@ static bool vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev)
* Return: a boolean value indicating whether the KVM guest's APCB was changed
* by the filtering or not.
*/
-static bool vfio_ap_mdev_filter_matrix(unsigned long *apm, unsigned long *aqm,
- struct ap_matrix_mdev *matrix_mdev)
+static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev)
{
unsigned long apid, apqi, apqn;
DECLARE_BITMAP(prev_shadow_apm, AP_DEVICES);
@@ -692,8 +691,8 @@ static bool vfio_ap_mdev_filter_matrix(unsigned long *apm, unsigned long *aqm,
bitmap_and(matrix_mdev->shadow_apcb.aqm, matrix_mdev->matrix.aqm,
(unsigned long *)matrix_dev->info.aqm, AP_DOMAINS);
- for_each_set_bit_inv(apid, apm, AP_DEVICES) {
- for_each_set_bit_inv(apqi, aqm, AP_DOMAINS) {
+ for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) {
+ for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) {
/*
* If the APQN is not bound to the vfio_ap device
* driver, then we can't assign it to the guest's
@@ -958,7 +957,6 @@ static ssize_t assign_adapter_store(struct device *dev,
{
int ret;
unsigned long apid;
- DECLARE_BITMAP(apm_delta, AP_DEVICES);
struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
mutex_lock(&ap_perms_mutex);
@@ -987,11 +985,8 @@ static ssize_t assign_adapter_store(struct device *dev,
}
vfio_ap_mdev_link_adapter(matrix_mdev, apid);
- memset(apm_delta, 0, sizeof(apm_delta));
- set_bit_inv(apid, apm_delta);
- if (vfio_ap_mdev_filter_matrix(apm_delta,
- matrix_mdev->matrix.aqm, matrix_mdev))
+ if (vfio_ap_mdev_filter_matrix(matrix_mdev))
vfio_ap_mdev_update_guest_apcb(matrix_mdev);
ret = count;
@@ -1167,7 +1162,6 @@ static ssize_t assign_domain_store(struct device *dev,
{
int ret;
unsigned long apqi;
- DECLARE_BITMAP(aqm_delta, AP_DOMAINS);
struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
mutex_lock(&ap_perms_mutex);
@@ -1196,11 +1190,8 @@ static ssize_t assign_domain_store(struct device *dev,
}
vfio_ap_mdev_link_domain(matrix_mdev, apqi);
- memset(aqm_delta, 0, sizeof(aqm_delta));
- set_bit_inv(apqi, aqm_delta);
- if (vfio_ap_mdev_filter_matrix(matrix_mdev->matrix.apm, aqm_delta,
- matrix_mdev))
+ if (vfio_ap_mdev_filter_matrix(matrix_mdev))
vfio_ap_mdev_update_guest_apcb(matrix_mdev);
ret = count;
@@ -2091,9 +2082,7 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev)
if (matrix_mdev) {
vfio_ap_mdev_link_queue(matrix_mdev, q);
- if (vfio_ap_mdev_filter_matrix(matrix_mdev->matrix.apm,
- matrix_mdev->matrix.aqm,
- matrix_mdev))
+ if (vfio_ap_mdev_filter_matrix(matrix_mdev))
vfio_ap_mdev_update_guest_apcb(matrix_mdev);
}
dev_set_drvdata(&apdev->device, q);
@@ -2443,34 +2432,22 @@ void vfio_ap_on_cfg_changed(struct ap_config_info *cur_cfg_info,
static void vfio_ap_mdev_hot_plug_cfg(struct ap_matrix_mdev *matrix_mdev)
{
- bool do_hotplug = false;
- int filter_domains = 0;
- int filter_adapters = 0;
- DECLARE_BITMAP(apm, AP_DEVICES);
- DECLARE_BITMAP(aqm, AP_DOMAINS);
+ bool filter_domains, filter_adapters, filter_cdoms, do_hotplug = false;
mutex_lock(&matrix_mdev->kvm->lock);
mutex_lock(&matrix_dev->mdevs_lock);
- filter_adapters = bitmap_and(apm, matrix_mdev->matrix.apm,
- matrix_mdev->apm_add, AP_DEVICES);
- filter_domains = bitmap_and(aqm, matrix_mdev->matrix.aqm,
- matrix_mdev->aqm_add, AP_DOMAINS);
-
- if (filter_adapters && filter_domains)
- do_hotplug |= vfio_ap_mdev_filter_matrix(apm, aqm, matrix_mdev);
- else if (filter_adapters)
- do_hotplug |=
- vfio_ap_mdev_filter_matrix(apm,
- matrix_mdev->shadow_apcb.aqm,
- matrix_mdev);
- else
- do_hotplug |=
- vfio_ap_mdev_filter_matrix(matrix_mdev->shadow_apcb.apm,
- aqm, matrix_mdev);
+ filter_adapters = bitmap_intersects(matrix_mdev->matrix.apm,
+ matrix_mdev->apm_add, AP_DEVICES);
+ filter_domains = bitmap_intersects(matrix_mdev->matrix.aqm,
+ matrix_mdev->aqm_add, AP_DOMAINS);
+ filter_cdoms = bitmap_intersects(matrix_mdev->matrix.adm,
+ matrix_mdev->adm_add, AP_DOMAINS);
+
+ if (filter_adapters || filter_domains)
+ do_hotplug = vfio_ap_mdev_filter_matrix(matrix_mdev);
- if (bitmap_intersects(matrix_mdev->matrix.adm, matrix_mdev->adm_add,
- AP_DOMAINS))
+ if (filter_cdoms)
do_hotplug |= vfio_ap_mdev_filter_cdoms(matrix_mdev);
if (do_hotplug)
--
2.43.0
The patch titled
Subject: maple_tree: do not preallocate nodes for slot stores
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
maple_tree-do-not-preallocate-nodes-for-slot-stores.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Subject: maple_tree: do not preallocate nodes for slot stores
Date: Tue, 12 Dec 2023 11:46:40 -0800
mas_preallocate() defaults to requesting 1 node for preallocation and then
,depending on the type of store, will update the request variable. There
isn't a check for a slot store type, so slot stores are preallocating the
default 1 node. Slot stores do not require any additional nodes, so add a
check for the slot store case that will bypass node_count_gfp(). Update
the tests to reflect that slot stores do not require allocations.
User visible effects of this bug include increased memory usage from the
unneeded node that was allocated.
Link: https://lkml.kernel.org/r/20231212194640.217966-1-sidhartha.kumar@oracle.com
Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 6 ++++++
tools/testing/radix-tree/maple.c | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
--- a/lib/maple_tree.c~maple_tree-do-not-preallocate-nodes-for-slot-stores
+++ a/lib/maple_tree.c
@@ -5501,6 +5501,12 @@ int mas_preallocate(struct ma_state *mas
mas_wr_end_piv(&wr_mas);
node_size = mas_wr_new_end(&wr_mas);
+
+ /* Slot store, does not require additional nodes */
+ if ((node_size == mas->end) && ((!mt_in_rcu(mas->tree))
+ || (wr_mas.offset_end - mas->offset == 1)))
+ return 0;
+
if (node_size >= mt_slots[wr_mas.type]) {
/* Split, worst case for now. */
request = 1 + mas_mt_height(mas) * 2;
--- a/tools/testing/radix-tree/maple.c~maple_tree-do-not-preallocate-nodes-for-slot-stores
+++ a/tools/testing/radix-tree/maple.c
@@ -35538,7 +35538,7 @@ static noinline void __init check_preall
MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) != 0);
allocated = mas_allocated(&mas);
height = mas_mt_height(&mas);
- MT_BUG_ON(mt, allocated != 1);
+ MT_BUG_ON(mt, allocated != 0);
mas_store_prealloc(&mas, ptr);
MT_BUG_ON(mt, mas_allocated(&mas) != 0);
_
Patches currently in -mm which might be from sidhartha.kumar(a)oracle.com are
maple_tree-do-not-preallocate-nodes-for-slot-stores.patch
This is the start of the stable review cycle for the 4.14.333 release.
There are 25 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 13 Dec 2023 18:19:59 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.333-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.333-rc1
Ido Schimmel <idosch(a)nvidia.com>
drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
Ido Schimmel <idosch(a)nvidia.com>
psample: Require 'CAP_NET_ADMIN' when joining "packets" group
Ido Schimmel <idosch(a)nvidia.com>
genetlink: add CAP_NET_ADMIN test for multicast bind
Ido Schimmel <idosch(a)nvidia.com>
netlink: don't call ->netlink_bind with table lock held
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix missing error check for sb_set_blocksize call
Claudio Imbrenda <imbrenda(a)linux.ibm.com>
KVM: s390/mm: Properly reset no-dat
Ronald Wahl <ronald.wahl(a)raritan.com>
serial: 8250_omap: Add earlycon support for the AM654 UART controller
Daniel Mack <daniel(a)zonque.org>
serial: sc16is7xx: address RX timeout interrupt errata
Arnd Bergmann <arnd(a)arndb.de>
ARM: PL011: Fix DMA support
Cameron Williams <cang1(a)live.co.uk>
parport: Add support for Brainboxes IX/UC/PX parallel cards
Daniel Borkmann <daniel(a)iogearbox.net>
packet: Move reference count in packet_sock to atomic_long_t
Petr Pavlu <petr.pavlu(a)suse.com>
tracing: Fix a possible race when disabling buffered events
Petr Pavlu <petr.pavlu(a)suse.com>
tracing: Fix incomplete locking when disabling buffered events
Steven Rostedt (Google) <rostedt(a)goodmis.org>
tracing: Always update snapshot buffer size
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
Jason Zhang <jason.zhang(a)rock-chips.com>
ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
Petr Pavlu <petr.pavlu(a)suse.com>
tracing: Fix a warning when allocating buffered events fails
Armin Wolf <W_Armin(a)gmx.de>
hwmon: (acpi_power_meter) Fix 4.29 MW bug
Kalesh AP <kalesh-anakkur.purayil(a)broadcom.com>
RDMA/bnxt_re: Correct module description string
Eric Dumazet <edumazet(a)google.com>
tcp: do not accept ACK of bytes we never sent
Yonglong Liu <liuyonglong(a)huawei.com>
net: hns: fix fake link up on xge port
YuanShang <YuanShang.Mao(a)amd.com>
drm/amdgpu: correct chunk_ptr to a pointer to chunk.
Alex Pakhunov <alexey.pakhunov(a)spacex.com>
tg3: Increment tx_dropped in tg3_tso_bug()
Alex Pakhunov <alexey.pakhunov(a)spacex.com>
tg3: Move the [rt]x_dropped counters to tg3_napi
-------------
Diffstat:
Makefile | 4 +-
arch/s390/mm/pgtable.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/hwmon/acpi_power_meter.c | 4 +
drivers/infiniband/hw/bnxt_re/main.c | 2 +-
drivers/net/ethernet/broadcom/tg3.c | 42 ++++++--
drivers/net/ethernet/broadcom/tg3.h | 4 +-
drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c | 29 ++++++
drivers/parport/parport_pc.c | 21 ++++
drivers/scsi/be2iscsi/be_main.c | 1 +
drivers/tty/serial/8250/8250_early.c | 1 +
drivers/tty/serial/amba-pl011.c | 112 +++++++++++-----------
drivers/tty/serial/sc16is7xx.c | 12 +++
fs/nilfs2/sufile.c | 44 +++++++--
fs/nilfs2/the_nilfs.c | 6 +-
include/net/genetlink.h | 3 +
kernel/trace/trace.c | 42 ++++----
net/core/drop_monitor.c | 4 +-
net/ipv4/tcp_input.c | 6 +-
net/netlink/af_netlink.c | 4 +-
net/netlink/genetlink.c | 35 +++++++
net/packet/af_packet.c | 16 ++--
net/packet/internal.h | 2 +-
net/psample/psample.c | 3 +-
sound/core/pcm.c | 1 +
25 files changed, 286 insertions(+), 116 deletions(-)
The quilt patch titled
Subject: mm/rmap: fix misplaced parenthesis of a likely()
has been removed from the -mm tree. Its filename was
mm-rmap-fix-misplaced-parenthesis-of-a-likely.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Steven Rostedt <rostedt(a)goodmis.org>
Subject: mm/rmap: fix misplaced parenthesis of a likely()
Date: Fri, 1 Dec 2023 14:59:36 -0500
From: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Running my yearly branch profiler to see where likely/unlikely annotation
may be added or removed, I discovered this:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 457918 100 page_try_dup_anon_rmap rmap.h 264
[..]
458021 0 0 page_try_dup_anon_rmap rmap.h 265
I thought it was interesting that line 264 of rmap.h had a 100% incorrect
annotation, but the line directly below it was 100% correct. Looking at the
code:
if (likely(!is_device_private_page(page) &&
unlikely(page_needs_cow_for_dma(vma, page))))
It didn't make sense. The "likely()" was around the entire if statement
(not just the "!is_device_private_page(page)"), which also included the
"unlikely()" portion of that if condition.
If the unlikely portion is unlikely to be true, that would make the entire
if condition unlikely to be true, so it made no sense at all to say the
entire if condition is true.
What is more likely to be likely is just the first part of the if statement
before the && operation. It's likely to be a misplaced parenthesis. And
after making the if condition broken into a likely() && unlikely(), both
now appear to be correct!
Link: https://lkml.kernel.org/r/20231201145936.5ddfdb50@gandalf.local.home
Fixes:fb3d824d1a46c ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/rmap.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/rmap.h~mm-rmap-fix-misplaced-parenthesis-of-a-likely
+++ a/include/linux/rmap.h
@@ -261,8 +261,8 @@ static inline int page_try_dup_anon_rmap
* guarantee the pinned page won't be randomly replaced in the
* future on write faults.
*/
- if (likely(!is_device_private_page(page) &&
- unlikely(page_needs_cow_for_dma(vma, page))))
+ if (likely(!is_device_private_page(page)) &&
+ unlikely(page_needs_cow_for_dma(vma, page)))
return -EBUSY;
ClearPageAnonExclusive(page);
_
Patches currently in -mm which might be from rostedt(a)goodmis.org are