On Fri, Feb 04, 2022 at 11:28:19AM -0600, Tabitha Sable wrote:
> I'm happy to help with this, but I'm not familiar with the conventions for
> sending in backport diffs.
>
> I've read through the docs on kernelnewbies and found them to be both
> overwhelmingly big and also not directly relevant to this particular
> situation. I think if I simply try to follow them I'll foul things up.
Yeah, it's not the same thing at all.
> Can I simply make the changes against the appropriate git branch, build and
> test, and then email in the diff to stable(a)vger.kernel.org copying most of
> what you've put in the original email, greg?
Yes, that's exactly what we need here. Be sure to let me know what the
git id of the commit is in Linus's tree so we can properly track it (you
can put it in the changelog text somewhere.)
thanks,
greg k-h
This is the start of the stable review cycle for the 5.16.6 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 06 Feb 2022 09:19:05 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.16.6-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.16.6-rc1
Eric Dumazet <edumazet(a)google.com>
tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
Eric Dumazet <edumazet(a)google.com>
tcp: fix mem under-charging with zerocopy sendmsg()
Eric Dumazet <edumazet(a)google.com>
af_packet: fix data-race in packet_setsockopt / packet_setsockopt
Sasha Neftin <sasha.neftin(a)intel.com>
e1000e: Handshake with CSME starts from ADL platforms
Tianchen Ding <dtcccc(a)linux.alibaba.com>
cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()
He Fengqing <hefengqing(a)huawei.com>
bpf: Fix possible race in inc_misses_counter
Alex Elder <elder(a)linaro.org>
net: ipa: request IPA register values be retained
Eric Dumazet <edumazet(a)google.com>
rtnetlink: make sure to refresh master_dev/m_ops in __rtnl_newlink()
Eric Dumazet <edumazet(a)google.com>
net: sched: fix use-after-free in tc_new_tfilter()
Dan Carpenter <dan.carpenter(a)oracle.com>
fanotify: Fix stale file descriptor in copy_event_to_user()
Shyam Sundar S K <Shyam-sundar.S-k(a)amd.com>
net: amd-xgbe: Fix skb data length underflow
Raju Rangoju <Raju.Rangoju(a)amd.com>
net: amd-xgbe: ensure to reset the tx_timer_active flag
Karen Sornek <karen.sornek(a)intel.com>
i40e: Fix reset path while removing the driver
Jedrzej Jagielski <jedrzej.jagielski(a)intel.com>
i40e: Fix reset bw limit when DCB enabled with 1 TC
Georgi Valkov <gvalkov(a)abv.bg>
ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
Roi Dayan <roid(a)nvidia.com>
net/mlx5e: Avoid implicit modify hdr for decap drop rule
Maor Dickman <maord(a)nvidia.com>
net/mlx5: E-Switch, Fix uninitialized variable modact
Khalid Manaa <khalidm(a)nvidia.com>
net/mlx5e: Fix broken SKB allocation in HW-GRO
Khalid Manaa <khalidm(a)nvidia.com>
net/mlx5e: Fix wrong calculation of header index in HW_GRO
Kees Cook <keescook(a)chromium.org>
net/mlx5e: Avoid field-overflowing memcpy()
Roi Dayan <roid(a)nvidia.com>
net/mlx5: Bridge, Fix devlink deadlock on net namespace deletion
Maxim Mikityanskiy <maximmi(a)nvidia.com>
net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
Dima Chumak <dchumak(a)nvidia.com>
net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
Roi Dayan <roid(a)nvidia.com>
net/mlx5e: TC, Reject rules with forward and drop actions
Gal Pressman <gal(a)nvidia.com>
net/mlx5e: Fix module EEPROM query
Maher Sanalla <msanalla(a)nvidia.com>
net/mlx5: Use del_timer_sync in fw reset flow of halting poll
Maor Dickman <maord(a)nvidia.com>
net/mlx5e: Fix handling of wrong devices during bond netevent
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5: Bridge, ensure dev_name is null-terminated
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5: Bridge, take rtnl lock in init error handler
Roi Dayan <roid(a)nvidia.com>
net/mlx5e: TC, Reject rules with drop and modify hdr action
Raed Salem <raeds(a)nvidia.com>
net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
Raed Salem <raeds(a)nvidia.com>
net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
J. Bruce Fields <bfields(a)redhat.com>
lockd: fix failure to cleanup client locks
J. Bruce Fields <bfields(a)redhat.com>
lockd: fix server crash on reboot of client holding lock
Miklos Szeredi <mszeredi(a)redhat.com>
ovl: don't fail copy up if no fileattr support on upper
Jonathan McDowell <noodles(a)earth.li>
net: phy: Fix qca8081 with speeds lower than 2.5Gb/s
John Hubbard <jhubbard(a)nvidia.com>
Revert "mm/gup: small refactoring: simplify try_grab_page()"
Eric W. Biederman <ebiederm(a)xmission.com>
cgroup-v1: Require capabilities to set release_agent
Maxime Ripard <maxime(a)cerno.tech>
drm/vc4: hdmi: Make sure the device is powered with CEC
Alex Elder <elder(a)linaro.org>
net: ipa: prevent concurrent replenish
Alex Elder <elder(a)linaro.org>
net: ipa: use a bitmap for endpoint replenish_enabled
Paolo Abeni <pabeni(a)redhat.com>
selftests: mptcp: fix ipv6 routing setup
Lukas Wunner <lukas(a)wunner.de>
PCI: pciehp: Fix infinite loop in IRQ handler upon power fault
-------------
Diffstat:
Makefile | 4 +-
drivers/gpu/drm/vc4/vc4_hdmi.c | 25 ++++++-----
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 14 +++++-
drivers/net/ethernet/intel/e1000e/netdev.c | 6 ++-
drivers/net/ethernet/intel/i40e/i40e.h | 1 +
drivers/net/ethernet/intel/i40e/i40e_main.c | 31 ++++++++++++-
drivers/net/ethernet/mellanox/mlx5/core/en.h | 6 +--
drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 3 +-
.../net/ethernet/mellanox/mlx5/core/en/rep/bond.c | 32 ++++++-------
.../ethernet/mellanox/mlx5/core/en/rep/bridge.c | 6 ++-
drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 5 +++
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 +-
.../mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 13 +++++-
.../mellanox/mlx5/core/en_accel/ipsec_rxtx.h | 9 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 30 ++++++++-----
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 15 ++++++-
.../net/ethernet/mellanox/mlx5/core/esw/bridge.c | 4 ++
.../mlx5/core/esw/diag/bridge_tracepoint.h | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 2 +-
.../ethernet/mellanox/mlx5/core/lib/fs_chains.c | 9 ++--
drivers/net/ethernet/mellanox/mlx5/core/port.c | 9 ++--
drivers/net/ipa/ipa_endpoint.c | 21 +++++++--
drivers/net/ipa/ipa_endpoint.h | 17 ++++++-
drivers/net/ipa/ipa_power.c | 52 ++++++++++++++++++++++
drivers/net/ipa/ipa_power.h | 7 +++
drivers/net/ipa/ipa_uc.c | 5 +++
drivers/net/phy/at803x.c | 26 +++++------
drivers/net/usb/ipheth.c | 6 +--
drivers/pci/hotplug/pciehp_hpc.c | 7 +--
fs/lockd/svcsubs.c | 18 ++++----
fs/notify/fanotify/fanotify_user.c | 6 +--
fs/overlayfs/copy_up.c | 12 ++++-
kernel/bpf/trampoline.c | 5 ++-
kernel/cgroup/cgroup-v1.c | 14 ++++++
kernel/cgroup/cpuset.c | 3 +-
mm/gup.c | 35 ++++++++++++---
net/core/rtnetlink.c | 6 ++-
net/ipv4/tcp.c | 7 ++-
net/ipv4/tcp_input.c | 2 +
net/packet/af_packet.c | 8 +++-
net/sched/cls_api.c | 11 +++--
tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 ++-
42 files changed, 374 insertions(+), 129 deletions(-)
Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx
before reporting the expedited quiescent state. Under heavy real-time
load, this can result in this function being preempted before the
quiescent state is reported, which can in turn prevent the expedited grace
period from completing. Tim Murray reports that the resulting expedited
grace periods can take hundreds of milliseconds and even more than one
second, when they should normally complete in less than a millisecond.
This was fine given that there were no particular response-time
constraints for synchronize_rcu_expedited(), as it was designed
for throughput rather than latency. However, some users now need
sub-100-millisecond response-time constratints.
This patch therefore follows Neeraj's suggestion (seconded by Tim and
by Uladzislau Rezki) of simply reversing the two operations.
Reported-by: Tim Murray <timmurray(a)google.com>
Reported-by: Joel Fernandes <joelaf(a)google.com>
Reported-by: Neeraj Upadhyay <quic_neeraju(a)quicinc.com>
Reviewed-by: Neeraj Upadhyay <quic_neeraju(a)quicinc.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Tested-by: Tim Murray <timmurray(a)google.com>
Cc: Todd Kjos <tkjos(a)google.com>
Cc: Sandeep Patil <sspatil(a)google.com>
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
---
kernel/rcu/tree_plugin.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 109429e70a642..02ac057ba3f83 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -556,16 +556,16 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
- /* Unboost if we were boosted. */
- if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex)
- rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex);
-
/*
* If this was the last task on the expedited lists,
* then we need to report up the rcu_node hierarchy.
*/
if (!empty_exp && empty_exp_now)
rcu_report_exp_rnp(rnp, true);
+
+ /* Unboost if we were boosted. */
+ if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex)
+ rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex);
} else {
local_irq_restore(flags);
}
--
2.31.1.189.g2e36527f23
When I rewrote the VMA dumping logic for coredumps, I changed it to
recognize ELF library mappings based on the file being executable instead
of the mapping having an ELF header. But turns out, distros ship many ELF
libraries as non-executable, so the heuristic goes wrong...
Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of
any offset-0 readable mapping that starts with the ELF magic.
This fix is technically layer-breaking a bit, because it checks for
something ELF-specific in fs/coredump.c; but since we probably want to
share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
And this also keeps the change small for backporting.
Cc: stable(a)vger.kernel.org
Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper")
Reported-by: Bill Messmer <wmessmer(a)microsoft.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
---
@Bill: If you happen to have a kernel tree lying around, you could give
this a try and report back whether this solves your issues?
But if not, it's also fine, I've tested myself that with this patch
applied, the first 0x1000 bytes of non-executable libraries are dumped
into the coredump according to "readelf".
fs/coredump.c | 39 ++++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)
diff --git a/fs/coredump.c b/fs/coredump.c
index 1c060c0a2d72..b73817712dd2 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -42,6 +42,7 @@
#include <linux/path.h>
#include <linux/timekeeping.h>
#include <linux/sysctl.h>
+#include <linux/elf.h>
#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@@ -980,6 +981,8 @@ static bool always_dump_vma(struct vm_area_struct *vma)
return false;
}
+#define DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER 1
+
/*
* Decide how much of @vma's contents should be included in a core dump.
*/
@@ -1039,9 +1042,20 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
* dump the first page to aid in determining what was mapped here.
*/
if (FILTER(ELF_HEADERS) &&
- vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) &&
- (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
- return PAGE_SIZE;
+ vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
+ if ((READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
+ return PAGE_SIZE;
+
+ /*
+ * ELF libraries aren't always executable.
+ * We'll want to check whether the mapping starts with the ELF
+ * magic, but not now - we're holding the mmap lock,
+ * so copy_from_user() doesn't work here.
+ * Use a placeholder instead, and fix it up later in
+ * dump_vma_snapshot().
+ */
+ return DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER;
+ }
#undef FILTER
@@ -1116,8 +1130,6 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
m->end = vma->vm_end;
m->flags = vma->vm_flags;
m->dump_size = vma_dump_size(vma, cprm->mm_flags);
-
- vma_data_size += m->dump_size;
}
mmap_write_unlock(mm);
@@ -1127,6 +1139,23 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
return -EFAULT;
}
+ for (i = 0; i < *vma_count; i++) {
+ struct core_vma_metadata *m = (*vma_meta) + i;
+
+ if (m->dump_size == DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER) {
+ char elfmag[SELFMAG];
+
+ if (copy_from_user(elfmag, (void __user *)m->start, SELFMAG) ||
+ memcmp(elfmag, ELFMAG, SELFMAG) != 0) {
+ m->dump_size = 0;
+ } else {
+ m->dump_size = PAGE_SIZE;
+ }
+ }
+
+ vma_data_size += m->dump_size;
+ }
+
*vma_data_size_ptr = vma_data_size;
return 0;
}
base-commit: 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8
--
2.35.0.rc0.227.g00780c9af4-goog