This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 24 Feb 2021 12:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.258-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.258-rc1
Lai Jiangshan laijs@linux.alibaba.com kvm: check tlbs_dirty directly
Arun Easi aeasi@marvell.com scsi: qla2xxx: Fix crash during driver load on big endian machines
Jan Beulich jbeulich@suse.com xen-blkback: fix error handling in xen_blkbk_map()
Jan Beulich jbeulich@suse.com xen-scsiback: don't "handle" error by BUG()
Jan Beulich jbeulich@suse.com xen-netback: don't "handle" error by BUG()
Jan Beulich jbeulich@suse.com xen-blkback: don't "handle" error by BUG()
Stefano Stabellini stefano.stabellini@xilinx.com xen/arm: don't ignore return errors from set_phys_to_machine
Jan Beulich jbeulich@suse.com Xen/gntdev: correct error checking in gntdev_map_grant_pages()
Jan Beulich jbeulich@suse.com Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages()
Jan Beulich jbeulich@suse.com Xen/x86: also check kernel mapping in set_foreign_p2m_mapping()
Jan Beulich jbeulich@suse.com Xen/x86: don't bail early from clear_foreign_p2m_mapping()
Vasily Gorbik gor@linux.ibm.com tracing: Avoid calling cc-option -mrecord-mcount for every Makefile
Greg Thelen gthelen@google.com tracing: Fix SKIP_STACK_VALIDATION=1 build due to bad merge with -mrecord-mcount
Andi Kleen ak@linux.intel.com trace: Use -mcount-record for dynamic ftrace
Borislav Petkov bp@suse.de x86/build: Disable CET instrumentation in the kernel for 32-bit too
Stefano Garzarella sgarzare@redhat.com vsock: fix locking in vsock_shutdown()
Edwin Peer edwin.peer@broadcom.com net: watchdog: hold device global xmit lock during tx disable
Serge Semin Sergey.Semin@baikalelectronics.ru usb: dwc3: ulpi: Replace CPU-based busyloop with Protocol-based one
Felipe Balbi balbi@kernel.org usb: dwc3: ulpi: fix checkpatch warning
Randy Dunlap rdunlap@infradead.org h8300: fix PREEMPTION build, TI_PRE_COUNT undefined
Jozsef Kadlecsik kadlec@mail.kfki.hu netfilter: xt_recent: Fix attempt to update deleted entry
Roman Gushchin guro@fb.com memblock: do not start bottom-up allocations with kernel_end
Phillip Lougher phillip@squashfs.org.uk squashfs: add more sanity checks in xattr id lookup
Phillip Lougher phillip@squashfs.org.uk squashfs: add more sanity checks in inode lookup
Phillip Lougher phillip@squashfs.org.uk squashfs: add more sanity checks in id lookup
Theodore Ts'o tytso@mit.edu memcg: fix a crash in wb_workfn when a device disappears
Qian Cai cai@lca.pw include/trace/events/writeback.h: fix -Wstringop-truncation warnings
Tobin C. Harding tobin@kernel.org lib/string: Add strscpy_pad() function
Dave Wysochanski dwysocha@redhat.com SUNRPC: Handle 0 length opaque XDR object data properly
Dave Wysochanski dwysocha@redhat.com SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
Johannes Berg johannes.berg@intel.com iwlwifi: mvm: guard against device removal in reprobe
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap
Cong Wang cong.wang@bytedance.com af_key: relax availability checks for skb size calculation
Steven Rostedt (VMware) rostedt@goodmis.org fgraph: Initialize tracing_graph_pause at task creation
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Do not count ftrace events in top level enable output
-------------
Diffstat:
Makefile | 11 +++++- arch/arm/xen/p2m.c | 6 ++- arch/h8300/kernel/asm-offsets.c | 3 ++ arch/x86/Makefile | 6 +-- arch/x86/xen/p2m.c | 15 ++++---- drivers/block/xen-blkback/blkback.c | 28 ++++++++------ drivers/net/wireless/iwlwifi/mvm/ops.c | 3 +- drivers/net/wireless/iwlwifi/pcie/tx.c | 5 +++ drivers/net/xen-netback/netback.c | 4 +- drivers/scsi/qla2xxx/qla_tmpl.c | 9 +++-- drivers/scsi/qla2xxx/qla_tmpl.h | 2 +- drivers/usb/dwc3/ulpi.c | 20 ++++++++-- drivers/xen/gntdev.c | 33 +++++++++++------ drivers/xen/xen-scsiback.c | 4 +- fs/fs-writeback.c | 2 +- fs/squashfs/export.c | 41 ++++++++++++++++---- fs/squashfs/id.c | 40 ++++++++++++++++---- fs/squashfs/squashfs_fs_sb.h | 1 + fs/squashfs/super.c | 6 +-- fs/squashfs/xattr.h | 10 ++++- fs/squashfs/xattr_id.c | 66 ++++++++++++++++++++++++++++----- include/linux/backing-dev.h | 10 +++++ include/linux/ftrace.h | 4 +- include/linux/netdevice.h | 2 + include/linux/string.h | 4 ++ include/linux/sunrpc/xdr.h | 3 +- include/trace/events/writeback.h | 35 +++++++++-------- include/xen/grant_table.h | 1 + kernel/trace/ftrace.c | 2 - kernel/trace/trace_events.c | 3 +- lib/string.c | 47 +++++++++++++++++++---- mm/backing-dev.c | 1 + mm/memblock.c | 49 +++--------------------- net/key/af_key.c | 6 +-- net/netfilter/xt_recent.c | 12 +++++- net/sunrpc/auth_gss/auth_gss.c | 30 +-------------- net/sunrpc/auth_gss/auth_gss_internal.h | 45 ++++++++++++++++++++++ net/sunrpc/auth_gss/gss_krb5_mech.c | 31 +--------------- net/vmw_vsock/af_vsock.c | 8 ++-- scripts/Makefile.build | 3 ++ virt/kvm/kvm_main.c | 3 +- 41 files changed, 389 insertions(+), 225 deletions(-)
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 256cfdd6fdf70c6fcf0f7c8ddb0ebd73ce8f3bc9 upstream.
The file /sys/kernel/tracing/events/enable is used to enable all events by echoing in "1", or disabling all events when echoing in "0". To know if all events are enabled, disabled, or some are enabled but not all of them, cating the file should show either "1" (all enabled), "0" (all disabled), or "X" (some enabled but not all of them). This works the same as the "enable" files in the individule system directories (like tracing/events/sched/enable).
But when all events are enabled, the top level "enable" file shows "X". The reason is that its checking the "ftrace" events, which are special events that only exist for their format files. These include the format for the function tracer events, that are enabled when the function tracer is enabled, but not by the "enable" file. The check includes these events, which will always be disabled, and even though all true events are enabled, the top level "enable" file will show "X" instead of "1".
To fix this, have the check test the event's flags to see if it has the "IGNORE_ENABLE" flag set, and if so, not test it.
Cc: stable@vger.kernel.org Fixes: 553552ce1796c ("tracing: Combine event filter_active and enable into single flags field") Reported-by: "Yordan Karadzhov (VMware)" y.karadz@gmail.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -1083,7 +1083,8 @@ system_enable_read(struct file *filp, ch mutex_lock(&event_mutex); list_for_each_entry(file, &tr->events, list) { call = file->event_call; - if (!trace_event_name(call) || !call->class || !call->class->reg) + if ((call->flags & TRACE_EVENT_FL_IGNORE_ENABLE) || + !trace_event_name(call) || !call->class || !call->class->reg) continue;
if (system && strcmp(call->class->system, system->name) != 0)
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 upstream.
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend() will disable or pause function graph tracing, as there's some paths in bringing down the CPU that can have issues with its return address being modified. The task_struct structure has a "tracing_graph_pause" atomic counter, that when set to something other than zero, the function graph tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the function graph tracer is enabled. This can corrupt the counter for the idle task if it is suspended in these architectures.
CPU 1 CPU 2 ----- ----- do_idle() cpu_suspend() pause_graph_tracing() task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing() for_each_online_cpu(cpu) { ftrace_graph_init_idle_task(cpu) task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing() task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can not be initialized at boot up.
Cc: stable@vger.kernel.org Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339 Reported-by: pierre.gondois@arm.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/ftrace.h | 4 +++- kernel/trace/ftrace.c | 2 -- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -747,7 +747,9 @@ typedef int (*trace_func_graph_ent_t)(st #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/* for init task */ -#define INIT_FTRACE_GRAPH .ret_stack = NULL, +#define INIT_FTRACE_GRAPH \ + .ret_stack = NULL, \ + .tracing_graph_pause = ATOMIC_INIT(0),
/* * Stack of return addresses for functions --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -5708,7 +5708,6 @@ static int alloc_retstack_tasklist(struc }
if (t->ret_stack == NULL) { - atomic_set(&t->tracing_graph_pause, 0); atomic_set(&t->trace_overrun, 0); t->curr_ret_stack = -1; /* Make sure the tasks see the -1 first: */ @@ -5920,7 +5919,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_ static void graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack) { - atomic_set(&t->tracing_graph_pause, 0); atomic_set(&t->trace_overrun, 0); t->ftrace_timestamp = 0; /* make curr_ret_stack visible before we add the ret_stack */
From: Cong Wang cong.wang@bytedance.com
[ Upstream commit afbc293add6466f8f3f0c3d944d85f53709c170f ]
xfrm_probe_algs() probes kernel crypto modules and changes the availability of struct xfrm_algo_desc. But there is a small window where ealg->available and aalg->available get changed between count_ah_combs()/count_esp_combs() and dump_ah_combs()/dump_esp_combs(), in this case we may allocate a smaller skb but later put a larger amount of data and trigger the panic in skb_put().
Fix this by relaxing the checks when counting the size, that is, skipping the test of ->available. We may waste some memory for a few of sizeof(struct sadb_comb), but it is still much better than a panic.
Reported-by: syzbot+b2bf2652983d23734c5c@syzkaller.appspotmail.com Cc: Steffen Klassert steffen.klassert@secunet.com Cc: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Cong Wang cong.wang@bytedance.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/key/af_key.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c index 76a008b1cbe5f..adc93329e6aac 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2933,7 +2933,7 @@ static int count_ah_combs(const struct xfrm_tmpl *t) break; if (!aalg->pfkey_supported) continue; - if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } return sz + sizeof(struct sadb_prop); @@ -2951,7 +2951,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!ealg->pfkey_supported) continue;
- if (!(ealg_tmpl_set(t, ealg) && ealg->available)) + if (!(ealg_tmpl_set(t, ealg))) continue;
for (k = 1; ; k++) { @@ -2962,7 +2962,7 @@ static int count_esp_combs(const struct xfrm_tmpl *t) if (!aalg->pfkey_supported) continue;
- if (aalg_tmpl_set(t, aalg) && aalg->available) + if (aalg_tmpl_set(t, aalg)) sz += sizeof(struct sadb_comb); } }
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ]
I hit a NULL pointer exception in this function when the init flow went really bad.
Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf7... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/iwlwifi/pcie/tx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c index 8dfe6b2bc7031..cb03c2855019b 100644 --- a/drivers/net/wireless/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/iwlwifi/pcie/tx.c @@ -585,6 +585,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_txq *txq = &trans_pcie->txq[txq_id]; struct iwl_queue *q = &txq->q;
+ if (!txq) { + IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n"); + return; + } + spin_lock_bh(&txq->lock); while (q->write_ptr != q->read_ptr) { IWL_DEBUG_TX_REPLY(trans, "Q %d Free %d\n",
Hi,
Sorry for the report after the release.
On Mon, Feb 22, 2021 at 01:36:00PM +0100, Greg Kroah-Hartman wrote:
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ]
I hit a NULL pointer exception in this function when the init flow went really bad.
Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf7... Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/iwlwifi/pcie/tx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c index 8dfe6b2bc7031..cb03c2855019b 100644 --- a/drivers/net/wireless/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/iwlwifi/pcie/tx.c @@ -585,6 +585,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_txq *txq = &trans_pcie->txq[txq_id]; struct iwl_queue *q = &txq->q;
- if (!txq) {
IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n");
return;
- }
I think that this fix is not enough. If txq is NULL, an error will occur with "struct iwl_queue * q = & txq->q;". The following changes are required.
diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c index cb03c2855019b7..7584796131fa41 100644 --- a/drivers/net/wireless/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/iwlwifi/pcie/tx.c @@ -583,13 +583,15 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_txq *txq = &trans_pcie->txq[txq_id]; - struct iwl_queue *q = &txq->q; + struct iwl_queue *q;
if (!txq) { IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n"); return; }
+ q = &txq->q; + spin_lock_bh(&txq->lock); while (q->write_ptr != q->read_ptr) { IWL_DEBUG_TX_REPLY(trans, "Q %d Free %d\n",
spin_lock_bh(&txq->lock); while (q->write_ptr != q->read_ptr) { IWL_DEBUG_TX_REPLY(trans, "Q %d Free %d\n", -- 2.27.0
Best regards, Nobuhiro
On Thu, Feb 25, 2021 at 03:04:46PM +0900, Nobuhiro Iwamatsu wrote:
Hi,
Sorry for the report after the release.
On Mon, Feb 22, 2021 at 01:36:00PM +0100, Greg Kroah-Hartman wrote:
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ]
I hit a NULL pointer exception in this function when the init flow went really bad.
Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf7... Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/iwlwifi/pcie/tx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c index 8dfe6b2bc7031..cb03c2855019b 100644 --- a/drivers/net/wireless/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/iwlwifi/pcie/tx.c @@ -585,6 +585,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_txq *txq = &trans_pcie->txq[txq_id]; struct iwl_queue *q = &txq->q;
- if (!txq) {
IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n");
return;
- }
I think that this fix is not enough. If txq is NULL, an error will occur with "struct iwl_queue * q = & txq->q;". The following changes are required.
Is this a 4.4-only thing, or is this issue also in Linus's tree as well? If Linus's tree, please submit this as a normal patch so we can apply it there first.
thanks,
greg k-h
Hi,
On Thu, Feb 25, 2021 at 09:14:42AM +0100, Greg Kroah-Hartman wrote:
On Thu, Feb 25, 2021 at 03:04:46PM +0900, Nobuhiro Iwamatsu wrote:
Hi,
Sorry for the report after the release.
On Mon, Feb 22, 2021 at 01:36:00PM +0100, Greg Kroah-Hartman wrote:
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 98c7d21f957b10d9c07a3a60a3a5a8f326a197e5 ]
I hit a NULL pointer exception in this function when the init flow went really bad.
Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210115130252.2e8da9f2c132.I0234d4b8ddaf7... Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/iwlwifi/pcie/tx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/iwlwifi/pcie/tx.c b/drivers/net/wireless/iwlwifi/pcie/tx.c index 8dfe6b2bc7031..cb03c2855019b 100644 --- a/drivers/net/wireless/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/iwlwifi/pcie/tx.c @@ -585,6 +585,11 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id) struct iwl_txq *txq = &trans_pcie->txq[txq_id]; struct iwl_queue *q = &txq->q;
- if (!txq) {
IWL_ERR(trans, "Trying to free a queue that wasn't allocated?\n");
return;
- }
I think that this fix is not enough. If txq is NULL, an error will occur with "struct iwl_queue * q = & txq->q;". The following changes are required.
Is this a 4.4-only thing, or is this issue also in Linus's tree as well? If Linus's tree, please submit this as a normal patch so we can apply it there first.
I did not have enough explanation.
This issue is only 4.4.y tree. The same patch has been applied to the other trees, but with the correct fixes. Also this issue is not in Linus's tree. This is due to incorrect fixes in this commit.
I attached a patch for this issue.
thanks,
greg k-h
Best regards, Nobuhiro
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 7a21b1d4a728a483f07c638ccd8610d4b4f12684 ]
If we get into a problem severe enough to attempt a reprobe, we schedule a worker to do that. However, if the problem gets more severe and the device is actually destroyed before this worker has a chance to run, we use a free device. Bump up the reference count of the device until the worker runs to avoid this situation.
Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210122144849.871f0892e4b2.I94819e11afd68... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/iwlwifi/mvm/ops.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/iwlwifi/mvm/ops.c b/drivers/net/wireless/iwlwifi/mvm/ops.c index 13c97f665ba88..bb81261de45fa 100644 --- a/drivers/net/wireless/iwlwifi/mvm/ops.c +++ b/drivers/net/wireless/iwlwifi/mvm/ops.c @@ -909,6 +909,7 @@ static void iwl_mvm_reprobe_wk(struct work_struct *wk) reprobe = container_of(wk, struct iwl_mvm_reprobe, work); if (device_reprobe(reprobe->dev)) dev_err(reprobe->dev, "reprobe failed!\n"); + put_device(reprobe->dev); kfree(reprobe); module_put(THIS_MODULE); } @@ -991,7 +992,7 @@ void iwl_mvm_nic_restart(struct iwl_mvm *mvm, bool fw_error) module_put(THIS_MODULE); return; } - reprobe->dev = mvm->trans->dev; + reprobe->dev = get_device(mvm->trans->dev); INIT_WORK(&reprobe->work, iwl_mvm_reprobe_wk); schedule_work(&reprobe->work); } else if (mvm->cur_ucode == IWL_UCODE_REGULAR) {
From: Dave Wysochanski dwysocha@redhat.com
[ Upstream commit ba6dfce47c4d002d96cd02a304132fca76981172 ]
Remove duplicated helper functions to parse opaque XDR objects and place inside new file net/sunrpc/auth_gss/auth_gss_internal.h. In the new file carry the license and copyright from the source file net/sunrpc/auth_gss/auth_gss.c. Finally, update the comment inside include/linux/sunrpc/xdr.h since lockd is not the only user of struct xdr_netobj.
Signed-off-by: Dave Wysochanski dwysocha@redhat.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sunrpc/xdr.h | 3 +- net/sunrpc/auth_gss/auth_gss.c | 30 +----------------- net/sunrpc/auth_gss/auth_gss_internal.h | 42 +++++++++++++++++++++++++ net/sunrpc/auth_gss/gss_krb5_mech.c | 31 ++---------------- 4 files changed, 46 insertions(+), 60 deletions(-) create mode 100644 net/sunrpc/auth_gss/auth_gss_internal.h
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h index 70c6b92e15a7c..8def5e0a491fa 100644 --- a/include/linux/sunrpc/xdr.h +++ b/include/linux/sunrpc/xdr.h @@ -23,8 +23,7 @@ #define XDR_QUADLEN(l) (((l) + 3) >> 2)
/* - * Generic opaque `network object.' At the kernel level, this type - * is used only by lockd. + * Generic opaque `network object.' */ #define XDR_MAX_NETOBJ 1024 struct xdr_netobj { diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c index 62fca77bf3c70..7bde2976307ed 100644 --- a/net/sunrpc/auth_gss/auth_gss.c +++ b/net/sunrpc/auth_gss/auth_gss.c @@ -53,6 +53,7 @@ #include <asm/uaccess.h> #include <linux/hashtable.h>
+#include "auth_gss_internal.h" #include "../netns.h"
static const struct rpc_authops authgss_ops; @@ -147,35 +148,6 @@ gss_cred_set_ctx(struct rpc_cred *cred, struct gss_cl_ctx *ctx) clear_bit(RPCAUTH_CRED_NEW, &cred->cr_flags); }
-static const void * -simple_get_bytes(const void *p, const void *end, void *res, size_t len) -{ - const void *q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - memcpy(res, p, len); - return q; -} - -static inline const void * -simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) -{ - const void *q; - unsigned int len; - - p = simple_get_bytes(p, end, &len, sizeof(len)); - if (IS_ERR(p)) - return p; - q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - dest->data = kmemdup(p, len, GFP_NOFS); - if (unlikely(dest->data == NULL)) - return ERR_PTR(-ENOMEM); - dest->len = len; - return q; -} - static struct gss_cl_ctx * gss_cred_get_ctx(struct rpc_cred *cred) { diff --git a/net/sunrpc/auth_gss/auth_gss_internal.h b/net/sunrpc/auth_gss/auth_gss_internal.h new file mode 100644 index 0000000000000..c5603242b54bf --- /dev/null +++ b/net/sunrpc/auth_gss/auth_gss_internal.h @@ -0,0 +1,42 @@ +// SPDX-License-Identifier: BSD-3-Clause +/* + * linux/net/sunrpc/auth_gss/auth_gss_internal.h + * + * Internal definitions for RPCSEC_GSS client authentication + * + * Copyright (c) 2000 The Regents of the University of Michigan. + * All rights reserved. + * + */ +#include <linux/err.h> +#include <linux/string.h> +#include <linux/sunrpc/xdr.h> + +static inline const void * +simple_get_bytes(const void *p, const void *end, void *res, size_t len) +{ + const void *q = (const void *)((const char *)p + len); + if (unlikely(q > end || q < p)) + return ERR_PTR(-EFAULT); + memcpy(res, p, len); + return q; +} + +static inline const void * +simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) +{ + const void *q; + unsigned int len; + + p = simple_get_bytes(p, end, &len, sizeof(len)); + if (IS_ERR(p)) + return p; + q = (const void *)((const char *)p + len); + if (unlikely(q > end || q < p)) + return ERR_PTR(-EFAULT); + dest->data = kmemdup(p, len, GFP_NOFS); + if (unlikely(dest->data == NULL)) + return ERR_PTR(-ENOMEM); + dest->len = len; + return q; +} diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c b/net/sunrpc/auth_gss/gss_krb5_mech.c index 28db442a0034a..89e616da161fd 100644 --- a/net/sunrpc/auth_gss/gss_krb5_mech.c +++ b/net/sunrpc/auth_gss/gss_krb5_mech.c @@ -45,6 +45,8 @@ #include <linux/crypto.h> #include <linux/sunrpc/gss_krb5_enctypes.h>
+#include "auth_gss_internal.h" + #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) # define RPCDBG_FACILITY RPCDBG_AUTH #endif @@ -186,35 +188,6 @@ get_gss_krb5_enctype(int etype) return NULL; }
-static const void * -simple_get_bytes(const void *p, const void *end, void *res, int len) -{ - const void *q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - memcpy(res, p, len); - return q; -} - -static const void * -simple_get_netobj(const void *p, const void *end, struct xdr_netobj *res) -{ - const void *q; - unsigned int len; - - p = simple_get_bytes(p, end, &len, sizeof(len)); - if (IS_ERR(p)) - return p; - q = (const void *)((const char *)p + len); - if (unlikely(q > end || q < p)) - return ERR_PTR(-EFAULT); - res->data = kmemdup(p, len, GFP_NOFS); - if (unlikely(res->data == NULL)) - return ERR_PTR(-ENOMEM); - res->len = len; - return q; -} - static inline const void * get_key(const void *p, const void *end, struct krb5_ctx *ctx, struct crypto_blkcipher **res)
From: Dave Wysochanski dwysocha@redhat.com
[ Upstream commit e4a7d1f7707eb44fd953a31dd59eff82009d879c ]
When handling an auth_gss downcall, it's possible to get 0-length opaque object for the acceptor. In the case of a 0-length XDR object, make sure simple_get_netobj() fills in dest->data = NULL, and does not continue to kmemdup() which will set dest->data = ZERO_SIZE_PTR for the acceptor.
The trace event code can handle NULL but not ZERO_SIZE_PTR for a string, and so without this patch the rpcgss_context trace event will crash the kernel as follows:
[ 162.887992] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 162.898693] #PF: supervisor read access in kernel mode [ 162.900830] #PF: error_code(0x0000) - not-present page [ 162.902940] PGD 0 P4D 0 [ 162.904027] Oops: 0000 [#1] SMP PTI [ 162.905493] CPU: 4 PID: 4321 Comm: rpc.gssd Kdump: loaded Not tainted 5.10.0 #133 [ 162.908548] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 162.910978] RIP: 0010:strlen+0x0/0x20 [ 162.912505] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 162.920101] RSP: 0018:ffffaec900c77d90 EFLAGS: 00010202 [ 162.922263] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000fffde697 [ 162.925158] RDX: 000000000000002f RSI: 0000000000000080 RDI: 0000000000000010 [ 162.928073] RBP: 0000000000000010 R08: 0000000000000e10 R09: 0000000000000000 [ 162.930976] R10: ffff8e698a590cb8 R11: 0000000000000001 R12: 0000000000000e10 [ 162.933883] R13: 00000000fffde697 R14: 000000010034d517 R15: 0000000000070028 [ 162.936777] FS: 00007f1e1eb93700(0000) GS:ffff8e6ab7d00000(0000) knlGS:0000000000000000 [ 162.940067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 162.942417] CR2: 0000000000000010 CR3: 0000000104eba000 CR4: 00000000000406e0 [ 162.945300] Call Trace: [ 162.946428] trace_event_raw_event_rpcgss_context+0x84/0x140 [auth_rpcgss] [ 162.949308] ? __kmalloc_track_caller+0x35/0x5a0 [ 162.951224] ? gss_pipe_downcall+0x3a3/0x6a0 [auth_rpcgss] [ 162.953484] gss_pipe_downcall+0x585/0x6a0 [auth_rpcgss] [ 162.955953] rpc_pipe_write+0x58/0x70 [sunrpc] [ 162.957849] vfs_write+0xcb/0x2c0 [ 162.959264] ksys_write+0x68/0xe0 [ 162.960706] do_syscall_64+0x33/0x40 [ 162.962238] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 162.964346] RIP: 0033:0x7f1e1f1e57df
Signed-off-by: Dave Wysochanski dwysocha@redhat.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/auth_gss/auth_gss_internal.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/sunrpc/auth_gss/auth_gss_internal.h b/net/sunrpc/auth_gss/auth_gss_internal.h index c5603242b54bf..f6d9631bd9d00 100644 --- a/net/sunrpc/auth_gss/auth_gss_internal.h +++ b/net/sunrpc/auth_gss/auth_gss_internal.h @@ -34,9 +34,12 @@ simple_get_netobj(const void *p, const void *end, struct xdr_netobj *dest) q = (const void *)((const char *)p + len); if (unlikely(q > end || q < p)) return ERR_PTR(-EFAULT); - dest->data = kmemdup(p, len, GFP_NOFS); - if (unlikely(dest->data == NULL)) - return ERR_PTR(-ENOMEM); + if (len) { + dest->data = kmemdup(p, len, GFP_NOFS); + if (unlikely(dest->data == NULL)) + return ERR_PTR(-ENOMEM); + } else + dest->data = NULL; dest->len = len; return q; }
From: Tobin C. Harding tobin@kernel.org
[ Upstream commit 458a3bf82df4fe1f951d0f52b1e0c1e9d5a88a3b ]
We have a function to copy strings safely and we have a function to copy strings and zero the tail of the destination (if source string is shorter than destination buffer) but we do not have a function to do both at once. This means developers must write this themselves if they desire this functionality. This is a chore, and also leaves us open to off by one errors unnecessarily.
Add a function that calls strscpy() then memset()s the tail to zero if the source string is shorter than the destination buffer.
Acked-by: Kees Cook keescook@chromium.org Signed-off-by: Tobin C. Harding tobin@kernel.org Signed-off-by: Shuah Khan shuah@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/string.h | 4 ++++ lib/string.c | 47 +++++++++++++++++++++++++++++++++++------- 2 files changed, 44 insertions(+), 7 deletions(-)
diff --git a/include/linux/string.h b/include/linux/string.h index 870268d42ae7d..7da409760cf18 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -28,6 +28,10 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif + +/* Wraps calls to strscpy()/memset(), no arch specific code required */ +ssize_t strscpy_pad(char *dest, const char *src, size_t count); + #ifndef __HAVE_ARCH_STRCAT extern char * strcat(char *, const char *); #endif diff --git a/lib/string.c b/lib/string.c index 7f4baad6fb193..4351ec43cd6b8 100644 --- a/lib/string.c +++ b/lib/string.c @@ -157,11 +157,9 @@ EXPORT_SYMBOL(strlcpy); * @src: Where to copy the string from * @count: Size of destination buffer * - * Copy the string, or as much of it as fits, into the dest buffer. - * The routine returns the number of characters copied (not including - * the trailing NUL) or -E2BIG if the destination buffer wasn't big enough. - * The behavior is undefined if the string buffers overlap. - * The destination buffer is always NUL terminated, unless it's zero-sized. + * Copy the string, or as much of it as fits, into the dest buffer. The + * behavior is undefined if the string buffers overlap. The destination + * buffer is always NUL terminated, unless it's zero-sized. * * Preferred to strlcpy() since the API doesn't require reading memory * from the src string beyond the specified "count" bytes, and since @@ -171,8 +169,10 @@ EXPORT_SYMBOL(strlcpy); * * Preferred to strncpy() since it always returns a valid string, and * doesn't unnecessarily force the tail of the destination buffer to be - * zeroed. If the zeroing is desired, it's likely cleaner to use strscpy() - * with an overflow test, then just memset() the tail of the dest buffer. + * zeroed. If zeroing is desired please use strscpy_pad(). + * + * Return: The number of characters copied (not including the trailing + * %NUL) or -E2BIG if the destination buffer wasn't big enough. */ ssize_t strscpy(char *dest, const char *src, size_t count) { @@ -259,6 +259,39 @@ char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) } EXPORT_SYMBOL(stpcpy);
+/** + * strscpy_pad() - Copy a C-string into a sized buffer + * @dest: Where to copy the string to + * @src: Where to copy the string from + * @count: Size of destination buffer + * + * Copy the string, or as much of it as fits, into the dest buffer. The + * behavior is undefined if the string buffers overlap. The destination + * buffer is always %NUL terminated, unless it's zero-sized. + * + * If the source string is shorter than the destination buffer, zeros + * the tail of the destination buffer. + * + * For full explanation of why you may want to consider using the + * 'strscpy' functions please see the function docstring for strscpy(). + * + * Return: The number of characters copied (not including the trailing + * %NUL) or -E2BIG if the destination buffer wasn't big enough. + */ +ssize_t strscpy_pad(char *dest, const char *src, size_t count) +{ + ssize_t written; + + written = strscpy(dest, src, count); + if (written < 0 || written == count - 1) + return written; + + memset(dest + written + 1, 0, count - written - 1); + + return written; +} +EXPORT_SYMBOL(strscpy_pad); + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another
From: Qian Cai cai@lca.pw
[ Upstream commit d1a445d3b86c9341ce7a0954c23be0edb5c9bec5 ]
There are many of those warnings.
In file included from ./arch/powerpc/include/asm/paca.h:15, from ./arch/powerpc/include/asm/current.h:13, from ./include/linux/thread_info.h:21, from ./include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:51, from fs/fs-writeback.c:19: In function 'strncpy', inlined from 'perf_trace_writeback_page_template' at ./include/trace/events/writeback.h:56:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix it by using the new strscpy_pad() which was introduced in "lib/string: Add strscpy_pad() function" and will always be NUL-terminated instead of strncpy(). Also, change strlcpy() to use strscpy_pad() in this file for consistency.
Link: http://lkml.kernel.org/r/1564075099-27750-1-git-send-email-cai@lca.pw Fixes: 455b2864686d ("writeback: Initial tracing support") Fixes: 028c2dd184c0 ("writeback: Add tracing to balance_dirty_pages") Fixes: e84d0a4f8e39 ("writeback: trace event writeback_queue_io") Fixes: b48c104d2211 ("writeback: trace event bdi_dirty_ratelimit") Fixes: cc1676d917f3 ("writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()") Fixes: 9fb0a7da0c52 ("writeback: add more tracepoints") Signed-off-by: Qian Cai cai@lca.pw Reviewed-by: Jan Kara jack@suse.cz Cc: Tobin C. Harding tobin@kernel.org Cc: Steven Rostedt (VMware) rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Cc: Tejun Heo tj@kernel.org Cc: Dave Chinner dchinner@redhat.com Cc: Fengguang Wu fengguang.wu@intel.com Cc: Jens Axboe axboe@kernel.dk Cc: Joe Perches joe@perches.com Cc: Kees Cook keescook@chromium.org Cc: Jann Horn jannh@google.com Cc: Jonathan Corbet corbet@lwn.net Cc: Nitin Gote nitin.r.gote@intel.com Cc: Rasmus Villemoes rasmus.villemoes@prevas.dk Cc: Stephen Kitt steve@sk2.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/writeback.h | 38 +++++++++++++++++--------------- 1 file changed, 20 insertions(+), 18 deletions(-)
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 2609b1c3549e2..07a912af1dd74 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -65,8 +65,9 @@ TRACE_EVENT(writeback_dirty_page, ),
TP_fast_assign( - strncpy(__entry->name, - mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", + 32); __entry->ino = mapping ? mapping->host->i_ino : 0; __entry->index = page->index; ), @@ -95,8 +96,8 @@ DECLARE_EVENT_CLASS(writeback_dirty_inode_template, struct backing_dev_info *bdi = inode_to_bdi(inode);
/* may be called for files on pseudo FSes w/ unregistered bdi */ - strncpy(__entry->name, - bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->flags = flags; @@ -205,8 +206,8 @@ DECLARE_EVENT_CLASS(writeback_write_inode_template, ),
TP_fast_assign( - strncpy(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + strscpy_pad(__entry->name, + dev_name(inode_to_bdi(inode)->dev), 32); __entry->ino = inode->i_ino; __entry->sync_mode = wbc->sync_mode; __trace_wbc_assign_cgroup(__get_str(cgroup), wbc); @@ -249,8 +250,9 @@ DECLARE_EVENT_CLASS(writeback_work_class, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strncpy(__entry->name, - wb->bdi->dev ? dev_name(wb->bdi->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, + wb->bdi->dev ? dev_name(wb->bdi->dev) : + "(unknown)", 32); __entry->nr_pages = work->nr_pages; __entry->sb_dev = work->sb ? work->sb->s_dev : 0; __entry->sync_mode = work->sync_mode; @@ -303,7 +305,7 @@ DECLARE_EVENT_CLASS(writeback_class, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strncpy(__entry->name, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->name, dev_name(wb->bdi->dev), 32); __trace_wb_assign_cgroup(__get_str(cgroup), wb); ), TP_printk("bdi %s: cgroup=%s", @@ -326,7 +328,7 @@ TRACE_EVENT(writeback_bdi_register, __array(char, name, 32) ), TP_fast_assign( - strncpy(__entry->name, dev_name(bdi->dev), 32); + strscpy_pad(__entry->name, dev_name(bdi->dev), 32); ), TP_printk("bdi %s", __entry->name @@ -351,7 +353,7 @@ DECLARE_EVENT_CLASS(wbc_class, ),
TP_fast_assign( - strncpy(__entry->name, dev_name(bdi->dev), 32); + strscpy_pad(__entry->name, dev_name(bdi->dev), 32); __entry->nr_to_write = wbc->nr_to_write; __entry->pages_skipped = wbc->pages_skipped; __entry->sync_mode = wbc->sync_mode; @@ -402,7 +404,7 @@ TRACE_EVENT(writeback_queue_io, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strncpy(__entry->name, dev_name(wb->bdi->dev), 32); + strncpy_pad(__entry->name, dev_name(wb->bdi->dev), 32); __entry->older = dirtied_before; __entry->age = (jiffies - dirtied_before) * 1000 / HZ; __entry->moved = moved; @@ -487,7 +489,7 @@ TRACE_EVENT(bdi_dirty_ratelimit, ),
TP_fast_assign( - strlcpy(__entry->bdi, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32); __entry->write_bw = KBps(wb->write_bandwidth); __entry->avg_write_bw = KBps(wb->avg_write_bandwidth); __entry->dirty_rate = KBps(dirty_rate); @@ -552,7 +554,7 @@ TRACE_EVENT(balance_dirty_pages,
TP_fast_assign( unsigned long freerun = (thresh + bg_thresh) / 2; - strlcpy(__entry->bdi, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32);
__entry->limit = global_wb_domain.dirty_limit; __entry->setpoint = (global_wb_domain.dirty_limit + @@ -613,8 +615,8 @@ TRACE_EVENT(writeback_sb_inodes_requeue, ),
TP_fast_assign( - strncpy(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + strscpy_pad(__entry->name, + dev_name(inode_to_bdi(inode)->dev), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when; @@ -687,8 +689,8 @@ DECLARE_EVENT_CLASS(writeback_single_inode_template, ),
TP_fast_assign( - strncpy(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + strscpy_pad(__entry->name, + dev_name(inode_to_bdi(inode)->dev), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when;
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit 68f23b89067fdf187763e75a56087550624fdbee ]
Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained.
With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister).
Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*.
Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled.
The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment.
Google Bug Id: 145475544
Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: Chris Mason clm@fb.com Cc: Tejun Heo tj@kernel.org Cc: Jens Axboe axboe@kernel.dk Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fs-writeback.c | 2 +- include/linux/backing-dev.h | 10 ++++++++++ include/trace/events/writeback.h | 29 +++++++++++++---------------- mm/backing-dev.c | 1 + 4 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 66a9c9dab8316..7f068330edb67 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1929,7 +1929,7 @@ void wb_workfn(struct work_struct *work) struct bdi_writeback, dwork); long pages_written;
- set_worker_desc("flush-%s", dev_name(wb->bdi->dev)); + set_worker_desc("flush-%s", bdi_dev_name(wb->bdi)); current->flags |= PF_SWAPWRITE;
if (likely(!current_is_workqueue_rescuer() || diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 361274ce5815f..883ce03191e76 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -12,6 +12,7 @@ #include <linux/fs.h> #include <linux/sched.h> #include <linux/blkdev.h> +#include <linux/device.h> #include <linux/writeback.h> #include <linux/blk-cgroup.h> #include <linux/backing-dev-defs.h> @@ -518,4 +519,13 @@ static inline int bdi_rw_congested(struct backing_dev_info *bdi) (1 << WB_async_congested)); }
+extern const char *bdi_unknown_name; + +static inline const char *bdi_dev_name(struct backing_dev_info *bdi) +{ + if (!bdi || !bdi->dev) + return bdi_unknown_name; + return dev_name(bdi->dev); +} + #endif /* _LINUX_BACKING_DEV_H */ diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 07a912af1dd74..d01217407d6d8 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -66,8 +66,8 @@ TRACE_EVENT(writeback_dirty_page,
TP_fast_assign( strscpy_pad(__entry->name, - mapping ? dev_name(inode_to_bdi(mapping->host)->dev) : "(unknown)", - 32); + bdi_dev_name(mapping ? inode_to_bdi(mapping->host) : + NULL), 32); __entry->ino = mapping ? mapping->host->i_ino : 0; __entry->index = page->index; ), @@ -96,8 +96,7 @@ DECLARE_EVENT_CLASS(writeback_dirty_inode_template, struct backing_dev_info *bdi = inode_to_bdi(inode);
/* may be called for files on pseudo FSes w/ unregistered bdi */ - strscpy_pad(__entry->name, - bdi->dev ? dev_name(bdi->dev) : "(unknown)", 32); + strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->flags = flags; @@ -207,7 +206,7 @@ DECLARE_EVENT_CLASS(writeback_write_inode_template,
TP_fast_assign( strscpy_pad(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->sync_mode = wbc->sync_mode; __trace_wbc_assign_cgroup(__get_str(cgroup), wbc); @@ -250,9 +249,7 @@ DECLARE_EVENT_CLASS(writeback_work_class, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strscpy_pad(__entry->name, - wb->bdi->dev ? dev_name(wb->bdi->dev) : - "(unknown)", 32); + strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->nr_pages = work->nr_pages; __entry->sb_dev = work->sb ? work->sb->s_dev : 0; __entry->sync_mode = work->sync_mode; @@ -305,7 +302,7 @@ DECLARE_EVENT_CLASS(writeback_class, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strscpy_pad(__entry->name, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __trace_wb_assign_cgroup(__get_str(cgroup), wb); ), TP_printk("bdi %s: cgroup=%s", @@ -328,7 +325,7 @@ TRACE_EVENT(writeback_bdi_register, __array(char, name, 32) ), TP_fast_assign( - strscpy_pad(__entry->name, dev_name(bdi->dev), 32); + strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); ), TP_printk("bdi %s", __entry->name @@ -353,7 +350,7 @@ DECLARE_EVENT_CLASS(wbc_class, ),
TP_fast_assign( - strscpy_pad(__entry->name, dev_name(bdi->dev), 32); + strscpy_pad(__entry->name, bdi_dev_name(bdi), 32); __entry->nr_to_write = wbc->nr_to_write; __entry->pages_skipped = wbc->pages_skipped; __entry->sync_mode = wbc->sync_mode; @@ -404,7 +401,7 @@ TRACE_EVENT(writeback_queue_io, __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) ), TP_fast_assign( - strncpy_pad(__entry->name, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32); __entry->older = dirtied_before; __entry->age = (jiffies - dirtied_before) * 1000 / HZ; __entry->moved = moved; @@ -489,7 +486,7 @@ TRACE_EVENT(bdi_dirty_ratelimit, ),
TP_fast_assign( - strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32); __entry->write_bw = KBps(wb->write_bandwidth); __entry->avg_write_bw = KBps(wb->avg_write_bandwidth); __entry->dirty_rate = KBps(dirty_rate); @@ -554,7 +551,7 @@ TRACE_EVENT(balance_dirty_pages,
TP_fast_assign( unsigned long freerun = (thresh + bg_thresh) / 2; - strscpy_pad(__entry->bdi, dev_name(wb->bdi->dev), 32); + strscpy_pad(__entry->bdi, bdi_dev_name(wb->bdi), 32);
__entry->limit = global_wb_domain.dirty_limit; __entry->setpoint = (global_wb_domain.dirty_limit + @@ -616,7 +613,7 @@ TRACE_EVENT(writeback_sb_inodes_requeue,
TP_fast_assign( strscpy_pad(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when; @@ -690,7 +687,7 @@ DECLARE_EVENT_CLASS(writeback_single_inode_template,
TP_fast_assign( strscpy_pad(__entry->name, - dev_name(inode_to_bdi(inode)->dev), 32); + bdi_dev_name(inode_to_bdi(inode)), 32); __entry->ino = inode->i_ino; __entry->state = inode->i_state; __entry->dirtied_when = inode->dirtied_when; diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 07e3b3b8e8469..f705c58b320b8 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -21,6 +21,7 @@ struct backing_dev_info noop_backing_dev_info = { EXPORT_SYMBOL_GPL(noop_backing_dev_info);
static struct class *bdi_class; +const char *bdi_unknown_name = "(unknown)";
/* * bdi_lock protects updates to bdi_list. bdi_list has RCU reader side
From: Phillip Lougher phillip@squashfs.org.uk
commit f37aa4c7366e23f91b81d00bafd6a7ab54e4a381 upstream.
Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the following corruption.
1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read.
In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Reported-by: syzbot+b06d57ba83f604522af2@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee9807c@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0f19@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0fa4f@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/id.c | 40 ++++++++++++++++++++++++++++++++-------- fs/squashfs/squashfs_fs_sb.h | 1 + fs/squashfs/super.c | 6 +++--- fs/squashfs/xattr.h | 10 +++++++++- 4 files changed, 45 insertions(+), 12 deletions(-)
--- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -48,10 +48,15 @@ int squashfs_get_id(struct super_block * struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_ID_BLOCK(index); int offset = SQUASHFS_ID_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->id_table[block]); + u64 start_block; __le32 disk_id; int err;
+ if (index >= msblk->ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->id_table[block]); + err = squashfs_read_metadata(sb, &disk_id, &start_block, &offset, sizeof(disk_id)); if (err < 0) @@ -69,7 +74,10 @@ __le64 *squashfs_read_id_index_table(str u64 id_table_start, u64 next_table, unsigned short no_ids) { unsigned int length = SQUASHFS_ID_BLOCK_BYTES(no_ids); + unsigned int indexes = SQUASHFS_ID_BLOCKS(no_ids); + int n; __le64 *table; + u64 start, end;
TRACE("In read_id_index_table, length %d\n", length);
@@ -80,20 +88,36 @@ __le64 *squashfs_read_id_index_table(str return ERR_PTR(-EINVAL);
/* - * length bytes should not extend into the next table - this check - * also traps instances where id_table_start is incorrectly larger - * than the next table start + * The computed size of the index table (length bytes) should exactly + * match the table start and end points */ - if (id_table_start + length > next_table) + if (length != (next_table - id_table_start)) return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, id_table_start, length); + if (IS_ERR(table)) + return table;
/* - * table[0] points to the first id lookup table metadata block, this - * should be less than id_table_start + * table[0], table[1], ... table[indexes - 1] store the locations + * of the compressed id blocks. Each entry should be less than + * the next (i.e. table[0] < table[1]), and the difference between them + * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] + * should be less than id_table_start, and again the difference + * should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= id_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); } --- a/fs/squashfs/squashfs_fs_sb.h +++ b/fs/squashfs/squashfs_fs_sb.h @@ -77,5 +77,6 @@ struct squashfs_sb_info { unsigned int inodes; unsigned int fragments; int xattr_ids; + unsigned int ids; }; #endif --- a/fs/squashfs/super.c +++ b/fs/squashfs/super.c @@ -177,6 +177,7 @@ static int squashfs_fill_super(struct su msblk->directory_table = le64_to_cpu(sblk->directory_table_start); msblk->inodes = le32_to_cpu(sblk->inodes); msblk->fragments = le32_to_cpu(sblk->fragments); + msblk->ids = le16_to_cpu(sblk->no_ids); flags = le16_to_cpu(sblk->flags);
TRACE("Found valid superblock on %s\n", bdevname(sb->s_bdev, b)); @@ -188,7 +189,7 @@ static int squashfs_fill_super(struct su TRACE("Block size %d\n", msblk->block_size); TRACE("Number of inodes %d\n", msblk->inodes); TRACE("Number of fragments %d\n", msblk->fragments); - TRACE("Number of ids %d\n", le16_to_cpu(sblk->no_ids)); + TRACE("Number of ids %d\n", msblk->ids); TRACE("sblk->inode_table_start %llx\n", msblk->inode_table); TRACE("sblk->directory_table_start %llx\n", msblk->directory_table); TRACE("sblk->fragment_table_start %llx\n", @@ -245,8 +246,7 @@ static int squashfs_fill_super(struct su allocate_id_index_table: /* Allocate and read id index table */ msblk->id_table = squashfs_read_id_index_table(sb, - le64_to_cpu(sblk->id_table_start), next_table, - le16_to_cpu(sblk->no_ids)); + le64_to_cpu(sblk->id_table_start), next_table, msblk->ids); if (IS_ERR(msblk->id_table)) { ERROR("unable to read id index table\n"); err = PTR_ERR(msblk->id_table); --- a/fs/squashfs/xattr.h +++ b/fs/squashfs/xattr.h @@ -30,8 +30,16 @@ extern int squashfs_xattr_lookup(struct static inline __le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, u64 *xattr_table_start, int *xattr_ids) { + struct squashfs_xattr_id_table *id_table; + + id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + if (IS_ERR(id_table)) + return (__le64 *) id_table; + + *xattr_table_start = le64_to_cpu(id_table->xattr_table_start); + kfree(id_table); + ERROR("Xattrs in filesystem, these will be ignored\n"); - *xattr_table_start = start; return ERR_PTR(-ENOTSUPP); }
From: Phillip Lougher phillip@squashfs.org.uk
commit eabac19e40c095543def79cb6ffeb3a8588aaff4 upstream.
Sysbot has reported an "slab-out-of-bounds read" error which has been identified as being caused by a corrupted "ino_num" value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the following corruption.
1. It checks against corruption of the inodes count. This can either lead to a larger table to be read, or a smaller than expected table to be read.
In the case of a too large inodes count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
[phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co....
Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Reported-by: syzbot+04419e3ff19d2970ea28@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/export.c | 41 +++++++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 8 deletions(-)
--- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -54,12 +54,17 @@ static long long squashfs_inode_lookup(s struct squashfs_sb_info *msblk = sb->s_fs_info; int blk = SQUASHFS_LOOKUP_BLOCK(ino_num - 1); int offset = SQUASHFS_LOOKUP_BLOCK_OFFSET(ino_num - 1); - u64 start = le64_to_cpu(msblk->inode_lookup_table[blk]); + u64 start; __le64 ino; int err;
TRACE("Entered squashfs_inode_lookup, inode_number = %d\n", ino_num);
+ if (ino_num == 0 || (ino_num - 1) >= msblk->inodes) + return -EINVAL; + + start = le64_to_cpu(msblk->inode_lookup_table[blk]); + err = squashfs_read_metadata(sb, &ino, &start, &offset, sizeof(ino)); if (err < 0) return err; @@ -124,7 +129,10 @@ __le64 *squashfs_read_inode_lookup_table u64 lookup_table_start, u64 next_table, unsigned int inodes) { unsigned int length = SQUASHFS_LOOKUP_BLOCK_BYTES(inodes); + unsigned int indexes = SQUASHFS_LOOKUP_BLOCKS(inodes); + int n; __le64 *table; + u64 start, end;
TRACE("In read_inode_lookup_table, length %d\n", length);
@@ -134,20 +142,37 @@ __le64 *squashfs_read_inode_lookup_table if (inodes == 0) return ERR_PTR(-EINVAL);
- /* length bytes should not extend into the next table - this check - * also traps instances where lookup_table_start is incorrectly larger - * than the next table start + /* + * The computed size of the lookup table (length bytes) should exactly + * match the table start and end points */ - if (lookup_table_start + length > next_table) + if (length != (next_table - lookup_table_start)) return ERR_PTR(-EINVAL);
table = squashfs_read_table(sb, lookup_table_start, length); + if (IS_ERR(table)) + return table;
/* - * table[0] points to the first inode lookup table metadata block, - * this should be less than lookup_table_start + * table0], table[1], ... table[indexes - 1] store the locations + * of the compressed inode lookup blocks. Each entry should be + * less than the next (i.e. table[0] < table[1]), and the difference + * between them should be SQUASHFS_METADATA_SIZE or less. + * table[indexes - 1] should be less than lookup_table_start, and + * again the difference should be SQUASHFS_METADATA_SIZE or less */ - if (!IS_ERR(table) && le64_to_cpu(table[0]) >= lookup_table_start) { + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { kfree(table); return ERR_PTR(-EINVAL); }
From: Phillip Lougher phillip@squashfs.org.uk
commit 506220d2ba21791314af569211ffd8870b8208fa upstream.
Sysbot has reported a warning where a kmalloc() attempt exceeds the maximum limit. This has been identified as corruption of the xattr_ids count when reading the xattr id lookup table.
This patch adds a number of additional sanity checks to detect this corruption and others.
1. It checks for a corrupted xattr index read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This would cause an out of bounds read.
2. It checks against corruption of the xattr_ids count. This can either lead to the above kmalloc failure, or a smaller than expected table to be read.
3. It checks the contents of the index table for corruption.
[phillip@squashfs.org.uk: fix checkpatch issue] Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co....
Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Reported-by: syzbot+2ccea6339d368360800d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/xattr_id.c | 66 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 57 insertions(+), 9 deletions(-)
--- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -44,10 +44,15 @@ int squashfs_xattr_lookup(struct super_b struct squashfs_sb_info *msblk = sb->s_fs_info; int block = SQUASHFS_XATTR_BLOCK(index); int offset = SQUASHFS_XATTR_BLOCK_OFFSET(index); - u64 start_block = le64_to_cpu(msblk->xattr_id_table[block]); + u64 start_block; struct squashfs_xattr_id id; int err;
+ if (index >= msblk->xattr_ids) + return -EINVAL; + + start_block = le64_to_cpu(msblk->xattr_id_table[block]); + err = squashfs_read_metadata(sb, &id, &start_block, &offset, sizeof(id)); if (err < 0) @@ -63,13 +68,17 @@ int squashfs_xattr_lookup(struct super_b /* * Read uncompressed xattr id lookup table indexes from disk into memory */ -__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 start, +__le64 *squashfs_read_xattr_id_table(struct super_block *sb, u64 table_start, u64 *xattr_table_start, int *xattr_ids) { - unsigned int len; + struct squashfs_sb_info *msblk = sb->s_fs_info; + unsigned int len, indexes; struct squashfs_xattr_id_table *id_table; + __le64 *table; + u64 start, end; + int n;
- id_table = squashfs_read_table(sb, start, sizeof(*id_table)); + id_table = squashfs_read_table(sb, table_start, sizeof(*id_table)); if (IS_ERR(id_table)) return (__le64 *) id_table;
@@ -83,13 +92,52 @@ __le64 *squashfs_read_xattr_id_table(str if (*xattr_ids == 0) return ERR_PTR(-EINVAL);
- /* xattr_table should be less than start */ - if (*xattr_table_start >= start) + len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); + + /* + * The computed size of the index table (len bytes) should exactly + * match the table start and end points + */ + start = table_start + sizeof(*id_table); + end = msblk->bytes_used; + + if (len != (end - start)) return ERR_PTR(-EINVAL);
- len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); + table = squashfs_read_table(sb, start, len); + if (IS_ERR(table)) + return table; + + /* table[0], table[1], ... table[indexes - 1] store the locations + * of the compressed xattr id blocks. Each entry should be less than + * the next (i.e. table[0] < table[1]), and the difference between them + * should be SQUASHFS_METADATA_SIZE or less. table[indexes - 1] + * should be less than table_start, and again the difference + * shouls be SQUASHFS_METADATA_SIZE or less. + * + * Finally xattr_table_start should be less than table[0]. + */ + for (n = 0; n < (indexes - 1); n++) { + start = le64_to_cpu(table[n]); + end = le64_to_cpu(table[n + 1]); + + if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + } + } + + start = le64_to_cpu(table[indexes - 1]); + if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + kfree(table); + return ERR_PTR(-EINVAL); + }
- TRACE("In read_xattr_index_table, length %d\n", len); + if (*xattr_table_start >= le64_to_cpu(table[0])) { + kfree(table); + return ERR_PTR(-EINVAL); + }
- return squashfs_read_table(sb, start + sizeof(*id_table), len); + return table; }
From: Roman Gushchin guro@fb.com
[ Upstream commit 2dcb3964544177c51853a210b6ad400de78ef17d ]
With kaslr the kernel image is placed at a random place, so starting the bottom-up allocation with the kernel_end can result in an allocation failure and a warning like this one:
hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node ------------[ cut here ]------------ memblock: bottom-up allocation failed, memory hotremove may be affected WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:memblock_find_in_range_node+0x178/0x25a Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046 RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000 FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0 Call Trace: memblock_alloc_range_nid+0x8d/0x11e cma_declare_contiguous_nid+0x2c4/0x38c hugetlb_cma_reserve+0xdc/0x128 flush_tlb_one_kernel+0xc/0x20 native_set_fixmap+0x82/0xd0 flat_get_apic_id+0x5/0x10 register_lapic_address+0x8e/0x97 setup_arch+0x8a5/0xc3f start_kernel+0x66/0x547 load_ucode_bsp+0x4c/0xcd secondary_startup_64_no_verify+0xb0/0xbb random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0 ---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(), so we can just start searching at PAGE_SIZE. In this case the bottom-up allocation has the same chances to success as a top-down allocation, so there is no reason to fallback in the case of a failure. All together it simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory") Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Mike Rapoport rppt@linux.ibm.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Michal Hocko mhocko@kernel.org Cc: Rik van Riel riel@surriel.com Cc: Wonhyuk Yang vvghjk1234@gmail.com Cc: Thiago Jung Bauermann bauerman@linux.ibm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memblock.c | 49 ++++++------------------------------------------- 1 file changed, 6 insertions(+), 43 deletions(-)
diff --git a/mm/memblock.c b/mm/memblock.c index f8fab45bfdb75..ff51a37eb86be 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -189,14 +189,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end, * * Find @size free area aligned to @align in the specified range and node. * - * When allocation direction is bottom-up, the @start should be greater - * than the end of the kernel image. Otherwise, it will be trimmed. The - * reason is that we want the bottom-up allocation just near the kernel - * image so it is highly likely that the allocated memory and the kernel - * will reside in the same node. - * - * If bottom-up allocation failed, will try to allocate memory top-down. - * * RETURNS: * Found address on success, 0 on failure. */ @@ -204,8 +196,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, phys_addr_t align, phys_addr_t start, phys_addr_t end, int nid, ulong flags) { - phys_addr_t kernel_end, ret; - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE) end = memblock.current_limit; @@ -213,40 +203,13 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size, /* avoid allocating the first page */ start = max_t(phys_addr_t, start, PAGE_SIZE); end = max(start, end); - kernel_end = __pa_symbol(_end); - - /* - * try bottom-up allocation only when bottom-up mode - * is set and @end is above the kernel image. - */ - if (memblock_bottom_up() && end > kernel_end) { - phys_addr_t bottom_up_start; - - /* make sure we will allocate above the kernel */ - bottom_up_start = max(start, kernel_end);
- /* ok, try bottom-up allocation first */ - ret = __memblock_find_range_bottom_up(bottom_up_start, end, - size, align, nid, flags); - if (ret) - return ret; - - /* - * we always limit bottom-up allocation above the kernel, - * but top-down allocation doesn't have the limit, so - * retrying top-down allocation may succeed when bottom-up - * allocation failed. - * - * bottom-up allocation is expected to be fail very rarely, - * so we use WARN_ONCE() here to see the stack trace if - * fail happens. - */ - WARN_ONCE(1, "memblock: bottom-up allocation failed, " - "memory hotunplug may be affected\n"); - } - - return __memblock_find_range_top_down(start, end, size, align, nid, - flags); + if (memblock_bottom_up()) + return __memblock_find_range_bottom_up(start, end, size, align, + nid, flags); + else + return __memblock_find_range_top_down(start, end, size, align, + nid, flags); }
/**
From: Jozsef Kadlecsik kadlec@mail.kfki.hu
[ Upstream commit b1bdde33b72366da20d10770ab7a49fe87b5e190 ]
When both --reap and --update flag are specified, there's a code path at which the entry to be updated is reaped beforehand, which then leads to kernel crash. Reap only entries which won't be updated.
Fixes kernel bugzilla #207773.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773 Reported-by: Reindl Harald h.reindl@thelounge.net Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper") Signed-off-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/xt_recent.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c index cd53b861a15c1..ffe673c6a2485 100644 --- a/net/netfilter/xt_recent.c +++ b/net/netfilter/xt_recent.c @@ -156,7 +156,8 @@ static void recent_entry_remove(struct recent_table *t, struct recent_entry *e) /* * Drop entries with timestamps older then 'time'. */ -static void recent_entry_reap(struct recent_table *t, unsigned long time) +static void recent_entry_reap(struct recent_table *t, unsigned long time, + struct recent_entry *working, bool update) { struct recent_entry *e;
@@ -165,6 +166,12 @@ static void recent_entry_reap(struct recent_table *t, unsigned long time) */ e = list_entry(t->lru_list.next, struct recent_entry, lru_list);
+ /* + * Do not reap the entry which are going to be updated. + */ + if (e == working && update) + return; + /* * The last time stamp is the most recent. */ @@ -307,7 +314,8 @@ recent_mt(const struct sk_buff *skb, struct xt_action_param *par)
/* info->seconds must be non-zero */ if (info->check_set & XT_RECENT_REAP) - recent_entry_reap(t, time); + recent_entry_reap(t, time, e, + info->check_set & XT_RECENT_UPDATE && ret); }
if (info->check_set & XT_RECENT_SET ||
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit ade9679c159d5bbe14fb7e59e97daf6062872e2b ]
Fix a build error for undefined 'TI_PRE_COUNT' by adding it to asm-offsets.c.
h8300-linux-ld: arch/h8300/kernel/entry.o: in function `resume_kernel': (.text+0x29a): undefined reference to `TI_PRE_COUNT'
Link: https://lkml.kernel.org/r/20210212021650.22740-1-rdunlap@infradead.org Fixes: df2078b8daa7 ("h8300: Low level entry") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Cc: Yoshinori Sato ysato@users.sourceforge.jp Cc: Thomas Gleixner tglx@linutronix.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/h8300/kernel/asm-offsets.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/h8300/kernel/asm-offsets.c b/arch/h8300/kernel/asm-offsets.c index dc2d16ce8a0d5..3e33a9844d99a 100644 --- a/arch/h8300/kernel/asm-offsets.c +++ b/arch/h8300/kernel/asm-offsets.c @@ -62,6 +62,9 @@ int main(void) OFFSET(TI_FLAGS, thread_info, flags); OFFSET(TI_CPU, thread_info, cpu); OFFSET(TI_PRE, thread_info, preempt_count); +#ifdef CONFIG_PREEMPTION + DEFINE(TI_PRE_COUNT, offsetof(struct thread_info, preempt_count)); +#endif
return 0; }
From: Felipe Balbi balbi@kernel.org
commit 2a499b45295206e7f3dc76edadde891c06cc4447 upstream
no functional changes.
Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/ulpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/dwc3/ulpi.c +++ b/drivers/usb/dwc3/ulpi.c @@ -22,7 +22,7 @@
static int dwc3_ulpi_busyloop(struct dwc3 *dwc) { - unsigned count = 1000; + unsigned int count = 1000; u32 reg;
while (count--) {
From: Serge Semin Sergey.Semin@baikalelectronics.ru
commit fca3f138105727c3a22edda32d02f91ce1bf11c9 upstream
Originally the procedure of the ULPI transaction finish detection has been developed as a simple busy-loop with just decrementing counter and no delays. It's wrong since on different systems the loop will take a different time to complete. So if the system bus and CPU are fast enough to overtake the ULPI bus and the companion PHY reaction, then we'll get to take a false timeout error. Fix this by converting the busy-loop procedure to take the standard bus speed, address value and the registers access mode into account for the busy-loop delay calculation.
Here is the way the fix works. It's known that the ULPI bus is clocked with 60MHz signal. In accordance with [1] the ULPI bus protocol is created so to spend 5 and 6 clock periods for immediate register write and read operations respectively, and 6 and 7 clock periods - for the extended register writes and reads. Based on that we can easily pre-calculate the time which will be needed for the controller to perform a requested IO operation. Note we'll still preserve the attempts counter in case if the DWC USB3 controller has got some internals delays.
[1] UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, October 20, 2004, pp. 30 - 36.
Fixes: 88bc9d194ff6 ("usb: dwc3: add ULPI interface support") Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Serge Semin Sergey.Semin@baikalelectronics.ru Link: https://lore.kernel.org/r/20201210085008.13264-3-Sergey.Semin@baikalelectron... Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [sudip: adjust context] Signed-off-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/ulpi.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc3/ulpi.c +++ b/drivers/usb/dwc3/ulpi.c @@ -10,6 +10,8 @@ * published by the Free Software Foundation. */
+#include <linux/delay.h> +#include <linux/time64.h> #include <linux/ulpi/regs.h>
#include "core.h" @@ -20,12 +22,22 @@ DWC3_GUSB2PHYACC_ADDR(ULPI_ACCESS_EXTENDED) | \ DWC3_GUSB2PHYACC_EXTEND_ADDR(a) : DWC3_GUSB2PHYACC_ADDR(a))
-static int dwc3_ulpi_busyloop(struct dwc3 *dwc) +#define DWC3_ULPI_BASE_DELAY DIV_ROUND_UP(NSEC_PER_SEC, 60000000L) + +static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read) { + unsigned long ns = 5L * DWC3_ULPI_BASE_DELAY; unsigned int count = 1000; u32 reg;
+ if (addr >= ULPI_EXT_VENDOR_SPECIFIC) + ns += DWC3_ULPI_BASE_DELAY; + + if (read) + ns += DWC3_ULPI_BASE_DELAY; + while (count--) { + ndelay(ns); reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYACC(0)); if (!(reg & DWC3_GUSB2PHYACC_BUSY)) return 0; @@ -44,7 +56,7 @@ static int dwc3_ulpi_read(struct ulpi_op reg = DWC3_GUSB2PHYACC_NEWREGREQ | DWC3_ULPI_ADDR(addr); dwc3_writel(dwc->regs, DWC3_GUSB2PHYACC(0), reg);
- ret = dwc3_ulpi_busyloop(dwc); + ret = dwc3_ulpi_busyloop(dwc, addr, true); if (ret) return ret;
@@ -62,7 +74,7 @@ static int dwc3_ulpi_write(struct ulpi_o reg |= DWC3_GUSB2PHYACC_WRITE | val; dwc3_writel(dwc->regs, DWC3_GUSB2PHYACC(0), reg);
- return dwc3_ulpi_busyloop(dwc); + return dwc3_ulpi_busyloop(dwc, addr, false); }
static struct ulpi_ops dwc3_ulpi_ops = {
From: Edwin Peer edwin.peer@broadcom.com
commit 3aa6bce9af0e25b735c9c1263739a5639a336ae8 upstream.
Prevent netif_tx_disable() running concurrently with dev_watchdog() by taking the device global xmit lock. Otherwise, the recommended:
netif_carrier_off(dev); netif_tx_disable(dev);
driver shutdown sequence can happen after the watchdog has already checked carrier, resulting in possible false alarms. This is because netif_tx_lock() only sets the frozen bit without maintaining the locks on the individual queues.
Fixes: c3f26a269c24 ("netdev: Fix lockdep warnings in multiqueue configurations.") Signed-off-by: Edwin Peer edwin.peer@broadcom.com Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/netdevice.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3428,6 +3428,7 @@ static inline void netif_tx_disable(stru
local_bh_disable(); cpu = smp_processor_id(); + spin_lock(&dev->tx_global_lock); for (i = 0; i < dev->num_tx_queues; i++) { struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
@@ -3435,6 +3436,7 @@ static inline void netif_tx_disable(stru netif_tx_stop_queue(txq); __netif_tx_unlock(txq); } + spin_unlock(&dev->tx_global_lock); local_bh_enable(); }
From: Stefano Garzarella sgarzare@redhat.com
commit 1c5fae9c9a092574398a17facc31c533791ef232 upstream.
In vsock_shutdown() we touched some socket fields without holding the socket lock, such as 'state' and 'sk_flags'.
Also, after the introduction of multi-transport, we are accessing 'vsk->transport' in vsock_send_shutdown() without holding the lock and this call can be made while the connection is in progress, so the transport can change in the meantime.
To avoid issues, we hold the socket lock when we enter in vsock_shutdown() and release it when we leave.
Among the transports that implement the 'shutdown' callback, only hyperv_transport acquired the lock. Since the caller now holds it, we no longer take it.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/af_vsock.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -818,10 +818,12 @@ static int vsock_shutdown(struct socket */
sk = sock->sk; + + lock_sock(sk); if (sock->state == SS_UNCONNECTED) { err = -ENOTCONN; if (sk->sk_type == SOCK_STREAM) - return err; + goto out; } else { sock->state = SS_DISCONNECTING; err = 0; @@ -830,10 +832,8 @@ static int vsock_shutdown(struct socket /* Receive and send shutdowns are treated alike. */ mode = mode & (RCV_SHUTDOWN | SEND_SHUTDOWN); if (mode) { - lock_sock(sk); sk->sk_shutdown |= mode; sk->sk_state_change(sk); - release_sock(sk);
if (sk->sk_type == SOCK_STREAM) { sock_reset_flag(sk, SOCK_DONE); @@ -841,6 +841,8 @@ static int vsock_shutdown(struct socket } }
+out: + release_sock(sk); return err; }
From: Borislav Petkov bp@suse.de
commit 256b92af784d5043eeb7d559b6d5963dcc2ecb10 upstream.
Commit
20bf2b378729 ("x86/build: Disable CET instrumentation in the kernel")
disabled CET instrumentation which gets added by default by the Ubuntu gcc9 and 10 by default, but did that only for 64-bit builds. It would still fail when building a 32-bit target. So disable CET for all x86 builds.
Fixes: 20bf2b378729 ("x86/build: Disable CET instrumentation in the kernel") Reported-by: AC achirvasub@gmail.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Josh Poimboeuf jpoimboe@redhat.com Tested-by: AC achirvasub@gmail.com Link: https://lkml.kernel.org/r/YCCIgMHkzh/xT4ex@arch-chirva.localdomain Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -61,6 +61,9 @@ endif KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow KBUILD_CFLAGS += $(call cc-option,-mno-avx,)
+# Intel CET isn't enabled in the kernel +KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) + ifeq ($(CONFIG_X86_32),y) BITS := 32 UTS_MACHINE := i386 @@ -137,9 +140,6 @@ else KBUILD_CFLAGS += -mno-red-zone KBUILD_CFLAGS += -mcmodel=kernel
- # Intel CET isn't enabled in the kernel - KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) - # -funit-at-a-time shrinks the kernel .text considerably # unfortunately it makes reading oopses harder. KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
From: Andi Kleen ak@linux.intel.com
commit 96f60dfa5819a065bfdd2f2ba0df7d9cbce7f4dd upstream.
gcc 5 supports a new -mcount-record option to generate ftrace tables directly. This avoids the need to run record_mcount manually.
Use this option when available.
So far doesn't use -mcount-nop, which also exists now.
This is needed to make ftrace work with LTO because the normal record-mcount script doesn't run over the link time output.
It should also improve build times slightly in the general case. Link: http://lkml.kernel.org/r/20171127213423.27218-12-andi@firstfloor.org
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/Makefile.build | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -221,6 +221,11 @@ cmd_modversions_c = \ endif
ifdef CONFIG_FTRACE_MCOUNT_RECORD +# gcc 5 supports generating the mcount tables directly +ifneq ($(call cc-option,-mrecord-mcount,y),y) +KBUILD_CFLAGS += -mrecord-mcount +else +# else do it all manually ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") RECORDMCOUNT_FLAGS = -w @@ -250,6 +255,7 @@ cmd_record_mcount = \ $(sub_cmd_record_mcount) \ fi; endif +endif
define rule_cc_o_c $(call echo-cmd,checksrc) $(cmd_checksrc) \
From: Greg Thelen gthelen@google.com
commit ed7d40bc67b8353c677b38c6cdddcdc310c0f452 upstream.
Non gcc-5 builds with CONFIG_STACK_VALIDATION=y and SKIP_STACK_VALIDATION=1 fail. Example output: /bin/sh: init/.tmp_main.o: Permission denied
commit 96f60dfa5819 ("trace: Use -mcount-record for dynamic ftrace"), added a mismatched endif. This causes cmd_objtool to get mistakenly set.
Relocate endif to balance the newly added -record-mcount check.
Link: http://lkml.kernel.org/r/20180608214746.136554-1-gthelen@google.com
Fixes: 96f60dfa5819 ("trace: Use -mcount-record for dynamic ftrace") Acked-by: Andi Kleen ak@linux.intel.com Tested-by: David Rientjes rientjes@google.com Signed-off-by: Greg Thelen gthelen@google.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -254,7 +254,7 @@ cmd_record_mcount = \ "$(CC_FLAGS_FTRACE)" ]; then \ $(sub_cmd_record_mcount) \ fi; -endif +endif # -record-mcount endif
define rule_cc_o_c
From: Vasily Gorbik gor@linux.ibm.com
commit 07d0408120216b60625c9a5b8012d1c3a907984d upstream.
Currently if CONFIG_FTRACE_MCOUNT_RECORD is enabled -mrecord-mcount compiler flag support is tested for every Makefile.
Top 4 cc-option usages: 511 -mrecord-mcount 11 -fno-stack-protector 9 -Wno-override-init 2 -fsched-pressure
To address that move cc-option from scripts/Makefile.build to top Makefile and export CC_USING_RECORD_MCOUNT to be used in original place.
While doing that also add -mrecord-mcount to CC_FLAGS_FTRACE (if gcc actually supports it).
Link: http://lkml.kernel.org/r/patch-2.thread-aa7b8d.git-de935bace15a.your-ad-here...
Acked-by: Andi Kleen ak@linux.intel.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 7 +++++++ scripts/Makefile.build | 9 +++------ 2 files changed, 10 insertions(+), 6 deletions(-)
--- a/Makefile +++ b/Makefile @@ -760,6 +760,13 @@ ifdef CONFIG_FUNCTION_TRACER ifndef CC_FLAGS_FTRACE CC_FLAGS_FTRACE := -pg endif +ifdef CONFIG_FTRACE_MCOUNT_RECORD + # gcc 5 supports generating the mcount tables directly + ifeq ($(call cc-option-yn,-mrecord-mcount),y) + CC_FLAGS_FTRACE += -mrecord-mcount + export CC_USING_RECORD_MCOUNT := 1 + endif +endif export CC_FLAGS_FTRACE ifdef CONFIG_HAVE_FENTRY CC_USING_FENTRY := $(call cc-option, -mfentry -DCC_USING_FENTRY) --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -221,11 +221,8 @@ cmd_modversions_c = \ endif
ifdef CONFIG_FTRACE_MCOUNT_RECORD -# gcc 5 supports generating the mcount tables directly -ifneq ($(call cc-option,-mrecord-mcount,y),y) -KBUILD_CFLAGS += -mrecord-mcount -else -# else do it all manually +ifndef CC_USING_RECORD_MCOUNT +# compiler will not generate __mcount_loc use recordmcount or recordmcount.pl ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") RECORDMCOUNT_FLAGS = -w @@ -254,7 +251,7 @@ cmd_record_mcount = \ "$(CC_FLAGS_FTRACE)" ]; then \ $(sub_cmd_record_mcount) \ fi; -endif # -record-mcount +endif # CC_USING_RECORD_MCOUNT endif
define rule_cc_o_c
From: Jan Beulich jbeulich@suse.com
commit a35f2ef3b7376bfd0a57f7844bd7454389aae1fc upstream.
Its sibling (set_foreign_p2m_mapping()) as well as the sibling of its only caller (gnttab_map_refs()) don't clean up after themselves in case of error. Higher level callers are expected to do so. However, in order for that to really clean up any partially set up state, the operation should not terminate upon encountering an entry in unexpected state. It is particularly relevant to notice here that set_foreign_p2m_mapping() would skip setting up a p2m entry if its grant mapping failed, but it would continue to set up further p2m entries as long as their mappings succeeded.
Arguably down the road set_foreign_p2m_mapping() may want its page state related WARN_ON() also converted to an error return.
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/xen/p2m.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -763,17 +763,15 @@ int clear_foreign_p2m_mapping(struct gnt unsigned long mfn = __pfn_to_mfn(page_to_pfn(pages[i])); unsigned long pfn = page_to_pfn(pages[i]);
- if (mfn == INVALID_P2M_ENTRY || !(mfn & FOREIGN_FRAME_BIT)) { + if (mfn != INVALID_P2M_ENTRY && (mfn & FOREIGN_FRAME_BIT)) + set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + else ret = -EINVAL; - goto out; - } - - set_phys_to_machine(pfn, INVALID_P2M_ENTRY); } if (kunmap_ops) ret = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, - kunmap_ops, count); -out: + kunmap_ops, count) ?: ret; + return ret; } EXPORT_SYMBOL_GPL(clear_foreign_p2m_mapping);
From: Jan Beulich jbeulich@suse.com
commit b512e1b077e5ccdbd6e225b15d934ab12453b70a upstream.
We should not set up further state if either mapping failed; paying attention to just the user mapping's status isn't enough.
Also use GNTST_okay instead of implying its value (zero).
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/xen/p2m.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -725,7 +725,8 @@ int set_foreign_p2m_mapping(struct gntta unsigned long mfn, pfn;
/* Do not add to override if the map failed. */ - if (map_ops[i].status) + if (map_ops[i].status != GNTST_okay || + (kmap_ops && kmap_ops[i].status != GNTST_okay)) continue;
if (map_ops[i].flags & GNTMAP_contains_pte) {
From: Jan Beulich jbeulich@suse.com
commit dbe5283605b3bc12ca45def09cc721a0a5c853a2 upstream.
We may not skip setting the field in the unmap structure when GNTMAP_device_map is in use - such an unmap would fail to release the respective resources (a page ref in the hypervisor). Otoh the field doesn't need setting at all when GNTMAP_device_map is not in use.
To record the value for unmapping, we also better don't use our local p2m: In particular after a subsequent change it may not have got updated for all the batch elements. Instead it can simply be taken from the respective map's results.
We can additionally avoid playing this game altogether for the kernel part of the mappings in (x86) PV mode.
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Stefano Stabellini sstabellini@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/gntdev.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -293,18 +293,25 @@ static int map_grant_pages(struct grant_ * to the kernel linear addresses of the struct pages. * These ptes are completely different from the user ptes dealt * with find_grant_ptes. + * Note that GNTMAP_device_map isn't needed here: The + * dev_bus_addr output field gets consumed only from ->map_ops, + * and by not requesting it when mapping we also avoid needing + * to mirror dev_bus_addr into ->unmap_ops (and holding an extra + * reference to the page in the hypervisor). */ + unsigned int flags = (map->flags & ~GNTMAP_device_map) | + GNTMAP_host_map; + for (i = 0; i < map->count; i++) { unsigned long address = (unsigned long) pfn_to_kaddr(page_to_pfn(map->pages[i])); BUG_ON(PageHighMem(map->pages[i]));
- gnttab_set_map_op(&map->kmap_ops[i], address, - map->flags | GNTMAP_host_map, + gnttab_set_map_op(&map->kmap_ops[i], address, flags, map->grants[i].ref, map->grants[i].domid); gnttab_set_unmap_op(&map->kunmap_ops[i], address, - map->flags | GNTMAP_host_map, -1); + flags, -1); } }
@@ -320,6 +327,9 @@ static int map_grant_pages(struct grant_ continue; }
+ if (map->flags & GNTMAP_device_map) + map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr; + map->unmap_ops[i].handle = map->map_ops[i].handle; if (use_ptemod) map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
From: Jan Beulich jbeulich@suse.com
commit ebee0eab08594b2bd5db716288a4f1ae5936e9bc upstream.
Failure of the kernel part of the mapping operation should also be indicated as an error to the caller, or else it may assume the respective kernel VA is okay to access.
Furthermore gnttab_map_refs() failing still requires recording successfully mapped handles, so they can be unmapped subsequently. This in turn requires there to be a way to tell full hypercall failure from partial success - preset map_op status fields such that they won't "happen" to look as if the operation succeeded.
Also again use GNTST_okay instead of implying its value (zero).
This is part of XSA-361.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/gntdev.c | 17 +++++++++-------- include/xen/grant_table.h | 1 + 2 files changed, 10 insertions(+), 8 deletions(-)
--- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -318,21 +318,22 @@ static int map_grant_pages(struct grant_ pr_debug("map %d+%d\n", map->index, map->count); err = gnttab_map_refs(map->map_ops, use_ptemod ? map->kmap_ops : NULL, map->pages, map->count); - if (err) - return err;
for (i = 0; i < map->count; i++) { - if (map->map_ops[i].status) { + if (map->map_ops[i].status == GNTST_okay) + map->unmap_ops[i].handle = map->map_ops[i].handle; + else if (!err) err = -EINVAL; - continue; - }
if (map->flags & GNTMAP_device_map) map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
- map->unmap_ops[i].handle = map->map_ops[i].handle; - if (use_ptemod) - map->kunmap_ops[i].handle = map->kmap_ops[i].handle; + if (use_ptemod) { + if (map->kmap_ops[i].status == GNTST_okay) + map->kunmap_ops[i].handle = map->kmap_ops[i].handle; + else if (!err) + err = -EINVAL; + } } return err; } --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -157,6 +157,7 @@ gnttab_set_map_op(struct gnttab_map_gran map->flags = flags; map->ref = ref; map->dom = domid; + map->status = 1; /* arbitrary positive value */ }
static inline void
From: Stefano Stabellini stefano.stabellini@xilinx.com
commit 36bf1dfb8b266e089afa9b7b984217f17027bf35 upstream.
set_phys_to_machine can fail due to lack of memory, see the kzalloc call in arch/arm/xen/p2m.c:__set_phys_to_machine_multi.
Don't ignore the potential return error in set_foreign_p2m_mapping, returning it to the caller instead.
This is part of XSA-361.
Signed-off-by: Stefano Stabellini stefano.stabellini@xilinx.com Cc: stable@vger.kernel.org Reviewed-by: Julien Grall jgrall@amazon.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/xen/p2m.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/arch/arm/xen/p2m.c +++ b/arch/arm/xen/p2m.c @@ -93,8 +93,10 @@ int set_foreign_p2m_mapping(struct gntta for (i = 0; i < count; i++) { if (map_ops[i].status) continue; - set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, - map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT); + if (unlikely(!set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT, + map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT))) { + return -ENOMEM; + } }
return 0;
From: Jan Beulich jbeulich@suse.com
commit 5a264285ed1cd32e26d9de4f3c8c6855e467fd63 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/xen-blkback/blkback.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -842,10 +842,8 @@ again: break; }
- if (segs_to_map) { + if (segs_to_map) ret = gnttab_map_refs(map, NULL, pages_to_gnt, segs_to_map); - BUG_ON(ret); - }
/* * Now swizzle the MFN in our domain with the MFN from the other domain @@ -860,7 +858,7 @@ again: pr_debug("invalid buffer -- could not remap it\n"); put_free_pages(blkif, &pages[seg_idx]->page, 1); pages[seg_idx]->handle = BLKBACK_INVALID_HANDLE; - ret |= 1; + ret |= !ret; goto next; } pages[seg_idx]->handle = map[new_map_idx].handle;
From: Jan Beulich jbeulich@suse.com
commit 3194a1746e8aabe86075fd3c5e7cf1f4632d7f16 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/xen-netback/netback.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1792,13 +1792,11 @@ int xenvif_tx_action(struct xenvif_queue return 0;
gnttab_batch_copy(queue->tx_copy_ops, nr_cops); - if (nr_mops != 0) { + if (nr_mops != 0) ret = gnttab_map_refs(queue->tx_map_ops, NULL, queue->pages_to_map, nr_mops); - BUG_ON(ret); - }
work_done = xenvif_tx_submit(queue);
From: Jan Beulich jbeulich@suse.com
commit 7c77474b2d22176d2bfb592ec74e0f2cb71352c9 upstream.
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping(). Don't make problems worse, the more that handling elsewhere (together with map's status fields now indicating whether a mapping wasn't even attempted, and hence has to be considered failed) doesn't require this odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/xen-scsiback.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -415,12 +415,12 @@ static int scsiback_gnttab_data_map_batc return 0;
err = gnttab_map_refs(map, NULL, pg, cnt); - BUG_ON(err); for (i = 0; i < cnt; i++) { if (unlikely(map[i].status != GNTST_okay)) { pr_err("invalid buffer -- could not remap it\n"); map[i].handle = SCSIBACK_INVALID_HANDLE; - err = -ENOMEM; + if (!err) + err = -ENOMEM; } else { get_page(pg[i]); }
From: Jan Beulich jbeulich@suse.com
commit 871997bc9e423f05c7da7c9178e62dde5df2a7f8 upstream.
The function uses a goto-based loop, which may lead to an earlier error getting discarded by a later iteration. Exit this ad-hoc loop when an error was encountered.
The out-of-memory error path additionally fails to fill a structure field looked at by xen_blkbk_unmap_prepare() before inspecting the handle which does get properly set (to BLKBACK_INVALID_HANDLE).
Since the earlier exiting from the ad-hoc loop requires the same field filling (invalidation) as that on the out-of-memory path, fold both paths. While doing so, drop the pr_alert(), as extra log messages aren't going to help the situation (the kernel will log oom conditions already anyway).
This is XSA-365.
Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Reviewed-by: Julien Grall julien@xen.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/xen-blkback/blkback.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -825,8 +825,11 @@ again: pages[i]->page = persistent_gnt->page; pages[i]->persistent_gnt = persistent_gnt; } else { - if (get_free_page(blkif, &pages[i]->page)) - goto out_of_memory; + if (get_free_page(blkif, &pages[i]->page)) { + put_free_pages(blkif, pages_to_gnt, segs_to_map); + ret = -ENOMEM; + goto out; + } addr = vaddr(pages[i]->page); pages_to_gnt[segs_to_map] = pages[i]->page; pages[i]->persistent_gnt = NULL; @@ -910,15 +913,18 @@ next: } segs_to_map = 0; last_map = map_until; - if (map_until != num) + if (!ret && map_until != num) goto again;
- return ret; +out: + for (i = last_map; i < num; i++) { + /* Don't zap current batch's valid persistent grants. */ + if(i >= last_map + segs_to_map) + pages[i]->persistent_gnt = NULL; + pages[i]->handle = BLKBACK_INVALID_HANDLE; + }
-out_of_memory: - pr_alert("%s: out of memory\n", __func__); - put_free_pages(blkif, pages_to_gnt, segs_to_map); - return -ENOMEM; + return ret; }
static int xen_blkbk_map_seg(struct pending_req *pending_req)
From: Arun Easi aeasi@marvell.com
commit 8de309e7299a00b3045fb274f82b326f356404f0 upstream
Crash stack: [576544.715489] Unable to handle kernel paging request for data at address 0xd00000000f970000 [576544.715497] Faulting instruction address: 0xd00000000f880f64 [576544.715503] Oops: Kernel access of bad area, sig: 11 [#1] [576544.715506] SMP NR_CPUS=2048 NUMA pSeries : [576544.715703] NIP [d00000000f880f64] .qla27xx_fwdt_template_valid+0x94/0x100 [qla2xxx] [576544.715722] LR [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715726] Call Trace: [576544.715731] [c0000004d0ffb000] [c0000006fe02c350] 0xc0000006fe02c350 (unreliable) [576544.715750] [c0000004d0ffb080] [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715770] [c0000004d0ffb170] [d00000000f7aa034] .qla81xx_load_risc+0x84/0x1a0 [qla2xxx] [576544.715789] [c0000004d0ffb210] [d00000000f79f7c8] .qla2x00_setup_chip+0xc8/0x910 [qla2xxx] [576544.715808] [c0000004d0ffb300] [d00000000f7a631c] .qla2x00_initialize_adapter+0x4dc/0xb00 [qla2xxx] [576544.715826] [c0000004d0ffb3e0] [d00000000f78ce28] .qla2x00_probe_one+0xf08/0x2200 [qla2xxx]
Link: https://lore.kernel.org/r/20201202132312.19966-8-njavali@marvell.com Fixes: f73cb695d3ec ("[SCSI] qla2xxx: Add support for ISP2071.") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com [sudip: adjust context] Signed-off-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_tmpl.c | 9 +++++---- drivers/scsi/qla2xxx/qla_tmpl.h | 2 +- 2 files changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_tmpl.c +++ b/drivers/scsi/qla2xxx/qla_tmpl.c @@ -871,7 +871,8 @@ qla27xx_template_checksum(void *p, ulong static inline int qla27xx_verify_template_checksum(struct qla27xx_fwdt_template *tmp) { - return qla27xx_template_checksum(tmp, tmp->template_size) == 0; + return qla27xx_template_checksum(tmp, + le32_to_cpu(tmp->template_size)) == 0; }
static inline int @@ -887,7 +888,7 @@ qla27xx_execute_fwdt_template(struct scs ulong len;
if (qla27xx_fwdt_template_valid(tmp)) { - len = tmp->template_size; + len = le32_to_cpu(tmp->template_size); tmp = memcpy(vha->hw->fw_dump, tmp, len); ql27xx_edit_template(vha, tmp); qla27xx_walk_template(vha, tmp, tmp, &len); @@ -903,7 +904,7 @@ qla27xx_fwdt_calculate_dump_size(struct ulong len = 0;
if (qla27xx_fwdt_template_valid(tmp)) { - len = tmp->template_size; + len = le32_to_cpu(tmp->template_size); qla27xx_walk_template(vha, tmp, NULL, &len); }
@@ -915,7 +916,7 @@ qla27xx_fwdt_template_size(void *p) { struct qla27xx_fwdt_template *tmp = p;
- return tmp->template_size; + return le32_to_cpu(tmp->template_size); }
ulong --- a/drivers/scsi/qla2xxx/qla_tmpl.h +++ b/drivers/scsi/qla2xxx/qla_tmpl.h @@ -13,7 +13,7 @@ struct __packed qla27xx_fwdt_template { uint32_t template_type; uint32_t entry_offset; - uint32_t template_size; + __le32 template_size; uint32_t reserved_1;
uint32_t entry_count;
From: Lai Jiangshan laijs@linux.alibaba.com
commit 88bf56d04bc3564542049ec4ec168a8b60d0b48c upstream
In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as: need_tlb_flush |= kvm->tlbs_dirty; with need_tlb_flush's type being int and tlbs_dirty's type being long.
It means that tlbs_dirty is always used as int and the higher 32 bits is useless. We need to check tlbs_dirty in a correct way and this change checks it directly without propagating it to need_tlb_flush.
Note: it's _extremely_ unlikely this neglecting of higher 32 bits can cause problems in practice. It would require encountering tlbs_dirty on a 4 billion count boundary, and KVM would need to be using shadow paging or be running a nested guest.
Cc: stable@vger.kernel.org Fixes: a4ee1ca4a36e ("KVM: MMU: delay flush all tlbs on sync_page path") Signed-off-by: Lai Jiangshan laijs@linux.alibaba.com Message-Id: 20201217154118.16497-1-jiangshanlai@gmail.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com [sudip: adjust context] Signed-off-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -346,9 +346,8 @@ static void kvm_mmu_notifier_invalidate_ */ kvm->mmu_notifier_count++; need_tlb_flush = kvm_unmap_hva_range(kvm, start, end); - need_tlb_flush |= kvm->tlbs_dirty; /* we've to flush the tlb before the pages can be freed */ - if (need_tlb_flush) + if (need_tlb_flush || kvm->tlbs_dirty) kvm_flush_remote_tlbs(kvm);
spin_unlock(&kvm->mmu_lock);
Hi!
This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-4...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On Mon, Feb 22, 2021 at 01:35:56PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 24 Feb 2021 12:07:46 +0000. Anything received after that time might be too late.
Build results: total: 165 pass: 165 fail: 0 Qemu test results: total: 329 pass: 329 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Mon, 22 Feb 2021 at 18:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 24 Feb 2021 12:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.258-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Summary ------------------------------------------------------------------------
kernel: 4.4.258-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.4.y git commit: 1a954f75c0ee3245a025a60f2a4cccd6722a1bb6 git describe: v4.4.256-39-g1a954f75c0ee Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.4.y/build/v4.4.25...
No regressions (compared to build 4.4.257-rc1)
No fixes (compared to build 4.4.257-rc1)
Ran 31761 total tests in the following environments and test suites.
Environments -------------- - arm - arm64 - i386 - juno-64k_page_size - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - mips - qemu-arm64-kasan - qemu-x86_64-kasan - qemu_arm - qemu_arm64 - qemu_arm64-compat - qemu_i386 - qemu_x86_64 - qemu_x86_64-compat - sparc - x15 - arm - x86_64 - x86-kasan - x86_64
Test Suites ----------- * build * linux-log-parser * kselftest-android * kselftest-bpf * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-lkdtm * kselftest-membarrier * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * perf * v4l2-compliance * kvm-unit-tests * fwts * ssuite
Summary ------------------------------------------------------------------------
kernel: 4.4.258-rc1 git repo: https://git.linaro.org/lkft/arm64-stable-rc.git git branch: 4.4.258-rc1-hikey-20210222-938 git commit: 13d50bac200ebd6562e303c2847856b75c283666 git describe: 4.4.257-rc1-hikey-20210208-927 Test details: https://qa-reports.linaro.org/lkft/linaro-hikey-stable-rc-4.4-oe/build/4.4.2...
No regressions (compared to build 4.4.257-rc1-hikey-20210208-927)
No fixes (compared to build 4.4.257-rc1-hikey-20210208-927)
Ran 1897 total tests in the following environments and test suites.
Environments -------------- - hi6220-hikey - arm64
Test Suites ----------- * build * install-android-platform-tools-r2600 * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * kselftest-android * kselftest-bpf * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-lkdtm * kselftest-membarrier * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram
On Mon, 22 Feb 2021 13:35:56 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 24 Feb 2021 12:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.258-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.4: 6 builds: 6 pass, 0 fail 12 boots: 12 pass, 0 fail 28 tests: 28 pass, 0 fail
Linux version: 4.4.258-rc1-gd947b6dcd5fc Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 2/22/21 5:35 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.258 release. There are 35 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 24 Feb 2021 12:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.258-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org