Hi
Side note: This fix requires
4e7aaa6b82d6 ("netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type"
in first place, as a dependency.
Thanks
On Sat, Jun 22, 2024 at 07:41:24PM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> netfilter: ipset: Fix suspicious rcu_dereference_protected()
>
> to the 6.9-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> netfilter-ipset-fix-suspicious-rcu_dereference_prote.patch
> and it can be found in the queue-6.9 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 0226dfa53edc90463c1b0d50167da948c88025ef
> Author: Jozsef Kadlecsik <kadlec(a)netfilter.org>
> Date: Mon Jun 17 11:18:15 2024 +0200
>
> netfilter: ipset: Fix suspicious rcu_dereference_protected()
>
> [ Upstream commit 8ecd06277a7664f4ef018abae3abd3451d64e7a6 ]
>
> When destroying all sets, we are either in pernet exit phase or
> are executing a "destroy all sets command" from userspace. The latter
> was taken into account in ip_set_dereference() (nfnetlink mutex is held),
> but the former was not. The patch adds the required check to
> rcu_dereference_protected() in ip_set_dereference().
>
> Fixes: 4e7aaa6b82d6 ("netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type")
> Reported-by: syzbot+b62c37cdd58103293a5a(a)syzkaller.appspotmail.com
> Reported-by: syzbot+cfbe1da5fdfc39efc293(a)syzkaller.appspotmail.com
> Reported-by: kernel test robot <oliver.sang(a)intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202406141556.e0b6f17e-lkp@intel.com
> Signed-off-by: Jozsef Kadlecsik <kadlec(a)netfilter.org>
> Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
> index c7ae4d9bf3d24..61431690cbd5f 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -53,12 +53,13 @@ MODULE_DESCRIPTION("core IP set support");
> MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET);
>
> /* When the nfnl mutex or ip_set_ref_lock is held: */
> -#define ip_set_dereference(p) \
> - rcu_dereference_protected(p, \
> +#define ip_set_dereference(inst) \
> + rcu_dereference_protected((inst)->ip_set_list, \
> lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
> - lockdep_is_held(&ip_set_ref_lock))
> + lockdep_is_held(&ip_set_ref_lock) || \
> + (inst)->is_deleted)
> #define ip_set(inst, id) \
> - ip_set_dereference((inst)->ip_set_list)[id]
> + ip_set_dereference(inst)[id]
> #define ip_set_ref_netlink(inst,id) \
> rcu_dereference_raw((inst)->ip_set_list)[id]
> #define ip_set_dereference_nfnl(p) \
> @@ -1133,7 +1134,7 @@ static int ip_set_create(struct sk_buff *skb, const struct nfnl_info *info,
> if (!list)
> goto cleanup;
> /* nfnl mutex is held, both lists are valid */
> - tmp = ip_set_dereference(inst->ip_set_list);
> + tmp = ip_set_dereference(inst);
> memcpy(list, tmp, sizeof(struct ip_set *) * inst->ip_set_max);
> rcu_assign_pointer(inst->ip_set_list, list);
> /* Make sure all current packets have passed through */
The following commit has been merged into the smp/urgent branch of tip:
Commit-ID: 932d8476399f622aa0767a4a0a9e78e5341dc0e1
Gitweb: https://git.kernel.org/tip/932d8476399f622aa0767a4a0a9e78e5341dc0e1
Author: Yuntao Wang <ytcoode(a)gmail.com>
AuthorDate: Wed, 15 May 2024 21:45:54 +08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 17 Jun 2024 15:08:04 +02:00
cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
Commit 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare
stage") added a dynamic range for the prepare states, but did not handle
the assignment of the dynstate variable in __cpuhp_setup_state_cpuslocked().
This causes the corresponding startup callback not to be invoked when
calling __cpuhp_setup_state_cpuslocked() with the CPUHP_BP_PREPARE_DYN
parameter, even though it should be.
Currently, the users of __cpuhp_setup_state_cpuslocked(), for one reason or
another, have not triggered this bug.
Fixes: 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage")
Signed-off-by: Yuntao Wang <ytcoode(a)gmail.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240515134554.427071-1-ytcoode@gmail.com
---
kernel/cpu.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 563877d..74cfdb6 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2446,7 +2446,7 @@ EXPORT_SYMBOL_GPL(__cpuhp_state_add_instance);
* The caller needs to hold cpus read locked while calling this function.
* Return:
* On success:
- * Positive state number if @state is CPUHP_AP_ONLINE_DYN;
+ * Positive state number if @state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN;
* 0 for all other states
* On failure: proper (negative) error code
*/
@@ -2469,7 +2469,7 @@ int __cpuhp_setup_state_cpuslocked(enum cpuhp_state state,
ret = cpuhp_store_callbacks(state, name, startup, teardown,
multi_instance);
- dynstate = state == CPUHP_AP_ONLINE_DYN;
+ dynstate = state == CPUHP_AP_ONLINE_DYN || state == CPUHP_BP_PREPARE_DYN;
if (ret > 0 && dynstate) {
state = ret;
ret = 0;
@@ -2500,8 +2500,8 @@ int __cpuhp_setup_state_cpuslocked(enum cpuhp_state state,
out:
mutex_unlock(&cpuhp_state_mutex);
/*
- * If the requested state is CPUHP_AP_ONLINE_DYN, return the
- * dynamically allocated state in case of success.
+ * If the requested state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN,
+ * return the dynamically allocated state in case of success.
*/
if (!ret && dynstate)
return state;
After the reworking of "Parallel CPU bringup", the cmdline "nosmp" and
"maxcpus=0" is broken. Because these parameters make setup_max_cpus be
zero, and setup_max_cpus is the "max_cpus" of bringup_nonboot_cpus().
In this case, "if (!--ncpus)" will not be true in cpuhp_bringup_mask(),
and the result is all the possible cpus are brought up.
We can fix it by changing "if (!--ncpus)" to "if (!ncpus--)". But to
make logic more clear and save some cpu cycles, it is better to check
"max_cpus" in bringup_nonboot_cpus(), return early if it is zero.
Cc: stable(a)vger.kernel.org
Fixes: 18415f33e2ac4ab382 ("cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE")
Fixes: 06c6796e0304234da6 ("cpu/hotplug: Fix off by one in cpuhp_bringup_mask()")
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
kernel/cpu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 563877d6c28b..200974a31de8 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1859,6 +1859,9 @@ static inline bool cpuhp_bringup_cpus_parallel(unsigned int ncpus) { return fals
void __init bringup_nonboot_cpus(unsigned int max_cpus)
{
+ if (!max_cpus)
+ return;
+
/* Try parallel bringup optimization if enabled */
if (cpuhp_bringup_cpus_parallel(max_cpus))
return;
--
2.43.0
x86_of_pci_irq_enable() returns PCIBIOS_* code received from
pci_read_config_byte() directly and also -EINVAL which are not
compatible error types. x86_of_pci_irq_enable() is used as
(*pcibios_enable_irq) function which should not return PCIBIOS_* codes.
Convert the PCIBIOS_* return code from pci_read_config_byte() into
normal errno using pcibios_err_to_errno().
Fixes: 96e0a0797eba ("x86: dtb: Add support for PCI devices backed by dtb nodes")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/devicetree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c
index 8e3c53b4d070..64280879c68c 100644
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -83,7 +83,7 @@ static int x86_of_pci_irq_enable(struct pci_dev *dev)
ret = pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (ret)
- return ret;
+ return pcibios_err_to_errno(ret);
if (!pin)
return 0;
--
2.39.2
From: Arnd Bergmann <arnd(a)arndb.de>
Both of these architectures require u64 function arguments to be
passed in even/odd pairs of registers or stack slots, which in case of
sync_file_range would result in a seven-argument system call that is
not currently possible. The system call is therefore incompatible with
all existing binaries.
While it would be possible to implement support for seven arguments
like on mips, it seems better to use a six-argument version, either
with the normal argument order but misaligned as on most architectures
or with the reordered sync_file_range2() calling conventions as on
arm and powerpc.
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
arch/csky/include/uapi/asm/unistd.h | 1 +
arch/hexagon/include/uapi/asm/unistd.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/csky/include/uapi/asm/unistd.h b/arch/csky/include/uapi/asm/unistd.h
index 7ff6a2466af1..e0594b6370a6 100644
--- a/arch/csky/include/uapi/asm/unistd.h
+++ b/arch/csky/include/uapi/asm/unistd.h
@@ -6,6 +6,7 @@
#define __ARCH_WANT_SYS_CLONE3
#define __ARCH_WANT_SET_GET_RLIMIT
#define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
#include <asm-generic/unistd.h>
#define __NR_set_thread_area (__NR_arch_specific_syscall + 0)
diff --git a/arch/hexagon/include/uapi/asm/unistd.h b/arch/hexagon/include/uapi/asm/unistd.h
index 432c4db1b623..21ae22306b5d 100644
--- a/arch/hexagon/include/uapi/asm/unistd.h
+++ b/arch/hexagon/include/uapi/asm/unistd.h
@@ -36,5 +36,6 @@
#define __ARCH_WANT_SYS_VFORK
#define __ARCH_WANT_SYS_FORK
#define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYNC_FILE_RANGE2
#include <asm-generic/unistd.h>
--
2.39.2
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 2d64eaeeeda5659d52da1af79d237269ba3c2d2c
Gitweb: https://git.kernel.org/tip/2d64eaeeeda5659d52da1af79d237269ba3c2d2c
Author: Huacai Chen <chenhuacai(a)loongson.cn>
AuthorDate: Sun, 23 Jun 2024 11:41:13 +08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Sun, 23 Jun 2024 17:09:26 +02:00
irqchip/loongson-eiointc: Use early_cpu_to_node() instead of cpu_to_node()
Multi-bridge machines required that all eiointc controllers in the system
are initialized, otherwise the system does not boot.
The initialization happens on the boot CPU during early boot and relies on
cpu_to_node() for identifying the individual nodes.
That works when the number of possible CPUs is large enough, but with a
command line limit, e.g. "nr_cpus=$N" for kdump, but fails when the CPUs
of the secondary nodes are not covered.
During early ACPI enumeration all CPU to node mappings are recorded up to
CONFIG_NR_CPUS. These are accessible via early_cpu_to_node() even in the
case that "nr_cpus=N" truncates the number of possible CPUs and only
provides the possible CPUs via cpu_to_node() translation.
Change the node lookup in the driver to use early_cpu_to_node() so that
even with a limitation on the number of possible CPUs all eointc instances
are initialized.
This can't obviously cure the case where CONFIG_NR_CPUS is too small.
[ tglx: Massaged changelog ]
Fixes: 64cc451e45e1 ("irqchip/loongson-eiointc: Fix incorrect use of acpi_get_vec_parent")
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240623034113.1808727-1-chenhuacai@loongson.cn
---
drivers/irqchip/irq-loongson-eiointc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
index c7ddebf..b1f2080 100644
--- a/drivers/irqchip/irq-loongson-eiointc.c
+++ b/drivers/irqchip/irq-loongson-eiointc.c
@@ -15,6 +15,7 @@
#include <linux/irqchip/chained_irq.h>
#include <linux/kernel.h>
#include <linux/syscore_ops.h>
+#include <asm/numa.h>
#define EIOINTC_REG_NODEMAP 0x14a0
#define EIOINTC_REG_IPMAP 0x14c0
@@ -339,7 +340,7 @@ static int __init pch_msi_parse_madt(union acpi_subtable_headers *header,
int node;
if (cpu_has_flatmode)
- node = cpu_to_node(eiointc_priv[nr_pics - 1]->node * CORES_PER_EIO_NODE);
+ node = early_cpu_to_node(eiointc_priv[nr_pics - 1]->node * CORES_PER_EIO_NODE);
else
node = eiointc_priv[nr_pics - 1]->node;
@@ -431,7 +432,7 @@ int __init eiointc_acpi_init(struct irq_domain *parent,
goto out_free_handle;
if (cpu_has_flatmode)
- node = cpu_to_node(acpi_eiointc->node * CORES_PER_EIO_NODE);
+ node = early_cpu_to_node(acpi_eiointc->node * CORES_PER_EIO_NODE);
else
node = acpi_eiointc->node;
acpi_set_vec_parent(node, priv->eiointc_domain, pch_group);
In the liointc hardware, there are different ISRs (i.e. Interrupt Status
Registers) for different cores. We always use core#0's ISR before but it
has no problem, this is because the interrupts are routed to core#0 by
default. If we change the routing (which can be done by changing firmware
configuration) then we will lose interrupts while CPU hotplugging, so we
should set correct ISRs for different cores.
Cc: <stable(a)vger.kernel.org>
Co-developed-by: Tianli Xiong <xiongtianli(a)loongson.cn>
Signed-off-by: Tianli Xiong <xiongtianli(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
V2: Update commit messages.
drivers/irqchip/irq-loongson-liointc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-loongson-liointc.c b/drivers/irqchip/irq-loongson-liointc.c
index e4b33aed1c97..7c4fe7ab4b83 100644
--- a/drivers/irqchip/irq-loongson-liointc.c
+++ b/drivers/irqchip/irq-loongson-liointc.c
@@ -28,7 +28,7 @@
#define LIOINTC_INTC_CHIP_START 0x20
-#define LIOINTC_REG_INTC_STATUS (LIOINTC_INTC_CHIP_START + 0x20)
+#define LIOINTC_REG_INTC_STATUS(core) (LIOINTC_INTC_CHIP_START + 0x20 + (core) * 8)
#define LIOINTC_REG_INTC_EN_STATUS (LIOINTC_INTC_CHIP_START + 0x04)
#define LIOINTC_REG_INTC_ENABLE (LIOINTC_INTC_CHIP_START + 0x08)
#define LIOINTC_REG_INTC_DISABLE (LIOINTC_INTC_CHIP_START + 0x0c)
@@ -217,7 +217,7 @@ static int liointc_init(phys_addr_t addr, unsigned long size, int revision,
goto out_free_priv;
for (i = 0; i < LIOINTC_NUM_CORES; i++)
- priv->core_isr[i] = base + LIOINTC_REG_INTC_STATUS;
+ priv->core_isr[i] = base + LIOINTC_REG_INTC_STATUS(i);
for (i = 0; i < LIOINTC_NUM_PARENT; i++)
priv->handler[i].parent_int_map = parent_int_map[i];
--
2.43.0