From: Oleg Nesterov <oleg(a)redhat.com>
sched/isolation: Prevent boot crash when the boot CPU is nohz_full
[ Upstream commit 5097cbcb38e6e0d2627c9dde1985e91d2c9f880e ]
Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not
include the boot CPU, which is no longer true after:
commit 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full").
However after:
aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
the kernel will crash at boot time in this case; housekeeping_any_cpu()
returns an invalid CPU number until smp_init() brings the first
housekeeping CPU up.
Change housekeeping_any_cpu() to check the result of cpumask_any_and() and
return smp_processor_id() in this case.
This is just the simple and backportable workaround which fixes the
symptom, but smp_processor_id() at boot time should be safe at least for
type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu
logic.
There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to
offline tick_do_timer_cpu (the 1st online housekeeping CPU).
[ Apply only documentation changes as commit which causes boot
crash when boot CPU is nohz_full is not backported to stable
kernels - Krishanth ]
Fixes: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
Reported-by: Chris von Recklinghausen <crecklin(a)redhat.com>
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Reviewed-by: Phil Auld <pauld(a)redhat.com>
Acked-by: Frederic Weisbecker <frederic(a)kernel.org>
Link: https://lore.kernel.org/r/20240411143905.GA19288@redhat.com
Closes: https://lore.kernel.org/all/20240402105847.GA24832@redhat.com/
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Krishanth Jagaduri <Krishanth.Jagaduri(a)sony.com>
---
Hi,
Before kernel 6.9, Documentation/timers/no_hz.rst states that
"nohz_full=" mask must not include the boot CPU, which is no longer
true after commit 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be
nohz_full").
When trying LTS kernels between 5.4 and 6.6, we noticed we could use
boot CPU as nohz_full but the information in the document was misleading.
This was fixed upstream by commit 5097cbcb38e6 ("sched/isolation: Prevent
boot crash when the boot CPU is nohz_full").
While it fixes the document description, it also fixes issue introduced
by another commit aae17ebb53cd ("workqueue: Avoid using isolated cpus'
timers on queue_delayed_work").
It is unlikely that upstream commit as a whole will be backported to
stable kernels which does not contain the commit that introduced the
issue of boot crash when boot CPU is nohz_full.
Could we fix only the document portion in stable kernels 5.4+ that
mentions boot CPU cannot be nohz_full?
---
Changes in v2:
- Add original changelog and trailers to commit message.
- Add backport note for why only document portion is modified.
- Link to v1: https://lore.kernel.org/r/20250205-send-oss-20250129-v1-1-d404921e6d7e@sony…
---
Documentation/timers/no_hz.rst | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/Documentation/timers/no_hz.rst b/Documentation/timers/no_hz.rst
index 065db217cb04fc252bbf6a05991296e7f1d3a4c5..16bda468423e88090c0dc467ca7a5c7f3fd2bf02 100644
--- a/Documentation/timers/no_hz.rst
+++ b/Documentation/timers/no_hz.rst
@@ -129,11 +129,8 @@ adaptive-tick CPUs: At least one non-adaptive-tick CPU must remain
online to handle timekeeping tasks in order to ensure that system
calls like gettimeofday() returns accurate values on adaptive-tick CPUs.
(This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no running
-user processes to observe slight drifts in clock rate.) Therefore, the
-boot CPU is prohibited from entering adaptive-ticks mode. Specifying a
-"nohz_full=" mask that includes the boot CPU will result in a boot-time
-error message, and the boot CPU will be removed from the mask. Note that
-this means that your system must have at least two CPUs in order for
+user processes to observe slight drifts in clock rate.) Note that this
+means that your system must have at least two CPUs in order for
CONFIG_NO_HZ_FULL=y to do anything for you.
Finally, adaptive-ticks CPUs must have their RCU callbacks offloaded.
---
base-commit: 219d54332a09e8d8741c1e1982f5eae56099de85
change-id: 20250129-send-oss-20250129-3c42dcf463eb
Best regards,
--
Krishanth Jagaduri <Krishanth.Jagaduri(a)sony.com>
Now for dwmac-loongson {tx,rx}_fifo_size are uninitialised, which means
zero. This means dwmac-loongson doesn't support changing MTU, so set the
correct tx_fifo_size and rx_fifo_size for it (16KB multiplied by channel
counts).
Note: the Fixes tag is not exactly right, but it is a key commit of the
dwmac-loongson series.
Cc: stable(a)vger.kernel.org
Fixes: ad72f783de06827a1f ("net: stmmac: Add multi-channel support")
Signed-off-by: Chong Qiao <qiaochong(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
index bfe6e2d631bd..79acdf38c525 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
@@ -574,6 +574,9 @@ static int loongson_dwmac_probe(struct pci_dev *pdev, const struct pci_device_id
if (ret)
goto err_disable_device;
+ plat->tx_fifo_size = SZ_16K * plat->tx_queues_to_use;
+ plat->rx_fifo_size = SZ_16K * plat->rx_queues_to_use;
+
if (dev_of_node(&pdev->dev))
ret = loongson_dwmac_dt_config(pdev, plat, &res);
else
--
2.47.1
The '-f' parameter is there to force the kernel to emit MPTCP FASTCLOSE
by closing the connection with unread bytes in the receive queue.
The xdisconnect() helper was used to stop the connection, but it does
more than that: it will shut it down, then wait before reconnecting to
the same address. This causes the mptcp_join's "fastclose test" to fail
all the time.
This failure is due to a recent change, with commit 218cc166321f
("selftests: mptcp: avoid spurious errors on disconnect"), but that went
unnoticed because the test is currently ignored. The recent modification
only shown an existing issue: xdisconnect() doesn't need to be used
here, only the shutdown() part is needed.
Fixes: 6bf41020b72b ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Notes:
- The failure was not clearly visible on NIPA, because the results for
the two impacted sub-tests are currently ignored (unstable). Still,
it looks important to fix that, as this will help when the tests will
be improved not to be unstable any more.
---
tools/testing/selftests/net/mptcp/mptcp_connect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index 414addef9a4514c489ecd09249143fe0ce2af649..d240d02fa443a1cd802f0e705ab36db5c22063a8 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -1302,7 +1302,7 @@ int main_loop(void)
return ret;
if (cfg_truncate > 0) {
- xdisconnect(fd);
+ shutdown(fd, SHUT_WR);
} else if (--cfg_repeat > 0) {
xdisconnect(fd);
---
base-commit: 4241a702e0d0c2ca9364cfac08dbf134264962de
change-id: 20250204-net-mptcp-sft-conn-f-d1c14274ba66
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
In the "pmcmd_ioctl" function, three memory objects allocated by
kmalloc are initialized by "hcall_get_cpu_state", which are then
copied to user space. The initializer is indeed implemented in
"acrn_hypercall2" (arch/x86/include/asm/acrn.h). There is a risk of
information leakage due to uninitialized bytes.
Fixes: 3d679d5aec64 ("virt: acrn: Introduce interfaces to query C-states and P-states allowed by hypervisor")
Signed-off-by: Haoyu Li <lihaoyu499(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
drivers/virt/acrn/hsm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index c24036c4e51e..e4e196abdaac 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -49,7 +49,7 @@ static int pmcmd_ioctl(u64 cmd, void __user *uptr)
switch (cmd & PMCMD_TYPE_MASK) {
case ACRN_PMCMD_GET_PX_CNT:
case ACRN_PMCMD_GET_CX_CNT:
- pm_info = kmalloc(sizeof(u64), GFP_KERNEL);
+ pm_info = kzalloc(sizeof(u64), GFP_KERNEL);
if (!pm_info)
return -ENOMEM;
@@ -64,7 +64,7 @@ static int pmcmd_ioctl(u64 cmd, void __user *uptr)
kfree(pm_info);
break;
case ACRN_PMCMD_GET_PX_DATA:
- px_data = kmalloc(sizeof(*px_data), GFP_KERNEL);
+ px_data = kzalloc(sizeof(*px_data), GFP_KERNEL);
if (!px_data)
return -ENOMEM;
@@ -79,7 +79,7 @@ static int pmcmd_ioctl(u64 cmd, void __user *uptr)
kfree(px_data);
break;
case ACRN_PMCMD_GET_CX_DATA:
- cx_data = kmalloc(sizeof(*cx_data), GFP_KERNEL);
+ cx_data = kzalloc(sizeof(*cx_data), GFP_KERNEL);
if (!cx_data)
return -ENOMEM;
--
2.34.1
The role switch registration and set_role() can happen in parallel as they
are invoked independent of each other. There is a possibility that a driver
might spend significant amount of time in usb_role_switch_register() API
due to the presence of time intensive operations like component_add()
which operate under common mutex. This leads to a time window after
allocating the switch and before setting the registered flag where the set
role notifications are dropped. Below timeline summarizes this behavior
Thread1 | Thread2
usb_role_switch_register() |
| |
---> allocate switch |
| |
---> component_add() | usb_role_switch_set_role()
| | |
| | --> Drop role notifications
| | since sw->registered
| | flag is not set.
| |
--->Set registered flag.|
To avoid this, cache the last role received and set it once the switch
registration is complete. Since we are now caching the roles based on
registered flag, protect this flag with the switch mutex.
Fixes: b787a3e78175 ("usb: roles: don't get/set_role() when usb_role_switch is unregistered")
cc: stable(a)vger.kernel.org
Signed-off-by: Elson Roy Serrao <quic_eserrao(a)quicinc.com>
---
drivers/usb/roles/class.c | 45 ++++++++++++++++++++++++++++++++-------
1 file changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/roles/class.c b/drivers/usb/roles/class.c
index c58a12c147f4..c0149c31c01b 100644
--- a/drivers/usb/roles/class.c
+++ b/drivers/usb/roles/class.c
@@ -26,6 +26,8 @@ struct usb_role_switch {
struct mutex lock; /* device lock*/
struct module *module; /* the module this device depends on */
enum usb_role role;
+ enum usb_role cached_role;
+ bool cached;
bool registered;
/* From descriptor */
@@ -65,6 +67,20 @@ static const struct component_ops connector_ops = {
.unbind = connector_unbind,
};
+static int __usb_role_switch_set_role(struct usb_role_switch *sw,
+ enum usb_role role)
+{
+ int ret;
+
+ ret = sw->set(sw, role);
+ if (!ret) {
+ sw->role = role;
+ kobject_uevent(&sw->dev.kobj, KOBJ_CHANGE);
+ }
+
+ return ret;
+}
+
/**
* usb_role_switch_set_role - Set USB role for a switch
* @sw: USB role switch
@@ -79,17 +95,21 @@ int usb_role_switch_set_role(struct usb_role_switch *sw, enum usb_role role)
if (IS_ERR_OR_NULL(sw))
return 0;
- if (!sw->registered)
- return -EOPNOTSUPP;
-
+ /*
+ * Since we have a valid sw struct here, role switch registration might
+ * be in progress. Hence cache the role here and send it out once
+ * registration is complete.
+ */
mutex_lock(&sw->lock);
-
- ret = sw->set(sw, role);
- if (!ret) {
- sw->role = role;
- kobject_uevent(&sw->dev.kobj, KOBJ_CHANGE);
+ if (!sw->registered) {
+ sw->cached = true;
+ sw->cached_role = role;
+ mutex_unlock(&sw->lock);
+ return 0;
}
+ ret = __usb_role_switch_set_role(sw, role);
+
mutex_unlock(&sw->lock);
return ret;
@@ -399,8 +419,14 @@ usb_role_switch_register(struct device *parent,
dev_warn(&sw->dev, "failed to add component\n");
}
+ mutex_lock(&sw->lock);
sw->registered = true;
+ if (sw->cached)
+ __usb_role_switch_set_role(sw, sw->cached_role);
+
+ mutex_unlock(&sw->lock);
+
/* TODO: Symlinks for the host port and the device controller. */
return sw;
@@ -417,7 +443,10 @@ void usb_role_switch_unregister(struct usb_role_switch *sw)
{
if (IS_ERR_OR_NULL(sw))
return;
+ mutex_lock(&sw->lock);
sw->registered = false;
+ sw->cached = false;
+ mutex_unlock(&sw->lock);
if (dev_fwnode(&sw->dev))
component_del(&sw->dev, &connector_ops);
device_unregister(&sw->dev);
--
2.17.1
When using the in-kernel pd-mapper on x1e80100, client drivers often
fail to communicate with the firmware during boot, which specifically
breaks battery and USB-C altmode notifications. This has been observed
to happen on almost every second boot (41%) but likely depends on probe
order:
pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to send altmode request: 0x10 (-125)
pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to request altmode notifications: -125
ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: failed to send UCSI read request: -125
qcom_battmgr.pmic_glink_power_supply pmic_glink.power-supply.0: failed to request power notifications
In the same setup audio also fails to probe albeit much more rarely:
PDR: avs/audio get domain list txn wait failed: -110
PDR: service lookup for avs/audio failed: -110
Chris Lew has provided an analysis and is working on a fix for the
ECANCELED (125) errors, but it is not yet clear whether this will also
address the audio regression.
Even if this was first observed on x1e80100 there is currently no reason
to believe that these issues are specific to that platform.
Disable the in-kernel pd-mapper for now, and make sure to backport this
to stable to prevent users and distros from migrating away from the
user-space service.
Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
Cc: stable(a)vger.kernel.org # 6.11
Link: https://lore.kernel.org/lkml/Zqet8iInnDhnxkT9@hovoldconsulting.com/
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
It's now been over two months since I reported this regression, and even
if we seem to be making some progress on at least some of these issues I
think we need disable the pd-mapper temporarily until the fixes are in
place (e.g. to prevent distros from dropping the user-space service).
Johan
#regzbot introduced: 1ebcde047c54
drivers/soc/qcom/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 74b9121240f8..35ddab9338d4 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -78,6 +78,7 @@ config QCOM_PD_MAPPER
select QCOM_PDR_MSG
select AUXILIARY_BUS
depends on NET && QRTR && (ARCH_QCOM || COMPILE_TEST)
+ depends on BROKEN
default QCOM_RPROC_COMMON
help
The Protection Domain Mapper maps registered services to the domains
--
2.45.2