The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 53dac345395c0d2493cbc2f4c85fe38aef5b63f5
Gitweb: https://git.kernel.org/tip/53dac345395c0d2493cbc2f4c85fe38aef5b63f5
Author: Frederic Weisbecker <frederic(a)kernel.org>
AuthorDate: Sat, 18 Jan 2025 00:24:33 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Thu, 23 Jan 2025 20:06:35 +01:00
hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
hrtimers are migrated away from the dying CPU to any online target at
the CPUHP_AP_HRTIMERS_DYING stage in order not to delay bandwidth timers
handling tasks involved in the CPU hotplug forward progress.
However wakeups can still be performed by the outgoing CPU after
CPUHP_AP_HRTIMERS_DYING. Those can result again in bandwidth timers being
armed. Depending on several considerations (crystal ball power management
based election, earliest timer already enqueued, timer migration enabled or
not), the target may eventually be the current CPU even if offline. If that
happens, the timer is eventually ignored.
The most notable example is RCU which had to deal with each and every of
those wake-ups by deferring them to an online CPU, along with related
workarounds:
_ e787644caf76 (rcu: Defer RCU kthreads wakeup when CPU is dying)
_ 9139f93209d1 (rcu/nocb: Fix RT throttling hrtimer armed from offline CPU)
_ f7345ccc62a4 (rcu/nocb: Fix rcuog wake-up from offline softirq)
The problem isn't confined to RCU though as the stop machine kthread
(which runs CPUHP_AP_HRTIMERS_DYING) reports its completion at the end
of its work through cpu_stop_signal_done() and performs a wake up that
eventually arms the deadline server timer:
WARNING: CPU: 94 PID: 588 at kernel/time/hrtimer.c:1086 hrtimer_start_range_ns+0x289/0x2d0
CPU: 94 UID: 0 PID: 588 Comm: migration/94 Not tainted
Stopper: multi_cpu_stop+0x0/0x120 <- stop_machine_cpuslocked+0x66/0xc0
RIP: 0010:hrtimer_start_range_ns+0x289/0x2d0
Call Trace:
<TASK>
start_dl_timer
enqueue_dl_entity
dl_server_start
enqueue_task_fair
enqueue_task
ttwu_do_activate
try_to_wake_up
complete
cpu_stopper_thread
Instead of providing yet another bandaid to work around the situation, fix
it in the hrtimers infrastructure instead: always migrate away a timer to
an online target whenever it is enqueued from an offline CPU.
This will also allow to revert all the above RCU disgraceful hacks.
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Reported-by: Vlad Poenaru <vlad.wing(a)gmail.com>
Reported-by: Usama Arif <usamaarif642(a)gmail.com>
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Tested-by: Paul E. McKenney <paulmck(a)kernel.org>
Link: https://lore.kernel.org/all/20250117232433.24027-1-frederic@kernel.org
Closes: 20241213203739.1519801-1-usamaarif642(a)gmail.com
---
include/linux/hrtimer_defs.h | 1 +-
kernel/time/hrtimer.c | 103 +++++++++++++++++++++++++++-------
2 files changed, 83 insertions(+), 21 deletions(-)
diff --git a/include/linux/hrtimer_defs.h b/include/linux/hrtimer_defs.h
index c3b4b7e..84a5045 100644
--- a/include/linux/hrtimer_defs.h
+++ b/include/linux/hrtimer_defs.h
@@ -125,6 +125,7 @@ struct hrtimer_cpu_base {
ktime_t softirq_expires_next;
struct hrtimer *softirq_next_timer;
struct hrtimer_clock_base clock_base[HRTIMER_MAX_CLOCK_BASES];
+ call_single_data_t csd;
} ____cacheline_aligned;
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 4fb81f8..deb1aa3 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -58,6 +58,8 @@
#define HRTIMER_ACTIVE_SOFT (HRTIMER_ACTIVE_HARD << MASK_SHIFT)
#define HRTIMER_ACTIVE_ALL (HRTIMER_ACTIVE_SOFT | HRTIMER_ACTIVE_HARD)
+static void retrigger_next_event(void *arg);
+
/*
* The timer bases:
*
@@ -111,7 +113,8 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
.clockid = CLOCK_TAI,
.get_time = &ktime_get_clocktai,
},
- }
+ },
+ .csd = CSD_INIT(retrigger_next_event, NULL)
};
static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = {
@@ -124,6 +127,14 @@ static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = {
[CLOCK_TAI] = HRTIMER_BASE_TAI,
};
+static inline bool hrtimer_base_is_online(struct hrtimer_cpu_base *base)
+{
+ if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
+ return true;
+ else
+ return likely(base->online);
+}
+
/*
* Functions and macros which are different for UP/SMP systems are kept in a
* single place
@@ -178,27 +189,54 @@ struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
}
/*
- * We do not migrate the timer when it is expiring before the next
- * event on the target cpu. When high resolution is enabled, we cannot
- * reprogram the target cpu hardware and we would cause it to fire
- * late. To keep it simple, we handle the high resolution enabled and
- * disabled case similar.
+ * Check if the elected target is suitable considering its next
+ * event and the hotplug state of the current CPU.
+ *
+ * If the elected target is remote and its next event is after the timer
+ * to queue, then a remote reprogram is necessary. However there is no
+ * guarantee the IPI handling the operation would arrive in time to meet
+ * the high resolution deadline. In this case the local CPU becomes a
+ * preferred target, unless it is offline.
+ *
+ * High and low resolution modes are handled the same way for simplicity.
*
* Called with cpu_base->lock of target cpu held.
*/
-static int
-hrtimer_check_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base)
+static bool hrtimer_suitable_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base,
+ struct hrtimer_cpu_base *new_cpu_base,
+ struct hrtimer_cpu_base *this_cpu_base)
{
ktime_t expires;
+ /*
+ * The local CPU clockevent can be reprogrammed. Also get_target_base()
+ * guarantees it is online.
+ */
+ if (new_cpu_base == this_cpu_base)
+ return true;
+
+ /*
+ * The offline local CPU can't be the default target if the
+ * next remote target event is after this timer. Keep the
+ * elected new base. An IPI will we issued to reprogram
+ * it as a last resort.
+ */
+ if (!hrtimer_base_is_online(this_cpu_base))
+ return true;
+
expires = ktime_sub(hrtimer_get_expires(timer), new_base->offset);
- return expires < new_base->cpu_base->expires_next;
+
+ return expires >= new_base->cpu_base->expires_next;
}
-static inline
-struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base,
- int pinned)
+static inline struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base, int pinned)
{
+ if (!hrtimer_base_is_online(base)) {
+ int cpu = cpumask_any_and(cpu_online_mask, housekeeping_cpumask(HK_TYPE_TIMER));
+
+ return &per_cpu(hrtimer_bases, cpu);
+ }
+
#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
if (static_branch_likely(&timers_migration_enabled) && !pinned)
return &per_cpu(hrtimer_bases, get_nohz_timer_target());
@@ -249,8 +287,8 @@ again:
raw_spin_unlock(&base->cpu_base->lock);
raw_spin_lock(&new_base->cpu_base->lock);
- if (new_cpu_base != this_cpu_base &&
- hrtimer_check_target(timer, new_base)) {
+ if (!hrtimer_suitable_target(timer, new_base, new_cpu_base,
+ this_cpu_base)) {
raw_spin_unlock(&new_base->cpu_base->lock);
raw_spin_lock(&base->cpu_base->lock);
new_cpu_base = this_cpu_base;
@@ -259,8 +297,7 @@ again:
}
WRITE_ONCE(timer->base, new_base);
} else {
- if (new_cpu_base != this_cpu_base &&
- hrtimer_check_target(timer, new_base)) {
+ if (!hrtimer_suitable_target(timer, new_base, new_cpu_base, this_cpu_base)) {
new_cpu_base = this_cpu_base;
goto again;
}
@@ -706,8 +743,6 @@ static inline int hrtimer_is_hres_enabled(void)
return hrtimer_hres_enabled;
}
-static void retrigger_next_event(void *arg);
-
/*
* Switch to high resolution mode
*/
@@ -1195,6 +1230,7 @@ static int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
u64 delta_ns, const enum hrtimer_mode mode,
struct hrtimer_clock_base *base)
{
+ struct hrtimer_cpu_base *this_cpu_base = this_cpu_ptr(&hrtimer_bases);
struct hrtimer_clock_base *new_base;
bool force_local, first;
@@ -1206,10 +1242,16 @@ static int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
* and enforce reprogramming after it is queued no matter whether
* it is the new first expiring timer again or not.
*/
- force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
+ force_local = base->cpu_base == this_cpu_base;
force_local &= base->cpu_base->next_timer == timer;
/*
+ * Don't force local queuing if this enqueue happens on a unplugged
+ * CPU after hrtimer_cpu_dying() has been invoked.
+ */
+ force_local &= this_cpu_base->online;
+
+ /*
* Remove an active timer from the queue. In case it is not queued
* on the current CPU, make sure that remove_hrtimer() updates the
* remote data correctly.
@@ -1238,8 +1280,27 @@ static int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
}
first = enqueue_hrtimer(timer, new_base, mode);
- if (!force_local)
- return first;
+ if (!force_local) {
+ /*
+ * If the current CPU base is online, then the timer is
+ * never queued on a remote CPU if it would be the first
+ * expiring timer there.
+ */
+ if (hrtimer_base_is_online(this_cpu_base))
+ return first;
+
+ /*
+ * Timer was enqueued remote because the current base is
+ * already offline. If the timer is the first to expire,
+ * kick the remote CPU to reprogram the clock event.
+ */
+ if (first) {
+ struct hrtimer_cpu_base *new_cpu_base = new_base->cpu_base;
+
+ smp_call_function_single_async(new_cpu_base->cpu, &new_cpu_base->csd);
+ }
+ return 0;
+ }
/*
* Timer was forced to stay on the current CPU to avoid
Recently, a few issues linked to MPTCP have been reported by syzbot. All
the remaining ones are addressed in this series.
- Patch 1: Address "KMSAN: uninit-value in mptcp_incoming_options (2)".
A fix for v5.11.
- Patch 2: Address "WARNING in mptcp_pm_nl_set_flags (2)". A fix for
v5.18.
- Patch 3: Address "WARNING in __mptcp_clean_una (2)". A fix for v6.4,
backported up to v6.1.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Matthieu Baerts (NGI0) (1):
mptcp: pm: only set fullmesh for subflow endp
Paolo Abeni (2):
mptcp: consolidate suboption status
mptcp: handle fastopen disconnect correctly
net/mptcp/options.c | 13 +++++--------
net/mptcp/pm_netlink.c | 3 ++-
net/mptcp/protocol.c | 4 +++-
net/mptcp/protocol.h | 30 ++++++++++++++++--------------
4 files changed, 26 insertions(+), 24 deletions(-)
---
base-commit: 15a901361ec3fb1c393f91880e1cbf24ec0a88bd
change-id: 20250123-net-mptcp-syzbot-issues-ffdbbb9b85d7
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
In the US country code, to avoid including 6 GHz rules in the 5 GHz rules
list, the number of 5 GHz rules is set to a default constant value of 4
(REG_US_5G_NUM_REG_RULES). However, if there are more than 4 valid 5 GHz
rules, the current logic will bypass the legitimate 6 GHz rules.
For example, if there are 5 valid 5 GHz rules and 1 valid 6 GHz rule, the
current logic will only consider 4 of the 5 GHz rules, treating the last
valid rule as a 6 GHz rule. Consequently, the actual 6 GHz rule is never
processed, leading to the eventual disabling of 6 GHz channels.
To fix this issue, instead of hardcoding the value to 4, use a helper
function to determine the number of 6 GHz rules present in the 5 GHz rules
list and ignore only those rules.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Cc: stable(a)vger.kernel.org
Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh(a)oss.qualcomm.com>
---
drivers/net/wireless/ath/ath12k/wmi.c | 61 ++++++++++++++++++++++++++---------
drivers/net/wireless/ath/ath12k/wmi.h | 1 -
2 files changed, 45 insertions(+), 17 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c
index a7625a3d1a3a9b1bbcf90cee30d3c707de03c439..296238c7ee0fd95045a3e0859d730e3bd73ec906 100644
--- a/drivers/net/wireless/ath/ath12k/wmi.c
+++ b/drivers/net/wireless/ath/ath12k/wmi.c
@@ -4852,6 +4852,22 @@ static struct ath12k_reg_rule
return reg_rule_ptr;
}
+static u8 ath12k_wmi_ignore_num_extra_rules(struct ath12k_wmi_reg_rule_ext_params *rule,
+ u32 num_reg_rules)
+{
+ u8 num_invalid_5ghz_rules = 0;
+ u32 count, start_freq;
+
+ for (count = 0; count < num_reg_rules; count++) {
+ start_freq = le32_get_bits(rule[count].freq_info, REG_RULE_START_FREQ);
+
+ if (start_freq >= ATH12K_MIN_6G_FREQ)
+ num_invalid_5ghz_rules++;
+ }
+
+ return num_invalid_5ghz_rules;
+}
+
static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,
struct sk_buff *skb,
struct ath12k_reg_info *reg_info)
@@ -4862,6 +4878,7 @@ static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,
u32 num_2g_reg_rules, num_5g_reg_rules;
u32 num_6g_reg_rules_ap[WMI_REG_CURRENT_MAX_AP_TYPE];
u32 num_6g_reg_rules_cl[WMI_REG_CURRENT_MAX_AP_TYPE][WMI_REG_MAX_CLIENT_TYPE];
+ u8 num_invalid_5ghz_ext_rules;
u32 total_reg_rules = 0;
int ret, i, j;
@@ -4955,20 +4972,6 @@ static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,
memcpy(reg_info->alpha2, &ev->alpha2, REG_ALPHA2_LEN);
- /* FIXME: Currently FW includes 6G reg rule also in 5G rule
- * list for country US.
- * Having same 6G reg rule in 5G and 6G rules list causes
- * intersect check to be true, and same rules will be shown
- * multiple times in iw cmd. So added hack below to avoid
- * parsing 6G rule from 5G reg rule list, and this can be
- * removed later, after FW updates to remove 6G reg rule
- * from 5G rules list.
- */
- if (memcmp(reg_info->alpha2, "US", 2) == 0) {
- reg_info->num_5g_reg_rules = REG_US_5G_NUM_REG_RULES;
- num_5g_reg_rules = reg_info->num_5g_reg_rules;
- }
-
reg_info->dfs_region = le32_to_cpu(ev->dfs_region);
reg_info->phybitmap = le32_to_cpu(ev->phybitmap);
reg_info->num_phy = le32_to_cpu(ev->num_phy);
@@ -5071,8 +5074,29 @@ static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,
}
}
+ ext_wmi_reg_rule += num_2g_reg_rules;
+
+ /* Firmware might include 6 GHz reg rule in 5 GHz rule list
+ * for few countries along with separate 6 GHz rule.
+ * Having same 6 GHz reg rule in 5 GHz and 6 GHz rules list
+ * causes intersect check to be true, and same rules will be
+ * shown multiple times in iw cmd.
+ * Hence, avoid parsing 6 GHz rule from 5 GHz reg rule list
+ */
+ num_invalid_5ghz_ext_rules = ath12k_wmi_ignore_num_extra_rules(ext_wmi_reg_rule,
+ num_5g_reg_rules);
+
+ if (num_invalid_5ghz_ext_rules) {
+ ath12k_dbg(ab, ATH12K_DBG_WMI,
+ "CC: %s 5 GHz reg rules number %d from fw, %d number of invalid 5 GHz rules",
+ reg_info->alpha2, reg_info->num_5g_reg_rules,
+ num_invalid_5ghz_ext_rules);
+
+ num_5g_reg_rules = num_5g_reg_rules - num_invalid_5ghz_ext_rules;
+ reg_info->num_5g_reg_rules = num_5g_reg_rules;
+ }
+
if (num_5g_reg_rules) {
- ext_wmi_reg_rule += num_2g_reg_rules;
reg_info->reg_rules_5g_ptr =
create_ext_reg_rules_from_wmi(num_5g_reg_rules,
ext_wmi_reg_rule);
@@ -5084,7 +5108,12 @@ static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,
}
}
- ext_wmi_reg_rule += num_5g_reg_rules;
+ /* We have adjusted the number of 5 GHz reg rules above. But still those
+ * many rules needs to be adjusted in ext_wmi_reg_rule.
+ *
+ * NOTE: num_invalid_5ghz_ext_rules will be 0 for rest other cases.
+ */
+ ext_wmi_reg_rule += (num_5g_reg_rules + num_invalid_5ghz_ext_rules);
for (i = 0; i < WMI_REG_CURRENT_MAX_AP_TYPE; i++) {
reg_info->reg_rules_6g_ap_ptr[i] =
diff --git a/drivers/net/wireless/ath/ath12k/wmi.h b/drivers/net/wireless/ath/ath12k/wmi.h
index 38aed18b99f4239e40d369181549b4bc78faa862..81248253a36872f6fd93db772b5394f3ef9d3bf9 100644
--- a/drivers/net/wireless/ath/ath12k/wmi.h
+++ b/drivers/net/wireless/ath/ath12k/wmi.h
@@ -4093,7 +4093,6 @@ struct ath12k_wmi_eht_rate_set_params {
#define MAX_REG_RULES 10
#define REG_ALPHA2_LEN 2
#define MAX_6G_REG_RULES 5
-#define REG_US_5G_NUM_REG_RULES 4
enum wmi_start_event_param {
WMI_VDEV_START_RESP_EVENT = 0,
---
base-commit: c4e4e37afc8b8d178b0f00f607c30b3b81253597
change-id: 20250123-fix_6ghz_rules_handling-ff85f1efb30b
Having the exec queue snapshot inside a "GuC CT" section was always
wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section
headings and add tile info") tried to fix that bug, but with that also
broke the mesa tool that parses the devcoredump, hence it was reverted
in commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa
debug tool").
With the mesa tool also fixed, this can propagate as a fix on both
kernel and userspace side to avoid unnecessary headache for a debug
feature.
Cc: John Harrison <John.C.Harrison(a)Intel.com>
Cc: Julia Filipchuk <julia.filipchuk(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: stable(a)vger.kernel.org
Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool")
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
drivers/gpu/drm/xe/xe_devcoredump.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index 81dc7795c0651..a7946a76777e7 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -119,11 +119,7 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
drm_puts(&p, "\n**** GuC CT ****\n");
xe_guc_ct_snapshot_print(ss->guc.ct, &p);
- /*
- * Don't add a new section header here because the mesa debug decoder
- * tool expects the context information to be in the 'GuC CT' section.
- */
- /* drm_puts(&p, "\n**** Contexts ****\n"); */
+ drm_puts(&p, "\n**** Contexts ****\n");
xe_guc_exec_queue_snapshot_print(ss->ge, &p);
drm_puts(&p, "\n**** Job ****\n");
--
2.48.0