The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From fdd92b64d15bc4aec973caa25899afd782402e68 Mon Sep 17 00:00:00 2001
From: Jeff Layton <jlayton(a)kernel.org>
Date: Fri, 20 Aug 2021 09:29:50 -0400
Subject: [PATCH] fs: warn about impending deprecation of mandatory locks
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros
have disabled it. Warn the stragglers that still use "-o mand" that
we'll be dropping support for that mount option.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
---
fs/namespace.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index ab4174a3c802..2279473d0d6f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1716,8 +1716,12 @@ static inline bool may_mount(void)
}
#ifdef CONFIG_MANDATORY_FILE_LOCKING
-static inline bool may_mandlock(void)
+static bool may_mandlock(void)
{
+ pr_warn_once("======================================================\n"
+ "WARNING: the mand mount option is being deprecated and\n"
+ " will be removed in v5.15!\n"
+ "======================================================\n");
return capable(CAP_SYS_ADMIN);
}
#else
--
2.30.2
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From e0bff43220925b7e527f9d3bc9f5c624177c959e Mon Sep 17 00:00:00 2001
From: Marcin Bachry <hegel666(a)gmail.com>
Date: Wed, 21 Jul 2021 22:58:58 -0400
Subject: [PATCH] PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues. The HW works fine
in Windows. I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666(a)gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Prike Liang <prike.liang(a)amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k(a)amd.com>
---
drivers/pci/quirks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386eadc2..ab3de1551b50 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1900,6 +1900,7 @@ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1639, quirk_ryzen_xhci_d3hot);
#ifdef CONFIG_X86_IO_APIC
static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
--
2.30.2
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From da94692001ea45ffa1f5e9f17ecdef7aecd90c27 Mon Sep 17 00:00:00 2001
From: Kristin Paget <kristin(a)tombom.co.uk>
Date: Sat, 14 Aug 2021 15:46:05 -0700
Subject: [PATCH] ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15
9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin(a)tombom.co.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index a065260d0d20..96f32eaa24df 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a2e, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a30, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a58, "Dell", ALC255_FIXUP_DELL_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0a61, "Dell XPS 15 9510", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
--
2.30.2
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From e0bff43220925b7e527f9d3bc9f5c624177c959e Mon Sep 17 00:00:00 2001
From: Marcin Bachry <hegel666(a)gmail.com>
Date: Wed, 21 Jul 2021 22:58:58 -0400
Subject: [PATCH] PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues. The HW works fine
in Windows. I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666(a)gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Prike Liang <prike.liang(a)amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k(a)amd.com>
---
drivers/pci/quirks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386eadc2..ab3de1551b50 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1900,6 +1900,7 @@ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1639, quirk_ryzen_xhci_d3hot);
#ifdef CONFIG_X86_IO_APIC
static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
--
2.30.2
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From da94692001ea45ffa1f5e9f17ecdef7aecd90c27 Mon Sep 17 00:00:00 2001
From: Kristin Paget <kristin(a)tombom.co.uk>
Date: Sat, 14 Aug 2021 15:46:05 -0700
Subject: [PATCH] ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15
9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin(a)tombom.co.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index a065260d0d20..96f32eaa24df 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a2e, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a30, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a58, "Dell", ALC255_FIXUP_DELL_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0a61, "Dell XPS 15 9510", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From e0bff43220925b7e527f9d3bc9f5c624177c959e Mon Sep 17 00:00:00 2001
From: Marcin Bachry <hegel666(a)gmail.com>
Date: Wed, 21 Jul 2021 22:58:58 -0400
Subject: [PATCH] PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues. The HW works fine
in Windows. I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666(a)gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Prike Liang <prike.liang(a)amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k(a)amd.com>
---
drivers/pci/quirks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386eadc2..ab3de1551b50 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1900,6 +1900,7 @@ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1639, quirk_ryzen_xhci_d3hot);
#ifdef CONFIG_X86_IO_APIC
static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From da94692001ea45ffa1f5e9f17ecdef7aecd90c27 Mon Sep 17 00:00:00 2001
From: Kristin Paget <kristin(a)tombom.co.uk>
Date: Sat, 14 Aug 2021 15:46:05 -0700
Subject: [PATCH] ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15
9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin(a)tombom.co.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index a065260d0d20..96f32eaa24df 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a2e, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a30, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a58, "Dell", ALC255_FIXUP_DELL_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0a61, "Dell XPS 15 9510", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
--
2.30.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From da94692001ea45ffa1f5e9f17ecdef7aecd90c27 Mon Sep 17 00:00:00 2001
From: Kristin Paget <kristin(a)tombom.co.uk>
Date: Sat, 14 Aug 2021 15:46:05 -0700
Subject: [PATCH] ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15
9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin(a)tombom.co.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index a065260d0d20..96f32eaa24df 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a2e, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a30, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a58, "Dell", ALC255_FIXUP_DELL_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0a61, "Dell XPS 15 9510", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
--
2.30.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From c0e38eaa8d5102c138e4f16658ea762417d42a8f Mon Sep 17 00:00:00 2001
From: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Date: Mon, 9 Aug 2021 09:24:27 +0100
Subject: [PATCH] slimbus: ngd: set correct device for pm
For some reason we ended up using wrong device in some places for pm_runtime calls.
Fix this so that NGG driver can do runtime pm correctly.
Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-4-srinivas.kandagatla@linaro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/slimbus/qcom-ngd-ctrl.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/slimbus/qcom-ngd-ctrl.c b/drivers/slimbus/qcom-ngd-ctrl.c
index c054e83ab636..f3ee8e036372 100644
--- a/drivers/slimbus/qcom-ngd-ctrl.c
+++ b/drivers/slimbus/qcom-ngd-ctrl.c
@@ -618,7 +618,7 @@ static void qcom_slim_ngd_rx(struct qcom_slim_ngd_ctrl *ctrl, u8 *buf)
(mc == SLIM_USR_MC_GENERIC_ACK &&
mt == SLIM_MSG_MT_SRC_REFERRED_USER)) {
slim_msg_response(&ctrl->ctrl, &buf[4], buf[3], len - 4);
- pm_runtime_mark_last_busy(ctrl->dev);
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
}
}
@@ -1257,13 +1257,14 @@ static int qcom_slim_ngd_enable(struct qcom_slim_ngd_ctrl *ctrl, bool enable)
}
/* controller state should be in sync with framework state */
complete(&ctrl->qmi.qmi_comp);
- if (!pm_runtime_enabled(ctrl->dev) ||
- !pm_runtime_suspended(ctrl->dev))
- qcom_slim_ngd_runtime_resume(ctrl->dev);
+ if (!pm_runtime_enabled(ctrl->ctrl.dev) ||
+ !pm_runtime_suspended(ctrl->ctrl.dev))
+ qcom_slim_ngd_runtime_resume(ctrl->ctrl.dev);
else
- pm_runtime_resume(ctrl->dev);
- pm_runtime_mark_last_busy(ctrl->dev);
- pm_runtime_put(ctrl->dev);
+ pm_runtime_resume(ctrl->ctrl.dev);
+
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
+ pm_runtime_put(ctrl->ctrl.dev);
ret = slim_register_controller(&ctrl->ctrl);
if (ret) {
@@ -1389,7 +1390,7 @@ static int qcom_slim_ngd_ssr_pdr_notify(struct qcom_slim_ngd_ctrl *ctrl,
/* Make sure the last dma xfer is finished */
mutex_lock(&ctrl->tx_lock);
if (ctrl->state != QCOM_SLIM_NGD_CTRL_DOWN) {
- pm_runtime_get_noresume(ctrl->dev);
+ pm_runtime_get_noresume(ctrl->ctrl.dev);
ctrl->state = QCOM_SLIM_NGD_CTRL_DOWN;
qcom_slim_ngd_down(ctrl);
qcom_slim_ngd_exit_dma(ctrl);
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 419dd626e357e89fc9c4e3863592c8b38cfe1571 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenz(a)kernel.org>
Date: Sat, 7 Aug 2021 13:06:36 +0200
Subject: [PATCH] mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on
BCM2711
The controller doesn't seem to pick-up on clock changes, so set the
SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN flag to query the clock frequency
directly from the clock.
Fixes: f84e411c85be ("mmc: sdhci-iproc: Add support for emmc2 of the BCM2711")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)kernel.org>
Signed-off-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1628334401-6577-6-git-send-email-stefan.wahren@i2…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
drivers/mmc/host/sdhci-iproc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/sdhci-iproc.c b/drivers/mmc/host/sdhci-iproc.c
index 032bf852397f..e7565c671998 100644
--- a/drivers/mmc/host/sdhci-iproc.c
+++ b/drivers/mmc/host/sdhci-iproc.c
@@ -295,7 +295,8 @@ static const struct sdhci_ops sdhci_iproc_bcm2711_ops = {
};
static const struct sdhci_pltfm_data sdhci_bcm2711_pltfm_data = {
- .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12,
+ .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 |
+ SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
.ops = &sdhci_iproc_bcm2711_ops,
};
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From c9107dd0b851777d7e134420baf13a5c5343bc16 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenz(a)kernel.org>
Date: Sat, 7 Aug 2021 13:06:35 +0200
Subject: [PATCH] mmc: sdhci-iproc: Cap min clock frequency on BCM2711
There is a known bug on BCM2711's SDHCI core integration where the
controller will hang when the difference between the core clock and the
bus clock is too great. Specifically this can be reproduced under the
following conditions:
- No SD card plugged in, polling thread is running, probing cards at
100 kHz.
- BCM2711's core clock configured at 500MHz or more.
So set 200 kHz as the minimum clock frequency available for that board.
For more information on the issue see this:
https://lore.kernel.org/linux-mmc/20210322185816.27582-1-nsaenz@kernel.org/…
Fixes: f84e411c85be ("mmc: sdhci-iproc: Add support for emmc2 of the BCM2711")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)kernel.org>
Signed-off-by: Stefan Wahren <stefan.wahren(a)i2se.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/1628334401-6577-5-git-send-email-stefan.wahren@i2…
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
---
drivers/mmc/host/sdhci-iproc.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/mmc/host/sdhci-iproc.c b/drivers/mmc/host/sdhci-iproc.c
index cce390fe9cf3..032bf852397f 100644
--- a/drivers/mmc/host/sdhci-iproc.c
+++ b/drivers/mmc/host/sdhci-iproc.c
@@ -173,6 +173,23 @@ static unsigned int sdhci_iproc_get_max_clock(struct sdhci_host *host)
return pltfm_host->clock;
}
+/*
+ * There is a known bug on BCM2711's SDHCI core integration where the
+ * controller will hang when the difference between the core clock and the bus
+ * clock is too great. Specifically this can be reproduced under the following
+ * conditions:
+ *
+ * - No SD card plugged in, polling thread is running, probing cards at
+ * 100 kHz.
+ * - BCM2711's core clock configured at 500MHz or more
+ *
+ * So we set 200kHz as the minimum clock frequency available for that SoC.
+ */
+static unsigned int sdhci_iproc_bcm2711_get_min_clock(struct sdhci_host *host)
+{
+ return 200000;
+}
+
static const struct sdhci_ops sdhci_iproc_ops = {
.set_clock = sdhci_set_clock,
.get_max_clock = sdhci_iproc_get_max_clock,
@@ -271,6 +288,7 @@ static const struct sdhci_ops sdhci_iproc_bcm2711_ops = {
.set_clock = sdhci_set_clock,
.set_power = sdhci_set_power_and_bus_voltage,
.get_max_clock = sdhci_iproc_get_max_clock,
+ .get_min_clock = sdhci_iproc_bcm2711_get_min_clock,
.set_bus_width = sdhci_set_bus_width,
.reset = sdhci_reset,
.set_uhs_signaling = sdhci_set_uhs_signaling,
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From da94692001ea45ffa1f5e9f17ecdef7aecd90c27 Mon Sep 17 00:00:00 2001
From: Kristin Paget <kristin(a)tombom.co.uk>
Date: Sat, 14 Aug 2021 15:46:05 -0700
Subject: [PATCH] ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15
9510 laptop
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin(a)tombom.co.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index a065260d0d20..96f32eaa24df 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a2e, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a30, "Dell", ALC236_FIXUP_DELL_AIO_HEADSET_MIC),
SND_PCI_QUIRK(0x1028, 0x0a58, "Dell", ALC255_FIXUP_DELL_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0a61, "Dell XPS 15 9510", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
--
2.30.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From c0e38eaa8d5102c138e4f16658ea762417d42a8f Mon Sep 17 00:00:00 2001
From: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Date: Mon, 9 Aug 2021 09:24:27 +0100
Subject: [PATCH] slimbus: ngd: set correct device for pm
For some reason we ended up using wrong device in some places for pm_runtime calls.
Fix this so that NGG driver can do runtime pm correctly.
Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-4-srinivas.kandagatla@linaro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/slimbus/qcom-ngd-ctrl.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/slimbus/qcom-ngd-ctrl.c b/drivers/slimbus/qcom-ngd-ctrl.c
index c054e83ab636..f3ee8e036372 100644
--- a/drivers/slimbus/qcom-ngd-ctrl.c
+++ b/drivers/slimbus/qcom-ngd-ctrl.c
@@ -618,7 +618,7 @@ static void qcom_slim_ngd_rx(struct qcom_slim_ngd_ctrl *ctrl, u8 *buf)
(mc == SLIM_USR_MC_GENERIC_ACK &&
mt == SLIM_MSG_MT_SRC_REFERRED_USER)) {
slim_msg_response(&ctrl->ctrl, &buf[4], buf[3], len - 4);
- pm_runtime_mark_last_busy(ctrl->dev);
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
}
}
@@ -1257,13 +1257,14 @@ static int qcom_slim_ngd_enable(struct qcom_slim_ngd_ctrl *ctrl, bool enable)
}
/* controller state should be in sync with framework state */
complete(&ctrl->qmi.qmi_comp);
- if (!pm_runtime_enabled(ctrl->dev) ||
- !pm_runtime_suspended(ctrl->dev))
- qcom_slim_ngd_runtime_resume(ctrl->dev);
+ if (!pm_runtime_enabled(ctrl->ctrl.dev) ||
+ !pm_runtime_suspended(ctrl->ctrl.dev))
+ qcom_slim_ngd_runtime_resume(ctrl->ctrl.dev);
else
- pm_runtime_resume(ctrl->dev);
- pm_runtime_mark_last_busy(ctrl->dev);
- pm_runtime_put(ctrl->dev);
+ pm_runtime_resume(ctrl->ctrl.dev);
+
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
+ pm_runtime_put(ctrl->ctrl.dev);
ret = slim_register_controller(&ctrl->ctrl);
if (ret) {
@@ -1389,7 +1390,7 @@ static int qcom_slim_ngd_ssr_pdr_notify(struct qcom_slim_ngd_ctrl *ctrl,
/* Make sure the last dma xfer is finished */
mutex_lock(&ctrl->tx_lock);
if (ctrl->state != QCOM_SLIM_NGD_CTRL_DOWN) {
- pm_runtime_get_noresume(ctrl->dev);
+ pm_runtime_get_noresume(ctrl->ctrl.dev);
ctrl->state = QCOM_SLIM_NGD_CTRL_DOWN;
qcom_slim_ngd_down(ctrl);
qcom_slim_ngd_exit_dma(ctrl);
--
2.30.2
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From a30f895ad3239f45012e860d4f94c1a388b36d14 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Fri, 20 Aug 2021 14:53:59 -0600
Subject: [PATCH] io_uring: fix xa_alloc_cycle() error return value check
We currently check for ret != 0 to indicate error, but '1' is a valid
return and just indicates that the allocation succeeded with a wrap.
Correct the check to be for < 0, like it was before the xarray
conversion.
Cc: stable(a)vger.kernel.org
Fixes: 61cf93700fe6 ("io_uring: Convert personality_idr to XArray")
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
fs/io_uring.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 979941bcd15a..a2e20a6fbfed 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9843,10 +9843,11 @@ static int io_register_personality(struct io_ring_ctx *ctx)
ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
XA_LIMIT(0, USHRT_MAX), &ctx->pers_next, GFP_KERNEL);
- if (!ret)
- return id;
- put_cred(creds);
- return ret;
+ if (ret < 0) {
+ put_cred(creds);
+ return ret;
+ }
+ return id;
}
static int io_register_restrictions(struct io_ring_ctx *ctx, void __user *arg,
--
2.30.2
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 6c34df6f350df9579ce99d887a2b5fa14cc13b32 Mon Sep 17 00:00:00 2001
From: Pingfan Liu <kernelfans(a)gmail.com>
Date: Sat, 14 Aug 2021 11:45:38 +0800
Subject: [PATCH] tracing: Apply trace filters on all output channels
The event filters are not applied on all of the output, which results in
the flood of printk when using tp_printk. Unfolding
event_trigger_unlock_commit_regs() into trace_event_buffer_commit(), so
the filters can be applied on every output.
Link: https://lkml.kernel.org/r/20210814034538.8428-1-kernelfans@gmail.com
Cc: stable(a)vger.kernel.org
Fixes: 0daa2302968c1 ("tracing: Add tp_printk cmdline to have tracepoints go to printk()")
Signed-off-by: Pingfan Liu <kernelfans(a)gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 18 +++++++++++++++---
kernel/trace/trace.h | 32 --------------------------------
2 files changed, 15 insertions(+), 35 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 33899a71fdc1..a1adb29ef5c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2897,14 +2897,26 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
{
+ enum event_trigger_type tt = ETT_NONE;
+ struct trace_event_file *file = fbuffer->trace_file;
+
+ if (__event_trigger_test_discard(file, fbuffer->buffer, fbuffer->event,
+ fbuffer->entry, &tt))
+ goto discard;
+
if (static_key_false(&tracepoint_printk_key.key))
output_printk(fbuffer);
if (static_branch_unlikely(&trace_event_exports_enabled))
ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT);
- event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer,
- fbuffer->event, fbuffer->entry,
- fbuffer->trace_ctx, fbuffer->regs);
+
+ trace_buffer_unlock_commit_regs(file->tr, fbuffer->buffer,
+ fbuffer->event, fbuffer->trace_ctx, fbuffer->regs);
+
+discard:
+ if (tt)
+ event_triggers_post_call(file, tt);
+
}
EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a180abf76d4e..4a0e693000c6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1389,38 +1389,6 @@ event_trigger_unlock_commit(struct trace_event_file *file,
event_triggers_post_call(file, tt);
}
-/**
- * event_trigger_unlock_commit_regs - handle triggers and finish event commit
- * @file: The file pointer associated with the event
- * @buffer: The ring buffer that the event is being written to
- * @event: The event meta data in the ring buffer
- * @entry: The event itself
- * @trace_ctx: The tracing context flags.
- *
- * This is a helper function to handle triggers that require data
- * from the event itself. It also tests the event against filters and
- * if the event is soft disabled and should be discarded.
- *
- * Same as event_trigger_unlock_commit() but calls
- * trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
- */
-static inline void
-event_trigger_unlock_commit_regs(struct trace_event_file *file,
- struct trace_buffer *buffer,
- struct ring_buffer_event *event,
- void *entry, unsigned int trace_ctx,
- struct pt_regs *regs)
-{
- enum event_trigger_type tt = ETT_NONE;
-
- if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
- trace_buffer_unlock_commit_regs(file->tr, buffer, event,
- trace_ctx, regs);
-
- if (tt)
- event_triggers_post_call(file, tt);
-}
-
#define FILTER_PRED_INVALID ((unsigned short)-1)
#define FILTER_PRED_IS_RIGHT (1 << 15)
#define FILTER_PRED_FOLD (1 << 15)
--
2.30.2
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From 21f965221e7c42609521342403e8fb91b8b3e76e Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Sat, 14 Aug 2021 09:04:40 -0600
Subject: [PATCH] io_uring: only assign io_uring_enter() SQPOLL error in actual
error case
If an SQPOLL based ring is newly created and an application issues an
io_uring_enter(2) system call on it, then we can return a spurious
-EOWNERDEAD error. This happens because there's nothing to submit, and
if the caller doesn't specify any other action, the initial error
assignment of -EOWNERDEAD never gets overwritten. This causes us to
return it directly, even if it isn't valid.
Move the error assignment into the actual failure case instead.
Cc: stable(a)vger.kernel.org
Fixes: d9d05217cb69 ("io_uring: stop SQPOLL submit on creator's death")
Reported-by: Sherlock Holo sherlockya(a)gmail.com
Link: https://github.com/axboe/liburing/issues/413
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
fs/io_uring.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 04c6d059ea94..6a092a534d2b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9370,9 +9370,10 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
if (ctx->flags & IORING_SETUP_SQPOLL) {
io_cqring_overflow_flush(ctx, false);
- ret = -EOWNERDEAD;
- if (unlikely(ctx->sq_data->thread == NULL))
+ if (unlikely(ctx->sq_data->thread == NULL)) {
+ ret = -EOWNERDEAD;
goto out;
+ }
if (flags & IORING_ENTER_SQ_WAKEUP)
wake_up(&ctx->sq_data->wait);
if (flags & IORING_ENTER_SQ_WAIT) {
--
2.30.2
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
>From c0e38eaa8d5102c138e4f16658ea762417d42a8f Mon Sep 17 00:00:00 2001
From: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Date: Mon, 9 Aug 2021 09:24:27 +0100
Subject: [PATCH] slimbus: ngd: set correct device for pm
For some reason we ended up using wrong device in some places for pm_runtime calls.
Fix this so that NGG driver can do runtime pm correctly.
Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Link: https://lore.kernel.org/r/20210809082428.11236-4-srinivas.kandagatla@linaro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/slimbus/qcom-ngd-ctrl.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/slimbus/qcom-ngd-ctrl.c b/drivers/slimbus/qcom-ngd-ctrl.c
index c054e83ab636..f3ee8e036372 100644
--- a/drivers/slimbus/qcom-ngd-ctrl.c
+++ b/drivers/slimbus/qcom-ngd-ctrl.c
@@ -618,7 +618,7 @@ static void qcom_slim_ngd_rx(struct qcom_slim_ngd_ctrl *ctrl, u8 *buf)
(mc == SLIM_USR_MC_GENERIC_ACK &&
mt == SLIM_MSG_MT_SRC_REFERRED_USER)) {
slim_msg_response(&ctrl->ctrl, &buf[4], buf[3], len - 4);
- pm_runtime_mark_last_busy(ctrl->dev);
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
}
}
@@ -1257,13 +1257,14 @@ static int qcom_slim_ngd_enable(struct qcom_slim_ngd_ctrl *ctrl, bool enable)
}
/* controller state should be in sync with framework state */
complete(&ctrl->qmi.qmi_comp);
- if (!pm_runtime_enabled(ctrl->dev) ||
- !pm_runtime_suspended(ctrl->dev))
- qcom_slim_ngd_runtime_resume(ctrl->dev);
+ if (!pm_runtime_enabled(ctrl->ctrl.dev) ||
+ !pm_runtime_suspended(ctrl->ctrl.dev))
+ qcom_slim_ngd_runtime_resume(ctrl->ctrl.dev);
else
- pm_runtime_resume(ctrl->dev);
- pm_runtime_mark_last_busy(ctrl->dev);
- pm_runtime_put(ctrl->dev);
+ pm_runtime_resume(ctrl->ctrl.dev);
+
+ pm_runtime_mark_last_busy(ctrl->ctrl.dev);
+ pm_runtime_put(ctrl->ctrl.dev);
ret = slim_register_controller(&ctrl->ctrl);
if (ret) {
@@ -1389,7 +1390,7 @@ static int qcom_slim_ngd_ssr_pdr_notify(struct qcom_slim_ngd_ctrl *ctrl,
/* Make sure the last dma xfer is finished */
mutex_lock(&ctrl->tx_lock);
if (ctrl->state != QCOM_SLIM_NGD_CTRL_DOWN) {
- pm_runtime_get_noresume(ctrl->dev);
+ pm_runtime_get_noresume(ctrl->ctrl.dev);
ctrl->state = QCOM_SLIM_NGD_CTRL_DOWN;
qcom_slim_ngd_down(ctrl);
qcom_slim_ngd_exit_dma(ctrl);
--
2.30.2
From: Parav Pandit <parav(a)nvidia.com>
[ Upstream commit 60f0779862e4ab943810187752c462e85f5fa371 ]
Currently vq->broken field is read by virtqueue_is_broken() in busy
loop in one context by virtnet_send_command().
vq->broken is set to true in other process context by
virtio_break_device(). Reader and writer are accessing it without any
synchronization. This may lead to a compiler optimization which may
result to optimize reading vq->broken only once.
Hence, force reading vq->broken on each invocation of
virtqueue_is_broken() and also force writing it so that such
update is visible to the readers.
It is a theoretical fix that isn't yet encountered in the field.
Signed-off-by: Parav Pandit <parav(a)nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/virtio/virtio_ring.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 6b3565feddb2..b15c24c4d91f 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -840,7 +840,7 @@ bool virtqueue_is_broken(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
- return vq->broken;
+ return READ_ONCE(vq->broken);
}
EXPORT_SYMBOL_GPL(virtqueue_is_broken);
@@ -854,7 +854,9 @@ void virtio_break_device(struct virtio_device *dev)
list_for_each_entry(_vq, &dev->vqs, list) {
struct vring_virtqueue *vq = to_vvq(_vq);
- vq->broken = true;
+
+ /* Pairs with READ_ONCE() in virtqueue_is_broken(). */
+ WRITE_ONCE(vq->broken, true);
}
}
EXPORT_SYMBOL_GPL(virtio_break_device);
--
2.30.2
From: Parav Pandit <parav(a)nvidia.com>
[ Upstream commit 60f0779862e4ab943810187752c462e85f5fa371 ]
Currently vq->broken field is read by virtqueue_is_broken() in busy
loop in one context by virtnet_send_command().
vq->broken is set to true in other process context by
virtio_break_device(). Reader and writer are accessing it without any
synchronization. This may lead to a compiler optimization which may
result to optimize reading vq->broken only once.
Hence, force reading vq->broken on each invocation of
virtqueue_is_broken() and also force writing it so that such
update is visible to the readers.
It is a theoretical fix that isn't yet encountered in the field.
Signed-off-by: Parav Pandit <parav(a)nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/virtio/virtio_ring.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5cad9f41c238..cf7eccfe3469 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1150,7 +1150,7 @@ bool virtqueue_is_broken(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
- return vq->broken;
+ return READ_ONCE(vq->broken);
}
EXPORT_SYMBOL_GPL(virtqueue_is_broken);
@@ -1164,7 +1164,9 @@ void virtio_break_device(struct virtio_device *dev)
list_for_each_entry(_vq, &dev->vqs, list) {
struct vring_virtqueue *vq = to_vvq(_vq);
- vq->broken = true;
+
+ /* Pairs with READ_ONCE() in virtqueue_is_broken(). */
+ WRITE_ONCE(vq->broken, true);
}
}
EXPORT_SYMBOL_GPL(virtio_break_device);
--
2.30.2
From: Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
[ Upstream commit 335ffab3ef864539e814b9a2903b0ae420c1c067 ]
This WARN can be triggered per-core and the stack trace is not useful.
Replace it with plain dev_err(). Fix a comment while at it.
Signed-off-by: Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/opp/of.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index d64a13d7881b..a53123356697 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -423,8 +423,9 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np)
}
}
- /* There should be one of more OPP defined */
- if (WARN_ON(!count)) {
+ /* There should be one or more OPPs defined */
+ if (!count) {
+ dev_err(dev, "%s: no supported OPPs", __func__);
ret = -ENOENT;
goto put_opp_table;
}
--
2.30.2
From: Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
[ Upstream commit 335ffab3ef864539e814b9a2903b0ae420c1c067 ]
This WARN can be triggered per-core and the stack trace is not useful.
Replace it with plain dev_err(). Fix a comment while at it.
Signed-off-by: Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/opp/of.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 249738e1e0b7..603c688fe23d 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -682,8 +682,9 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table)
}
}
- /* There should be one of more OPP defined */
- if (WARN_ON(!count)) {
+ /* There should be one or more OPPs defined */
+ if (!count) {
+ dev_err(dev, "%s: no supported OPPs", __func__);
ret = -ENOENT;
goto remove_static_opp;
}
--
2.30.2
When locking a region, we currently clamp to a PAGE_SIZE as the minimum
lock region. While this is valid for Midgard, it is invalid for Bifrost,
where the minimum locking size is 8x larger than the 4k page size. Add a
hardware definition for the minimum lock region size (corresponding to
KBASE_LOCK_REGION_MIN_SIZE_LOG2 in kbase) and respect it.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig(a)collabora.com>
Tested-by: Chris Morgan <macromorgan(a)hotmail.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 3a795273e505..dfe5f1d29763 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -66,7 +66,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
/* The size is encoded as ceil(log2) minus(1), which may be calculated
* with fls. The size must be clamped to hardware bounds.
*/
- size = max_t(u64, size, PAGE_SIZE);
+ size = max_t(u64, size, AS_LOCK_REGION_MIN_SIZE);
region_width = fls64(size - 1) - 1;
region |= region_width;
diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index 1940ff86e49a..6c5a11ef1ee8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -316,6 +316,8 @@
#define AS_FAULTSTATUS_ACCESS_TYPE_READ (0x2 << 8)
#define AS_FAULTSTATUS_ACCESS_TYPE_WRITE (0x3 << 8)
+#define AS_LOCK_REGION_MIN_SIZE (1ULL << 15)
+
#define gpu_write(dev, reg, data) writel(data, dev->iomem + reg)
#define gpu_read(dev, reg) readl(dev->iomem + reg)
--
2.30.2
In lock_region, simplify the calculation of the region_width parameter.
This field is the size, but encoded as log2(ceil(size)) - 1.
log2(ceil(size)) may be computed directly as fls(size - 1). However, we
want to use the 64-bit versions as the amount to lock can exceed
32-bits.
This avoids undefined behaviour when locking all memory (size ~0),
caught by UBSAN.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig(a)collabora.com>
Reported-and-tested-by: Chris Morgan <macromorgan(a)hotmail.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 0da5b3100ab1..f6e02d0392f4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
{
u8 region_width;
u64 region = iova & PAGE_MASK;
- /*
- * fls returns:
- * 1 .. 32
- *
- * 10 + fls(num_pages)
- * results in the range (11 .. 42)
- */
-
- size = round_up(size, PAGE_SIZE);
- region_width = 10 + fls(size >> PAGE_SHIFT);
- if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {
- /* not pow2, so must go up to the next pow2 */
- region_width += 1;
- }
+ /* The size is encoded as ceil(log2) minus(1), which may be calculated
+ * with fls. The size must be clamped to hardware bounds.
+ */
+ size = max_t(u64, size, PAGE_SIZE);
+ region_width = fls64(size - 1) - 1;
region |= region_width;
/* Lock the region that needs to be updated */
--
2.30.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 112022bdb5bc372e00e6e43cb88ee38ea67b97bd Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Tue, 22 Jun 2021 10:56:47 -0700
Subject: [PATCH] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP
shadow MMUs
Mark NX as being used for all non-nested shadow MMUs, as KVM will set the
NX bit for huge SPTEs if the iTLB mutli-hit mitigation is enabled.
Checking the mitigation itself is not sufficient as it can be toggled on
at any time and KVM doesn't reset MMU contexts when that happens. KVM
could reset the contexts, but that would require purging all SPTEs in all
MMUs, for no real benefit. And, KVM already forces EFER.NX=1 when TDP is
disabled (for WP=0, SMEP=1, NX=0), so technically NX is never reserved
for shadow MMUs.
Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20210622175739.3610207-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b3be690d081a..444e068e6ad9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4221,7 +4221,15 @@ static inline u64 reserved_hpa_bits(void)
void
reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
{
- bool uses_nx = context->nx ||
+ /*
+ * KVM uses NX when TDP is disabled to handle a variety of scenarios,
+ * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
+ * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
+ * The iTLB multi-hit workaround can be toggled at any time, so assume
+ * NX can be used by any non-nested shadow MMU to avoid having to reset
+ * MMU contexts. Note, KVM forces EFER.NX=1 when TDP is disabled.
+ */
+ bool uses_nx = context->nx || !tdp_enabled ||
context->mmu_role.base.smep_andnot_wp;
struct rsvd_bits_validate *shadow_zero_check;
int i;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 112022bdb5bc372e00e6e43cb88ee38ea67b97bd Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Tue, 22 Jun 2021 10:56:47 -0700
Subject: [PATCH] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP
shadow MMUs
Mark NX as being used for all non-nested shadow MMUs, as KVM will set the
NX bit for huge SPTEs if the iTLB mutli-hit mitigation is enabled.
Checking the mitigation itself is not sufficient as it can be toggled on
at any time and KVM doesn't reset MMU contexts when that happens. KVM
could reset the contexts, but that would require purging all SPTEs in all
MMUs, for no real benefit. And, KVM already forces EFER.NX=1 when TDP is
disabled (for WP=0, SMEP=1, NX=0), so technically NX is never reserved
for shadow MMUs.
Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20210622175739.3610207-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b3be690d081a..444e068e6ad9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4221,7 +4221,15 @@ static inline u64 reserved_hpa_bits(void)
void
reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
{
- bool uses_nx = context->nx ||
+ /*
+ * KVM uses NX when TDP is disabled to handle a variety of scenarios,
+ * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
+ * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
+ * The iTLB multi-hit workaround can be toggled at any time, so assume
+ * NX can be used by any non-nested shadow MMU to avoid having to reset
+ * MMU contexts. Note, KVM forces EFER.NX=1 when TDP is disabled.
+ */
+ bool uses_nx = context->nx || !tdp_enabled ||
context->mmu_role.base.smep_andnot_wp;
struct rsvd_bits_validate *shadow_zero_check;
int i;
The patch titled
Subject: hugetlb: don't pass page cache pages to restore_reserve_on_error
has been removed from the -mm tree. Its filename was
hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache. It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page
to the cache.
The only code other than huge page allocation which sets the flag is
restore_reserve_on_error. It will potentially set the flag in rare out
of memory conditions. syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.
The code in restore_reserve_on_error is doing the right thing. However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error. This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page until
it is removed from the cache. Error paths do not remove pages from the
cache, so even in the case of error, the page will remain in the cache
and no reservation adjustment is needed.
Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.
Note on fixes tag:
Prior to commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") the routine would not process page cache pages because
the HPageRestoreReserve flag is not set on such pages. Therefore, this
issue could not be trigggered. The code added by commit 846be08578ed
("mm/hugetlb: expand restore_reserve_on_error functionality") is needed
and correct. It exposed incorrect calls to restore_reserve_on_error which
is the root cause addressed by this commit.
[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/
Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: <syzbot+67654e51e54455f1c585(a)syzkaller.appspotmail.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c~hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error
+++ a/mm/hugetlb.c
@@ -2476,7 +2476,7 @@ void restore_reserve_on_error(struct hst
if (!rc) {
/*
* This indicates there is an entry in the reserve map
- * added by alloc_huge_page. We know it was added
+ * not added by alloc_huge_page. We know it was added
* before the alloc_huge_page call, otherwise
* HPageRestoreReserve would be set on the page.
* Remove the entry so that a subsequent allocation
@@ -4660,7 +4660,9 @@ retry_avoidcopy:
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(&range);
out_release_all:
- restore_reserve_on_error(h, vma, haddr, new_page);
+ /* No restore in case of successful pagetable update (Break COW) */
+ if (new_page != old_page)
+ restore_reserve_on_error(h, vma, haddr, new_page);
put_page(new_page);
out_release_old:
put_page(old_page);
@@ -4776,7 +4778,7 @@ static vm_fault_t hugetlb_no_page(struct
pte_t new_pte;
spinlock_t *ptl;
unsigned long haddr = address & huge_page_mask(h);
- bool new_page = false;
+ bool new_page, new_pagecache_page = false;
/*
* Currently, we are forced to kill the process in the event the
@@ -4799,6 +4801,7 @@ static vm_fault_t hugetlb_no_page(struct
goto out;
retry:
+ new_page = false;
page = find_lock_page(mapping, idx);
if (!page) {
/* Check for page in userfault range */
@@ -4842,6 +4845,7 @@ retry:
goto retry;
goto out;
}
+ new_pagecache_page = true;
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
@@ -4926,7 +4930,9 @@ backout:
spin_unlock(ptl);
backout_unlocked:
unlock_page(page);
- restore_reserve_on_error(h, vma, haddr, page);
+ /* restore reserve for newly allocated pages not in page cache */
+ if (new_page && !new_pagecache_page)
+ restore_reserve_on_error(h, vma, haddr, page);
put_page(page);
goto out;
}
@@ -5135,6 +5141,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
int ret = -ENOMEM;
struct page *page;
int writable;
+ bool new_pagecache_page = false;
if (is_continue) {
ret = -EFAULT;
@@ -5228,6 +5235,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
ret = huge_add_to_page_cache(page, mapping, idx);
if (ret)
goto out_release_nounlock;
+ new_pagecache_page = true;
}
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
@@ -5291,7 +5299,8 @@ out_release_unlock:
if (vm_shared || is_continue)
unlock_page(page);
out_release_nounlock:
- restore_reserve_on_error(h, dst_vma, dst_addr, page);
+ if (!new_pagecache_page)
+ restore_reserve_on_error(h, dst_vma, dst_addr, page);
put_page(page);
goto out;
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch
hugetlb-drop-ref-count-earlier-after-page-allocation.patch
hugetlb-before-freeing-hugetlb-page-set-dtor-to-appropriate-value.patch
The patch titled
Subject: kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
has been removed from the -mm tree. Its filename was
kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
Originally the addr != NULL check was meant to take care of the case where
__kfence_pool == NULL (KFENCE is disabled). However, this does not work
for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.
This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE. While the kernel
would likely crash, the stack traces and report might be confusing due to
double faults upon KFENCE's attempt to unprotect such an address.
Fix it by just checking that __kfence_pool != NULL instead.
Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee(a)mediatek.com>
Acked-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kfence.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/include/linux/kfence.h~kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size
+++ a/include/linux/kfence.h
@@ -51,10 +51,11 @@ extern atomic_t kfence_allocation_gate;
static __always_inline bool is_kfence_address(const void *addr)
{
/*
- * The non-NULL check is required in case the __kfence_pool pointer was
- * never initialized; keep it in the slow-path after the range-check.
+ * The __kfence_pool != NULL check is required to deal with the case
+ * where __kfence_pool == NULL && addr < KFENCE_POOL_SIZE. Keep it in
+ * the slow-path after the range-check!
*/
- return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && addr);
+ return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && __kfence_pool);
}
/**
_
Patches currently in -mm which might be from elver(a)google.com are
kfence-show-cpu-and-timestamp-in-alloc-free-info.patch
The patch titled
Subject: mm/hwpoison: retry with shake_page() for unhandlable pages
has been removed from the -mm tree. Its filename was
mm-hwpoison-retry-with-shake_page-for-unhandlable-pages.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/hwpoison: retry with shake_page() for unhandlable pages
HWPoisonHandlable() sometimes returns false for typical user pages due to
races with average memory events like transfers over LRU lists. This
causes failures in hwpoison handling.
There's retry code for such a case but does not work because the retry
loop reaches the retry limit too quickly before the page settles down to
handlable state. Let get_any_page() call shake_page() to fix it.
[naoya.horiguchi(a)nec.com: get_any_page(): return -EIO when retry limit reached]
Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev
Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev
Fixes: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reported-by: Tony Luck <tony.luck(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.13+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/mm/memory-failure.c~mm-hwpoison-retry-with-shake_page-for-unhandlable-pages
+++ a/mm/memory-failure.c
@@ -1146,7 +1146,7 @@ static int __get_hwpoison_page(struct pa
* unexpected races caused by taking a page refcount.
*/
if (!HWPoisonHandlable(head))
- return 0;
+ return -EBUSY;
if (PageTransHuge(head)) {
/*
@@ -1199,9 +1199,15 @@ try_again:
}
goto out;
} else if (ret == -EBUSY) {
- /* We raced with freeing huge page to buddy, retry. */
- if (pass++ < 3)
+ /*
+ * We raced with (possibly temporary) unhandlable
+ * page, retry.
+ */
+ if (pass++ < 3) {
+ shake_page(p, 1);
goto try_again;
+ }
+ ret = -EIO;
goto out;
}
}
_
Patches currently in -mm which might be from naoya.horiguchi(a)nec.com are
mm-sparse-set-section_nid_shift-to-6.patch
The patch titled
Subject: mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
has been removed from the -mm tree. Its filename was
mm-memcontrol-fix-occasional-ooms-due-to-proportional-memorylow-reclaim.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups. This is unexpected and undesirable as memory.low
is supposed to express non-OOMing memory priorities between cgroups.
The reason for this is proportional memory.low reclaim. When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slightly above their memory.low setting, page scan
force is scaled down and diminished in proportion to the overage, to
the point where it can cause reclaim to fail as well - only in that
case we currently don't retry, and instead trigger OOM.
To fix this, hook proportional reclaim into the same retry logic we
have in place for when cgroups are skipped entirely. This way if
reclaim fails and some cgroups were scanned with diminished pressure,
we'll try another full-force cycle before giving up and OOMing.
[akpm(a)linux-foundation.org: coding-style fixes]
Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Leon Yang <lnyng(a)fb.com>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memcontrol.h | 29 +++++++++++++++--------------
mm/vmscan.c | 27 +++++++++++++++++++--------
2 files changed, 34 insertions(+), 22 deletions(-)
--- a/include/linux/memcontrol.h~mm-memcontrol-fix-occasional-ooms-due-to-proportional-memorylow-reclaim
+++ a/include/linux/memcontrol.h
@@ -612,12 +612,15 @@ static inline bool mem_cgroup_disabled(v
return !cgroup_subsys_enabled(memory_cgrp_subsys);
}
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
- struct mem_cgroup *memcg,
- bool in_low_reclaim)
+static inline void mem_cgroup_protection(struct mem_cgroup *root,
+ struct mem_cgroup *memcg,
+ unsigned long *min,
+ unsigned long *low)
{
+ *min = *low = 0;
+
if (mem_cgroup_disabled())
- return 0;
+ return;
/*
* There is no reclaim protection applied to a targeted reclaim.
@@ -653,13 +656,10 @@ static inline unsigned long mem_cgroup_p
*
*/
if (root == memcg)
- return 0;
-
- if (in_low_reclaim)
- return READ_ONCE(memcg->memory.emin);
+ return;
- return max(READ_ONCE(memcg->memory.emin),
- READ_ONCE(memcg->memory.elow));
+ *min = READ_ONCE(memcg->memory.emin);
+ *low = READ_ONCE(memcg->memory.elow);
}
void mem_cgroup_calculate_protection(struct mem_cgroup *root,
@@ -1147,11 +1147,12 @@ static inline void memcg_memory_event_mm
{
}
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
- struct mem_cgroup *memcg,
- bool in_low_reclaim)
+static inline void mem_cgroup_protection(struct mem_cgroup *root,
+ struct mem_cgroup *memcg,
+ unsigned long *min,
+ unsigned long *low)
{
- return 0;
+ *min = *low = 0;
}
static inline void mem_cgroup_calculate_protection(struct mem_cgroup *root,
--- a/mm/vmscan.c~mm-memcontrol-fix-occasional-ooms-due-to-proportional-memorylow-reclaim
+++ a/mm/vmscan.c
@@ -100,9 +100,12 @@ struct scan_control {
unsigned int may_swap:1;
/*
- * Cgroups are not reclaimed below their configured memory.low,
- * unless we threaten to OOM. If any cgroups are skipped due to
- * memory.low and nothing was reclaimed, go back for memory.low.
+ * Cgroup memory below memory.low is protected as long as we
+ * don't threaten to OOM. If any cgroup is reclaimed at
+ * reduced force or passed over entirely due to its memory.low
+ * setting (memcg_low_skipped), and nothing is reclaimed as a
+ * result, then go back for one more cycle that reclaims the protected
+ * memory (memcg_low_reclaim) to avert OOM.
*/
unsigned int memcg_low_reclaim:1;
unsigned int memcg_low_skipped:1;
@@ -2537,15 +2540,14 @@ out:
for_each_evictable_lru(lru) {
int file = is_file_lru(lru);
unsigned long lruvec_size;
+ unsigned long low, min;
unsigned long scan;
- unsigned long protection;
lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
- protection = mem_cgroup_protection(sc->target_mem_cgroup,
- memcg,
- sc->memcg_low_reclaim);
+ mem_cgroup_protection(sc->target_mem_cgroup, memcg,
+ &min, &low);
- if (protection) {
+ if (min || low) {
/*
* Scale a cgroup's reclaim pressure by proportioning
* its current usage to its memory.low or memory.min
@@ -2576,6 +2578,15 @@ out:
* hard protection.
*/
unsigned long cgroup_size = mem_cgroup_size(memcg);
+ unsigned long protection;
+
+ /* memory.low scaling, make sure we retry before OOM */
+ if (!sc->memcg_low_reclaim && low > min) {
+ protection = low;
+ sc->memcg_low_skipped = 1;
+ } else {
+ protection = min;
+ }
/* Avoid TOCTOU with earlier protection check */
cgroup_size = max(cgroup_size, protection);
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-remove-irqsave-restore-locking-from-contexts-with-irqs-enabled.patch
fs-drop_caches-fix-skipping-over-shadow-cache-inodes.patch
fs-inode-count-invalidated-shadow-pages-in-pginodesteal.patch
vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru.patch
Commit ca6bfcb2f6d9 ("libata: Enable queued TRIM for Samsung SSD 860")
limited the existing ATA_HORKAGE_NO_NCQ_TRIM quirk from "Samsung SSD 8*",
covering all Samsung 800 series SSDs, to only apply to "Samsung SSD 840*"
and "Samsung SSD 850*" series based on information from Samsung.
But there is a large number of users which is still reporting issues
with the Samsung 860 and 870 SSDs combined with Intel, ASmedia or
Marvell SATA controllers and all reporters also report these problems
going away when disabling queued trims.
Note that with AMD SATA controllers users are reporting even worse
issues and only completely disabling NCQ helps there, this will be
addressed in a separate patch.
Fixes: ca6bfcb2f6d9 ("libata: Enable queued TRIM for Samsung SSD 860")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=203475
Cc: stable(a)vger.kernel.org
Cc: Kate Hsuan <hpa(a)redhat.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/ata/libata-core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 61c762961ca8..3eda3291952b 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3950,6 +3950,10 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "Samsung SSD 850*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
+ { "Samsung SSD 860*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
+ ATA_HORKAGE_ZERO_AFTER_TRIM, },
+ { "Samsung SSD 870*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
+ ATA_HORKAGE_ZERO_AFTER_TRIM, },
{ "FCCT*M500*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
ATA_HORKAGE_ZERO_AFTER_TRIM, },
--
2.31.1
Backport for kernel 5.13
(cherry picked from commit ef486bf448a057a6e2d50e40ae879f7add6585da)
Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.
However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.
This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.
First problem: unexpected WARN_ON() for mysterious PROTFAULT
WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40
NIP: c001b5c8 LR: c001b6f8 CTR: 00000000
REGS: e2d09e40 TRAP: 0700 Not tainted (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 00021032 <ME,IR,DR,RI> CR: 42d04f30 XER: 20000000
GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000
GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c
GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40
NIP [c001b5c8] do_page_fault+0x6c/0x5b0
LR [c001b6f8] do_page_fault+0x19c/0x5b0
Call Trace:
[e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable)
[e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4
--- interrupt: 300 at 0xa7a261dc
NIP: a7a261dc LR: a7a253bc CTR: 00000000
REGS: e2d09f40 TRAP: 0300 Not tainted (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 228428e2 XER: 20000000
DAR: 00cba02c DSISR: 42000000
GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000
GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c
GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030
NIP [a7a261dc] 0xa7a261dc
LR [a7a253bc] 0xa7a253bc
--- interrupt: 300
Instruction dump:
7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c
2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594
Second problem: MSR PR is seen unset allthough the interrupt frame shows it set
kernel BUG at arch/powerpc/kernel/interrupt.c:458!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Tainted: G W 5.13.0-pmac-00028-gb3c15b60339a #40
NIP: c0011434 LR: c001629c CTR: 00000000
REGS: e2d09e70 TRAP: 0700 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42d09f30 XER: 00000000
GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000
GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144
GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70
LR [c001629c] interrupt_return+0x9c/0x144
Call Trace:
[e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable)
--- interrupt: 300 at 0xa09be008
NIP: a09be008 LR: a09bdfe8 CTR: a09bdfc0
REGS: e2d09f40 TRAP: 0300 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 420028e2 XER: 20000000
DAR: a539a308 DSISR: 0a000000
GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004
GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144
GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [a09be008] 0xa09be008
LR [a09bdfe8] 0xa09bdfe8
--- interrupt: 300
Instruction dump:
80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc
81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034
Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
Cc: stable(a)vger.kernel.org # v5.13+
Reported-by: Stan Johnson <userm57(a)yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.16292693…
---
arch/powerpc/mm/book3s32/kuep.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/powerpc/mm/book3s32/kuep.c b/arch/powerpc/mm/book3s32/kuep.c
index 8ed1b8634839..7015bd489063 100644
--- a/arch/powerpc/mm/book3s32/kuep.c
+++ b/arch/powerpc/mm/book3s32/kuep.c
@@ -4,6 +4,7 @@
#include <asm/reg.h>
#include <asm/task_size_32.h>
#include <asm/mmu.h>
+#include <asm/synch.h>
#define KUEP_UPDATE_TWO_USER_SEGMENTS(n) do { \
if (TASK_SIZE > ((n) << 28)) \
@@ -32,9 +33,27 @@ static __always_inline void kuep_update(u32 val)
void kuep_lock(void)
{
kuep_update(mfsr(0) | SR_NX);
+ /*
+ * This isync() shouldn't be necessary as the kernel is not excepted to
+ * run any instruction in userspace soon after the update of segments,
+ * but hash based cores (at least G3) seem to exhibit a random
+ * behaviour when the 'isync' is not there. 603 cores don't have this
+ * behaviour so don't do the 'isync' as it saves several CPU cycles.
+ */
+ if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
+ isync(); /* Context sync required after mtsr() */
}
void kuep_unlock(void)
{
kuep_update(mfsr(0) & ~SR_NX);
+ /*
+ * This isync() shouldn't be necessary as a 'rfi' will soon be executed
+ * to return to userspace, but hash based cores (at least G3) seem to
+ * exhibit a random behaviour when the 'isync' is not there. 603 cores
+ * don't have this behaviour so don't do the 'isync' as it saves several
+ * CPU cycles.
+ */
+ if (mmu_has_feature(MMU_FTR_HPTE_TABLE))
+ isync(); /* Context sync required after mtsr() */
}
--
2.25.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 527f721478bce3f49b513a733bacd19d6f34b08c
Gitweb: https://git.kernel.org/tip/527f721478bce3f49b513a733bacd19d6f34b08c
Author: Babu Moger <babu.moger(a)amd.com>
AuthorDate: Fri, 20 Aug 2021 16:52:42 -05:00
Committer: Borislav Petkov <bp(a)suse.de>
CommitterDate: Sun, 22 Aug 2021 09:11:29 +02:00
x86/resctrl: Fix a maybe-uninitialized build warning treated as error
The recent commit
064855a69003 ("x86/resctrl: Fix default monitoring groups reporting")
caused a RHEL build failure with an uninitialized variable warning
treated as an error because it removed the default case snippet.
The RHEL Makefile uses '-Werror=maybe-uninitialized' to force possibly
uninitialized variable warnings to be treated as errors. This is also
reported by smatch via the 0day robot.
The error from the RHEL build is:
arch/x86/kernel/cpu/resctrl/monitor.c: In function ‘__mon_event_count’:
arch/x86/kernel/cpu/resctrl/monitor.c:261:12: error: ‘m’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
m->chunks += chunks;
^~
The upstream Makefile does not build using '-Werror=maybe-uninitialized'.
So, the problem is not seen there. Fix the problem by putting back the
default case snippet.
[ bp: note that there's nothing wrong with the code and other compilers
do not trigger this warning - this is being done just so the RHEL compiler
is happy. ]
Fixes: 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting")
Reported-by: Terry Bowman <Terry.Bowman(a)amd.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Babu Moger <babu.moger(a)amd.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/162949631908.23903.17090272726012848523.stgit@bmo…
---
arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 57e4bb6..8caf871 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -304,6 +304,12 @@ static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
case QOS_L3_MBM_LOCAL_EVENT_ID:
m = &rr->d->mbm_local[rmid];
break;
+ default:
+ /*
+ * Code would never reach here because an invalid
+ * event id would fail the __rmid_read.
+ */
+ return RMID_VAL_ERROR;
}
if (rr->first) {
From: Joerg Roedel <jroedel(a)suse.de>
Commit 79419e13e808 ("x86/boot/compressed/64: Setup IDT in startup_32
boot path") introduced an IDT into the 32 bit boot path of the
decompressor stub. But the IDT is set up before ExitBootServices() is
called and some UEFI firmwares rely on their own IDT.
Save the firmware IDT on boot and restore it before calling into EFI
functions to fix boot failures introduced by above commit.
Reported-by: Fabio Aiuto <fabioaiuto83(a)gmail.com>
Fixes: 79419e13e808 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")
Cc: stable(a)vger.kernel.org # 5.13+
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
arch/x86/boot/compressed/efi_thunk_64.S | 30 +++++++++++++++++--------
arch/x86/boot/compressed/head_64.S | 3 +++
2 files changed, 24 insertions(+), 9 deletions(-)
diff --git a/arch/x86/boot/compressed/efi_thunk_64.S b/arch/x86/boot/compressed/efi_thunk_64.S
index 95a223b3e56a..8bb92e9f4e97 100644
--- a/arch/x86/boot/compressed/efi_thunk_64.S
+++ b/arch/x86/boot/compressed/efi_thunk_64.S
@@ -5,9 +5,8 @@
* Early support for invoking 32-bit EFI services from a 64-bit kernel.
*
* Because this thunking occurs before ExitBootServices() we have to
- * restore the firmware's 32-bit GDT before we make EFI service calls,
- * since the firmware's 32-bit IDT is still currently installed and it
- * needs to be able to service interrupts.
+ * restore the firmware's 32-bit GDT and IDT before we make EFI service
+ * calls.
*
* On the plus side, we don't have to worry about mangling 64-bit
* addresses into 32-bits because we're executing with an identity
@@ -39,7 +38,7 @@ SYM_FUNC_START(__efi64_thunk)
/*
* Convert x86-64 ABI params to i386 ABI
*/
- subq $32, %rsp
+ subq $64, %rsp
movl %esi, 0x0(%rsp)
movl %edx, 0x4(%rsp)
movl %ecx, 0x8(%rsp)
@@ -49,14 +48,19 @@ SYM_FUNC_START(__efi64_thunk)
leaq 0x14(%rsp), %rbx
sgdt (%rbx)
+ addq $16, %rbx
+ sidt (%rbx)
+
/*
- * Switch to gdt with 32-bit segments. This is the firmware GDT
- * that was installed when the kernel started executing. This
- * pointer was saved at the EFI stub entry point in head_64.S.
+ * Switch to IDT and GDT with 32-bit segments. This is the firmware GDT
+ * and IDT that was installed when the kernel started executing. The
+ * pointers were saved at the EFI stub entry point in head_64.S.
*
* Pass the saved DS selector to the 32-bit code, and use far return to
* restore the saved CS selector.
*/
+ leaq efi32_boot_idt(%rip), %rax
+ lidt (%rax)
leaq efi32_boot_gdt(%rip), %rax
lgdt (%rax)
@@ -67,7 +71,7 @@ SYM_FUNC_START(__efi64_thunk)
pushq %rax
lretq
-1: addq $32, %rsp
+1: addq $64, %rsp
movq %rdi, %rax
pop %rbx
@@ -128,10 +132,13 @@ SYM_FUNC_START_LOCAL(efi_enter32)
/*
* Some firmware will return with interrupts enabled. Be sure to
- * disable them before we switch GDTs.
+ * disable them before we switch GDTs and IDTs.
*/
cli
+ lidtl (%ebx)
+ subl $16, %ebx
+
lgdtl (%ebx)
movl %cr4, %eax
@@ -166,6 +173,11 @@ SYM_DATA_START(efi32_boot_gdt)
.quad 0
SYM_DATA_END(efi32_boot_gdt)
+SYM_DATA_START(efi32_boot_idt)
+ .word 0
+ .quad 0
+SYM_DATA_END(efi32_boot_idt)
+
SYM_DATA_START(efi32_boot_cs)
.word 0
SYM_DATA_END(efi32_boot_cs)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index a2347ded77ea..572c535cf45b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -319,6 +319,9 @@ SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
movw %cs, rva(efi32_boot_cs)(%ebp)
movw %ds, rva(efi32_boot_ds)(%ebp)
+ /* Store firmware IDT descriptor */
+ sidtl rva(efi32_boot_idt)(%ebp)
+
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax
--
2.32.0
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
The selftest for ftrace checks some features by checking if the README has
text that states the feature is supported by that kernel. Unfortunately,
this check gives false positives because it many not be checked if there's
spaces in the string to check. This is due to the compare between the
required variable with the ":README" string stripped, because neither has
quotes around them.
Link: https://lkml.kernel.org/r/20210820204742.087177341@goodmis.org
Cc: "Tzvetomir Stoyanov" <tz.stoyanov(a)gmail.com>
Cc: Tom Zanussi <zanussi(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: 1b8eec510ba64 ("selftests/ftrace: Support ":README" suffix for requires")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
tools/testing/selftests/ftrace/test.d/functions | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/functions b/tools/testing/selftests/ftrace/test.d/functions
index f68d336b961b..000fd05e84b1 100644
--- a/tools/testing/selftests/ftrace/test.d/functions
+++ b/tools/testing/selftests/ftrace/test.d/functions
@@ -137,7 +137,7 @@ check_requires() { # Check required files and tracers
echo "Required tracer $t is not configured."
exit_unsupported
fi
- elif [ $r != $i ]; then
+ elif [ "$r" != "$i" ]; then
if ! grep -Fq "$r" README ; then
echo "Required feature pattern \"$r\" is not in README."
exit_unsupported
--
2.30.2
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 3c9a70b027b0 - kyber: make trace_block_rq call consistent with documentation
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ ACPI table test
⚡⚡⚡ ACPI enabled test
⚡⚡⚡ LTP
⚡⚡⚡ CIFS Connectathon
⚡⚡⚡ POSIX pjd-fstest suites
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ jvm - jcstress tests
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking cki netfilter test
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ xarray-idr-radixtree-test
🚧 ⚡⚡⚡ i2c: i2cdetect sanity
🚧 ⚡⚡⚡ NFS Connectathon
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ lvm cache test
🚧 ⚡⚡⚡ lvm snapper test
Host 2:
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ❌ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
🚧 💥 stress: stress-ng
Host 3:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ NFS Connectathon
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
ppc64le:
Host 1:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ NFS Connectathon
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
Host 2:
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
🚧 ✅ Storage: lvm device-mapper test - upstream
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ Storage: swraid mdadm raid_module test
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage nvme - tcp
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ NFS Connectathon
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ ACPI table test
⚡⚡⚡ LTP
⚡⚡⚡ CIFS Connectathon
⚡⚡⚡ POSIX pjd-fstest suites
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ jvm - jcstress tests
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking cki netfilter test
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ xarray-idr-radixtree-test
🚧 ⚡⚡⚡ i2c: i2cdetect sanity
🚧 ⚡⚡⚡ NFS Connectathon
🚧 ⚡⚡⚡ Firmware test suite
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ lvm cache test
🚧 ⚡⚡⚡ lvm snapper test
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
The selftest for ftrace checks some features by checking if the README has
text that states the feature is supported by that kernel. Unfortunately,
this check gives false positives because it many not be checked if there's
spaces in the string to check. This is due to the compare between the
required variable with the ":README" string stripped, because neither has
quotes around them.
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Fixes: 1b8eec510ba64 ("selftests/ftrace: Support ":README" suffix for requires")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
tools/testing/selftests/ftrace/test.d/functions | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/functions b/tools/testing/selftests/ftrace/test.d/functions
index f68d336b961b..000fd05e84b1 100644
--- a/tools/testing/selftests/ftrace/test.d/functions
+++ b/tools/testing/selftests/ftrace/test.d/functions
@@ -137,7 +137,7 @@ check_requires() { # Check required files and tracers
echo "Required tracer $t is not configured."
exit_unsupported
fi
- elif [ $r != $i ]; then
+ elif [ "$r" != "$i" ]; then
if ! grep -Fq "$r" README ; then
echo "Required feature pattern \"$r\" is not in README."
exit_unsupported
--
2.30.2
Hi
Tim Connors, a user in Debian (https://bugs.debian.org/992121)
reported that cca342d98bef ("Bluetooth: hidp: use correct wait queue
when removing ctrl_wait") from 5.11-rc1 is missing for the 5.10.y
stable series.
Actually would apply further back, but it was only tested to fix the
problem on 5.10.y
Thanks for considering,
Regards,
Salvatore
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros
have disabled it. Warn the stragglers that still use "-o mand" that
we'll be dropping support for that mount option.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
---
fs/namespace.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index ab4174a3c802..2279473d0d6f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1716,8 +1716,12 @@ static inline bool may_mount(void)
}
#ifdef CONFIG_MANDATORY_FILE_LOCKING
-static inline bool may_mandlock(void)
+static bool may_mandlock(void)
{
+ pr_warn_once("======================================================\n"
+ "WARNING: the mand mount option is being deprecated and\n"
+ " will be removed in v5.15!\n"
+ "======================================================\n");
return capable(CAP_SYS_ADMIN);
}
#else
--
2.31.1
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros
have disabled it. Warn the stragglers that still use "-o mand" that
we'll be dropping support for that mount option.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
---
fs/namespace.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/namespace.c b/fs/namespace.c
index ab4174a3c802..ffab0bb1e649 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1716,8 +1716,16 @@ static inline bool may_mount(void)
}
#ifdef CONFIG_MANDATORY_FILE_LOCKING
+static bool warned_mand;
static inline bool may_mandlock(void)
{
+ if (!warned_mand) {
+ warned_mand = true;
+ pr_warn("======================================================\n");
+ pr_warn("WARNING: the mand mount option is being deprecated and\n");
+ pr_warn(" will be removed in v5.15!\n");
+ pr_warn("======================================================\n");
+ }
return capable(CAP_SYS_ADMIN);
}
#else
--
2.31.1
From: Joerg Roedel <jroedel(a)suse.de>
Commit 79419e13e808 ("x86/boot/compressed/64: Setup IDT in startup_32
boot path") introduced an IDT into the 32 bit boot path of the
decompressor stub. But the IDT is set up before ExitBootServices() is
called and some UEFI firmwares rely on their own IDT.
Save the firmware IDT on boot and restore it before calling into EFI
functions to fix boot failures introduced by above commit.
Reported-by: Fabio Aiuto <fabioaiuto83(a)gmail.com>
Fixes: 79419e13e808 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")
Cc: stable(a)vger.kernel.org # 5.13+
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
arch/x86/boot/compressed/efi_thunk_64.S | 23 ++++++++++++++++++-----
arch/x86/boot/compressed/head_64.S | 3 +++
2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/arch/x86/boot/compressed/efi_thunk_64.S b/arch/x86/boot/compressed/efi_thunk_64.S
index 95a223b3e56a..99cfd5dea23c 100644
--- a/arch/x86/boot/compressed/efi_thunk_64.S
+++ b/arch/x86/boot/compressed/efi_thunk_64.S
@@ -39,7 +39,7 @@ SYM_FUNC_START(__efi64_thunk)
/*
* Convert x86-64 ABI params to i386 ABI
*/
- subq $32, %rsp
+ subq $64, %rsp
movl %esi, 0x0(%rsp)
movl %edx, 0x4(%rsp)
movl %ecx, 0x8(%rsp)
@@ -49,14 +49,19 @@ SYM_FUNC_START(__efi64_thunk)
leaq 0x14(%rsp), %rbx
sgdt (%rbx)
+ addq $16, %rbx
+ sidt (%rbx)
+
/*
- * Switch to gdt with 32-bit segments. This is the firmware GDT
- * that was installed when the kernel started executing. This
- * pointer was saved at the EFI stub entry point in head_64.S.
+ * Switch to idt and gdt with 32-bit segments. This is the firmware GDT
+ * and IDT that was installed when the kernel started executing. The
+ * pointers were saved at the EFI stub entry point in head_64.S.
*
* Pass the saved DS selector to the 32-bit code, and use far return to
* restore the saved CS selector.
*/
+ leaq efi32_boot_idt(%rip), %rax
+ lidt (%rax)
leaq efi32_boot_gdt(%rip), %rax
lgdt (%rax)
@@ -67,7 +72,7 @@ SYM_FUNC_START(__efi64_thunk)
pushq %rax
lretq
-1: addq $32, %rsp
+1: addq $64, %rsp
movq %rdi, %rax
pop %rbx
@@ -132,6 +137,9 @@ SYM_FUNC_START_LOCAL(efi_enter32)
*/
cli
+ lidtl (%ebx)
+ subl $16, %ebx
+
lgdtl (%ebx)
movl %cr4, %eax
@@ -166,6 +174,11 @@ SYM_DATA_START(efi32_boot_gdt)
.quad 0
SYM_DATA_END(efi32_boot_gdt)
+SYM_DATA_START(efi32_boot_idt)
+ .word 0
+ .quad 0
+SYM_DATA_END(efi32_boot_idt)
+
SYM_DATA_START(efi32_boot_cs)
.word 0
SYM_DATA_END(efi32_boot_cs)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index a2347ded77ea..572c535cf45b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -319,6 +319,9 @@ SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
movw %cs, rva(efi32_boot_cs)(%ebp)
movw %ds, rva(efi32_boot_ds)(%ebp)
+ /* Store firmware IDT descriptor */
+ sidtl rva(efi32_boot_idt)(%ebp)
+
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax
--
2.32.0
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.
xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.
This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.
Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.
This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.
This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable(a)vger.kernel.org> #5.12
Reported-by: Tao Wang <wat(a)codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 40 ++++++++++++++++++++++--------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d0faa67a689d..9017986241f5 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -942,17 +942,21 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
td->urb->stream_id);
hw_deq &= ~0xf;
- if (td->cancel_status == TD_HALTED) {
- cached_td = td;
- } else if (trb_in_td(xhci, td->start_seg, td->first_trb,
- td->last_trb, hw_deq, false)) {
+ if (td->cancel_status == TD_HALTED ||
+ trb_in_td(xhci, td->start_seg, td->first_trb, td->last_trb, hw_deq, false)) {
switch (td->cancel_status) {
case TD_CLEARED: /* TD is already no-op */
case TD_CLEARING_CACHE: /* set TR deq command already queued */
break;
case TD_DIRTY: /* TD is cached, clear it */
case TD_HALTED:
- /* FIXME stream case, several stopped rings */
+ td->cancel_status = TD_CLEARING_CACHE;
+ if (cached_td)
+ /* FIXME stream case, several stopped rings */
+ xhci_dbg(xhci,
+ "Move dq past stream %u URB %p instead of stream %u URB %p\n",
+ td->urb->stream_id, td->urb,
+ cached_td->urb->stream_id, cached_td->urb);
cached_td = td;
break;
}
@@ -961,18 +965,24 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
td->cancel_status = TD_CLEARED;
}
}
- if (cached_td) {
- cached_td->cancel_status = TD_CLEARING_CACHE;
- err = xhci_move_dequeue_past_td(xhci, slot_id, ep->ep_index,
- cached_td->urb->stream_id,
- cached_td);
- /* Failed to move past cached td, try just setting it noop */
- if (err) {
- td_to_noop(xhci, ring, cached_td, false);
- cached_td->cancel_status = TD_CLEARED;
+ /* If there's no need to move the dequeue pointer then we're done */
+ if (!cached_td)
+ return 0;
+
+ err = xhci_move_dequeue_past_td(xhci, slot_id, ep->ep_index,
+ cached_td->urb->stream_id,
+ cached_td);
+ if (err) {
+ /* Failed to move past cached td, just set cached TDs to no-op */
+ list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list, cancelled_td_list) {
+ if (td->cancel_status != TD_CLEARING_CACHE)
+ continue;
+ xhci_dbg(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n",
+ td->urb);
+ td_to_noop(xhci, ring, td, false);
+ td->cancel_status = TD_CLEARED;
}
- cached_td = NULL;
}
return 0;
}
--
2.25.1
We can't depend on the TRB's HWO bit to determine if the TRB ring is
"full". A TRB is only available when the driver had processed it, not
when the controller consumed and relinquished the TRB's ownership to the
driver. Otherwise, the driver may overwrite unprocessed TRBs. This can
happen when many transfer events accumulate and the system is slow to
process them and/or when there are too many small requests.
If a request is in the started_list, that means there is one or more
unprocessed TRBs remained. Check this instead of the TRB's HWO bit
whether the TRB ring is full.
Cc: <stable(a)vger.kernel.org>
Fixes: c4233573f6ee ("usb: dwc3: gadget: prepare TRBs on update transfers too")
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
---
drivers/usb/dwc3/gadget.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 84fe57ef5a49..1e6ddbc986ba 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -940,19 +940,19 @@ static struct dwc3_trb *dwc3_ep_prev_trb(struct dwc3_ep *dep, u8 index)
static u32 dwc3_calc_trbs_left(struct dwc3_ep *dep)
{
- struct dwc3_trb *tmp;
u8 trbs_left;
/*
- * If enqueue & dequeue are equal than it is either full or empty.
- *
- * One way to know for sure is if the TRB right before us has HWO bit
- * set or not. If it has, then we're definitely full and can't fit any
- * more transfers in our ring.
+ * If the enqueue & dequeue are equal then the TRB ring is either full
+ * or empty. It's considered full when there are DWC3_TRB_NUM-1 of TRBs
+ * pending to be processed by the driver.
*/
if (dep->trb_enqueue == dep->trb_dequeue) {
- tmp = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
- if (tmp->ctrl & DWC3_TRB_CTRL_HWO)
+ /*
+ * If there is any request remained in the started_list at
+ * this point, that means there is no TRB available.
+ */
+ if (!list_empty(&dep->started_list))
return 0;
return DWC3_TRB_NUM - 1;
base-commit: 5571ea3117ca22849072adb58074fb5a2fd12c00
--
2.28.0
Limit the number of files in io_uring fixed tables by RLIMIT_NOFILE,
that's the first and the simpliest restriction that we should impose.
Cc: stable(a)vger.kernel.org
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
---
fs/io_uring.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 30edc329d803..e6301d5d03a8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7730,6 +7730,8 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
return -EINVAL;
if (nr_args > IORING_MAX_FIXED_FILES)
return -EMFILE;
+ if (nr_args > rlimit(RLIMIT_NOFILE))
+ return -EMFILE;
ret = io_rsrc_node_switch_start(ctx);
if (ret)
return ret;
--
2.32.0
When a mapped level interrupt (a timer, for example) is deactivated
by the guest, the corresponding host interrupt is equally deactivated.
However, the fate of the pending state still needs to be dealt
with in SW.
This is specially true when the interrupt was in the active+pending
state in the virtual distributor at the point where the guest
was entered. On exit, the pending state is potentially stale
(the guest may have put the interrupt in a non-pending state).
If we don't do anything, the interrupt will be spuriously injected
in the guest. Although this shouldn't have any ill effect (spurious
interrupts are always possible), we can improve the emulation by
detecting the deactivation-while-pending case and resample the
interrupt.
While we're at it, move the logic into a common helper that can
be shared between the two GIC implementations.
Fixes: e40cc57bac79 ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts")
Reported-by: Raghavendra Rao Ananta <rananta(a)google.com>
Tested-by: Raghavendra Rao Ananta <rananta(a)google.com>
Reviewed-by: Oliver Upton <oupton(a)google.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
Notes:
* From v1:
- Moved the resampling into a separate helper, and pointed both
GIC implementations to it.
- Collected TB, RB, with thanks.
arch/arm64/kvm/vgic/vgic-v2.c | 36 +++++----------------------------
arch/arm64/kvm/vgic/vgic-v3.c | 36 +++++----------------------------
arch/arm64/kvm/vgic/vgic.c | 38 +++++++++++++++++++++++++++++++++++
arch/arm64/kvm/vgic/vgic.h | 2 ++
4 files changed, 50 insertions(+), 62 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-v2.c b/arch/arm64/kvm/vgic/vgic-v2.c
index 2c580204f1dc..95a18cec14a3 100644
--- a/arch/arm64/kvm/vgic/vgic-v2.c
+++ b/arch/arm64/kvm/vgic/vgic-v2.c
@@ -60,6 +60,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
u32 val = cpuif->vgic_lr[lr];
u32 cpuid, intid = val & GICH_LR_VIRTUALID;
struct vgic_irq *irq;
+ bool deactivated;
/* Extract the source vCPU id from the LR */
cpuid = val & GICH_LR_PHYSID_CPUID;
@@ -75,7 +76,8 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
raw_spin_lock(&irq->irq_lock);
- /* Always preserve the active bit */
+ /* Always preserve the active bit, note deactivation */
+ deactivated = irq->active && !(val & GICH_LR_ACTIVE_BIT);
irq->active = !!(val & GICH_LR_ACTIVE_BIT);
if (irq->active && vgic_irq_is_sgi(intid))
@@ -96,36 +98,8 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
if (irq->config == VGIC_CONFIG_LEVEL && !(val & GICH_LR_STATE))
irq->pending_latch = false;
- /*
- * Level-triggered mapped IRQs are special because we only
- * observe rising edges as input to the VGIC.
- *
- * If the guest never acked the interrupt we have to sample
- * the physical line and set the line level, because the
- * device state could have changed or we simply need to
- * process the still pending interrupt later.
- *
- * If this causes us to lower the level, we have to also clear
- * the physical active state, since we will otherwise never be
- * told when the interrupt becomes asserted again.
- *
- * Another case is when the interrupt requires a helping hand
- * on deactivation (no HW deactivation, for example).
- */
- if (vgic_irq_is_mapped_level(irq)) {
- bool resample = false;
-
- if (val & GICH_LR_PENDING_BIT) {
- irq->line_level = vgic_get_phys_line_level(irq);
- resample = !irq->line_level;
- } else if (vgic_irq_needs_resampling(irq) &&
- !(irq->active || irq->pending_latch)) {
- resample = true;
- }
-
- if (resample)
- vgic_irq_set_phys_active(irq, false);
- }
+ /* Handle resampling for mapped interrupts if required */
+ vgic_irq_handle_resampling(irq, deactivated, val & GICH_LR_PENDING_BIT);
raw_spin_unlock(&irq->irq_lock);
vgic_put_irq(vcpu->kvm, irq);
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 66004f61cd83..21a6207fb2ee 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -46,6 +46,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
u32 intid, cpuid;
struct vgic_irq *irq;
bool is_v2_sgi = false;
+ bool deactivated;
cpuid = val & GICH_LR_PHYSID_CPUID;
cpuid >>= GICH_LR_PHYSID_CPUID_SHIFT;
@@ -68,7 +69,8 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
raw_spin_lock(&irq->irq_lock);
- /* Always preserve the active bit */
+ /* Always preserve the active bit, note deactivation */
+ deactivated = irq->active && !(val & ICH_LR_ACTIVE_BIT);
irq->active = !!(val & ICH_LR_ACTIVE_BIT);
if (irq->active && is_v2_sgi)
@@ -89,36 +91,8 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
if (irq->config == VGIC_CONFIG_LEVEL && !(val & ICH_LR_STATE))
irq->pending_latch = false;
- /*
- * Level-triggered mapped IRQs are special because we only
- * observe rising edges as input to the VGIC.
- *
- * If the guest never acked the interrupt we have to sample
- * the physical line and set the line level, because the
- * device state could have changed or we simply need to
- * process the still pending interrupt later.
- *
- * If this causes us to lower the level, we have to also clear
- * the physical active state, since we will otherwise never be
- * told when the interrupt becomes asserted again.
- *
- * Another case is when the interrupt requires a helping hand
- * on deactivation (no HW deactivation, for example).
- */
- if (vgic_irq_is_mapped_level(irq)) {
- bool resample = false;
-
- if (val & ICH_LR_PENDING_BIT) {
- irq->line_level = vgic_get_phys_line_level(irq);
- resample = !irq->line_level;
- } else if (vgic_irq_needs_resampling(irq) &&
- !(irq->active || irq->pending_latch)) {
- resample = true;
- }
-
- if (resample)
- vgic_irq_set_phys_active(irq, false);
- }
+ /* Handle resampling for mapped interrupts if required */
+ vgic_irq_handle_resampling(irq, deactivated, val & ICH_LR_PENDING_BIT);
raw_spin_unlock(&irq->irq_lock);
vgic_put_irq(vcpu->kvm, irq);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 111bff47e471..42a6ac78fe95 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -1022,3 +1022,41 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int vintid)
return map_is_active;
}
+
+/*
+ * Level-triggered mapped IRQs are special because we only observe rising
+ * edges as input to the VGIC.
+ *
+ * If the guest never acked the interrupt we have to sample the physical
+ * line and set the line level, because the device state could have changed
+ * or we simply need to process the still pending interrupt later.
+ *
+ * We could also have entered the guest with the interrupt active+pending.
+ * On the next exit, we need to re-evaluate the pending state, as it could
+ * otherwise result in a spurious interrupt by injecting a now potentially
+ * stale pending state.
+ *
+ * If this causes us to lower the level, we have to also clear the physical
+ * active state, since we will otherwise never be told when the interrupt
+ * becomes asserted again.
+ *
+ * Another case is when the interrupt requires a helping hand on
+ * deactivation (no HW deactivation, for example).
+ */
+void vgic_irq_handle_resampling(struct vgic_irq *irq,
+ bool lr_deactivated, bool lr_pending)
+{
+ if (vgic_irq_is_mapped_level(irq)) {
+ bool resample = false;
+
+ if (unlikely(vgic_irq_needs_resampling(irq))) {
+ resample = !(irq->active || irq->pending_latch);
+ } else if (lr_pending || (lr_deactivated && irq->line_level)) {
+ irq->line_level = vgic_get_phys_line_level(irq);
+ resample = !irq->line_level;
+ }
+
+ if (resample)
+ vgic_irq_set_phys_active(irq, false);
+ }
+}
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index dc1f3d1657ee..14a9218641f5 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -169,6 +169,8 @@ void vgic_irq_set_phys_active(struct vgic_irq *irq, bool active);
bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq,
unsigned long flags);
void vgic_kick_vcpus(struct kvm *kvm);
+void vgic_irq_handle_resampling(struct vgic_irq *irq,
+ bool lr_deactivated, bool lr_pending);
int vgic_check_ioaddr(struct kvm *kvm, phys_addr_t *ioaddr,
phys_addr_t addr, phys_addr_t alignment);
--
2.30.2
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache. It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page
to the cache.
The only code other than huge page allocation which sets the flag is
restore_reserve_on_error. It will potentially set the flag in rare out
of memory conditions. syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.
The code in restore_reserve_on_error is doing the right thing. However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error. This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page until
it is removed from the cache. Error paths do not remove pages from the
cache, so even in the case of error, the page will remain in the cache
and no reservation adjustment is needed.
Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.
Note on fixes tag:
Prior to commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") the routine would not process page cache pages because
the HPageRestoreReserve flag is not set on such pages. Therefore, this
issue could not be trigggered. The code added by commit 846be08578ed
("mm/hugetlb: expand restore_reserve_on_error functionality") is needed
and correct. It exposed incorrect calls to restore_reserve_on_error which
is the root cause addressed by this commit.
[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/
Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: <syzbot+67654e51e54455f1c585(a)syzkaller.appspotmail.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c~hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error
+++ a/mm/hugetlb.c
@@ -2476,7 +2476,7 @@ void restore_reserve_on_error(struct hst
if (!rc) {
/*
* This indicates there is an entry in the reserve map
- * added by alloc_huge_page. We know it was added
+ * not added by alloc_huge_page. We know it was added
* before the alloc_huge_page call, otherwise
* HPageRestoreReserve would be set on the page.
* Remove the entry so that a subsequent allocation
@@ -4660,7 +4660,9 @@ retry_avoidcopy:
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(&range);
out_release_all:
- restore_reserve_on_error(h, vma, haddr, new_page);
+ /* No restore in case of successful pagetable update (Break COW) */
+ if (new_page != old_page)
+ restore_reserve_on_error(h, vma, haddr, new_page);
put_page(new_page);
out_release_old:
put_page(old_page);
@@ -4776,7 +4778,7 @@ static vm_fault_t hugetlb_no_page(struct
pte_t new_pte;
spinlock_t *ptl;
unsigned long haddr = address & huge_page_mask(h);
- bool new_page = false;
+ bool new_page, new_pagecache_page = false;
/*
* Currently, we are forced to kill the process in the event the
@@ -4799,6 +4801,7 @@ static vm_fault_t hugetlb_no_page(struct
goto out;
retry:
+ new_page = false;
page = find_lock_page(mapping, idx);
if (!page) {
/* Check for page in userfault range */
@@ -4842,6 +4845,7 @@ retry:
goto retry;
goto out;
}
+ new_pagecache_page = true;
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
@@ -4926,7 +4930,9 @@ backout:
spin_unlock(ptl);
backout_unlocked:
unlock_page(page);
- restore_reserve_on_error(h, vma, haddr, page);
+ /* restore reserve for newly allocated pages not in page cache */
+ if (new_page && !new_pagecache_page)
+ restore_reserve_on_error(h, vma, haddr, page);
put_page(page);
goto out;
}
@@ -5135,6 +5141,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
int ret = -ENOMEM;
struct page *page;
int writable;
+ bool new_pagecache_page = false;
if (is_continue) {
ret = -EFAULT;
@@ -5228,6 +5235,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
ret = huge_add_to_page_cache(page, mapping, idx);
if (ret)
goto out_release_nounlock;
+ new_pagecache_page = true;
}
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
@@ -5291,7 +5299,8 @@ out_release_unlock:
if (vm_shared || is_continue)
unlock_page(page);
out_release_nounlock:
- restore_reserve_on_error(h, dst_vma, dst_addr, page);
+ if (!new_pagecache_page)
+ restore_reserve_on_error(h, dst_vma, dst_addr, page);
put_page(page);
goto out;
}
_
From: Marco Elver <elver(a)google.com>
Subject: kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
Originally the addr != NULL check was meant to take care of the case where
__kfence_pool == NULL (KFENCE is disabled). However, this does not work
for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.
This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE. While the kernel
would likely crash, the stack traces and report might be confusing due to
double faults upon KFENCE's attempt to unprotect such an address.
Fix it by just checking that __kfence_pool != NULL instead.
Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee(a)mediatek.com>
Acked-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kfence.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/include/linux/kfence.h~kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size
+++ a/include/linux/kfence.h
@@ -51,10 +51,11 @@ extern atomic_t kfence_allocation_gate;
static __always_inline bool is_kfence_address(const void *addr)
{
/*
- * The non-NULL check is required in case the __kfence_pool pointer was
- * never initialized; keep it in the slow-path after the range-check.
+ * The __kfence_pool != NULL check is required to deal with the case
+ * where __kfence_pool == NULL && addr < KFENCE_POOL_SIZE. Keep it in
+ * the slow-path after the range-check!
*/
- return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && addr);
+ return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && __kfence_pool);
}
/**
_
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/hwpoison: retry with shake_page() for unhandlable pages
HWPoisonHandlable() sometimes returns false for typical user pages due to
races with average memory events like transfers over LRU lists. This
causes failures in hwpoison handling.
There's retry code for such a case but does not work because the retry
loop reaches the retry limit too quickly before the page settles down to
handlable state. Let get_any_page() call shake_page() to fix it.
[naoya.horiguchi(a)nec.com: get_any_page(): return -EIO when retry limit reached]
Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev
Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev
Fixes: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reported-by: Tony Luck <tony.luck(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.13+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/mm/memory-failure.c~mm-hwpoison-retry-with-shake_page-for-unhandlable-pages
+++ a/mm/memory-failure.c
@@ -1146,7 +1146,7 @@ static int __get_hwpoison_page(struct pa
* unexpected races caused by taking a page refcount.
*/
if (!HWPoisonHandlable(head))
- return 0;
+ return -EBUSY;
if (PageTransHuge(head)) {
/*
@@ -1199,9 +1199,15 @@ try_again:
}
goto out;
} else if (ret == -EBUSY) {
- /* We raced with freeing huge page to buddy, retry. */
- if (pass++ < 3)
+ /*
+ * We raced with (possibly temporary) unhandlable
+ * page, retry.
+ */
+ if (pass++ < 3) {
+ shake_page(p, 1);
goto try_again;
+ }
+ ret = -EIO;
goto out;
}
}
_
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups. This is unexpected and undesirable as memory.low
is supposed to express non-OOMing memory priorities between cgroups.
The reason for this is proportional memory.low reclaim. When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slightly above their memory.low setting, page scan
force is scaled down and diminished in proportion to the overage, to
the point where it can cause reclaim to fail as well - only in that
case we currently don't retry, and instead trigger OOM.
To fix this, hook proportional reclaim into the same retry logic we
have in place for when cgroups are skipped entirely. This way if
reclaim fails and some cgroups were scanned with diminished pressure,
we'll try another full-force cycle before giving up and OOMing.
[akpm(a)linux-foundation.org: coding-style fixes]
Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Leon Yang <lnyng(a)fb.com>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memcontrol.h | 29 +++++++++++++++--------------
mm/vmscan.c | 27 +++++++++++++++++++--------
2 files changed, 34 insertions(+), 22 deletions(-)
--- a/include/linux/memcontrol.h~mm-memcontrol-fix-occasional-ooms-due-to-proportional-memorylow-reclaim
+++ a/include/linux/memcontrol.h
@@ -612,12 +612,15 @@ static inline bool mem_cgroup_disabled(v
return !cgroup_subsys_enabled(memory_cgrp_subsys);
}
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
- struct mem_cgroup *memcg,
- bool in_low_reclaim)
+static inline void mem_cgroup_protection(struct mem_cgroup *root,
+ struct mem_cgroup *memcg,
+ unsigned long *min,
+ unsigned long *low)
{
+ *min = *low = 0;
+
if (mem_cgroup_disabled())
- return 0;
+ return;
/*
* There is no reclaim protection applied to a targeted reclaim.
@@ -653,13 +656,10 @@ static inline unsigned long mem_cgroup_p
*
*/
if (root == memcg)
- return 0;
-
- if (in_low_reclaim)
- return READ_ONCE(memcg->memory.emin);
+ return;
- return max(READ_ONCE(memcg->memory.emin),
- READ_ONCE(memcg->memory.elow));
+ *min = READ_ONCE(memcg->memory.emin);
+ *low = READ_ONCE(memcg->memory.elow);
}
void mem_cgroup_calculate_protection(struct mem_cgroup *root,
@@ -1147,11 +1147,12 @@ static inline void memcg_memory_event_mm
{
}
-static inline unsigned long mem_cgroup_protection(struct mem_cgroup *root,
- struct mem_cgroup *memcg,
- bool in_low_reclaim)
+static inline void mem_cgroup_protection(struct mem_cgroup *root,
+ struct mem_cgroup *memcg,
+ unsigned long *min,
+ unsigned long *low)
{
- return 0;
+ *min = *low = 0;
}
static inline void mem_cgroup_calculate_protection(struct mem_cgroup *root,
--- a/mm/vmscan.c~mm-memcontrol-fix-occasional-ooms-due-to-proportional-memorylow-reclaim
+++ a/mm/vmscan.c
@@ -100,9 +100,12 @@ struct scan_control {
unsigned int may_swap:1;
/*
- * Cgroups are not reclaimed below their configured memory.low,
- * unless we threaten to OOM. If any cgroups are skipped due to
- * memory.low and nothing was reclaimed, go back for memory.low.
+ * Cgroup memory below memory.low is protected as long as we
+ * don't threaten to OOM. If any cgroup is reclaimed at
+ * reduced force or passed over entirely due to its memory.low
+ * setting (memcg_low_skipped), and nothing is reclaimed as a
+ * result, then go back for one more cycle that reclaims the protected
+ * memory (memcg_low_reclaim) to avert OOM.
*/
unsigned int memcg_low_reclaim:1;
unsigned int memcg_low_skipped:1;
@@ -2537,15 +2540,14 @@ out:
for_each_evictable_lru(lru) {
int file = is_file_lru(lru);
unsigned long lruvec_size;
+ unsigned long low, min;
unsigned long scan;
- unsigned long protection;
lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
- protection = mem_cgroup_protection(sc->target_mem_cgroup,
- memcg,
- sc->memcg_low_reclaim);
+ mem_cgroup_protection(sc->target_mem_cgroup, memcg,
+ &min, &low);
- if (protection) {
+ if (min || low) {
/*
* Scale a cgroup's reclaim pressure by proportioning
* its current usage to its memory.low or memory.min
@@ -2576,6 +2578,15 @@ out:
* hard protection.
*/
unsigned long cgroup_size = mem_cgroup_size(memcg);
+ unsigned long protection;
+
+ /* memory.low scaling, make sure we retry before OOM */
+ if (!sc->memcg_low_reclaim && low > min) {
+ protection = low;
+ sc->memcg_low_skipped = 1;
+ } else {
+ protection = min;
+ }
/* Avoid TOCTOU with earlier protection check */
cgroup_size = max(cgroup_size, protection);
_
Two legacy PCI sysfs objects "legacy_io" and "legacy_mem" were updated
to use an unified address space in the commit 636b21b50152 ("PCI: Revoke
mappings like devmem"). This allows for revocations to be managed from
a single place when drivers want to take over and mmap() a /dev/mem
range.
Following the update, both of the sysfs objects should leverage the
iomem_get_mapping() function to get an appropriate address range, but
only the "legacy_io" has been correctly updated - the second attribute
seems to be using a wrong variable to pass the iomem_get_mapping()
function to.
Thus, correct the variable name used so that the "legacy_mem" sysfs
object would also correctly call the iomem_get_mapping() function.
Fixes: 636b21b50152 ("PCI: Revoke mappings like devmem")
Signed-off-by: Krzysztof Wilczyński <kw(a)linux.com>
---
drivers/pci/pci-sysfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 5d63df7c1820..7bbf2673c7f2 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -978,7 +978,7 @@ void pci_create_legacy_files(struct pci_bus *b)
b->legacy_mem->size = 1024*1024;
b->legacy_mem->attr.mode = 0600;
b->legacy_mem->mmap = pci_mmap_legacy_mem;
- b->legacy_io->mapping = iomem_get_mapping();
+ b->legacy_mem->mapping = iomem_get_mapping();
pci_adjust_legacy_attr(b, pci_mmap_mem);
error = device_create_bin_file(&b->dev, b->legacy_mem);
if (error)
--
2.32.0
From: Pauli Virtanen <pav(a)iki.fi>
Kernel Version 5.13
commit 55981d3541812234e687062926ff199c83f79a39 upstream.
Some USB BT adapters don't satisfy the MTU requirement mentioned in
commit e848dbd364ac ("Bluetooth: btusb: Add support USB ALT 3 for WBS")
and have ALT 3 setting that produces no/garbled audio. Some adapters
with larger MTU were also reported to have problems with ALT 3.
Add a flag and check it and MTU before selecting ALT 3, falling back to
ALT 1. Enable the flag for Realtek, restoring the previous behavior for
non-Realtek devices.
Tested with USB adapters (mtu<72, no/garbled sound with ALT3, ALT1
works) BCM20702A1 0b05:17cb, CSR8510A10 0a12:0001, and (mtu>=72, ALT3
works) RTL8761BU 0bda:8771, Intel AX200 8087:0029 (after disabling
ALT6). Also got reports for (mtu>=72, ALT 3 reported to produce bad
audio) Intel 8087:0a2b.
Signed-off-by: Pauli Virtanen <pav(a)iki.fi>
Fixes: e848dbd364ac ("Bluetooth: btusb: Add support USB ALT 3 for WBS")
Tested-by: Michał Kępień <kernel(a)kempniu.pl>
Tested-by: Jonathan Lampérth <jon(a)h4n.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
---
drivers/bluetooth/btusb.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 488f110e17e27..2336f731dbc7e 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -528,6 +528,7 @@ static const struct dmi_system_id btusb_needs_reset_resume_table[] = {
#define BTUSB_HW_RESET_ACTIVE 12
#define BTUSB_TX_WAIT_VND_EVT 13
#define BTUSB_WAKEUP_DISABLE 14
+#define BTUSB_USE_ALT3_FOR_WBS 15
struct btusb_data {
struct hci_dev *hdev;
@@ -1761,16 +1762,20 @@ static void btusb_work(struct work_struct *work)
/* Bluetooth USB spec recommends alt 6 (63 bytes), but
* many adapters do not support it. Alt 1 appears to
* work for all adapters that do not have alt 6, and
- * which work with WBS at all.
+ * which work with WBS at all. Some devices prefer
+ * alt 3 (HCI payload >= 60 Bytes let air packet
+ * data satisfy 60 bytes), requiring
+ * MTU >= 3 (packets) * 25 (size) - 3 (headers) = 72
+ * see also Core spec 5, vol 4, B 2.1.1 & Table 2.1.
*/
- new_alts = btusb_find_altsetting(data, 6) ? 6 : 1;
- /* Because mSBC frames do not need to be aligned to the
- * SCO packet boundary. If support the Alt 3, use the
- * Alt 3 for HCI payload >= 60 Bytes let air packet
- * data satisfy 60 bytes.
- */
- if (new_alts == 1 && btusb_find_altsetting(data, 3))
+ if (btusb_find_altsetting(data, 6))
+ new_alts = 6;
+ else if (btusb_find_altsetting(data, 3) &&
+ hdev->sco_mtu >= 72 &&
+ test_bit(BTUSB_USE_ALT3_FOR_WBS, &data->flags))
new_alts = 3;
+ else
+ new_alts = 1;
}
if (btusb_switch_alt_setting(hdev, new_alts) < 0)
@@ -3882,6 +3887,7 @@ static int btusb_probe(struct usb_interface *intf,
* (DEVICE_REMOTE_WAKEUP)
*/
set_bit(BTUSB_WAKEUP_DISABLE, &data->flags);
+ set_bit(BTUSB_USE_ALT3_FOR_WBS, &data->flags);
}
if (!reset)
--
cgit 1.2.3-1.el7
Currently there are (at least) two problems in the way pwm_bl starts
managing the enable_gpio pin. Both occur when the backlight is initially
off and the driver finds the pin not already in output mode and, as a
result, unconditionally switches it to output-mode and asserts the signal.
Problem 1: This could cause the backlight to flicker since, at this stage
in driver initialisation, we have no idea what the PWM and regulator are
doing (an unconfigured PWM could easily "rest" at 100% duty cycle).
Problem 2: This will cause us not to correctly honour the
post_pwm_on_delay (which also risks flickers).
Fix this by moving the code to configure the GPIO output mode until after
we have examines the handover state. That allows us to initialize
enable_gpio to off if the backlight is currently off and on if the
backlight is on.
Reported-by: Marek Vasut <marex(a)denx.de>
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
Cc: stable(a)vger.kernel.org
Acked-by: Marek Vasut <marex(a)denx.de>
Tested-by: Marek Vasut <marex(a)denx.de>
---
drivers/video/backlight/pwm_bl.c | 54 +++++++++++++++++---------------
1 file changed, 28 insertions(+), 26 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index e48fded3e414..8d8959a70e44 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -409,6 +409,33 @@ static bool pwm_backlight_is_linear(struct platform_pwm_backlight_data *data)
static int pwm_backlight_initial_power_state(const struct pwm_bl_data *pb)
{
struct device_node *node = pb->dev->of_node;
+ bool active = true;
+
+ /*
+ * If the enable GPIO is present, observable (either as input
+ * or output) and off then the backlight is not currently active.
+ * */
+ if (pb->enable_gpio && gpiod_get_value_cansleep(pb->enable_gpio) == 0)
+ active = false;
+
+ if (!regulator_is_enabled(pb->power_supply))
+ active = false;
+
+ if (!pwm_is_enabled(pb->pwm))
+ active = false;
+
+ /*
+ * Synchronize the enable_gpio with the observed state of the
+ * hardware.
+ */
+ if (pb->enable_gpio)
+ gpiod_direction_output(pb->enable_gpio, active);
+
+ /*
+ * Do not change pb->enabled here! pb->enabled essentially
+ * tells us if we own one of the regulator's use counts and
+ * right now we do not.
+ */
/* Not booted with device tree or no phandle link to the node */
if (!node || !node->phandle)
@@ -420,20 +447,7 @@ static int pwm_backlight_initial_power_state(const struct pwm_bl_data *pb)
* assume that another driver will enable the backlight at the
* appropriate time. Therefore, if it is disabled, keep it so.
*/
-
- /* if the enable GPIO is disabled, do not enable the backlight */
- if (pb->enable_gpio && gpiod_get_value_cansleep(pb->enable_gpio) == 0)
- return FB_BLANK_POWERDOWN;
-
- /* The regulator is disabled, do not enable the backlight */
- if (!regulator_is_enabled(pb->power_supply))
- return FB_BLANK_POWERDOWN;
-
- /* The PWM is disabled, keep it like this */
- if (!pwm_is_enabled(pb->pwm))
- return FB_BLANK_POWERDOWN;
-
- return FB_BLANK_UNBLANK;
+ return active ? FB_BLANK_UNBLANK: FB_BLANK_POWERDOWN;
}
static int pwm_backlight_probe(struct platform_device *pdev)
@@ -486,18 +500,6 @@ static int pwm_backlight_probe(struct platform_device *pdev)
goto err_alloc;
}
- /*
- * If the GPIO is not known to be already configured as output, that
- * is, if gpiod_get_direction returns either 1 or -EINVAL, change the
- * direction to output and set the GPIO as active.
- * Do not force the GPIO to active when it was already output as it
- * could cause backlight flickering or we would enable the backlight too
- * early. Leave the decision of the initial backlight state for later.
- */
- if (pb->enable_gpio &&
- gpiod_get_direction(pb->enable_gpio) != 0)
- gpiod_direction_output(pb->enable_gpio, 1);
-
pb->power_supply = devm_regulator_get(&pdev->dev, "power");
if (IS_ERR(pb->power_supply)) {
ret = PTR_ERR(pb->power_supply);
base-commit: 2734d6c1b1a089fb593ef6a23d4b70903526fe0c
--
2.30.2
Dear powerpc maintainers,
The script ./scripts/checkkconfigsymbols.py warns on invalid references to
Kconfig symbols (often, minor typos, name confusions or outdated references).
This patch series addresses all issues reported by
./scripts/checkkconfigsymbols.py in ./drivers/usb/ for Kconfig and Makefile
files. Issues in the Kconfig and Makefile files indicate some shortcomings in
the overall build definitions, and often are true actionable issues to address.
These issues can be identified and filtered by:
./scripts/checkkconfigsymbols.py | grep -E "arch/powerpc/.*(Kconfig|Makefile)" -B 1 -A 1
After applying this patch series on linux-next (next-20210817), the command
above yields just two false positives (SHELL, r13) due to tool shortcomings.
As these two patches are fixes, please consider if they are suitable for
backporting to stable.
Lukas
Lukas Bulwahn (2):
powerpc: kvm: rectify selection to PPC_DAWR
powerpc: rectify selection to ARCH_ENABLE_SPLIT_PMD_PTLOCK
arch/powerpc/kvm/Kconfig | 2 +-
arch/powerpc/platforms/Kconfig.cputype | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--
2.26.2
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: f428e49b8cb1 - Linux 5.13.12
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ✅ Podman system integration test - as root
🚧 ❌ Podman system integration test - as user
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
🚧 💥 stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ NFS Connectathon
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
ppc64le:
Host 1:
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
🚧 ✅ Storage: lvm device-mapper test - upstream
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ NFS Connectathon
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
s390x:
Host 1:
✅ Boot test
✅ Reboot test
✅ selinux-policy: serge-testsuite
✅ Storage blktests
✅ Storage: swraid mdadm raid_module test
🚧 ✅ Podman system integration test - as root
🚧 ✅ Podman system integration test - as user
🚧 ✅ Storage nvme - tcp
🚧 ❌ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ NFS Connectathon
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ NFS Connectathon
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
🚧 ✅ lvm snapper test
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ Storage blktests
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The patch titled
Subject: hugetlb: don't pass page cache pages to restore_reserve_on_error
has been added to the -mm tree. Its filename is
hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/hugetlb-dont-pass-page-cache-page…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/hugetlb-dont-pass-page-cache-page…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache. It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page
to the cache.
The only code other than huge page allocation which sets the flag is
restore_reserve_on_error. It will potentially set the flag in rare out
of memory conditions. syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.
The code in restore_reserve_on_error is doing the right thing. However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error. This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page until
it is removed from the cache. Error paths do not remove pages from the
cache, so even in the case of error, the page will remain in the cache
and no reservation adjustment is needed.
Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.
Note on fixes tag:
Prior to commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") the routine would not process page cache pages because
the HPageRestoreReserve flag is not set on such pages. Therefore, this
issue could not be trigggered. The code added by commit 846be08578ed
("mm/hugetlb: expand restore_reserve_on_error functionality") is needed
and correct. It exposed incorrect calls to restore_reserve_on_error which
is the root cause addressed by this commit.
[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/
Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: <syzbot+67654e51e54455f1c585(a)syzkaller.appspotmail.com>
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c~hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error
+++ a/mm/hugetlb.c
@@ -2476,7 +2476,7 @@ void restore_reserve_on_error(struct hst
if (!rc) {
/*
* This indicates there is an entry in the reserve map
- * added by alloc_huge_page. We know it was added
+ * not added by alloc_huge_page. We know it was added
* before the alloc_huge_page call, otherwise
* HPageRestoreReserve would be set on the page.
* Remove the entry so that a subsequent allocation
@@ -4660,7 +4660,9 @@ retry_avoidcopy:
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(&range);
out_release_all:
- restore_reserve_on_error(h, vma, haddr, new_page);
+ /* No restore in case of successful pagetable update (Break COW) */
+ if (new_page != old_page)
+ restore_reserve_on_error(h, vma, haddr, new_page);
put_page(new_page);
out_release_old:
put_page(old_page);
@@ -4776,7 +4778,7 @@ static vm_fault_t hugetlb_no_page(struct
pte_t new_pte;
spinlock_t *ptl;
unsigned long haddr = address & huge_page_mask(h);
- bool new_page = false;
+ bool new_page, new_pagecache_page = false;
/*
* Currently, we are forced to kill the process in the event the
@@ -4799,6 +4801,7 @@ static vm_fault_t hugetlb_no_page(struct
goto out;
retry:
+ new_page = false;
page = find_lock_page(mapping, idx);
if (!page) {
/* Check for page in userfault range */
@@ -4842,6 +4845,7 @@ retry:
goto retry;
goto out;
}
+ new_pagecache_page = true;
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
@@ -4926,7 +4930,9 @@ backout:
spin_unlock(ptl);
backout_unlocked:
unlock_page(page);
- restore_reserve_on_error(h, vma, haddr, page);
+ /* restore reserve for newly allocated pages not in page cache */
+ if (new_page && !new_pagecache_page)
+ restore_reserve_on_error(h, vma, haddr, page);
put_page(page);
goto out;
}
@@ -5135,6 +5141,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
int ret = -ENOMEM;
struct page *page;
int writable;
+ bool new_pagecache_page = false;
if (is_continue) {
ret = -EFAULT;
@@ -5228,6 +5235,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
ret = huge_add_to_page_cache(page, mapping, idx);
if (ret)
goto out_release_nounlock;
+ new_pagecache_page = true;
}
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
@@ -5291,7 +5299,8 @@ out_release_unlock:
if (vm_shared || is_continue)
unlock_page(page);
out_release_nounlock:
- restore_reserve_on_error(h, dst_vma, dst_addr, page);
+ if (!new_pagecache_page)
+ restore_reserve_on_error(h, dst_vma, dst_addr, page);
put_page(page);
goto out;
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-dont-pass-page-cache-pages-to-restore_reserve_on_error.patch
hugetlb-simplify-prep_compound_gigantic_page-ref-count-racing-code.patch
hugetlb-drop-ref-count-earlier-after-page-allocation.patch
hugetlb-before-freeing-hugetlb-page-set-dtor-to-appropriate-value.patch
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache. It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page
to the cache.
The only code other than huge page allocation which sets the flag is
restore_reserve_on_error. It will potentially set the flag in rare out
of memory conditions. syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.
The code in restore_reserve_on_error is doing the right thing. However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error. This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page until
it is removed from the cache. Error paths do not remove pages from the
cache, so even in the case of error, the page will remain in the cache
and no reservation adjustment is needed.
Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.
Note on fixes tag:
Prior to commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") the routine would not process page cache pages because
the HPageRestoreReserve flag is not set on such pages. Therefore, this
issue could not be trigggered. The code added by commit 846be08578ed
("mm/hugetlb: expand restore_reserve_on_error functionality") is needed
and correct. It exposed incorrect calls to restore_reserve_on_error which
is the root cause addressed by this commit.
[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/
Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Reported-by: syzbot+67654e51e54455f1c585(a)syzkaller.appspotmail.com
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
---
mm/hugetlb.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6337697f7ee4..54c715e7e362 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2546,7 +2546,7 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
if (!rc) {
/*
* This indicates there is an entry in the reserve map
- * added by alloc_huge_page. We know it was added
+ * not added by alloc_huge_page. We know it was added
* before the alloc_huge_page call, otherwise
* HPageRestoreReserve would be set on the page.
* Remove the entry so that a subsequent allocation
@@ -4753,7 +4753,9 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
spin_unlock(ptl);
mmu_notifier_invalidate_range_end(&range);
out_release_all:
- restore_reserve_on_error(h, vma, haddr, new_page);
+ /* No restore in case of successful pagetable update (Break COW) */
+ if (new_page != old_page)
+ restore_reserve_on_error(h, vma, haddr, new_page);
put_page(new_page);
out_release_old:
put_page(old_page);
@@ -4869,7 +4871,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
pte_t new_pte;
spinlock_t *ptl;
unsigned long haddr = address & huge_page_mask(h);
- bool new_page = false;
+ bool new_page, new_pagecache_page = false;
/*
* Currently, we are forced to kill the process in the event the
@@ -4892,6 +4894,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
goto out;
retry:
+ new_page = false;
page = find_lock_page(mapping, idx);
if (!page) {
/* Check for page in userfault range */
@@ -4935,6 +4938,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
goto retry;
goto out;
}
+ new_pagecache_page = true;
} else {
lock_page(page);
if (unlikely(anon_vma_prepare(vma))) {
@@ -5019,7 +5023,9 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
spin_unlock(ptl);
backout_unlocked:
unlock_page(page);
- restore_reserve_on_error(h, vma, haddr, page);
+ /* restore reserve for newly allocated pages not in page cache */
+ if (new_page && !new_pagecache_page)
+ restore_reserve_on_error(h, vma, haddr, page);
put_page(page);
goto out;
}
@@ -5228,6 +5234,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
int ret = -ENOMEM;
struct page *page;
int writable;
+ bool new_pagecache_page = false;
if (is_continue) {
ret = -EFAULT;
@@ -5321,6 +5328,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
ret = huge_add_to_page_cache(page, mapping, idx);
if (ret)
goto out_release_nounlock;
+ new_pagecache_page = true;
}
ptl = huge_pte_lockptr(h, dst_mm, dst_pte);
@@ -5384,7 +5392,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
if (vm_shared || is_continue)
unlock_page(page);
out_release_nounlock:
- restore_reserve_on_error(h, dst_vma, dst_addr, page);
+ if (!new_pagecache_page)
+ restore_reserve_on_error(h, dst_vma, dst_addr, page);
put_page(page);
goto out;
}
--
2.31.1
When a mapped level interrupt (a timer, for example) is deactivated
by the guest, the corresponding host interrupt is equally deactivated.
However, the fate of the pending state still needs to be dealt
with in SW.
This is specially true when the interrupt was in the active+pending
state in the virtual distributor at the point where the guest
was entered. On exit, the pending state is potentially stale
(the guest may have put the interrupt in a non-pending state).
If we don't do anything, the interrupt will be spuriously injected
in the guest. Although this shouldn't have any ill effect (spurious
interrupts are always possible), we can improve the emulation by
detecting the deactivation-while-pending case and resample the
interrupt.
Fixes: e40cc57bac79 ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts")
Reported-by: Raghavendra Rao Ananta <rananta(a)google.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kvm/vgic/vgic-v2.c | 25 ++++++++++++++++++-------
arch/arm64/kvm/vgic/vgic-v3.c | 25 ++++++++++++++++++-------
2 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-v2.c b/arch/arm64/kvm/vgic/vgic-v2.c
index 2c580204f1dc..3e52ea86a87f 100644
--- a/arch/arm64/kvm/vgic/vgic-v2.c
+++ b/arch/arm64/kvm/vgic/vgic-v2.c
@@ -60,6 +60,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
u32 val = cpuif->vgic_lr[lr];
u32 cpuid, intid = val & GICH_LR_VIRTUALID;
struct vgic_irq *irq;
+ bool deactivated;
/* Extract the source vCPU id from the LR */
cpuid = val & GICH_LR_PHYSID_CPUID;
@@ -75,7 +76,8 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
raw_spin_lock(&irq->irq_lock);
- /* Always preserve the active bit */
+ /* Always preserve the active bit, note deactivation */
+ deactivated = irq->active && !(val & GICH_LR_ACTIVE_BIT);
irq->active = !!(val & GICH_LR_ACTIVE_BIT);
if (irq->active && vgic_irq_is_sgi(intid))
@@ -105,6 +107,12 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
* device state could have changed or we simply need to
* process the still pending interrupt later.
*
+ * We could also have entered the guest with the interrupt
+ * active+pending. On the next exit, we need to re-evaluate
+ * the pending state, as it could otherwise result in a
+ * spurious interrupt by injecting a now potentially stale
+ * pending state.
+ *
* If this causes us to lower the level, we have to also clear
* the physical active state, since we will otherwise never be
* told when the interrupt becomes asserted again.
@@ -115,12 +123,15 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
if (vgic_irq_is_mapped_level(irq)) {
bool resample = false;
- if (val & GICH_LR_PENDING_BIT) {
- irq->line_level = vgic_get_phys_line_level(irq);
- resample = !irq->line_level;
- } else if (vgic_irq_needs_resampling(irq) &&
- !(irq->active || irq->pending_latch)) {
- resample = true;
+ if (unlikely(vgic_irq_needs_resampling(irq))) {
+ if (!(irq->active || irq->pending_latch))
+ resample = true;
+ } else {
+ if ((val & GICH_LR_PENDING_BIT) ||
+ (deactivated && irq->line_level)) {
+ irq->line_level = vgic_get_phys_line_level(irq);
+ resample = !irq->line_level;
+ }
}
if (resample)
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 66004f61cd83..74f9aefffd5e 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -46,6 +46,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
u32 intid, cpuid;
struct vgic_irq *irq;
bool is_v2_sgi = false;
+ bool deactivated;
cpuid = val & GICH_LR_PHYSID_CPUID;
cpuid >>= GICH_LR_PHYSID_CPUID_SHIFT;
@@ -68,7 +69,8 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
raw_spin_lock(&irq->irq_lock);
- /* Always preserve the active bit */
+ /* Always preserve the active bit, note deactivation */
+ deactivated = irq->active && !(val & ICH_LR_ACTIVE_BIT);
irq->active = !!(val & ICH_LR_ACTIVE_BIT);
if (irq->active && is_v2_sgi)
@@ -98,6 +100,12 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
* device state could have changed or we simply need to
* process the still pending interrupt later.
*
+ * We could also have entered the guest with the interrupt
+ * active+pending. On the next exit, we need to re-evaluate
+ * the pending state, as it could otherwise result in a
+ * spurious interrupt by injecting a now potentially stale
+ * pending state.
+ *
* If this causes us to lower the level, we have to also clear
* the physical active state, since we will otherwise never be
* told when the interrupt becomes asserted again.
@@ -108,12 +116,15 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
if (vgic_irq_is_mapped_level(irq)) {
bool resample = false;
- if (val & ICH_LR_PENDING_BIT) {
- irq->line_level = vgic_get_phys_line_level(irq);
- resample = !irq->line_level;
- } else if (vgic_irq_needs_resampling(irq) &&
- !(irq->active || irq->pending_latch)) {
- resample = true;
+ if (unlikely(vgic_irq_needs_resampling(irq))) {
+ if (!(irq->active || irq->pending_latch))
+ resample = true;
+ } else {
+ if ((val & ICH_LR_PENDING_BIT) ||
+ (deactivated && irq->line_level)) {
+ irq->line_level = vgic_get_phys_line_level(irq);
+ resample = !irq->line_level;
+ }
}
if (resample)
--
2.30.2
The patch titled
Subject: kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
has been added to the -mm tree. Its filename is
kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/kfence-fix-is_kfence_address-for-…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/kfence-fix-is_kfence_address-for-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
Originally the addr != NULL check was meant to take care of the case where
__kfence_pool == NULL (KFENCE is disabled). However, this does not work
for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.
This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE. While the kernel
would likely crash, the stack traces and report might be confusing due to
double faults upon KFENCE's attempt to unprotect such an address.
Fix it by just checking that __kfence_pool != NULL instead.
Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee(a)mediatek.com>
Acked-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kfence.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/include/linux/kfence.h~kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size
+++ a/include/linux/kfence.h
@@ -51,10 +51,11 @@ extern atomic_t kfence_allocation_gate;
static __always_inline bool is_kfence_address(const void *addr)
{
/*
- * The non-NULL check is required in case the __kfence_pool pointer was
- * never initialized; keep it in the slow-path after the range-check.
+ * The __kfence_pool != NULL check is required to deal with the case
+ * where __kfence_pool == NULL && addr < KFENCE_POOL_SIZE. Keep it in
+ * the slow-path after the range-check!
*/
- return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && addr);
+ return unlikely((unsigned long)((char *)addr - __kfence_pool) < KFENCE_POOL_SIZE && __kfence_pool);
}
/**
_
Patches currently in -mm which might be from elver(a)google.com are
kfence-fix-is_kfence_address-for-addresses-below-kfence_pool_size.patch
kfence-show-cpu-and-timestamp-in-alloc-free-info.patch
The patch titled
Subject: mm/filemap.c: remove bogus VM_BUG_ON
has been added to the -mm tree. Its filename is
mm-remove-bogus-vm_bug_on.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-remove-bogus-vm_bug_on.patch
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-remove-bogus-vm_bug_on.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm/filemap.c: remove bogus VM_BUG_ON
It is not safe to check page->index without holding the page lock. It can
be changed if the page is moved between the swap cache and the page cache
for a shmem file, for example. There is a VM_BUG_ON below which checks
page->index is correct after taking the page lock.
Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reported-by: <syzbot+c87be4f669d920c76330(a)syzkaller.appspotmail.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 1 -
1 file changed, 1 deletion(-)
--- a/mm/filemap.c~mm-remove-bogus-vm_bug_on
+++ a/mm/filemap.c
@@ -2038,7 +2038,6 @@ unsigned find_lock_entries(struct addres
if (!xa_is_value(page)) {
if (page->index < start)
goto put;
- VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
if (page->index + thp_nr_pages(page) - 1 > end)
goto put;
if (!trylock_page(page))
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-remove-bogus-vm_bug_on.patch
mm-report-a-more-useful-address-for-reclaim-acquisition.patch
mm-mark-idle-page-tracking-as-broken.patch
avoid-a-warning-in-sparse-memory-support.patch
mm-move-kvmalloc-related-functions-to-slabh.patch
This is the start of the stable review cycle for the 5.4.142 release.
There are 62 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 18 Aug 2021 12:54:12 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.142-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.142-rc1
Saeed Mirzamohammadi <saeed.mirzamohammadi(a)oracle.com>
iommu/vt-d: Fix agaw for a supported 48 bit guest address width
Nathan Chancellor <nathan(a)kernel.org>
vmlinux.lds.h: Handle clang's module.{c,d}tor sections
Jeff Layton <jlayton(a)kernel.org>
ceph: take snap_empty_lock atomically with snaprealm refcount change
Jeff Layton <jlayton(a)kernel.org>
ceph: clean up locking annotation for ceph_get_snap_realm and __lookup_snap_realm
Jeff Layton <jlayton(a)kernel.org>
ceph: add some lockdep assertions around snaprealm handling
Sean Christopherson <seanjc(a)google.com>
KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Protect msi_desc::masked for multi-MSI
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Correct misleading comments
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Do not set invalid bits in MSI mask
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Enforce MSI[X] entry updates to be visible
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Enforce that MSI-X table entry is masked for update
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Mask all unused MSI-X entries
Thomas Gleixner <tglx(a)linutronix.de>
PCI/MSI: Enable and mask MSI-X early
Ben Dai <ben.dai(a)unisoc.com>
genirq/timings: Prevent potential array overflow in __irq_timings_store()
Bixuan Cui <cuibixuan(a)huawei.com>
genirq/msi: Ensure deactivation on teardown
Babu Moger <Babu.Moger(a)amd.com>
x86/resctrl: Fix default monitoring groups reporting
Thomas Gleixner <tglx(a)linutronix.de>
x86/ioapic: Force affinity setup before startup
Thomas Gleixner <tglx(a)linutronix.de>
x86/msi: Force affinity setup before startup
Thomas Gleixner <tglx(a)linutronix.de>
genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
Randy Dunlap <rdunlap(a)infradead.org>
x86/tools: Fix objdump version check again
Pu Lehui <pulehui(a)huawei.com>
powerpc/kprobes: Fix kprobe Oops happens in booke
Xie Yongji <xieyongji(a)bytedance.com>
nbd: Aovid double completion of a request
Longpeng(Mike) <longpeng2(a)huawei.com>
vsock/virtio: avoid potential deadlock when vsock device remove
Maximilian Heyne <mheyne(a)amazon.de>
xen/events: Fix race in set_evtchn_to_irq
Eric Dumazet <edumazet(a)google.com>
net: igmp: increase size of mr_ifc_count
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
Willy Tarreau <w(a)1wt.eu>
net: linkwatch: fix failure to restore device state across suspend/resume
Yang Yingliang <yangyingliang(a)huawei.com>
net: bridge: fix memleak in br_add_if()
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
Eric Dumazet <edumazet(a)google.com>
net: igmp: fix data-race in igmp_ifc_timer_expire()
Takeshi Misawa <jeliantsurux(a)gmail.com>
net: Fix memory leak in ieee802154_raw_deliver
Ben Hutchings <ben.hutchings(a)mind.be>
net: dsa: microchip: Fix ksz_read64()
Christian Hewitt <christianshewitt(a)gmail.com>
drm/meson: fix colour distortion from HDR set during vendor u-boot
Aya Levin <ayal(a)nvidia.com>
net/mlx5: Fix return value from tracer initialization
Roi Dayan <roid(a)nvidia.com>
psample: Add a fwd declaration for skbuff
Md Fahad Iqbal Polash <md.fahad.iqbal.polash(a)intel.com>
iavf: Set RSS LUT and key in reset handle path
Hangbin Liu <liuhangbin(a)gmail.com>
net: sched: act_mirred: Reset ct info when mirror/redirect skb
Pali Rohár <pali(a)kernel.org>
ppp: Fix generating ifname when empty IFLA_IFNAME is specified
Ben Hutchings <ben.hutchings(a)mind.be>
net: phy: micrel: Fix link detection on ksz87xx switch"
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: pcengines-apuv2: Add missing terminating entries to gpio-lookup tables
Florian Eckert <fe(a)dev.tdt.de>
platform/x86: pcengines-apuv2: revert wiring up simswitch GPIO as LED
DENG Qingfang <dqfext(a)gmail.com>
net: dsa: mt7530: add the missing RxUnicast MIB counter
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs42l42: Fix LRCLK frame start edge
Yajun Deng <yajun.deng(a)linux.dev>
netfilter: nf_conntrack_bridge: Fix memory leak when error
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs42l42: Remove duplicate control for WNF filter frequency
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs42l42: Fix inversion of ADC Notch Switch control
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs42l42: Correct definition of ADC Volume control
Dongliang Mu <mudongliangabcd(a)gmail.com>
ieee802154: hwsim: fix GPF in hwsim_new_edge_nl
Dongliang Mu <mudongliangabcd(a)gmail.com>
ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi
Dan Williams <dan.j.williams(a)intel.com>
libnvdimm/region: Fix label activation vs errors
Dan Williams <dan.j.williams(a)intel.com>
ACPI: NFIT: Fix support for virtual SPA ranges
Luis Henriques <lhenriques(a)suse.de>
ceph: reduce contention in ceph_check_delayed_caps()
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
i2c: dev: zero out array used for i2c reads from userspace
Takashi Iwai <tiwai(a)suse.de>
ASoC: intel: atom: Fix reference to PCM buffer address
Takashi Iwai <tiwai(a)suse.de>
ASoC: xilinx: Fix reference to PCM buffer address
Colin Ian King <colin.king(a)canonical.com>
iio: adc: Fix incorrect exit of for-loop
Chris Lesiak <chris.lesiak(a)licor.com>
iio: humidity: hdc100x: Add margin to the conversion time
Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/kprobes.c | 3 +-
arch/x86/kernel/apic/io_apic.c | 6 +-
arch/x86/kernel/apic/msi.c | 13 ++-
arch/x86/kernel/cpu/resctrl/monitor.c | 27 +++--
arch/x86/kvm/vmx/vmx.h | 2 +-
arch/x86/tools/chkobjdump.awk | 1 +
drivers/acpi/nfit/core.c | 3 +
drivers/base/core.c | 1 +
drivers/block/nbd.c | 14 ++-
drivers/gpu/drm/meson/meson_registers.h | 5 +
drivers/gpu/drm/meson/meson_viu.c | 7 +-
drivers/i2c/i2c-dev.c | 5 +-
drivers/iio/adc/palmas_gpadc.c | 4 +-
drivers/iio/adc/ti-ads7950.c | 1 -
drivers/iio/humidity/hdc100x.c | 6 +-
drivers/iommu/intel-iommu.c | 7 +-
drivers/net/dsa/lan9303-core.c | 34 +++---
drivers/net/dsa/lantiq_gswip.c | 14 ++-
drivers/net/dsa/microchip/ksz_common.h | 8 +-
drivers/net/dsa/mt7530.c | 1 +
drivers/net/dsa/sja1105/sja1105_main.c | 4 +-
drivers/net/ethernet/intel/iavf/iavf_main.c | 13 ++-
.../ethernet/mellanox/mlx5/core/diag/fw_tracer.c | 11 +-
drivers/net/ieee802154/mac802154_hwsim.c | 6 +-
drivers/net/phy/micrel.c | 2 -
drivers/net/ppp/ppp_generic.c | 2 +-
drivers/nvdimm/namespace_devs.c | 17 ++-
drivers/pci/msi.c | 125 +++++++++++++--------
drivers/platform/x86/pcengines-apuv2.c | 5 +-
drivers/xen/events/events_base.c | 20 +++-
fs/ceph/caps.c | 17 ++-
fs/ceph/mds_client.c | 25 +++--
fs/ceph/snap.c | 54 +++++----
fs/ceph/super.h | 2 +-
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/device.h | 1 +
include/linux/inetdevice.h | 2 +-
include/linux/irq.h | 2 +
include/linux/msi.h | 2 +-
include/net/psample.h | 2 +
kernel/irq/chip.c | 5 +-
kernel/irq/msi.c | 13 ++-
kernel/irq/timings.c | 5 +
net/bridge/br_if.c | 2 +
net/bridge/netfilter/nf_conntrack_bridge.c | 6 +
net/core/link_watch.c | 5 +-
net/ieee802154/socket.c | 7 +-
net/ipv4/igmp.c | 21 ++--
net/ipv4/tcp_bbr.c | 2 +-
net/sched/act_mirred.c | 3 +
net/vmw_vsock/virtio_transport.c | 7 +-
sound/soc/codecs/cs42l42.c | 39 +++----
sound/soc/intel/atom/sst-mfld-platform-pcm.c | 3 +-
sound/soc/xilinx/xlnx_formatter_pcm.c | 4 +-
55 files changed, 382 insertions(+), 219 deletions(-)
It is not safe to check page->index without holding the page lock.
It can be changed if the page is moved between the swap cache and the
page cache for a shmem file, for example. There is a VM_BUG_ON below
which checks page->index is correct after taking the page lock.
Cc: stable(a)vger.kernel.org
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
---
mm/filemap.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index d1458ecf2f51..34de0b14aaa9 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2033,17 +2033,16 @@ unsigned find_lock_entries(struct address_space *mapping, pgoff_t start,
XA_STATE(xas, &mapping->i_pages, start);
struct page *page;
rcu_read_lock();
while ((page = find_get_entry(&xas, end, XA_PRESENT))) {
if (!xa_is_value(page)) {
if (page->index < start)
goto put;
- VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
if (page->index + thp_nr_pages(page) - 1 > end)
goto put;
if (!trylock_page(page))
goto put;
if (page->mapping != mapping || PageWriteback(page))
goto unlock;
VM_BUG_ON_PAGE(!thp_contains(page, xas.xa_index),
page);
--
2.30.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 826da771291fc25a428e871f9e7fb465e390f852 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Thu, 29 Jul 2021 23:51:48 +0200
Subject: [PATCH] genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
X86 IO/APIC and MSI interrupts (when used without interrupts remapping)
require that the affinity setup on startup is done before the interrupt is
enabled for the first time as the non-remapped operation mode cannot safely
migrate enabled interrupts from arbitrary contexts. Provide a new irq chip
flag which allows affected hardware to request this.
This has to be opt-in because there have been reports in the past that some
interrupt chips cannot handle affinity setting before startup.
Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.779791738@linutronix.de
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 8e9a9ae471a6..c8293c817646 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -569,6 +569,7 @@ struct irq_chip {
* IRQCHIP_SUPPORTS_NMI: Chip can deliver NMIs, only for root irqchips
* IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND: Invokes __enable_irq()/__disable_irq() for wake irqs
* in the suspend path if they are in disabled state
+ * IRQCHIP_AFFINITY_PRE_STARTUP: Default affinity update before startup
*/
enum {
IRQCHIP_SET_TYPE_MASKED = (1 << 0),
@@ -581,6 +582,7 @@ enum {
IRQCHIP_SUPPORTS_LEVEL_MSI = (1 << 7),
IRQCHIP_SUPPORTS_NMI = (1 << 8),
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND = (1 << 9),
+ IRQCHIP_AFFINITY_PRE_STARTUP = (1 << 10),
};
#include <linux/irqdesc.h>
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 7f04c7d8296e..a98bcfc4be7b 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -265,8 +265,11 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force)
} else {
switch (__irq_startup_managed(desc, aff, force)) {
case IRQ_STARTUP_NORMAL:
+ if (d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP)
+ irq_setup_affinity(desc);
ret = __irq_startup(desc);
- irq_setup_affinity(desc);
+ if (!(d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP))
+ irq_setup_affinity(desc);
break;
case IRQ_STARTUP_MANAGED:
irq_do_set_affinity(d, aff, false);