Commit 5829f8a897e4 ("platform/x86: ideapad-laptop: Send
KEY_TOUCHPAD_TOGGLE on some models") made ideapad-laptop send
KEY_TOUCHPAD_TOGGLE when we receive an ACPI notify with VPC event bit 5 set
and the touchpad-state has not been changed by the EC itself already.
This was done under the assumption that this would be good to do to make
the touchpad-toggle hotkey work on newer models where the EC does not
toggle the touchpad on/off itself (because it is not routed through
the PS/2 controller, but uses I2C).
But it turns out that at least some models, e.g. the Yoga 7-15ITL5 the EC
triggers an ACPI notify with VPC event bit 5 set on resume, which would
now cause a spurious KEY_TOUCHPAD_TOGGLE on resume to which the desktop
environment responds by disabling the touchpad in software, breaking
the touchpad (until manually re-enabled) on resume.
It was never confirmed that sending KEY_TOUCHPAD_TOGGLE actually improves
things on new models and at least some new models like the Yoga 7-15ITL5
don't have a touchpad on/off toggle hotkey at all, while still sending
ACPI notify events with VPC event bit 5 set.
So it seems best to revert the change to send KEY_TOUCHPAD_TOGGLE when
receiving an ACPI notify events with VPC event bit 5 and the touchpad
state as reported by the EC has not changed.
Note this is not a full revert the code to cache the last EC touchpad
state is kept to avoid sending spurious KEY_TOUCHPAD_ON / _OFF events
on resume.
Fixes: 5829f8a897e4 ("platform/x86: ideapad-laptop: Send KEY_TOUCHPAD_TOGGLE on some models")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217234
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/platform/x86/ideapad-laptop.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c
index b5ef3452da1f..35c63cce0479 100644
--- a/drivers/platform/x86/ideapad-laptop.c
+++ b/drivers/platform/x86/ideapad-laptop.c
@@ -1170,7 +1170,6 @@ static const struct key_entry ideapad_keymap[] = {
{ KE_KEY, 65, { KEY_PROG4 } },
{ KE_KEY, 66, { KEY_TOUCHPAD_OFF } },
{ KE_KEY, 67, { KEY_TOUCHPAD_ON } },
- { KE_KEY, 68, { KEY_TOUCHPAD_TOGGLE } },
{ KE_KEY, 128, { KEY_ESC } },
/*
@@ -1526,18 +1525,16 @@ static void ideapad_sync_touchpad_state(struct ideapad_private *priv, bool send_
if (priv->features.ctrl_ps2_aux_port)
i8042_command(¶m, value ? I8042_CMD_AUX_ENABLE : I8042_CMD_AUX_DISABLE);
- if (send_events) {
- /*
- * On older models the EC controls the touchpad and toggles it
- * on/off itself, in this case we report KEY_TOUCHPAD_ON/_OFF.
- * If the EC did not toggle, report KEY_TOUCHPAD_TOGGLE.
- */
- if (value != priv->r_touchpad_val) {
- ideapad_input_report(priv, value ? 67 : 66);
- sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad");
- } else {
- ideapad_input_report(priv, 68);
- }
+ /*
+ * On older models the EC controls the touchpad and toggles it on/off
+ * itself, in this case we report KEY_TOUCHPAD_ON/_OFF. Some models do
+ * an acpi-notify with VPC bit 5 set on resume, so this function get
+ * called with send_events=true on every resume. Therefor if the EC did
+ * not toggle, do nothing to avoid sending spurious KEY_TOUCHPAD_TOGGLE.
+ */
+ if (send_events && value != priv->r_touchpad_val) {
+ ideapad_input_report(priv, value ? 67 : 66);
+ sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad");
}
priv->r_touchpad_val = value;
--
2.39.1
In the memory allocation of rp->domains in rapl_detect_domains(), there
is an additional memory of struct rapl_domain allocated to prevent the
pointer out of bounds called later.
Optimize the code here to save sizeof(struct rapl_domain) bytes of
memory.
Test in Intel NUC (i5-1135G7).
Cc: stable(a)vger.kernel.org
Signed-off-by: xiongxin <xiongxin(a)kylinos.cn>
Tested-by: xiongxin <xiongxin(a)kylinos.cn>
---
drivers/powercap/intel_rapl_common.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 8970c7b80884..f8971282498a 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -1171,13 +1171,14 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
{
struct rapl_domain *rd;
struct powercap_zone *power_zone = NULL;
- int nr_pl, ret;
+ int nr_pl, ret, i;
/* Update the domain data of the new package */
rapl_update_domain_data(rp);
/* first we register package domain as the parent zone */
- for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
+ for (i = 0; i < rp->nr_domains; i++) {
+ rd = &rp->domains[i];
if (rd->id == RAPL_DOMAIN_PACKAGE) {
nr_pl = find_nr_power_limit(rd);
pr_debug("register package domain %s\n", rp->name);
@@ -1201,8 +1202,9 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
return -ENODEV;
}
/* now register domains as children of the socket/package */
- for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
+ for (i = 0; i < rp->nr_domains; i++) {
struct powercap_zone *parent = rp->power_zone;
+ rd = &rp->domains[i];
if (rd->id == RAPL_DOMAIN_PACKAGE)
continue;
@@ -1302,7 +1304,6 @@ static void rapl_detect_powerlimit(struct rapl_domain *rd)
*/
static int rapl_detect_domains(struct rapl_package *rp, int cpu)
{
- struct rapl_domain *rd;
int i;
for (i = 0; i < RAPL_DOMAIN_MAX; i++) {
@@ -1319,15 +1320,15 @@ static int rapl_detect_domains(struct rapl_package *rp, int cpu)
}
pr_debug("found %d domains on %s\n", rp->nr_domains, rp->name);
- rp->domains = kcalloc(rp->nr_domains + 1, sizeof(struct rapl_domain),
+ rp->domains = kcalloc(rp->nr_domains, sizeof(struct rapl_domain),
GFP_KERNEL);
if (!rp->domains)
return -ENOMEM;
rapl_init_domains(rp);
- for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++)
- rapl_detect_powerlimit(rd);
+ for (i = 0; i < rp->nr_domains; i++)
+ rapl_detect_powerlimit(&rp->domains[i]);
return 0;
}
--
2.34.1
Dzień dobry,
jakiś czas temu zgłosiła się do nas firma, której strona internetowa nie pozycjonowała się wysoko w wyszukiwarce Google.
Na podstawie wykonanego przez nas audytu SEO zoptymalizowaliśmy treści na stronie pod kątem wcześniej opracowanych słów kluczowych. Nasz wewnętrzny system codziennie analizuje prawidłowe działanie witryny. Dzięki indywidualnej strategii, firma zdobywa coraz więcej Klientów.
Czy chcieliby Państwo zwiększyć liczbę osób odwiedzających stronę internetową firmy? Mógłbym przedstawić ofertę?
Pozdrawiam
Przemysław Polak
This series fixes two bugs in the RTW88 USB driver I was reported from
several people and that I also encountered myself.
The first one resulted in "timed out to flush queue 3" messages from the
driver and sometimes a complete stall of the TX queues.
The second one is specific to the RTW8821CU chipset. Here 2GHz networks
were hardly seen and impossible to connect to. This goes down to
misinterpreting the rfe_option field in the efuse.
Sascha Hauer (2):
wifi: rtw88: usb: fix priority queue to endpoint mapping
wifi: rtw88: rtw8821c: Fix rfe_option field width
drivers/net/wireless/realtek/rtw88/rtw8821c.c | 3 +-
drivers/net/wireless/realtek/rtw88/usb.c | 70 +++++++++++++------
2 files changed, 48 insertions(+), 25 deletions(-)
--
2.39.2
From: Hui Li <caelli(a)tencent.com>
We have met a hang on pty device, the reader was blocking
at epoll on master side, the writer was sleeping at wait_woken
inside n_tty_write on slave side, and the write buffer on
tty_port was full, we found that the reader and writer would
never be woken again and blocked forever.
The problem was caused by a race between reader and kworker:
n_tty_read(reader): n_tty_receive_buf_common(kworker):
copy_from_read_buf()|
|room = N_TTY_BUF_SIZE - (ldata->read_head - tail)
|room <= 0
n_tty_kick_worker() |
|ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush
data on tty_port to reader, and the kworker finds that reader
has no room to store data so room <= 0 is met. At this moment,
reader consumes all the data on reader buffer and calls
n_tty_kick_worker to check ldata->no_room which is false and
reader quits reading. Then kworker sets ldata->no_room=true
and quits too.
If write buffer is not full, writer will wake kworker to flush data
again after following writes, but if write buffer is full and writer
goes to sleep, kworker will never be woken again and tty device is
blocked.
This problem can be solved with a check for read buffer size inside
n_tty_receive_buf_common, if read buffer is empty and ldata->no_room
is true, a call to n_tty_kick_worker is necessary to keep flushing
data to reader.
Cc: <stable(a)vger.kernel.org>
Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader")
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Signed-off-by: Hui Li <caelli(a)tencent.com>
---
Patch changelogs between v1 and v2:
-add barrier inside n_tty_read and n_tty_receive_buf_common;
-comment why barrier is needed;
-access to ldata->no_room is changed with READ_ONCE and WRITE_ONCE;
Patch changelogs between v2 and v3:
-in function n_tty_receive_buf_common, add unlikely to check
ldata->no_room, eg: if (unlikely(ldata->no_room)), and READ_ONCE
is removed here to get locality;
-change comment for barrier to show the race condition to make
comment easier to understand;
Patch changelogs between v3 and v4:
-change subject from 'tty: fix a possible hang on tty device' to
'tty: fix hang on tty device with no_room set' to make subject
more obvious;
Patch changelogs between v4 and v5:
-name is changed from cael to caelli, li is added as the family
name and caelli is the fullname.
Patch changelogs between v5 and v6:
-change from and Signed-off-by, from 'caelli <juanfengpy(a)gmail.com>'
to 'caelli <caelli(a)tencent.com>', later one is my corporate address.
Patch changelogs between v6 and v7:
-change name from caelli to 'Hui Li', which is my name in chinese.
-the comment for barrier is improved, and a Fixes and Reviewed-by
tags is added.
drivers/tty/n_tty.c | 41 +++++++++++++++++++++++++++++++++++++----
1 file changed, 37 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index c8f56c9b1a1c..8c17304fffcf 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -204,8 +204,8 @@ static void n_tty_kick_worker(struct tty_struct *tty)
struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */
- if (unlikely(ldata->no_room)) {
- ldata->no_room = 0;
+ if (unlikely(READ_ONCE(ldata->no_room))) {
+ WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL,
"scheduling with invalid itty\n");
@@ -1698,7 +1698,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp,
if (overflow && room < 0)
ldata->read_head--;
room = overflow;
- ldata->no_room = flow && !room;
+ WRITE_ONCE(ldata->no_room, flow && !room);
} else
overflow = 0;
@@ -1729,6 +1729,27 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp,
} else
n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) {
+ /*
+ * Barrier here is to ensure to read the latest read_tail in
+ * chars_in_buffer() and to make sure that read_tail is not loaded
+ * before ldata->no_room is set, otherwise, following race may occur:
+ * n_tty_receive_buf_common()
+ * n_tty_read()
+ * if (!chars_in_buffer(tty))->false
+ * copy_from_read_buf()
+ * read_tail=commit_head
+ * n_tty_kick_worker()
+ * if (ldata->no_room)->false
+ * ldata->no_room = 1
+ * Then both kworker and reader will fail to kick n_tty_kick_worker(),
+ * smp_mb is paired with smp_mb() in n_tty_read().
+ */
+ smp_mb();
+ if (!chars_in_buffer(tty))
+ n_tty_kick_worker(tty);
+ }
+
up_read(&tty->termios_rwsem);
return rcvd;
@@ -2282,8 +2303,25 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
if (time)
timeout = time;
}
- if (old_tail != ldata->read_tail)
+ if (old_tail != ldata->read_tail) {
+ /*
+ * Make sure no_room is not read in n_tty_kick_worker()
+ * before setting ldata->read_tail in copy_from_read_buf(),
+ * otherwise, following race may occur:
+ * n_tty_read()
+ * n_tty_receive_buf_common()
+ * n_tty_kick_worker()
+ * if(ldata->no_room)->false
+ * ldata->no_room = 1
+ * if (!chars_in_buffer(tty))->false
+ * copy_from_read_buf()
+ * read_tail=commit_head
+ * Both reader and kworker will fail to kick tty_buffer_restart_work(),
+ * smp_mb is paired with smp_mb() in n_tty_receive_buf_common().
+ */
+ smp_mb();
n_tty_kick_worker(tty);
+ }
up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
--
2.27.0
Hi stable maintainers,
This is a backport patch for commit df205b5c6328 ("KVM: arm64: Filter
out invalid core register IDs in KVM_GET_REG_LIST") to 4.14 and 4.19.
This commit was not applied to the 4.14-stable tree due to merge
conflict [1]. To backport this, commit be25bbb392fa ("KVM: arm64: Factor
out core register ID enumeration") that has no functional changes is
cherry-picked.
I'd appreciate if if you could consider backporting this to 4.14 and
4.19.
Best regards,
Takahiro
[1] https://lore.kernel.org/all/1560343489-22906-1-git-send-email-Dave.Martin@a…
Dave Martin (2):
KVM: arm64: Factor out core register ID enumeration
KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST
arch/arm64/kvm/guest.c | 79 ++++++++++++++++++++++++++++++++++--------
1 file changed, 65 insertions(+), 14 deletions(-)
--
2.39.2
On Mon, Apr 03, 2023 at 09:11:43PM +0800, cuigaosheng wrote:
> On 2023/4/3 20:53, Greg KH wrote:
> > On Mon, Apr 03, 2023 at 08:17:04PM +0800, Gaosheng Cui wrote:
> > > This reverts commit c7a218cbf67fffcd99b76ae3b5e9c2e8bef17c8c.
> > >
> > > The memory of ctx is allocated by devm_kzalloc in cal_ctx_create,
> > > it should not be freed by kfree when cal_ctx_v4l2_init() fails,
> > > otherwise kfree() will cause double free, so revert this patch.
> > >
> > > Fixes: c7a218cbf67f ("media: ti: cal: fix possible memory leak in cal_ctx_create()")
> > > Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com>
> > > ---
> > > drivers/media/platform/ti-vpe/cal.c | 4 +---
> > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/media/platform/ti-vpe/cal.c b/drivers/media/platform/ti-vpe/cal.c
> > > index 93121c90d76a..2eef245c31a1 100644
> > > --- a/drivers/media/platform/ti-vpe/cal.c
> > > +++ b/drivers/media/platform/ti-vpe/cal.c
> > > @@ -624,10 +624,8 @@ static struct cal_ctx *cal_ctx_create(struct cal_dev *cal, int inst)
> > > ctx->cport = inst;
> > > ret = cal_ctx_v4l2_init(ctx);
> > > - if (ret) {
> > > - kfree(ctx);
> > > + if (ret)
> > > return NULL;
> > > - }
> > > return ctx;
> > > }
> > > --
> > > 2.25.1
> > >
> > Why is this not needed to be reverted in Linus's tree first?
> >
> > thanks,
> >
> > greg k-h
> > .
>
> Thanks for taking time to review this patch.
>
> The memory of ctx is allocated by kzalloc since commit 9e67f24e4d90
> ("media: ti-vpe: cal: fix ctx uninitialization"), so the fixes tag
> of patch c7a218cbf67fis not entirely accurate, mainline should merge this
> patch, but it should not be merged into 5.10, my apologies for
> notdiscovering this bug earlier. Gaosheng.
>
Great, can you please put all of this information in the changelog
explaining why this is only needed for this one branch and resend it?
thanks,
greg k-h
This reverts commit c7a218cbf67fffcd99b76ae3b5e9c2e8bef17c8c.
The memory of ctx is allocated by devm_kzalloc in cal_ctx_create,
it should not be freed by kfree when cal_ctx_v4l2_init() fails,
otherwise kfree() will cause double free, so revert this patch.
The memory of ctx is allocated by kzalloc since commit
9e67f24e4d9 ("media: ti-vpe: cal: fix ctx uninitialization"),
so the fixes tag of patch c7a218cbf67fis not entirely accurate,
mainline should merge this patch, but it should not be merged
into 5.10, so we just revert this patch for this branch.
Fixes: c7a218cbf67f ("media: ti: cal: fix possible memory leak in cal_ctx_create()")
Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com>
---
v2:
- Update the commit message and explain why this is needed
for 5.10 branch, thanks!
drivers/media/platform/ti-vpe/cal.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/media/platform/ti-vpe/cal.c b/drivers/media/platform/ti-vpe/cal.c
index 93121c90d76a..2eef245c31a1 100644
--- a/drivers/media/platform/ti-vpe/cal.c
+++ b/drivers/media/platform/ti-vpe/cal.c
@@ -624,10 +624,8 @@ static struct cal_ctx *cal_ctx_create(struct cal_dev *cal, int inst)
ctx->cport = inst;
ret = cal_ctx_v4l2_init(ctx);
- if (ret) {
- kfree(ctx);
+ if (ret)
return NULL;
- }
return ctx;
}
--
2.25.1
Hi stable maintainers,
This is a backport patch for commit df205b5c6328 ("KVM: arm64: Filter
out invalid core register IDs in KVM_GET_REG_LIST") to 4.14 and 4.19.
This commit was not applied to the 4.14-stable tree due to merge
conflict [1]. To backport this, commit be25bbb392fa ("KVM: arm64: Factor
out core register ID enumeration") that has no functional changes is
cherry-picked.
I'd appreciate if if you could consider backporting this to 4.14 and
4.19.
Best regards,
Takahiro
[1] https://lore.kernel.org/all/1560343489-22906-1-git-send-email-Dave.Martin@a…
Dave Martin (2):
KVM: arm64: Factor out core register ID enumeration
KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST
arch/arm64/kvm/guest.c | 79 ++++++++++++++++++++++++++++++++++--------
1 file changed, 65 insertions(+), 14 deletions(-)
--
2.39.2
From: Daniel Bristot de Oliveira <bristot(a)kernel.org>
osnoise/timerlat tracers are reporting new max latency on instances
where the tracing is off, creating inconsistencies between the max
reported values in the trace and in the tracing_max_latency. Thus
only report new tracing_max_latency on active tracing instances.
Link: https://lkml.kernel.org/r/ecd109fde4a0c24ab0f00ba1e9a144ac19a91322.16801041…
Cc: stable(a)vger.kernel.org
Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_osnoise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index e8116094bed8..4496975f2029 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1296,7 +1296,7 @@ static void notify_new_max_latency(u64 latency)
rcu_read_lock();
list_for_each_entry_rcu(inst, &osnoise_instances, list) {
tr = inst->tr;
- if (tr->max_latency < latency) {
+ if (tracer_tracing_is_on(tr) && tr->max_latency < latency) {
tr->max_latency = latency;
latency_fsnotify(tr);
}
--
2.39.2
From: Daniel Bristot de Oliveira <bristot(a)kernel.org>
timerlat is not reporting a new tracing_max_latency for the thread
latency. The reason is that it is not calling notify_new_max_latency()
function after the new thread latency is sampled.
Call notify_new_max_latency() after computing the thread latency.
Link: https://lkml.kernel.org/r/16e18d61d69073d0192ace07bf61e405cca96e9c.16801041…
Cc: stable(a)vger.kernel.org
Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_osnoise.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 9176bb7a9bb4..e8116094bed8 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1738,6 +1738,8 @@ static int timerlat_main(void *data)
trace_timerlat_sample(&s);
+ notify_new_max_latency(diff);
+
timerlat_dump_stack(time_to_us(diff));
tlat->tracing_thread = false;
--
2.39.2
From: John Keeping <john(a)metanate.com>
If the compiler decides not to inline this function then preemption
tracing will always show an IP inside the preemption disabling path and
never the function actually calling preempt_{enable,disable}.
Link: https://lore.kernel.org/linux-trace-kernel/20230327173647.1690849-1-john@me…
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: stable(a)vger.kernel.org
Fixes: f904f58263e1d ("sched/debug: Fix preempt_disable_ip recording for preempt_disable()")
Signed-off-by: John Keeping <john(a)metanate.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
include/linux/ftrace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 366c730beaa3..402fc061de75 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -980,7 +980,7 @@ static inline void __ftrace_enabled_restore(int enabled)
#define CALLER_ADDR5 ((unsigned long)ftrace_return_address(5))
#define CALLER_ADDR6 ((unsigned long)ftrace_return_address(6))
-static inline unsigned long get_lock_parent_ip(void)
+static __always_inline unsigned long get_lock_parent_ip(void)
{
unsigned long addr = CALLER_ADDR0;
--
2.39.2
From: Zheng Yejian <zhengyejian1(a)huawei.com>
When user reads file 'trace_pipe', kernel keeps printing following logs
that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in
rb_get_reader_page(). It just looks like there's an infinite loop in
tracing_read_pipe(). This problem occurs several times on arm64 platform
when testing v5.10 and below.
Call trace:
rb_get_reader_page+0x248/0x1300
rb_buffer_peek+0x34/0x160
ring_buffer_peek+0xbc/0x224
peek_next_entry+0x98/0xbc
__find_next_entry+0xc4/0x1c0
trace_find_next_entry_inc+0x30/0x94
tracing_read_pipe+0x198/0x304
vfs_read+0xb4/0x1e0
ksys_read+0x74/0x100
__arm64_sys_read+0x24/0x30
el0_svc_common.constprop.0+0x7c/0x1bc
do_el0_svc+0x2c/0x94
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb4
el0_sync+0x160/0x180
Then I dump the vmcore and look into the problematic per_cpu ring_buffer,
I found that tail_page/commit_page/reader_page are on the same page while
reader_page->read is obviously abnormal:
tail_page == commit_page == reader_page == {
.write = 0x100d20,
.read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!!
.entries = 0x10004c,
.real_end = 0x0,
.page = {
.time_stamp = 0x857257416af0,
.commit = 0xd20, // This page hasn't been full filled.
// .data[0...0xd20] seems normal.
}
}
The root cause is most likely the race that reader and writer are on the
same page while reader saw an event that not fully committed by writer.
To fix this, add memory barriers to make sure the reader can see the
content of what is committed. Since commit a0fcaaed0c46 ("ring-buffer: Fix
race between reset page and reading page") has added the read barrier in
rb_get_reader_page(), here we just need to add the write barrier.
Link: https://lore.kernel.org/linux-trace-kernel/20230325021247.2923907-1-zhengye…
Cc: stable(a)vger.kernel.org
Fixes: 77ae365eca89 ("ring-buffer: make lockless")
Suggested-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index c6f47b6cfd5f..76a2d91eecad 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3098,6 +3098,10 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
if (RB_WARN_ON(cpu_buffer,
rb_is_reader_page(cpu_buffer->tail_page)))
return;
+ /*
+ * No need for a memory barrier here, as the update
+ * of the tail_page did it for this page.
+ */
local_set(&cpu_buffer->commit_page->page->commit,
rb_page_write(cpu_buffer->commit_page));
rb_inc_page(&cpu_buffer->commit_page);
@@ -3107,6 +3111,8 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
while (rb_commit_index(cpu_buffer) !=
rb_page_write(cpu_buffer->commit_page)) {
+ /* Make sure the readers see the content of what is committed. */
+ smp_wmb();
local_set(&cpu_buffer->commit_page->page->commit,
rb_page_write(cpu_buffer->commit_page));
RB_WARN_ON(cpu_buffer,
@@ -4684,7 +4690,12 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
/*
* Make sure we see any padding after the write update
- * (see rb_reset_tail())
+ * (see rb_reset_tail()).
+ *
+ * In addition, a writer may be writing on the reader page
+ * if the page has not been fully filled, so the read barrier
+ * is also needed to make sure we see the content of what is
+ * committed by the writer (see rb_set_commit_to_write()).
*/
smp_rmb();
--
2.39.2
The Lenovo ThinkPad W530 uses a nvidia k1000m GPU. When this gets used
together with one of the older nvidia binary driver series (the latest
series does not support it), then backlight control does not work.
This is caused by commit 3dbc80a3e4c5 ("ACPI: video: Make backlight
class device registration a separate step (v2)") combined with
commit 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for
creating ACPI backlight by default").
After these changes the acpi_video# backlight device is only registered
when requested by a GPU driver calling acpi_video_register_backlight()
which the nvidia binary driver does not do.
I realize that using the nvidia binary driver is not a supported use-case
and users can workaround this by adding acpi_backlight=video on the kernel
commandline, but the ThinkPad W530 is a popular model under Linux users,
so it seems worthwhile to add a quirk for this.
I will also email Nvidia asking them to make the driver call
acpi_video_register_backlight() when an internal LCD panel is detected.
So maybe the next maintenance release of the drivers will fix this...
Fixes: 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/acpi/video_detect.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 295744fe7c92..e85729fc481f 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -299,6 +299,20 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
},
},
+ /*
+ * Older models with nvidia GPU which need acpi_video backlight
+ * control and where the old nvidia binary driver series does not
+ * call acpi_video_register_backlight().
+ */
+ {
+ .callback = video_detect_force_video,
+ /* ThinkPad W530 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad W530"),
+ },
+ },
+
/*
* These models have a working acpi_video backlight control, and using
* native backlight causes a regression where backlight does not work
--
2.39.1
On the Apple iMac14,1 and iMac14,2 all-in-ones (monitors with builtin "PC")
the connection between the GPU and the panel is seen by the GPU driver as
regular DP instead of eDP, causing the GPU driver to never call
acpi_video_register_backlight().
(GPU drivers only call acpi_video_register_backlight() when an internal
panel is detected, to avoid non working acpi_video# devices getting
registered on desktops which unfortunately is a real issue.)
Fix the missing acpi_video# backlight device on these all-in-ones by
adding a acpi_backlight=video DMI quirk, so that video.ko will
immediately register the backlight device instead of waiting for
an acpi_video_register_backlight() call.
Fixes: 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/acpi/video_detect.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index f7c218dd8742..295744fe7c92 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -276,6 +276,29 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
},
},
+ /*
+ * Models which need acpi_video backlight control where the GPU drivers
+ * do not call acpi_video_register_backlight() because no internal panel
+ * is detected. Typically these are all-in-ones (monitors with builtin
+ * PC) where the panel connection shows up as regular DP instead of eDP.
+ */
+ {
+ .callback = video_detect_force_video,
+ /* Apple iMac14,1 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "iMac14,1"),
+ },
+ },
+ {
+ .callback = video_detect_force_video,
+ /* Apple iMac14,2 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "iMac14,2"),
+ },
+ },
+
/*
* These models have a working acpi_video backlight control, and using
* native backlight causes a regression where backlight does not work
--
2.39.1
Commit 3dbc80a3e4c5 ("ACPI: video: Make backlight class device
registration a separate step (v2)") combined with
commit 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for
creating ACPI backlight by default")
Means that the video.ko code now fully depends on the GPU driver calling
acpi_video_register_backlight() for the acpi_video# backlight class
devices to get registered.
This means that if the GPU driver does not do this, acpi_backlight=video
on the cmdline, or DMI quirks for selecting acpi_video# will not work.
This is a problem on for example Apple iMac14,1 all-in-ones where
the monitor's LCD panel shows up as a regular DP connection instead of
eDP so the GPU driver will not call acpi_video_register_backlight() [1].
Fix this by making video.ko directly register the acpi_video# devices
when these have been explicitly requested either on the cmdline or
through DMI quirks (rather then auto-detection being used).
[1] GPU drivers only call acpi_video_register_backlight() when an internal
panel is detected, to avoid non working acpi_video# devices getting
registered on desktops which unfortunately is a real issue.
Fixes: 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/acpi/acpi_video.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 97b711e57bff..c7a6d0b69dab 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -1984,6 +1984,7 @@ static int instance;
static int acpi_video_bus_add(struct acpi_device *device)
{
struct acpi_video_bus *video;
+ bool auto_detect;
int error;
acpi_status status;
@@ -2045,10 +2046,20 @@ static int acpi_video_bus_add(struct acpi_device *device)
mutex_unlock(&video_list_lock);
/*
- * The userspace visible backlight_device gets registered separately
- * from acpi_video_register_backlight().
+ * If backlight-type auto-detection is used then a native backlight may
+ * show up later and this may change the result from video to native.
+ * Therefor normally the userspace visible /sys/class/backlight device
+ * gets registered separately by the GPU driver calling
+ * acpi_video_register_backlight() when an internal panel is detected.
+ * Register the backlight now when not using auto-detection, so that
+ * when the kernel cmdline or DMI-quirks are used the backlight will
+ * get registered even if acpi_video_register_backlight() is not called.
*/
acpi_video_run_bcl_for_osi(video);
+ if (__acpi_video_get_backlight_type(false, &auto_detect) == acpi_backlight_video &&
+ !auto_detect)
+ acpi_video_bus_register_backlight(video);
+
acpi_video_bus_add_notify_handler(video);
return 0;
--
2.39.1
Allow callers of __acpi_video_get_backlight_type() to pass a pointer
to a bool which will get set to false if the backlight-type comes from
the cmdline or a DMI quirk and set to true if auto-detection was used.
And make __acpi_video_get_backlight_type() non static so that it can
be called directly outside of video_detect.c .
While at it turn the acpi_video_get_backlight_type() and
acpi_video_backlight_use_native() wrappers into static inline functions
in include/acpi/video.h, so that we need to export one less symbol.
Fixes: 5aa9d943e9b6 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/acpi/video_detect.c | 21 ++++++++-------------
include/acpi/video.h | 15 +++++++++++++--
2 files changed, 21 insertions(+), 15 deletions(-)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index fd7cbce8076e..f7c218dd8742 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -782,7 +782,7 @@ static bool prefer_native_over_acpi_video(void)
* Determine which type of backlight interface to use on this system,
* First check cmdline, then dmi quirks, then do autodetect.
*/
-static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
+enum acpi_backlight_type __acpi_video_get_backlight_type(bool native, bool *auto_detect)
{
static DEFINE_MUTEX(init_mutex);
static bool nvidia_wmi_ec_present;
@@ -807,6 +807,9 @@ static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
native_available = true;
mutex_unlock(&init_mutex);
+ if (auto_detect)
+ *auto_detect = false;
+
/*
* The below heuristics / detection steps are in order of descending
* presedence. The commandline takes presedence over anything else.
@@ -818,6 +821,9 @@ static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
if (acpi_backlight_dmi != acpi_backlight_undef)
return acpi_backlight_dmi;
+ if (auto_detect)
+ *auto_detect = true;
+
/* Special cases such as nvidia_wmi_ec and apple gmux. */
if (nvidia_wmi_ec_present)
return acpi_backlight_nvidia_wmi_ec;
@@ -837,15 +843,4 @@ static enum acpi_backlight_type __acpi_video_get_backlight_type(bool native)
/* No ACPI video/native (old hw), use vendor specific fw methods. */
return acpi_backlight_vendor;
}
-
-enum acpi_backlight_type acpi_video_get_backlight_type(void)
-{
- return __acpi_video_get_backlight_type(false);
-}
-EXPORT_SYMBOL(acpi_video_get_backlight_type);
-
-bool acpi_video_backlight_use_native(void)
-{
- return __acpi_video_get_backlight_type(true) == acpi_backlight_native;
-}
-EXPORT_SYMBOL(acpi_video_backlight_use_native);
+EXPORT_SYMBOL(__acpi_video_get_backlight_type);
diff --git a/include/acpi/video.h b/include/acpi/video.h
index 8ed9bec03e53..ff5a8da5d883 100644
--- a/include/acpi/video.h
+++ b/include/acpi/video.h
@@ -59,8 +59,6 @@ extern void acpi_video_unregister(void);
extern void acpi_video_register_backlight(void);
extern int acpi_video_get_edid(struct acpi_device *device, int type,
int device_id, void **edid);
-extern enum acpi_backlight_type acpi_video_get_backlight_type(void);
-extern bool acpi_video_backlight_use_native(void);
/*
* Note: The value returned by acpi_video_handles_brightness_key_presses()
* may change over time and should not be cached.
@@ -69,6 +67,19 @@ extern bool acpi_video_handles_brightness_key_presses(void);
extern int acpi_video_get_levels(struct acpi_device *device,
struct acpi_video_device_brightness **dev_br,
int *pmax_level);
+
+extern enum acpi_backlight_type __acpi_video_get_backlight_type(bool native,
+ bool *auto_detect);
+
+static inline enum acpi_backlight_type acpi_video_get_backlight_type(void)
+{
+ return __acpi_video_get_backlight_type(false, NULL);
+}
+
+static inline bool acpi_video_backlight_use_native(void)
+{
+ return __acpi_video_get_backlight_type(true, NULL) == acpi_backlight_native;
+}
#else
static inline void acpi_video_report_nolcd(void) { return; };
static inline int acpi_video_register(void) { return -ENODEV; }
--
2.39.1
Remove the EDO mode support from as the FMC2 controller does not
support the feature.
Signed-off-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Fixes: 2cd457f328c1 ("mtd: rawnand: stm32_fmc2: add STM32 FMC2 NAND flash controller driver")
Cc: stable(a)vger.kernel.org #v5.4+
Reviewed-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
---
Changes in v3:
- The commit message has been reworked
- Cc to stable added
- Tudor Reviewed-by added
drivers/mtd/nand/raw/stm32_fmc2_nand.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/mtd/nand/raw/stm32_fmc2_nand.c b/drivers/mtd/nand/raw/stm32_fmc2_nand.c
index 5d627048c420..3abb63d00a0b 100644
--- a/drivers/mtd/nand/raw/stm32_fmc2_nand.c
+++ b/drivers/mtd/nand/raw/stm32_fmc2_nand.c
@@ -1531,6 +1531,9 @@ static int stm32_fmc2_nfc_setup_interface(struct nand_chip *chip, int chipnr,
if (IS_ERR(sdrt))
return PTR_ERR(sdrt);
+ if (sdrt->tRC_min < 30000)
+ return -EOPNOTSUPP;
+
if (chipnr == NAND_DATA_IFACE_CHECK_ONLY)
return 0;
--
2.25.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 005308f7bdacf5685ed1a431244a183dbbb9e0e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040352-overbuilt-backshift-9c74@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
005308f7bdac ("io_uring/poll: clear single/double poll flags on poll arming")
5204aa8c43bd ("io_uring: add a helper for apoll alloc")
329061d3e2f9 ("io_uring: move poll handling into its own file")
cfd22e6b3319 ("io_uring: add opcode name to io_op_defs")
c9f06aa7de15 ("io_uring: move io_uring_task (tctx) helpers into its own file")
a4ad4f748ea9 ("io_uring: move fdinfo helpers to its own file")
e5550a1447bf ("io_uring: use io_is_uring_fops() consistently")
17437f311490 ("io_uring: move SQPOLL related handling into its own file")
59915143e89f ("io_uring: move timeout opcodes and handling into its own file")
e418bbc97bff ("io_uring: move our reference counting into a header")
36404b09aa60 ("io_uring: move msg_ring into its own file")
f9ead18c1058 ("io_uring: split network related opcodes into its own file")
e0da14def1ee ("io_uring: move statx handling to its own file")
a9c210cebe13 ("io_uring: move epoll handler to its own file")
4cf90495281b ("io_uring: add a dummy -EOPNOTSUPP prep handler")
99f15d8d6136 ("io_uring: move uring_cmd handling to its own file")
cd40cae29ef8 ("io_uring: split out open/close operations")
453b329be5ea ("io_uring: separate out file table handling code")
f4c163dd7d4b ("io_uring: split out fadvise/madvise operations")
0d5847274037 ("io_uring: split out fs related sync/fallocate functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 005308f7bdacf5685ed1a431244a183dbbb9e0e8 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Mon, 27 Mar 2023 19:56:18 -0600
Subject: [PATCH] io_uring/poll: clear single/double poll flags on poll arming
Unless we have at least one entry queued, then don't call into
io_poll_remove_entries(). Normally this isn't possible, but if we
retry poll then we can have ->nr_entries cleared again as we're
setting it up. If this happens for a poll retry, then we'll still have
at least REQ_F_SINGLE_POLL set. io_poll_remove_entries() then thinks
it has entries to remove.
Clear REQ_F_SINGLE_POLL and REQ_F_DOUBLE_POLL unconditionally when
arming a poll request.
Fixes: c16bda37594f ("io_uring/poll: allow some retries for poll triggering spuriously")
Cc: stable(a)vger.kernel.org
Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 795facbd0e9f..55306e801081 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -726,6 +726,7 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
apoll = io_req_alloc_apoll(req, issue_flags);
if (!apoll)
return IO_APOLL_ABORTED;
+ req->flags &= ~(REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL);
req->flags |= REQ_F_POLLED;
ipt.pt._qproc = io_async_queue_proc;
When upreving llvm I realised that kexec stopped working on my test
platform. This patch fixes it.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v5:
- Add warning when multiple text sections are found. Thanks Simon!
- Add Fixes tag.
- Link to v4: https://lore.kernel.org/r/20230321-kexec_clang16-v4-0-1340518f98e9@chromium…
Changes in v4:
- Add Cc: stable
- Add linker script for x86
- Add a warning when the kernel image has overlapping sections.
- Link to v3: https://lore.kernel.org/r/20230321-kexec_clang16-v3-0-5f016c8d0e87@chromium…
Changes in v3:
- Fix initial value. Thanks Ross!
- Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@chromium…
Changes in v2:
- Fix if condition. Thanks Steven!.
- Update Philipp email. Thanks Baoquan.
- Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@chromium…
---
Ricardo Ribalda (2):
kexec: Support purgatories with .text.hot sections
x86/purgatory: Add linker script
arch/x86/purgatory/.gitignore | 2 ++
arch/x86/purgatory/Makefile | 20 +++++++++----
arch/x86/purgatory/kexec-purgatory.S | 2 +-
arch/x86/purgatory/purgatory.lds.S | 57 ++++++++++++++++++++++++++++++++++++
kernel/kexec_file.c | 14 ++++++++-
5 files changed, 87 insertions(+), 8 deletions(-)
---
base-commit: 17214b70a159c6547df9ae204a6275d983146f6b
change-id: 20230321-kexec_clang16-4510c23d129c
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Hi,
I am running kernel 6.1 on a system with a mv88e6320 and can easily
trigger a flood of "mv88e6085 30be0000.ethernet-1:00: VTU member
violation for vid 10, source port 5" messages.
When this happens, the Ethernet audio that passes through the switch
causes a loud noise in the speaker.
Backporting the following commits to 6.1 solves the problem:
4bf24ad09bc0 ("net: dsa: mv88e6xxx: read FID when handling ATU violations")
8646384d80f3 ("net: dsa: mv88e6xxx: replace ATU violation prints with
trace points")
9e3d9ae52b56 ("net: dsa: mv88e6xxx: replace VTU violation prints with
trace points")
Please apply them to 6.1-stable tree.
Thanks,
Fabio Estevam
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x fd7276189450110ed835eb0a334e62d2f1c4e3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040302-flakily-define-371e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
fd7276189450 ("powerpc: Don't try to copy PPR for task with NULL pt_regs")
47e12855a91d ("powerpc: switch to ->regset_get()")
6e0b79750ce2 ("powerpc/ptrace: move register viewing functions out of ptrace.c")
7c1f8db019f8 ("powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.")
60ef9dbd9d2a ("powerpc/ptrace: split out SPE related functions.")
1b20773b00b7 ("powerpc/ptrace: split out ALTIVEC related functions.")
7b99ed4e8e3a ("powerpc/ptrace: split out VSX related functions.")
963ae6b2ff1c ("powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET")
f1763e623c69 ("powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64")
b3138536c837 ("powerpc/ptrace: remove unused header includes")
da9a1c10e2c7 ("powerpc: Move ptrace into a subdirectory.")
68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
965dd3ad3076 ("powerpc/64/syscall: Remove non-volatile GPR save optimisation")
fddb5d430ad9 ("open: introduce openat2(2) syscall")
793b08e2efff ("powerpc/kexec: Move kexec files into a dedicated subdir.")
9f7bd9201521 ("powerpc/32: Split kexec low level code out of misc_32.S")
b020aa9d1e87 ("powerpc: cleanup hw_irq.h")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd7276189450110ed835eb0a334e62d2f1c4e3be Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Sun, 26 Mar 2023 16:15:57 -0600
Subject: [PATCH] powerpc: Don't try to copy PPR for task with NULL pt_regs
powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which
from my (arguably very short) checking is not commonly done for other
archs. This is fine, except when PF_IO_WORKER's have been created and
the task does something that causes a coredump to be generated. Then we
get this crash:
Kernel attempted to read user page (160) - exploit attempt? (uid: 1000)
BUG: Kernel NULL pointer dereference on read at 0x00000160
Faulting instruction address: 0xc0000000000c3a60
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries
Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod
CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0
REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+)
MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8
...
NIP memcpy_power7+0x200/0x7d0
LR ppr_get+0x64/0xb0
Call Trace:
ppr_get+0x40/0xb0 (unreliable)
__regset_get+0x180/0x1f0
regset_get_alloc+0x64/0x90
elf_core_dump+0xb98/0x1b60
do_coredump+0x1c34/0x24a0
get_signal+0x71c/0x1410
do_notify_resume+0x140/0x6f0
interrupt_exit_user_prepare_main+0x29c/0x320
interrupt_exit_user_prepare+0x6c/0xa0
interrupt_return_srr_user+0x8/0x138
Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL
pt_regs.
Check for a valid pt_regs in both ppc_get/ppr_set, and return an error
if not set. The actual error value doesn't seem to be important here, so
just pick -EINVAL.
Fixes: fa439810cc1b ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR")
Cc: stable(a)vger.kernel.org # v4.8+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
[mpe: Trim oops in change log, add Fixes & Cc stable]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/d9f63344-fe7c-56ae-b420-4a1a04a2ae4c@kernel.dk
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 2087a785f05f..5fff0d04b23f 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -290,6 +290,9 @@ static int gpr_set(struct task_struct *target, const struct user_regset *regset,
static int ppr_get(struct task_struct *target, const struct user_regset *regset,
struct membuf to)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return membuf_write(&to, &target->thread.regs->ppr, sizeof(u64));
}
@@ -297,6 +300,9 @@ static int ppr_set(struct task_struct *target, const struct user_regset *regset,
unsigned int pos, unsigned int count, const void *kbuf,
const void __user *ubuf)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.regs->ppr, 0, sizeof(u64));
}
I am very sorry. My gcc version is 7.5 and it does not report error.
We have a deadlock problem which can be solved by commit 4f7e7236435ca
("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock").
However, it makes lock order of cpus_read_lock and cpuset_mutex
wrong in v4.19. The call sequence is as follows:
cgroup_procs_write()
cgroup_procs_write_start()
get_online_cpus(); // cpus_read_lock()
percpu_down_write(&cgroup_threadgroup_rwsem)
cgroup_attach_task
cgroup_migrate
cgroup_migrate_execute
ss->attach (cpust_attach)
mutex_lock(&cpuset_mutex)
it seems hard to make cpus_read_lock is locked before
cgroup_threadgroup_rwsem and cpuset_mutex is locked before
cpus_read_lock unless backport the commit d74b27d63a8beb
("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order")
Changes in v2:
* Add #include <linux/cpu.h> in kernel/cgroup/cgroup.c to
avoid some compile error.
* Exchange get_online_cpus() location in cpuset_attach to
keep cpu_hotplug_lock->cpuset_mutex order, although it will
be remove by ("cgroup: Fix threadgroup_rwsem <->
cpus_read_lock() deadlock")
Juri Lelli (1):
cgroup/cpuset: Change cpuset_rwsem and hotplug lock order
Tejun Heo (1):
cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock
Tetsuo Handa (1):
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
include/linux/cpuset.h | 8 +++----
kernel/cgroup/cgroup-v1.c | 3 +++
kernel/cgroup/cgroup.c | 50 +++++++++++++++++++++++++++++++++++----
kernel/cgroup/cpuset.c | 25 ++++++++++++--------
4 files changed, 67 insertions(+), 19 deletions(-)
--
2.17.1
Kernel bug in iomap_read_inline_data() fixed by the following patch is hit
on older stables 4.19/5.4/5.10.
The patch failed to be initially backported into stable branches older
than 5.15 due to the upstream commit 7db354444ad8 ("gfs2: Cosmetic
gfs2_dinode_{in,out} cleanup").
Now it can be cleanly applied to the 4.19/5.4/5.10 stable branches.
The patch below does not apply to the 6.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.2.y
git checkout FETCH_HEAD
git cherry-pick -x 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800048817970(a)kroah.com' --subject-prefix 'PATCH 6.2.y' HEAD^..
Possible dependencies:
0482c34ec6f8 ("usb: ucsi: Fix ucsi->connector race")
f87fb985452a ("usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()")
924fb3ec50f5 ("Merge 6.2-rc7 into usb-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Wed, 8 Mar 2023 16:42:43 +0100
Subject: [PATCH] usb: ucsi: Fix ucsi->connector race
ucsi_init() which runs from a workqueue sets ucsi->connector and
on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for
ucsi->connector being NULL in case ucsi_init() has not finished yet;
or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on
an error creates a race where the check in ucsi_resume() may pass,
only to have ucsi->connector free-ed underneath it when ucsi_init()
hits an error.
Fix this race by making ucsi_init() store the connector array in
a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae8f ("usb: typec: ucsi: Simplified registration and I/O API")
Cc: stable(a)vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 0623861c597b..8d1baf28df55 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1125,12 +1125,11 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con)
return NULL;
}
-static int ucsi_register_port(struct ucsi *ucsi, int index)
+static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con)
{
struct usb_power_delivery_desc desc = { ucsi->cap.pd_version};
struct usb_power_delivery_capabilities_desc pd_caps;
struct usb_power_delivery_capabilities *pd_cap;
- struct ucsi_connector *con = &ucsi->connector[index];
struct typec_capability *cap = &con->typec_cap;
enum typec_accessory *accessory = cap->accessory;
enum usb_role u_role = USB_ROLE_NONE;
@@ -1151,7 +1150,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
init_completion(&con->complete);
mutex_init(&con->lock);
INIT_LIST_HEAD(&con->partner_tasks);
- con->num = index + 1;
con->ucsi = ucsi;
cap->fwnode = ucsi_find_fwnode(con);
@@ -1328,7 +1326,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
*/
static int ucsi_init(struct ucsi *ucsi)
{
- struct ucsi_connector *con;
+ struct ucsi_connector *con, *connector;
u64 command, ntfy;
int ret;
int i;
@@ -1359,16 +1357,16 @@ static int ucsi_init(struct ucsi *ucsi)
}
/* Allocate the connectors. Released in ucsi_unregister() */
- ucsi->connector = kcalloc(ucsi->cap.num_connectors + 1,
- sizeof(*ucsi->connector), GFP_KERNEL);
- if (!ucsi->connector) {
+ connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL);
+ if (!connector) {
ret = -ENOMEM;
goto err_reset;
}
/* Register all connectors */
for (i = 0; i < ucsi->cap.num_connectors; i++) {
- ret = ucsi_register_port(ucsi, i);
+ connector[i].num = i + 1;
+ ret = ucsi_register_port(ucsi, &connector[i]);
if (ret)
goto err_unregister;
}
@@ -1380,11 +1378,12 @@ static int ucsi_init(struct ucsi *ucsi)
if (ret < 0)
goto err_unregister;
+ ucsi->connector = connector;
ucsi->ntfy = ntfy;
return 0;
err_unregister:
- for (con = ucsi->connector; con->port; con++) {
+ for (con = connector; con->port; con++) {
ucsi_unregister_partner(con);
ucsi_unregister_altmodes(con, UCSI_RECIPIENT_CON);
ucsi_unregister_port_psy(con);
@@ -1400,10 +1399,7 @@ static int ucsi_init(struct ucsi *ucsi)
typec_unregister_port(con->port);
con->port = NULL;
}
-
- kfree(ucsi->connector);
- ucsi->connector = NULL;
-
+ kfree(connector);
err_reset:
memset(&ucsi->cap, 0, sizeof(ucsi->cap));
ucsi_reset_ppm(ucsi);
commit 2ab4f4018cb6b8010ca5002c3bdc37783b5d28c2 upstream.
When mailboxes are used as a transport it is possible to setup the SCMI
transport layer, depending on the underlying channels configuration, to use
one or two mailboxes, associated, respectively, to one or two, distinct,
shared memory areas: any other combination should be treated as invalid.
Add more strict checking of SCMI mailbox transport device node descriptors.
Fixes: fbc4d81ad285 ("firmware: arm_scmi: refactor in preparation to support per-protocol channels")
Cc: <stable(a)vger.kernel.org> # 4.19
Signed-off-by: Cristian Marussi <cristian.marussi(a)arm.com>
Link: https://lore.kernel.org/r/20230307162324.891866-1-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
[Cristian: backported to v4.19]
Signed-off-by: Cristian Marussi <cristian.marussi(a)arm.com>
---
Backporting was trivial but required since the patched function
was moved around in a different file.
---
drivers/firmware/arm_scmi/driver.c | 37 ++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
index e8cd66705ad7..5ccbbb3eb68e 100644
--- a/drivers/firmware/arm_scmi/driver.c
+++ b/drivers/firmware/arm_scmi/driver.c
@@ -705,6 +705,39 @@ static int scmi_remove(struct platform_device *pdev)
return ret;
}
+static int scmi_mailbox_chan_validate(struct device *cdev)
+{
+ int num_mb, num_sh, ret = 0;
+ struct device_node *np = cdev->of_node;
+
+ num_mb = of_count_phandle_with_args(np, "mboxes", "#mbox-cells");
+ num_sh = of_count_phandle_with_args(np, "shmem", NULL);
+ /* Bail out if mboxes and shmem descriptors are inconsistent */
+ if (num_mb <= 0 || num_sh > 2 || num_mb != num_sh) {
+ dev_warn(cdev, "Invalid channel descriptor for '%s'\n",
+ of_node_full_name(np));
+ return -EINVAL;
+ }
+
+ if (num_sh > 1) {
+ struct device_node *np_tx, *np_rx;
+
+ np_tx = of_parse_phandle(np, "shmem", 0);
+ np_rx = of_parse_phandle(np, "shmem", 1);
+ /* SCMI Tx and Rx shared mem areas have to be distinct */
+ if (!np_tx || !np_rx || np_tx == np_rx) {
+ dev_warn(cdev, "Invalid shmem descriptor for '%s'\n",
+ of_node_full_name(np));
+ ret = -EINVAL;
+ }
+
+ of_node_put(np_tx);
+ of_node_put(np_rx);
+ }
+
+ return ret;
+}
+
static inline int
scmi_mbox_chan_setup(struct scmi_info *info, struct device *dev, int prot_id)
{
@@ -720,6 +753,10 @@ scmi_mbox_chan_setup(struct scmi_info *info, struct device *dev, int prot_id)
goto idr_alloc;
}
+ ret = scmi_mailbox_chan_validate(dev);
+ if (ret)
+ return ret;
+
cinfo = devm_kzalloc(info->dev, sizeof(*cinfo), GFP_KERNEL);
if (!cinfo)
return -ENOMEM;
--
2.34.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 88b170088ad2c3e27086fe35769aa49f8a512564
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800036329137(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
db58653ce0c7 ("zonefs: Fix active zone accounting")
7dd12d65ac64 ("zonefs: fix zone report size in __zonefs_io_error()")
8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88b170088ad2c3e27086fe35769aa49f8a512564 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Mon, 20 Mar 2023 22:49:15 +0900
Subject: [PATCH] zonefs: Fix error message in zonefs_file_dio_append()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.
Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani(a)oracle.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index a545a6d9a32e..617e4f9db42e 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -426,7 +426,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
if (bio->bi_iter.bi_sector != wpsector) {
zonefs_warn(inode->i_sb,
"Corrupted write pointer %llu for zone at %llu\n",
- wpsector, z->z_sector);
+ bio->bi_iter.bi_sector, z->z_sector);
ret = -EIO;
}
}
From: Dan Carpenter <dan.carpenter(a)oracle.com>
commit 0709831a50d31b3caf2237e8d7fe89e15b0d919d upstream.
The code is supposed to clear the RH_A_NPS and RH_A_PSM bits, but it's
a no-op because of the & vs | typo. This bug predates git and it was
only discovered using static analysis so it must not affect too many
people in real life.
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
Link: https://lore.kernel.org/r/20190817065520.GA29951@mwanda
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu(a)toshiba.co.jp>
---
drivers/usb/host/ohci-pxa27x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/ohci-pxa27x.c b/drivers/usb/host/ohci-pxa27x.c
index 3e247495973591..7679fb583e4161 100644
--- a/drivers/usb/host/ohci-pxa27x.c
+++ b/drivers/usb/host/ohci-pxa27x.c
@@ -148,7 +148,7 @@ static int pxa27x_ohci_select_pmm(struct pxa27x_ohci *pxa_ohci, int mode)
uhcrhda |= RH_A_NPS;
break;
case PMM_GLOBAL_MODE:
- uhcrhda &= ~(RH_A_NPS & RH_A_PSM);
+ uhcrhda &= ~(RH_A_NPS | RH_A_PSM);
break;
case PMM_PERPORT_MODE:
uhcrhda &= ~(RH_A_NPS);
--
2.36.1
From: Sean Christopherson <seanjc(a)google.com>
commit 98c25ead5eda5e9d41abe57839ad3e8caf19500c upstream.
Handle the switch to/from the hypervisor/software timer when a vCPU is
blocking in common x86 instead of in VMX. Even though VMX is the only
user of a hypervisor timer, the logic and all functions involved are
generic x86 (unless future CPUs do something completely different and
implement a hypervisor timer that runs regardless of mode).
Handling the switch in common x86 will allow for the elimination of the
pre/post_blocks hooks, and also lets KVM switch back to the hypervisor
timer if and only if it was in use (without additional params). Add a
comment explaining why the switch cannot be deferred to kvm_sched_out()
or kvm_vcpu_block().
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Message-Id: <20211208015236.1616697-8-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
[ta: Fix conflicts in vmx_pre_block and vmx_post_block as per Paolo's
suggestion. Add Reported-by and Link tags.]
Reported-by: syzbot+b6a74be92b5063a0f1ff(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=489beb3d76ef14cc6cd18125782dc6f86051a6…
Tested-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Signed-off-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
---
arch/x86/kvm/vmx/vmx.c | 6 ------
arch/x86/kvm/x86.c | 21 +++++++++++++++++++++
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9ce45554d637..c95c3675e8d5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7597,17 +7597,11 @@ static int vmx_pre_block(struct kvm_vcpu *vcpu)
if (pi_pre_block(vcpu))
return 1;
- if (kvm_lapic_hv_timer_in_use(vcpu))
- kvm_lapic_switch_to_sw_timer(vcpu);
-
return 0;
}
static void vmx_post_block(struct kvm_vcpu *vcpu)
{
- if (kvm_x86_ops.set_hv_timer)
- kvm_lapic_switch_to_hv_timer(vcpu);
-
pi_post_block(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0622256cd768..5cb4af42ba64 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10043,12 +10043,28 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
{
+ bool hv_timer;
+
if (!kvm_arch_vcpu_runnable(vcpu) &&
(!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
+ /*
+ * Switch to the software timer before halt-polling/blocking as
+ * the guest's timer may be a break event for the vCPU, and the
+ * hypervisor timer runs only when the CPU is in guest mode.
+ * Switch before halt-polling so that KVM recognizes an expired
+ * timer before blocking.
+ */
+ hv_timer = kvm_lapic_hv_timer_in_use(vcpu);
+ if (hv_timer)
+ kvm_lapic_switch_to_sw_timer(vcpu);
+
srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
kvm_vcpu_block(vcpu);
vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
+ if (hv_timer)
+ kvm_lapic_switch_to_hv_timer(vcpu);
+
if (kvm_x86_ops.post_block)
static_call(kvm_x86_post_block)(vcpu);
@@ -10287,6 +10303,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
r = -EINTR;
goto out;
}
+ /*
+ * It should be impossible for the hypervisor timer to be in
+ * use before KVM has ever run the vCPU.
+ */
+ WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu));
kvm_vcpu_block(vcpu);
if (kvm_apic_accept_events(vcpu) < 0) {
r = 0;
--
2.40.0.348.gf938b09366-goog
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9228b26194d1cc00449f12f306f53ef2e234a55b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040337-reassign-brigade-a047@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
9228b26194d1 ("KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9228b26194d1cc00449f12f306f53ef2e234a55b Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:08 -0700
Subject: [PATCH] KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the
current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value. But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..1b2c161120be 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -856,6 +856,22 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
return true;
}
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 idx;
+
+ if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0)
+ /* PMCCNTR_EL0 */
+ idx = ARMV8_PMU_CYCLE_IDX;
+ else
+ /* PMEVCNTRn_EL0 */
+ idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
+
+ *val = kvm_pmu_get_counter_value(vcpu, idx);
+ return 0;
+}
+
static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -1072,7 +1088,7 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
/* Macro to expand the PMEVCNTRn_EL0 register */
#define PMU_PMEVCNTR_EL0(n) \
{ PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \
- .reset = reset_pmevcntr, \
+ .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \
.access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */
@@ -1982,7 +1998,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(SYS_PMCEID1_EL0),
.access = access_pmceid, .reset = NULL },
{ PMU_SYS_REG(SYS_PMCCNTR_EL0),
- .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 },
+ .access = access_pmu_evcntr, .reset = reset_unknown,
+ .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr},
{ PMU_SYS_REG(SYS_PMXEVTYPER_EL0),
.access = access_pmu_evtyper, .reset = NULL },
{ PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
Hi Sasha,
On Thu, Mar 30, 2023 at 1:16 PM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> drm/meson: Fix error handling when afbcd.ops->init fails
>
> to the 5.4-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> drm-meson-fix-error-handling-when-afbcd.ops-init-fai.patch
> and it can be found in the queue-5.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
I think this patch was added to allow backporting of another patch:
drm-meson-fix-missing-component-unbind-on-bind-error.patch
Unfortunately this breaks the build as old kernels (5.4 and older)
don't have AFBC support yet. That build failure has been:
Reported-by: kernel test robot <lkp(a)intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202303310016.tc4BlcbT-lkp@intel.com/
I think that there are few to no meson users on 5.4 or older kernels.
Can you please drop these patches from the 5.4 and 4.19 backport queue?
Thank you,
Martin
commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 upstream.
This fixes mkfs/mount/check failures due to race with systemd-udevd
scan.
During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY
error
- LTP pwritev03 test case, where mkfs.vfs failed with
the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
on the device.
In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very
small window during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurrences were anywhere between 10 to 400
iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function call chain:
btrfs_mount_root
btrfs_open_devices
open_fs_devices
btrfs_open_one_device
btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan
operation, and add a comment.
The case where mkfs may still write to the device and a scan is running,
the btrfs signature is not written at that time so scan will not
recognize such device.
Reported-by: Sherry Yang <sherry.yang(a)oracle.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
---
The upstream commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 can be
applied without conflict to LTS stable-5.15.y and stable-6.1.y. However,
on LTS stable-5.4.y and stable-5.15.y, a conflict fix is required since
the zoned device support commits are not present in these versions. This
patch resolves the conflicts.
fs/btrfs/volumes.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 3fa972a43b5e..c5944c61317f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1579,8 +1579,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
bytenr = btrfs_sb_offset(0);
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
--
2.39.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 89aba4c26fae4e459f755a18912845c348ee48f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040339-copartner-curliness-0c67@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
89aba4c26fae ("s390/uaccess: add missing earlyclobber annotations to __clear_user()")
4e1b5a86a5ed ("s390/uaccess: add missing EX_TABLE entries to __clear_user()")
731efc9613ee ("s390: convert ".insn" encoding to instruction names")
012a224e1fa3 ("s390/uaccess: introduce bit field for OAC specifier")
b087dfab4d39 ("s390/crypto: add SIMD implementation for ChaCha20")
de5012b41e5c ("s390/ftrace: implement hotpatching")
f8c2602733c9 ("s390/ftrace: fix ftrace_update_ftrace_func implementation")
d1e18efa8fa9 ("s390/lib,uaccess: get rid of register asm")
fcc91d5d4047 ("s390/timex: get rid of register asm")
dbb8864b28d6 ("s390/uaccess: get rid of register asm")
53c1c2504b6b ("s390/pgtable: use register pair instead of register asm")
80f06306240e ("s390/vdso: reimplement getcpu vdso syscall")
87d598634521 ("s390/mm: remove set_fs / rework address space handling")
c9343637d6b2 ("s390/ftrace: assume -mhotpatch or -mrecord-mcount always available")
ab177c5d00cd ("s390/mm: remove unused clear_user_asce()")
cfef9aa69a73 ("s390/vdso: remove unused constants")
847d4287a0c6 ("Merge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89aba4c26fae4e459f755a18912845c348ee48f3 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca(a)linux.ibm.com>
Date: Thu, 23 Mar 2023 13:09:16 +0100
Subject: [PATCH] s390/uaccess: add missing earlyclobber annotations to
__clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable(a)vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb1924..d44214072779 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
This reverts commit c7a218cbf67fffcd99b76ae3b5e9c2e8bef17c8c.
The memory of ctx is allocated by devm_kzalloc in cal_ctx_create,
it should not be freed by kfree when cal_ctx_v4l2_init() fails,
otherwise kfree() will cause double free, so revert this patch.
Fixes: c7a218cbf67f ("media: ti: cal: fix possible memory leak in cal_ctx_create()")
Signed-off-by: Gaosheng Cui <cuigaosheng1(a)huawei.com>
---
drivers/media/platform/ti-vpe/cal.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/media/platform/ti-vpe/cal.c b/drivers/media/platform/ti-vpe/cal.c
index 93121c90d76a..2eef245c31a1 100644
--- a/drivers/media/platform/ti-vpe/cal.c
+++ b/drivers/media/platform/ti-vpe/cal.c
@@ -624,10 +624,8 @@ static struct cal_ctx *cal_ctx_create(struct cal_dev *cal, int inst)
ctx->cport = inst;
ret = cal_ctx_v4l2_init(ctx);
- if (ret) {
- kfree(ctx);
+ if (ret)
return NULL;
- }
return ctx;
}
--
2.25.1
The original patch uses a feature in lib/vsprintf.c to handle the invalid
address when tring to print *_fw_module->man4_module_entry.name when the
*rc_fw_module is NULL.
This case is handled by check_pointer_msg() internally and turns the
invalid pointer to '(efault)' for printing but it is hiding useful
information about the circumstances. Change the print to emmit the name
of the widget and a note on which side's fw_module is missing.
Fixes: e3720f92e023 ("ASoC: SOF: avoid a NULL dereference with unsupported widgets")
Reported-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/alsa-devel/4826f662-42f0-4a82-ba32-8bf5f8a03256@kil…
Signed-off-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
---
Hi Mark, Dan,
This patch clarifies the print and will not rely on vsprintf internal protection
on printing the error.
Regards,
Peter
sound/soc/sof/ipc4-topology.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c
index 669b99a4f76e..3a5394c3dd83 100644
--- a/sound/soc/sof/ipc4-topology.c
+++ b/sound/soc/sof/ipc4-topology.c
@@ -1806,10 +1806,12 @@ static int sof_ipc4_route_setup(struct snd_sof_dev *sdev, struct snd_sof_route *
int ret;
if (!src_fw_module || !sink_fw_module) {
- /* The NULL module will print as "(efault)" */
- dev_err(sdev->dev, "source %s or sink %s widget weren't set up properly\n",
- src_fw_module->man4_module_entry.name,
- sink_fw_module->man4_module_entry.name);
+ dev_err(sdev->dev,
+ "cannot bind %s -> %s, no firmware module for: %s%s\n",
+ src_widget->widget->name, sink_widget->widget->name,
+ src_fw_module ? "" : " source",
+ sink_fw_module ? "" : " sink");
+
return -ENODEV;
}
--
2.40.0
The command allocated to set exit latency LPM values need to be freed in
case the command is never queued. This would be the case if there is no
change in exit latency values, or device is missing.
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Link: https://lore.kernel.org/linux-usb/24263902-c9b3-ce29-237b-1c3d6918f4fe@alu.…
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Fixes: 5c2a380a5aa8 ("xhci: Allocate separate command structures for each LPM command")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index bdb6dd819a3b..6307bae9cddf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4442,6 +4442,7 @@ static int __maybe_unused xhci_change_max_exit_latency(struct xhci_hcd *xhci,
if (!virt_dev || max_exit_latency == virt_dev->current_mel) {
spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_free_command(xhci, command);
return 0;
}
--
2.25.1
Device exclusive page table entries are used to prevent CPU access to
a page whilst it is being accessed from a device. Typically this is
used to implement atomic operations when the underlying bus does not
support atomic access. When a CPU thread encounters a device exclusive
entry it locks the page and restores the original entry after calling
mmu notifiers to signal drivers that exclusive access is no longer
available.
The device exclusive entry holds a reference to the page making it
safe to access the struct page whilst the entry is present. However
the fault handling code does not hold the PTL when taking the page
lock. This means if there are multiple threads faulting concurrently
on the device exclusive entry one will remove the entry whilst others
will wait on the page lock without holding a reference.
This can lead to threads locking or waiting on a folio with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via
munmap() they may still be freed by a migration. This leads to
warnings such as PAGE_FLAGS_CHECK_AT_FREE due to the page being locked
when the refcount drops to zero.
Fix this by trying to take a reference on the folio before locking
it. The code already checks the PTE under the PTL and aborts if the
entry is no longer there. It is also possible the folio has been
unmapped, freed and re-allocated allowing a reference to be taken on
an unrelated folio. This case is also detected by the PTE check and
the folio is unlocked without further changes.
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Cc: stable(a)vger.kernel.org
---
Changes for v2:
- Rebased to Linus master
- Reworded commit message
- Switched to using folios (thanks Matthew!)
- Added Reviewed-by's
---
mm/memory.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index f456f3b5049c..01a23ad48a04 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3563,8 +3563,21 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
struct vm_area_struct *vma = vmf->vma;
struct mmu_notifier_range range;
- if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags))
+ /*
+ * We need a reference to lock the folio because we don't hold
+ * the PTL so a racing thread can remove the device-exclusive
+ * entry and unmap it. If the folio is free the entry must
+ * have been removed already. If it happens to have already
+ * been re-allocated after being freed all we do is lock and
+ * unlock it.
+ */
+ if (!folio_try_get(folio))
+ return 0;
+
+ if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) {
+ folio_put(folio);
return VM_FAULT_RETRY;
+ }
mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0,
vma->vm_mm, vmf->address & PAGE_MASK,
(vmf->address & PAGE_MASK) + PAGE_SIZE, NULL);
@@ -3577,6 +3590,7 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
folio_unlock(folio);
+ folio_put(folio);
mmu_notifier_invalidate_range_end(&range);
return 0;
--
2.39.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 89aba4c26fae4e459f755a18912845c348ee48f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040338-emphases-eggbeater-ad1c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
89aba4c26fae ("s390/uaccess: add missing earlyclobber annotations to __clear_user()")
4e1b5a86a5ed ("s390/uaccess: add missing EX_TABLE entries to __clear_user()")
731efc9613ee ("s390: convert ".insn" encoding to instruction names")
012a224e1fa3 ("s390/uaccess: introduce bit field for OAC specifier")
b087dfab4d39 ("s390/crypto: add SIMD implementation for ChaCha20")
de5012b41e5c ("s390/ftrace: implement hotpatching")
f8c2602733c9 ("s390/ftrace: fix ftrace_update_ftrace_func implementation")
d1e18efa8fa9 ("s390/lib,uaccess: get rid of register asm")
fcc91d5d4047 ("s390/timex: get rid of register asm")
dbb8864b28d6 ("s390/uaccess: get rid of register asm")
53c1c2504b6b ("s390/pgtable: use register pair instead of register asm")
80f06306240e ("s390/vdso: reimplement getcpu vdso syscall")
87d598634521 ("s390/mm: remove set_fs / rework address space handling")
c9343637d6b2 ("s390/ftrace: assume -mhotpatch or -mrecord-mcount always available")
ab177c5d00cd ("s390/mm: remove unused clear_user_asce()")
cfef9aa69a73 ("s390/vdso: remove unused constants")
847d4287a0c6 ("Merge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89aba4c26fae4e459f755a18912845c348ee48f3 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca(a)linux.ibm.com>
Date: Thu, 23 Mar 2023 13:09:16 +0100
Subject: [PATCH] s390/uaccess: add missing earlyclobber annotations to
__clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable(a)vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb1924..d44214072779 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 89aba4c26fae4e459f755a18912845c348ee48f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040337-opposing-arrange-ad10@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
89aba4c26fae ("s390/uaccess: add missing earlyclobber annotations to __clear_user()")
4e1b5a86a5ed ("s390/uaccess: add missing EX_TABLE entries to __clear_user()")
731efc9613ee ("s390: convert ".insn" encoding to instruction names")
012a224e1fa3 ("s390/uaccess: introduce bit field for OAC specifier")
b087dfab4d39 ("s390/crypto: add SIMD implementation for ChaCha20")
de5012b41e5c ("s390/ftrace: implement hotpatching")
f8c2602733c9 ("s390/ftrace: fix ftrace_update_ftrace_func implementation")
d1e18efa8fa9 ("s390/lib,uaccess: get rid of register asm")
fcc91d5d4047 ("s390/timex: get rid of register asm")
dbb8864b28d6 ("s390/uaccess: get rid of register asm")
53c1c2504b6b ("s390/pgtable: use register pair instead of register asm")
80f06306240e ("s390/vdso: reimplement getcpu vdso syscall")
87d598634521 ("s390/mm: remove set_fs / rework address space handling")
c9343637d6b2 ("s390/ftrace: assume -mhotpatch or -mrecord-mcount always available")
ab177c5d00cd ("s390/mm: remove unused clear_user_asce()")
cfef9aa69a73 ("s390/vdso: remove unused constants")
847d4287a0c6 ("Merge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89aba4c26fae4e459f755a18912845c348ee48f3 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca(a)linux.ibm.com>
Date: Thu, 23 Mar 2023 13:09:16 +0100
Subject: [PATCH] s390/uaccess: add missing earlyclobber annotations to
__clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable(a)vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb1924..d44214072779 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 89aba4c26fae4e459f755a18912845c348ee48f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040336-reformist-kinship-d451@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
89aba4c26fae ("s390/uaccess: add missing earlyclobber annotations to __clear_user()")
4e1b5a86a5ed ("s390/uaccess: add missing EX_TABLE entries to __clear_user()")
731efc9613ee ("s390: convert ".insn" encoding to instruction names")
012a224e1fa3 ("s390/uaccess: introduce bit field for OAC specifier")
b087dfab4d39 ("s390/crypto: add SIMD implementation for ChaCha20")
de5012b41e5c ("s390/ftrace: implement hotpatching")
f8c2602733c9 ("s390/ftrace: fix ftrace_update_ftrace_func implementation")
d1e18efa8fa9 ("s390/lib,uaccess: get rid of register asm")
fcc91d5d4047 ("s390/timex: get rid of register asm")
dbb8864b28d6 ("s390/uaccess: get rid of register asm")
53c1c2504b6b ("s390/pgtable: use register pair instead of register asm")
80f06306240e ("s390/vdso: reimplement getcpu vdso syscall")
87d598634521 ("s390/mm: remove set_fs / rework address space handling")
c9343637d6b2 ("s390/ftrace: assume -mhotpatch or -mrecord-mcount always available")
ab177c5d00cd ("s390/mm: remove unused clear_user_asce()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89aba4c26fae4e459f755a18912845c348ee48f3 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca(a)linux.ibm.com>
Date: Thu, 23 Mar 2023 13:09:16 +0100
Subject: [PATCH] s390/uaccess: add missing earlyclobber annotations to
__clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable(a)vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb1924..d44214072779 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 89aba4c26fae4e459f755a18912845c348ee48f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040335-unlisted-supernova-a6b9@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
89aba4c26fae ("s390/uaccess: add missing earlyclobber annotations to __clear_user()")
4e1b5a86a5ed ("s390/uaccess: add missing EX_TABLE entries to __clear_user()")
731efc9613ee ("s390: convert ".insn" encoding to instruction names")
012a224e1fa3 ("s390/uaccess: introduce bit field for OAC specifier")
b087dfab4d39 ("s390/crypto: add SIMD implementation for ChaCha20")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 89aba4c26fae4e459f755a18912845c348ee48f3 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca(a)linux.ibm.com>
Date: Thu, 23 Mar 2023 13:09:16 +0100
Subject: [PATCH] s390/uaccess: add missing earlyclobber annotations to
__clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable(a)vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb1924..d44214072779 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
The command allocated to set exit latency LPM values need to be freed in
case the command is never queued. This would be the case if there is no
change in exit latency values, or device is missing.
Fixes: 5c2a380a5aa8 ("xhci: Allocate separate command structures for each LPM command")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index bdb6dd819a3b..6307bae9cddf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4442,6 +4442,7 @@ static int __maybe_unused xhci_change_max_exit_latency(struct xhci_hcd *xhci,
if (!virt_dev || max_exit_latency == virt_dev->current_mel) {
spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_free_command(xhci, command);
return 0;
}
--
2.25.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 76b767d4d1cd052e455cf18e06929e8b2b70101d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040340-kinswoman-generic-1bb5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
76b767d4d1cd ("drm/i915: Split icl_color_commit_noarm() from skl_color_commit_noarm()")
48205f42ae9b ("drm/i915: Get rid of glk_load_degamma_lut_linear()")
b1d9092240b7 ("drm/i915: Assert {pre,post}_csc_lut were assigned sensibly")
18f1b5ae7eca ("drm/i915: Introduce crtc_state->{pre,post}_csc_lut")
5ca1493e252a ("drm/i915: Make ilk_load_luts() deal with degamma")
a2b1d9ecaa75 ("drm/i915: Clean up some namespacing")
adc831bfc885 ("drm/i915: Make DRRS debugfs per-crtc/connector")
2e25c1fba714 ("drm/i915: Make the DRRS debugfs contents more consistent")
61564e6c5a4a ("drm/i915: Move DRRS debugfs next to the implementation")
296cd8ecfd30 ("drm/i915: Change glk_load_degamma_lut() calling convention")
7671fc626526 ("drm/i915: Clean up intel_color_init_hooks()")
2a40e5848a95 ("drm/i915: Simplify the intel_color_init_hooks() if ladder")
064751a6c5dc ("drm/i915: Split up intel_color_init()")
319b0869f51c ("drm/i915: Remove PLL asserts from .load_luts()")
1bed8b073420 ("drm/i915/hotplug: move hotplug storm debugfs to intel_hotplug.c")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76b767d4d1cd052e455cf18e06929e8b2b70101d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala(a)linux.intel.com>
Date: Mon, 20 Mar 2023 11:54:33 +0200
Subject: [PATCH] drm/i915: Split icl_color_commit_noarm() from
skl_color_commit_noarm()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We're going to want different behavior for skl/glk vs. icl
in .color_commit_noarm(), so split the hook into two. Arguably
we already had slightly different behaviour since
csc_enable/gamma_enable are never set on icl+, so the old
code was perhaps a bit confusing as well.
Cc: <stable(a)vger.kernel.org> #v5.19+
Cc: Manasi Navare <navaremanasi(a)google.com>
Cc: Drew Davenport <ddavenport(a)chromium.org>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: Jouni Högander <jouni.hogander(a)intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230320095438.17328-2-ville.…
Reviewed-by: Imre Deak <imre.deak(a)intel.com>
(cherry picked from commit f161eb01f50ab31f2084975b43bce54b7b671e17)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c
index 8d97c299e657..ce2c3819146c 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -677,6 +677,25 @@ static void skl_color_commit_arm(const struct intel_crtc_state *crtc_state)
crtc_state->csc_mode);
}
+static void icl_color_commit_arm(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+ enum pipe pipe = crtc->pipe;
+
+ /*
+ * We don't (yet) allow userspace to control the pipe background color,
+ * so force it to black.
+ */
+ intel_de_write(i915, SKL_BOTTOM_COLOR(pipe), 0);
+
+ intel_de_write(i915, GAMMA_MODE(crtc->pipe),
+ crtc_state->gamma_mode);
+
+ intel_de_write_fw(i915, PIPE_CSC_MODE(crtc->pipe),
+ crtc_state->csc_mode);
+}
+
static struct drm_property_blob *
create_linear_lut(struct drm_i915_private *i915, int lut_size)
{
@@ -3067,7 +3086,7 @@ static const struct intel_color_funcs i9xx_color_funcs = {
static const struct intel_color_funcs icl_color_funcs = {
.color_check = icl_color_check,
.color_commit_noarm = icl_color_commit_noarm,
- .color_commit_arm = skl_color_commit_arm,
+ .color_commit_arm = icl_color_commit_arm,
.load_luts = icl_load_luts,
.read_luts = icl_read_luts,
.lut_equal = icl_lut_equal,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8c2e8ac8ad4be68409e806ce1cc78fc7a04539f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040351-conflict-swan-ae8b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
8c2e8ac8ad4b ("KVM: arm64: Check for kvm_vma_mte_allowed in the critical section")
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8c2e8ac8ad4be68409e806ce1cc78fc7a04539f3 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Thu, 16 Mar 2023 17:45:46 +0000
Subject: [PATCH] KVM: arm64: Check for kvm_vma_mte_allowed in the critical
section
On page fault, we find about the VMA that backs the page fault
early on, and quickly release the mmap_read_lock. However, using
the VMA pointer after the critical section is pretty dangerous,
as a teardown may happen in the meantime and the VMA be long gone.
Move the sampling of the MTE permission early, and NULL-ify the
VMA pointer after that, just to be on the safe side.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230316174546.3777507-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index cd819725193b..3b9d4d24c361 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
{
int ret = 0;
bool write_fault, writable, force_pte = false;
- bool exec_fault;
+ bool exec_fault, mte_allowed;
bool device = false;
unsigned long mmu_seq;
struct kvm *kvm = vcpu->kvm;
@@ -1309,6 +1309,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
+ mte_allowed = kvm_vma_mte_allowed(vma);
+
+ /* Don't use the VMA after the unlock -- it may have vanished */
+ vma = NULL;
/*
* Read mmu_invalidate_seq so that KVM can detect if the results of
@@ -1379,7 +1383,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (fault_status != ESR_ELx_FSC_PERM && !device && kvm_has_mte(kvm)) {
/* Check the VMM hasn't introduced a new disallowed VMA */
- if (kvm_vma_mte_allowed(vma)) {
+ if (mte_allowed) {
sanitise_mte_tags(kvm, pfn, vma_pagesize);
} else {
ret = -EFAULT;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8c2e8ac8ad4be68409e806ce1cc78fc7a04539f3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040350-shrouded-scribble-3c39@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
8c2e8ac8ad4b ("KVM: arm64: Check for kvm_vma_mte_allowed in the critical section")
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8c2e8ac8ad4be68409e806ce1cc78fc7a04539f3 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Thu, 16 Mar 2023 17:45:46 +0000
Subject: [PATCH] KVM: arm64: Check for kvm_vma_mte_allowed in the critical
section
On page fault, we find about the VMA that backs the page fault
early on, and quickly release the mmap_read_lock. However, using
the VMA pointer after the critical section is pretty dangerous,
as a teardown may happen in the meantime and the VMA be long gone.
Move the sampling of the MTE permission early, and NULL-ify the
VMA pointer after that, just to be on the safe side.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230316174546.3777507-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index cd819725193b..3b9d4d24c361 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
{
int ret = 0;
bool write_fault, writable, force_pte = false;
- bool exec_fault;
+ bool exec_fault, mte_allowed;
bool device = false;
unsigned long mmu_seq;
struct kvm *kvm = vcpu->kvm;
@@ -1309,6 +1309,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
+ mte_allowed = kvm_vma_mte_allowed(vma);
+
+ /* Don't use the VMA after the unlock -- it may have vanished */
+ vma = NULL;
/*
* Read mmu_invalidate_seq so that KVM can detect if the results of
@@ -1379,7 +1383,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (fault_status != ESR_ELx_FSC_PERM && !device && kvm_has_mte(kvm)) {
/* Check the VMM hasn't introduced a new disallowed VMA */
- if (kvm_vma_mte_allowed(vma)) {
+ if (mte_allowed) {
sanitise_mte_tags(kvm, pfn, vma_pagesize);
} else {
ret = -EFAULT;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040332-prevent-monorail-d14a@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040331-gulf-component-c0f8@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040329-elongated-decal-968b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040328-antihero-anthem-461d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040327-volatile-unit-4e22@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 13ec9308a85702af7c31f3638a2720863848a7f2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040325-tables-zigzagged-e167@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
13ec9308a857 ("KVM: arm64: Retry fault if vma_lookup() results become invalid")
b0803ba72b55 ("KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*")
382b5b87a97d ("Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13ec9308a85702af7c31f3638a2720863848a7f2 Mon Sep 17 00:00:00 2001
From: David Matlack <dmatlack(a)google.com>
Date: Mon, 13 Mar 2023 16:54:54 -0700
Subject: [PATCH] KVM: arm64: Retry fault if vma_lookup() results become
invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable(a)vger.kernel.org
Reported-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: David Matlack <dmatlack(a)google.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
+ /*
+ * Permission faults just need to update the existing leaf entry,
+ * and so normally don't require allocations from the memcache. The
+ * only exception to this is when dirty logging is enabled at runtime
+ * and a write fault needs to collapse a block entry into a table.
+ */
+ if (fault_status != ESR_ELx_FSC_PERM ||
+ (logging_active && write_fault)) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Let's check if we will get back a huge page backed by hugetlbfs, or
* get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT;
- mmap_read_unlock(current->mm);
-
- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq;
/*
- * Ensure the read of mmu_invalidate_seq happens before we call
- * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
- * the page we just got a reference to gets unmapped before we have a
- * chance to grab the mmu_lock, which ensure that if the page gets
- * unmapped afterwards, the call to kvm_unmap_gfn will take it away
- * from us again properly. This smp_rmb() interacts with the smp_wmb()
- * in kvm_mmu_notifier_invalidate_<page|range_end>.
+ * Read mmu_invalidate_seq so that KVM can detect if the results of
+ * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+ * acquiring kvm->mmu_lock.
*
- * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
- * used to avoid unnecessary overhead introduced to locate the memory
- * slot because it's always fixed even @gfn is adjusted for huge pages.
+ * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+ * with the smp_wmb() in kvm_mmu_invalidate_end().
*/
- smp_rmb();
+ mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+ mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
write_fault, &writable, NULL);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040314-wolf-chrome-d9ce@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions")
fcf37b38ff22 ("arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names")
c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
8794b4f510f7 ("Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040313-squid-purchase-3fb2@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions")
fcf37b38ff22 ("arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names")
c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
8794b4f510f7 ("Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040312-uphold-satisfied-aa62@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions")
fcf37b38ff22 ("arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names")
c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
8794b4f510f7 ("Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040311-tipper-siding-3f17@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions")
fcf37b38ff22 ("arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names")
c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
8794b4f510f7 ("Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040310-snowbird-bacteria-e989@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions")
fcf37b38ff22 ("arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names")
c0357a73fa4a ("arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture")
8794b4f510f7 ("Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f6da81f650fa47b61b847488f3938d43f90d093d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040309-oxidize-stapling-5dc3@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
f6da81f650fa ("KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU")
64d6820d64c0 ("KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run")
11af4c37165e ("KVM: arm64: PMU: Implement PMUv3p5 long counter support")
3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation")
c82d28cbf1d4 ("KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow")
bead02204e98 ("KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f6da81f650fa47b61b847488f3938d43f90d093d Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:34 -0700
Subject: [PATCH] KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ,
KVM saves the register value, including these bits.
When userspace reads the register using KVM_GET_ONE_REG, KVM returns
the saved register value as it is (the saved value might have these
bits set). This could result in userspace setting these bits on the
destination during migration. Consequently, KVM may end up resetting
the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to
zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 24908400e190..c243b10f3e15 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_pmu_is_3p5(vcpu))
val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+ /* The reset bits don't indicate any state, and shouldn't be saved. */
+ __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) {
kvm_pmu_enable_counter_mask(vcpu,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 9228b26194d1cc00449f12f306f53ef2e234a55b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040358-smite-passover-92f2@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
9228b26194d1 ("KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value")
0ab410a93d62 ("KVM: arm64: Narrow PMU sysreg reset values to architectural requirements")
11663111cd49 ("KVM: arm64: Hide PMU registers from userspace when not available")
7ccadf23b861 ("KVM: arm64: Add missing reset handlers for PMU emulation")
03fdfb269009 ("KVM: arm64: Don't write junk to sysregs on reset")
20589c8cc47d ("arm/arm64: KVM: Don't panic on failure to properly reset system registers")
8d404c4c2461 ("KVM: arm64: Rewrite system register accessors to read/write functions")
52f6c4f02164 ("KVM: arm64: Change 32-bit handling of VM system registers")
54ceb1bcf8d8 ("KVM: arm64: Move debug dirty flag calculation out of world switch")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9228b26194d1cc00449f12f306f53ef2e234a55b Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:08 -0700
Subject: [PATCH] KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the
current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value. But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..1b2c161120be 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -856,6 +856,22 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
return true;
}
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 idx;
+
+ if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0)
+ /* PMCCNTR_EL0 */
+ idx = ARMV8_PMU_CYCLE_IDX;
+ else
+ /* PMEVCNTRn_EL0 */
+ idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
+
+ *val = kvm_pmu_get_counter_value(vcpu, idx);
+ return 0;
+}
+
static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -1072,7 +1088,7 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
/* Macro to expand the PMEVCNTRn_EL0 register */
#define PMU_PMEVCNTR_EL0(n) \
{ PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \
- .reset = reset_pmevcntr, \
+ .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \
.access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */
@@ -1982,7 +1998,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(SYS_PMCEID1_EL0),
.access = access_pmceid, .reset = NULL },
{ PMU_SYS_REG(SYS_PMCCNTR_EL0),
- .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 },
+ .access = access_pmu_evcntr, .reset = reset_unknown,
+ .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr},
{ PMU_SYS_REG(SYS_PMXEVTYPER_EL0),
.access = access_pmu_evtyper, .reset = NULL },
{ PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 9228b26194d1cc00449f12f306f53ef2e234a55b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040356-immerse-arming-cc39@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
9228b26194d1 ("KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value")
0ab410a93d62 ("KVM: arm64: Narrow PMU sysreg reset values to architectural requirements")
11663111cd49 ("KVM: arm64: Hide PMU registers from userspace when not available")
7ccadf23b861 ("KVM: arm64: Add missing reset handlers for PMU emulation")
03fdfb269009 ("KVM: arm64: Don't write junk to sysregs on reset")
20589c8cc47d ("arm/arm64: KVM: Don't panic on failure to properly reset system registers")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9228b26194d1cc00449f12f306f53ef2e234a55b Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:08 -0700
Subject: [PATCH] KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the
current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value. But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..1b2c161120be 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -856,6 +856,22 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
return true;
}
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 idx;
+
+ if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0)
+ /* PMCCNTR_EL0 */
+ idx = ARMV8_PMU_CYCLE_IDX;
+ else
+ /* PMEVCNTRn_EL0 */
+ idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
+
+ *val = kvm_pmu_get_counter_value(vcpu, idx);
+ return 0;
+}
+
static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -1072,7 +1088,7 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
/* Macro to expand the PMEVCNTRn_EL0 register */
#define PMU_PMEVCNTR_EL0(n) \
{ PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \
- .reset = reset_pmevcntr, \
+ .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \
.access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */
@@ -1982,7 +1998,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(SYS_PMCEID1_EL0),
.access = access_pmceid, .reset = NULL },
{ PMU_SYS_REG(SYS_PMCCNTR_EL0),
- .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 },
+ .access = access_pmu_evcntr, .reset = reset_unknown,
+ .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr},
{ PMU_SYS_REG(SYS_PMXEVTYPER_EL0),
.access = access_pmu_evtyper, .reset = NULL },
{ PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 9228b26194d1cc00449f12f306f53ef2e234a55b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040355-harsh-endowment-6808@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
9228b26194d1 ("KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value")
0ab410a93d62 ("KVM: arm64: Narrow PMU sysreg reset values to architectural requirements")
11663111cd49 ("KVM: arm64: Hide PMU registers from userspace when not available")
7ccadf23b861 ("KVM: arm64: Add missing reset handlers for PMU emulation")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9228b26194d1cc00449f12f306f53ef2e234a55b Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:08 -0700
Subject: [PATCH] KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the
current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value. But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..1b2c161120be 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -856,6 +856,22 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
return true;
}
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 idx;
+
+ if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0)
+ /* PMCCNTR_EL0 */
+ idx = ARMV8_PMU_CYCLE_IDX;
+ else
+ /* PMEVCNTRn_EL0 */
+ idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
+
+ *val = kvm_pmu_get_counter_value(vcpu, idx);
+ return 0;
+}
+
static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -1072,7 +1088,7 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
/* Macro to expand the PMEVCNTRn_EL0 register */
#define PMU_PMEVCNTR_EL0(n) \
{ PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \
- .reset = reset_pmevcntr, \
+ .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \
.access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */
@@ -1982,7 +1998,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(SYS_PMCEID1_EL0),
.access = access_pmceid, .reset = NULL },
{ PMU_SYS_REG(SYS_PMCCNTR_EL0),
- .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 },
+ .access = access_pmu_evcntr, .reset = reset_unknown,
+ .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr},
{ PMU_SYS_REG(SYS_PMXEVTYPER_EL0),
.access = access_pmu_evtyper, .reset = NULL },
{ PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9228b26194d1cc00449f12f306f53ef2e234a55b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040354-armful-augmented-752b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
9228b26194d1 ("KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value")
0ab410a93d62 ("KVM: arm64: Narrow PMU sysreg reset values to architectural requirements")
11663111cd49 ("KVM: arm64: Hide PMU registers from userspace when not available")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9228b26194d1cc00449f12f306f53ef2e234a55b Mon Sep 17 00:00:00 2001
From: Reiji Watanabe <reijiw(a)google.com>
Date: Sun, 12 Mar 2023 20:32:08 -0700
Subject: [PATCH] KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the
current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value. But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register")
Cc: stable(a)vger.kernel.org
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..1b2c161120be 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -856,6 +856,22 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
return true;
}
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 idx;
+
+ if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0)
+ /* PMCCNTR_EL0 */
+ idx = ARMV8_PMU_CYCLE_IDX;
+ else
+ /* PMEVCNTRn_EL0 */
+ idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
+
+ *val = kvm_pmu_get_counter_value(vcpu, idx);
+ return 0;
+}
+
static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
@@ -1072,7 +1088,7 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
/* Macro to expand the PMEVCNTRn_EL0 register */
#define PMU_PMEVCNTR_EL0(n) \
{ PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \
- .reset = reset_pmevcntr, \
+ .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \
.access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */
@@ -1982,7 +1998,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(SYS_PMCEID1_EL0),
.access = access_pmceid, .reset = NULL },
{ PMU_SYS_REG(SYS_PMCCNTR_EL0),
- .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 },
+ .access = access_pmu_evcntr, .reset = reset_unknown,
+ .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr},
{ PMU_SYS_REG(SYS_PMXEVTYPER_EL0),
.access = access_pmu_evtyper, .reset = NULL },
{ PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 6165a16a5ad9b237bb3131cff4d3c601ccb8f9a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040341-excitable-jaws-269d@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
6165a16a5ad9 ("NFSv4: Fix hangs when recovering open state after a server reboot")
e3c8dc761ead ("NFSv4: Check the return value of update_open_stateid()")
ace9fad43aa6 ("NFSv4: Convert struct nfs4_state to use refcount_t")
c9399f21c215 ("NFSv4: Fix OPEN / CLOSE race")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6165a16a5ad9b237bb3131cff4d3c601ccb8f9a3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Tue, 21 Mar 2023 00:17:36 -0400
Subject: [PATCH] NFSv4: Fix hangs when recovering open state after a server
reboot
When we're using a cached open stateid or a delegation in order to avoid
sending a CLAIM_PREVIOUS open RPC call to the server, we don't have a
new open stateid to present to update_open_stateid().
Instead rely on nfs4_try_open_cached(), just as if we were doing a
normal open.
Fixes: d2bfda2e7aa0 ("NFSv4: don't reprocess cached open CLAIM_PREVIOUS")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 22a93ae46cd7..5607b1e2b821 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1980,8 +1980,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (!data->rpc_done) {
if (data->rpc_status)
return ERR_PTR(data->rpc_status);
- /* cached opens have already been processed */
- goto update;
+ return nfs4_try_open_cached(data);
}
ret = nfs_refresh_inode(inode, &data->f_attr);
@@ -1990,7 +1989,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
-update:
+
if (!update_open_stateid(state, &data->o_res.stateid,
NULL, data->o_arg.fmode))
return ERR_PTR(-EAGAIN);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 6165a16a5ad9b237bb3131cff4d3c601ccb8f9a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040340-bouncing-sampling-7639@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
6165a16a5ad9 ("NFSv4: Fix hangs when recovering open state after a server reboot")
e3c8dc761ead ("NFSv4: Check the return value of update_open_stateid()")
ace9fad43aa6 ("NFSv4: Convert struct nfs4_state to use refcount_t")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6165a16a5ad9b237bb3131cff4d3c601ccb8f9a3 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Tue, 21 Mar 2023 00:17:36 -0400
Subject: [PATCH] NFSv4: Fix hangs when recovering open state after a server
reboot
When we're using a cached open stateid or a delegation in order to avoid
sending a CLAIM_PREVIOUS open RPC call to the server, we don't have a
new open stateid to present to update_open_stateid().
Instead rely on nfs4_try_open_cached(), just as if we were doing a
normal open.
Fixes: d2bfda2e7aa0 ("NFSv4: don't reprocess cached open CLAIM_PREVIOUS")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 22a93ae46cd7..5607b1e2b821 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1980,8 +1980,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (!data->rpc_done) {
if (data->rpc_status)
return ERR_PTR(data->rpc_status);
- /* cached opens have already been processed */
- goto update;
+ return nfs4_try_open_cached(data);
}
ret = nfs_refresh_inode(inode, &data->f_attr);
@@ -1990,7 +1989,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
-update:
+
if (!update_open_stateid(state, &data->o_res.stateid,
NULL, data->o_arg.fmode))
return ERR_PTR(-EAGAIN);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x fd7276189450110ed835eb0a334e62d2f1c4e3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040306-emit-earthling-ac1b@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
fd7276189450 ("powerpc: Don't try to copy PPR for task with NULL pt_regs")
47e12855a91d ("powerpc: switch to ->regset_get()")
6e0b79750ce2 ("powerpc/ptrace: move register viewing functions out of ptrace.c")
7c1f8db019f8 ("powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.")
60ef9dbd9d2a ("powerpc/ptrace: split out SPE related functions.")
1b20773b00b7 ("powerpc/ptrace: split out ALTIVEC related functions.")
7b99ed4e8e3a ("powerpc/ptrace: split out VSX related functions.")
963ae6b2ff1c ("powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET")
f1763e623c69 ("powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64")
b3138536c837 ("powerpc/ptrace: remove unused header includes")
da9a1c10e2c7 ("powerpc: Move ptrace into a subdirectory.")
68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
965dd3ad3076 ("powerpc/64/syscall: Remove non-volatile GPR save optimisation")
fddb5d430ad9 ("open: introduce openat2(2) syscall")
793b08e2efff ("powerpc/kexec: Move kexec files into a dedicated subdir.")
9f7bd9201521 ("powerpc/32: Split kexec low level code out of misc_32.S")
b020aa9d1e87 ("powerpc: cleanup hw_irq.h")
0671c5b84e9e ("MIPS: Wire up clone3 syscall")
45824fc0da6e ("Merge tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd7276189450110ed835eb0a334e62d2f1c4e3be Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Sun, 26 Mar 2023 16:15:57 -0600
Subject: [PATCH] powerpc: Don't try to copy PPR for task with NULL pt_regs
powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which
from my (arguably very short) checking is not commonly done for other
archs. This is fine, except when PF_IO_WORKER's have been created and
the task does something that causes a coredump to be generated. Then we
get this crash:
Kernel attempted to read user page (160) - exploit attempt? (uid: 1000)
BUG: Kernel NULL pointer dereference on read at 0x00000160
Faulting instruction address: 0xc0000000000c3a60
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries
Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod
CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0
REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+)
MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8
...
NIP memcpy_power7+0x200/0x7d0
LR ppr_get+0x64/0xb0
Call Trace:
ppr_get+0x40/0xb0 (unreliable)
__regset_get+0x180/0x1f0
regset_get_alloc+0x64/0x90
elf_core_dump+0xb98/0x1b60
do_coredump+0x1c34/0x24a0
get_signal+0x71c/0x1410
do_notify_resume+0x140/0x6f0
interrupt_exit_user_prepare_main+0x29c/0x320
interrupt_exit_user_prepare+0x6c/0xa0
interrupt_return_srr_user+0x8/0x138
Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL
pt_regs.
Check for a valid pt_regs in both ppc_get/ppr_set, and return an error
if not set. The actual error value doesn't seem to be important here, so
just pick -EINVAL.
Fixes: fa439810cc1b ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR")
Cc: stable(a)vger.kernel.org # v4.8+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
[mpe: Trim oops in change log, add Fixes & Cc stable]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/d9f63344-fe7c-56ae-b420-4a1a04a2ae4c@kernel.dk
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 2087a785f05f..5fff0d04b23f 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -290,6 +290,9 @@ static int gpr_set(struct task_struct *target, const struct user_regset *regset,
static int ppr_get(struct task_struct *target, const struct user_regset *regset,
struct membuf to)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return membuf_write(&to, &target->thread.regs->ppr, sizeof(u64));
}
@@ -297,6 +300,9 @@ static int ppr_set(struct task_struct *target, const struct user_regset *regset,
unsigned int pos, unsigned int count, const void *kbuf,
const void __user *ubuf)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.regs->ppr, 0, sizeof(u64));
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x fd7276189450110ed835eb0a334e62d2f1c4e3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040304-splashed-consult-3a39@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
fd7276189450 ("powerpc: Don't try to copy PPR for task with NULL pt_regs")
47e12855a91d ("powerpc: switch to ->regset_get()")
6e0b79750ce2 ("powerpc/ptrace: move register viewing functions out of ptrace.c")
7c1f8db019f8 ("powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.")
60ef9dbd9d2a ("powerpc/ptrace: split out SPE related functions.")
1b20773b00b7 ("powerpc/ptrace: split out ALTIVEC related functions.")
7b99ed4e8e3a ("powerpc/ptrace: split out VSX related functions.")
963ae6b2ff1c ("powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET")
f1763e623c69 ("powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64")
b3138536c837 ("powerpc/ptrace: remove unused header includes")
da9a1c10e2c7 ("powerpc: Move ptrace into a subdirectory.")
68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
965dd3ad3076 ("powerpc/64/syscall: Remove non-volatile GPR save optimisation")
fddb5d430ad9 ("open: introduce openat2(2) syscall")
793b08e2efff ("powerpc/kexec: Move kexec files into a dedicated subdir.")
9f7bd9201521 ("powerpc/32: Split kexec low level code out of misc_32.S")
b020aa9d1e87 ("powerpc: cleanup hw_irq.h")
0671c5b84e9e ("MIPS: Wire up clone3 syscall")
45824fc0da6e ("Merge tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fd7276189450110ed835eb0a334e62d2f1c4e3be Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Sun, 26 Mar 2023 16:15:57 -0600
Subject: [PATCH] powerpc: Don't try to copy PPR for task with NULL pt_regs
powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which
from my (arguably very short) checking is not commonly done for other
archs. This is fine, except when PF_IO_WORKER's have been created and
the task does something that causes a coredump to be generated. Then we
get this crash:
Kernel attempted to read user page (160) - exploit attempt? (uid: 1000)
BUG: Kernel NULL pointer dereference on read at 0x00000160
Faulting instruction address: 0xc0000000000c3a60
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries
Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod
CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0
REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+)
MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8
...
NIP memcpy_power7+0x200/0x7d0
LR ppr_get+0x64/0xb0
Call Trace:
ppr_get+0x40/0xb0 (unreliable)
__regset_get+0x180/0x1f0
regset_get_alloc+0x64/0x90
elf_core_dump+0xb98/0x1b60
do_coredump+0x1c34/0x24a0
get_signal+0x71c/0x1410
do_notify_resume+0x140/0x6f0
interrupt_exit_user_prepare_main+0x29c/0x320
interrupt_exit_user_prepare+0x6c/0xa0
interrupt_return_srr_user+0x8/0x138
Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL
pt_regs.
Check for a valid pt_regs in both ppc_get/ppr_set, and return an error
if not set. The actual error value doesn't seem to be important here, so
just pick -EINVAL.
Fixes: fa439810cc1b ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR")
Cc: stable(a)vger.kernel.org # v4.8+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
[mpe: Trim oops in change log, add Fixes & Cc stable]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/d9f63344-fe7c-56ae-b420-4a1a04a2ae4c@kernel.dk
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 2087a785f05f..5fff0d04b23f 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -290,6 +290,9 @@ static int gpr_set(struct task_struct *target, const struct user_regset *regset,
static int ppr_get(struct task_struct *target, const struct user_regset *regset,
struct membuf to)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return membuf_write(&to, &target->thread.regs->ppr, sizeof(u64));
}
@@ -297,6 +300,9 @@ static int ppr_set(struct task_struct *target, const struct user_regset *regset,
unsigned int pos, unsigned int count, const void *kbuf,
const void __user *ubuf)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.regs->ppr, 0, sizeof(u64));
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x b26cd9325be4c1fcd331b77f10acb627c560d4d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040315-prodigal-chief-ace2@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume")
4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
e81376ebbafc ("pinctrl: amd: Use irqchip template")
279ffafaf39d ("pinctrl: Added IRQF_SHARED flag for amd-pinctrl driver")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b26cd9325be4c1fcd331b77f10acb627c560d4d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Kornel=20Dul=C4=99ba?= <korneld(a)chromium.org>
Date: Mon, 20 Mar 2023 09:32:59 +0000
Subject: [PATCH] pinctrl: amd: Disable and mask interrupts on resume
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This fixes a similar problem to the one observed in:
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe").
On some systems, during suspend/resume cycle firmware leaves
an interrupt enabled on a pin that is not used by the kernel.
This confuses the AMD pinctrl driver and causes spurious interrupts.
The driver already has logic to detect if a pin is used by the kernel.
Leverage it to re-initialize interrupt fields of a pin only if it's not
used by us.
Cc: stable(a)vger.kernel.org
Fixes: dbad75dd1f25 ("pinctrl: add AMD GPIO driver support.")
Signed-off-by: Kornel Dulęba <korneld(a)chromium.org>
Link: https://lore.kernel.org/r/20230320093259.845178-1-korneld@chromium.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9236a132c7ba..609821b756c2 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -872,32 +872,34 @@ static const struct pinconf_ops amd_pinconf_ops = {
.pin_config_group_set = amd_pinconf_group_set,
};
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin)
{
- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ const struct pin_desc *pd;
unsigned long flags;
u32 pin_reg, mask;
- int i;
mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
BIT(WAKE_CNTRL_OFF_S4);
- for (i = 0; i < desc->npins; i++) {
- int pin = desc->pins[i].number;
- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
-
- if (!pd)
- continue;
+ pd = pin_desc_get(gpio_dev->pctrl, pin);
+ if (!pd)
+ return;
- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ pin_reg = readl(gpio_dev->base + pin * 4);
+ pin_reg &= ~mask;
+ writel(pin_reg, gpio_dev->base + pin * 4);
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+}
- pin_reg = readl(gpio_dev->base + i * 4);
- pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + i * 4);
+static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+{
+ struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ int i;
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- }
+ for (i = 0; i < desc->npins; i++)
+ amd_gpio_irq_init_pin(gpio_dev, i);
}
#ifdef CONFIG_PM_SLEEP
@@ -950,8 +952,10 @@ static int amd_gpio_resume(struct device *dev)
for (i = 0; i < desc->npins; i++) {
int pin = desc->pins[i].number;
- if (!amd_gpio_should_save(gpio_dev, pin))
+ if (!amd_gpio_should_save(gpio_dev, pin)) {
+ amd_gpio_irq_init_pin(gpio_dev, pin);
continue;
+ }
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4) & PIN_IRQ_PENDING;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x b26cd9325be4c1fcd331b77f10acb627c560d4d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040314-starring-unwanted-c4c0@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume")
4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
e81376ebbafc ("pinctrl: amd: Use irqchip template")
279ffafaf39d ("pinctrl: Added IRQF_SHARED flag for amd-pinctrl driver")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b26cd9325be4c1fcd331b77f10acb627c560d4d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Kornel=20Dul=C4=99ba?= <korneld(a)chromium.org>
Date: Mon, 20 Mar 2023 09:32:59 +0000
Subject: [PATCH] pinctrl: amd: Disable and mask interrupts on resume
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This fixes a similar problem to the one observed in:
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe").
On some systems, during suspend/resume cycle firmware leaves
an interrupt enabled on a pin that is not used by the kernel.
This confuses the AMD pinctrl driver and causes spurious interrupts.
The driver already has logic to detect if a pin is used by the kernel.
Leverage it to re-initialize interrupt fields of a pin only if it's not
used by us.
Cc: stable(a)vger.kernel.org
Fixes: dbad75dd1f25 ("pinctrl: add AMD GPIO driver support.")
Signed-off-by: Kornel Dulęba <korneld(a)chromium.org>
Link: https://lore.kernel.org/r/20230320093259.845178-1-korneld@chromium.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9236a132c7ba..609821b756c2 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -872,32 +872,34 @@ static const struct pinconf_ops amd_pinconf_ops = {
.pin_config_group_set = amd_pinconf_group_set,
};
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin)
{
- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ const struct pin_desc *pd;
unsigned long flags;
u32 pin_reg, mask;
- int i;
mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
BIT(WAKE_CNTRL_OFF_S4);
- for (i = 0; i < desc->npins; i++) {
- int pin = desc->pins[i].number;
- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
-
- if (!pd)
- continue;
+ pd = pin_desc_get(gpio_dev->pctrl, pin);
+ if (!pd)
+ return;
- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ pin_reg = readl(gpio_dev->base + pin * 4);
+ pin_reg &= ~mask;
+ writel(pin_reg, gpio_dev->base + pin * 4);
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+}
- pin_reg = readl(gpio_dev->base + i * 4);
- pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + i * 4);
+static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+{
+ struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ int i;
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- }
+ for (i = 0; i < desc->npins; i++)
+ amd_gpio_irq_init_pin(gpio_dev, i);
}
#ifdef CONFIG_PM_SLEEP
@@ -950,8 +952,10 @@ static int amd_gpio_resume(struct device *dev)
for (i = 0; i < desc->npins; i++) {
int pin = desc->pins[i].number;
- if (!amd_gpio_should_save(gpio_dev, pin))
+ if (!amd_gpio_should_save(gpio_dev, pin)) {
+ amd_gpio_irq_init_pin(gpio_dev, pin);
continue;
+ }
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4) & PIN_IRQ_PENDING;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b26cd9325be4c1fcd331b77f10acb627c560d4d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040312-paramedic-spectacle-0fcf@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume")
4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
e81376ebbafc ("pinctrl: amd: Use irqchip template")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b26cd9325be4c1fcd331b77f10acb627c560d4d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Kornel=20Dul=C4=99ba?= <korneld(a)chromium.org>
Date: Mon, 20 Mar 2023 09:32:59 +0000
Subject: [PATCH] pinctrl: amd: Disable and mask interrupts on resume
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This fixes a similar problem to the one observed in:
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe").
On some systems, during suspend/resume cycle firmware leaves
an interrupt enabled on a pin that is not used by the kernel.
This confuses the AMD pinctrl driver and causes spurious interrupts.
The driver already has logic to detect if a pin is used by the kernel.
Leverage it to re-initialize interrupt fields of a pin only if it's not
used by us.
Cc: stable(a)vger.kernel.org
Fixes: dbad75dd1f25 ("pinctrl: add AMD GPIO driver support.")
Signed-off-by: Kornel Dulęba <korneld(a)chromium.org>
Link: https://lore.kernel.org/r/20230320093259.845178-1-korneld@chromium.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9236a132c7ba..609821b756c2 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -872,32 +872,34 @@ static const struct pinconf_ops amd_pinconf_ops = {
.pin_config_group_set = amd_pinconf_group_set,
};
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin)
{
- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ const struct pin_desc *pd;
unsigned long flags;
u32 pin_reg, mask;
- int i;
mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
BIT(WAKE_CNTRL_OFF_S4);
- for (i = 0; i < desc->npins; i++) {
- int pin = desc->pins[i].number;
- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
-
- if (!pd)
- continue;
+ pd = pin_desc_get(gpio_dev->pctrl, pin);
+ if (!pd)
+ return;
- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ pin_reg = readl(gpio_dev->base + pin * 4);
+ pin_reg &= ~mask;
+ writel(pin_reg, gpio_dev->base + pin * 4);
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+}
- pin_reg = readl(gpio_dev->base + i * 4);
- pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + i * 4);
+static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+{
+ struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ int i;
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- }
+ for (i = 0; i < desc->npins; i++)
+ amd_gpio_irq_init_pin(gpio_dev, i);
}
#ifdef CONFIG_PM_SLEEP
@@ -950,8 +952,10 @@ static int amd_gpio_resume(struct device *dev)
for (i = 0; i < desc->npins; i++) {
int pin = desc->pins[i].number;
- if (!amd_gpio_should_save(gpio_dev, pin))
+ if (!amd_gpio_should_save(gpio_dev, pin)) {
+ amd_gpio_irq_init_pin(gpio_dev, pin);
continue;
+ }
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4) & PIN_IRQ_PENDING;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x c1976bd8f23016d8706973908f2bb0ac0d852a8f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040358-blade-preheated-5a2d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
c1976bd8f230 ("zonefs: Always invalidate last cached page on append write")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c1976bd8f23016d8706973908f2bb0ac0d852a8f Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Wed, 29 Mar 2023 13:16:01 +0900
Subject: [PATCH] zonefs: Always invalidate last cached page on append write
When a direct append write is executed, the append offset may correspond
to the last page of a sequential file inode which might have been cached
already by buffered reads, page faults with mmap-read or non-direct
readahead. To ensure that the on-disk and cached data is consistant for
such last cached page, make sure to always invalidate it in
zonefs_file_dio_append(). If the invalidation fails, return -EBUSY to
userspace to differentiate from IO errors.
This invalidation will always be a no-op when the FS block size (device
zone write granularity) is equal to the page size (e.g. 4K).
Reported-by: Hans Holmberg <Hans.Holmberg(a)wdc.com>
Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Tested-by: Hans Holmberg <hans.holmberg(a)wdc.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 617e4f9db42e..c6ab2732955e 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -382,6 +382,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
struct zonefs_zone *z = zonefs_inode_zone(inode);
struct block_device *bdev = inode->i_sb->s_bdev;
unsigned int max = bdev_max_zone_append_sectors(bdev);
+ pgoff_t start, end;
struct bio *bio;
ssize_t size = 0;
int nr_pages;
@@ -390,6 +391,19 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
iov_iter_truncate(from, max);
+ /*
+ * If the inode block size (zone write granularity) is smaller than the
+ * page size, we may be appending data belonging to the last page of the
+ * inode straddling inode->i_size, with that page already cached due to
+ * a buffered read or readahead. So make sure to invalidate that page.
+ * This will always be a no-op for the case where the block size is
+ * equal to the page size.
+ */
+ start = iocb->ki_pos >> PAGE_SHIFT;
+ end = (iocb->ki_pos + iov_iter_count(from) - 1) >> PAGE_SHIFT;
+ if (invalidate_inode_pages2_range(inode->i_mapping, start, end))
+ return -EBUSY;
+
nr_pages = iov_iter_npages(from, BIO_MAX_VECS);
if (!nr_pages)
return 0;
Hi, Thorsten here, the Linux kernel's regression tracker.
I noticed a regression report in bugzilla.kernel.org. As many (most?)
kernel developers don't keep an eye on it, I decided to forward it by mail.
Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
not CCed them in mails like this.
Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217282 :
> Carsten Hatger 2023-03-31 14:49:06 UTC
>
> Created attachment 304068 [details]
> dmesg of 6.1.22 vanilla boot
>
> Dear all,
>
> ath11k hangs when booting v6.1.22 on a HP Pro x360 G9 w/ Qualcomm FastConnect 6900 using firmware WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23.
>
> However, v.6.1.21 works fine.
>
> Please let me know if I can further assist in resolving this issue - testing would be fine.
>
> Yours,
> Carsten
See the ticket for more details.
[TLDR for the rest of this mail: I'm adding this report to the list of
tracked Linux kernel regressions; the text you find below is based on a
few templates paragraphs you might have encountered already in similar
form.]
BTW, let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:
#regzbot introduced: v6.1.21..v6.1.22
https://bugzilla.kernel.org/show_bug.cgi?id=217282
#regzbot title: net: wireless: ath11k: hang on boot
#regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.
Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
this thread sees some discussion). See page linked in footer for details.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
[1] because bugzilla.kernel.org tells users upon registration their
"email address will never be displayed to logged out users"
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 005308f7bdacf5685ed1a431244a183dbbb9e0e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040353-preview-jackknife-52f7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
005308f7bdac ("io_uring/poll: clear single/double poll flags on poll arming")
5204aa8c43bd ("io_uring: add a helper for apoll alloc")
329061d3e2f9 ("io_uring: move poll handling into its own file")
cfd22e6b3319 ("io_uring: add opcode name to io_op_defs")
c9f06aa7de15 ("io_uring: move io_uring_task (tctx) helpers into its own file")
a4ad4f748ea9 ("io_uring: move fdinfo helpers to its own file")
e5550a1447bf ("io_uring: use io_is_uring_fops() consistently")
17437f311490 ("io_uring: move SQPOLL related handling into its own file")
59915143e89f ("io_uring: move timeout opcodes and handling into its own file")
e418bbc97bff ("io_uring: move our reference counting into a header")
36404b09aa60 ("io_uring: move msg_ring into its own file")
f9ead18c1058 ("io_uring: split network related opcodes into its own file")
e0da14def1ee ("io_uring: move statx handling to its own file")
a9c210cebe13 ("io_uring: move epoll handler to its own file")
4cf90495281b ("io_uring: add a dummy -EOPNOTSUPP prep handler")
99f15d8d6136 ("io_uring: move uring_cmd handling to its own file")
cd40cae29ef8 ("io_uring: split out open/close operations")
453b329be5ea ("io_uring: separate out file table handling code")
f4c163dd7d4b ("io_uring: split out fadvise/madvise operations")
0d5847274037 ("io_uring: split out fs related sync/fallocate functions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 005308f7bdacf5685ed1a431244a183dbbb9e0e8 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Mon, 27 Mar 2023 19:56:18 -0600
Subject: [PATCH] io_uring/poll: clear single/double poll flags on poll arming
Unless we have at least one entry queued, then don't call into
io_poll_remove_entries(). Normally this isn't possible, but if we
retry poll then we can have ->nr_entries cleared again as we're
setting it up. If this happens for a poll retry, then we'll still have
at least REQ_F_SINGLE_POLL set. io_poll_remove_entries() then thinks
it has entries to remove.
Clear REQ_F_SINGLE_POLL and REQ_F_DOUBLE_POLL unconditionally when
arming a poll request.
Fixes: c16bda37594f ("io_uring/poll: allow some retries for poll triggering spuriously")
Cc: stable(a)vger.kernel.org
Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 795facbd0e9f..55306e801081 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -726,6 +726,7 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
apoll = io_req_alloc_apoll(req, issue_flags);
if (!apoll)
return IO_APOLL_ABORTED;
+ req->flags &= ~(REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL);
req->flags |= REQ_F_POLLED;
ipt.pt._qproc = io_async_queue_proc;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 77af13ba3c7f91d91c377c7e2d122849bbc17128
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040327-freehand-water-9a7b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
77af13ba3c7f ("zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
db58653ce0c7 ("zonefs: Fix active zone accounting")
7dd12d65ac64 ("zonefs: fix zone report size in __zonefs_io_error()")
8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77af13ba3c7f91d91c377c7e2d122849bbc17128 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Thu, 30 Mar 2023 09:47:58 +0900
Subject: [PATCH] zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user
space
The call to invalidate_inode_pages2_range() in __iomap_dio_rw() may
fail, in which case -ENOTBLK is returned and this error code is
propagated back to user space trhough iomap_dio_rw() ->
zonefs_file_dio_write() return chain. This error code is fairly obscure
and may confuse the user. Avoid this and be consistent with the behavior
of zonefs_file_dio_append() for similar invalidate_inode_pages2_range()
errors by returning -EBUSY to user space when iomap_dio_rw() returns
-ENOTBLK.
Suggested-by: Christoph Hellwig <hch(a)infradead.org>
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Tested-by: Hans Holmberg <hans.holmberg(a)wdc.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index c6ab2732955e..132f01d3461f 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -581,11 +581,21 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
append = sync;
}
- if (append)
+ if (append) {
ret = zonefs_file_dio_append(iocb, from);
- else
+ } else {
+ /*
+ * iomap_dio_rw() may return ENOTBLK if there was an issue with
+ * page invalidation. Overwrite that error code with EBUSY to
+ * be consistent with zonefs_file_dio_append() return value for
+ * similar issues.
+ */
ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops,
&zonefs_write_dio_ops, 0, NULL, 0);
+ if (ret == -ENOTBLK)
+ ret = -EBUSY;
+ }
+
if (zonefs_zone_is_seq(z) &&
(ret > 0 || ret == -EIOCBQUEUED)) {
if (ret > 0)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 77af13ba3c7f91d91c377c7e2d122849bbc17128
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040326-rind-claw-0bcb@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
77af13ba3c7f ("zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
db58653ce0c7 ("zonefs: Fix active zone accounting")
7dd12d65ac64 ("zonefs: fix zone report size in __zonefs_io_error()")
8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77af13ba3c7f91d91c377c7e2d122849bbc17128 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Thu, 30 Mar 2023 09:47:58 +0900
Subject: [PATCH] zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user
space
The call to invalidate_inode_pages2_range() in __iomap_dio_rw() may
fail, in which case -ENOTBLK is returned and this error code is
propagated back to user space trhough iomap_dio_rw() ->
zonefs_file_dio_write() return chain. This error code is fairly obscure
and may confuse the user. Avoid this and be consistent with the behavior
of zonefs_file_dio_append() for similar invalidate_inode_pages2_range()
errors by returning -EBUSY to user space when iomap_dio_rw() returns
-ENOTBLK.
Suggested-by: Christoph Hellwig <hch(a)infradead.org>
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Tested-by: Hans Holmberg <hans.holmberg(a)wdc.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index c6ab2732955e..132f01d3461f 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -581,11 +581,21 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
append = sync;
}
- if (append)
+ if (append) {
ret = zonefs_file_dio_append(iocb, from);
- else
+ } else {
+ /*
+ * iomap_dio_rw() may return ENOTBLK if there was an issue with
+ * page invalidation. Overwrite that error code with EBUSY to
+ * be consistent with zonefs_file_dio_append() return value for
+ * similar issues.
+ */
ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops,
&zonefs_write_dio_ops, 0, NULL, 0);
+ if (ret == -ENOTBLK)
+ ret = -EBUSY;
+ }
+
if (zonefs_zone_is_seq(z) &&
(ret > 0 || ret == -EIOCBQUEUED)) {
if (ret > 0)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 2280d425ba3599bdd85c41bd0ec8ba568f00c032
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040354-ramrod-papyrus-415b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
2280d425ba35 ("btrfs: ignore fiemap path cache when there are multiple paths for a node")
73e339e6ab74 ("btrfs: cache sharedness of the last few data extents during fiemap")
b629685803bc ("btrfs: remove roots ulist when checking data extent sharedness")
84a7949d4097 ("btrfs: move ulists to data extent sharedness check context")
61dbb952f0a5 ("btrfs: turn the backref sharedness check cache into a context object")
ceb707da9ad9 ("btrfs: directly pass the inode to btrfs_is_data_extent_shared()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2280d425ba3599bdd85c41bd0ec8ba568f00c032 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 28 Mar 2023 10:45:20 +0100
Subject: [PATCH] btrfs: ignore fiemap path cache when there are multiple paths
for a node
During fiemap, when walking backreferences to determine if a b+tree
node/leaf is shared, we may find a tree block (leaf or node) for which
two parents were added to the references ulist. This happens if we get
for example one direct ref (shared tree block ref) and one indirect ref
(non-shared tree block ref) for the tree block at the current level,
which can happen during relocation.
In that case the fiemap path cache can not be used since it's meant for
a single path, with one tree block at each possible level, so having
multiple references for a tree block at any level may result in getting
the level counter exceed BTRFS_MAX_LEVEL and eventually trigger the
warning:
WARN_ON_ONCE(level >= BTRFS_MAX_LEVEL)
at lookup_backref_shared_cache() and at store_backref_shared_cache().
This is harmless since the code ignores any level >= BTRFS_MAX_LEVEL, the
warning is there just to catch any unexpected case like the one described
above. However if a user finds this it may be scary and get reported.
So just ignore the path cache once we find a tree block for which there
are more than one reference, which is the less common case, and update
the cache with the sharedness check result for all levels below the level
for which we found multiple references.
Reported-by: Jarno Pelkonen <jarno.pelkonen(a)gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CAKv8qLmDNAGJGCtsevxx_VZ_YOvvs1L83iEJkT…
Fixes: 12a824dc67a6 ("btrfs: speedup checking for extent sharedness during fiemap")
CC: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 90e40d5ceccd..e54f0884802a 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -1921,8 +1921,7 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
level = -1;
ULIST_ITER_INIT(&uiter);
while (1) {
- bool is_shared;
- bool cached;
+ const unsigned long prev_ref_count = ctx->refs.nnodes;
walk_ctx.bytenr = bytenr;
ret = find_parent_nodes(&walk_ctx, &shared);
@@ -1940,21 +1939,36 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
ret = 0;
/*
- * If our data extent was not directly shared (without multiple
- * reference items), than it might have a single reference item
- * with a count > 1 for the same offset, which means there are 2
- * (or more) file extent items that point to the data extent -
- * this happens when a file extent item needs to be split and
- * then one item gets moved to another leaf due to a b+tree leaf
- * split when inserting some item. In this case the file extent
- * items may be located in different leaves and therefore some
- * of the leaves may be referenced through shared subtrees while
- * others are not. Since our extent buffer cache only works for
- * a single path (by far the most common case and simpler to
- * deal with), we can not use it if we have multiple leaves
- * (which implies multiple paths).
+ * More than one extent buffer (bytenr) may have been added to
+ * the ctx->refs ulist, in which case we have to check multiple
+ * tree paths in case the first one is not shared, so we can not
+ * use the path cache which is made for a single path. Multiple
+ * extent buffers at the current level happen when:
+ *
+ * 1) level -1, the data extent: If our data extent was not
+ * directly shared (without multiple reference items), then
+ * it might have a single reference item with a count > 1 for
+ * the same offset, which means there are 2 (or more) file
+ * extent items that point to the data extent - this happens
+ * when a file extent item needs to be split and then one
+ * item gets moved to another leaf due to a b+tree leaf split
+ * when inserting some item. In this case the file extent
+ * items may be located in different leaves and therefore
+ * some of the leaves may be referenced through shared
+ * subtrees while others are not. Since our extent buffer
+ * cache only works for a single path (by far the most common
+ * case and simpler to deal with), we can not use it if we
+ * have multiple leaves (which implies multiple paths).
+ *
+ * 2) level >= 0, a tree node/leaf: We can have a mix of direct
+ * and indirect references on a b+tree node/leaf, so we have
+ * to check multiple paths, and the extent buffer (the
+ * current bytenr) may be shared or not. One example is
+ * during relocation as we may get a shared tree block ref
+ * (direct ref) and a non-shared tree block ref (indirect
+ * ref) for the same node/leaf.
*/
- if (level == -1 && ctx->refs.nnodes > 1)
+ if ((ctx->refs.nnodes - prev_ref_count) > 1)
ctx->use_path_cache = false;
if (level >= 0)
@@ -1964,18 +1978,45 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
if (!node)
break;
bytenr = node->val;
- level++;
- cached = lookup_backref_shared_cache(ctx, root, bytenr, level,
- &is_shared);
- if (cached) {
- ret = (is_shared ? 1 : 0);
- break;
+ if (ctx->use_path_cache) {
+ bool is_shared;
+ bool cached;
+
+ level++;
+ cached = lookup_backref_shared_cache(ctx, root, bytenr,
+ level, &is_shared);
+ if (cached) {
+ ret = (is_shared ? 1 : 0);
+ break;
+ }
}
shared.share_count = 0;
shared.have_delayed_delete_refs = false;
cond_resched();
}
+ /*
+ * If the path cache is disabled, then it means at some tree level we
+ * got multiple parents due to a mix of direct and indirect backrefs or
+ * multiple leaves with file extent items pointing to the same data
+ * extent. We have to invalidate the cache and cache only the sharedness
+ * result for the levels where we got only one node/reference.
+ */
+ if (!ctx->use_path_cache) {
+ int i = 0;
+
+ level--;
+ if (ret >= 0 && level >= 0) {
+ bytenr = ctx->path_cache_entries[level].bytenr;
+ ctx->use_path_cache = true;
+ store_backref_shared_cache(ctx, root, bytenr, level, ret);
+ i = level + 1;
+ }
+
+ for ( ; i < BTRFS_MAX_LEVEL; i++)
+ ctx->path_cache_entries[i].bytenr = 0;
+ }
+
/*
* Cache the sharedness result for the data extent if we know our inode
* has more than 1 file extent item that refers to the data extent.
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 50d281fc434cb8e2497f5e70a309ccca6b1a09f0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040340-happiest-next-a09c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
50d281fc434c ("btrfs: scan device in non-exclusive mode")
12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode")
5d1ab66c56fe ("btrfs: disallow space_cache in ZONED mode")
b70f509774ad ("btrfs: check and enable ZONED mode")
5b316468983d ("btrfs: get zone information of zoned block devices")
bacce86ae8a7 ("btrfs: drop unused argument step from btrfs_free_extra_devids")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 Mon Sep 17 00:00:00 2001
From: Anand Jain <anand.jain(a)oracle.com>
Date: Thu, 23 Mar 2023 15:56:48 +0800
Subject: [PATCH] btrfs: scan device in non-exclusive mode
This fixes mkfs/mount/check failures due to race with systemd-udevd
scan.
During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY
error
- LTP pwritev03 test case, where mkfs.vfs failed with
the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
on the device.
In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very
small window during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurrences were anywhere between 10 to 400
iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function call chain:
btrfs_mount_root
btrfs_open_devices
open_fs_devices
btrfs_open_one_device
btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan
operation, and add a comment.
The case where mkfs may still write to the device and a scan is running,
the btrfs signature is not written at that time so scan will not
recognize such device.
Reported-by: Sherry Yang <sherry.yang(a)oracle.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d0124b6e79e..ac0e8fb92fc8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* So, we need to add a special mount option to scan for
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 50d281fc434cb8e2497f5e70a309ccca6b1a09f0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040338-carving-ripping-8786@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
50d281fc434c ("btrfs: scan device in non-exclusive mode")
12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode")
5d1ab66c56fe ("btrfs: disallow space_cache in ZONED mode")
b70f509774ad ("btrfs: check and enable ZONED mode")
5b316468983d ("btrfs: get zone information of zoned block devices")
bacce86ae8a7 ("btrfs: drop unused argument step from btrfs_free_extra_devids")
96c2e067ed3e ("btrfs: skip devices without magic signature when mounting")
c3e1f96c37d0 ("btrfs: enumerate the type of exclusive operation in progress")
944d3f9fac61 ("btrfs: switch seed device to list api")
c4989c2fd0eb ("btrfs: simplify setting/clearing fs_info to btrfs_fs_devices")
54eed6ae8d8e ("btrfs: make close_fs_devices return void")
3712ccb7f1cc ("btrfs: factor out loop logic from btrfs_free_extra_devids")
dc0ab488d2cb ("btrfs: factor out reada loop in __reada_start_machine")
adca4d945c8d ("btrfs: qgroup: remove ASYNC_COMMIT mechanism in favor of reserve retry-after-EDQUOT")
3092c68fc58c ("btrfs: sysfs: add bdi link to the fsid directory")
998a0671961f ("btrfs: include non-missing as a qualifier for the latest_bdev")
1ed802c972c6 ("btrfs: drop useless goto in open_fs_devices")
b335eab890ed ("btrfs: make btrfs_read_disk_super return struct btrfs_disk_super")
c4a816c67c39 ("btrfs: introduce chunk allocation policy")
9a8658e33d8f ("btrfs: open code trivial helper btrfs_header_fsid")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 Mon Sep 17 00:00:00 2001
From: Anand Jain <anand.jain(a)oracle.com>
Date: Thu, 23 Mar 2023 15:56:48 +0800
Subject: [PATCH] btrfs: scan device in non-exclusive mode
This fixes mkfs/mount/check failures due to race with systemd-udevd
scan.
During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY
error
- LTP pwritev03 test case, where mkfs.vfs failed with
the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
on the device.
In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very
small window during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurrences were anywhere between 10 to 400
iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function call chain:
btrfs_mount_root
btrfs_open_devices
open_fs_devices
btrfs_open_one_device
btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan
operation, and add a comment.
The case where mkfs may still write to the device and a scan is running,
the btrfs signature is not written at that time so scan will not
recognize such device.
Reported-by: Sherry Yang <sherry.yang(a)oracle.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d0124b6e79e..ac0e8fb92fc8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* So, we need to add a special mount option to scan for
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
The patch below does not apply to the 5.20-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.20.y
git checkout FETCH_HEAD
git cherry-pick -x 50d281fc434cb8e2497f5e70a309ccca6b1a09f0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040334-reproduce-granola-78a1@gregkh' --subject-prefix 'PATCH 5.20.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 Mon Sep 17 00:00:00 2001
From: Anand Jain <anand.jain(a)oracle.com>
Date: Thu, 23 Mar 2023 15:56:48 +0800
Subject: [PATCH] btrfs: scan device in non-exclusive mode
This fixes mkfs/mount/check failures due to race with systemd-udevd
scan.
During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY
error
- LTP pwritev03 test case, where mkfs.vfs failed with
the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
on the device.
In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very
small window during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurrences were anywhere between 10 to 400
iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function call chain:
btrfs_mount_root
btrfs_open_devices
open_fs_devices
btrfs_open_one_device
btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan
operation, and add a comment.
The case where mkfs may still write to the device and a scan is running,
the btrfs signature is not written at that time so scan will not
recognize such device.
Reported-by: Sherry Yang <sherry.yang(a)oracle.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d0124b6e79e..ac0e8fb92fc8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* So, we need to add a special mount option to scan for
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 50d281fc434cb8e2497f5e70a309ccca6b1a09f0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023040333-vanquish-wriggle-e007@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
50d281fc434c ("btrfs: scan device in non-exclusive mode")
12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode")
5d1ab66c56fe ("btrfs: disallow space_cache in ZONED mode")
b70f509774ad ("btrfs: check and enable ZONED mode")
5b316468983d ("btrfs: get zone information of zoned block devices")
bacce86ae8a7 ("btrfs: drop unused argument step from btrfs_free_extra_devids")
96c2e067ed3e ("btrfs: skip devices without magic signature when mounting")
c3e1f96c37d0 ("btrfs: enumerate the type of exclusive operation in progress")
944d3f9fac61 ("btrfs: switch seed device to list api")
c4989c2fd0eb ("btrfs: simplify setting/clearing fs_info to btrfs_fs_devices")
54eed6ae8d8e ("btrfs: make close_fs_devices return void")
3712ccb7f1cc ("btrfs: factor out loop logic from btrfs_free_extra_devids")
dc0ab488d2cb ("btrfs: factor out reada loop in __reada_start_machine")
adca4d945c8d ("btrfs: qgroup: remove ASYNC_COMMIT mechanism in favor of reserve retry-after-EDQUOT")
3092c68fc58c ("btrfs: sysfs: add bdi link to the fsid directory")
998a0671961f ("btrfs: include non-missing as a qualifier for the latest_bdev")
1ed802c972c6 ("btrfs: drop useless goto in open_fs_devices")
b335eab890ed ("btrfs: make btrfs_read_disk_super return struct btrfs_disk_super")
c4a816c67c39 ("btrfs: introduce chunk allocation policy")
9a8658e33d8f ("btrfs: open code trivial helper btrfs_header_fsid")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 Mon Sep 17 00:00:00 2001
From: Anand Jain <anand.jain(a)oracle.com>
Date: Thu, 23 Mar 2023 15:56:48 +0800
Subject: [PATCH] btrfs: scan device in non-exclusive mode
This fixes mkfs/mount/check failures due to race with systemd-udevd
scan.
During the device scan initiated by systemd-udevd, other user space
EXCL operations such as mkfs, mount, or check may get blocked and result
in a "Device or resource busy" error. This is because the device
scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY
error
- LTP pwritev03 test case, where mkfs.vfs failed with
the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
on the device.
In both cases, fsck and mkfs (respectively) were racing with a
systemd-udevd device scan, and systemd-udevd won, resulting in the
-EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very
small window during which these userspace threads can race to
acquire the exclusive device open. Even on the system where the problem
was observed, the problem occurrences were anywhere between 10 to 400
iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process,
as there are no write operations on the device during scan. Furthermore,
during the mount process, the superblock is re-read in the below
function call chain:
btrfs_mount_root
btrfs_open_devices
open_fs_devices
btrfs_open_one_device
btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan
operation, and add a comment.
The case where mkfs may still write to the device and a scan is running,
the btrfs signature is not written at that time so scan will not
recognize such device.
Reported-by: Sherry Yang <sherry.yang(a)oracle.com>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d0124b6e79e..ac0e8fb92fc8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* So, we need to add a special mount option to scan for
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
On 4/3/23 12:10 PM, Matthew Wilcox wrote:
> On Sun, Apr 02, 2023 at 06:19:20AM +0800, Rongwei Wang wrote:
>> Without this modification, a core will wait (mostly)
>> 'swap_info_struct->lock' when completing
>> 'del_from_avail_list(p)'. Immediately, other cores
>> soon calling 'add_to_avail_list()' to add the same
>> object again when acquiring the lock that released
>> by former. It's not the desired result but exists
>> indeed. This case can be described as below:
> This feels like a very verbose way of saying
>
> "The si->lock must be held when deleting the si from the
> available list. Otherwise, another thread can re-add the
> si to the available list, which can lead to memory corruption.
> The only place we have found where this happens is in the
> swapoff path."
It looks better than mine. Sorry for my confusing description, it will
be fixed in the next version.
>
>> +++ b/mm/swapfile.c
>> @@ -2610,8 +2610,12 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>> spin_unlock(&swap_lock);
>> goto out_dput;
>> }
>> - del_from_avail_list(p);
>> + /*
>> + * Here lock is used to protect deleting and SWP_WRITEOK clearing
>> + * can be seen concurrently.
>> + */
> This comment isn't necessary. But I would add a lockdep assert inside
> __del_from_avail_list() that p->lock is held.
Thanks. Actually, I have this line in previous test version, but delete
for saving one line of code.
I will update here as you said.
Thanks for your time.
>
>> spin_lock(&p->lock);
>> + del_from_avail_list(p);
>> if (p->prio < 0) {
>> struct swap_info_struct *si = p;
>> int nid;
>> --
>> 2.27.0
>>
>>
This is a proposal to revert commit 914eedcb9ba0ff53c33808.
I found this when writting a simple UFFDIO_API test to be the first unit
test in this set. Two things breaks with the commit:
- UFFDIO_API check was lost and missing. According to man page, the
kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This
check is needed if the api version will be extended in the future, or
user app won't be able to identify which is a new kernel.
- Feature flags checks were removed, which means UFFDIO_API with a
feature that does not exist will also succeed. According to the man
page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if
unknown features passed in.
Link: https://lore.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Peter Xu <peterx(a)redhat.com>
---
fs/userfaultfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 8395605790f6..3b2a41c330e6 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1977,8 +1977,10 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
ret = -EFAULT;
if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api)))
goto out;
- /* Ignore unsupported features (userspace built against newer kernel) */
- features = uffdio_api.features & UFFD_API_FEATURES;
+ features = uffdio_api.features;
+ ret = -EINVAL;
+ if (uffdio_api.api != UFFD_API || (features & ~UFFD_API_FEATURES))
+ goto err_out;
ret = -EPERM;
if ((features & UFFD_FEATURE_EVENT_FORK) && !capable(CAP_SYS_PTRACE))
goto err_out;
--
2.39.1
Hello my beloved, good morning from here, how are you doing today? My
name is Mrs. Rabi Juanni Marcus, I have something very important that
i want to discuss with you.
If the value read from the CHDBOFF and ERDBOFF registers is outside the
range of the MHI register space then an invalid address might be computed
which later causes a kernel panic. Range check the read value to prevent
a crash due to bad data from the device.
Fixes: 6cd330ae76ff ("bus: mhi: core: Add support for ringing channel/event ring doorbells")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeffrey Hugo <quic_jhugo(a)quicinc.com>
Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy(a)quicinc.com>
---
v2:
-CC stable
-Use ERANGE for the error code
drivers/bus/mhi/host/init.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 3d779ee..b46a082 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -516,6 +516,12 @@ int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
return -EIO;
}
+ if (val >= mhi_cntrl->reg_len - (8 * MHI_DEV_WAKE_DB)) {
+ dev_err(dev, "CHDB offset: 0x%x is out of range: 0x%zx\n",
+ val, mhi_cntrl->reg_len - (8 * MHI_DEV_WAKE_DB));
+ return -ERANGE;
+ }
+
/* Setup wake db */
mhi_cntrl->wake_db = base + val + (8 * MHI_DEV_WAKE_DB);
mhi_cntrl->wake_set = false;
@@ -532,6 +538,12 @@ int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
return -EIO;
}
+ if (val >= mhi_cntrl->reg_len - (8 * mhi_cntrl->total_ev_rings)) {
+ dev_err(dev, "ERDB offset: 0x%x is out of range: 0x%zx\n",
+ val, mhi_cntrl->reg_len - (8 * mhi_cntrl->total_ev_rings));
+ return -ERANGE;
+ }
+
/* Setup event db address for each ev_ring */
mhi_event = mhi_cntrl->mhi_event;
for (i = 0; i < mhi_cntrl->total_ev_rings; i++, val += 8, mhi_event++) {
--
2.7.4
If two or more suitable entries with the same filename are found in
__uc_fw_auto_select's fw_blobs, and that filename fails to load in the
first attempt and in the retry, when __uc_fw_auto_select is called for
the third time, the coincidence of strings will cause it to clear
file_selected.path at the first hit, so it will return the second hit
over and over again, indefinitely.
Of course this doesn't occur with the pristine blob lists, but a
modified version could run into this, e.g., patching in a duplicate
entry, or (as in our case) disarming blob loading by remapping their
names to "/*(DEBLOBBED)*/", given a toolchain that unifies identical
string literals.
Of course I'm ready to carry a patchlet to avoid this problem
triggered by our (GNU Linux-libre's) intentional changes, but I
figured you might be interested in fail-safing it even in accidental
backporting circumstances. I realize it's not entirely foolproof: if
the same string appears in two entries separated by a different one,
the infinite loop might still occur. Catching that even more unlikely
situation seemed too expensive.
Link: https://www.fsfla.org/pipermail/linux-libre/2023-March/003506.html
Cc: intel-gfx(a)lists.freedesktop.org
Cc: stable(a)vger.kernel.org # 6.[12].x
Signed-off-by: Alexandre Oliva <lxoliva(a)fsfla.org>
---
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 9d6f571097e6..2b7564a3ed82 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -259,7 +259,10 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw)
uc_fw->file_selected.path = NULL;
continue;
- }
+ } else if (uc_fw->file_wanted.path == blob->path)
+ /* Avoid retrying forever when neighbor
+ entries point to the same path. */
+ continue;
uc_fw->file_selected.path = blob->path;
uc_fw->file_wanted.path = blob->path;
--
2.25.1
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts. Ask me about <https://stallmansupport.org>
In dual-bridges scenario, some bugs were found for irq
controllers drivers, so the patch serie is used to fix them.
V1->V2:
1. Remove all of ChangeID in patches
2. Exchange the sequence of some patches
3. Adjust code style of if...else... in patch[2]
Jianmin Lv (5):
irqchip/loongson-eiointc: Fix returned value on parsing MADT
irqchip/loongson-eiointc: Fix incorrect use of acpi_get_vec_parent
irqchip/loongson-eiointc: Fix registration of syscore_ops
irqchip/loongson-pch-pic: Fix registration of syscore_ops
irqchip/loongson-pch-pic: Fix pch_pic_acpi_init calling
drivers/irqchip/irq-loongson-eiointc.c | 32 ++++++++++++++++++--------
drivers/irqchip/irq-loongson-pch-pic.c | 6 ++++-
2 files changed, 27 insertions(+), 11 deletions(-)
--
2.31.1
Since 32ef9e5054ec, -Wa,-gdwarf-2 is no longer used in KBUILD_AFLAGS.
Instead, it includes -g, the appropriate -gdwarf-* flag, and also the
-Wa versions of both of those if building with Clang and GNU as. As a
result, debug info was being generated for the purgatory objects, even
though the intention was that it not be.
Fixes: 32ef9e5054ec ("Makefile.debug: re-enable debug info for .S files")
Signed-off-by: Alyssa Ross <hi(a)alyssa.is>
Cc: stable(a)vger.kernel.org
Acked-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
v2: https://lore.kernel.org/r/20230326182120.194541-1-hi@alyssa.is
Difference from v2: replaced asflags-remove-y with every possible
debug flag with asflags-y += -g0, as suggested by Nick Desaulniers.
Additionally, I've CCed the x86 maintainers this time, since Masahiro
said he would like acks from subsystem maintainers, and
get_maintainer.pl didn't pick them the first time around.
arch/riscv/purgatory/Makefile | 7 +------
arch/x86/purgatory/Makefile | 3 +--
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
index d16bf715a586..9c1e71853ee7 100644
--- a/arch/riscv/purgatory/Makefile
+++ b/arch/riscv/purgatory/Makefile
@@ -84,12 +84,7 @@ CFLAGS_string.o += $(PURGATORY_CFLAGS)
CFLAGS_REMOVE_ctype.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_ctype.o += $(PURGATORY_CFLAGS)
-AFLAGS_REMOVE_entry.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_memcpy.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_memset.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strcmp.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strlen.o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_strncmp.o += -Wa,-gdwarf-2
+asflags-y += -g0
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 17f09dc26381..8e6c81b1c8f7 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -69,8 +69,7 @@ CFLAGS_sha256.o += $(PURGATORY_CFLAGS)
CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_string.o += $(PURGATORY_CFLAGS)
-AFLAGS_REMOVE_setup-x86_$(BITS).o += -Wa,-gdwarf-2
-AFLAGS_REMOVE_entry64.o += -Wa,-gdwarf-2
+asflags-y += -g0
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
--
2.37.1
The patch titled
Subject: nilfs2: fix sysfs interface lifetime
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-sysfs-interface-lifetime.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix sysfs interface lifetime
Date: Fri, 31 Mar 2023 05:55:15 +0900
The current nilfs2 sysfs support has issues with the timing of creation
and deletion of sysfs entries, potentially leading to null pointer
dereferences, use-after-free, and lockdep warnings.
Some of the sysfs attributes for nilfs2 per-filesystem instance refer to
metadata file "cpfile", "sufile", or "dat", but
nilfs_sysfs_create_device_group that creates those attributes is executed
before the inodes for these metadata files are loaded, and
nilfs_sysfs_delete_device_group which deletes these sysfs entries is
called after releasing their metadata file inodes.
Therefore, access to some of these sysfs attributes may occur outside of
the lifetime of these metadata files, resulting in inode NULL pointer
dereferences or use-after-free.
In addition, the call to nilfs_sysfs_create_device_group() is made during
the locking period of the semaphore "ns_sem" of nilfs object, so the
shrinker call caused by the memory allocation for the sysfs entries, may
derive lock dependencies "ns_sem" -> (shrinker) -> "locks acquired in
nilfs_evict_inode()".
Since nilfs2 may acquire "ns_sem" deep in the call stack holding other
locks via its error handler __nilfs_error(), this causes lockdep to report
circular locking. This is a false positive and no circular locking
actually occurs as no inodes exist yet when
nilfs_sysfs_create_device_group() is called. Fortunately, the lockdep
warnings can be resolved by simply moving the call to
nilfs_sysfs_create_device_group() out of "ns_sem".
This fixes these sysfs issues by revising where the device's sysfs
interface is created/deleted and keeping its lifetime within the lifetime
of the metadata files above.
Link: https://lkml.kernel.org/r/20230330205515.6167-1-konishi.ryusuke@gmail.com
Fixes: dd70edbde262 ("nilfs2: integrate sysfs support into driver")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+979fa7f9c0d086fdc282(a)syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/0000000000003414b505f7885f7e@google.com
Reported-by: syzbot+5b7d542076d9bddc3c6a(a)syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/0000000000006ac86605f5f44eb9@google.com
Cc: Viacheslav Dubeyko <slava(a)dubeyko.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/super.c | 2 ++
fs/nilfs2/the_nilfs.c | 12 +++++++-----
2 files changed, 9 insertions(+), 5 deletions(-)
--- a/fs/nilfs2/super.c~nilfs2-fix-sysfs-interface-lifetime
+++ a/fs/nilfs2/super.c
@@ -482,6 +482,7 @@ static void nilfs_put_super(struct super
up_write(&nilfs->ns_sem);
}
+ nilfs_sysfs_delete_device_group(nilfs);
iput(nilfs->ns_sufile);
iput(nilfs->ns_cpfile);
iput(nilfs->ns_dat);
@@ -1105,6 +1106,7 @@ nilfs_fill_super(struct super_block *sb,
nilfs_put_root(fsroot);
failed_unload:
+ nilfs_sysfs_delete_device_group(nilfs);
iput(nilfs->ns_sufile);
iput(nilfs->ns_cpfile);
iput(nilfs->ns_dat);
--- a/fs/nilfs2/the_nilfs.c~nilfs2-fix-sysfs-interface-lifetime
+++ a/fs/nilfs2/the_nilfs.c
@@ -87,7 +87,6 @@ void destroy_nilfs(struct the_nilfs *nil
{
might_sleep();
if (nilfs_init(nilfs)) {
- nilfs_sysfs_delete_device_group(nilfs);
brelse(nilfs->ns_sbh[0]);
brelse(nilfs->ns_sbh[1]);
}
@@ -305,6 +304,10 @@ int load_nilfs(struct the_nilfs *nilfs,
goto failed;
}
+ err = nilfs_sysfs_create_device_group(sb);
+ if (unlikely(err))
+ goto sysfs_error;
+
if (valid_fs)
goto skip_recovery;
@@ -366,6 +369,9 @@ int load_nilfs(struct the_nilfs *nilfs,
goto failed;
failed_unload:
+ nilfs_sysfs_delete_device_group(nilfs);
+
+ sysfs_error:
iput(nilfs->ns_cpfile);
iput(nilfs->ns_sufile);
iput(nilfs->ns_dat);
@@ -697,10 +703,6 @@ int init_nilfs(struct the_nilfs *nilfs,
if (err)
goto failed_sbh;
- err = nilfs_sysfs_create_device_group(sb);
- if (err)
- goto failed_sbh;
-
set_nilfs_init(nilfs);
err = 0;
out:
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-potential-uaf-of-struct-nilfs_sc_info-in-nilfs_segctor_thread.patch
nilfs2-fix-sysfs-interface-lifetime.patch
As I attached a USB SSD into CentOS 9 Stream computer, after a short
while it swaps /dev/sdb into /dev/sdc and the I/O gets ruined.
Kind regards, Ilari Jääskeläinen.
When cleaning up peer group ids in the failure path we need to make sure
to hold on to the namespace lock. Otherwise another thread might just
turn the mount from a shared into a non-shared mount concurrently.
Reported-by: syzbot+8ac3859139c685c4f597(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/lkml/00000000000088694505f8132d77@google.com
Fixes: 2a1867219c7b ("fs: add mount_setattr()")
Cc: stable(a)vger.kernel.org # 5.12+
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
---
fs/namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index bc0f15257b49..6836e937ee61 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4183,9 +4183,9 @@ static int do_mount_setattr(struct path *path, struct mount_kattr *kattr)
unlock_mount_hash();
if (kattr->propagation) {
- namespace_unlock();
if (err)
cleanup_group_ids(mnt, NULL);
+ namespace_unlock();
}
return err;
---
base-commit: 197b6b60ae7bc51dd0814953c562833143b292aa
change-id: 20230330-vfs-mount_setattr-propagation-fix-363b7c59d7fb
I am fairly certain that CentOS 9 Stream GNOME mounted it without UUID,
though.
pe, 2023-03-31 kello 09:35 +0000, David Laight kirjoitti:
> From: Ilari Jääskeläinen
> > Sent: 31 March 2023 10:09
> >
> > I am afraid I cant do that. I already almost wiped my root
> > partition by
> > accident because of this flaw.
>
> Always mount filsystems by uuid, not device name.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
> MK1 1PT, UK
> Registration No: 1397386 (Wales)
roger that. good advice. thanks.
pe, 2023-03-31 kello 09:35 +0000, David Laight kirjoitti:
> From: Ilari Jääskeläinen
> > Sent: 31 March 2023 10:09
> >
> > I am afraid I cant do that. I already almost wiped my root
> > partition by
> > accident because of this flaw.
>
> Always mount filsystems by uuid, not device name.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
> MK1 1PT, UK
> Registration No: 1397386 (Wales)
Please can I trust you?.
I need your assistance to help me to invest my inheritance fund in
your country and to help me to come over to your country for the
betterment of my life and continue my education.
I will be happy to hear from you, then I will give you more details.
Best regards,
Miss Sarah Arabian
It the device is probed in non-zero ACPI D state, the module
identification is delayed until the first streamon.
The module identification has two parts: deviceID and version. To rea
the version we have to enable OTP read. This cannot be done during
streamon, becase it modifies REG_MODE_SELECT.
Since the driver has the same behaviour for all the module versions, do
not read the module version from the sensor's OTP.
Cc: stable(a)vger.kernel.org
Fixes: 0e014f1a8d54 ("media: ov8856: support device probe in non-zero ACPI D state")
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
To: Dongchun Zhu <dongchun.zhu(a)mediatek.com>
To: Mauro Carvalho Chehab <mchehab(a)kernel.org>
To: Sakari Ailus <sakari.ailus(a)linux.intel.com>
To: Bingbu Cao <bingbu.cao(a)intel.com>
Cc: Max Staudt <mstaudt(a)chromium.org>
Cc: Jimmy Su <jimmy.su(a)intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
Cc: linux-media(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
drivers/media/i2c/ov8856.c | 40 ----------------------------------------
1 file changed, 40 deletions(-)
diff --git a/drivers/media/i2c/ov8856.c b/drivers/media/i2c/ov8856.c
index cf8384e09413..b5c7881383ca 100644
--- a/drivers/media/i2c/ov8856.c
+++ b/drivers/media/i2c/ov8856.c
@@ -1709,46 +1709,6 @@ static int ov8856_identify_module(struct ov8856 *ov8856)
return -ENXIO;
}
- ret = ov8856_write_reg(ov8856, OV8856_REG_MODE_SELECT,
- OV8856_REG_VALUE_08BIT, OV8856_MODE_STREAMING);
- if (ret)
- return ret;
-
- ret = ov8856_write_reg(ov8856, OV8856_OTP_MODE_CTRL,
- OV8856_REG_VALUE_08BIT, OV8856_OTP_MODE_AUTO);
- if (ret) {
- dev_err(&client->dev, "failed to set otp mode");
- return ret;
- }
-
- ret = ov8856_write_reg(ov8856, OV8856_OTP_LOAD_CTRL,
- OV8856_REG_VALUE_08BIT,
- OV8856_OTP_LOAD_CTRL_ENABLE);
- if (ret) {
- dev_err(&client->dev, "failed to enable load control");
- return ret;
- }
-
- ret = ov8856_read_reg(ov8856, OV8856_MODULE_REVISION,
- OV8856_REG_VALUE_08BIT, &val);
- if (ret) {
- dev_err(&client->dev, "failed to read module revision");
- return ret;
- }
-
- dev_info(&client->dev, "OV8856 revision %x (%s) at address 0x%02x\n",
- val,
- val == OV8856_2A_MODULE ? "2A" :
- val == OV8856_1B_MODULE ? "1B" : "unknown revision",
- client->addr);
-
- ret = ov8856_write_reg(ov8856, OV8856_REG_MODE_SELECT,
- OV8856_REG_VALUE_08BIT, OV8856_MODE_STANDBY);
- if (ret) {
- dev_err(&client->dev, "failed to exit streaming mode");
- return ret;
- }
-
ov8856->identified = true;
return 0;
---
base-commit: 9fd6ba5420ba2b637d1ecc6de8613ec8b9c87e5a
change-id: 20230323-ov8856-otp-112f3cdc74b1
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
When considering whether to mark one context as stopped and another as
started we need to look at whether the previous and new _contexts_ are
different and not just requests. Otherwise the software tracked context
start time was incorrectly updated to the most recent lite-restore time-
stamp, which was in some cases resulting in active time going backward,
until the context switch (typically the hearbeat pulse) would synchronise
with the hardware tracked context runtime. Easiest use case to observe
this behaviour was with a full screen clients with close to 100% engine
load.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Fixes: bb6287cb1886 ("drm/i915: Track context current active time")
Cc: <stable(a)vger.kernel.org> # v5.19+
---
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1bbe6708d0a7..750326434677 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2018,6 +2018,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
* inspecting the queue to see if we need to resumbit.
*/
if (*prev != *execlists->active) { /* elide lite-restores */
+ struct intel_context *prev_ce = NULL, *active_ce = NULL;
+
/*
* Note the inherent discrepancy between the HW runtime,
* recorded as part of the context switch, and the CPU
@@ -2029,9 +2031,15 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
* and correct overselves later when updating from HW.
*/
if (*prev)
- lrc_runtime_stop((*prev)->context);
+ prev_ce = (*prev)->context;
if (*execlists->active)
- lrc_runtime_start((*execlists->active)->context);
+ active_ce = (*execlists->active)->context;
+ if (prev_ce != active_ce) {
+ if (prev_ce)
+ lrc_runtime_stop(prev_ce);
+ if (active_ce)
+ lrc_runtime_start(active_ce);
+ }
new_timeslice(execlists);
}
--
2.37.2
The bug was obswerved while reading code. There are not many users of
addr_mode_nbytes. Anyway, we should update the flash's current address
mode when changing the address mode, fix it. We don't care for now about
the set_4byte_addr_mode(nor, false) from spi_nor_restore(), as it is
used at driver remove and shutdown.
Fixes: d7931a215063 ("mtd: spi-nor: core: Track flash's internal address mode")
Signed-off-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Cc: stable(a)vger.kernel.org
---
drivers/mtd/spi-nor/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 0517a61975e4..4f0d90d3dad5 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -3135,6 +3135,7 @@ static int spi_nor_quad_enable(struct spi_nor *nor)
static int spi_nor_init(struct spi_nor *nor)
{
+ struct spi_nor_flash_parameter *params = nor->params;
int err;
err = spi_nor_octal_dtr_enable(nor, true);
@@ -3176,9 +3177,10 @@ static int spi_nor_init(struct spi_nor *nor)
*/
WARN_ONCE(nor->flags & SNOR_F_BROKEN_RESET,
"enabling reset hack; may not recover from unexpected reboots\n");
- err = nor->params->set_4byte_addr_mode(nor, true);
+ err = params->set_4byte_addr_mode(nor, true);
if (err && err != -ENOTSUPP)
return err;
+ params->addr_mode_nbytes = 4;
}
return 0;
--
2.40.0.348.gf938b09366-goog
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Reiserfs sets a security xattr at inode creation time in two stages: first,
it calls reiserfs_security_init() to obtain the xattr from active LSMs;
then, it calls reiserfs_security_write() to actually write that xattr.
Unfortunately, it seems there is a wrong expectation that LSMs provide the
full xattr name in the form 'security.<suffix>'. However, LSMs always
provided just the suffix, causing reiserfs to not write the xattr at all
(if the suffix is shorter than the prefix), or to write an xattr with the
wrong name.
Add a temporary buffer in reiserfs_security_write(), and write to it the
full xattr name, before passing it to reiserfs_xattr_set_handle().
Since the 'security.' prefix is always prepended, remove the name length
check.
Cc: stable(a)vger.kernel.org # v2.6.x
Fixes: 57fe60df6241 ("reiserfs: add atomic addition of selinux attributes during inode creation")
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
---
fs/reiserfs/xattr_security.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/reiserfs/xattr_security.c b/fs/reiserfs/xattr_security.c
index 6bffdf9a4fd..b0c354ab113 100644
--- a/fs/reiserfs/xattr_security.c
+++ b/fs/reiserfs/xattr_security.c
@@ -95,11 +95,13 @@ int reiserfs_security_write(struct reiserfs_transaction_handle *th,
struct inode *inode,
struct reiserfs_security_handle *sec)
{
+ char xattr_name[XATTR_NAME_MAX + 1];
int error;
- if (strlen(sec->name) < sizeof(XATTR_SECURITY_PREFIX))
- return -EINVAL;
- error = reiserfs_xattr_set_handle(th, inode, sec->name, sec->value,
+ snprintf(xattr_name, sizeof(xattr_name), "%s%s", XATTR_SECURITY_PREFIX,
+ sec->name);
+
+ error = reiserfs_xattr_set_handle(th, inode, xattr_name, sec->value,
sec->length, XATTR_CREATE);
if (error == -ENODATA || error == -EOPNOTSUPP)
error = 0;
--
2.25.1
While determining the initial pin assignment to be sent in the configure
message, using the DP_PIN_ASSIGN_DP_ONLY_MASK mask causes the DFP_U to
send both Pin Assignment C and E when both are supported by the DFP_U and
UFP_U. The spec (Table 5-7 DFP_U Pin Assignment Selection Mandates,
VESA DisplayPort Alt Mode Standard v2.0) indicates that the DFP_U never
selects Pin Assignment E when Pin Assignment C is offered.
Update the DP_PIN_ASSIGN_DP_ONLY_MASK conditional to intially select only
Pin Assignment C if it is available.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera(a)google.com>
---
drivers/usb/typec/altmodes/displayport.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 662cd043b50e..8f3e884222ad 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -112,8 +112,12 @@ static int dp_altmode_configure(struct dp_altmode *dp, u8 con)
if (dp->data.status & DP_STATUS_PREFER_MULTI_FUNC &&
pin_assign & DP_PIN_ASSIGN_MULTI_FUNC_MASK)
pin_assign &= DP_PIN_ASSIGN_MULTI_FUNC_MASK;
- else if (pin_assign & DP_PIN_ASSIGN_DP_ONLY_MASK)
+ else if (pin_assign & DP_PIN_ASSIGN_DP_ONLY_MASK) {
pin_assign &= DP_PIN_ASSIGN_DP_ONLY_MASK;
+ /* Default to pin assign C if available */
+ if (pin_assign & BIT(DP_PIN_ASSIGN_C))
+ pin_assign = BIT(DP_PIN_ASSIGN_C);
+ }
if (!pin_assign)
return -EINVAL;
base-commit: 97318d6427f62b723c89f4150f8f48126ef74961
--
2.40.0.348.gf938b09366-goog
SUBJECT: ovl: fail on invalid uid/gid mapping at copy up
COMMIT: 4f11ada10d0ad3fd53e2bd67806351de63a4f9c3
Reason for request:
This resolves CVE-2023-0386
CVE context: https://nvd.nist.gov/vuln/detail/CVE-2023-0386
SVACE reports return value of a function 'usb_alloc_urb' is dereferenced
without checking for null in 5.10 stable releases.
The problem has been fixed by the following
patch which can be cleanly applied to the 5.10 branch.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
The patch titled
Subject: mm: take a page reference when removing device exclusive entries
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-take-a-page-reference-when-removing-device-exclusive-entries.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm: take a page reference when removing device exclusive entries
Date: Thu, 30 Mar 2023 12:25:19 +1100
Device exclusive page table entries are used to prevent CPU access to a
page whilst it is being accessed from a device. Typically this is used to
implement atomic operations when the underlying bus does not support
atomic access. When a CPU thread encounters a device exclusive entry it
locks the page and restores the original entry after calling mmu notifiers
to signal drivers that exclusive access is no longer available.
The device exclusive entry holds a reference to the page making it safe to
access the struct page whilst the entry is present. However the fault
handling code does not hold the PTL when taking the page lock. This means
if there are multiple threads faulting concurrently on the device
exclusive entry one will remove the entry whilst others will wait on the
page lock without holding a reference.
This can lead to threads locking or waiting on a folio with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via munmap()
they may still be freed by a migration. This leads to warnings such as
PAGE_FLAGS_CHECK_AT_FREE due to the page being locked when the refcount
drops to zero.
Fix this by trying to take a reference on the folio before locking it.
The code already checks the PTE under the PTL and aborts if the entry is
no longer there. It is also possible the folio has been unmapped, freed
and re-allocated allowing a reference to be taken on an unrelated folio.
This case is also detected by the PTE check and the folio is unlocked
without further changes.
Link: https://lkml.kernel.org/r/20230330012519.804116-1-apopple@nvidia.com
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-take-a-page-reference-when-removing-device-exclusive-entries
+++ a/mm/memory.c
@@ -3563,8 +3563,21 @@ static vm_fault_t remove_device_exclusiv
struct vm_area_struct *vma = vmf->vma;
struct mmu_notifier_range range;
- if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags))
+ /*
+ * We need a reference to lock the folio because we don't hold
+ * the PTL so a racing thread can remove the device-exclusive
+ * entry and unmap it. If the folio is free the entry must
+ * have been removed already. If it happens to have already
+ * been re-allocated after being freed all we do is lock and
+ * unlock it.
+ */
+ if (!folio_try_get(folio))
+ return 0;
+
+ if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) {
+ folio_put(folio);
return VM_FAULT_RETRY;
+ }
mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0,
vma->vm_mm, vmf->address & PAGE_MASK,
(vmf->address & PAGE_MASK) + PAGE_SIZE, NULL);
@@ -3577,6 +3590,7 @@ static vm_fault_t remove_device_exclusiv
pte_unmap_unlock(vmf->pte, vmf->ptl);
folio_unlock(folio);
+ folio_put(folio);
mmu_notifier_invalidate_range_end(&range);
return 0;
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
mm-take-a-page-reference-when-removing-device-exclusive-entries.patch
commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
introduces a memory leak by missing a call to destroy_context() when a
percpu_counter fails to allocate.
Before introducing the per-cpu counter allocations, init_new_context()
was the last call that could fail in mm_init(), and thus there was no
need to ever invoke destroy_context() in the error paths. Adding the
following percpu counter allocations adds error paths after
init_new_context(), which means its associated destroy_context() needs
to be called when percpu counters fail to allocate.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org # 6.2
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/fork.c b/kernel/fork.c
index c0257cbee093..c983c4fe3090 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1171,6 +1171,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
fail_pcpu:
while (i > 0)
percpu_counter_destroy(&mm->rss_stat[--i]);
+ destroy_context(mm);
fail_nocontext:
mm_free_pgd(mm);
fail_nopgd:
--
2.25.1
The patch titled
Subject: mm: fix memory leak on mm_init error handling
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-fix-memory-leak-on-mm_init-error-handling.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Subject: mm: fix memory leak on mm_init error handling
Date: Thu, 30 Mar 2023 09:38:22 -0400
commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
introduces a memory leak by missing a call to destroy_context() when a
percpu_counter fails to allocate.
Before introducing the per-cpu counter allocations, init_new_context() was
the last call that could fail in mm_init(), and thus there was no need to
ever invoke destroy_context() in the error paths. Adding the following
percpu counter allocations adds error paths after init_new_context(),
which means its associated destroy_context() needs to be called when
percpu counters fail to allocate.
Link: https://lkml.kernel.org/r/20230330133822.66271-1-mathieu.desnoyers@efficios…
Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Acked-by: Shakeel Butt <shakeelb(a)google.com>
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c~mm-fix-memory-leak-on-mm_init-error-handling
+++ a/kernel/fork.c
@@ -1174,6 +1174,7 @@ static struct mm_struct *mm_init(struct
fail_pcpu:
while (i > 0)
percpu_counter_destroy(&mm->rss_stat[--i]);
+ destroy_context(mm);
fail_nocontext:
mm_free_pgd(mm);
fail_nopgd:
_
Patches currently in -mm which might be from mathieu.desnoyers(a)efficios.com are
mm-fix-memory-leak-on-mm_init-error-handling.patch
When cpumask is specified as a module parameter the value is
overwritten by the module init routine. This can easily be fixed
by checking to see if the mask has already been allocated in the
init routine.
When max_idle is specified as a module parameter a panic will occur.
The problem is that the idle_injection_cpu_mask is not allocated until
the module init routine executes. This can easily be fixed by allocating
the cpumask if it's not already allocated.
Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters")
Signed-off-by: David Arcari <darcari(a)redhat.com>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Cc: Amit Kucheria <amitk(a)kernel.org>
Cc: Zhang Rui <rui.zhang(a)intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Cc: David Arcari <darcari(a)redhat.com>
Cc: Chen Yu <yu.c.chen(a)intel.com>
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
drivers/thermal/intel/intel_powerclamp.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index c7ba5680cd48..91fc7e239497 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -235,6 +235,12 @@ static int max_idle_set(const char *arg, const struct kernel_param *kp)
goto skip_limit_set;
}
+ if (!cpumask_available(idle_injection_cpu_mask)) {
+ ret = allocate_copy_idle_injection_mask(cpu_present_mask);
+ if (ret)
+ goto skip_limit_set;
+ }
+
if (check_invalid(idle_injection_cpu_mask, new_max_idle)) {
ret = -EINVAL;
goto skip_limit_set;
@@ -791,7 +797,8 @@ static int __init powerclamp_init(void)
return retval;
mutex_lock(&powerclamp_lock);
- retval = allocate_copy_idle_injection_mask(cpu_present_mask);
+ if (!cpumask_available(idle_injection_cpu_mask))
+ retval = allocate_copy_idle_injection_mask(cpu_present_mask);
mutex_unlock(&powerclamp_lock);
if (retval)
--
2.27.0