From: Baokun Li <libaokun1(a)huawei.com>
[ Upstream commit b4b4fda34e535756f9e774fb2d09c4537b7dfd1c ]
In the following concurrency we will access the uninitialized rs->lock:
ext4_fill_super
ext4_register_sysfs
// sysfs registered msg_ratelimit_interval_ms
// Other processes modify rs->interval to
// non-zero via msg_ratelimit_interval_ms
ext4_orphan_cleanup
ext4_msg(sb, KERN_INFO, "Errors on filesystem, "
__ext4_msg
___ratelimit(&(EXT4_SB(sb)->s_msg_ratelimit_state)
if (!rs->interval) // do nothing if interval is 0
return 1;
raw_spin_trylock_irqsave(&rs->lock, flags)
raw_spin_trylock(lock)
_raw_spin_trylock
__raw_spin_trylock
spin_acquire(&lock->dep_map, 0, 1, _RET_IP_)
lock_acquire
__lock_acquire
register_lock_class
assign_lock_key
dump_stack();
ratelimit_state_init(&sbi->s_msg_ratelimit_state, 5 * HZ, 10);
raw_spin_lock_init(&rs->lock);
// init rs->lock here
and get the following dump_stack:
=========================================================
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 12 PID: 753 Comm: mount Tainted: G E 6.7.0-rc6-next-20231222 #504
[...]
Call Trace:
dump_stack_lvl+0xc5/0x170
dump_stack+0x18/0x30
register_lock_class+0x740/0x7c0
__lock_acquire+0x69/0x13a0
lock_acquire+0x120/0x450
_raw_spin_trylock+0x98/0xd0
___ratelimit+0xf6/0x220
__ext4_msg+0x7f/0x160 [ext4]
ext4_orphan_cleanup+0x665/0x740 [ext4]
__ext4_fill_super+0x21ea/0x2b10 [ext4]
ext4_fill_super+0x14d/0x360 [ext4]
[...]
=========================================================
Normally interval is 0 until s_msg_ratelimit_state is initialized, so
___ratelimit() does nothing. But registering sysfs precedes initializing
rs->lock, so it is possible to change rs->interval to a non-zero value
via the msg_ratelimit_interval_ms interface of sysfs while rs->lock is
uninitialized, and then a call to ext4_msg triggers the problem by
accessing an uninitialized rs->lock. Therefore register sysfs after all
initializations are complete to avoid such problems.
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20240102133730.1098120-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
[ Resolve merge conflicts ]
Signed-off-by: Bin Lan <bin.lan.cn(a)windriver.com>
---
fs/ext4/super.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 987d49e18dbe..e6f08ee9895f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5506,19 +5506,15 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
if (err)
goto failed_mount6;
- err = ext4_register_sysfs(sb);
- if (err)
- goto failed_mount7;
-
err = ext4_init_orphan_info(sb);
if (err)
- goto failed_mount8;
+ goto failed_mount7;
#ifdef CONFIG_QUOTA
/* Enable quota usage during mount. */
if (ext4_has_feature_quota(sb) && !sb_rdonly(sb)) {
err = ext4_enable_quotas(sb);
if (err)
- goto failed_mount9;
+ goto failed_mount8;
}
#endif /* CONFIG_QUOTA */
@@ -5545,7 +5541,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
ext4_msg(sb, KERN_INFO, "recovery complete");
err = ext4_mark_recovery_complete(sb, es);
if (err)
- goto failed_mount10;
+ goto failed_mount9;
}
if (test_opt(sb, DISCARD) && !bdev_max_discard_sectors(sb->s_bdev))
@@ -5562,15 +5558,17 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
atomic_set(&sbi->s_warning_count, 0);
atomic_set(&sbi->s_msg_count, 0);
+ /* Register sysfs after all initializations are complete. */
+ err = ext4_register_sysfs(sb);
+ if (err)
+ goto failed_mount9;
+
return 0;
-failed_mount10:
+failed_mount9:
ext4_quota_off_umount(sb);
-failed_mount9: __maybe_unused
+failed_mount8: __maybe_unused
ext4_release_orphan_info(sb);
-failed_mount8:
- ext4_unregister_sysfs(sb);
- kobject_put(&sbi->s_kobj);
failed_mount7:
ext4_unregister_li_request(sb);
failed_mount6:
--
2.43.0
From: Juntong Deng <juntong.deng(a)outlook.com>
commit bdcb8aa434c6d36b5c215d02a9ef07551be25a37 upstream.
In gfs2_put_super(), whether withdrawn or not, the quota should
be cleaned up by gfs2_quota_cleanup().
Otherwise, struct gfs2_sbd will be freed before gfs2_qd_dealloc (rcu
callback) has run for all gfs2_quota_data objects, resulting in
use-after-free.
Also, gfs2_destroy_threads() and gfs2_quota_cleanup() is already called
by gfs2_make_fs_ro(), so in gfs2_put_super(), after calling
gfs2_make_fs_ro(), there is no need to call them again.
Reported-by: syzbot+29c47e9e51895928698c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29c47e9e51895928698c
Signed-off-by: Juntong Deng <juntong.deng(a)outlook.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
Signed-off-by: Guocai He <guocai.he.cn(a)windriver.com>
---
Changes in v2:
Correct the upstream commit id.
This commit is to solve the CVE-2024-52760.
Please merge this commit to linux-5.15.y.
---
fs/gfs2/super.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 268651ac9fc8..98158559893f 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -590,6 +590,8 @@ static void gfs2_put_super(struct super_block *sb)
if (!sb_rdonly(sb)) {
gfs2_make_fs_ro(sdp);
+ } else {
+ gfs2_quota_cleanup(sdp);
}
WARN_ON(gfs2_withdrawing(sdp));
--
2.34.1
From: Tomas Glozar <tglozar(a)redhat.com>
commit 76b3102148135945b013797fac9b206273f0f777 upstream.
Do the same fix as in previous commit also for timerlat-hist.
Link: https://lore.kernel.org/20241011121015.2868751-2-tglozar@redhat.com
Reported-by: Attila Fazekas <afazekas(a)redhat.com>
Signed-off-by: Tomas Glozar <tglozar(a)redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
[ Drop hunk fixing printf in timerlat_print_stats_all since that is not
in 6.6 ]
Signed-off-by: Tomas Glozar <tglozar(a)redhat.com>
---
tools/tracing/rtla/src/timerlat_hist.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/tracing/rtla/src/timerlat_hist.c b/tools/tracing/rtla/src/timerlat_hist.c
index 1c8ecd4ebcbd..667f12f2d67f 100644
--- a/tools/tracing/rtla/src/timerlat_hist.c
+++ b/tools/tracing/rtla/src/timerlat_hist.c
@@ -58,9 +58,9 @@ struct timerlat_hist_cpu {
int *thread;
int *user;
- int irq_count;
- int thread_count;
- int user_count;
+ unsigned long long irq_count;
+ unsigned long long thread_count;
+ unsigned long long user_count;
unsigned long long min_irq;
unsigned long long sum_irq;
@@ -300,15 +300,15 @@ timerlat_print_summary(struct timerlat_hist_params *params,
continue;
if (!params->no_irq)
- trace_seq_printf(trace->seq, "%9d ",
+ trace_seq_printf(trace->seq, "%9llu ",
data->hist[cpu].irq_count);
if (!params->no_thread)
- trace_seq_printf(trace->seq, "%9d ",
+ trace_seq_printf(trace->seq, "%9llu ",
data->hist[cpu].thread_count);
if (params->user_hist)
- trace_seq_printf(trace->seq, "%9d ",
+ trace_seq_printf(trace->seq, "%9llu ",
data->hist[cpu].user_count);
}
trace_seq_printf(trace->seq, "\n");
--
2.47.1
The DWC Databook description for the LWR_TARGET_RW and LWR_TARGET_HW fields
in the IATU_LWR_TARGET_ADDR_OFF_INBOUND_i registers state that:
"Field size depends on log2(BAR_MASK+1) in BAR match mode."
I.e. only the upper bits are writable, and the number of writable bits is
dependent on the configured BAR_MASK.
If we do not write the BAR_MASK before writing the iATU registers, we are
relying the reset value of the BAR_MASK being larger than the requested
size of the first set_bar() call. The reset value of the BAR_MASK is SoC
dependent.
Thus, if the first set_bar() call requests a size that is larger than the
reset value of the BAR_MASK, the iATU will try to write to read-only bits,
which will cause the iATU to end up redirecting to a physical address that
is different from the address that was intended.
Thus, we should always write the iATU registers after writing the BAR_MASK.
Cc: stable(a)vger.kernel.org
Fixes: f8aed6ec624f ("PCI: dwc: designware: Add EP mode support")
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
.../pci/controller/dwc/pcie-designware-ep.c | 28 ++++++++++---------
1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index f3ac7d46a855..bad588ef69a4 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -222,19 +222,10 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
if ((flags & PCI_BASE_ADDRESS_MEM_TYPE_64) && (bar & 1))
return -EINVAL;
- reg = PCI_BASE_ADDRESS_0 + (4 * bar);
-
- if (!(flags & PCI_BASE_ADDRESS_SPACE))
- type = PCIE_ATU_TYPE_MEM;
- else
- type = PCIE_ATU_TYPE_IO;
-
- ret = dw_pcie_ep_inbound_atu(ep, func_no, type, epf_bar->phys_addr, bar);
- if (ret)
- return ret;
-
if (ep->epf_bar[bar])
- return 0;
+ goto config_atu;
+
+ reg = PCI_BASE_ADDRESS_0 + (4 * bar);
dw_pcie_dbi_ro_wr_en(pci);
@@ -246,9 +237,20 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
dw_pcie_ep_writel_dbi(ep, func_no, reg + 4, 0);
}
- ep->epf_bar[bar] = epf_bar;
dw_pcie_dbi_ro_wr_dis(pci);
+config_atu:
+ if (!(flags & PCI_BASE_ADDRESS_SPACE))
+ type = PCIE_ATU_TYPE_MEM;
+ else
+ type = PCIE_ATU_TYPE_IO;
+
+ ret = dw_pcie_ep_inbound_atu(ep, func_no, type, epf_bar->phys_addr, bar);
+ if (ret)
+ return ret;
+
+ ep->epf_bar[bar] = epf_bar;
+
return 0;
}
--
2.47.0
In commit 4284c88fff0e ("PCI: designware-ep: Allow pci_epc_set_bar() update
inbound map address") set_bar() was modified to support dynamically
changing the backing physical address of a BAR that was already configured.
This means that set_bar() can be called twice, without ever calling
clear_bar() (as calling clear_bar() would clear the BAR's PCI address
assigned by the host).
This can only be done if the new BAR size/flags does not differ from the
existing BAR configuration. Add these missing checks.
If we allow set_bar() to set e.g. a new BAR size that differs from the
existing BAR size, the new address translation range will be smaller than
the BAR size already determined by the host, which would mean that a read
past the new BAR size would pass the iATU untranslated, which could allow
the host to read memory not belonging to the new struct pci_epf_bar.
While at it, add comments which clarifies the support for dynamically
changing the physical address of a BAR. (Which was also missing.)
Cc: stable(a)vger.kernel.org
Fixes: 4284c88fff0e ("PCI: designware-ep: Allow pci_epc_set_bar() update inbound map address")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
.../pci/controller/dwc/pcie-designware-ep.c | 22 ++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index bad588ef69a4..44a617d54b15 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -222,8 +222,28 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
if ((flags & PCI_BASE_ADDRESS_MEM_TYPE_64) && (bar & 1))
return -EINVAL;
- if (ep->epf_bar[bar])
+ /*
+ * Certain EPF drivers dynamically change the physical address of a BAR
+ * (i.e. they call set_bar() twice, without ever calling clear_bar(), as
+ * calling clear_bar() would clear the BAR's PCI address assigned by the
+ * host).
+ */
+ if (ep->epf_bar[bar]) {
+ /*
+ * We can only dynamically change a BAR if the new BAR size and
+ * BAR flags do not differ from the existing configuration.
+ */
+ if (ep->epf_bar[bar]->barno != bar ||
+ ep->epf_bar[bar]->size != size ||
+ ep->epf_bar[bar]->flags != flags)
+ return -EINVAL;
+
+ /*
+ * When dynamically changing a BAR, skip writing the BAR reg, as
+ * that would clear the BAR's PCI address assigned by the host.
+ */
goto config_atu;
+ }
reg = PCI_BASE_ADDRESS_0 + (4 * bar);
--
2.47.1