Fixed a possible UAF problem in driver_override_show() in
drivers/bus/fsl-mc/fsl-mc-bus.c
This function driver_override_show() is part of DEVICE_ATTR_RW, which
includes both driver_override_show() and driver_override_store().
These functions can be executed concurrently in sysfs.
The driver_override_store() function uses driver_set_override() to
update the driver_override value, and driver_set_override() internally
locks the device (device_lock(dev)). If driver_override_show() reads
cdx_dev->driver_override without locking, it could potentially access
a freed pointer if driver_override_store() frees the string
concurrently. This could lead to printing a kernel address, which is a
security risk since DEVICE_ATTR can be read by all users.
Additionally, a similar pattern is used in drivers/amba/bus.c, as well
as many other bus drivers, where device_lock() is taken in the show
function, and it has been working without issues.
This potential bug was detected by our experimental static analysis tool,
which analyzes locking APIs and paired functions to identify data races
and atomicity violations.
Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
---
V2:
Revised the description to reduce the confusion it caused.
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 930d8a3ba722..62a9da88b4c9 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -201,8 +201,12 @@ static ssize_t driver_override_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+ ssize_t len;
- return snprintf(buf, PAGE_SIZE, "%s\n", mc_dev->driver_override);
+ device_lock(dev);
+ len = snprintf(buf, PAGE_SIZE, "%s\n", mc_dev->driver_override);
+ device_unlock(dev);
+ return len;
}
static DEVICE_ATTR_RW(driver_override);
--
2.34.1
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: de31b3cd706347044e1a57d68c3a683d58e8cca4
Gitweb: https://git.kernel.org/tip/de31b3cd706347044e1a57d68c3a683d58e8cca4
Author: Xin Li (Intel) <xin(a)zytor.com>
AuthorDate: Fri, 10 Jan 2025 09:46:39 -08:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Tue, 14 Jan 2025 14:16:36 -08:00
x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache
The FRED RSP0 MSR is only used for delivering events when running
userspace. Linux leverages this property to reduce expensive MSR
writes and optimize context switches. The kernel only writes the
MSR when about to run userspace *and* when the MSR has actually
changed since the last time userspace ran.
This optimization is implemented by maintaining a per-CPU cache of
FRED RSP0 and then checking that against the value for the top of
current task stack before running userspace.
However cpu_init_fred_exceptions() writes the MSR without updating
the per-CPU cache. This means that the kernel might return to
userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the
top of current task stack. This would induce a double fault (#DF),
which is bad.
A context switch after cpu_init_fred_exceptions() can paper over
the issue since it updates the cached value. That evidently
happens most of the time explaining how this bug got through.
Fix the bug through resynchronizing the FRED RSP0 MSR with its
per-CPU cache in cpu_init_fred_exceptions().
Fixes: fe85ee391966 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch")
Signed-off-by: Xin Li (Intel) <xin(a)zytor.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
---
arch/x86/kernel/fred.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c
index 8d32c3f..5e2cd10 100644
--- a/arch/x86/kernel/fred.c
+++ b/arch/x86/kernel/fred.c
@@ -50,7 +50,13 @@ void cpu_init_fred_exceptions(void)
FRED_CONFIG_ENTRYPOINT(asm_fred_entrypoint_user));
wrmsrl(MSR_IA32_FRED_STKLVLS, 0);
- wrmsrl(MSR_IA32_FRED_RSP0, 0);
+
+ /*
+ * Ater a CPU offline/online cycle, the FRED RSP0 MSR should be
+ * resynchronized with its per-CPU cache.
+ */
+ wrmsrl(MSR_IA32_FRED_RSP0, __this_cpu_read(fred_rsp0));
+
wrmsrl(MSR_IA32_FRED_RSP1, 0);
wrmsrl(MSR_IA32_FRED_RSP2, 0);
wrmsrl(MSR_IA32_FRED_RSP3, 0);
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 0d62a49ab55c99e8deb4593b8d9f923de1ab5c18
Gitweb: https://git.kernel.org/tip/0d62a49ab55c99e8deb4593b8d9f923de1ab5c18
Author: Yogesh Lal <quic_ylal(a)quicinc.com>
AuthorDate: Fri, 20 Dec 2024 15:09:07 +05:30
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 15 Jan 2025 09:42:44 +01:00
irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly
When a CPU attempts to enter low power mode, it disables the redistributor
and Group 1 interrupts and reinitializes the system registers upon wakeup.
If the transition into low power mode fails, then the CPU_PM framework
invokes the PM notifier callback with CPU_PM_ENTER_FAILED to allow the
drivers to undo the state changes.
The GIC V3 driver ignores CPU_PM_ENTER_FAILED, which leaves the GIC in
disabled state.
Handle CPU_PM_ENTER_FAILED in the same way as CPU_PM_EXIT to restore normal
operation.
[ tglx: Massage change log, add Fixes tag ]
Fixes: 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier")
Signed-off-by: Yogesh Lal <quic_ylal(a)quicinc.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20241220093907.2747601-1-quic_ylal@quicinc.com
---
drivers/irqchip/irq-gic-v3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 79d8cc8..76dce0a 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1522,7 +1522,7 @@ static int gic_retrigger(struct irq_data *data)
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
- if (cmd == CPU_PM_EXIT) {
+ if (cmd == CPU_PM_EXIT || cmd == CPU_PM_ENTER_FAILED) {
if (gic_dist_security_disabled())
gic_enable_redist(true);
gic_cpu_sys_reg_enable();
The following call-chain leads to misuse of spinlock_irq
when spinlock_irqsave was hold.
irq_set_vcpu_affinity
-> irq_get_desc_lock (spinlock_irqsave)
-> its_irq_set_vcpu_affinity
-> guard(raw_spin_lock_irq) <--- this enables interrupts
-> irq_put_desc_unlock // <--- WARN IRQs enabled
Fix the issue by using guard(raw_spinlock), since the function is
already called with irqsave and raw_spin_lock was used before the commit
b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
introducing the guard as well.
This was discovered through the lock debugging, and the corresponding
log is as follows:
raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 38 PID: 444 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2c/0x38
Call trace:
warn_bogus_irq_restore+0x2c/0x38
_raw_spin_unlock_irqrestore+0x68/0x88
__irq_put_desc_unlock+0x1c/0x48
irq_set_vcpu_affinity+0x74/0xc0
its_map_vlpi+0x44/0x88
kvm_vgic_v4_set_forwarding+0x148/0x230
kvm_arch_irq_bypass_add_producer+0x20/0x28
__connect+0x98/0xb8
irq_bypass_register_consumer+0x150/0x178
kvm_irqfd+0x6dc/0x744
kvm_vm_ioctl+0xe44/0x16b0
Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
Signed-off-by: Tomas Krcka <krckatom(a)amazon.de>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 92244cfa0464..8c3ec5734f1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2045,7 +2045,7 @@ static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
if (!is_v4(its_dev->its))
return -EINVAL;
- guard(raw_spinlock_irq)(&its_dev->event_map.vlpi_lock);
+ guard(raw_spinlock)(&its_dev->event_map.vlpi_lock);
/* Unmap request? */
if (!info)
--
2.40.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
There is a data race between the functions driver_override_show() and
driver_override_store(). In the driver_override_store() function, the
assignment to ret calls driver_set_override(), which frees the old value
while writing the new value to dev. If a race occurs, it may cause a
use-after-free (UAF) error in driver_override_show().
To fix this issue, we adopt a logic similar to the driver_override_show()
function in vmbus_drv.c, protecting dev within a lock to ensure its value
remains unchanged.
This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency bugs
including data races and atomicity violations.
Fixes: 48a6c7bced2a ("cdx: add device attributes")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
---
V2:
Modified the title and description.
Removed the changes to cdx_bus_match().
---
drivers/cdx/cdx.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c
index 07371cb653d3..4af1901c9d52 100644
--- a/drivers/cdx/cdx.c
+++ b/drivers/cdx/cdx.c
@@ -470,8 +470,12 @@ static ssize_t driver_override_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct cdx_device *cdx_dev = to_cdx_device(dev);
+ ssize_t len;
- return sysfs_emit(buf, "%s\n", cdx_dev->driver_override);
+ device_lock(dev);
+ len = sysfs_emit(buf, "%s\n", cdx_dev->driver_override);
+ device_unlock(dev);
+ return len;
}
static DEVICE_ATTR_RW(driver_override);
--
2.34.1
From: Cheng Ming Lin <chengminglin(a)mxic.com.tw>
The default dummy cycle for Macronix SPI NOR flash in Octal Output
Read Mode(1-1-8) is 20.
Currently, the dummy buswidth is set according to the address bus width.
In the 1-1-8 mode, this means the dummy buswidth is 1. When converting
dummy cycles to bytes, this results in 20 x 1 / 8 = 2 bytes, causing the
host to read data 4 cycles too early.
Since the protocol data buswidth is always greater than or equal to the
address buswidth. Setting the dummy buswidth to match the data buswidth
increases the likelihood that the dummy cycle-to-byte conversion will be
divisible, preventing the host from reading data prematurely.
Fixes: 0e30f47232ab5 ("mtd: spi-nor: add support for DTR protocol")
Cc: stable(a)vger.kernel.org
Reviewd-by: Pratyush Yadav <pratyush(a)kernel.org>
Signed-off-by: Cheng Ming Lin <chengminglin(a)mxic.com.tw>
---
drivers/mtd/spi-nor/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index f9c189ed7353..c7aceaa8a43f 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -89,7 +89,7 @@ void spi_nor_spimem_setup_op(const struct spi_nor *nor,
op->addr.buswidth = spi_nor_get_protocol_addr_nbits(proto);
if (op->dummy.nbytes)
- op->dummy.buswidth = spi_nor_get_protocol_addr_nbits(proto);
+ op->dummy.buswidth = spi_nor_get_protocol_data_nbits(proto);
if (op->data.nbytes)
op->data.buswidth = spi_nor_get_protocol_data_nbits(proto);
--
2.25.1
From: Darrick J. Wong <djwong(a)kernel.org>
I received a report from the release engineering side of the house that
xfs_scrub without the -n flag (aka fix it mode) would try to fix a
broken filesystem even on a kernel that doesn't have online repair built
into it:
# xfs_scrub -dTvn /mnt/test
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt/test: using 1 threads to scrub.
Phase 1: Memory used: 132k/0k (108k/25k), time: 0.00/ 0.00/ 0.00s
<snip>
Phase 4: Repair filesystem.
<snip>
Info: /mnt/test/some/victimdir directory entries: Attempting repair. (repair.c line 351)
Corruption: /mnt/test/some/victimdir directory entries: Repair unsuccessful; offline repair required. (repair.c line 204)
Source: https://blogs.oracle.com/linux/post/xfs-online-filesystem-repair
It is strange that xfs_scrub doesn't refuse to run, because the kernel
is supposed to return EOPNOTSUPP if we actually needed to run a repair,
and xfs_io's repair subcommand will perror that. And yet:
# xfs_io -x -c 'repair probe' /mnt/test
#
The first problem is commit dcb660f9222fd9 (4.15) which should have had
xchk_probe set the CORRUPT OFLAG so that any of the repair machinery
will get called at all.
It turns out that some refactoring that happened in the 6.6-6.8 era
broke the operation of this corner case. What we *really* want to
happen is that all the predicates that would steer xfs_scrub_metadata()
towards calling xrep_attempt() should function the same way that they do
when repair is compiled in; and then xrep_attempt gets to return the
fatal EOPNOTSUPP error code that causes the probe to fail.
Instead, commit 8336a64eb75cba (6.6) started the failwhale swimming by
hoisting OFLAG checking logic into a helper whose non-repair stub always
returns false, causing scrub to return "repair not needed" when in fact
the repair is not supported. Prior to that commit, the oflag checking
that was open-coded in scrub.c worked correctly.
Similarly, in commit 4bdfd7d15747b1 (6.8) we hoisted the IFLAG_REPAIR
and ALREADY_FIXED logic into a helper whose non-repair stub always
returns false, so we never enter the if test body that would have called
xrep_attempt, let alone fail to decode the OFLAGs correctly.
The final insult (yes, we're doing The Naked Gun now) is commit
48a72f60861f79 (6.8) in which we hoisted the "are we going to try a
repair?" predicate into yet another function with a non-repair stub
always returns false.
So. Fix xchk_probe to trigger xrep_probe, then fix all the other helper
predicates in scrub.h so that we always point to xrep_attempt, even if
online repair is disabled. Commit 48a72 is tagged here because the
scrub code prior to LTS 6.12 are incomplete and not worth patching.
Reported-by: David Flynn <david.flynn(a)oracle.com>
Cc: <stable(a)vger.kernel.org> # v6.8
Fixes: 48a72f60861f79 ("xfs: don't complain about unfixed metadata when repairs were injected")
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
---
fs/xfs/scrub/common.h | 5 -----
fs/xfs/scrub/repair.h | 11 ++++++++++-
fs/xfs/scrub/scrub.c | 9 +++++++++
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index eecb6d54c0f953..53a6b34ce54271 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -212,7 +212,6 @@ static inline bool xchk_skip_xref(struct xfs_scrub_metadata *sm)
bool xchk_dir_looks_zapped(struct xfs_inode *dp);
bool xchk_pptr_looks_zapped(struct xfs_inode *ip);
-#ifdef CONFIG_XFS_ONLINE_REPAIR
/* Decide if a repair is required. */
static inline bool xchk_needs_repair(const struct xfs_scrub_metadata *sm)
{
@@ -232,10 +231,6 @@ static inline bool xchk_could_repair(const struct xfs_scrub *sc)
return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) &&
!(sc->flags & XREP_ALREADY_FIXED);
}
-#else
-# define xchk_needs_repair(sc) (false)
-# define xchk_could_repair(sc) (false)
-#endif /* CONFIG_XFS_ONLINE_REPAIR */
int xchk_metadata_inode_forks(struct xfs_scrub *sc);
diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h
index b649da1a93eb8c..b3b1fe62814e7b 100644
--- a/fs/xfs/scrub/repair.h
+++ b/fs/xfs/scrub/repair.h
@@ -173,7 +173,16 @@ bool xrep_buf_verify_struct(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
#else
#define xrep_ino_dqattach(sc) (0)
-#define xrep_will_attempt(sc) (false)
+
+/*
+ * When online repair is not built into the kernel, we still want to attempt
+ * the repair so that the stub xrep_attempt below will return EOPNOTSUPP.
+ */
+static inline bool xrep_will_attempt(const struct xfs_scrub *sc)
+{
+ return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_FORCE_REBUILD) ||
+ xchk_needs_repair(sc->sm);
+}
static inline int
xrep_attempt(
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index 950f5a58dcd967..09468f50781b24 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -149,6 +149,15 @@ xchk_probe(
if (xchk_should_terminate(sc, &error))
return error;
+ /*
+ * If the caller is probing to see if repair works, set the CORRUPT
+ * flag (without any of the usual tracing/logging) to force us into
+ * the repair codepaths. If repair is compiled into the kernel, we'll
+ * call xrep_probe and simulate a repair; otherwise, the repair
+ * codepaths return EOPNOTSUPP.
+ */
+ if (xchk_could_repair(sc))
+ sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
return 0;
}
Currently, DWC3 driver attempts to acquire the USB power supply only
once during the probe. If the USB power supply is not ready at that
time, the driver simply ignores the failure and continues the probe,
leading to permanent non-functioning of the gadget vbus_draw callback.
Address this problem by delaying the dwc3 driver initialization until
the USB power supply is registered.
Fixes: 6f0764b5adea ("usb: dwc3: add a power supply for current control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kyle Tso <kyletso(a)google.com>
---
Note: This is a follow-up of https://lore.kernel.org/all/20240804084612.2561230-1-kyletso@google.com/
---
drivers/usb/dwc3/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 7578c5133568..1550c39e792a 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1669,7 +1669,7 @@ static void dwc3_get_software_properties(struct dwc3 *dwc)
}
}
-static void dwc3_get_properties(struct dwc3 *dwc)
+static int dwc3_get_properties(struct dwc3 *dwc)
{
struct device *dev = dwc->dev;
u8 lpm_nyet_threshold;
@@ -1724,7 +1724,7 @@ static void dwc3_get_properties(struct dwc3 *dwc)
if (ret >= 0) {
dwc->usb_psy = power_supply_get_by_name(usb_psy_name);
if (!dwc->usb_psy)
- dev_err(dev, "couldn't get usb power supply\n");
+ return dev_err_probe(dev, -EPROBE_DEFER, "couldn't get usb power supply\n");
}
dwc->has_lpm_erratum = device_property_read_bool(dev,
@@ -1847,6 +1847,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
dwc->imod_interval = 0;
dwc->tx_fifo_resize_max_num = tx_fifo_resize_max_num;
+
+ return 0;
}
/* check whether the core supports IMOD */
@@ -2181,7 +2183,9 @@ static int dwc3_probe(struct platform_device *pdev)
dwc->regs = regs;
dwc->regs_size = resource_size(&dwc_res);
- dwc3_get_properties(dwc);
+ ret = dwc3_get_properties(dwc);
+ if (ret)
+ return ret;
dwc3_get_software_properties(dwc);
--
2.47.1.688.g23fc6f90ad-goog