commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++++++++++++++------
arch/powerpc/kernel/ptrace/ptrace.c | 3 +++
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..09c49632bfe5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ else
+ memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
+ } else
*data = child->thread.fp_state.fpscr;
#else
*data = 0;
@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ else
+ memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
+ } else
child->thread.fp_state.fpscr = data;
#endif
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 6d5026a9db4f..eab6fa59d9d2 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -450,4 +450,7 @@ void __init pt_regs_check(void)
#else
BUILD_BUG_ON(IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
#endif
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
When fallocate() is used on a shmem file, the pages we allocate can end
up with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it
discovers !PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT,
as it would do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared
or not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case,
SGP_NOALLOC will clear the page and then return it. With this change,
UFFDIO_CONTINUE works properly (succeeds) on a shmem file which has been
fallocated, but otherwise not modified.
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/userfaultfd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 4f4892a5f767..07d3befc80e4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -246,7 +246,10 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
+ /* Our caller expects us to return -EFAULT if we failed to find page. */
+ if (ret == -ENOENT)
+ ret = -EFAULT;
if (ret)
goto out;
if (!page) {
--
2.36.1.476.g0c4daa206d-goog
When fallocate() is used on a shmem file, the pages we allocate can end
up with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it
discovers !PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT,
as it would do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared
or not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case,
SGP_NOALLOC will clear the page and then return it. With this change,
UFFDIO_CONTINUE works properly (succeeds) on a shmem file which has been
fallocated, but otherwise not modified.
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/userfaultfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 4f4892a5f767..c156f7f5b854 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -246,7 +246,7 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
if (ret)
goto out;
if (!page) {
--
2.36.1.255.ge46751e96f-goog
This is a preparation patch for the S29GL064N buffer writes fix. There
is no functional change.
Link: https://lore.kernel.org/r/b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.…
Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Tokunori Ikegami <ikegami.t(a)gmail.com>
Cc: stable(a)vger.kernel.org
Acked-by: Vignesh Raghavendra <vigneshr(a)ti.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220323170458.5608-2-ikegami.t@gmail.com
---
drivers/mtd/chips/cfi_cmdset_0002.c | 77 ++++++++++++-----------------
1 file changed, 32 insertions(+), 45 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index 870d1f1331b1..a8e1a961e844 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -730,50 +730,34 @@ static struct mtd_info *cfi_amdstd_setup(struct mtd_info *mtd)
}
/*
- * Return true if the chip is ready.
+ * Return true if the chip is ready and has the correct value.
*
* Ready is one of: read mode, query mode, erase-suspend-read mode (in any
* non-suspended sector) and is indicated by no toggle bits toggling.
*
+ * Error are indicated by toggling bits or bits held with the wrong value,
+ * or with bits toggling.
+ *
* Note that anything more complicated than checking if no bits are toggling
* (including checking DQ5 for an error status) is tricky to get working
* correctly and is therefore not done (particularly with interleaved chips
* as each chip must be checked independently of the others).
*/
-static int __xipram chip_ready(struct map_info *map, unsigned long addr)
+static int __xipram chip_ready(struct map_info *map, unsigned long addr,
+ map_word *expected)
{
map_word d, t;
+ int ret;
d = map_read(map, addr);
t = map_read(map, addr);
- return map_word_equal(map, d, t);
-}
+ ret = map_word_equal(map, d, t);
-/*
- * Return true if the chip is ready and has the correct value.
- *
- * Ready is one of: read mode, query mode, erase-suspend-read mode (in any
- * non-suspended sector) and it is indicated by no bits toggling.
- *
- * Error are indicated by toggling bits or bits held with the wrong value,
- * or with bits toggling.
- *
- * Note that anything more complicated than checking if no bits are toggling
- * (including checking DQ5 for an error status) is tricky to get working
- * correctly and is therefore not done (particularly with interleaved chips
- * as each chip must be checked independently of the others).
- *
- */
-static int __xipram chip_good(struct map_info *map, unsigned long addr, map_word expected)
-{
- map_word oldd, curd;
-
- oldd = map_read(map, addr);
- curd = map_read(map, addr);
+ if (!ret || !expected)
+ return ret;
- return map_word_equal(map, oldd, curd) &&
- map_word_equal(map, curd, expected);
+ return map_word_equal(map, t, *expected);
}
static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
@@ -790,7 +774,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
case FL_STATUS:
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -828,7 +812,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
chip->state = FL_ERASE_SUSPENDING;
chip->erase_suspended = 1;
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1361,7 +1345,7 @@ static int do_otp_lock(struct map_info *map, struct flchip *chip, loff_t adr,
/* wait for chip to become ready */
timeo = jiffies + msecs_to_jiffies(2);
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1628,10 +1612,11 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking
- * "chip_good" to avoid the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum)) {
+ if (time_after(jiffies, timeo) &&
+ !chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
@@ -1639,7 +1624,7 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
break;
}
- if (chip_good(map, adr, datum))
+ if (chip_ready(map, adr, &datum))
break;
/* Latency issues. Drop the lock, wait a while and retry */
@@ -1883,13 +1868,13 @@ static int __xipram do_write_buffer(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking "chip_good" to avoid
- * the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum))
+ if (time_after(jiffies, timeo) && !chip_ready(map, adr, &datum))
break;
- if (chip_good(map, adr, datum)) {
+ if (chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
goto op_done;
}
@@ -2023,7 +2008,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
* If the driver thinks the chip is idle, and no toggle bits
* are changing, then the chip is actually idle for sure.
*/
- if (chip->state == FL_READY && chip_ready(map, adr))
+ if (chip->state == FL_READY && chip_ready(map, adr, NULL))
return 0;
/*
@@ -2040,7 +2025,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
/* wait for the chip to become ready */
for (i = 0; i < jiffies_to_usecs(timeo); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
return 0;
udelay(1);
@@ -2104,13 +2089,13 @@ static int do_panic_write_oneword(struct map_info *map, struct flchip *chip,
map_write(map, datum, adr);
for (i = 0; i < jiffies_to_usecs(uWriteTimeout); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
udelay(1);
}
- if (!chip_good(map, adr, datum)) {
+ if (!chip_ready(map, adr, &datum)) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
/* FIXME - should have reset delay before continuing */
@@ -2251,6 +2236,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr = cfi->addr_unlock1;
@@ -2305,7 +2291,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map)))
+ if (chip_ready(map, adr, &datum))
break;
if (time_after(jiffies, timeo)) {
@@ -2347,6 +2333,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr += chip->start;
@@ -2401,7 +2388,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map))) {
+ if (chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
break;
}
@@ -2616,7 +2603,7 @@ static int __maybe_unused do_ppb_xxlock(struct map_info *map,
*/
timeo = jiffies + msecs_to_jiffies(2000); /* 2s max (un)locking */
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
--
2.34.1
Invalidating the buffer memory in arch_sync_dma_for_device() for
FROM_DEVICE transfers
When using the streaming DMA API to map a buffer prior to inbound
non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU
cachelines so that they will not be written back during the transfer and
corrupt the buffer contents written by the DMA. This, however, poses two
potential problems:
(1) If the DMA transfer does not write to every byte in the buffer,
then the unwritten bytes will contain stale data once the transfer
has completed.
(2) If the buffer has a virtual alias in userspace, then stale data
may be visible via this alias during the period between performing
the cache invalidation and the DMA writes landing in memory.
Address both of these issues by cleaning (aka writing-back) the dirty
lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding
them using invalidation.
Cc: Ard Biesheuvel <ardb(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20220606152150.GA31568@willie-the-truck
Signed-off-by: Will Deacon <will(a)kernel.org>
---
arch/arm64/mm/cache.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 0ea6cc25dc66..21c907987080 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -218,8 +218,6 @@ SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area)
*/
SYM_FUNC_START(__pi___dma_map_area)
add x1, x0, x1
- cmp w2, #DMA_FROM_DEVICE
- b.eq __pi_dcache_inval_poc
b __pi_dcache_clean_poc
SYM_FUNC_END(__pi___dma_map_area)
SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area)
--
2.36.1.476.g0c4daa206d-goog