The subordinate pointer can be null if we are out of bus number. The
function of_pci_prop_bus_range() deferences this pointer without checking
and may crashes the kernel.
This crash can be reproduced by starting a QEMU instance:
qemu-system-riscv64 -machine virt \
-kernel ../build-pci-riscv/arch/riscv/boot/Image \
-append "console=ttyS0 root=/dev/vda" \
-nographic \
-drive "file=root.img,format=raw,id=hd0" \
-device virtio-blk-device,drive=hd0 \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=0 \
-device pcie-pci-bridge,id=br1,bus=rp1
Then hot-add a bridge with
device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
Then the kernel crashes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088
[snip]
[<ffffffff804db234>] of_pci_add_properties+0x34c/0x3c6
[<ffffffff804c8228>] of_pci_make_dev_node+0xb6/0x116
[<ffffffff804a6b72>] pci_bus_add_device+0xa8/0xaa
[<ffffffff804a6ba4>] pci_bus_add_devices+0x30/0x6a
[<ffffffff804d3b5c>] shpchp_configure_device+0xa0/0x112
[<ffffffff804d2b3a>] board_added+0xce/0x354
[<ffffffff804d2e9a>] shpchp_enable_slot+0xda/0x30c
[<ffffffff804d336c>] shpchp_pushbutton_thread+0x84/0xa0
NULL check this pointer first before proceeding.
Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
drivers/pci/of_property.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/of_property.c b/drivers/pci/of_property.c
index c2c7334152bc..5fb516807ba2 100644
--- a/drivers/pci/of_property.c
+++ b/drivers/pci/of_property.c
@@ -91,6 +91,9 @@ static int of_pci_prop_bus_range(struct pci_dev *pdev,
struct of_changeset *ocs,
struct device_node *np)
{
+ if (!pdev->subordinate)
+ return 0;
+
u32 bus_range[] = { pdev->subordinate->busn_res.start,
pdev->subordinate->busn_res.end };
--
2.39.2
pci_dev->subordinate pointer can be NULL if we run out of bus number. The
driver deferences this pointer without checking, and the kernel crashes.
This crash can be reproduced by starting a QEMU instance:
qemu-system-x86_64 -machine pc-q35-2.10 \
-kernel bzImage \
-drive "file=img,format=raw" \
-m 2048 -smp 1 -enable-kvm \
-append "console=ttyS0 root=/dev/sda debug" \
-nographic \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1 \
-device pcie-pci-bridge,id=br1,bus=rp1
Then hot-add a bridge with the QEMU command:
device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
Then the kernel crashes:
shpchp 0000:02:01.0: enabling device (0000 -> 0002)
shpchp 0000:02:01.0: enabling bus mastering
BUG: kernel NULL pointer dereference, address: 00000000000000da
[snip]
Call Trace:
<TASK>
? show_regs+0x63/0x70
? __die+0x23/0x70
? page_fault_oops+0x17a/0x480
? shpc_init+0x3fb/0x9d0
? search_module_extables+0x4e/0x80
? shpc_init+0x3fb/0x9d0
? kernelmode_fixup_or_oops+0x9b/0x120
? __bad_area_nosemaphore+0x16e/0x240
? bad_area_nosemaphore+0x11/0x20
? do_user_addr_fault+0x2a3/0x610
? exc_page_fault+0x6d/0x160
? asm_exc_page_fault+0x2b/0x30
? shpc_init+0x3fb/0x9d0
shpc_probe+0x92/0x390
NULL check this pointer first before proceeding. If there is no
secondary bus number, there is no point in initializing this hot-plug
controller, so just bails out.
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org # all
---
This one exists since beginning of git history. So I didn't bother
with a Fixes: tag.
This patch is almost a copy-paste from pciehp
---
drivers/pci/hotplug/shpchp_core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 56c7795ed890..14cf9e894201 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -262,6 +262,12 @@ static int shpc_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
if (acpi_get_hp_hw_control_from_firmware(pdev))
return -ENODEV;
+ if (!pdev->subordinate) {
+ /* Can happen if we run out of bus numbers during probe */
+ pci_err(pdev, "Hotplug bridge without secondary bus, ignoring\n");
+ return -ENODEV;
+ }
+
ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
if (!ctrl)
goto err_out_none;
--
2.39.2
In function pci_scan_bridge_extend(), if the variable next_busnr gets to
256, "child = pci_find_bus()" will return bus 0 (root bus). Consequently,
we have a circular PCI topology. The scan will then go in circle until the
kernel crashes due to stack overflow.
This can be reproduced with:
qemu-system-x86_64 -machine pc-q35-2.10 \
-kernel bzImage \
-m 2048 -smp 1 -enable-kvm \
-append "console=ttyS0 root=/dev/sda debug" \
-nographic \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=253 \
-device pcie-root-port,bus=pcie.0,slot=2,id=rp2,bus-reserve=0 \
-device pcie-root-port,bus=pcie.0,slot=3,id=rp3,bus-reserve=0
Check if next_busnr "overflow" and bail out if this is the case.
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org # all
---
This bug exists since the beginning of git history. So I didn't bother
tracing beyond git to see which patch introduced this.
---
drivers/pci/probe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 1325fbae2f28..03caae76337c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1382,6 +1382,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
else
next_busnr = max + 1;
+ if (next_busnr == 256)
+ goto out;
+
/*
* Prevent assigning a bus number that already exists.
* This can happen when a bridge is hot-plugged, so in this
--
2.39.2
On Thu, May 30, 2024 at 10:05 PM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> ovl: add helper ovl_file_modified()
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> ovl-add-helper-ovl_file_modified.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit f87db32c0cdadc7eea4a37560867da0bd0bb87e8
> Author: Amir Goldstein <amir73il(a)gmail.com>
> Date: Wed Sep 27 13:43:44 2023 +0300
>
> ovl: add helper ovl_file_modified()
>
> [ Upstream commit c002728f608183449673818076380124935e6b9b ]
>
> A simple wrapper for updating ovl inode size/mtime, to conform
> with ovl_file_accessed().
>
> Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
> Stable-dep-of: 7c98f7cb8fda ("remove call_{read,write}_iter() functions")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
No objection to this patch, except for the fact that I think it is not
in the best
interest of the stable tree to backport 7c98f7cb8fda as is.
I suggest that you consider backporting only the parts of 7c98f7cb8fda that
open code call_{read,write}_iter() in call sites (some or all), if you
need those
as dependencies but actually leave the wrappers in the stable tree.
If the bots selected 7c98f7cb8fda to stable because of the Fixes: tag,
then I think that Fixes: tag was misleading the stable bots in this case.
Thanks,
Amir.
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 8be4dc050d1ed..9fd88579bfbfb 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -235,6 +235,12 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
> return ret;
> }
>
> +static void ovl_file_modified(struct file *file)
> +{
> + /* Update size/mtime */
> + ovl_copyattr(file_inode(file));
> +}
> +
> static void ovl_file_accessed(struct file *file)
> {
> struct inode *inode, *upperinode;
> @@ -290,10 +296,8 @@ static void ovl_aio_cleanup_handler(struct ovl_aio_req *aio_req)
> struct kiocb *orig_iocb = aio_req->orig_iocb;
>
> if (iocb->ki_flags & IOCB_WRITE) {
> - struct inode *inode = file_inode(orig_iocb->ki_filp);
> -
> kiocb_end_write(iocb);
> - ovl_copyattr(inode);
> + ovl_file_modified(orig_iocb->ki_filp);
> }
>
> orig_iocb->ki_pos = iocb->ki_pos;
> @@ -403,7 +407,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> ovl_iocb_to_rwf(ifl));
> file_end_write(real.file);
> /* Update size */
> - ovl_copyattr(inode);
> + ovl_file_modified(file);
> } else {
> struct ovl_aio_req *aio_req;
>
> @@ -489,7 +493,7 @@ static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
>
> file_end_write(real.file);
> /* Update size */
> - ovl_copyattr(inode);
> + ovl_file_modified(out);
> revert_creds(old_cred);
> fdput(real);
>
> @@ -570,7 +574,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
> revert_creds(old_cred);
>
> /* Update size */
> - ovl_copyattr(inode);
> + ovl_file_modified(file);
>
> fdput(real);
>
> @@ -654,7 +658,7 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
> revert_creds(old_cred);
>
> /* Update size */
> - ovl_copyattr(inode_out);
> + ovl_file_modified(file_out);
>
> fdput(real_in);
> fdput(real_out);
On Thu, May 30, 2024 at 10:05 PM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> fs: move kiocb_start_write() into vfs_iocb_iter_write()
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> fs-move-kiocb_start_write-into-vfs_iocb_iter_write.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 17f38d69e7960a2b346db04750b0e4ba867c0b83
> Author: Amir Goldstein <amir73il(a)gmail.com>
> Date: Wed Nov 22 14:27:12 2023 +0200
>
> fs: move kiocb_start_write() into vfs_iocb_iter_write()
>
> [ Upstream commit 6ae654392bb516a0baa47fed1f085d84e8cad739 ]
>
> In vfs code, sb_start_write() is usually called after the permission hook
> in rw_verify_area(). vfs_iocb_iter_write() is an exception to this rule,
> where kiocb_start_write() is called by its callers.
>
> Move kiocb_start_write() from the callers into vfs_iocb_iter_write()
> after the rw_verify_area() checks, to make them "start-write-safe".
>
> The semantics of vfs_iocb_iter_write() is changed, so that the caller is
> responsible for calling kiocb_end_write() on completion only if async
> iocb was queued. The completion handlers of both callers were adapted
> to this semantic change.
>
This comment about semantics change looks like a clue from my past
self that backporting this commit as standalone is risky.
This commit was part of a pretty big shuffle in splice and ovl code.
I'd feel much more comfortable with backporting the entire ovl
series 14ab6d425e8067..5b02bfc1e7e3 and splice series
v6.7..6ae654392bb51 than just 3 individual commits in the middle.
Thanks,
Amir.
> This is needed for fanotify "pre content" events.
>
> Suggested-by: Jan Kara <jack(a)suse.cz>
> Suggested-by: Josef Bacik <josef(a)toxicpanda.com>
> Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
> Link: https://lore.kernel.org/r/20231122122715.2561213-14-amir73il@gmail.com
> Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
> Reviewed-by: Jan Kara <jack(a)suse.cz>
> Signed-off-by: Christian Brauner <brauner(a)kernel.org>
> Stable-dep-of: 7c98f7cb8fda ("remove call_{read,write}_iter() functions")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
> index 009d23cd435b5..5857241c59181 100644
> --- a/fs/cachefiles/io.c
> +++ b/fs/cachefiles/io.c
> @@ -259,7 +259,8 @@ static void cachefiles_write_complete(struct kiocb *iocb, long ret)
>
> _enter("%ld", ret);
>
> - kiocb_end_write(iocb);
> + if (ki->was_async)
> + kiocb_end_write(iocb);
>
> if (ret < 0)
> trace_cachefiles_io_error(object, inode, ret,
> @@ -319,8 +320,6 @@ int __cachefiles_write(struct cachefiles_object *object,
> ki->iocb.ki_complete = cachefiles_write_complete;
> atomic_long_add(ki->b_writing, &cache->b_writing);
>
> - kiocb_start_write(&ki->iocb);
> -
> get_file(ki->iocb.ki_filp);
> cachefiles_grab_object(object, cachefiles_obj_get_ioreq);
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 9fd88579bfbfb..a1c64c2b8e204 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -295,10 +295,8 @@ static void ovl_aio_cleanup_handler(struct ovl_aio_req *aio_req)
> struct kiocb *iocb = &aio_req->iocb;
> struct kiocb *orig_iocb = aio_req->orig_iocb;
>
> - if (iocb->ki_flags & IOCB_WRITE) {
> - kiocb_end_write(iocb);
> + if (iocb->ki_flags & IOCB_WRITE)
> ovl_file_modified(orig_iocb->ki_filp);
> - }
>
> orig_iocb->ki_pos = iocb->ki_pos;
> ovl_aio_put(aio_req);
> @@ -310,6 +308,9 @@ static void ovl_aio_rw_complete(struct kiocb *iocb, long res)
> struct ovl_aio_req, iocb);
> struct kiocb *orig_iocb = aio_req->orig_iocb;
>
> + if (iocb->ki_flags & IOCB_WRITE)
> + kiocb_end_write(iocb);
> +
> ovl_aio_cleanup_handler(aio_req);
> orig_iocb->ki_complete(orig_iocb, res);
> }
> @@ -421,7 +422,6 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> aio_req->iocb.ki_flags = ifl;
> aio_req->iocb.ki_complete = ovl_aio_rw_complete;
> refcount_set(&aio_req->ref, 2);
> - kiocb_start_write(&aio_req->iocb);
> ret = vfs_iocb_iter_write(real.file, &aio_req->iocb, iter);
> ovl_aio_put(aio_req);
> if (ret != -EIOCBQUEUED)
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 4771701c896ba..9a56949f3b8d1 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -865,6 +865,10 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
> return ret;
> }
>
> +/*
> + * Caller is responsible for calling kiocb_end_write() on completion
> + * if async iocb was queued.
> + */
> ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
> struct iov_iter *iter)
> {
> @@ -885,7 +889,10 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
> if (ret < 0)
> return ret;
>
> + kiocb_start_write(iocb);
> ret = call_write_iter(file, iocb, iter);
> + if (ret != -EIOCBQUEUED)
> + kiocb_end_write(iocb);
> if (ret > 0)
> fsnotify_modify(file);
>
On Thu, May 30, 2024 at 10:05 PM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> splice: remove permission hook from iter_file_splice_write()
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> splice-remove-permission-hook-from-iter_file_splice_.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 9519e9d1e625d4f01b3c8a1c32042e3f5da53b0b
> Author: Amir Goldstein <amir73il(a)gmail.com>
> Date: Thu Nov 23 18:51:44 2023 +0100
>
> splice: remove permission hook from iter_file_splice_write()
>
> [ Upstream commit d53471ba6f7ae97a4e223539029528108b705af1 ]
>
> All the callers of ->splice_write(), (e.g. do_splice_direct() and
> do_splice()) already check rw_verify_area() for the entire range
> and perform all the other checks that are in vfs_write_iter().
>
Alas, that is incorrect for 6.6.y, because it depends on prior commit
ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()")
And in any case, this commit is part of a pretty hairy shuffle in splice code.
I'd feel much more comfortable with backporting the entire series
0db1d53937fa..6ae654392bb51 than just 3 individual commits in the
middle of the series.
I looked into it and ca7ab482401c does not apply cleanly to 6.6.y -
it depends on the ovl changes 14ab6d425e8067..5b02bfc1e7e3.
Not only for conflict, but also for correct locking order.
That amounts to quite a few non trivial ovl and splice patches,
so maybe you need to reconsider, but on the up side, all those ovl
and splice patches are actually very subtle bug fixes, so I cannot
say that they are not stable tree worthy.
There is also a coda commit that depends on this for conflict:
581a4d003001 coda: convert to new timestamp accessors
I did not check if it all compiles or works, only that it applies cleanly.
> Instead of creating another tiny helper for special caller, just
> open-code it.
>
> This is needed for fanotify "pre content" events.
>
> Suggested-by: Jan Kara <jack(a)suse.cz>
> Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
> Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
> Link: https://lore.kernel.org/r/20231122122715.2561213-6-amir73il@gmail.com
> Signed-off-by: Christian Brauner <brauner(a)kernel.org>
> Stable-dep-of: 7c98f7cb8fda ("remove call_{read,write}_iter() functions")
Why would you want to backport this commit?
It hinders backporting work - it does not assist it.
Any new code that open codes call_{read,write}_iter() is not affected
by the existence of the helpers in stable kernels and any old code that
does use these helpers works as well.
Thanks,
Amir.
Hello,
Kernel 6.8 which is the default kernel of Ubundu 24.04 and other
distributions derived from it is at the end of its life for you !?!
As it provides better support for the Raspberry Pi 5 graphics card, I
hoped to be able to benefit from it under Raspberry Pi OS which still
uses the latest LTS kernel available.
But if you don't make kernel 6.8 an LTS kernel, for Ubuntu and its
derivatives, I will never be able to benefit from it on my RPi5 ;-(
In addition, I am afraid that OS derived from Ubuntu 24.04 LTS will
end up in difficulties due to this situation.
Best regards.
On riscv32, it is possible for the last page in virtual address space
(0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that
shouldn't happen.
There is already some code to ensure memblock won't allocate the last page.
However, buddy allocator is left unchecked.
Fix this by reserving physical memory that would be mapped at virtual
addresses greater than 0xfffff000.
Reported-by: Björn Töpel <bjorn(a)kernel.org>
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong…
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
---
arch/riscv/mm/init.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 968761843203..7c985435b3fc 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -235,18 +235,19 @@ static void __init setup_bootmem(void)
kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
/*
- * memblock allocator is not aware of the fact that last 4K bytes of
- * the addressable memory can not be mapped because of IS_ERR_VALUE
- * macro. Make sure that last 4k bytes are not usable by memblock
- * if end of dram is equal to maximum addressable memory. For 64-bit
- * kernel, this problem can't happen here as the end of the virtual
- * address space is occupied by the kernel mapping then this check must
- * be done as soon as the kernel mapping base address is determined.
+ * Reserve physical address space that would be mapped to virtual
+ * addresses greater than (void *)(-PAGE_SIZE) because:
+ * - This memory would overlap with ERR_PTR
+ * - This memory belongs to high memory, which is not supported
+ *
+ * This is not applicable to 64-bit kernel, because virtual addresses
+ * after (void *)(-PAGE_SIZE) are not linearly mapped: they are
+ * occupied by kernel mapping. Also it is unrealistic for high memory
+ * to exist on 64-bit platforms.
*/
if (!IS_ENABLED(CONFIG_64BIT)) {
- max_mapped_addr = __pa(~(ulong)0);
- if (max_mapped_addr == (phys_ram_end - 1))
- memblock_set_current_limit(max_mapped_addr - 4096);
+ max_mapped_addr = __va_to_pa_nodebug(-PAGE_SIZE);
+ memblock_reserve(max_mapped_addr, (phys_addr_t)-max_mapped_addr);
}
min_low_pfn = PFN_UP(phys_ram_base);
--
2.39.2