There are some bugfix for the HNS3 ethernet driver
Jian Shen (1):
net: hns3: restore user pause configure when disable autoneg
Jie Wang (2):
net: hns3: refactor hclge_mac_link_status_wait for interface reuse
net: hns3: add wait until mac link down
Peiyang Wang (1):
net: hns3: fix wrong print link down up
Yonglong Liu (2):
net: hns3: fix side effects passed to min_t()
net: hns3: fix deadlock issue when externel_lb and reset are executed
together
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 17 ++++++++--
.../hisilicon/hns3/hns3pf/hclge_main.c | 32 ++++++++++++++-----
.../ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 2 +-
.../ethernet/hisilicon/hns3/hns3pf/hclge_tm.h | 1 +
4 files changed, 41 insertions(+), 11 deletions(-)
--
2.30.0
> On Fri, Jul 28, 2023 at 05:17:59PM -0400, Joel Fernandes wrote:
>
> On Jul 27, 2023, at 7:18 PM, Joel Fernandes <joel(a)joelfernandes.org>
> wrote:
>
> 
>
> On Jul 27, 2023, at 4:33 PM, Paul E. McKenney <paulmck(a)kernel.org>
> wrote:
>
> On Thu, Jul 27, 2023 at 10:39:17AM -0700, Guenter Roeck wrote:
>
> On 7/27/23 09:07, Paul E. McKenney wrote:
>
> ...]
>
> No. However, (unrelated) in linux-next, rcu tests sometimes result
> in apparent hangs
>
> or long runtime.
>
> [ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096
> bytes, linear)
>
> [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0,
> 4096 bytes, linear)
>
> [ 0.797998] Running RCU synchronous self tests
>
> [ 0.798209] Running RCU synchronous self tests
>
> [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family:
> 0x15, model: 0x2, stepping: 0x0)
>
> [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1
> rcu_task_cb_adjust=1.
>
> [ 0.925419] Running RCU-tasks wait API self tests
>
> (hangs until aborted). This is primarily with Opteron CPUs, but also
> with others such as Haswell,
>
> Icelake-Server, and pentium3. It is all but impossible to bisect
> because it doesn't happen
>
> all the time. All I was able to figure out was that it has to do
> with rcu changes in linux-next.
>
> I'd be much more concerned about that.
>
> First I have heard of this, so thank you for letting me know.
>
> About what fraction of the time does this happen?
>
> Here is a sample test log from yesterday's -next. This is with
> x86_64.
>
> Today's -next always crashes, so no data.
>
> Building
> x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ...
> running ....... passed
>
> Building
> x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd
> ... running .................R....... passed
>
> Building
> x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd
> ... running ...... passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:h
> d ... running ......... passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd
> ... running .................R.... passed
>
> Building
> x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem
> 4G:sdhci:mmc:hd ... running ....... passed
>
> Building
> x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:
> hd ... running ....... passed
>
> Building
> x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]
> :hd ... running ....... passed
>
> Building
> x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC
> 395]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974
> ]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53
> C974]:hd ... running ....... passed
>
> Building
> x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem
> 1G:scsi[53C810]:cd ... running .................R........... passed
>
> Building
> x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:e
> fi32:mem2G:scsi[53C895A]:hd ... running ............. passed
>
> Building
> x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSI
> ON]:hd ... running ..................R.......... passed
>
> Building
> x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi
> [MEGASAS]:hd ... running ....... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[M
> EGASAS2]:hd ... running ...... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS
> 2]:hd ... running .................R.............. failed (silent)
>
> Building
> x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2
> ]:hd ... running .......... passed
>
> Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd
> ... running ........ passed
>
> Building
> x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ...
> running ...... passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci]:hd ... running .................R................. passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci-old]:hd ... running ................... passed
>
> Building
> x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd
> ... running ......... passed
>
> Building
> x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd
> ... running .................R... passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:me
> m2G:virtio:cd ... running ......... passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:n
> vme:hd ... running ...... passed
>
> Building
> x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:ef
> i32:mem1G:sdhci:mmc:hd ... running ...... passed
>
> Building
> x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:init
> rd ... running ...... passed
>
> Building
> x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C
> 810]:hd ... running ....... passed
>
> Building
> x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd
> ... running ......... passed
>
> Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd
> ... running ....................R................. failed (silent)
>
> Building
> x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd
> ... running .....................R....... passed
>
> Building
> x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:
> ata:hd ... running .................R.............. failed (silent)
>
> An earlier test run:
>
> Building
> x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ...
> running ....... passed
>
> Building
> x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd
> ... running .................R....... passed
>
> Building
> x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd
> ... running ........ passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:h
> d ... running .......... passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd
> ... running .................R.... passed
>
> Building
> x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem
> 4G:sdhci:mmc:hd ... running ....... passed
>
> Building
> x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:
> hd ... running ......... passed
>
> Building
> x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]
> :hd ... running ....... passed
>
> Building
> x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC
> 395]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974
> ]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53
> C974]:hd ... running ........ passed
>
> Building
> x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem
> 1G:scsi[53C810]:cd ... running .......... passed
>
> Building
> x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:e
> fi32:mem2G:scsi[53C895A]:hd ... running .................R.....
> passed
>
> Building
> x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSI
> ON]:hd ... running .................R.............. failed (silent)
>
> Building
> x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi
> [MEGASAS]:hd ... running ....... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[M
> EGASAS2]:hd ... running ....... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS
> 2]:hd ... running ....... passed
>
> Building
> x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2
> ]:hd ... running .......... passed
>
> Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd
> ... running ........ passed
>
> Building
> x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ...
> running ...... passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci]:hd ... running .......... passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci-old]:hd ... running .......... passed
>
> Building
> x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd
> ... running ...... passed
>
> Building
> x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd
> ... running ...... passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:me
> m2G:virtio:cd ... running ......... passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:n
> vme:hd ... running ....... passed
>
> Building
> x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:ef
> i32:mem1G:sdhci:mmc:hd ... running ....... passed
>
> Building
> x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:init
> rd ... running ....... passed
>
> Building
> x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C
> 810]:hd ... running ........ passed
>
> Building
> x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd
> ... running ......... passed
>
> Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd
> ... running ....................R................. failed (silent)
>
> Building
> x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:
> ata:hd ... running ....... passed
>
> "R" means retry, and the dots reflect time expired. It looks like it
> happens most of the time,
>
> but not always, on affected CPUs. I don't have specific data for
> non-Intel CPUs. I don't think
>
> I see the problem there, but there is too much interference from
> other problems to be sure.
>
> For comparison, here is the result from the latest mainline:
>
> Building
> x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ...
> running ....... passed
>
> Building
> x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd
> ... running .......... passed
>
> Building
> x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd
> ... running ...... passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:h
> d ... running ......... passed
>
> Building
> x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd
> ... running ........... passed
>
> Building
> x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd
> ... running ........ passed
>
> Building
> x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem
> 4G:sdhci:mmc:hd ... running ....... passed
>
> Building
> x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:
> hd ... running ....... passed
>
> Building
> x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]
> :hd ... running ....... passed
>
> Building
> x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC
> 395]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974
> ]:hd ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53
> C974]:hd ... running ....... passed
>
> Building
> x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem
> 1G:scsi[53C810]:cd ... running .......... passed
>
> Building
> x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:e
> fi32:mem2G:scsi[53C895A]:hd ... running ....... passed
>
> Building
> x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSI
> ON]:hd ... running ............. passed
>
> Building
> x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi
> [MEGASAS]:hd ... running ....... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[M
> EGASAS2]:hd ... running ....... passed
>
> Building
> x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS
> 2]:hd ... running ...... passed
>
> Building
> x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2
> ]:hd ... running ......... passed
>
> Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd
> ... running ......... passed
>
> Building
> x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ...
> running ......... passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci]:hd ... running ......... passed
>
> Building
> x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-p
> ci-old]:hd ... running ......... passed
>
> Building
> x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd
> ... running ...... passed
>
> Building
> x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd
> ... running ....... passed
>
> Building
> x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd
> ... running ...... passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:me
> m2G:virtio:cd ... running ............ passed
>
> Building
> x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:n
> vme:hd ... running ....... passed
>
> Building
> x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:ef
> i32:mem1G:sdhci:mmc:hd ... running ...... passed
>
> Building
> x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:init
> rd ... running ...... passed
>
> Building
> x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C
> 810]:hd ... running ....... passed
>
> Building
> x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd
> ... running .......... passed
>
> Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd
> ... running .......... passed
>
> Building
> x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd
> ... running ...... passed
>
> Building
> x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:
> ata:hd ... running ...... passed
>
> I freely confess that I am having a hard time imagining what would
>
> be CPU dependent in that code. Timing, maybe? Whatever the reason,
>
> I am not seeing these failures in my testing.
>
> So which of the following Kconfig options is defined in your
> .config?
>
> CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
>
> If you have more than one of them, could you please apply this patch
>
> and show me the corresponding console output from the resulting
> hang?
>
> FWIW, I am not able to repro this issue either. If a .config can be
> shared of the problem system, I can try it out to see if it can be
> reproduced on my side.
>
> I do see this now on 5.15 stable:
>
>TASKS03 ------- 3089 GPs (0.858056/s)
>QEMU killed
>TASKS03 no success message, 64 successful version messages
>!!! PID 3309783 hung at 3781 vs. 3600 seconds
>
> I have not looked too closely yet. The full test artifacts are here:
>
> [1]Artifacts of linux-5.15.y 5.15.123 :
> /tools/testing/selftests/rcutorture/res/2023.07.28-04.00.44 [Jenkins]
> [2]box.joelfernandes.org
> [3]apple-touch-icon.png
>
> Thanks,
>
> - Joel
>
> (Apologies if the email is html, I am sending from phone).
Heh. I have a script that runs lynx. Which isn't perfect, but usually
makes things at least somewhat legible.
This looks like the prototypical hard hang with interrupts disabled,
which could be anywhere in the kernel, including RCU. I am not seeing
this. but the usual cause when I have seen it in the past was deadlock
of irq-disabled locks. In one spectacular case, it was a timekeeping
failure that messed up a CPU-hotplug operation.
If this is reproducible, one trick would be to have a script look at
the console.log file, and have it do something (NMI? sysrq? something
else?) to qemu if output ceased for too long.
One way to do this without messing with the rcutorture scripting is to
grab the qemu-cmd file from this run, and then invoke that file from your
own script, possibly with suitable modifications to qemu's parameters.
Thoughts?
Thanx, Paul
> Cheers,
> - Joel
>
> Thanx, Paul
>
> --------------------------------------------------------------------
> ----
>
> commit 709a917710dc01798e01750ea628ece4bfc42b7b
>
> Author: Paul E. McKenney <paulmck(a)kernel.org>
>
> Date: Thu Jul 27 13:13:46 2023 -0700
>
> rcu-tasks: Add printk()s to localize boot-time self-test hang
>
> Currently, rcu_tasks_initiate_self_tests() prints a message and
> then
>
> initiates self tests on up to three different RCU Tasks flavors.
> If one
>
> of the flavors has a grace-period hang, it is not easy to work out
> which
>
> of the three hung. This commit therefore prints a message prior
> to each
>
> individual test.
>
> Reported-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
>
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
>
> index 56c470a489c8..427433c90935 100644
>
> --- a/kernel/rcu/tasks.h
>
> +++ b/kernel/rcu/tasks.h
>
> @@ -1981,20 +1981,22 @@ static void test_rcu_tasks_callback(struct
> rcu_head *rhp)
>
> static void rcu_tasks_initiate_self_tests(void)
>
> {
>
> - pr_info("Running RCU-tasks wait API self tests\n");
>
> #ifdef CONFIG_TASKS_RCU
>
> + pr_info("Running RCU Tasks wait API self tests\n");
>
> tests[0].runstart = jiffies;
>
> synchronize_rcu_tasks();
>
> call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback);
>
> #endif
>
> #ifdef CONFIG_TASKS_RUDE_RCU
>
> + pr_info("Running RCU Tasks Rude wait API self tests\n");
>
> tests[1].runstart = jiffies;
>
> synchronize_rcu_tasks_rude();
>
> call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
>
> #endif
>
> #ifdef CONFIG_TASKS_TRACE_RCU
>
> + pr_info("Running RCU Tasks Trace wait API self tests\n");
>
> tests[2].runstart = jiffies;
>
> synchronize_rcu_tasks_trace();
>
> call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
>
>References
>
> Visible links:
> 1. http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-5.15.y/la…
> 2. http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-5.15.y/la…
> 3. http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-5.15.y/la…
>
> Hidden links:
> 5. http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-5.15.y/la…
This series primarily adds check at relevant places in venus driver where there
are possible OOB accesses due to unexpected payload from venus firmware. The
patches describes the specific OOB possibility.
Please review and share your feedback.
Vikash Garodia (4):
venus: hfi: add checks to perform sanity on queue pointers
venus: hfi: fix the check to handle session buffer requirement
venus: hfi: add checks to handle capabilities from firmware
venus: hfi_parser: Add check to keep the number of codecs within range
drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +-
drivers/media/platform/qcom/venus/hfi_parser.c | 27 ++++++++++++++++++++++++++
drivers/media/platform/qcom/venus/hfi_venus.c | 8 ++++++++
3 files changed, 36 insertions(+), 1 deletion(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The patch titled
Subject: nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-use-after-free-of-nilfs_root-in-dirtying-inodes-via-iput.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput
Date: Sat, 29 Jul 2023 04:13:18 +0900
During unmount process of nilfs2, nothing holds nilfs_root structure after
nilfs2 detaches its writer in nilfs_detach_log_writer(). Previously,
nilfs_evict_inode() could cause use-after-free read for nilfs_root if
inodes are left in "garbage_list" and released by nilfs_dispose_list at
the end of nilfs_detach_log_writer(), and this bug was fixed by commit
9b5a04ac3ad9 ("nilfs2: fix use-after-free bug of nilfs_root in
nilfs_evict_inode()").
However, it turned out that there is another possibility of UAF in the
call path where mark_inode_dirty_sync() is called from iput():
nilfs_detach_log_writer()
nilfs_dispose_list()
iput()
mark_inode_dirty_sync()
__mark_inode_dirty()
nilfs_dirty_inode()
__nilfs_mark_inode_dirty()
nilfs_load_inode_block() --> causes UAF of nilfs_root struct
This can happen after commit 0ae45f63d4ef ("vfs: add support for a
lazytime mount option"), which changed iput() to call
mark_inode_dirty_sync() on its final reference if i_state has I_DIRTY_TIME
flag and i_nlink is non-zero.
This issue appears after commit 28a65b49eb53 ("nilfs2: do not write dirty
data after degenerating to read-only") when using the syzbot reproducer,
but the issue has potentially existed before.
Fix this issue by adding a "purging flag" to the nilfs structure, setting
that flag while disposing the "garbage_list" and checking it in
__nilfs_mark_inode_dirty().
Unlike commit 9b5a04ac3ad9 ("nilfs2: fix use-after-free bug of nilfs_root
in nilfs_evict_inode()"), this patch does not rely on ns_writer to
determine whether to skip operations, so as not to break recovery on
mount. The nilfs_salvage_orphan_logs routine dirties the buffer of
salvaged data before attaching the log writer, so changing
__nilfs_mark_inode_dirty() to skip the operation when ns_writer is NULL
will cause recovery write to fail. The purpose of using the cleanup-only
flag is to allow for narrowing of such conditions.
Link: https://lkml.kernel.org/r/20230728191318.33047-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+74db8b3087f293d3a13a(a)syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/000000000000b4e906060113fd63@google.com
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org> # 4.0+
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/inode.c | 8 ++++++++
fs/nilfs2/segment.c | 2 ++
fs/nilfs2/the_nilfs.h | 2 ++
3 files changed, 12 insertions(+)
--- a/fs/nilfs2/inode.c~nilfs2-fix-use-after-free-of-nilfs_root-in-dirtying-inodes-via-iput
+++ a/fs/nilfs2/inode.c
@@ -1101,9 +1101,17 @@ int nilfs_set_file_dirty(struct inode *i
int __nilfs_mark_inode_dirty(struct inode *inode, int flags)
{
+ struct the_nilfs *nilfs = inode->i_sb->s_fs_info;
struct buffer_head *ibh;
int err;
+ /*
+ * Do not dirty inodes after the log writer has been detached
+ * and its nilfs_root struct has been freed.
+ */
+ if (unlikely(nilfs_purging(nilfs)))
+ return 0;
+
err = nilfs_load_inode_block(inode, &ibh);
if (unlikely(err)) {
nilfs_warn(inode->i_sb,
--- a/fs/nilfs2/segment.c~nilfs2-fix-use-after-free-of-nilfs_root-in-dirtying-inodes-via-iput
+++ a/fs/nilfs2/segment.c
@@ -2845,6 +2845,7 @@ void nilfs_detach_log_writer(struct supe
nilfs_segctor_destroy(nilfs->ns_writer);
nilfs->ns_writer = NULL;
}
+ set_nilfs_purging(nilfs);
/* Force to free the list of dirty files */
spin_lock(&nilfs->ns_inode_lock);
@@ -2857,4 +2858,5 @@ void nilfs_detach_log_writer(struct supe
up_write(&nilfs->ns_segctor_sem);
nilfs_dispose_list(nilfs, &garbage_list, 1);
+ clear_nilfs_purging(nilfs);
}
--- a/fs/nilfs2/the_nilfs.h~nilfs2-fix-use-after-free-of-nilfs_root-in-dirtying-inodes-via-iput
+++ a/fs/nilfs2/the_nilfs.h
@@ -29,6 +29,7 @@ enum {
THE_NILFS_DISCONTINUED, /* 'next' pointer chain has broken */
THE_NILFS_GC_RUNNING, /* gc process is running */
THE_NILFS_SB_DIRTY, /* super block is dirty */
+ THE_NILFS_PURGING, /* disposing dirty files for cleanup */
};
/**
@@ -208,6 +209,7 @@ THE_NILFS_FNS(INIT, init)
THE_NILFS_FNS(DISCONTINUED, discontinued)
THE_NILFS_FNS(GC_RUNNING, gc_running)
THE_NILFS_FNS(SB_DIRTY, sb_dirty)
+THE_NILFS_FNS(PURGING, purging)
/*
* Mount option operations
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-use-after-free-of-nilfs_root-in-dirtying-inodes-via-iput.patch
The patch titled
Subject: selftests: mm: ksm: fix incorrect evaluation of parameter
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-mm-ksm-fix-incorrect-evaluation-of-parameter.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ayush Jain <ayush.jain3(a)amd.com>
Subject: selftests: mm: ksm: fix incorrect evaluation of parameter
Date: Fri, 28 Jul 2023 22:09:51 +0530
A missing break in kms_tests leads to kselftest hang when the parameter -s
is used.
In current code flow because of missing break in -s, -t parses args
spilled from -s and as -t accepts only valid values as 0,1 so any arg in
-s >1 or <0, gets in ksm_test failure
This went undetected since, before the addition of option -t, the next
case -M would immediately break out of the switch statement but that is no
longer the case
Add the missing break statement.
----Before----
./ksm_tests -H -s 100
Invalid merge type
----After----
./ksm_tests -H -s 100
Number of normal pages: 0
Number of huge pages: 50
Total size: 100 MiB
Total time: 0.401732682 s
Average speed: 248.922 MiB/s
Link: https://lkml.kernel.org/r/20230728163952.4634-1-ayush.jain3@amd.com
Fixes: 07115fcc15b4 ("selftests/mm: add new selftests for KSM")
Signed-off-by: Ayush Jain <ayush.jain3(a)amd.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Stefan Roesch <shr(a)devkernel.io>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/ksm_tests.c | 1 +
1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/mm/ksm_tests.c~selftests-mm-ksm-fix-incorrect-evaluation-of-parameter
+++ a/tools/testing/selftests/mm/ksm_tests.c
@@ -831,6 +831,7 @@ int main(int argc, char *argv[])
printf("Size must be greater than 0\n");
return KSFT_FAIL;
}
+ break;
case 't':
{
int tmp = atoi(optarg);
_
Patches currently in -mm which might be from ayush.jain3(a)amd.com are
selftests-mm-ksm-fix-incorrect-evaluation-of-parameter.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d55901522f96082a43b9842d34867363c0cdbac5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023072356-confirm-embezzle-c962@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
d55901522f96 ("keys: Fix linking a duplicate key to a keyring's assoc_array")
f7e47677e39a ("watch_queue: Add a key/keyring notification facility")
0858caa419e6 ("uapi: General notification queue definitions")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d55901522f96082a43b9842d34867363c0cdbac5 Mon Sep 17 00:00:00 2001
From: Petr Pavlu <petr.pavlu(a)suse.com>
Date: Thu, 23 Mar 2023 14:04:12 +0100
Subject: [PATCH] keys: Fix linking a duplicate key to a keyring's assoc_array
When making a DNS query inside the kernel using dns_query(), the request
code can in rare cases end up creating a duplicate index key in the
assoc_array of the destination keyring. It is eventually found by
a BUG_ON() check in the assoc_array implementation and results in
a crash.
Example report:
[2158499.700025] kernel BUG at ../lib/assoc_array.c:652!
[2158499.700039] invalid opcode: 0000 [#1] SMP PTI
[2158499.700065] CPU: 3 PID: 31985 Comm: kworker/3:1 Kdump: loaded Not tainted 5.3.18-150300.59.90-default #1 SLE15-SP3
[2158499.700096] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[2158499.700351] Workqueue: cifsiod cifs_resolve_server [cifs]
[2158499.700380] RIP: 0010:assoc_array_insert+0x85f/0xa40
[2158499.700401] Code: ff 74 2b 48 8b 3b 49 8b 45 18 4c 89 e6 48 83 e7 fe e8 95 ec 74 00 3b 45 88 7d db 85 c0 79 d4 0f 0b 0f 0b 0f 0b e8 41 f2 be ff <0f> 0b 0f 0b 81 7d 88 ff ff ff 7f 4c 89 eb 4c 8b ad 58 ff ff ff 0f
[2158499.700448] RSP: 0018:ffffc0bd6187faf0 EFLAGS: 00010282
[2158499.700470] RAX: ffff9f1ea7da2fe8 RBX: ffff9f1ea7da2fc1 RCX: 0000000000000005
[2158499.700492] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
[2158499.700515] RBP: ffffc0bd6187fbb0 R08: ffff9f185faf1100 R09: 0000000000000000
[2158499.700538] R10: ffff9f1ea7da2cc0 R11: 000000005ed8cec8 R12: ffffc0bd6187fc28
[2158499.700561] R13: ffff9f15feb8d000 R14: ffff9f1ea7da2fc0 R15: ffff9f168dc0d740
[2158499.700585] FS: 0000000000000000(0000) GS:ffff9f185fac0000(0000) knlGS:0000000000000000
[2158499.700610] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2158499.700630] CR2: 00007fdd94fca238 CR3: 0000000809d8c006 CR4: 00000000003706e0
[2158499.700702] Call Trace:
[2158499.700741] ? key_alloc+0x447/0x4b0
[2158499.700768] ? __key_link_begin+0x43/0xa0
[2158499.700790] __key_link_begin+0x43/0xa0
[2158499.700814] request_key_and_link+0x2c7/0x730
[2158499.700847] ? dns_resolver_read+0x20/0x20 [dns_resolver]
[2158499.700873] ? key_default_cmp+0x20/0x20
[2158499.700898] request_key_tag+0x43/0xa0
[2158499.700926] dns_query+0x114/0x2ca [dns_resolver]
[2158499.701127] dns_resolve_server_name_to_ip+0x194/0x310 [cifs]
[2158499.701164] ? scnprintf+0x49/0x90
[2158499.701190] ? __switch_to_asm+0x40/0x70
[2158499.701211] ? __switch_to_asm+0x34/0x70
[2158499.701405] reconn_set_ipaddr_from_hostname+0x81/0x2a0 [cifs]
[2158499.701603] cifs_resolve_server+0x4b/0xd0 [cifs]
[2158499.701632] process_one_work+0x1f8/0x3e0
[2158499.701658] worker_thread+0x2d/0x3f0
[2158499.701682] ? process_one_work+0x3e0/0x3e0
[2158499.701703] kthread+0x10d/0x130
[2158499.701723] ? kthread_park+0xb0/0xb0
[2158499.701746] ret_from_fork+0x1f/0x40
The situation occurs as follows:
* Some kernel facility invokes dns_query() to resolve a hostname, for
example, "abcdef". The function registers its global DNS resolver
cache as current->cred.thread_keyring and passes the query to
request_key_net() -> request_key_tag() -> request_key_and_link().
* Function request_key_and_link() creates a keyring_search_context
object. Its match_data.cmp method gets set via a call to
type->match_preparse() (resolves to dns_resolver_match_preparse()) to
dns_resolver_cmp().
* Function request_key_and_link() continues and invokes
search_process_keyrings_rcu() which returns that a given key was not
found. The control is then passed to request_key_and_link() ->
construct_alloc_key().
* Concurrently to that, a second task similarly makes a DNS query for
"abcdef." and its result gets inserted into the DNS resolver cache.
* Back on the first task, function construct_alloc_key() first runs
__key_link_begin() to determine an assoc_array_edit operation to
insert a new key. Index keys in the array are compared exactly as-is,
using keyring_compare_object(). The operation finds that "abcdef" is
not yet present in the destination keyring.
* Function construct_alloc_key() continues and checks if a given key is
already present on some keyring by again calling
search_process_keyrings_rcu(). This search is done using
dns_resolver_cmp() and "abcdef" gets matched with now present key
"abcdef.".
* The found key is linked on the destination keyring by calling
__key_link() and using the previously calculated assoc_array_edit
operation. This inserts the "abcdef." key in the array but creates
a duplicity because the same index key is already present.
Fix the problem by postponing __key_link_begin() in
construct_alloc_key() until an actual key which should be linked into
the destination keyring is determined.
[jarkko(a)kernel.org: added a fixes tag and cc to stable]
Cc: stable(a)vger.kernel.org # v5.3+
Fixes: df593ee23e05 ("keys: Hoist locking out of __key_link_begin()")
Signed-off-by: Petr Pavlu <petr.pavlu(a)suse.com>
Reviewed-by: Joey Lee <jlee(a)suse.com>
Reviewed-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 07a0ef2baacd..a7673ad86d18 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -401,17 +401,21 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
set_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags);
if (dest_keyring) {
- ret = __key_link_lock(dest_keyring, &ctx->index_key);
+ ret = __key_link_lock(dest_keyring, &key->index_key);
if (ret < 0)
goto link_lock_failed;
- ret = __key_link_begin(dest_keyring, &ctx->index_key, &edit);
- if (ret < 0)
- goto link_prealloc_failed;
}
- /* attach the key to the destination keyring under lock, but we do need
+ /*
+ * Attach the key to the destination keyring under lock, but we do need
* to do another check just in case someone beat us to it whilst we
- * waited for locks */
+ * waited for locks.
+ *
+ * The caller might specify a comparison function which looks for keys
+ * that do not exactly match but are still equivalent from the caller's
+ * perspective. The __key_link_begin() operation must be done only after
+ * an actual key is determined.
+ */
mutex_lock(&key_construction_mutex);
rcu_read_lock();
@@ -420,12 +424,16 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
if (!IS_ERR(key_ref))
goto key_already_present;
- if (dest_keyring)
+ if (dest_keyring) {
+ ret = __key_link_begin(dest_keyring, &key->index_key, &edit);
+ if (ret < 0)
+ goto link_alloc_failed;
__key_link(dest_keyring, key, &edit);
+ }
mutex_unlock(&key_construction_mutex);
if (dest_keyring)
- __key_link_end(dest_keyring, &ctx->index_key, edit);
+ __key_link_end(dest_keyring, &key->index_key, edit);
mutex_unlock(&user->cons_lock);
*_key = key;
kleave(" = 0 [%d]", key_serial(key));
@@ -438,10 +446,13 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
mutex_unlock(&key_construction_mutex);
key = key_ref_to_ptr(key_ref);
if (dest_keyring) {
+ ret = __key_link_begin(dest_keyring, &key->index_key, &edit);
+ if (ret < 0)
+ goto link_alloc_failed_unlocked;
ret = __key_link_check_live_key(dest_keyring, key);
if (ret == 0)
__key_link(dest_keyring, key, &edit);
- __key_link_end(dest_keyring, &ctx->index_key, edit);
+ __key_link_end(dest_keyring, &key->index_key, edit);
if (ret < 0)
goto link_check_failed;
}
@@ -456,8 +467,10 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
kleave(" = %d [linkcheck]", ret);
return ret;
-link_prealloc_failed:
- __key_link_end(dest_keyring, &ctx->index_key, edit);
+link_alloc_failed:
+ mutex_unlock(&key_construction_mutex);
+link_alloc_failed_unlocked:
+ __key_link_end(dest_keyring, &key->index_key, edit);
link_lock_failed:
mutex_unlock(&user->cons_lock);
key_put(key);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0226436acf2495cde4b93e7400e5a87305c26054
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023072119-skipping-penalize-15f0@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
0226436acf24 ("mptcp: do not rely on implicit state check in mptcp_listen()")
cfdcfeed6449 ("mptcp: introduce 'sk' to replace 'sock->sk' in mptcp_listen()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0226436acf2495cde4b93e7400e5a87305c26054 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Tue, 4 Jul 2023 22:44:34 +0200
Subject: [PATCH] mptcp: do not rely on implicit state check in mptcp_listen()
Since the blamed commit, closing the first subflow resets the first
subflow socket state to SS_UNCONNECTED.
The current mptcp listen implementation relies only on such
state to prevent touching not-fully-disconnected sockets.
Incoming mptcp fastclose (or paired endpoint removal) unconditionally
closes the first subflow.
All the above allows an incoming fastclose followed by a listen() call
to successfully race with a blocking recvmsg(), potentially causing the
latter to hit a divide by zero bug in cleanup_rbuf/__tcp_select_window().
Address the issue explicitly checking the msk socket state in
mptcp_listen(). An alternative solution would be moving the first
subflow socket state update into mptcp_disconnect(), but in the long
term the first subflow socket should be removed: better avoid relaying
on it for internal consistency check.
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Cc: stable(a)vger.kernel.org
Reported-by: Christoph Paasch <cpaasch(a)apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/414
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 489a3defdde5..3613489eb6e3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3703,6 +3703,11 @@ static int mptcp_listen(struct socket *sock, int backlog)
pr_debug("msk=%p", msk);
lock_sock(sk);
+
+ err = -EINVAL;
+ if (sock->state != SS_UNCONNECTED || sock->type != SOCK_STREAM)
+ goto unlock;
+
ssock = __mptcp_nmpc_socket(msk);
if (IS_ERR(ssock)) {
err = PTR_ERR(ssock);
We accidentally enforced PROT_NONE PTE/PMD permission checks for
follow_page() like we do for get_user_pages() and friends. That was
undesired, because follow_page() is usually only used to lookup a currently
mapped page, not to actually access it. Further, follow_page() does not
actually trigger fault handling, but instead simply fails.
Let's restore that behavior by conditionally setting FOLL_FORCE if
FOLL_WRITE is not set. This way, for example KSM and migration code will
no longer fail on PROT_NONE mapped PTEs/PMDS.
Handling this internally doesn't require us to add any new FOLL_FORCE
usage outside of GUP code.
While at it, refuse to accept FOLL_FORCE: we don't even perform VMA
permission checks like in check_vma_flags(), so especially
FOLL_FORCE|FOLL_WRITE would be dodgy.
This issue was identified by code inspection. We'll add some
documentation regarding FOLL_FORCE next.
Reported-by: Peter Xu <peterx(a)redhat.com>
Fixes: 474098edac26 ("mm/gup: replace FOLL_NUMA by gup_can_follow_protnone()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
mm/gup.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/mm/gup.c b/mm/gup.c
index 2493ffa10f4b..da9a5cc096ac 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -841,9 +841,17 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
if (vma_is_secretmem(vma))
return NULL;
- if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
+ if (WARN_ON_ONCE(foll_flags & (FOLL_PIN | FOLL_FORCE)))
return NULL;
+ /*
+ * Traditionally, follow_page() succeeded on PROT_NONE-mapped pages
+ * but failed follow_page(FOLL_WRITE) on R/O-mapped pages. Let's
+ * keep these semantics by setting FOLL_FORCE if FOLL_WRITE is not set.
+ */
+ if (!(foll_flags & FOLL_WRITE))
+ foll_flags |= FOLL_FORCE;
+
page = follow_page_mask(vma, address, foll_flags, &ctx);
if (ctx.pgmap)
put_dev_pagemap(ctx.pgmap);
--
2.41.0