Hi Sasha,
I think this patch shouldn't be added to the stable trees; it will
cause the TCU clock to be managed by the clock framework and
automatically gated after bootup since it has no client, causing a
global system lockup. It's only really applicable to 5.3.
Thanks,
-Paul
Le sam., oct. 5, 2019 at 19:56, Sasha Levin <sashal(a)kernel.org> a
écrit :
> This is a note to let you know that I've just added the patch titled
>
> clk: jz4740: Add TCU clock
>
> to the 4.4-stable tree which can be found at:
>
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> clk-jz4740-add-tcu-clock.patch
> and it can be found in the queue-4.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit c72b7e327a8b4d3b870f76011f674134f9ac38f6
> Author: Paul Cercueil <paul(a)crapouillou.net>
> Date: Wed Jul 24 13:16:10 2019 -0400
>
> clk: jz4740: Add TCU clock
>
> [ Upstream commit 73dd11dc1a883d4c994d729dc9984f4890001157 ]
>
> Add the missing TCU clock to the list of clocks supplied by the
> CGU for
> the JZ4740 SoC.
>
> Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
> Tested-by: Mathieu Malaterre <malat(a)debian.org>
> Tested-by: Artur Rojek <contact(a)artur-rojek.eu>
> Acked-by: Stephen Boyd <sboyd(a)kernel.org>
> Acked-by: Rob Herring <robh(a)kernel.org>
> Signed-off-by: Paul Burton <paul.burton(a)mips.com>
> Cc: Ralf Baechle <ralf(a)linux-mips.org>
> Cc: James Hogan <jhogan(a)kernel.org>
> Cc: Jonathan Corbet <corbet(a)lwn.net>
> Cc: Lee Jones <lee.jones(a)linaro.org>
> Cc: Arnd Bergmann <arnd(a)arndb.de>
> Cc: Daniel Lezcano <daniel.lezcano(a)linaro.org>
> Cc: Thomas Gleixner <tglx(a)linutronix.de>
> Cc: Michael Turquette <mturquette(a)baylibre.com>
> Cc: Jason Cooper <jason(a)lakedaemon.net>
> Cc: Marc Zyngier <marc.zyngier(a)arm.com>
> Cc: Rob Herring <robh+dt(a)kernel.org>
> Cc: Mark Rutland <mark.rutland(a)arm.com>
> Cc: devicetree(a)vger.kernel.org
> Cc: linux-kernel(a)vger.kernel.org
> Cc: linux-doc(a)vger.kernel.org
> Cc: linux-mips(a)vger.kernel.org
> Cc: linux-clk(a)vger.kernel.org
> Cc: od(a)zcrc.me
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/clk/ingenic/jz4740-cgu.c
> b/drivers/clk/ingenic/jz4740-cgu.c
> index 305a26c2a800e..01b5b8b103888 100644
> --- a/drivers/clk/ingenic/jz4740-cgu.c
> +++ b/drivers/clk/ingenic/jz4740-cgu.c
> @@ -211,6 +211,12 @@ static const struct ingenic_cgu_clk_info
> jz4740_cgu_clocks[] = {
> .parents = { JZ4740_CLK_EXT, -1, -1, -1 },
> .gate = { CGU_REG_CLKGR, 5 },
> },
> +
> + [JZ4740_CLK_TCU] = {
> + "tcu", CGU_CLK_GATE,
> + .parents = { JZ4740_CLK_EXT, -1, -1, -1 },
> + .gate = { CGU_REG_CLKGR, 1 },
> + },
> };
>
> static void __init jz4740_cgu_init(struct device_node *np)
> diff --git a/include/dt-bindings/clock/jz4740-cgu.h
> b/include/dt-bindings/clock/jz4740-cgu.h
> index 43153d3e9bd26..ff7c27bc98e37 100644
> --- a/include/dt-bindings/clock/jz4740-cgu.h
> +++ b/include/dt-bindings/clock/jz4740-cgu.h
> @@ -33,5 +33,6 @@
> #define JZ4740_CLK_ADC 19
> #define JZ4740_CLK_I2C 20
> #define JZ4740_CLK_AIC 21
> +#define JZ4740_CLK_TCU 22
>
> #endif /* __DT_BINDINGS_CLOCK_JZ4740_CGU_H__ */
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: 40ce0335271a - Linux 5.3.4
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: OK
Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/208531
One or more kernel tests failed:
aarch64:
❌ Boot test
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: 40ce0335271a - Linux 5.3.4
We grabbed the 6a00b417bec3 commit of the stable queue repository.
We then merged the patchset with `git am`:
drm-vkms-fix-crc-worker-races.patch
drm-mcde-fix-uninitialized-variable.patch
drm-bridge-tc358767-increase-aux-transfer-length-lim.patch
drm-vkms-avoid-assigning-0-for-possible_crtc.patch
drm-panel-simple-fix-auo-g185han01-horizontal-blanki.patch
drm-amd-display-add-monitor-patch-to-add-t7-delay.patch
drm-amd-display-power-gate-all-dscs-at-driver-init-t.patch
drm-amd-display-fix-not-calling-ppsmu-to-trigger-pme.patch
drm-amd-display-clear-fec_ready-shadow-register-if-d.patch
drm-amd-display-copy-gsl-groups-when-committing-a-ne.patch
video-ssd1307fb-start-page-range-at-page_offset.patch
drm-tinydrm-kconfig-drivers-select-backlight_class_d.patch
drm-stm-attach-gem-fence-to-atomic-state.patch
drm-bridge-sii902x-fix-missing-reference-to-mclk-clo.patch
drm-panel-check-failure-cases-in-the-probe-func.patch
drm-rockchip-check-for-fast-link-training-before-ena.patch
drm-amdgpu-fix-hard-hang-for-s-g-display-bos.patch
drm-amd-display-use-proper-enum-conversion-functions.patch
drm-radeon-fix-eeh-during-kexec.patch
gpu-drm-radeon-fix-a-possible-null-pointer-dereferen.patch
clk-imx8mq-mark-ahb-clock-as-critical.patch
pci-rpaphp-avoid-a-sometimes-uninitialized-warning.patch
pinctrl-stmfx-update-pinconf-settings.patch
ipmi_si-only-schedule-continuously-in-the-thread-in-.patch
clk-qoriq-fix-wunused-const-variable.patch
clk-ingenic-jz4740-fix-pll-half-divider-not-read-wri.patch
clk-sunxi-ng-v3s-add-missing-clock-slices-for-mmc2-m.patch
drm-amd-display-fix-issue-where-252-255-values-are-c.patch
drm-amd-display-fix-frames_to_insert-math.patch
drm-amd-display-reprogram-vm-config-when-system-resu.patch
drm-amd-display-register-vupdate_no_lock-interrupts-.patch
powerpc-powernv-ioda2-allocate-tce-table-levels-on-d.patch
clk-actions-don-t-reference-clk_init_data-after-regi.patch
clk-sirf-don-t-reference-clk_init_data-after-registr.patch
clk-meson-axg-audio-don-t-reference-clk_init_data-af.patch
clk-sprd-don-t-reference-clk_init_data-after-registr.patch
clk-zx296718-don-t-reference-clk_init_data-after-reg.patch
clk-sunxi-don-t-call-clk_hw_get_name-on-a-hw-that-is.patch
powerpc-xmon-check-for-hv-mode-when-dumping-xive-inf.patch
powerpc-rtas-use-device-model-apis-and-serialization.patch
powerpc-ptdump-fix-walk_pagetables-address-mismatch.patch
powerpc-futex-fix-warning-oldval-may-be-used-uniniti.patch
powerpc-64s-radix-fix-memory-hotplug-section-page-ta.patch
powerpc-pseries-mobility-use-cond_resched-when-updat.patch
powerpc-perf-fix-imc-allocation-failure-handling.patch
pinctrl-tegra-fix-write-barrier-placement-in-pmx_wri.patch
powerpc-eeh-clear-stale-eeh_dev_no_handler-flag.patch
vfio_pci-restore-original-state-on-release.patch
drm-amdgpu-sdma5-fix-number-of-sdma5-trap-irq-types-.patch
drm-nouveau-kms-tu102-disable-input-lut-when-input-i.patch
drm-nouveau-volt-fix-for-some-cards-having-0-maximum.patch
pinctrl-amd-disable-spurious-firing-gpio-irqs.patch
clk-renesas-mstp-set-genpd_flag_always_on-for-clock-.patch
clk-renesas-cpg-mssr-set-genpd_flag_always_on-for-cl.patch
drm-amd-display-support-spdif.patch
drm-amd-powerpaly-fix-navi-series-custom-peak-level-.patch
drm-amd-display-fix-mpo-hubp-underflow-with-scatter-.patch
drm-amd-display-fix-trigger-not-generated-for-freesy.patch
selftests-powerpc-retry-on-host-facility-unavailable.patch
kbuild-do-not-enable-wimplicit-fallthrough-for-clang.patch
drm-amdgpu-si-fix-asic-tests.patch
powerpc-64s-exception-machine-check-use-correct-cfar.patch
pstore-fs-superblock-limits.patch
powerpc-eeh-clean-up-eeh-pes-after-recovery-finishes.patch
clk-qcom-gcc-sdm845-use-floor-ops-for-sdcc-clks.patch
powerpc-pseries-correctly-track-irq-state-in-default.patch
pinctrl-meson-gxbb-fix-wrong-pinning-definition-for-.patch
mailbox-mediatek-cmdq-clear-the-event-in-cmdq-initia.patch
arm-dts-dir685-drop-spi-cpol-from-the-display.patch
arm64-fix-unreachable-code-issue-with-cmpxchg.patch
clk-at91-select-parent-if-main-oscillator-or-bypass-.patch
clk-imx-pll14xx-avoid-glitch-when-set-rate.patch
clk-imx-clk-pll14xx-unbypass-pll-by-default.patch
clk-make-clk_bulk_get_all-return-a-valid-id.patch
powerpc-dump-kernel-log-before-carrying-out-fadump-o.patch
mbox-qcom-add-apcs-child-device-for-qcs404.patch
clk-sprd-add-missing-kfree.patch
scsi-core-reduce-memory-required-for-scsi-logging.patch
dma-buf-sw_sync-synchronize-signal-vs-syncpt-free.patch
f2fs-fix-to-drop-meta-node-pages-during-umount.patch
ext4-fix-potential-use-after-free-after-remounting-w.patch
mips-ingenic-disable-broken-btb-lookup-optimization.patch
clk-jz4740-add-tcu-clock.patch
mips-don-t-use-bc_false-uninitialized-in-__mm_isbran.patch
mips-tlbex-explicitly-cast-_page_no_exec-to-a-boolea.patch
i2c-cht-wc-fix-lockdep-warning.patch
mfd-intel-lpss-remove-d3cold-delay.patch
pci-tegra-fix-of-node-reference-leak.patch
hid-wacom-fix-several-minor-compiler-warnings.patch
rtc-bd70528-fix-driver-dependencies.patch
mips-atomic-fix-loongson_llsc_mb-wreckage.patch
pci-pci-hyperv-fix-build-errors-on-non-sysfs-config.patch
pci-layerscape-add-the-bar_fixed_64bit-property-to-t.patch
livepatch-nullify-obj-mod-in-klp_module_coming-s-err.patch
mips-atomic-fix-smp_mb__-before-after-_atomic.patch
arm-8898-1-mm-don-t-treat-faults-reported-from-cache.patch
soundwire-intel-fix-channel-number-reported-by-hardw.patch
pci-mobiveil-fix-the-cpu-base-address-setup-in-inbou.patch
arm-8875-1-kconfig-default-to-aeabi-w-clang.patch
rtc-snvs-fix-possible-race-condition.patch
rtc-pcf85363-pcf85263-fix-regmap-error-in-set_time.patch
power-supply-register-hwmon-devices-with-valid-names.patch
selinux-fix-residual-uses-of-current_security-for-th.patch
pci-add-pci_info_ratelimited-to-ratelimit-pci-separa.patch
hid-apple-fix-stuck-function-keys-when-using-fn.patch
pci-rockchip-propagate-errors-for-optional-regulator.patch
pci-histb-propagate-errors-for-optional-regulators.patch
pci-imx6-propagate-errors-for-optional-regulators.patch
pci-exynos-propagate-errors-for-optional-phys.patch
security-smack-fix-possible-null-pointer-dereference.patch
pci-use-static-const-struct-not-const-static-struct.patch
arm-8905-1-emit-__gnu_mcount_nc-when-using-clang-10..patch
arm-8903-1-ensure-that-usable-memory-in-bank-0-start.patch
i2c-tegra-move-suspend-handling-to-noirq-phase.patch
block-bfq-push-up-injection-only-after-setting-servi.patch
fat-work-around-race-with-userspace-s-read-via-block.patch
pktcdvd-remove-warning-on-attempting-to-register-non.patch
hypfs-fix-error-number-left-in-struct-pointer-member.patch
tools-power-x86-intel-speed-select-fix-high-priority.patch
crypto-hisilicon-fix-double-free-in-sec_free_hw_sgl.patch
mm-add-dummy-can_do_mlock-helper.patch
kbuild-clean-compressed-initramfs-image.patch
ocfs2-wait-for-recovering-done-after-direct-unlock-r.patch
kmemleak-increase-debug_kmemleak_early_log_size-defa.patch
arm64-consider-stack-randomization-for-mmap-base-onl.patch
mips-properly-account-for-stack-randomization-and-st.patch
arm-properly-account-for-stack-randomization-and-sta.patch
arm-use-stack_top-when-computing-mmap-base-address.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
✅ stress: stress-ng
🚧 ✅ LTP lite
🚧 ✅ POSIX pjd-fstest suites
Host 2:
❌ Boot test
⚡⚡⚡ xfstests: ext4
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ lvm thinp sanity
⚡⚡⚡ storage: software RAID testing
🚧 ⚡⚡⚡ Storage blktests
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ Usex - version 1.9-29
🚧 ✅ LTP lite
🚧 ✅ POSIX pjd-fstest suites
Host 2:
✅ Boot test
✅ xfstests: ext4
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ pciutils: sanity smoke test
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
✅ stress: stress-ng
🚧 ✅ LTP lite
🚧 ✅ POSIX pjd-fstest suites
Host 2:
✅ Boot test
✅ xfstests: ext4
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
Host 3:
✅ Boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit c751798aa224fadc5124b49eeb38fb468c0fa039 upstream.
syzkaller managed to trigger the warning in bpf_jit_free() which checks via
bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
in kallsyms, and subsequently trips over GPF when walking kallsyms entries:
[...]
8021q: adding VLAN 0 to HW filter on device batadv0
8021q: adding VLAN 0 to HW filter on device batadv0
WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x113/0x167 lib/dump_stack.c:113
panic+0x212/0x40b kernel/panic.c:214
__warn.cold.8+0x1b/0x38 kernel/panic.c:571
report_bug+0x1a4/0x200 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:bpf_jit_free+0x1e8/0x2a0
Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
BUG: unable to handle kernel paging request at fffffbfff400d000
#PF error: [normal kernel read fault]
PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
[...]
Upon further debugging, it turns out that whenever we trigger this
issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
but yet bpf_jit_free() reported that the entry is /in use/.
Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
fd is exposed to the public, a parallel close request came in right
before we attempted to do the bpf_prog_kallsyms_add().
Given at this time the prog reference count is one, we start to rip
everything underneath us via bpf_prog_release() -> bpf_prog_put().
The memory is eventually released via deferred free, so we're seeing
that bpf_jit_free() has a kallsym entry because we added it from
bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.
Therefore, move both notifications /before/ we install the fd. The
issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
because upon bpf_prog_get_fd_by_id() we'll take another reference to
the BPF prog, so we're still holding the original reference from the
bpf_prog_load().
Fixes: 6ee52e2a3fe4 ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
Fixes: 74451e66d516 ("bpf: make jited programs visible in traces")
Reported-by: syzbot+bd3bba6ff3fcea7a6ec6(a)syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Song Liu <songliubraving(a)fb.com>
Signed-off-by: Zubin Mithra <zsm(a)chromium.org>
---
Notes:
* Syzkaller triggered a WARNING on 4.14 kernels with the following
stacktrace:
Call Trace:
dump_stack+0x81/0xb3
panic+0x14a/0x2ad
? refcount_error_report+0xf6/0xf6
? set_fs+0x1a/0x29
? bpf_jit_free+0x8b/0xce
__warn+0xde/0x112
? bpf_jit_free+0x8b/0xce
report_bug+0x91/0xda
fixup_bug+0x2c/0x4c
do_error_trap+0xda/0x192
? fixup_bug+0x4c/0x4c
? hlock_class+0x6d/0x8b
? mark_lock+0x3a/0x26d
? trace_hardirqs_off_caller+0xf2/0xfb
? trace_hardirqs_off_thunk+0x1a/0x1c
invalid_op+0x1b/0x40
? bpf_jit_binary_free+0x15/0x20
? bpf_jit_free+0x7b/0xce
process_one_work+0x484/0x793
? wq_calc_node_cpumask.constprop.37+0x25/0x25
? worker_clr_flags+0x52/0x88
worker_thread+0x2b8/0x3d1
? rescuer_thread+0x425/0x425
kthread+0x192/0x1a2
? __list_del_entry+0x41/0x41
ret_from_fork+0x3a/0x50
* The commit is not present in linux-4.19.y. A backport has been sent
separately.
* The patch resolves conflicts inside bpf_prog_load that arise due to
trace_bpf_prog_load() not being present upstream when c751798aa224 was
applied and perf_event_bpf_event() not being present in linux-4.14.y.
* Tests run: Chrome OS tryjobs, Syzkaller reproducer
kernel/bpf/syscall.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2d828d3469822..59d2e94ecb798 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1067,20 +1067,26 @@ static int bpf_prog_load(union bpf_attr *attr)
if (err)
goto free_used_maps;
- err = bpf_prog_new_fd(prog);
- if (err < 0) {
- /* failed to allocate fd.
- * bpf_prog_put() is needed because the above
- * bpf_prog_alloc_id() has published the prog
- * to the userspace and the userspace may
- * have refcnt-ed it through BPF_PROG_GET_FD_BY_ID.
- */
- bpf_prog_put(prog);
- return err;
- }
-
+ /* Upon success of bpf_prog_alloc_id(), the BPF prog is
+ * effectively publicly exposed. However, retrieving via
+ * bpf_prog_get_fd_by_id() will take another reference,
+ * therefore it cannot be gone underneath us.
+ *
+ * Only for the time /after/ successful bpf_prog_new_fd()
+ * and before returning to userspace, we might just hold
+ * one reference and any parallel close on that fd could
+ * rip everything out. Hence, below notifications must
+ * happen before bpf_prog_new_fd().
+ *
+ * Also, any failure handling from this point onwards must
+ * be using bpf_prog_put() given the program is exposed.
+ */
bpf_prog_kallsyms_add(prog);
trace_bpf_prog_load(prog, err);
+
+ err = bpf_prog_new_fd(prog);
+ if (err < 0)
+ bpf_prog_put(prog);
return err;
free_used_maps:
--
2.23.0.444.g18eeb5a265-goog
[ Upstream commit cb8acabbe33b110157955a7425ee876fb81e6bbc ]
Commit 7211aef86f79 ("block: mq-deadline: Fix write completion
handling") added a call to blk_mq_sched_mark_restart_hctx() in
dd_dispatch_request() to make sure that write request dispatching does
not stall when all target zones are locked. This fix left a subtle race
when a write completion happens during a dispatch execution on another
CPU:
CPU 0: Dispatch CPU1: write completion
dd_dispatch_request()
lock(&dd->lock);
...
lock(&dd->zone_lock); dd_finish_request()
rq = find request lock(&dd->zone_lock);
unlock(&dd->zone_lock);
zone write unlock
unlock(&dd->zone_lock);
...
__blk_mq_free_request
check restart flag (not set)
-> queue not run
...
if (!rq && have writes)
blk_mq_sched_mark_restart_hctx()
unlock(&dd->lock)
Since the dispatch context finishes after the write request completion
handling, marking the queue as needing a restart is not seen from
__blk_mq_free_request() and blk_mq_sched_restart() not executed leading
to the dispatch stall under 100% write workloads.
Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from
dd_dispatch_request() into dd_finish_request() under the zone lock to
ensure full mutual exclusion between write request dispatch selection
and zone unlock on write request completion.
Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: stable(a)vger.kernel.org
Reported-by: Hans Holmberg <Hans.Holmberg(a)wdc.com>
Reviewed-by: Hans Holmberg <hans.holmberg(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/mq-deadline.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index d5e21ce44d2c..69094d641062 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -376,13 +376,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)
* hardware queue, but we may return a request that is for a
* different hardware queue. This is because mq-deadline has shared
* state for all hardware queues, in terms of sorting, FIFOs, etc.
- *
- * For a zoned block device, __dd_dispatch_request() may return NULL
- * if all the queued write requests are directed at zones that are already
- * locked due to on-going write requests. In this case, make sure to mark
- * the queue as needing a restart to ensure that the queue is run again
- * and the pending writes dispatched once the target zones for the ongoing
- * write requests are unlocked in dd_finish_request().
*/
static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
{
@@ -391,9 +384,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
spin_lock(&dd->lock);
rq = __dd_dispatch_request(dd);
- if (!rq && blk_queue_is_zoned(hctx->queue) &&
- !list_empty(&dd->fifo_list[WRITE]))
- blk_mq_sched_mark_restart_hctx(hctx);
spin_unlock(&dd->lock);
return rq;
@@ -559,6 +549,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio)
* spinlock so that the zone is never unlocked while deadline_fifo_request()
* or deadline_next_request() are executing. This function is called for
* all requests, whether or not these requests complete successfully.
+ *
+ * For a zoned block device, __dd_dispatch_request() may have stopped
+ * dispatching requests if all the queued requests are write requests directed
+ * at zones that are already locked due to on-going write requests. To ensure
+ * write request dispatch progress in this case, mark the queue as needing a
+ * restart to ensure that the queue is run again after completion of the
+ * request and zones being unlocked.
*/
static void dd_finish_request(struct request *rq)
{
@@ -570,6 +567,12 @@ static void dd_finish_request(struct request *rq)
spin_lock_irqsave(&dd->zone_lock, flags);
blk_req_zone_write_unlock(rq);
+ if (!list_empty(&dd->fifo_list[WRITE])) {
+ struct blk_mq_hw_ctx *hctx;
+
+ hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
+ blk_mq_sched_mark_restart_hctx(hctx);
+ }
spin_unlock_irqrestore(&dd->zone_lock, flags);
}
}
--
2.21.0