This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
-- Thanks, Sasha
-------------------------
Pseudo-Shortlog of commits:
Aaron Plattner (1): ALSA: hda: Add NVIDIA codec IDs 9a & 9d through a0 to patch table
Adam Ford (1): drm/panel-simple: fix connector type for LogicPD Type28 Display
Aditya Pakki (3): rocker: fix incorrect error handling in dma_rings_init RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rq test_objagg: Fix potential memory leak in error handling
Al Cooper (1): xhci: Fix enumeration issue when setting max packet size for FS devices.
Al Viro (1): fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()"
Alexander Lobakin (10): net: ethtool: add missing string for NETIF_F_GSO_TUNNEL_REMCSUM net: ethtool: add missing NETIF_F_GSO_FRAGLIST feature string net: qed: fix left elements count calculation net: qed: fix async event callbacks unregistering net: qede: stop adding events on an already destroyed workqueue net: qed: fix NVMe login fails over VFs net: qed: fix excessive QM ILT lines consumption net: qede: fix PTP initialization on recovery net: qede: fix use-after-free on recovery and AER handling net: qed: reset ILT block sizes before recomputing to fix crashes
Alexander Usyskin (1): mei: me: add tiger lake point device ids for H platforms.
Anand Moon (1): Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
Anson Huang (1): soc: imx8m: Correct i.MX8MP UID fuse offset
Anton Eidelman (2): nvme-multipath: fix deadlock between ana_work and scan_work nvme-multipath: fix deadlock due to head->lock
Ard Biesheuvel (1): net: phy: mscc: avoid skcipher API for single block AES encryption
Arseny Solokha (1): powerpc/fsl_booke/32: Fix build with CONFIG_RANDOMIZE_BASE
Arvind Sankar (1): efi/x86: Setup stack correctly for efi_pe_entry
Babu Moger (1): x86/resctrl: Fix memory bandwidth counter width for AMD
Ben Widawsky (1): mm/memory_hotplug.c: fix false softlockup during pfn range removal
Bernard Zhao (1): drm/amd: fix potential memleak in err branch
Borislav Petkov (1): EDAC/amd64: Read back the scrub rate PCI register on F15h
Chaitanya Kulkarni (1): nvmet: fail outstanding host posted AEN req
Charles Keepax (1): regmap: Fix memory leak from regmap_register_patch
Christoffer Nielsen (1): ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S
Christopher Swenson (1): ALSA: usb-audio: Set 48 kHz rate for Rodecaster
Chuck Lever (2): SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment() xprtrdma: Fix handling of RDMA_ERROR replies
Chuhong Yuan (1): USB: ohci-sm501: Add missed iounmap() in remove
Claudiu Beznea (3): net: macb: undo operations in case of failure net: macb: call pm_runtime_put_sync on failure path net: macb: free resources on failure path of at91ether_open()
Claudiu Manoil (1): enetc: Fix tx rings bitmap iteration range, irq handling
Colin Ian King (1): qed: add missing error test for DBG_STATUS_NO_MATCHING_FRAMING_MODE
Cong Wang (1): genetlink: clean up family attributes allocations
Dan Carpenter (3): x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get() usb: gadget: udc: Potential Oops in error handling code Staging: rtl8723bs: prevent buffer overflow in update_sta_support_rate()
Daniel Gomez (1): drm: rcar-du: Fix build error
Daniel Vetter (1): drm/fb-helper: Fix vt restore
Dave Martin (1): arm64/sve: Eliminate data races on sve_default_vl
David Christensen (1): tg3: driver sleeps indefinitely when EEH errors exceed eeh_max_freezes
David Howells (3): rxrpc: Fix notification call on completion of discarded calls rxrpc: Fix handling of rwind from an ACK packet afs: Fix storage of cell names
David Milburn (1): nvmet: cleanups the loop in nvmet_async_events_process
David Rientjes (3): dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails dma-direct: check return value when encrypting or decrypting memory dma-direct: add missing set_memory_decrypted() for coherent mapping
Dejin Zheng (1): net: phy: smsc: fix printing too many logs
Denis Efremov (2): drm/amd/display: Use kfree() to free rgb_user in calculate_user_regamma_ramp() drm/radeon: fix fb_div check in ni_init_smc_spll_table()
Denis Kirjanov (1): tcp: don't ignore ECN CWR on pure ACK
Dennis Dalessandro (1): IB/hfi1: Fix module use count flaw due to leftover module put calls
Dinghao Liu (1): hwrng: ks-sa - Fix runtime PM imbalance on error
Dmitry Baryshkov (1): pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage
Doug Berger (1): net: bcmgenet: use hardware padding of runt frames
Drew Fustini (1): ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect
Eddie James (1): i2c: fsi: Fix the port number field in status register
Eric Dumazet (2): net: increment xmit_recursion level in dev_direct_xmit() tcp: grow window for OOO packets only for SACK flows
Fabian Vogt (1): efi/tpm: Verify event log header before parsing
Fan Guo (1): RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()
Filipe Manana (7): btrfs: fix a block group ref counter leak after failure to remove block group btrfs: fix bytes_may_use underflow when running balance and scrub in parallel btrfs: fix data block group relocation failure due to concurrent scrub btrfs: check if a log root exists before locking the log_mutex on unlink btrfs: fix hang on snapshot creation after RWF_NOWAIT write btrfs: fix failure of RWF_NOWAIT write into prealloc extent beyond eof btrfs: fix RWF_NOWAIT write not failling when we need to cow
Florian Fainelli (3): net: phy: Check harder for errors in get_phy_id() of: of_mdio: Correct loop scanning logic net: dsa: bcm_sf2: Fix node reference count
Frieder Schrempf (2): ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
Gal Pressman (1): RDMA/efa: Set maximum pkeys device attribute
Gao Xiang (1): erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup
Gaurav Singh (2): ethtool: Fix check in ethtool_rx_flow_rule_create bpf, xdp, samples: Fix null pointer dereference in *_user code
Geliang Tang (1): mptcp: drop sndr_key in mptcp_syn_options
Harish (1): selftests/powerpc: Fix build failure in ebb tests
Heikki Krogerus (1): usb: typec: mux: intel_pmc_mux: Fix DP alternate mode entry
Heiner Kallweit (1): r8169: fix firmware not resetting tp->ocp_base
Huaisheng Ye (1): dm writecache: correct uncommitted_block when discarding uncommitted entry
Huy Nguyen (1): xfrm: Fix double ESP trailer insertion in IPsec crypto offload.
Ido Schimmel (1): mlxsw: spectrum: Do not rely on machine endianness
Igor Mammedov (1): kvm: lapic: fix broken vcpu hotplug
Ilya Ponetayev (1): sch_cake: don't try to reallocate or unshare skb unconditionally
Jason A. Donenfeld (5): wireguard: device: avoid circular netns references wireguard: receive: account for napi_gro_receive never returning GRO_DROP socionext: account for napi_gro_receive never returning GRO_DROP wil6210: account for napi_gro_receive never returning GRO_DROP ACPI: configfs: Disallow loading ACPI tables when locked down
Jeremy Kerr (1): net: usb: ax88179_178a: fix packet alignment padding
Jiping Ma (1): arm64: perf: Report the PC value in REGS_ABI_32 mode
Jiri Slaby (1): syscalls: Fix offset type of ksys_ftruncate()
Joakim Tjernlund (1): cdc-acm: Add DISABLE_ECHO quirk for Microchip/SMSC chip
Johannes Weiner (1): mm: memcontrol: handle div0 crash race condition in memory.low
John van der Kamp (1): drm/amdgpu/display: Unlock mutex on error
Julian Wiedmann (1): s390/qeth: fix error handling for isolation mode cmds
Junxiao Bi (4): ocfs2: avoid inode removal while nfsd is accessing it ocfs2: load global_inode_alloc ocfs2: fix value of OCFS2_INVALID_SLOT ocfs2: fix panic on nfs server over ocfs2
Juri Lelli (2): sched/deadline: Initialize ->dl_boosted sched/core: Fix PI boosting between RT and DEADLINE tasks
Kai-Heng Feng (3): xhci: Poll for U0 after disabling USB2 LPM xhci: Return if xHCI doesn't support LPM ALSA: hda/realtek: Add mute LED and micmute LED support for HP systems
Kees Cook (1): x86/cpu: Use pinning mask for CR4 bits needing to be 0
Keith Busch (1): nvme-multipath: set bdi capabilities once
Krzysztof Kozlowski (1): spi: spi-fsl-dspi: Free DMA memory with matching function
Laurence Tratt (1): ALSA: usb-audio: Add implicit feedback quirk for SSL2+.
Leon Romanovsky (1): RDMA/core: Check that type_attrs is not NULL prior access
Li Jun (1): usb: typec: tcpci_rt1711h: avoid screaming irq causing boot hangs
Longfang Liu (1): USB: ehci: reopen solution for Synopsys HC bug
Lorenzo Bianconi (2): openvswitch: take into account de-fragmentation/gso_size in execute_check_pkt_len samples/bpf: xdp_redirect_cpu: Set MAX_CPUS according to NR_CPUS
Lu Baolu (3): iommu/vt-d: Set U/S bit in first level page table by default iommu/vt-d: Enable PCI ACS for platform opt in hint iommu/vt-d: Update scalable mode paging structure coherency
Luis Chamberlain (1): blktrace: break out of blktrace setup on concurrent calls
Macpaul Lin (2): usb: host: xhci-mtk: avoid runtime suspend when removing hcd ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
Mans Rullgard (1): i2c: core: check returned size of emulated smbus block read
Marcelo Ricardo Leitner (1): sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
Mark Zhang (1): RDMA/cma: Protect bind_list and listen_list while finding matching cm id
Martin (1): bareudp: Fixed multiproto mode configuration
Martin Fuzzey (1): regulator: da9063: fix LDO9 suspend and warning.
Masahiro Yamada (1): kbuild: improve cc-option to clean up all temporary files
Masami Hiramatsu (2): kprobes: Suppress the suspicious RCU warning on kprobes tracing: Fix event trigger to accept redundant spaces
Mathias Nyman (1): xhci: Fix incorrect EP_STATE_MASK
Matt Fleming (1): x86/asm/64: Align start of __clear_user() loop to 16-bytes
Matthew Hagan (3): ARM: bcm: Select ARM_TIMER_SP804 for ARCH_BCM_NSP ARM: dts: NSP: Disable PL330 by default, add dma-coherent property ARM: dts: NSP: Correct FA2 mailbox node
Mauricio Faria de Oliveira (1): bcache: check and adjust logical block size for backing devices
Michael Chan (3): bnxt_en: Store the running firmware version code. bnxt_en: Do not enable legacy TX push on older firmware. bnxt_en: Fix statistics counters issue during ifdown with older firmware.
Michal Kalderon (1): RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532
Mikulas Patocka (1): dm writecache: add cond_resched to loop in persistent_memory_claim()
Minas Harutyunyan (1): usb: dwc2: Postponed gadget registration to the udc class driver
Muchun Song (1): mm/memcontrol.c: add missed css_put()
Nathan Chancellor (2): s390/vdso: Use $(LD) instead of $(CC) to link vDSO ACPI: sysfs: Fix pm_profile_attr type
Nathan Huckleberry (1): riscv/atomic: Fix sign extension for RV64I
Navid Emamdoost (1): sata_rcar: handle pm_runtime_get_sync failure cases
Neal Cardwell (2): tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
Olga Kornievskaia (1): NFSv4 fix CLOSE not waiting for direct IO compeletion
Oskar Holmlund (2): ARM: dts: Fix am33xx.dtsi USB ranges length ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag
Pavel Begunkov (1): io_uring: fix hanging iopoll in case of -EAGAIN
Peter Chen (3): usb: cdns3: trace: using correct dir value usb: cdns3: ep0: fix the test mode set incorrectly usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
Philipp Fent (1): efi/libstub: Fix path separator regression
Pierre-Louis Bossart (1): ASoC: soc-pcm: fix checks for multi-cpu FE dailinks
Qiushi Wu (2): efi/esrt: Fix reference count leak in esre_create_sysfs_entry. ASoC: rockchip: Fix a reference count leak.
Rafał Miłecki (1): ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000
Rahul Lakkireddy (2): cxgb4: move handling L2T ARP failures to caller cxgb4: move PTP lock and unlock to caller in Tx path
Reinette Chatre (2): x86/cpu: Move resctrl CPUID code to resctrl/ x86/resctrl: Support CPUID enumeration of MBM counter width
Robin Gong (3): regualtor: pfuze100: correct sw1a/sw2 on pfuze3000 arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range
Roman Bolshakov (1): scsi: qla2xxx: Keep initiator ports after RSCN
Russell King (3): net: phylink: fix ethtool -A with attached PHYs net: phylink: ensure manual pause mode configuration takes effect netfilter: ipset: fix unaligned atomic access
Sabrina Dubroca (1): geneve: allow changing DF behavior after creation
Sagi Grimberg (2): nvme: fix possible deadlock when I/O is blocked nvme: don't protect ns mutation with ns->head->lock
Sami Tolvanen (1): recordmcount: support >64k sections
Sascha Ortmann (1): tracing/boottime: Fix kprobe multiple events
Sasha Levin (1): Linux 5.7.7-rc1
Sean Christopherson (3): KVM: nVMX: Plumb L2 GPA through to PML emulation KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup
SeongJae Park (1): scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
Shannon Nelson (2): ionic: update the queue count on open ionic: tame the watchdog timer on reconfig
Shay Drory (1): IB/mad: Fix use after free when destroying MAD agent
Shengjiu Wang (1): ASoC: fsl_ssi: Fix bclk calculation for mono channel
Srinivas Kandagatla (3): ASoC: q6asm: handle EOS correctly ASoc: q6afe: add support to get port direction ASoC: qcom: common: set correct directions for dailinks
Stanislav Fomichev (1): bpf: Don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE
Steffen Maier (1): scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action
Steven Rostedt (VMware) (1): ring-buffer: Zero out time extend if it is nested and not absolute
Stylon Wang (1): drm/amd/display: Enable output_bpc property on all outputs
Sven Auhagen (1): mvpp2: ethtool rxtx stats fix
Sven Schnelle (4): s390/seccomp: pass syscall arguments via seccomp_data s390/ptrace: return -ENOSYS when invalid syscall is supplied s390/ptrace: pass invalid syscall numbers to tracing s390/ptrace: fix setting syscall number
Taehee Yoo (3): net: core: reduce recursion limit value ip6_gre: fix use-after-free in ip6gre_tunnel_lookup() ip_tunnel: fix use-after-free in ip_tunnel_lookup()
Takashi Iwai (3): ALSA: usb-audio: Fix potential use-after-free of streams ALSA: usb-audio: Fix OOB access of mixer element list ALSA: hda/realtek - Add quirk for MSI GE63 laptop
Tang Bin (1): usb: host: ehci-exynos: Fix error check in exynos_ehci_probe()
Tariq Toukan (1): net: Do not clear the sock TX queue in sk_set_socket()
Thierry Reding (1): Revert "i2c: tegra: Fix suspending in active runtime PM state"
Thomas Falcon (2): ibmveth: Fix max MTU limit ibmvnic: Harden device login requests
Thomas Martitz (1): net: bridge: enfore alignment for ethernet address
Todd Kjos (1): binder: fix null deref of proc->context
Toke Høiland-Jørgensen (3): sch_cake: don't call diffserv parsing code when it is not needed sch_cake: fix a few style nits devmap: Use bpf_map_area_alloc() for allocating hash buckets
Tom Seewald (1): RDMA/siw: Fix pointer-to-int-cast warning in siw_rx_pbl()
Tomas Winkler (1): mei: me: disable mei interface on Mehlow server platforms
Tomasz Meresiński (1): usb: add USB_QUIRK_DELAY_INIT for Logitech C922
Tomi Valkeinen (1): drm/panel-simple: fix connector type for newhaven_nhd_43_480272ef_atxl
Tony Lindgren (6): bus: ti-sysc: Flush posted write on enable and disable bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit bus: ti-sysc: Ignore clockactivity unless specified as a quirk bus: ti-sysc: Fix uninitialized framedonetv_irq ARM: OMAP2+: Fix legacy mode dss_reset ARM: dts: Fix duovero smsc interrupt for suspend
Trond Myklebust (1): pNFS/flexfiles: Fix list corruption if the mirror count changes
Vasily Averin (1): sunrpc: fixed rollback in rpc_gssd_dummy_populate()
Vasundhara Volam (1): bnxt_en: Read VPD info only for PFs
Vidya Sagar (1): pinctrl: tegra: Use noirq suspend/resume callbacks
Vincent Chen (1): clk: sifive: allocate sufficient memory for struct __prci_data
Vincent Guittot (1): sched/cfs: change initial value of runnable_avg
Vincenzo Frascino (1): s390/vdso: fix vDSO clock_getres()
Vishal Verma (1): nvdimm/region: always show the 'align' attribute
Vitaly Kuznetsov (1): Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
Vlastimil Babka (1): mm, compaction: make capture control handling safe wrt interrupts
Waiman Long (2): mm, slab: fix sign conversion problem in memcg_uncharge_slab() mm/slab: use memzero_explicit() in kzfree()
Wang Hai (1): mld: fix memory leak in ipv6_mc_destroy_dev()
Wei Yongjun (1): mptcp: fix memory leak in mptcp_subflow_create_socket()
Weiping Zhang (1): block: update hctx map when use multiple maps
Wenhui Sheng (1): drm/amdgpu: add fw release for sdma v5_0
Will Deacon (1): arm64: sve: Fix build failure when ARM64_SVE=y and SYSCTL=n
Willem de Bruijn (1): selftests/net: report etf errors correctly
Xiaoyao Li (1): KVM: X86: Fix MSR range of APIC registers in X2APIC mode
Xiyu Yang (1): cifs: Fix cached_fid refcnt leak in open_shroot
Yang Yingliang (1): net: fix memleak in register_netdevice()
Yash Shah (1): RISC-V: Don't allow write+exec only page mapping request in mmap
Ye Bin (1): ata/libata: Fix usage of page address by page_address in ata_scsi_mode_select_xlat function
Yick W. Tse (1): ALSA: usb-audio: add quirk for Denon DCD-1500RE
Yoshihiro Shimoda (1): usb: renesas_usbhs: getting residue from callback_result
Zekun Shen (1): net: alx: fix race condition in alx_remove
Zhang Xiaoxu (2): cifs/smb3: Fix data inconsistent when punch hole cifs/smb3: Fix data inconsistent when zero file range
Zheng Bin (1): loop: replace kill_bdev with invalidate_bdev
guodeqing (1): net: Fix the arp error in some cases
yu kuai (2): block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
Makefile | 4 +- arch/arm/boot/dts/am335x-pocketbeagle.dts | 1 - arch/arm/boot/dts/am33xx.dtsi | 4 +- arch/arm/boot/dts/bcm-nsp.dtsi | 10 +- arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts | 1 + arch/arm/boot/dts/bcm958522er.dts | 4 + arch/arm/boot/dts/bcm958525er.dts | 4 + arch/arm/boot/dts/bcm958525xmc.dts | 4 + arch/arm/boot/dts/bcm958622hr.dts | 4 + arch/arm/boot/dts/bcm958623hr.dts | 4 + arch/arm/boot/dts/bcm958625hr.dts | 4 + arch/arm/boot/dts/bcm958625k.dts | 4 + arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi | 13 -- .../dts/imx6ul-kontron-n6x1x-som-common.dtsi | 13 ++ arch/arm/boot/dts/omap4-duovero-parlor.dts | 2 +- arch/arm/mach-bcm/Kconfig | 1 + arch/arm/mach-imx/pm-imx5.c | 6 +- arch/arm/mach-omap2/omap_hwmod.c | 2 +- arch/arm64/boot/dts/freescale/imx8mm-evk.dts | 4 +- .../boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 +- arch/arm64/kernel/fpsimd.c | 31 +++-- arch/arm64/kernel/perf_regs.c | 25 +++- arch/powerpc/mm/nohash/kaslr_booke.c | 1 + arch/riscv/include/asm/cmpxchg.h | 8 +- arch/riscv/kernel/sys_riscv.c | 6 + arch/s390/include/asm/vdso.h | 1 + arch/s390/kernel/asm-offsets.c | 2 +- arch/s390/kernel/entry.S | 2 +- arch/s390/kernel/ptrace.c | 83 ++++++++++-- arch/s390/kernel/time.c | 1 + arch/s390/kernel/vdso64/Makefile | 10 +- arch/s390/kernel/vdso64/clock_getres.S | 10 +- arch/sparc/kernel/ptrace_32.c | 9 +- arch/x86/boot/compressed/head_64.S | 11 +- arch/x86/include/asm/cpu.h | 5 + arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/mwait.h | 2 - arch/x86/include/asm/processor.h | 3 +- arch/x86/include/asm/resctrl_sched.h | 3 + arch/x86/kernel/cpu/centaur.c | 1 + arch/x86/kernel/cpu/common.c | 51 ++----- arch/x86/kernel/cpu/cpu.h | 4 - arch/x86/kernel/cpu/resctrl/core.c | 29 ++++ arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 + arch/x86/kernel/cpu/umwait.c | 6 - arch/x86/kernel/cpu/zhaoxin.c | 1 + arch/x86/kvm/lapic.c | 1 + arch/x86/kvm/mmu.h | 2 +- arch/x86/kvm/mmu/mmu.c | 4 +- arch/x86/kvm/mmu/paging_tmpl.h | 7 +- arch/x86/kvm/vmx/vmx.c | 27 +--- arch/x86/kvm/x86.c | 4 +- arch/x86/lib/usercopy_64.c | 1 + arch/x86/power/cpu.c | 6 + block/bio-integrity.c | 1 - block/blk-mq.c | 4 +- drivers/acpi/acpi_configfs.c | 6 +- drivers/acpi/sysfs.c | 4 +- drivers/android/binder.c | 14 +- drivers/ata/libata-scsi.c | 9 +- drivers/ata/sata_rcar.c | 11 +- drivers/base/regmap/regmap.c | 1 + drivers/block/loop.c | 8 +- drivers/bus/ti-sysc.c | 98 ++++++++++---- drivers/char/hw_random/ks-sa-rng.c | 1 + drivers/clk/sifive/fu540-prci.c | 5 +- drivers/edac/amd64_edac.c | 2 + drivers/firmware/efi/esrt.c | 2 +- drivers/firmware/efi/libstub/file.c | 16 ++- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 1 + .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 4 +- .../amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 6 +- .../amd/display/modules/color/color_gamma.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 63 +++++++-- drivers/gpu/drm/panel/panel-simple.c | 2 + drivers/gpu/drm/radeon/ni_dpm.c | 2 +- drivers/gpu/drm/rcar-du/Kconfig | 1 + drivers/i2c/busses/i2c-fsi.c | 2 +- drivers/i2c/busses/i2c-tegra.c | 9 -- drivers/i2c/i2c-core-smbus.c | 7 + drivers/infiniband/core/cma.c | 18 +++ drivers/infiniband/core/mad.c | 3 +- drivers/infiniband/core/rdma_core.c | 36 +++-- drivers/infiniband/hw/efa/efa_verbs.c | 1 + drivers/infiniband/hw/hfi1/debugfs.c | 19 +-- drivers/infiniband/hw/qedr/qedr_iw_cm.c | 13 +- drivers/infiniband/sw/rdmavt/qp.c | 6 +- drivers/infiniband/sw/siw/siw_qp_rx.c | 3 +- drivers/iommu/dmar.c | 3 +- drivers/iommu/intel-iommu.c | 18 ++- drivers/md/bcache/super.c | 22 ++- drivers/md/dm-writecache.c | 4 + drivers/misc/mei/hw-me-regs.h | 3 + drivers/misc/mei/hw-me.c | 70 +++++++++- drivers/misc/mei/hw-me.h | 17 ++- drivers/misc/mei/pci-me.c | 17 +-- drivers/net/bareudp.c | 3 + drivers/net/dsa/bcm_sf2.c | 2 + drivers/net/ethernet/atheros/alx/main.c | 9 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 36 ++++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 + .../net/ethernet/broadcom/genet/bcmgenet.c | 8 +- drivers/net/ethernet/broadcom/tg3.c | 4 +- drivers/net/ethernet/cadence/macb_main.c | 128 +++++++++++------- drivers/net/ethernet/chelsio/cxgb4/l2t.c | 52 ++++--- drivers/net/ethernet/chelsio/cxgb4/sge.c | 23 ++-- drivers/net/ethernet/freescale/enetc/enetc.c | 4 +- drivers/net/ethernet/ibm/ibmveth.c | 2 +- drivers/net/ethernet/ibm/ibmvnic.c | 21 ++- .../net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 +- .../net/ethernet/mellanox/mlxsw/spectrum.c | 4 +- .../net/ethernet/mellanox/mlxsw/spectrum.h | 8 +- .../mellanox/mlxsw/spectrum_buffers.c | 2 +- .../ethernet/mellanox/mlxsw/spectrum_span.c | 2 +- .../net/ethernet/pensando/ionic/ionic_lif.c | 21 ++- drivers/net/ethernet/qlogic/qed/qed_cxt.c | 21 ++- drivers/net/ethernet/qlogic/qed/qed_debug.c | 3 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 9 +- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 2 - drivers/net/ethernet/qlogic/qed/qed_roce.c | 1 - drivers/net/ethernet/qlogic/qed/qed_vf.c | 23 +++- drivers/net/ethernet/qlogic/qede/qede_main.c | 3 +- drivers/net/ethernet/qlogic/qede/qede_ptp.c | 31 ++--- drivers/net/ethernet/qlogic/qede/qede_ptp.h | 2 +- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 3 +- drivers/net/ethernet/realtek/r8169_main.c | 5 +- drivers/net/ethernet/rocker/rocker_main.c | 4 +- drivers/net/ethernet/socionext/netsec.c | 5 +- drivers/net/geneve.c | 1 + drivers/net/phy/Kconfig | 3 +- drivers/net/phy/mscc/mscc_macsec.c | 40 ++---- drivers/net/phy/phy_device.c | 6 +- drivers/net/phy/phylink.c | 45 ++++-- drivers/net/phy/smsc.c | 11 +- drivers/net/usb/ax88179_178a.c | 11 +- drivers/net/wireguard/device.c | 58 ++++---- drivers/net/wireguard/device.h | 3 +- drivers/net/wireguard/netlink.c | 14 +- drivers/net/wireguard/receive.c | 10 +- drivers/net/wireguard/socket.c | 25 +++- drivers/net/wireless/ath/wil6210/txrx.c | 39 ++---- drivers/nvdimm/region_devs.c | 14 +- drivers/nvme/host/core.c | 8 -- drivers/nvme/host/multipath.c | 46 ++++--- drivers/nvme/host/nvme.h | 2 + drivers/nvme/target/core.c | 40 ++++-- drivers/of/of_mdio.c | 9 +- drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 21 ++- drivers/pinctrl/tegra/pinctrl-tegra.c | 4 +- drivers/regulator/da9063-regulator.c | 1 - drivers/regulator/pfuze100-regulator.c | 60 +++++--- drivers/s390/net/qeth_core_main.c | 5 +- drivers/s390/scsi/zfcp_erp.c | 13 +- drivers/scsi/lpfc/lpfc_init.c | 3 +- drivers/scsi/qla2xxx/qla_gs.c | 4 +- drivers/soc/imx/soc-imx8m.c | 8 +- drivers/spi/spi-fsl-dspi.c | 8 +- .../staging/rtl8723bs/core/rtw_wlan_util.c | 4 +- drivers/usb/cdns3/ep0.c | 10 +- drivers/usb/cdns3/trace.h | 2 +- drivers/usb/class/cdc-acm.c | 2 + drivers/usb/core/quirks.c | 3 +- drivers/usb/dwc2/gadget.c | 6 - drivers/usb/dwc2/platform.c | 11 ++ drivers/usb/dwc3/dwc3-exynos.c | 9 -- drivers/usb/gadget/udc/mv_udc_core.c | 3 +- drivers/usb/host/ehci-exynos.c | 5 +- drivers/usb/host/ehci-pci.c | 7 + drivers/usb/host/ohci-sm501.c | 1 + drivers/usb/host/xhci-mtk.c | 5 +- drivers/usb/host/xhci.c | 9 +- drivers/usb/host/xhci.h | 2 +- drivers/usb/renesas_usbhs/fifo.c | 23 ++-- drivers/usb/renesas_usbhs/fifo.h | 2 +- drivers/usb/typec/mux/intel_pmc_mux.c | 13 +- drivers/usb/typec/tcpm/tcpci_rt1711h.c | 31 ++--- drivers/video/fbdev/core/fbcon.c | 3 +- fs/afs/cell.c | 9 ++ fs/afs/internal.h | 2 +- fs/btrfs/block-group.c | 19 ++- fs/btrfs/ctree.h | 2 + fs/btrfs/file.c | 15 +- fs/btrfs/inode.c | 39 ++++-- fs/btrfs/tree-log.c | 5 + fs/cifs/smb2ops.c | 12 ++ fs/erofs/zdata.h | 20 +-- fs/io_uring.c | 9 +- fs/nfs/direct.c | 13 +- fs/nfs/file.c | 1 + fs/nfs/flexfilelayout/flexfilelayout.c | 11 +- fs/ocfs2/dlmglue.c | 17 ++- fs/ocfs2/ocfs2.h | 1 + fs/ocfs2/ocfs2_fs.h | 4 +- fs/ocfs2/suballoc.c | 9 +- include/linux/intel-iommu.h | 1 + include/linux/netdevice.h | 2 +- include/linux/qed/qed_chain.h | 26 ++-- include/linux/syscalls.h | 2 +- include/linux/tpm_eventlog.h | 14 +- include/net/sctp/constants.h | 8 +- include/net/sock.h | 1 - include/net/xfrm.h | 1 + include/uapi/linux/fb.h | 1 + kernel/bpf/cgroup.c | 53 +++++--- kernel/bpf/devmap.c | 10 +- kernel/dma/direct.c | 26 +++- kernel/kprobes.c | 3 +- kernel/sched/core.c | 3 +- kernel/sched/deadline.c | 1 + kernel/sched/fair.c | 2 +- kernel/trace/blktrace.c | 13 ++ kernel/trace/ring_buffer.c | 2 +- kernel/trace/trace_boot.c | 8 +- kernel/trace/trace_events_trigger.c | 21 ++- lib/test_objagg.c | 4 +- mm/compaction.c | 17 ++- mm/memcontrol.c | 13 +- mm/memory_hotplug.c | 13 +- mm/slab.h | 4 +- mm/slab_common.c | 2 +- net/bridge/br_private.h | 2 +- net/core/dev.c | 9 ++ net/core/sock.c | 4 +- net/ethtool/common.c | 2 + net/ethtool/ioctl.c | 2 +- net/ipv4/fib_semantics.c | 2 +- net/ipv4/ip_tunnel.c | 14 +- net/ipv4/tcp_cubic.c | 5 +- net/ipv4/tcp_input.c | 26 +++- net/ipv6/ip6_gre.c | 9 +- net/ipv6/mcast.c | 1 + net/mptcp/options.c | 2 - net/mptcp/subflow.c | 4 +- net/netfilter/ipset/ip_set_core.c | 2 + net/netlink/genetlink.c | 28 ++-- net/openvswitch/actions.c | 9 +- net/rxrpc/call_accept.c | 7 + net/rxrpc/input.c | 7 +- net/sched/sch_cake.c | 58 +++++--- net/sctp/associola.c | 5 +- net/sctp/bind_addr.c | 1 + net/sctp/protocol.c | 3 +- net/sunrpc/rpc_pipe.c | 1 + net/sunrpc/xdr.c | 4 + net/sunrpc/xprtrdma/rpc_rdma.c | 9 +- net/xfrm/xfrm_device.c | 4 +- samples/bpf/xdp_monitor_user.c | 8 +- samples/bpf/xdp_redirect_cpu_kern.c | 2 +- samples/bpf/xdp_redirect_cpu_user.c | 34 ++--- samples/bpf/xdp_rxq_info_user.c | 13 +- scripts/Kbuild.include | 11 +- scripts/recordmcount.h | 98 +++++++++++++- sound/pci/hda/patch_hdmi.c | 5 + sound/pci/hda/patch_realtek.c | 3 + sound/soc/fsl/fsl_ssi.c | 13 +- sound/soc/qcom/common.c | 14 +- sound/soc/qcom/qdsp6/q6afe.c | 8 ++ sound/soc/qcom/qdsp6/q6afe.h | 1 + sound/soc/qcom/qdsp6/q6asm.c | 7 +- sound/soc/rockchip/rockchip_pdm.c | 4 +- sound/soc/soc-pcm.c | 6 +- sound/usb/format.c | 6 +- sound/usb/mixer.c | 15 +- sound/usb/mixer.h | 9 +- sound/usb/mixer_quirks.c | 3 +- sound/usb/pcm.c | 2 + sound/usb/quirks.c | 10 ++ tools/testing/selftests/bpf/progs/bpf_cubic.c | 5 +- tools/testing/selftests/net/so_txtime.c | 33 ++++- .../selftests/powerpc/pmu/ebb/Makefile | 2 +- tools/testing/selftests/wireguard/netns.sh | 13 +- 273 files changed, 2043 insertions(+), 1048 deletions(-)
From: Krzysztof Kozlowski krzk@kernel.org
commit 03fe7aaf0c3d40ef7feff2bdc7180c146989586a upstream.
Driver allocates DMA memory with dma_alloc_coherent() but frees it with dma_unmap_single().
This causes DMA warning during system shutdown (with DMA debugging) on Toradex Colibri VF50 module:
WARNING: CPU: 0 PID: 1 at ../kernel/dma/debug.c:1036 check_unmap+0x3fc/0xb04 DMA-API: fsl-edma 40098000.dma-controller: device driver frees DMA memory with wrong function [device address=0x0000000087040000] [size=8 bytes] [mapped as coherent] [unmapped as single] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) (unwind_backtrace) from [<8010bb34>] (show_stack+0x10/0x14) (show_stack) from [<8011ced8>] (__warn+0xf0/0x108) (__warn) from [<8011cf64>] (warn_slowpath_fmt+0x74/0xb8) (warn_slowpath_fmt) from [<8017d170>] (check_unmap+0x3fc/0xb04) (check_unmap) from [<8017d900>] (debug_dma_unmap_page+0x88/0x90) (debug_dma_unmap_page) from [<80601d68>] (dspi_release_dma+0x88/0x110) (dspi_release_dma) from [<80601e4c>] (dspi_shutdown+0x5c/0x80) (dspi_shutdown) from [<805845f8>] (device_shutdown+0x17c/0x220) (device_shutdown) from [<80143ef8>] (kernel_restart+0xc/0x50) (kernel_restart) from [<801441cc>] (__do_sys_reboot+0x18c/0x210) (__do_sys_reboot) from [<80100060>] (ret_fast_syscall+0x0/0x28) DMA-API: Mapped at: dma_alloc_attrs+0xa4/0x130 dspi_probe+0x568/0x7b4 platform_drv_probe+0x6c/0xa4 really_probe+0x208/0x348 driver_probe_device+0x5c/0xb4
Fixes: 90ba37033cb9 ("spi: spi-fsl-dspi: Add DMA support for Vybrid") Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Acked-by: Vladimir Oltean vladimir.oltean@nxp.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1591803717-11218-1-git-send-email-krzk@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-fsl-dspi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index 2e9f9adc59003..88176eaca448f 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -584,14 +584,14 @@ static void dspi_release_dma(struct fsl_dspi *dspi) return;
if (dma->chan_tx) { - dma_unmap_single(dma->chan_tx->device->dev, dma->tx_dma_phys, - dma_bufsize, DMA_TO_DEVICE); + dma_free_coherent(dma->chan_tx->device->dev, dma_bufsize, + dma->tx_dma_buf, dma->tx_dma_phys); dma_release_channel(dma->chan_tx); }
if (dma->chan_rx) { - dma_unmap_single(dma->chan_rx->device->dev, dma->rx_dma_phys, - dma_bufsize, DMA_FROM_DEVICE); + dma_free_coherent(dma->chan_rx->device->dev, dma_bufsize, + dma->rx_dma_buf, dma->rx_dma_phys); dma_release_channel(dma->chan_rx); } }
From: yu kuai yukuai3@huawei.com
commit a75ca9303175d36af93c0937dd9b1a6422908b8d upstream.
commit e7bf90e5afe3 ("block/bio-integrity: fix a memory leak bug") added a kfree() for 'buf' if bio_integrity_add_page() returns '0'. However, the object will be freed in bio_integrity_free() since 'bio->bi_opf' and 'bio->bi_integrity' were set previousy in bio_integrity_alloc().
Fixes: commit e7bf90e5afe3 ("block/bio-integrity: fix a memory leak bug") Signed-off-by: yu kuai yukuai3@huawei.com Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bob Liu bob.liu@oracle.com Acked-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/bio-integrity.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c index bf62c25cde8f4..ae07dd78e9518 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -278,7 +278,6 @@ bool bio_integrity_prep(struct bio *bio)
if (ret == 0) { printk(KERN_ERR "could not attach integrity payload\n"); - kfree(buf); status = BLK_STS_RESOURCE; goto err_end_io; }
From: Claudiu Manoil claudiu.manoil@nxp.com
[ Upstream commit 0574e2000fc3103cbc69ba82ec1175ce171fdf5e ]
The rings bitmap of an interrupt vector encodes which of the device's rings were assigned to that interrupt vector. Hence the iteration range of the tx rings bitmap (for_each_set_bit()) should be the total number of Tx rings of that netdevice instead of the number of rings assigned to the interrupt vector. Since there are 2 cores, and one interrupt vector for each core, the number of rings asigned to an interrupt vector is half the number of available rings. The impact of this error is that the upper half of the tx rings could still generate interrupts during napi polling.
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Claudiu Manoil claudiu.manoil@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index ccf2611f4a203..4486a0db8ef0c 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -266,7 +266,7 @@ static irqreturn_t enetc_msix(int irq, void *data) /* disable interrupts */ enetc_wr_reg(v->rbier, 0);
- for_each_set_bit(i, &v->tx_rings_map, v->count_tx_rings) + for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS) enetc_wr_reg(v->tbier_base + ENETC_BDR_OFF(i), 0);
napi_schedule_irqoff(&v->napi); @@ -302,7 +302,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) /* enable interrupts */ enetc_wr_reg(v->rbier, ENETC_RBIER_RXTIE);
- for_each_set_bit(i, &v->tx_rings_map, v->count_tx_rings) + for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS) enetc_wr_reg(v->tbier_base + ENETC_BDR_OFF(i), ENETC_TBIER_TXTIE);
From: Gaurav Singh gaurav1086@gmail.com
[ Upstream commit 21a739c64d3e9871186483a0cc3e7b52638c3d59 ]
Fix check in ethtool_rx_flow_rule_create
Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Signed-off-by: Gaurav Singh gaurav1086@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 89d0b1827aaff..d3eeeb26396cd 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -2957,7 +2957,7 @@ ethtool_rx_flow_rule_create(const struct ethtool_rx_flow_spec_input *input) sizeof(match->mask.ipv6.dst)); } if (memcmp(v6_m_spec->ip6src, &zero_addr, sizeof(zero_addr)) || - memcmp(v6_m_spec->ip6src, &zero_addr, sizeof(zero_addr))) { + memcmp(v6_m_spec->ip6dst, &zero_addr, sizeof(zero_addr))) { match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS); match->dissector.offset[FLOW_DISSECTOR_KEY_IPV6_ADDRS] =
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit 56c09de347e40804fc8dad155272fb9609e0a97b ]
Currently, trying to change the DF parameter of a geneve device does nothing:
# ip -d link show geneve1 14: geneve1: <snip> link/ether <snip> geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip> # ip link set geneve1 type geneve id 1 df unset # ip -d link show geneve1 14: geneve1: <snip> link/ether <snip> geneve id 1 remote 10.0.0.1 ttl auto df set dstport 6081 <snip>
We just need to update the value in geneve_changelink.
Fixes: a025fb5f49ad ("geneve: Allow configuration of DF behaviour") Signed-off-by: Sabrina Dubroca sd@queasysnail.net Reviewed-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/geneve.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 75266580b586d..4661ef865807f 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1649,6 +1649,7 @@ static int geneve_changelink(struct net_device *dev, struct nlattr *tb[], geneve->collect_md = metadata; geneve->use_udp6_rx_checksums = use_udp6_rx_checksums; geneve->ttl_inherit = ttl_inherit; + geneve->df = df; geneve_unquiesce(geneve, gs4, gs6);
return 0;
From: Thomas Falcon tlfalcon@linux.ibm.com
[ Upstream commit 5948378b26d89f8aa5eac37629dbd0616ce8d7a7 ]
The max MTU limit defined for ibmveth is not accounting for virtual ethernet buffer overhead, which is twenty-two additional bytes set aside for the ethernet header and eight additional bytes of an opaque handle reserved for use by the hypervisor. Update the max MTU to reflect this overhead.
Fixes: d894be57ca92 ("ethernet: use net core MTU range checking in more drivers") Fixes: 110447f8269a ("ethernet: fix min/max MTU typos") Signed-off-by: Thomas Falcon tlfalcon@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ibm/ibmveth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index 96d36ae5049e1..c5c732601e35e 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c @@ -1715,7 +1715,7 @@ static int ibmveth_probe(struct vio_dev *dev, const struct vio_device_id *id) }
netdev->min_mtu = IBMVETH_MIN_MTU; - netdev->max_mtu = ETH_MAX_MTU; + netdev->max_mtu = ETH_MAX_MTU - IBMVETH_BUFF_OH;
memcpy(netdev->dev_addr, mac_addr_p, ETH_ALEN);
From: Wang Hai wanghai38@huawei.com
[ Upstream commit ea2fce88d2fd678ed9d45354ff49b73f1d5615dd ]
Commit a84d01647989 ("mld: fix memory leak in mld_del_delrec()") fixed the memory leak of MLD, but missing the ipv6_mc_destroy_dev() path, in which mca_sources are leaked after ma_put().
Using ip6_mc_clear_src() to take care of the missing free.
BUG: memory leak unreferenced object 0xffff8881113d3180 (size 64): comm "syz-executor071", pid 389, jiffies 4294887985 (age 17.943s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002cbc483c>] kmalloc include/linux/slab.h:555 [inline] [<000000002cbc483c>] kzalloc include/linux/slab.h:669 [inline] [<000000002cbc483c>] ip6_mc_add1_src net/ipv6/mcast.c:2237 [inline] [<000000002cbc483c>] ip6_mc_add_src+0x7f5/0xbb0 net/ipv6/mcast.c:2357 [<0000000058b8b1ff>] ip6_mc_source+0xe0c/0x1530 net/ipv6/mcast.c:449 [<000000000bfc4fb5>] do_ipv6_setsockopt.isra.12+0x1b2c/0x3b30 net/ipv6/ipv6_sockglue.c:754 [<00000000e4e7a722>] ipv6_setsockopt+0xda/0x150 net/ipv6/ipv6_sockglue.c:950 [<0000000029260d9a>] rawv6_setsockopt+0x45/0x100 net/ipv6/raw.c:1081 [<000000005c1b46f9>] __sys_setsockopt+0x131/0x210 net/socket.c:2132 [<000000008491f7db>] __do_sys_setsockopt net/socket.c:2148 [inline] [<000000008491f7db>] __se_sys_setsockopt net/socket.c:2145 [inline] [<000000008491f7db>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2145 [<00000000c7bc11c5>] do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295 [<000000005fb7a3f3>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang Hai wanghai38@huawei.com Acked-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/mcast.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index eaa4c2cc2fbb2..c875c9b6edbe9 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -2618,6 +2618,7 @@ void ipv6_mc_destroy_dev(struct inet6_dev *idev) idev->mc_list = i->next;
write_unlock_bh(&idev->lock); + ip6_mc_clear_src(i); ma_put(i); write_lock_bh(&idev->lock); }
From: Ido Schimmel idosch@mellanox.com
[ Upstream commit f3fe412b0a634286a6a3753c3f9ff201e6bec716 ]
The second commit cited below performed a cast of 'u32 buffsize' to '(u16 *)' when calling mlxsw_sp_port_headroom_8x_adjust():
mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize);
Colin noted that this will behave differently on big endian architectures compared to little endian architectures.
Fix this by following Colin's suggestion and have the function accept and return 'u32' instead of passing the current size by reference.
Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC") Fixes: 60833d54d56c ("mlxsw: spectrum: Adjust headroom buffers for 8x ports") Signed-off-by: Ido Schimmel idosch@mellanox.com Reported-by: Colin Ian King colin.king@canonical.com Suggested-by: Colin Ian King colin.king@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 4 ++-- drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 8 +++----- drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 2 +- drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c | 2 +- 4 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index 3e4199246a18d..d9a2267aeaea2 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -990,10 +990,10 @@ int __mlxsw_sp_port_headroom_set(struct mlxsw_sp_port *mlxsw_sp_port, int mtu,
lossy = !(pfc || pause_en); thres_cells = mlxsw_sp_pg_buf_threshold_get(mlxsw_sp, mtu); - mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, &thres_cells); + thres_cells = mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, thres_cells); delay_cells = mlxsw_sp_pg_buf_delay_get(mlxsw_sp, mtu, delay, pfc, pause_en); - mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, &delay_cells); + delay_cells = mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, delay_cells); total_cells = thres_cells + delay_cells;
taken_headroom_cells += total_cells; diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h index e28ecb84b8164..6b2e4e730b189 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h @@ -395,17 +395,15 @@ mlxsw_sp_port_vlan_find_by_vid(const struct mlxsw_sp_port *mlxsw_sp_port, return NULL; }
-static inline void +static inline u32 mlxsw_sp_port_headroom_8x_adjust(const struct mlxsw_sp_port *mlxsw_sp_port, - u16 *p_size) + u32 size_cells) { /* Ports with eight lanes use two headroom buffers between which the * configured headroom size is split. Therefore, multiply the calculated * headroom size by two. */ - if (mlxsw_sp_port->mapping.width != 8) - return; - *p_size *= 2; + return mlxsw_sp_port->mapping.width == 8 ? 2 * size_cells : size_cells; }
enum mlxsw_sp_flood_type { diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c index 19bf0768ed788..2fb2cbd4f229e 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c @@ -312,7 +312,7 @@ static int mlxsw_sp_port_pb_init(struct mlxsw_sp_port *mlxsw_sp_port)
if (i == MLXSW_SP_PB_UNUSED) continue; - mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, &size); + size = mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, size); mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl, i, size); } mlxsw_reg_pbmc_lossy_buffer_pack(pbmc_pl, diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c index 7c5032f9c8fff..76242c70d41ad 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c @@ -776,7 +776,7 @@ mlxsw_sp_span_port_buffsize_update(struct mlxsw_sp_port *mlxsw_sp_port, u16 mtu) speed = 0;
buffsize = mlxsw_sp_span_buffsize_get(mlxsw_sp, speed, mtu); - mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize); + buffsize = mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, buffsize); mlxsw_reg_sbib_pack(sbib_pl, mlxsw_sp_port->local_port, buffsize); return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sbib), sbib_pl); }
From: Sven Auhagen sven.auhagen@voleatech.de
[ Upstream commit cc970925feb9a38c2f0d34305518e00a3084ce85 ]
The ethtool rx and tx queue statistics are reporting wrong values. Fix reading out the correct ones.
Signed-off-by: Sven Auhagen sven.auhagen@voleatech.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index b7b553602ea9f..24f4d8e0da989 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -1544,7 +1544,7 @@ static void mvpp2_read_stats(struct mvpp2_port *port) for (q = 0; q < port->ntxqs; q++) for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_txq_regs); i++) *pstats++ += mvpp2_read_index(port->priv, - MVPP22_CTRS_TX_CTR(port->id, i), + MVPP22_CTRS_TX_CTR(port->id, q), mvpp2_ethtool_txq_regs[i].offset);
/* Rxqs are numbered from 0 from the user standpoint, but not from the @@ -1553,7 +1553,7 @@ static void mvpp2_read_stats(struct mvpp2_port *port) for (q = 0; q < port->nrxqs; q++) for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_rxq_regs); i++) *pstats++ += mvpp2_read_index(port->priv, - port->first_rxq + i, + port->first_rxq + q, mvpp2_ethtool_rxq_regs[i].offset); }
From: Thomas Martitz t.martitz@avm.de
[ Upstream commit db7202dec92e6caa2706c21d6fc359af318bde2e ]
The eth_addr member is passed to ether_addr functions that require 2-byte alignment, therefore the member must be properly aligned to avoid unaligned accesses.
The problem is in place since the initial merge of multicast to unicast: commit 6db6f0eae6052b70885562e1733896647ec1d807 bridge: multicast to unicast
Fixes: 6db6f0eae605 ("bridge: multicast to unicast") Cc: Roopa Prabhu roopa@cumulusnetworks.com Cc: Nikolay Aleksandrov nikolay@cumulusnetworks.com Cc: David S. Miller davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: Felix Fietkau nbd@nbd.name Cc: stable@vger.kernel.org Signed-off-by: Thomas Martitz t.martitz@avm.de Acked-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_private.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 1f97703a52ffb..18430f79ac37e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -217,8 +217,8 @@ struct net_bridge_port_group { struct rcu_head rcu; struct timer_list timer; struct br_ip addr; + unsigned char eth_addr[ETH_ALEN] __aligned(2); unsigned char flags; - unsigned char eth_addr[ETH_ALEN]; };
struct net_bridge_mdb_entry {
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit fb7861d14c8d7edac65b2fcb6e8031cb138457b2 ]
In the current code, ->ndo_start_xmit() can be executed recursively only 10 times because of stack memory. But, in the case of the vxlan, 10 recursion limit value results in a stack overflow. In the current code, the nested interface is limited by 8 depth. There is no critical reason that the recursion limitation value should be 10. So, it would be good to be the same value with the limitation value of nesting interface depth.
Test commands: ip link add vxlan10 type vxlan vni 10 dstport 4789 srcport 4789 4789 ip link set vxlan10 up ip a a 192.168.10.1/24 dev vxlan10 ip n a 192.168.10.2 dev vxlan10 lladdr fc:22:33:44:55:66 nud permanent
for i in {9..0} do let A=$i+1 ip link add vxlan$i type vxlan vni $i dstport 4789 srcport 4789 4789 ip link set vxlan$i up ip a a 192.168.$i.1/24 dev vxlan$i ip n a 192.168.$i.2 dev vxlan$i lladdr fc:22:33:44:55:66 nud permanent bridge fdb add fc:22:33:44:55:66 dev vxlan$A dst 192.168.$i.2 self done hping3 192.168.10.2 -2 -d 60000
Splat looks like: [ 103.814237][ T1127] ============================================================================= [ 103.871955][ T1127] BUG kmalloc-2k (Tainted: G B ): Padding overwritten. 0x00000000897a2e4f-0x000 [ 103.873187][ T1127] ----------------------------------------------------------------------------- [ 103.873187][ T1127] [ 103.874252][ T1127] INFO: Slab 0x000000005cccc724 objects=5 used=5 fp=0x0000000000000000 flags=0x10000000001020 [ 103.881323][ T1127] CPU: 3 PID: 1127 Comm: hping3 Tainted: G B 5.7.0+ #575 [ 103.882131][ T1127] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 103.883006][ T1127] Call Trace: [ 103.883324][ T1127] dump_stack+0x96/0xdb [ 103.883716][ T1127] slab_err+0xad/0xd0 [ 103.884106][ T1127] ? _raw_spin_unlock+0x1f/0x30 [ 103.884620][ T1127] ? get_partial_node.isra.78+0x140/0x360 [ 103.885214][ T1127] slab_pad_check.part.53+0xf7/0x160 [ 103.885769][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.886316][ T1127] check_slab+0x97/0xb0 [ 103.886763][ T1127] alloc_debug_processing+0x84/0x1a0 [ 103.887308][ T1127] ___slab_alloc+0x5a5/0x630 [ 103.887765][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.888265][ T1127] ? lock_downgrade+0x730/0x730 [ 103.888762][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.889244][ T1127] ? __slab_alloc+0x3e/0x80 [ 103.889675][ T1127] __slab_alloc+0x3e/0x80 [ 103.890108][ T1127] __kmalloc_node_track_caller+0xc7/0x420 [ ... ]
Fixes: 11a766ce915f ("net: Increase xmit RECURSION_LIMIT to 10.") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/netdevice.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 130a668049ab0..36c7ad24d54d9 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3125,7 +3125,7 @@ static inline int dev_recursion_level(void) return this_cpu_read(softnet_data.xmit.recursion); }
-#define XMIT_RECURSION_LIMIT 10 +#define XMIT_RECURSION_LIMIT 8 static inline bool dev_xmit_recursion(void) { return unlikely(__this_cpu_read(softnet_data.xmit.recursion) >
From: Tariq Toukan tariqt@mellanox.com
[ Upstream commit 41b14fb8724d5a4b382a63cb4a1a61880347ccb8 ]
Clearing the sock TX queue in sk_set_socket() might cause unexpected out-of-order transmit when called from sock_orphan(), as outstanding packets can pick a different TX queue and bypass the ones already queued.
This is undesired in general. More specifically, it breaks the in-order scheduling property guarantee for device-offloaded TLS sockets.
Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it explicitly only where needed.
Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping") Signed-off-by: Tariq Toukan tariqt@mellanox.com Reviewed-by: Boris Pismenny borisp@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 1 - net/core/sock.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/sock.h b/include/net/sock.h index 3e8c6d4b4b59f..46423e86dba50 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1846,7 +1846,6 @@ static inline int sk_rx_queue_get(const struct sock *sk)
static inline void sk_set_socket(struct sock *sk, struct socket *sock) { - sk_tx_queue_clear(sk); sk->sk_socket = sock; }
diff --git a/net/core/sock.c b/net/core/sock.c index b714162213aea..da244f4d00363 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1678,6 +1678,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority, cgroup_sk_alloc(&sk->sk_cgrp_data); sock_update_classid(&sk->sk_cgrp_data); sock_update_netprioidx(&sk->sk_cgrp_data); + sk_tx_queue_clear(sk); }
return sk; @@ -1901,6 +1902,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) */ sk_refcnt_debug_inc(newsk); sk_set_socket(newsk, NULL); + sk_tx_queue_clear(newsk); RCU_INIT_POINTER(newsk->sk_wq, NULL);
if (newsk->sk_prot->sockets_allocated)
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit b4730ae6a443afe611afb4fb651c885c51003c15 ]
Commit e585f2363637 ("udp: Changes to udp_offload to support remote checksum offload") added new GSO type and a corresponding netdev feature, but missed Ethtool's 'netdev_features_strings' table. Give it a name so it will be exposed to userspace and become available for manual configuration.
v3: - decouple from "netdev_features_strings[] cleanup" series; - no functional changes.
v2: - don't split the "Fixes:" tag across lines; - no functional changes.
Fixes: e585f2363637 ("udp: Changes to udp_offload to support remote checksum offload") Signed-off-by: Alexander Lobakin alobakin@pm.me Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/common.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 423e640e3876d..7f7fff88c5d3e 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -40,6 +40,7 @@ const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN] = { [NETIF_F_GSO_UDP_TUNNEL_BIT] = "tx-udp_tnl-segmentation", [NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT] = "tx-udp_tnl-csum-segmentation", [NETIF_F_GSO_PARTIAL_BIT] = "tx-gso-partial", + [NETIF_F_GSO_TUNNEL_REMCSUM_BIT] = "tx-tunnel-remcsum-segmentation", [NETIF_F_GSO_SCTP_BIT] = "tx-sctp-segmentation", [NETIF_F_GSO_ESP_BIT] = "tx-esp-segmentation", [NETIF_F_GSO_UDP_L4_BIT] = "tx-udp-segmentation",
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 814152a89ed52c722ab92e9fbabcac3cb8a39245 ]
I got a memleak report when doing some fuzz test:
unreferenced object 0xffff888112584000 (size 13599): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 32 bytes): 74 61 70 30 00 00 00 00 00 00 00 00 00 00 00 00 tap0............ 00 ee d9 19 81 88 ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002f60ba65>] __kmalloc_node+0x309/0x3a0 [<0000000075b211ec>] kvmalloc_node+0x7f/0xc0 [<00000000d3a97396>] alloc_netdev_mqs+0x76/0xfc0 [<00000000609c3655>] __tun_chr_ioctl+0x1456/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff888111845cc0 (size 8): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 8 bytes): 74 61 70 30 00 88 ff ff tap0.... backtrace: [<000000004c159777>] kstrdup+0x35/0x70 [<00000000d8b496ad>] kstrdup_const+0x3d/0x50 [<00000000494e884a>] kvasprintf_const+0xf1/0x180 [<0000000097880a2b>] kobject_set_name_vargs+0x56/0x140 [<000000008fbdfc7b>] dev_set_name+0xab/0xe0 [<000000005b99e3b4>] netdev_register_kobject+0xc0/0x390 [<00000000602704fe>] register_netdevice+0xb61/0x1250 [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff88811886d800 (size 512): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff c0 66 3d a3 ff ff ff ff .........f=..... backtrace: [<0000000050315800>] device_add+0x61e/0x1950 [<0000000021008dfb>] netdev_register_kobject+0x17e/0x390 [<00000000602704fe>] register_netdevice+0xb61/0x1250 [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
If call_netdevice_notifiers() failed, then rollback_registered() calls netdev_unregister_kobject() which holds the kobject. The reference cannot be put because the netdev won't be add to todo list, so it will leads a memleak, we need put the reference to avoid memleak.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c index 93a279ab4e97c..096b0dfa95890 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9435,6 +9435,13 @@ int register_netdevice(struct net_device *dev) rcu_barrier();
dev->reg_state = NETREG_UNREGISTERED; + /* We should put the kobject that hold in + * netdev_unregister_kobject(), otherwise + * the net device cannot be freed when + * driver calls free_netdev(), because the + * kobject is being hold. + */ + kobject_put(&dev->dev.kobj); } /* * Prevent userspace races by waiting until the network
From: guodeqing geffrey.guo@huawei.com
[ Upstream commit 5eea3a63ff4aba6a26002e657a6d21934b7e2b96 ]
ie., $ ifconfig eth0 6.6.6.6 netmask 255.255.255.0
$ ip rule add from 6.6.6.6 table 6666
$ ip route add 9.9.9.9 via 6.6.6.6
$ ping -I 6.6.6.6 9.9.9.9 PING 9.9.9.9 (9.9.9.9) from 6.6.6.6 : 56(84) bytes of data.
3 packets transmitted, 0 received, 100% packet loss, time 2079ms
$ arp Address HWtype HWaddress Flags Mask Iface 6.6.6.6 (incomplete) eth0
The arp request address is error, this is because fib_table_lookup in fib_check_nh lookup the destnation 9.9.9.9 nexthop, the scope of the fib result is RT_SCOPE_LINK,the correct scope is RT_SCOPE_HOST. Here I add a check of whether this is RT_TABLE_MAIN to solve this problem.
Fixes: 3bfd847203c6 ("net: Use passed in table for nexthop lookups") Signed-off-by: guodeqing geffrey.guo@huawei.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/fib_semantics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 55ca2e5218280..871c035be31f2 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -1109,7 +1109,7 @@ static int fib_check_nh_v4_gw(struct net *net, struct fib_nh *nh, u32 table, if (fl4.flowi4_scope < RT_SCOPE_LINK) fl4.flowi4_scope = RT_SCOPE_LINK;
- if (table) + if (table && table != RT_TABLE_MAIN) tbl = fib_get_table(net, table);
if (tbl)
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0ad6f6e767ec2f613418cbc7ebe5ec4c35af540c ]
Back in commit f60e5990d9c1 ("ipv6: protect skb->sk accesses from recursive dereference inside the stack") Hannes added code so that IPv6 stack would not trust skb->sk for typical cases where packet goes through 'standard' xmit path (__dev_queue_xmit())
Alas af_packet had a dev_direct_xmit() path that was not dealing yet with xmit_recursion level.
Also change sk_mc_loop() to dump a stack once only.
Without this patch, syzbot was able to trigger :
[1] [ 153.567378] WARNING: CPU: 7 PID: 11273 at net/core/sock.c:721 sk_mc_loop+0x51/0x70 [ 153.567378] Modules linked in: nfnetlink ip6table_raw ip6table_filter iptable_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 nf_defrag_ipv6 iptable_filter macsec macvtap tap macvlan 8021q hsr wireguard libblake2s blake2s_x86_64 libblake2s_generic udp_tunnel ip6_udp_tunnel libchacha20poly1305 poly1305_x86_64 chacha_x86_64 libchacha curve25519_x86_64 libcurve25519_generic netdevsim batman_adv dummy team bridge stp llc w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core [ 153.567386] CPU: 7 PID: 11273 Comm: b159172088 Not tainted 5.8.0-smp-DEV #273 [ 153.567387] RIP: 0010:sk_mc_loop+0x51/0x70 [ 153.567388] Code: 66 83 f8 0a 75 24 0f b6 4f 12 b8 01 00 00 00 31 d2 d3 e0 a9 bf ef ff ff 74 07 48 8b 97 f0 02 00 00 0f b6 42 3a 83 e0 01 5d c3 <0f> 0b b8 01 00 00 00 5d c3 0f b6 87 18 03 00 00 5d c0 e8 04 83 e0 [ 153.567388] RSP: 0018:ffff95c69bb93990 EFLAGS: 00010212 [ 153.567388] RAX: 0000000000000011 RBX: ffff95c6e0ee3e00 RCX: 0000000000000007 [ 153.567389] RDX: ffff95c69ae50000 RSI: ffff95c6c30c3000 RDI: ffff95c6c30c3000 [ 153.567389] RBP: ffff95c69bb93990 R08: ffff95c69a77f000 R09: 0000000000000008 [ 153.567389] R10: 0000000000000040 R11: 00003e0e00026128 R12: ffff95c6c30c3000 [ 153.567390] R13: ffff95c6cc4fd500 R14: ffff95c6f84500c0 R15: ffff95c69aa13c00 [ 153.567390] FS: 00007fdc3a283700(0000) GS:ffff95c6ff9c0000(0000) knlGS:0000000000000000 [ 153.567390] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 153.567391] CR2: 00007ffee758e890 CR3: 0000001f9ba20003 CR4: 00000000001606e0 [ 153.567391] Call Trace: [ 153.567391] ip6_finish_output2+0x34e/0x550 [ 153.567391] __ip6_finish_output+0xe7/0x110 [ 153.567391] ip6_finish_output+0x2d/0xb0 [ 153.567392] ip6_output+0x77/0x120 [ 153.567392] ? __ip6_finish_output+0x110/0x110 [ 153.567392] ip6_local_out+0x3d/0x50 [ 153.567392] ipvlan_queue_xmit+0x56c/0x5e0 [ 153.567393] ? ksize+0x19/0x30 [ 153.567393] ipvlan_start_xmit+0x18/0x50 [ 153.567393] dev_direct_xmit+0xf3/0x1c0 [ 153.567393] packet_direct_xmit+0x69/0xa0 [ 153.567394] packet_sendmsg+0xbf0/0x19b0 [ 153.567394] ? plist_del+0x62/0xb0 [ 153.567394] sock_sendmsg+0x65/0x70 [ 153.567394] sock_write_iter+0x93/0xf0 [ 153.567394] new_sync_write+0x18e/0x1a0 [ 153.567395] __vfs_write+0x29/0x40 [ 153.567395] vfs_write+0xb9/0x1b0 [ 153.567395] ksys_write+0xb1/0xe0 [ 153.567395] __x64_sys_write+0x1a/0x20 [ 153.567395] do_syscall_64+0x43/0x70 [ 153.567396] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 153.567396] RIP: 0033:0x453549 [ 153.567396] Code: Bad RIP value. [ 153.567396] RSP: 002b:00007fdc3a282cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 153.567397] RAX: ffffffffffffffda RBX: 00000000004d32d0 RCX: 0000000000453549 [ 153.567397] RDX: 0000000000000020 RSI: 0000000020000300 RDI: 0000000000000003 [ 153.567398] RBP: 00000000004d32d8 R08: 0000000000000000 R09: 0000000000000000 [ 153.567398] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d32dc [ 153.567398] R13: 00007ffee742260f R14: 00007fdc3a282dc0 R15: 00007fdc3a283700 [ 153.567399] ---[ end trace c1d5ae2b1059ec62 ]---
f60e5990d9c1 ("ipv6: protect skb->sk accesses from recursive dereference inside the stack") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 2 ++ net/core/sock.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c index 096b0dfa95890..c9ee5d80d5ea1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4109,10 +4109,12 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
local_bh_disable();
+ dev_xmit_recursion_inc(); HARD_TX_LOCK(dev, txq, smp_processor_id()); if (!netif_xmit_frozen_or_drv_stopped(txq)) ret = netdev_start_xmit(skb, dev, txq, false); HARD_TX_UNLOCK(dev, txq); + dev_xmit_recursion_dec();
local_bh_enable();
diff --git a/net/core/sock.c b/net/core/sock.c index da244f4d00363..afe4a62adf8ff 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -707,7 +707,7 @@ bool sk_mc_loop(struct sock *sk) return inet6_sk(sk)->mc_loop; #endif } - WARN_ON(1); + WARN_ON_ONCE(1); return true; } EXPORT_SYMBOL(sk_mc_loop);
From: Jeremy Kerr jk@ozlabs.org
[ Upstream commit e869e7a17798d85829fa7d4f9bbe1eebd4b2d3f6 ]
Using a AX88179 device (0b95:1790), I see two bytes of appended data on every RX packet. For example, this 48-byte ping, using 0xff as a payload byte:
04:20:22.528472 IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 2447, seq 1, length 64 0x0000: 000a cd35 ea50 000a cd35 ea4f 0800 4500 0x0010: 0054 c116 4000 4001 f63e c0a8 0101 c0a8 0x0020: 0102 0800 b633 098f 0001 87ea cd5e 0000 0x0030: 0000 dcf2 0600 0000 0000 ffff ffff ffff 0x0040: ffff ffff ffff ffff ffff ffff ffff ffff 0x0050: ffff ffff ffff ffff ffff ffff ffff ffff 0x0060: ffff 961f
Those last two bytes - 96 1f - aren't part of the original packet.
In the ax88179 RX path, the usbnet rx_fixup function trims a 2-byte 'alignment pseudo header' from the start of the packet, and sets the length from a per-packet field populated by hardware. It looks like that length field *includes* the 2-byte header; the current driver assumes that it's excluded.
This change trims the 2-byte alignment header after we've set the packet length, so the resulting packet length is correct. While we're moving the comment around, this also fixes the spelling of 'pseudo'.
Signed-off-by: Jeremy Kerr jk@ozlabs.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/ax88179_178a.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index 93044cf1417a5..1fe4cc28d154d 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -1414,10 +1414,10 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb) }
if (pkt_cnt == 0) { - /* Skip IP alignment psudo header */ - skb_pull(skb, 2); skb->len = pkt_len; - skb_set_tail_pointer(skb, pkt_len); + /* Skip IP alignment pseudo header */ + skb_pull(skb, 2); + skb_set_tail_pointer(skb, skb->len); skb->truesize = pkt_len + sizeof(struct sk_buff); ax88179_rx_checksum(skb, pkt_hdr); return 1; @@ -1426,8 +1426,9 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb) ax_skb = skb_clone(skb, GFP_ATOMIC); if (ax_skb) { ax_skb->len = pkt_len; - ax_skb->data = skb->data + 2; - skb_set_tail_pointer(ax_skb, pkt_len); + /* Skip IP alignment pseudo header */ + skb_pull(ax_skb, 2); + skb_set_tail_pointer(ax_skb, ax_skb->len); ax_skb->truesize = pkt_len + sizeof(struct sk_buff); ax88179_rx_checksum(ax_skb, pkt_hdr); usbnet_skb_return(dev, ax_skb);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 17843655708e1941c0653af3cd61be6948e36f43 ]
ovs connection tracking module performs de-fragmentation on incoming fragmented traffic. Take info account if traffic has been de-fragmented in execute_check_pkt_len action otherwise we will perform the wrong nested action considering the original packet size. This issue typically occurs if ovs-vswitchd adds a rule in the pipeline that requires connection tracking (e.g. OVN stateful ACLs) before execute_check_pkt_len action. Moreover take into account GSO fragment size for GSO packet in execute_check_pkt_len routine
Fixes: 4d5ec89fc8d14 ("net: openvswitch: Add a new action check_pkt_len") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/actions.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index fc0efd8833c84..2611657f40cac 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -1169,9 +1169,10 @@ static int execute_check_pkt_len(struct datapath *dp, struct sk_buff *skb, struct sw_flow_key *key, const struct nlattr *attr, bool last) { + struct ovs_skb_cb *ovs_cb = OVS_CB(skb); const struct nlattr *actions, *cpl_arg; + int len, max_len, rem = nla_len(attr); const struct check_pkt_len_arg *arg; - int rem = nla_len(attr); bool clone_flow_key;
/* The first netlink attribute in 'attr' is always @@ -1180,7 +1181,11 @@ static int execute_check_pkt_len(struct datapath *dp, struct sk_buff *skb, cpl_arg = nla_data(attr); arg = nla_data(cpl_arg);
- if (skb->len <= arg->pkt_len) { + len = ovs_cb->mru ? ovs_cb->mru + skb->mac_len : skb->len; + max_len = arg->pkt_len; + + if ((skb_is_gso(skb) && skb_gso_validate_mac_len(skb, max_len)) || + len <= max_len) { /* Second netlink attribute in 'attr' is always * 'OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL'. */
From: Aditya Pakki pakki001@umn.edu
[ Upstream commit 58d0c864e1a759a15c9df78f50ea5a5c32b3989e ]
In rocker_dma_rings_init, the goto blocks in case of errors caused by the functions rocker_dma_cmd_ring_waits_alloc() and rocker_dma_ring_create() are incorrect. The patch fixes the order consistent with cleanup in rocker_dma_rings_fini().
Signed-off-by: Aditya Pakki pakki001@umn.edu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/rocker/rocker_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c index 7585cd2270ba4..fc99e7118e494 100644 --- a/drivers/net/ethernet/rocker/rocker_main.c +++ b/drivers/net/ethernet/rocker/rocker_main.c @@ -647,10 +647,10 @@ static int rocker_dma_rings_init(struct rocker *rocker) err_dma_event_ring_bufs_alloc: rocker_dma_ring_destroy(rocker, &rocker->event_ring); err_dma_event_ring_create: + rocker_dma_cmd_ring_waits_free(rocker); +err_dma_cmd_ring_waits_alloc: rocker_dma_ring_bufs_free(rocker, &rocker->cmd_ring, PCI_DMA_BIDIRECTIONAL); -err_dma_cmd_ring_waits_alloc: - rocker_dma_cmd_ring_waits_free(rocker); err_dma_cmd_ring_bufs_alloc: rocker_dma_ring_destroy(rocker, &rocker->cmd_ring); return err;
From: David Howells dhowells@redhat.com
[ Upstream commit 0041cd5a50442db6e456b145892a0eaf2dff061f ]
When preallocated service calls are being discarded, they're passed to ->discard_new_call() to have the caller clean up any attached higher-layer preallocated pieces before being marked completed. However, the act of marking them completed now invokes the call's notification function - which causes a problem because that function might assume that the previously freed pieces of memory are still there.
Fix this by setting a dummy notification function on the socket after calling ->discard_new_call().
This results in the following kasan message when the kafs module is removed.
================================================================== BUG: KASAN: use-after-free in afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707 Write of size 1 at addr ffff8880946c39e4 by task kworker/u4:1/21
CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.8.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707 rxrpc_notify_socket+0x1db/0x5d0 net/rxrpc/recvmsg.c:40 __rxrpc_set_call_completion.part.0+0x172/0x410 net/rxrpc/recvmsg.c:76 __rxrpc_call_completed net/rxrpc/recvmsg.c:112 [inline] rxrpc_call_completed+0xca/0xf0 net/rxrpc/recvmsg.c:111 rxrpc_discard_prealloc+0x781/0xab0 net/rxrpc/call_accept.c:233 rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245 afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110 afs_net_exit+0x1bc/0x310 fs/afs/main.c:155 ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186 cleanup_net+0x511/0xa50 net/core/net_namespace.c:603 process_one_work+0x965/0x1690 kernel/workqueue.c:2269 worker_thread+0x96/0xe10 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
Allocated by task 6820: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:494 [inline] __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:467 kmem_cache_alloc_trace+0x153/0x7d0 mm/slab.c:3551 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] afs_alloc_call+0x55/0x630 fs/afs/rxrpc.c:141 afs_charge_preallocation+0xe9/0x2d0 fs/afs/rxrpc.c:757 afs_open_socket+0x292/0x360 fs/afs/rxrpc.c:92 afs_net_init+0xa6c/0xe30 fs/afs/main.c:125 ops_init+0xaf/0x420 net/core/net_namespace.c:151 setup_net+0x2de/0x860 net/core/net_namespace.c:341 copy_net_ns+0x293/0x590 net/core/net_namespace.c:482 create_new_namespaces+0x3fb/0xb30 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0xbd/0x1f0 kernel/nsproxy.c:231 ksys_unshare+0x43d/0x8e0 kernel/fork.c:2983 __do_sys_unshare kernel/fork.c:3051 [inline] __se_sys_unshare kernel/fork.c:3049 [inline] __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3049 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 21: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:455 __cache_free mm/slab.c:3426 [inline] kfree+0x109/0x2b0 mm/slab.c:3757 afs_put_call+0x585/0xa40 fs/afs/rxrpc.c:190 rxrpc_discard_prealloc+0x764/0xab0 net/rxrpc/call_accept.c:230 rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245 afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110 afs_net_exit+0x1bc/0x310 fs/afs/main.c:155 ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186 cleanup_net+0x511/0xa50 net/core/net_namespace.c:603 process_one_work+0x965/0x1690 kernel/workqueue.c:2269 worker_thread+0x96/0xe10 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
The buggy address belongs to the object at ffff8880946c3800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 484 bytes inside of 1024-byte region [ffff8880946c3800, ffff8880946c3c00) The buggy address belongs to the page: page:ffffea000251b0c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea0002546508 ffffea00024fa248 ffff8880aa000c40 raw: 0000000000000000 ffff8880946c3000 0000000100000002 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff8880946c3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880946c3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880946c3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8880946c3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880946c3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================
Reported-by: syzbot+d3eccef36ddbd02713e9@syzkaller.appspotmail.com Fixes: 5ac0d62226a0 ("rxrpc: Fix missing notification") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/rxrpc/call_accept.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index b7611cc159e51..032ed76c0166d 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -22,6 +22,11 @@ #include <net/ip.h> #include "ar-internal.h"
+static void rxrpc_dummy_notify(struct sock *sk, struct rxrpc_call *call, + unsigned long user_call_ID) +{ +} + /* * Preallocate a single service call, connection and peer and, if possible, * give them a user ID and attach the user's side of the ID to them. @@ -228,6 +233,8 @@ void rxrpc_discard_prealloc(struct rxrpc_sock *rx) if (rx->discard_new_call) { _debug("discard %lx", call->user_call_ID); rx->discard_new_call(call, call->user_call_ID); + if (call->notify_rx) + call->notify_rx = rxrpc_dummy_notify; rxrpc_put_call(call, rxrpc_call_put_kernel); } rxrpc_call_completed(call);
From: Marcelo Ricardo Leitner marcelo.leitner@gmail.com
[ Upstream commit 471e39df96b9a4c4ba88a2da9e25a126624d7a9c ]
If a socket is set ipv6only, it will still send IPv4 addresses in the INIT and INIT_ACK packets. This potentially misleads the peer into using them, which then would cause association termination.
The fix is to not add IPv4 addresses to ipv6only sockets.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Corey Minyard cminyard@mvista.com Signed-off-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Tested-by: Corey Minyard cminyard@mvista.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sctp/constants.h | 8 +++++--- net/sctp/associola.c | 5 ++++- net/sctp/bind_addr.c | 1 + net/sctp/protocol.c | 3 ++- 4 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h index 15b4d9aec7ff2..122d9e2d8dfde 100644 --- a/include/net/sctp/constants.h +++ b/include/net/sctp/constants.h @@ -353,11 +353,13 @@ enum { ipv4_is_anycast_6to4(a))
/* Flags used for the bind address copy functions. */ -#define SCTP_ADDR6_ALLOWED 0x00000001 /* IPv6 address is allowed by +#define SCTP_ADDR4_ALLOWED 0x00000001 /* IPv4 address is allowed by local sock family */ -#define SCTP_ADDR4_PEERSUPP 0x00000002 /* IPv4 address is supported by +#define SCTP_ADDR6_ALLOWED 0x00000002 /* IPv6 address is allowed by + local sock family */ +#define SCTP_ADDR4_PEERSUPP 0x00000004 /* IPv4 address is supported by peer */ -#define SCTP_ADDR6_PEERSUPP 0x00000004 /* IPv6 address is supported by +#define SCTP_ADDR6_PEERSUPP 0x00000008 /* IPv6 address is supported by peer */
/* Reasons to retransmit. */ diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 437079a4883d2..732bc9a45190b 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -1565,12 +1565,15 @@ void sctp_assoc_rwnd_decrease(struct sctp_association *asoc, unsigned int len) int sctp_assoc_set_bind_addr_from_ep(struct sctp_association *asoc, enum sctp_scope scope, gfp_t gfp) { + struct sock *sk = asoc->base.sk; int flags;
/* Use scoping rules to determine the subset of addresses from * the endpoint. */ - flags = (PF_INET6 == asoc->base.sk->sk_family) ? SCTP_ADDR6_ALLOWED : 0; + flags = (PF_INET6 == sk->sk_family) ? SCTP_ADDR6_ALLOWED : 0; + if (!inet_v6_ipv6only(sk)) + flags |= SCTP_ADDR4_ALLOWED; if (asoc->peer.ipv4_address) flags |= SCTP_ADDR4_PEERSUPP; if (asoc->peer.ipv6_address) diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c index 53bc61537f44f..701c5a4e441d9 100644 --- a/net/sctp/bind_addr.c +++ b/net/sctp/bind_addr.c @@ -461,6 +461,7 @@ static int sctp_copy_one_addr(struct net *net, struct sctp_bind_addr *dest, * well as the remote peer. */ if ((((AF_INET == addr->sa.sa_family) && + (flags & SCTP_ADDR4_ALLOWED) && (flags & SCTP_ADDR4_PEERSUPP))) || (((AF_INET6 == addr->sa.sa_family) && (flags & SCTP_ADDR6_ALLOWED) && diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 092d1afdee0d2..cde29f3c7fb3c 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -148,7 +148,8 @@ int sctp_copy_local_addr_list(struct net *net, struct sctp_bind_addr *bp, * sock as well as the remote peer. */ if (addr->a.sa.sa_family == AF_INET && - !(copy_flags & SCTP_ADDR4_PEERSUPP)) + (!(copy_flags & SCTP_ADDR4_ALLOWED) || + !(copy_flags & SCTP_ADDR4_PEERSUPP))) continue; if (addr->a.sa.sa_family == AF_INET6 && (!(copy_flags & SCTP_ADDR6_ALLOWED) ||
From: Denis Kirjanov kda@linux-powerpc.org
[ Upstream commit 2570284060b48f3f79d8f1a2698792f36c385e9a ]
there is a problem with the CWR flag set in an incoming ACK segment and it leads to the situation when the ECE flag is latched forever
the following packetdrill script shows what happens:
// Stack receives incoming segments with CE set +0.1 <[ect0] . 11001:12001(1000) ack 1001 win 65535 +0.0 <[ce] . 12001:13001(1000) ack 1001 win 65535 +0.0 <[ect0] P. 13001:14001(1000) ack 1001 win 65535
// Stack repsonds with ECN ECHO +0.0 >[noecn] . 1001:1001(0) ack 12001 +0.0 >[noecn] E. 1001:1001(0) ack 13001 +0.0 >[noecn] E. 1001:1001(0) ack 14001
// Write a packet +0.1 write(3, ..., 1000) = 1000 +0.0 >[ect0] PE. 1001:2001(1000) ack 14001
// Pure ACK received +0.01 <[noecn] W. 14001:14001(0) ack 2001 win 65535
// Since CWR was sent, this packet should NOT have ECE set
+0.1 write(3, ..., 1000) = 1000 +0.0 >[ect0] P. 2001:3001(1000) ack 14001 // but Linux will still keep ECE latched here, with packetdrill // flagging a missing ECE flag, expecting // >[ect0] PE. 2001:3001(1000) ack 14001 // in the script
In the situation above we will continue to send ECN ECHO packets and trigger the peer to reduce the congestion window. To avoid that we can check CWR on pure ACKs received.
v3: - Add a sequence check to avoid sending an ACK to an ACK
v2: - Adjusted the comment - move CWR check before checking for unacknowledged packets
Signed-off-by: Denis Kirjanov denis.kirjanov@suse.com Acked-by: Neal Cardwell ncardwell@google.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_input.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 29c6fc8c77168..ccab8bc29e2b1 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -261,7 +261,8 @@ static void tcp_ecn_accept_cwr(struct sock *sk, const struct sk_buff *skb) * cwnd may be very low (even just 1 packet), so we should ACK * immediately. */ - inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_NOW; + if (TCP_SKB_CB(skb)->seq != TCP_SKB_CB(skb)->end_seq) + inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_NOW; } }
@@ -3683,6 +3684,15 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) tcp_in_ack_event(sk, ack_ev_flags); }
+ /* This is a deviation from RFC3168 since it states that: + * "When the TCP data sender is ready to set the CWR bit after reducing + * the congestion window, it SHOULD set the CWR bit only on the first + * new data packet that it transmits." + * We accept CWR on pure ACKs to be more robust + * with widely-deployed TCP implementations that do this. + */ + tcp_ecn_accept_cwr(sk, skb); + /* We passed data and got it acked, remove any soft error * log. Something worked... */ @@ -4780,8 +4790,6 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) skb_dst_drop(skb); __skb_pull(skb, tcp_hdr(skb)->doff * 4);
- tcp_ecn_accept_cwr(sk, skb); - tp->rx_opt.dsack = 0;
/* Queue data for delivery to the user.
From: Eric Dumazet edumazet@google.com
[ Upstream commit 662051215c758ae8545451628816204ed6cd372d ]
Back in 2013, we made a change that broke fast retransmit for non SACK flows.
Indeed, for these flows, a sender needs to receive three duplicate ACK before starting fast retransmit. Sending ACK with different receive window do not count.
Even if enabling SACK is strongly recommended these days, there still are some cases where it has to be disabled.
Not increasing the window seems better than having to rely on RTO.
After the fix, following packetdrill test gives :
// Initialize connection 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8> +0 < . 1:1(0) ack 1 win 514
+0 accept(3, ..., ...) = 4
+0 < . 1:1001(1000) ack 1 win 514 // Quick ack +0 > . 1:1(0) ack 1001 win 264
+0 < . 2001:3001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264
+0 < . 3001:4001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264
+0 < . 4001:5001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264
+0 < . 1001:2001(1000) ack 1 win 514 // Hole is repaired. +0 > . 1:1(0) ack 5001 win 272
Fixes: 4e4f1fc22681 ("tcp: properly increase rcv_ssthresh for ofo packets") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Venkat Venkatsubra venkat.x.venkatsubra@oracle.com Acked-by: Neal Cardwell ncardwell@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_input.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index ccab8bc29e2b1..1fa009999f577 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4603,7 +4603,11 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb) if (tcp_ooo_try_coalesce(sk, tp->ooo_last_skb, skb, &fragstolen)) { coalesce_done: - tcp_grow_window(sk, skb); + /* For non sack flows, do not grow window to force DUPACK + * and trigger fast retransmit. + */ + if (tcp_is_sack(tp)) + tcp_grow_window(sk, skb); kfree_skb_partial(skb, fragstolen); skb = NULL; goto add_sack; @@ -4687,7 +4691,11 @@ add_sack: tcp_sack_new_ofo_skb(sk, seq, end_seq); end: if (skb) { - tcp_grow_window(sk, skb); + /* For non sack flows, do not grow window to force DUPACK + * and trigger fast retransmit. + */ + if (tcp_is_sack(tp)) + tcp_grow_window(sk, skb); skb_condense(skb); skb_set_owner_r(skb, sk); }
From: David Christensen drc@linux.vnet.ibm.com
[ Upstream commit 3a2656a211caf35e56afc9425e6e518fa52f7fbc ]
The driver function tg3_io_error_detected() calls napi_disable twice, without an intervening napi_enable, when the number of EEH errors exceeds eeh_max_freezes, resulting in an indefinite sleep while holding rtnl_lock.
Add check for pcierr_recovery which skips code already executed for the "Frozen" state.
Signed-off-by: David Christensen drc@linux.vnet.ibm.com Reviewed-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/tg3.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index ff98a82b7bc47..d71ce7634ac19 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -18170,8 +18170,8 @@ static pci_ers_result_t tg3_io_error_detected(struct pci_dev *pdev,
rtnl_lock();
- /* We probably don't have netdev yet */ - if (!netdev || !netif_running(netdev)) + /* Could be second call or maybe we don't have netdev yet */ + if (!netdev || tp->pcierr_recovery || !netif_running(netdev)) goto done;
/* We needn't recover from permanent error */
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit dafabb6590cb15f300b77c095d50312e2c7c8e0f ]
In the datapath, the ip6gre_tunnel_lookup() is used and it internally uses fallback tunnel device pointer, which is fb_tunnel_dev. This pointer variable should be set to NULL when a fb interface is deleted. But there is no routine to set fb_tunnel_dev pointer to NULL. So, this pointer will be still used after interface is deleted and it eventually results in the use-after-free problem.
Test commands: ip netns add A ip netns add B ip link add eth0 type veth peer name eth1 ip link set eth0 netns A ip link set eth1 netns B
ip netns exec A ip link set lo up ip netns exec A ip link set eth0 up ip netns exec A ip link add ip6gre1 type ip6gre local fc:0::1 \ remote fc:0::2 ip netns exec A ip -6 a a fc:100::1/64 dev ip6gre1 ip netns exec A ip link set ip6gre1 up ip netns exec A ip -6 a a fc:0::1/64 dev eth0 ip netns exec A ip link set ip6gre0 up
ip netns exec B ip link set lo up ip netns exec B ip link set eth1 up ip netns exec B ip link add ip6gre1 type ip6gre local fc:0::2 \ remote fc:0::1 ip netns exec B ip -6 a a fc:100::2/64 dev ip6gre1 ip netns exec B ip link set ip6gre1 up ip netns exec B ip -6 a a fc:0::2/64 dev eth1 ip netns exec B ip link set ip6gre0 up ip netns exec A ping fc:100::2 -s 60000 & ip netns del B
Splat looks like: [ 73.087285][ C1] BUG: KASAN: use-after-free in ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.088361][ C1] Read of size 4 at addr ffff888040559218 by task ping/1429 [ 73.089317][ C1] [ 73.089638][ C1] CPU: 1 PID: 1429 Comm: ping Not tainted 5.7.0+ #602 [ 73.090531][ C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 73.091725][ C1] Call Trace: [ 73.092160][ C1] <IRQ> [ 73.092556][ C1] dump_stack+0x96/0xdb [ 73.093122][ C1] print_address_description.constprop.6+0x2cc/0x450 [ 73.094016][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.094894][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.095767][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.096619][ C1] kasan_report+0x154/0x190 [ 73.097209][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.097989][ C1] ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.098750][ C1] ? gre_del_protocol+0x60/0x60 [gre] [ 73.099500][ C1] gre_rcv+0x1c5/0x1450 [ip6_gre] [ 73.100199][ C1] ? ip6gre_header+0xf00/0xf00 [ip6_gre] [ 73.100985][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.101830][ C1] ? ip6_input_finish+0x5/0xf0 [ 73.102483][ C1] ip6_protocol_deliver_rcu+0xcbb/0x1510 [ 73.103296][ C1] ip6_input_finish+0x5b/0xf0 [ 73.103920][ C1] ip6_input+0xcd/0x2c0 [ 73.104473][ C1] ? ip6_input_finish+0xf0/0xf0 [ 73.105115][ C1] ? rcu_read_lock_held+0x90/0xa0 [ 73.105783][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.106548][ C1] ipv6_rcv+0x1f1/0x300 [ ... ]
Suggested-by: Eric Dumazet eric.dumazet@gmail.com Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_gre.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 781ca8c07a0da..6532bde82b40a 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -127,6 +127,7 @@ static struct ip6_tnl *ip6gre_tunnel_lookup(struct net_device *dev, gre_proto == htons(ETH_P_ERSPAN2)) ? ARPHRD_ETHER : ARPHRD_IP6GRE; int score, cand_score = 4; + struct net_device *ndev;
for_each_ip_tunnel_rcu(t, ign->tunnels_r_l[h0 ^ h1]) { if (!ipv6_addr_equal(local, &t->parms.laddr) || @@ -238,9 +239,9 @@ static struct ip6_tnl *ip6gre_tunnel_lookup(struct net_device *dev, if (t && t->dev->flags & IFF_UP) return t;
- dev = ign->fb_tunnel_dev; - if (dev && dev->flags & IFF_UP) - return netdev_priv(dev); + ndev = READ_ONCE(ign->fb_tunnel_dev); + if (ndev && ndev->flags & IFF_UP) + return netdev_priv(ndev);
return NULL; } @@ -413,6 +414,8 @@ static void ip6gre_tunnel_uninit(struct net_device *dev)
ip6gre_tunnel_unlink_md(ign, t); ip6gre_tunnel_unlink(ign, t); + if (ign->fb_tunnel_dev == dev) + WRITE_ONCE(ign->fb_tunnel_dev, NULL); dst_cache_reset(&t->dst_cache); dev_put(dev); }
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit b2ffc75e2e990b09903f9d15ccd53bc5f3a4217c ]
Commit 02a6efcab675 ("net: phy: allow scanning busses with missing phys") added a special condition to return -ENODEV in case -ENODEV or -EIO was returned from the first read of the MII_PHYSID1 register.
In case the MDIO bus data line pull-up is not strong enough, the MDIO bus controller will not flag this as a read error. This can happen when a pluggable daughter card is not connected and weak internal pull-ups are used (since that is the only option, otherwise the pins are floating).
The second read of MII_PHYSID2 will be correctly flagged an error though, but now we will return -EIO which will be treated as a hard error, thus preventing MDIO bus scanning loops to continue succesfully.
Apply the same logic to both register reads, thus allowing the scanning logic to proceed.
Fixes: 02a6efcab675 ("net: phy: allow scanning busses with missing phys") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy_device.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 697c74deb222b..0881b4b923632 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -798,8 +798,10 @@ static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
/* Grab the bits from PHYIR2, and put them in the lower half */ phy_reg = mdiobus_read(bus, addr, MII_PHYSID2); - if (phy_reg < 0) - return -EIO; + if (phy_reg < 0) { + /* returning -ENODEV doesn't stop bus scanning */ + return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO; + }
*phy_id |= phy_reg;
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit ba61539c6ae57f4146284a5cb4f7b7ed8d42bf45 ]
In the datapath, the ip_tunnel_lookup() is used and it internally uses fallback tunnel device pointer, which is fb_tunnel_dev. This pointer variable should be set to NULL when a fb interface is deleted. But there is no routine to set fb_tunnel_dev pointer to NULL. So, this pointer will be still used after interface is deleted and it eventually results in the use-after-free problem.
Test commands: ip netns add A ip netns add B ip link add eth0 type veth peer name eth1 ip link set eth0 netns A ip link set eth1 netns B
ip netns exec A ip link set lo up ip netns exec A ip link set eth0 up ip netns exec A ip link add gre1 type gre local 10.0.0.1 \ remote 10.0.0.2 ip netns exec A ip link set gre1 up ip netns exec A ip a a 10.0.100.1/24 dev gre1 ip netns exec A ip a a 10.0.0.1/24 dev eth0
ip netns exec B ip link set lo up ip netns exec B ip link set eth1 up ip netns exec B ip link add gre1 type gre local 10.0.0.2 \ remote 10.0.0.1 ip netns exec B ip link set gre1 up ip netns exec B ip a a 10.0.100.2/24 dev gre1 ip netns exec B ip a a 10.0.0.2/24 dev eth1 ip netns exec A hping3 10.0.100.2 -2 --flood -d 60000 & ip netns del B
Splat looks like: [ 77.793450][ C3] ================================================================== [ 77.794702][ C3] BUG: KASAN: use-after-free in ip_tunnel_lookup+0xcc4/0xf30 [ 77.795573][ C3] Read of size 4 at addr ffff888060bd9c84 by task hping3/2905 [ 77.796398][ C3] [ 77.796664][ C3] CPU: 3 PID: 2905 Comm: hping3 Not tainted 5.8.0-rc1+ #616 [ 77.797474][ C3] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 77.798453][ C3] Call Trace: [ 77.798815][ C3] <IRQ> [ 77.799142][ C3] dump_stack+0x9d/0xdb [ 77.799605][ C3] print_address_description.constprop.7+0x2cc/0x450 [ 77.800365][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.800908][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.801517][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.802145][ C3] kasan_report+0x154/0x190 [ 77.802821][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.803503][ C3] ip_tunnel_lookup+0xcc4/0xf30 [ 77.804165][ C3] __ipgre_rcv+0x1ab/0xaa0 [ip_gre] [ 77.804862][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 77.805621][ C3] gre_rcv+0x304/0x1910 [ip_gre] [ 77.806293][ C3] ? lock_acquire+0x1a9/0x870 [ 77.806925][ C3] ? gre_rcv+0xfe/0x354 [gre] [ 77.807559][ C3] ? erspan_xmit+0x2e60/0x2e60 [ip_gre] [ 77.808305][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 77.809032][ C3] ? rcu_read_lock_held+0x90/0xa0 [ 77.809713][ C3] gre_rcv+0x1b8/0x354 [gre] [ ... ]
Suggested-by: Eric Dumazet eric.dumazet@gmail.com Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_tunnel.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index cd4b84310d929..a0b4dc54f8a60 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -85,9 +85,10 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, __be32 remote, __be32 local, __be32 key) { - unsigned int hash; struct ip_tunnel *t, *cand = NULL; struct hlist_head *head; + struct net_device *ndev; + unsigned int hash;
hash = ip_tunnel_hash(key, remote); head = &itn->tunnels[hash]; @@ -162,8 +163,9 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, if (t && t->dev->flags & IFF_UP) return t;
- if (itn->fb_tunnel_dev && itn->fb_tunnel_dev->flags & IFF_UP) - return netdev_priv(itn->fb_tunnel_dev); + ndev = READ_ONCE(itn->fb_tunnel_dev); + if (ndev && ndev->flags & IFF_UP) + return netdev_priv(ndev);
return NULL; } @@ -1245,9 +1247,9 @@ void ip_tunnel_uninit(struct net_device *dev) struct ip_tunnel_net *itn;
itn = net_generic(net, tunnel->ip_tnl_net_id); - /* fb_tunnel_dev will be unregisted in net-exit call. */ - if (itn->fb_tunnel_dev != dev) - ip_tunnel_del(itn, netdev_priv(dev)); + ip_tunnel_del(itn, netdev_priv(dev)); + if (itn->fb_tunnel_dev == dev) + WRITE_ONCE(itn->fb_tunnel_dev, NULL);
dst_cache_reset(&tunnel->dst_cache); }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit d0ad2ea2bc185835f8a749302ad07b70528d2a09 ]
We currently only store the firmware version as a string for ethtool and devlink info. Store it also as a version code. The next 2 patches will need to check the firmware major version to determine some workarounds.
We also use the 16-bit firmware version fields if the firmware is newer and provides the 16-bit fields.
Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 22 ++++++++++++++++++---- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++++ 2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 19c4a0a5727a4..83ed6f31a1fae 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -7217,8 +7217,9 @@ static int __bnxt_hwrm_ver_get(struct bnxt *bp, bool silent) static int bnxt_hwrm_ver_get(struct bnxt *bp) { struct hwrm_ver_get_output *resp = bp->hwrm_cmd_resp_addr; + u16 fw_maj, fw_min, fw_bld, fw_rsv; u32 dev_caps_cfg, hwrm_ver; - int rc; + int rc, len;
bp->hwrm_max_req_len = HWRM_MAX_REQ_LEN; mutex_lock(&bp->hwrm_cmd_lock); @@ -7250,9 +7251,22 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp) resp->hwrm_intf_maj_8b, resp->hwrm_intf_min_8b, resp->hwrm_intf_upd_8b);
- snprintf(bp->fw_ver_str, BC_HWRM_STR_LEN, "%d.%d.%d.%d", - resp->hwrm_fw_maj_8b, resp->hwrm_fw_min_8b, - resp->hwrm_fw_bld_8b, resp->hwrm_fw_rsvd_8b); + fw_maj = le16_to_cpu(resp->hwrm_fw_major); + if (bp->hwrm_spec_code > 0x10803 && fw_maj) { + fw_min = le16_to_cpu(resp->hwrm_fw_minor); + fw_bld = le16_to_cpu(resp->hwrm_fw_build); + fw_rsv = le16_to_cpu(resp->hwrm_fw_patch); + len = FW_VER_STR_LEN; + } else { + fw_maj = resp->hwrm_fw_maj_8b; + fw_min = resp->hwrm_fw_min_8b; + fw_bld = resp->hwrm_fw_bld_8b; + fw_rsv = resp->hwrm_fw_rsvd_8b; + len = BC_HWRM_STR_LEN; + } + bp->fw_ver_code = BNXT_FW_VER_CODE(fw_maj, fw_min, fw_bld, fw_rsv); + snprintf(bp->fw_ver_str, len, "%d.%d.%d.%d", fw_maj, fw_min, fw_bld, + fw_rsv);
if (strlen(resp->active_pkg_name)) { int fw_ver_len = strlen(bp->fw_ver_str); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 3d39638521d6c..a880aea0c20b5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1729,6 +1729,10 @@ struct bnxt { #define PHY_VER_STR_LEN (FW_VER_STR_LEN - BC_HWRM_STR_LEN) char fw_ver_str[FW_VER_STR_LEN]; char hwrm_ver_supp[FW_VER_STR_LEN]; + u64 fw_ver_code; +#define BNXT_FW_VER_CODE(maj, min, bld, rsv) \ + ((u64)(maj) << 48 | (u64)(min) << 32 | (u64)(bld) << 16 | (rsv)) + __be16 vxlan_port; u8 vxlan_port_cnt; __le16 vxlan_fw_dst_port_id;
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit fed7edd18143c68c63ea049999a7e861123de6de ]
Older firmware may not support legacy TX push properly and may not be disabling it. So we check certain firmware versions that may have this problem and disable legacy TX push unconditionally.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 83ed6f31a1fae..6bf97b3acdadc 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6953,7 +6953,8 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp) bp->fw_cap |= BNXT_FW_CAP_ERR_RECOVER_RELOAD;
bp->tx_push_thresh = 0; - if (flags & FUNC_QCAPS_RESP_FLAGS_PUSH_MODE_SUPPORTED) + if ((flags & FUNC_QCAPS_RESP_FLAGS_PUSH_MODE_SUPPORTED) && + BNXT_FW_MAJ(bp) > 217) bp->tx_push_thresh = BNXT_TX_PUSH_THRESH;
hw_resc->max_rsscos_ctxs = le16_to_cpu(resp->max_rsscos_ctx); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index a880aea0c20b5..23ee433db8644 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1732,6 +1732,7 @@ struct bnxt { u64 fw_ver_code; #define BNXT_FW_VER_CODE(maj, min, bld, rsv) \ ((u64)(maj) << 48 | (u64)(min) << 32 | (u64)(bld) << 16 | (rsv)) +#define BNXT_FW_MAJ(bp) ((bp)->fw_ver_code >> 48)
__be16 vxlan_port; u8 vxlan_port_cnt;
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit c2dec363feb41544a76c8083aca2378990e17166 ]
On older firmware, the hardware statistics are not cleared when the driver frees the hardware stats contexts during ifdown. The driver expects these stats to be cleared and saves a copy before freeing the stats contexts. During the next ifup, the driver will likely allocate the same hardware stats contexts and this will cause a big increase in the counters as the old counters are added back to the saved counters.
We fix it by making an additional firmware call to clear the counters before freeing the hw stats contexts when the firmware is the older 20.x firmware.
Fixes: b8875ca356f1 ("bnxt_en: Save ring statistics before reset.") Reported-by: Jakub Kicinski kicinski@fb.com Reviewed-by: Vasundhara Volam vasundhara-v.volam@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Tested-by: Jakub Kicinski kicinski@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 6bf97b3acdadc..c202c2a3d140d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6293,6 +6293,7 @@ int bnxt_hwrm_set_coal(struct bnxt *bp)
static void bnxt_hwrm_stat_ctx_free(struct bnxt *bp) { + struct hwrm_stat_ctx_clr_stats_input req0 = {0}; struct hwrm_stat_ctx_free_input req = {0}; int i;
@@ -6302,6 +6303,7 @@ static void bnxt_hwrm_stat_ctx_free(struct bnxt *bp) if (BNXT_CHIP_TYPE_NITRO_A0(bp)) return;
+ bnxt_hwrm_cmd_hdr_init(bp, &req0, HWRM_STAT_CTX_CLR_STATS, -1, -1); bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_STAT_CTX_FREE, -1, -1);
mutex_lock(&bp->hwrm_cmd_lock); @@ -6311,7 +6313,11 @@ static void bnxt_hwrm_stat_ctx_free(struct bnxt *bp)
if (cpr->hw_stats_ctx_id != INVALID_STATS_CTX_ID) { req.stat_ctx_id = cpu_to_le32(cpr->hw_stats_ctx_id); - + if (BNXT_FW_MAJ(bp) <= 20) { + req0.stat_ctx_id = req.stat_ctx_id; + _hwrm_send_message(bp, &req0, sizeof(req0), + HWRM_CMD_TIMEOUT); + } _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
From: Vasundhara Volam vasundhara-v.volam@broadcom.com
[ Upstream commit c55e28a8b43fcd7dc71868bd165705bc7741a7ca ]
Virtual functions does not have VPD information. This patch modifies calling bnxt_read_vpd_info() only for PFs and avoids an unnecessary error log.
Fixes: a0d0fd70fed5 ("bnxt_en: Read partno and serialno of the board from VPD") Signed-off-by: Vasundhara Volam vasundhara-v.volam@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index c202c2a3d140d..b6fb5a1709c01 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -11884,7 +11884,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) dev->ethtool_ops = &bnxt_ethtool_ops; pci_set_drvdata(pdev, dev);
- bnxt_vpd_read_info(bp); + if (BNXT_PF(bp)) + bnxt_vpd_read_info(bp);
rc = bnxt_alloc_hwrm_resources(bp); if (rc)
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit c718af2d00a37587b09e5958d142da7569f3d55b ]
Fix a phylink's ethtool set_pauseparam support deadlock caused by phylib interacting with phylink: we must not hold the state lock while calling phylib functions that may call into phylink_phy_change().
Fixes: f904f15ea9b5 ("net: phylink: allow ethtool -A to change flow control advertisement") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phylink.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 34ca12aec61b3..b0ddeab2a8d2f 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1519,18 +1519,20 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, linkmode_set_pause(config->advertising, pause->tx_pause, pause->rx_pause);
- /* If we have a PHY, phylib will call our link state function if the - * mode has changed, which will trigger a resolve and update the MAC - * configuration. + if (!pl->phydev && !test_bit(PHYLINK_DISABLE_STOPPED, + &pl->phylink_disable_state)) + phylink_pcs_config(pl, true, &pl->link_config); + + mutex_unlock(&pl->state_mutex); + + /* If we have a PHY, a change of the pause frame advertisement will + * cause phylib to renegotiate (if AN is enabled) which will in turn + * call our phylink_phy_change() and trigger a resolve. Note that + * we can't hold our state mutex while calling phy_set_asym_pause(). */ - if (pl->phydev) { + if (pl->phydev) phy_set_asym_pause(pl->phydev, pause->rx_pause, pause->tx_pause); - } else if (!test_bit(PHYLINK_DISABLE_STOPPED, - &pl->phylink_disable_state)) { - phylink_pcs_config(pl, true, &pl->link_config); - } - mutex_unlock(&pl->state_mutex);
return 0; }
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 2e919bc446faee429ac862a6cdb5e40017051f6b ]
We have been relying on link events and mac_config() when the manual pause modes are changed. With recent developments, such as moving the programming of link state to mac_link_up(), this no longer works.
To ensure that we update the MAC, we must generate a link-down followed by a link-up event; we can do that by setting mac_link_dropped and triggering a resolve.
Fixes: 91a208f2185a ("net: phylink: propagate resolved link config via mac_link_up()") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phylink.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index b0ddeab2a8d2f..ac38bead1cd25 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1480,6 +1480,8 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, struct ethtool_pauseparam *pause) { struct phylink_link_state *config = &pl->link_config; + bool manual_changed; + int pause_state;
ASSERT_RTNL();
@@ -1494,15 +1496,15 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, !pause->autoneg && pause->rx_pause != pause->tx_pause) return -EINVAL;
- mutex_lock(&pl->state_mutex); - config->pause = 0; + pause_state = 0; if (pause->autoneg) - config->pause |= MLO_PAUSE_AN; + pause_state |= MLO_PAUSE_AN; if (pause->rx_pause) - config->pause |= MLO_PAUSE_RX; + pause_state |= MLO_PAUSE_RX; if (pause->tx_pause) - config->pause |= MLO_PAUSE_TX; + pause_state |= MLO_PAUSE_TX;
+ mutex_lock(&pl->state_mutex); /* * See the comments for linkmode_set_pause(), wrt the deficiencies * with the current implementation. A solution to this issue would @@ -1519,6 +1521,12 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, linkmode_set_pause(config->advertising, pause->tx_pause, pause->rx_pause);
+ manual_changed = (config->pause ^ pause_state) & MLO_PAUSE_AN || + (!(pause_state & MLO_PAUSE_AN) && + (config->pause ^ pause_state) & MLO_PAUSE_TXRX_MASK); + + config->pause = pause_state; + if (!pl->phydev && !test_bit(PHYLINK_DISABLE_STOPPED, &pl->phylink_disable_state)) phylink_pcs_config(pl, true, &pl->link_config); @@ -1534,6 +1542,15 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, phy_set_asym_pause(pl->phydev, pause->rx_pause, pause->tx_pause);
+ /* If the manual pause settings changed, make sure we trigger a + * resolve to update their state; we can not guarantee that the + * link will cycle. + */ + if (manual_changed) { + pl->mac_link_dropped = true; + phylink_run_resolve(pl); + } + return 0; } EXPORT_SYMBOL_GPL(phylink_ethtool_set_pauseparam);
From: Ilya Ponetayev i.ponetaev@ndmsystems.com
[ Upstream commit 9208d2863ac689a563b92f2161d8d1e7127d0add ]
cake_handle_diffserv() tries to linearize mac and network header parts of skb and to make it writable unconditionally. In some cases it leads to full skb reallocation, which reduces throughput and increases CPU load. Some measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable() reallocates skb, if skb was allocated in ethernet driver via so-called 'build skb' method from page cache (it was discovered by strange increase of kmalloc-2048 slab at first).
Obtain DSCP value via read-only skb_header_pointer() call, and leave linearization only for DSCP bleaching or ECN CE setting. And, as an additional optimisation, skip diffserv parsing entirely if it is not needed by the current configuration.
Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits") Signed-off-by: Ilya Ponetayev i.ponetaev@ndmsystems.com [ fix a few style issues, reflow commit message ] Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 1496e87cd07bb..a92d6c57aa9a5 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1516,30 +1516,49 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) { - int wlen = skb_network_offset(skb); + const int offset = skb_network_offset(skb); + u16 *buf, buf_; u8 dscp;
switch (tc_skb_protocol(skb)) { case htons(ETH_P_IP): - wlen += sizeof(struct iphdr); - if (!pskb_may_pull(skb, wlen) || - skb_try_make_writable(skb, wlen)) + buf = skb_header_pointer(skb, offset, sizeof(buf_), &buf_); + if (unlikely(!buf)) return 0;
- dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2; - if (wash && dscp) + /* ToS is in the second byte of iphdr */ + dscp = ipv4_get_dsfield((struct iphdr *)buf) >> 2; + + if (wash && dscp) { + const int wlen = offset + sizeof(struct iphdr); + + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0); + } + return dscp;
case htons(ETH_P_IPV6): - wlen += sizeof(struct ipv6hdr); - if (!pskb_may_pull(skb, wlen) || - skb_try_make_writable(skb, wlen)) + buf = skb_header_pointer(skb, offset, sizeof(buf_), &buf_); + if (unlikely(!buf)) return 0;
- dscp = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2; - if (wash && dscp) + /* Traffic class is in the first and second bytes of ipv6hdr */ + dscp = ipv6_get_dsfield((struct ipv6hdr *)buf) >> 2; + + if (wash && dscp) { + const int wlen = offset + sizeof(struct ipv6hdr); + + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0); + } + return dscp;
case htons(ETH_P_ARP):
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit 8c95eca0bb8c4bd2231a0d581f1ad0d50c90488c ]
As a further optimisation of the diffserv parsing codepath, we can skip it entirely if CAKE is configured to neither use diffserv-based classification, nor to zero out the diffserv bits.
Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index a92d6c57aa9a5..3482f9569dce5 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1514,7 +1514,7 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) return idx + (tin << 16); }
-static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) +static u8 cake_handle_diffserv(struct sk_buff *skb, bool wash) { const int offset = skb_network_offset(skb); u16 *buf, buf_; @@ -1575,14 +1575,17 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch, { struct cake_sched_data *q = qdisc_priv(sch); u32 tin, mark; + bool wash; u8 dscp;
/* Tin selection: Default to diffserv-based selection, allow overriding - * using firewall marks or skb->priority. + * using firewall marks or skb->priority. Call DSCP parsing early if + * wash is enabled, otherwise defer to below to skip unneeded parsing. */ - dscp = cake_handle_diffserv(skb, - q->rate_flags & CAKE_FLAG_WASH); mark = (skb->mark & q->fwmark_mask) >> q->fwmark_shft; + wash = !!(q->rate_flags & CAKE_FLAG_WASH); + if (wash) + dscp = cake_handle_diffserv(skb, wash);
if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT) tin = 0; @@ -1596,6 +1599,8 @@ static struct cake_tin_data *cake_select_tin(struct Qdisc *sch, tin = q->tin_order[TC_H_MIN(skb->priority) - 1];
else { + if (!wash) + dscp = cake_handle_diffserv(skb, wash); tin = q->tin_index[dscp];
if (unlikely(tin >= q->tin_cnt))
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit 3f608f0c41360b11b04c763f348b712f651c8bac ]
I spotted a few nits when comparing the in-tree version of sch_cake with the out-of-tree one: A redundant error variable declaration shadowing an outer declaration, and an indentation alignment issue. Fix both of these.
Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 3482f9569dce5..9475fa81ea7f6 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -2678,7 +2678,7 @@ static int cake_init(struct Qdisc *sch, struct nlattr *opt, qdisc_watchdog_init(&q->watchdog, sch);
if (opt) { - int err = cake_change(sch, opt, extack); + err = cake_change(sch, opt, extack);
if (err) return err; @@ -2995,7 +2995,7 @@ static int cake_dump_class_stats(struct Qdisc *sch, unsigned long cl, PUT_STAT_S32(BLUE_TIMER_US, ktime_to_us( ktime_sub(now, - flow->cvars.blue_timer))); + flow->cvars.blue_timer))); } if (flow->cvars.dropping) { PUT_STAT_S32(DROP_NEXT_US,
From: Neal Cardwell ncardwell@google.com
[ Upstream commit b344579ca8478598937215f7005d6c7b84d28aee ]
Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an ACK when the minimum rtt of a connection goes down. From inspection it is clear from the existing code that this could happen in an example like the following:
o The first 8 RTT samples in a round trip are 150ms, resulting in a curr_rtt of 150ms and a delay_min of 150ms.
o The 9th RTT sample is 100ms. The curr_rtt does not change after the first 8 samples, so curr_rtt remains 150ms. But delay_min can be lowered at any time, so delay_min falls to 100ms. The code executes the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min of 100ms, and the curr_rtt is declared far enough above delay_min to force a (spurious) exit of Slow start.
The fix here is simple: allow every RTT sample in a round trip to lower the curr_rtt.
Fixes: ae27e98a5152 ("[TCP] CUBIC v2.3") Reported-by: Mirja Kuehlewind mirja.kuehlewind@ericsson.com Signed-off-by: Neal Cardwell ncardwell@google.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Soheil Hassas Yeganeh soheil@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_cubic.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index 8f8eefd3a3ce1..c7bf5b26bf0c2 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -432,10 +432,9 @@ static void hystart_update(struct sock *sk, u32 delay)
if (hystart_detect & HYSTART_DELAY) { /* obtain the minimum delay of more than sampling packets */ + if (ca->curr_rtt > delay) + ca->curr_rtt = delay; if (ca->sample_cnt < HYSTART_MIN_SAMPLES) { - if (ca->curr_rtt > delay) - ca->curr_rtt = delay; - ca->sample_cnt++; } else { if (ca->curr_rtt > ca->delay_min +
From: Neal Cardwell ncardwell@google.com
[ Upstream commit 7d21d54d624777358ab6c7be7ff778808fef70ba ]
Apply the fix from: "tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT" to the BPF implementation of TCP CUBIC congestion control.
Repeating the commit description here for completeness:
Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an ACK when the minimum rtt of a connection goes down. From inspection it is clear from the existing code that this could happen in an example like the following:
o The first 8 RTT samples in a round trip are 150ms, resulting in a curr_rtt of 150ms and a delay_min of 150ms.
o The 9th RTT sample is 100ms. The curr_rtt does not change after the first 8 samples, so curr_rtt remains 150ms. But delay_min can be lowered at any time, so delay_min falls to 100ms. The code executes the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min of 100ms, and the curr_rtt is declared far enough above delay_min to force a (spurious) exit of Slow start.
The fix here is simple: allow every RTT sample in a round trip to lower the curr_rtt.
Fixes: 6de4a9c430b5 ("bpf: tcp: Add bpf_cubic example") Reported-by: Mirja Kuehlewind mirja.kuehlewind@ericsson.com Signed-off-by: Neal Cardwell ncardwell@google.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Soheil Hassas Yeganeh soheil@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/progs/bpf_cubic.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/bpf_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cubic.c index 7897c8f4d363b..ef574087f1e1f 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cubic.c @@ -480,10 +480,9 @@ static __always_inline void hystart_update(struct sock *sk, __u32 delay)
if (hystart_detect & HYSTART_DELAY) { /* obtain the minimum delay of more than sampling packets */ + if (ca->curr_rtt > delay) + ca->curr_rtt = delay; if (ca->sample_cnt < HYSTART_MIN_SAMPLES) { - if (ca->curr_rtt > delay) - ca->curr_rtt = delay; - ca->sample_cnt++; } else { if (ca->curr_rtt > ca->delay_min +
From: Claudiu Beznea claudiu.beznea@microchip.com
[ Upstream commit faa620876b01d6744f1599e279042bb8149247ab ]
Undo previously done operation in case macb_phylink_connect() fails. Since macb_reset_hw() is the 1st undo operation the napi_exit label was renamed to reset_hw.
Fixes: 7897b071ac3b ("net: macb: convert to phylink") Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/cadence/macb_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 67933079aeea5..257c4920cb886 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -2558,7 +2558,7 @@ static int macb_open(struct net_device *dev)
err = macb_phylink_connect(bp); if (err) - goto napi_exit; + goto reset_hw;
netif_tx_start_all_queues(dev);
@@ -2567,9 +2567,11 @@ static int macb_open(struct net_device *dev)
return 0;
-napi_exit: +reset_hw: + macb_reset_hw(bp); for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) napi_disable(&queue->napi); + macb_free_consistent(bp); pm_exit: pm_runtime_put_sync(&bp->pdev->dev); return err;
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 89fbd26cca7ec9e82ec4787a4b6e95939b57d073 ]
Typically the firmware takes care that tp->ocp_base is reset to its default value. That's not the case (at least) for RTL8117. As a result subsequent PHY access reads/writes the wrong page and the link is broken. Fix this be resetting tp->ocp_base explicitly.
Fixes: 229c1e0dfd3d ("r8169: load firmware for RTL8168fp/RTL8117") Reported-by: Aaron Ma mapengyu@gmail.com Tested-by: Aaron Ma mapengyu@gmail.com Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index c51b48dc36397..7bda2671bd5b6 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2192,8 +2192,11 @@ static void rtl_release_firmware(struct rtl8169_private *tp) void r8169_apply_firmware(struct rtl8169_private *tp) { /* TODO: release firmware if rtl_fw_write_firmware signals failure. */ - if (tp->rtl_fw) + if (tp->rtl_fw) { rtl_fw_write_firmware(tp, tp->rtl_fw); + /* At least one firmware doesn't reset tp->ocp_base. */ + tp->ocp_base = OCP_STD_PHY_BASE; + } }
static void rtl8168_config_eee_mac(struct rtl8169_private *tp)
From: Geliang Tang geliangtang@gmail.com
[ Upstream commit b562f58bbc12444219b74a5d6524977a3d87a022 ]
In RFC 8684, we don't need to send sndr_key in SYN package anymore, so drop it.
Fixes: cc7972ea1932 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec") Signed-off-by: Geliang Tang geliangtang@gmail.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/options.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 1c20dd14b2aa2..2430bbfa34059 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -336,9 +336,7 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, */ subflow->snd_isn = TCP_SKB_CB(skb)->end_seq; if (subflow->request_mptcp) { - pr_debug("local_key=%llu", subflow->local_key); opts->suboptions = OPTION_MPTCP_MPC_SYN; - opts->sndr_key = subflow->local_key; *size = TCPOLEN_MPTCP_MPC_SYN; return true; } else if (subflow->request_join) {
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit b8ad540dd4e40566c520dff491fc06c71ae6b989 ]
socket malloced by sock_create_kern() should be release before return in the error handling, otherwise it cause memory leak.
unreferenced object 0xffff88810910c000 (size 1216): comm "00000003_test_m", pid 12238, jiffies 4295050289 (age 54.237s) hex dump (first 32 bytes): 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 2f 30 0a 81 88 ff ff ........./0..... backtrace: [<00000000e877f89f>] sock_alloc_inode+0x18/0x1c0 [<0000000093d1dd51>] alloc_inode+0x63/0x1d0 [<000000005673fec6>] new_inode_pseudo+0x14/0xe0 [<00000000b5db6be8>] sock_alloc+0x3c/0x260 [<00000000e7e3cbb2>] __sock_create+0x89/0x620 [<0000000023e48593>] mptcp_subflow_create_socket+0xc0/0x5e0 [<00000000419795e4>] __mptcp_socket_create+0x1ad/0x3f0 [<00000000b2f942e8>] mptcp_stream_connect+0x281/0x4f0 [<00000000c80cd5cc>] __sys_connect_file+0x14d/0x190 [<00000000dc761f11>] __sys_connect+0x128/0x160 [<000000008b14e764>] __x64_sys_connect+0x6f/0xb0 [<000000007b4f93bd>] do_syscall_64+0xa1/0x530 [<00000000d3e770b6>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 2303f994b3e1 ("mptcp: Associate MPTCP context with TCP socket") Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/subflow.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e6feb05a93dc3..db3e4e74e7857 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1015,8 +1015,10 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) err = tcp_set_ulp(sf->sk, "mptcp"); release_sock(sf->sk);
- if (err) + if (err) { + sock_release(sf); return err; + }
/* the newly created socket really belongs to the owning MPTCP master * socket, even if for additional subflows the allocation is performed
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit eddbf5d0204e550ee59de02bdc19fe90d4203dd6 ]
Commit 3b33583265ed ("net: Add fraglist GRO/GSO feature flags") missed an entry for NETIF_F_GSO_FRAGLIST in netdev_features_strings array. As a result, fraglist GSO feature is not shown in 'ethtool -k' output and can't be toggled on/off. The fix is trivial.
Fixes: 3b33583265ed ("net: Add fraglist GRO/GSO feature flags") Signed-off-by: Alexander Lobakin alobakin@pm.me Reviewed-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/common.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 7f7fff88c5d3e..aaecfc916a4db 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -44,6 +44,7 @@ const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN] = { [NETIF_F_GSO_SCTP_BIT] = "tx-sctp-segmentation", [NETIF_F_GSO_ESP_BIT] = "tx-esp-segmentation", [NETIF_F_GSO_UDP_L4_BIT] = "tx-udp-segmentation", + [NETIF_F_GSO_FRAGLIST_BIT] = "tx-gso-list",
[NETIF_F_FCOE_CRC_BIT] = "tx-checksum-fcoe-crc", [NETIF_F_SCTP_CRC_BIT] = "tx-checksum-sctp",
From: Claudiu Beznea claudiu.beznea@microchip.com
[ Upstream commit 0eaf228d574bd82a9aed73e3953bfb81721f4227 ]
Call pm_runtime_put_sync() on failure path of at91ether_open.
Fixes: e6a41c23df0d ("net: macb: ensure interface is not suspended on at91rm9200") Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/cadence/macb_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 257c4920cb886..5705359a36124 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -3837,7 +3837,7 @@ static int at91ether_open(struct net_device *dev)
ret = at91ether_start(dev); if (ret) - return ret; + goto pm_exit;
/* Enable MAC interrupts */ macb_writel(lp, IER, MACB_BIT(RCOMP) | @@ -3850,11 +3850,15 @@ static int at91ether_open(struct net_device *dev)
ret = macb_phylink_connect(lp); if (ret) - return ret; + goto pm_exit;
netif_start_queue(dev);
return 0; + +pm_exit: + pm_runtime_put_sync(&lp->pdev->dev); + return ret; }
/* Close the interface */
From: Ard Biesheuvel ardb@kernel.org
[ Upstream commit 8acd2edbe0e8e36261d98d89ce91b810dd7f4b0d ]
The skcipher API dynamically instantiates the transformation object on request that implements the requested algorithm optimally on the given platform. This notion of optimality only matters for cases like bulk network or disk encryption, where performance can be a bottleneck, or in cases where the algorithm itself is not known at compile time.
In the mscc case, we are dealing with AES encryption of a single block, and so neither concern applies, and we are better off using the AES library interface, which is lightweight and safe for this kind of use.
Note that the scatterlist API does not permit references to buffers that are located on the stack, so the existing code is incorrect in any case, but avoiding the skcipher and scatterlist APIs entirely is the most straight-forward approach to fixing this.
Cc: Antoine Tenart antoine.tenart@bootlin.com Cc: Andrew Lunn andrew@lunn.ch Cc: Florian Fainelli f.fainelli@gmail.com Cc: Heiner Kallweit hkallweit1@gmail.com Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org Fixes: 28c5107aa904e ("net: phy: mscc: macsec support") Reviewed-by: Eric Biggers ebiggers@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org Tested-by: Antoine Tenart antoine.tenart@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/Kconfig | 3 +-- drivers/net/phy/mscc/mscc_macsec.c | 40 +++++++----------------------- 2 files changed, 10 insertions(+), 33 deletions(-)
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 3fa33d27eebaf..d140e3c93fe34 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -461,8 +461,7 @@ config MICROCHIP_T1_PHY config MICROSEMI_PHY tristate "Microsemi PHYs" depends on MACSEC || MACSEC=n - select CRYPTO_AES - select CRYPTO_ECB + select CRYPTO_LIB_AES if MACSEC ---help--- Currently supports VSC8514, VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
diff --git a/drivers/net/phy/mscc/mscc_macsec.c b/drivers/net/phy/mscc/mscc_macsec.c index b4d3dc4068e27..d53ca884b5c9e 100644 --- a/drivers/net/phy/mscc/mscc_macsec.c +++ b/drivers/net/phy/mscc/mscc_macsec.c @@ -10,7 +10,7 @@ #include <linux/phy.h> #include <dt-bindings/net/mscc-phy-vsc8531.h>
-#include <crypto/skcipher.h> +#include <crypto/aes.h>
#include <net/macsec.h>
@@ -500,39 +500,17 @@ static u32 vsc8584_macsec_flow_context_id(struct macsec_flow *flow) static int vsc8584_macsec_derive_key(const u8 key[MACSEC_KEYID_LEN], u16 key_len, u8 hkey[16]) { - struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0); - struct skcipher_request *req = NULL; - struct scatterlist src, dst; - DECLARE_CRYPTO_WAIT(wait); - u32 input[4] = {0}; + const u8 input[AES_BLOCK_SIZE] = {0}; + struct crypto_aes_ctx ctx; int ret;
- if (IS_ERR(tfm)) - return PTR_ERR(tfm); - - req = skcipher_request_alloc(tfm, GFP_KERNEL); - if (!req) { - ret = -ENOMEM; - goto out; - } - - skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG | - CRYPTO_TFM_REQ_MAY_SLEEP, crypto_req_done, - &wait); - ret = crypto_skcipher_setkey(tfm, key, key_len); - if (ret < 0) - goto out; - - sg_init_one(&src, input, 16); - sg_init_one(&dst, hkey, 16); - skcipher_request_set_crypt(req, &src, &dst, 16, NULL); - - ret = crypto_wait_req(crypto_skcipher_encrypt(req), &wait); + ret = aes_expandkey(&ctx, key, key_len); + if (ret) + return ret;
-out: - skcipher_request_free(req); - crypto_free_skcipher(tfm); - return ret; + aes_encrypt(&ctx, hkey, input); + memzero_explicit(&ctx, sizeof(ctx)); + return 0; }
static int vsc8584_macsec_transformation(struct phy_device *phydev,
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 5a8d7f126c97d04d893f5e5be2b286437a0d01b0 ]
Commit 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()") introduced a break of the loop on the premise that a successful registration should exit the loop. The premise is correct but not to code, because rc && rc != -ENODEV is just a special error condition, that means we would exit the loop even with rc == -ENODEV which is absolutely not correct since this is the error code to indicate to the MDIO bus layer that scanning should continue.
Fix this by explicitly checking for rc = 0 as the only valid condition to break out of the loop.
Fixes: 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/of_mdio.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c index 9f982c0627a0d..95a3bb2e5eabf 100644 --- a/drivers/of/of_mdio.c +++ b/drivers/of/of_mdio.c @@ -303,10 +303,15 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np) child, addr);
if (of_mdiobus_child_is_phy(child)) { + /* -ENODEV is the return code that PHYLIB has + * standardized on to indicate that bus + * scanning should continue. + */ rc = of_mdiobus_register_phy(mdio, child, addr); - if (rc && rc != -ENODEV) + if (!rc) + break; + if (rc != -ENODEV) goto unregister; - break; } } }
From: "Jason A. Donenfeld" Jason@zx2c4.com
[ Upstream commit 900575aa33a3eaaef802b31de187a85c4a4b4bd0 ]
Before, we took a reference to the creating netns if the new netns was different. This caused issues with circular references, with two wireguard interfaces swapping namespaces. The solution is to rather not take any extra references at all, but instead simply invalidate the creating netns pointer when that netns is deleted.
In order to prevent this from happening again, this commit improves the rough object leak tracking by allowing it to account for created and destroyed interfaces, aside from just peers and keys. That then makes it possible to check for the object leak when having two interfaces take a reference to each others' namespaces.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireguard/device.c | 58 ++++++++++------------ drivers/net/wireguard/device.h | 3 +- drivers/net/wireguard/netlink.c | 14 ++++-- drivers/net/wireguard/socket.c | 25 +++++++--- tools/testing/selftests/wireguard/netns.sh | 13 ++++- 5 files changed, 67 insertions(+), 46 deletions(-)
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c index 3ac3f8570ca1b..a8f151b1b5fab 100644 --- a/drivers/net/wireguard/device.c +++ b/drivers/net/wireguard/device.c @@ -45,17 +45,18 @@ static int wg_open(struct net_device *dev) if (dev_v6) dev_v6->cnf.addr_gen_mode = IN6_ADDR_GEN_MODE_NONE;
+ mutex_lock(&wg->device_update_lock); ret = wg_socket_init(wg, wg->incoming_port); if (ret < 0) - return ret; - mutex_lock(&wg->device_update_lock); + goto out; list_for_each_entry(peer, &wg->peer_list, peer_list) { wg_packet_send_staged_packets(peer); if (peer->persistent_keepalive_interval) wg_packet_send_keepalive(peer); } +out: mutex_unlock(&wg->device_update_lock); - return 0; + return ret; }
#ifdef CONFIG_PM_SLEEP @@ -225,6 +226,7 @@ static void wg_destruct(struct net_device *dev) list_del(&wg->device_list); rtnl_unlock(); mutex_lock(&wg->device_update_lock); + rcu_assign_pointer(wg->creating_net, NULL); wg->incoming_port = 0; wg_socket_reinit(wg, NULL, NULL); /* The final references are cleared in the below calls to destroy_workqueue. */ @@ -240,13 +242,11 @@ static void wg_destruct(struct net_device *dev) skb_queue_purge(&wg->incoming_handshakes); free_percpu(dev->tstats); free_percpu(wg->incoming_handshakes_worker); - if (wg->have_creating_net_ref) - put_net(wg->creating_net); kvfree(wg->index_hashtable); kvfree(wg->peer_hashtable); mutex_unlock(&wg->device_update_lock);
- pr_debug("%s: Interface deleted\n", dev->name); + pr_debug("%s: Interface destroyed\n", dev->name); free_netdev(dev); }
@@ -292,7 +292,7 @@ static int wg_newlink(struct net *src_net, struct net_device *dev, struct wg_device *wg = netdev_priv(dev); int ret = -ENOMEM;
- wg->creating_net = src_net; + rcu_assign_pointer(wg->creating_net, src_net); init_rwsem(&wg->static_identity.lock); mutex_init(&wg->socket_update_lock); mutex_init(&wg->device_update_lock); @@ -393,30 +393,26 @@ static struct rtnl_link_ops link_ops __read_mostly = { .newlink = wg_newlink, };
-static int wg_netdevice_notification(struct notifier_block *nb, - unsigned long action, void *data) +static void wg_netns_pre_exit(struct net *net) { - struct net_device *dev = ((struct netdev_notifier_info *)data)->dev; - struct wg_device *wg = netdev_priv(dev); - - ASSERT_RTNL(); - - if (action != NETDEV_REGISTER || dev->netdev_ops != &netdev_ops) - return 0; + struct wg_device *wg;
- if (dev_net(dev) == wg->creating_net && wg->have_creating_net_ref) { - put_net(wg->creating_net); - wg->have_creating_net_ref = false; - } else if (dev_net(dev) != wg->creating_net && - !wg->have_creating_net_ref) { - wg->have_creating_net_ref = true; - get_net(wg->creating_net); + rtnl_lock(); + list_for_each_entry(wg, &device_list, device_list) { + if (rcu_access_pointer(wg->creating_net) == net) { + pr_debug("%s: Creating namespace exiting\n", wg->dev->name); + netif_carrier_off(wg->dev); + mutex_lock(&wg->device_update_lock); + rcu_assign_pointer(wg->creating_net, NULL); + wg_socket_reinit(wg, NULL, NULL); + mutex_unlock(&wg->device_update_lock); + } } - return 0; + rtnl_unlock(); }
-static struct notifier_block netdevice_notifier = { - .notifier_call = wg_netdevice_notification +static struct pernet_operations pernet_ops = { + .pre_exit = wg_netns_pre_exit };
int __init wg_device_init(void) @@ -429,18 +425,18 @@ int __init wg_device_init(void) return ret; #endif
- ret = register_netdevice_notifier(&netdevice_notifier); + ret = register_pernet_device(&pernet_ops); if (ret) goto error_pm;
ret = rtnl_link_register(&link_ops); if (ret) - goto error_netdevice; + goto error_pernet;
return 0;
-error_netdevice: - unregister_netdevice_notifier(&netdevice_notifier); +error_pernet: + unregister_pernet_device(&pernet_ops); error_pm: #ifdef CONFIG_PM_SLEEP unregister_pm_notifier(&pm_notifier); @@ -451,7 +447,7 @@ error_pm: void wg_device_uninit(void) { rtnl_link_unregister(&link_ops); - unregister_netdevice_notifier(&netdevice_notifier); + unregister_pernet_device(&pernet_ops); #ifdef CONFIG_PM_SLEEP unregister_pm_notifier(&pm_notifier); #endif diff --git a/drivers/net/wireguard/device.h b/drivers/net/wireguard/device.h index b15a8be9d8169..4d0144e169478 100644 --- a/drivers/net/wireguard/device.h +++ b/drivers/net/wireguard/device.h @@ -40,7 +40,7 @@ struct wg_device { struct net_device *dev; struct crypt_queue encrypt_queue, decrypt_queue; struct sock __rcu *sock4, *sock6; - struct net *creating_net; + struct net __rcu *creating_net; struct noise_static_identity static_identity; struct workqueue_struct *handshake_receive_wq, *handshake_send_wq; struct workqueue_struct *packet_crypt_wq; @@ -56,7 +56,6 @@ struct wg_device { unsigned int num_peers, device_update_gen; u32 fwmark; u16 incoming_port; - bool have_creating_net_ref; };
int wg_device_init(void); diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index 802099c8828a6..20a4f3c0a0a19 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -511,11 +511,15 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info) if (flags & ~__WGDEVICE_F_ALL) goto out;
- ret = -EPERM; - if ((info->attrs[WGDEVICE_A_LISTEN_PORT] || - info->attrs[WGDEVICE_A_FWMARK]) && - !ns_capable(wg->creating_net->user_ns, CAP_NET_ADMIN)) - goto out; + if (info->attrs[WGDEVICE_A_LISTEN_PORT] || info->attrs[WGDEVICE_A_FWMARK]) { + struct net *net; + rcu_read_lock(); + net = rcu_dereference(wg->creating_net); + ret = !net || !ns_capable(net->user_ns, CAP_NET_ADMIN) ? -EPERM : 0; + rcu_read_unlock(); + if (ret) + goto out; + }
++wg->device_update_gen;
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index f9018027fc133..c33e2c81635fa 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -347,6 +347,7 @@ static void set_sock_opts(struct socket *sock)
int wg_socket_init(struct wg_device *wg, u16 port) { + struct net *net; int ret; struct udp_tunnel_sock_cfg cfg = { .sk_user_data = wg, @@ -371,37 +372,47 @@ int wg_socket_init(struct wg_device *wg, u16 port) }; #endif
+ rcu_read_lock(); + net = rcu_dereference(wg->creating_net); + net = net ? maybe_get_net(net) : NULL; + rcu_read_unlock(); + if (unlikely(!net)) + return -ENONET; + #if IS_ENABLED(CONFIG_IPV6) retry: #endif
- ret = udp_sock_create(wg->creating_net, &port4, &new4); + ret = udp_sock_create(net, &port4, &new4); if (ret < 0) { pr_err("%s: Could not create IPv4 socket\n", wg->dev->name); - return ret; + goto out; } set_sock_opts(new4); - setup_udp_tunnel_sock(wg->creating_net, new4, &cfg); + setup_udp_tunnel_sock(net, new4, &cfg);
#if IS_ENABLED(CONFIG_IPV6) if (ipv6_mod_enabled()) { port6.local_udp_port = inet_sk(new4->sk)->inet_sport; - ret = udp_sock_create(wg->creating_net, &port6, &new6); + ret = udp_sock_create(net, &port6, &new6); if (ret < 0) { udp_tunnel_sock_release(new4); if (ret == -EADDRINUSE && !port && retries++ < 100) goto retry; pr_err("%s: Could not create IPv6 socket\n", wg->dev->name); - return ret; + goto out; } set_sock_opts(new6); - setup_udp_tunnel_sock(wg->creating_net, new6, &cfg); + setup_udp_tunnel_sock(net, new6, &cfg); } #endif
wg_socket_reinit(wg, new4->sk, new6 ? new6->sk : NULL); - return 0; + ret = 0; +out: + put_net(net); + return ret; }
void wg_socket_reinit(struct wg_device *wg, struct sock *new4, diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 17a1f53ceba01..d77f4829f1e07 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -587,9 +587,20 @@ ip0 link set wg0 up kill $ncat_pid ip0 link del wg0
+# Ensure there aren't circular reference loops +ip1 link add wg1 type wireguard +ip2 link add wg2 type wireguard +ip1 link set wg1 netns $netns2 +ip2 link set wg2 netns $netns1 +pp ip netns delete $netns1 +pp ip netns delete $netns2 +pp ip netns add $netns1 +pp ip netns add $netns2 + +sleep 2 # Wait for cleanup and grace periods declare -A objects while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do - [[ $line =~ .*(wg[0-9]+:\ [A-Z][a-z]+\ [0-9]+)\ .*(created|destroyed).* ]] || continue + [[ $line =~ .*(wg[0-9]+:\ [A-Z][a-z]+\ ?[0-9]*)\ .*(created|destroyed).* ]] || continue objects["${BASH_REMATCH[1]}"]+="${BASH_REMATCH[2]}" done < /dev/kmsg alldeleted=1
From: Martin martin.varghese@nokia.com
[ Upstream commit 4c98045c9b74feab837be58986c0517d3cc661f1 ]
Code to handle multiproto configuration is missing.
Fixes: 4b5f67232d95 ("net: Special handling for IP & MPLS") Signed-off-by: Martin martin.varghese@nokia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bareudp.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 5d3c691a1c668..3dd46cd551145 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -572,6 +572,9 @@ static int bareudp2info(struct nlattr *data[], struct bareudp_conf *conf, if (data[IFLA_BAREUDP_SRCPORT_MIN]) conf->sport_min = nla_get_u16(data[IFLA_BAREUDP_SRCPORT_MIN]);
+ if (data[IFLA_BAREUDP_MULTIPROTO_MODE]) + conf->multi_proto_mode = true; + return 0; }
From: Shannon Nelson snelson@pensando.io
[ Upstream commit fa48494cce5f6360b0f8683cdf258fb45c666287 ]
Let the network stack know the real number of queues that we are using.
v2: added error checking
Fixes: 49d3b493673a ("ionic: disable the queues on link down") Signed-off-by: Shannon Nelson snelson@pensando.io Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 7aa037c3fe020..2729b0bb12739 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -1653,6 +1653,14 @@ int ionic_open(struct net_device *netdev) if (err) goto err_out;
+ err = netif_set_real_num_tx_queues(netdev, lif->nxqs); + if (err) + goto err_txrx_deinit; + + err = netif_set_real_num_rx_queues(netdev, lif->nxqs); + if (err) + goto err_txrx_deinit; + /* don't start the queues until we have link */ if (netif_carrier_ok(netdev)) { err = ionic_start_queues(lif);
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 8dbe4c5d5e40fe140221024f7b16bec9f310bf70 ]
of_find_node_by_name() will do an of_node_put() on the "from" argument. With CONFIG_OF_DYNAMIC enabled which checks for device_node reference counts, we would be getting a warning like this:
[ 6.347230] refcount_t: increment on 0; use-after-free. [ 6.352498] WARNING: CPU: 3 PID: 77 at lib/refcount.c:156 refcount_inc_checked+0x38/0x44 [ 6.360601] Modules linked in: [ 6.363661] CPU: 3 PID: 77 Comm: kworker/3:1 Tainted: G W 5.4.46-gb78b3e9956e6 #13 [ 6.372546] Hardware name: BCM97278SV (DT) [ 6.376649] Workqueue: events deferred_probe_work_func [ 6.381796] pstate: 60000005 (nZCv daif -PAN -UAO) [ 6.386595] pc : refcount_inc_checked+0x38/0x44 [ 6.391133] lr : refcount_inc_checked+0x38/0x44 ... [ 6.478791] Call trace: [ 6.481243] refcount_inc_checked+0x38/0x44 [ 6.485433] kobject_get+0x3c/0x4c [ 6.488840] of_node_get+0x24/0x34 [ 6.492247] of_irq_find_parent+0x3c/0xe0 [ 6.496263] of_irq_parse_one+0xe4/0x1d0 [ 6.500191] irq_of_parse_and_map+0x44/0x84 [ 6.504381] bcm_sf2_sw_probe+0x22c/0x844 [ 6.508397] platform_drv_probe+0x58/0xa8 [ 6.512413] really_probe+0x238/0x3fc [ 6.516081] driver_probe_device+0x11c/0x12c [ 6.520358] __device_attach_driver+0xa8/0x100 [ 6.524808] bus_for_each_drv+0xb4/0xd0 [ 6.528650] __device_attach+0xd0/0x164 [ 6.532493] device_initial_probe+0x24/0x30 [ 6.536682] bus_probe_device+0x38/0x98 [ 6.540524] deferred_probe_work_func+0xa8/0xd4 [ 6.545061] process_one_work+0x178/0x288 [ 6.549078] process_scheduled_works+0x44/0x48 [ 6.553529] worker_thread+0x218/0x270 [ 6.557285] kthread+0xdc/0xe4 [ 6.560344] ret_from_fork+0x10/0x18 [ 6.563925] ---[ end trace 68f65caf69bb152a ]---
Fix this by adding a of_node_get() to increment the reference count prior to the call.
Fixes: afa3b592953b ("net: dsa: bcm_sf2: Ensure correct sub-node is parsed") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/bcm_sf2.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index c7ac63f419184..946e41f020a56 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1147,6 +1147,8 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev) set_bit(0, priv->cfp.used); set_bit(0, priv->cfp.unique);
+ /* Balance of_node_put() done by of_find_node_by_name() */ + of_node_get(dn); ports = of_find_node_by_name(dn, "ports"); if (ports) { bcm_sf2_identify_ports(priv, ports);
From: Dejin Zheng zhengdejin5@gmail.com
[ Upstream commit 6d61f483f148b856d47a6c96d5d84054d5a9f849 ]
Commit 7ae7ad2f11ef47 ("net: phy: smsc: use phy_read_poll_timeout() to simplify the code") will print a lot of logs as follows when Ethernet cable is not connected:
[ 4.473105] SMSC LAN8710/LAN8720 2188000.ethernet-1:00: lan87xx_read_status failed: -110
When wait 640 ms for check ENERGYON bit, the timeout should not be regarded as an actual error and an error message also should not be printed. due to a hardware bug in LAN87XX device, it leads to unstable detection of plugging in Ethernet cable when LAN87xx is in Energy Detect Power-Down mode. the workaround for it involves, when the link is down, and at each read_status() call:
- disable EDPD mode, forcing the PHY out of low-power mode - waiting 640ms to see if we have any energy detected from the media - re-enable entry to EDPD mode
This is presumably enough to allow the PHY to notice that a cable is connected, and resume normal operations to negotiate with the partner. The problem is that when no media is detected, the 640ms wait times out and this commit was modified to prints an error message. it is an inappropriate conversion by used phy_read_poll_timeout() to introduce this bug. so fix this issue by use read_poll_timeout() to replace phy_read_poll_timeout().
Fixes: 7ae7ad2f11ef47 ("net: phy: smsc: use phy_read_poll_timeout() to simplify the code") Reported-by: Kevin Groeneveld kgroeneveld@gmail.com Signed-off-by: Dejin Zheng zhengdejin5@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/smsc.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c index 93da7d3d0954c..74568ae161253 100644 --- a/drivers/net/phy/smsc.c +++ b/drivers/net/phy/smsc.c @@ -122,10 +122,13 @@ static int lan87xx_read_status(struct phy_device *phydev) if (rc < 0) return rc;
- /* Wait max 640 ms to detect energy */ - phy_read_poll_timeout(phydev, MII_LAN83C185_CTRL_STATUS, rc, - rc & MII_LAN83C185_ENERGYON, 10000, - 640000, true); + /* Wait max 640 ms to detect energy and the timeout is not + * an actual error. + */ + read_poll_timeout(phy_read, rc, + rc & MII_LAN83C185_ENERGYON || rc < 0, + 10000, 640000, true, phydev, + MII_LAN83C185_CTRL_STATUS); if (rc < 0) return rc;
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit b65ce380b754e77fbfdcfc83fd6e29c8ceedf431 ]
genl_family_rcv_msg_attrs_parse() and genl_family_rcv_msg_attrs_free() take a boolean parameter to determine whether allocate/free the family attrs. This is unnecessary as we can just check family->parallel_ops. More importantly, callers would not need to worry about pairing these parameters correctly after this patch.
And this fixes a memory leak, as after commit c36f05559104 ("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()") we call genl_family_rcv_msg_attrs_parse() for both parallel and non-parallel cases.
Fixes: c36f05559104 ("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()") Reported-by: Ido Schimmel idosch@idosch.org Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Reviewed-by: Ido Schimmel idosch@mellanox.com Tested-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/netlink/genetlink.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-)
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index bcbba0bef1c2a..9c1c27f3a0894 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -474,8 +474,7 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family, struct netlink_ext_ack *extack, const struct genl_ops *ops, int hdrlen, - enum genl_validate_flags no_strict_flag, - bool parallel) + enum genl_validate_flags no_strict_flag) { enum netlink_validation validate = ops->validate & no_strict_flag ? NL_VALIDATE_LIBERAL : @@ -486,7 +485,7 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family, if (!family->maxattr) return NULL;
- if (parallel) { + if (family->parallel_ops) { attrbuf = kmalloc_array(family->maxattr + 1, sizeof(struct nlattr *), GFP_KERNEL); if (!attrbuf) @@ -498,7 +497,7 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family, err = __nlmsg_parse(nlh, hdrlen, attrbuf, family->maxattr, family->policy, validate, extack); if (err) { - if (parallel) + if (family->parallel_ops) kfree(attrbuf); return ERR_PTR(err); } @@ -506,10 +505,9 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family, }
static void genl_family_rcv_msg_attrs_free(const struct genl_family *family, - struct nlattr **attrbuf, - bool parallel) + struct nlattr **attrbuf) { - if (parallel) + if (family->parallel_ops) kfree(attrbuf); }
@@ -537,15 +535,14 @@ static int genl_start(struct netlink_callback *cb)
attrs = genl_family_rcv_msg_attrs_parse(ctx->family, ctx->nlh, ctx->extack, ops, ctx->hdrlen, - GENL_DONT_VALIDATE_DUMP_STRICT, - true); + GENL_DONT_VALIDATE_DUMP_STRICT); if (IS_ERR(attrs)) return PTR_ERR(attrs);
no_attrs: info = genl_dumpit_info_alloc(); if (!info) { - kfree(attrs); + genl_family_rcv_msg_attrs_free(ctx->family, attrs); return -ENOMEM; } info->family = ctx->family; @@ -562,7 +559,7 @@ no_attrs: }
if (rc) { - kfree(attrs); + genl_family_rcv_msg_attrs_free(info->family, info->attrs); genl_dumpit_info_free(info); cb->data = NULL; } @@ -591,7 +588,7 @@ static int genl_lock_done(struct netlink_callback *cb) rc = ops->done(cb); genl_unlock(); } - genl_family_rcv_msg_attrs_free(info->family, info->attrs, false); + genl_family_rcv_msg_attrs_free(info->family, info->attrs); genl_dumpit_info_free(info); return rc; } @@ -604,7 +601,7 @@ static int genl_parallel_done(struct netlink_callback *cb)
if (ops->done) rc = ops->done(cb); - genl_family_rcv_msg_attrs_free(info->family, info->attrs, true); + genl_family_rcv_msg_attrs_free(info->family, info->attrs); genl_dumpit_info_free(info); return rc; } @@ -671,8 +668,7 @@ static int genl_family_rcv_msg_doit(const struct genl_family *family,
attrbuf = genl_family_rcv_msg_attrs_parse(family, nlh, extack, ops, hdrlen, - GENL_DONT_VALIDATE_STRICT, - family->parallel_ops); + GENL_DONT_VALIDATE_STRICT); if (IS_ERR(attrbuf)) return PTR_ERR(attrbuf);
@@ -698,7 +694,7 @@ static int genl_family_rcv_msg_doit(const struct genl_family *family, family->post_doit(ops, skb, &info);
out: - genl_family_rcv_msg_attrs_free(family, attrbuf, family->parallel_ops); + genl_family_rcv_msg_attrs_free(family, attrbuf);
return err; }
From: David Milburn dmilburn@redhat.com
[ Upstream commit 1cdf9f7670a7d74e27177d5c390c2f8b3b9ba338 ]
Based-on-a-patch-by: Christoph Hellwig hch@infradead.org Tested-by: Yi Zhang yi.zhang@redhat.com Signed-off-by: David Milburn dmilburn@redhat.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/core.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index aa5ca222c6f59..ac7ae031ce78d 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -134,15 +134,10 @@ static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status) struct nvmet_async_event *aen; struct nvmet_req *req;
- while (1) { - mutex_lock(&ctrl->lock); - aen = list_first_entry_or_null(&ctrl->async_events, - struct nvmet_async_event, entry); - if (!aen || !ctrl->nr_async_event_cmds) { - mutex_unlock(&ctrl->lock); - break; - } - + mutex_lock(&ctrl->lock); + while (ctrl->nr_async_event_cmds && !list_empty(&ctrl->async_events)) { + aen = list_first_entry(&ctrl->async_events, + struct nvmet_async_event, entry); req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds]; if (status == 0) nvmet_set_result(req, nvmet_async_event_result(aen)); @@ -152,7 +147,9 @@ static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status)
mutex_unlock(&ctrl->lock); nvmet_req_complete(req, status); + mutex_lock(&ctrl->lock); } + mutex_unlock(&ctrl->lock); }
static void nvmet_async_events_free(struct nvmet_ctrl *ctrl)
From: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com
[ Upstream commit 819f7b88b48fd2bce932cfe3a38bf8fcefbcabe7 ]
In function nvmet_async_event_process() we only process AENs iff there is an open slot on the ctrl->async_event_cmds[] && aen event list posted by the target is not empty. This keeps host posted AEN outstanding if target generated AEN list is empty. We do cleanup the target generated entries from the aen list in nvmet_ctrl_free()-> nvmet_async_events_free() but we don't process AEN posted by the host. This leads to following problem :-
When processing admin sq at the time of nvmet_sq_destroy() holds an extra percpu reference(atomic value = 1), so in the following code path after switching to atomic rcu, release function (nvmet_sq_free()) is not getting called which blocks the sq->free_done in nvmet_sq_destroy() :-
nvmet_sq_destroy() percpu_ref_kill_and_confirm() - __percpu_ref_switch_mode() -- __percpu_ref_switch_to_atomic() --- call_rcu() -> percpu_ref_switch_to_atomic_rcu() ---- /* calls switch callback */ - percpu_ref_put() -- percpu_ref_put_many(ref, 1) --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count))) ---- ref->release(ref); <---- Not called.
This results in indefinite hang:-
void nvmet_sq_destroy(struct nvmet_sq *sq) ... if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) { nvmet_async_events_process(ctrl, status); percpu_ref_put(&sq->ref); } percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq); wait_for_completion(&sq->confirm_done); wait_for_completion(&sq->free_done); <-- Hang here
Which breaks the further disconnect sequence. This problem seems to be introduced after commit 64f5e9cdd711b ("nvmet: fix memory leak when removing namespaces and controllers concurrently").
This patch processes ctrl->async_event_cmds[] in the admin sq destroy() context irrespetive of aen_list. Also we get rid of the controller's aen_list processing in the nvmet_sq_destroy() context and just ignore ctrl->aen_list.
This results in nvmet_async_events_process() being called from workqueue context so we adjust the code accordingly.
Fixes: 64f5e9cdd711 ("nvmet: fix memory leak when removing namespaces and controllers concurrently ") Signed-off-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/core.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index ac7ae031ce78d..96deaf3484662 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -129,7 +129,22 @@ static u32 nvmet_async_event_result(struct nvmet_async_event *aen) return aen->event_type | (aen->event_info << 8) | (aen->log_page << 16); }
-static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status) +static void nvmet_async_events_failall(struct nvmet_ctrl *ctrl) +{ + u16 status = NVME_SC_INTERNAL | NVME_SC_DNR; + struct nvmet_req *req; + + mutex_lock(&ctrl->lock); + while (ctrl->nr_async_event_cmds) { + req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds]; + mutex_unlock(&ctrl->lock); + nvmet_req_complete(req, status); + mutex_lock(&ctrl->lock); + } + mutex_unlock(&ctrl->lock); +} + +static void nvmet_async_events_process(struct nvmet_ctrl *ctrl) { struct nvmet_async_event *aen; struct nvmet_req *req; @@ -139,14 +154,13 @@ static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status) aen = list_first_entry(&ctrl->async_events, struct nvmet_async_event, entry); req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds]; - if (status == 0) - nvmet_set_result(req, nvmet_async_event_result(aen)); + nvmet_set_result(req, nvmet_async_event_result(aen));
list_del(&aen->entry); kfree(aen);
mutex_unlock(&ctrl->lock); - nvmet_req_complete(req, status); + nvmet_req_complete(req, 0); mutex_lock(&ctrl->lock); } mutex_unlock(&ctrl->lock); @@ -169,7 +183,7 @@ static void nvmet_async_event_work(struct work_struct *work) struct nvmet_ctrl *ctrl = container_of(work, struct nvmet_ctrl, async_event_work);
- nvmet_async_events_process(ctrl, 0); + nvmet_async_events_process(ctrl); }
void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type, @@ -752,7 +766,6 @@ static void nvmet_confirm_sq(struct percpu_ref *ref)
void nvmet_sq_destroy(struct nvmet_sq *sq) { - u16 status = NVME_SC_INTERNAL | NVME_SC_DNR; struct nvmet_ctrl *ctrl = sq->ctrl;
/* @@ -760,7 +773,7 @@ void nvmet_sq_destroy(struct nvmet_sq *sq) * queue doesn't have outstanding requests on it. */ if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) - nvmet_async_events_process(ctrl, status); + nvmet_async_events_failall(ctrl); percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq); wait_for_completion(&sq->confirm_done); wait_for_completion(&sq->free_done);
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 9d964e1b82d8182184153b70174f445ea616f053 ]
lost npc in PTRACE_SETREGSET, breaking PTRACE_SETREGS as well
Fixes: cf51e129b968 "sparc32: fix register window handling in genregs32_[gs]et()" Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/kernel/ptrace_32.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/sparc/kernel/ptrace_32.c b/arch/sparc/kernel/ptrace_32.c index 60f7205ebe40d..646dd58169ecb 100644 --- a/arch/sparc/kernel/ptrace_32.c +++ b/arch/sparc/kernel/ptrace_32.c @@ -168,12 +168,17 @@ static int genregs32_set(struct task_struct *target, if (ret || !count) return ret; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - ®s->y, + ®s->npc, 34 * sizeof(u32), 35 * sizeof(u32)); if (ret || !count) return ret; + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, + ®s->y, + 35 * sizeof(u32), 36 * sizeof(u32)); + if (ret || !count) + return ret; return user_regset_copyin_ignore(&pos, &count, &kbuf, &ubuf, - 35 * sizeof(u32), 38 * sizeof(u32)); + 36 * sizeof(u32), 38 * sizeof(u32)); }
static int fpregs32_get(struct task_struct *target,
From: Takashi Iwai tiwai@suse.de
[ Upstream commit ff58bbc7b9704a5869204176f804eff57307fef0 ]
With the recent full-duplex support of implicit feedback streams, an endpoint can be still running after closing the capture stream as long as the playback stream with the sync-endpoint is running. In such a state, the URBs are still be handled and they may call retire_data_urb callback, which tries to transfer the data from the PCM buffer. Since the PCM stream gets closed, this may lead to use-after-free.
This patch adds the proper clearance of the callback at stopping the capture stream for addressing the possible UAF above.
Fixes: 10ce77e4817f ("ALSA: usb-audio: Add duplex sound support for USB devices using implicit feedback") Link: https://lore.kernel.org/r/20200616120921.12249-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/pcm.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index d61c2f1095b5c..3411e27eb64bb 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -1782,6 +1782,7 @@ static int snd_usb_substream_capture_trigger(struct snd_pcm_substream *substream return 0; case SNDRV_PCM_TRIGGER_STOP: stop_endpoints(subs); + subs->data_endpoint->retire_data_urb = NULL; subs->running = 0; return 0; case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
From: Thierry Reding treding@nvidia.com
[ Upstream commit 78ad73421831247e46c31899a7bead02740e4bef ]
This reverts commit 9f42de8d4ec2304f10bbc51dc0484f3503d61196.
It's not safe to use pm_runtime_force_{suspend,resume}(), especially during the noirq phase of suspend. See also the guidance provided in commit 1e2ef05bb8cf ("PM: Limit race conditions between runtime PM and system sleep (v2)").
Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-tegra.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c index 4c4d17ddc96b9..7c88611c732c4 100644 --- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -1769,14 +1769,9 @@ static int tegra_i2c_remove(struct platform_device *pdev) static int __maybe_unused tegra_i2c_suspend(struct device *dev) { struct tegra_i2c_dev *i2c_dev = dev_get_drvdata(dev); - int err;
i2c_mark_adapter_suspended(&i2c_dev->adapter);
- err = pm_runtime_force_suspend(dev); - if (err < 0) - return err; - return 0; }
@@ -1797,10 +1792,6 @@ static int __maybe_unused tegra_i2c_resume(struct device *dev) if (err) return err;
- err = pm_runtime_force_resume(dev); - if (err < 0) - return err; - i2c_mark_adapter_resumed(&i2c_dev->adapter);
return 0;
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 9fecd13202f520f3f25d5b1c313adb740fe19773 ]
When removing a block group, if we fail to delete the block group's item from the extent tree, we jump to the 'out' label and end up decrementing the block group's reference count once only (by 1), resulting in a counter leak because the block group at that point was already removed from the block group cache rbtree - so we have to decrement the reference count twice, once for the rbtree and once for our lookup at the start of the function.
There is a second bug where if removing the free space tree entries (the call to remove_block_group_free_space()) fails we end up jumping to the 'out_put_group' label but end up decrementing the reference count only once, when we should have done it twice, since we have already removed the block group from the block group cache rbtree. This happens because the reference count decrement for the rbtree reference happens after attempting to remove the free space tree entries, which is far away from the place where we remove the block group from the rbtree.
To make things less error prone, decrement the reference count for the rbtree immediately after removing the block group from it. This also eleminates the need for two different exit labels on error, renaming 'out_put_label' to just 'out' and removing the old 'out'.
Fixes: f6033c5e333238 ("btrfs: fix block group leak when removing fails") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 233c5663f2332..0c17f18b47940 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -916,7 +916,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM; - goto out_put_group; + goto out; }
/* @@ -954,7 +954,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, ret = btrfs_orphan_add(trans, BTRFS_I(inode)); if (ret) { btrfs_add_delayed_iput(inode); - goto out_put_group; + goto out; } clear_nlink(inode); /* One for the block groups ref */ @@ -977,13 +977,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1); if (ret < 0) - goto out_put_group; + goto out; if (ret > 0) btrfs_release_path(path); if (ret == 0) { ret = btrfs_del_item(trans, tree_root, path); if (ret) - goto out_put_group; + goto out; btrfs_release_path(path); }
@@ -992,6 +992,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, &fs_info->block_group_cache_tree); RB_CLEAR_NODE(&block_group->cache_node);
+ /* Once for the block groups rbtree */ + btrfs_put_block_group(block_group); + if (fs_info->first_logical_byte == block_group->start) fs_info->first_logical_byte = (u64)-1; spin_unlock(&fs_info->block_group_cache_lock); @@ -1102,10 +1105,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
ret = remove_block_group_free_space(trans, block_group); if (ret) - goto out_put_group; - - /* Once for the block groups rbtree */ - btrfs_put_block_group(block_group); + goto out;
ret = btrfs_search_slot(trans, root, &key, path, -1, 1); if (ret > 0) @@ -1128,10 +1128,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, free_extent_map(em); }
-out_put_group: +out: /* Once for the lookup reference */ btrfs_put_block_group(block_group); -out: if (remove_rsv) btrfs_delayed_refs_rsv_release(fs_info, 1); btrfs_free_path(path);
From: Todd Kjos tkjos@google.com
commit d35d3660e065b69fdb8bf512f3d899f350afce52 upstream.
The binder driver makes the assumption proc->context pointer is invariant after initialization (as documented in the kerneldoc header for struct proc). However, in commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II") proc->context is set to NULL during binder_deferred_release().
Another proc was in the middle of setting up a transaction to the dying process and crashed on a NULL pointer deref on "context" which is a local set to &proc->context:
new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;
Here's the stack:
[ 5237.855435] Call trace: [ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec [ 5237.855446] binder_inc_ref_for_node+0x140/0x280 [ 5237.855451] binder_translate_binder+0x1d0/0x388 [ 5237.855456] binder_transaction+0x2228/0x3730 [ 5237.855461] binder_thread_write+0x640/0x25bc [ 5237.855466] binder_ioctl_write_read+0xb0/0x464 [ 5237.855471] binder_ioctl+0x30c/0x96c [ 5237.855477] do_vfs_ioctl+0x3e0/0x700 [ 5237.855482] __arm64_sys_ioctl+0x78/0xa4 [ 5237.855488] el0_svc_common+0xb4/0x194 [ 5237.855493] el0_svc_handler+0x74/0x98 [ 5237.855497] el0_svc+0x8/0xc
The fix is to move the kfree of the binder_device to binder_free_proc() so the binder_device is freed when we know there are no references remaining on the binder_proc.
Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II") Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Todd Kjos tkjos@google.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c index e47c8a4c83db5..f50c5f182bb52 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -4686,8 +4686,15 @@ static struct binder_thread *binder_get_thread(struct binder_proc *proc)
static void binder_free_proc(struct binder_proc *proc) { + struct binder_device *device; + BUG_ON(!list_empty(&proc->todo)); BUG_ON(!list_empty(&proc->delivered_death)); + device = container_of(proc->context, struct binder_device, context); + if (refcount_dec_and_test(&device->ref)) { + kfree(proc->context->name); + kfree(device); + } binder_alloc_deferred_release(&proc->alloc); put_task_struct(proc->tsk); binder_stats_deleted(BINDER_STAT_PROC); @@ -5406,7 +5413,6 @@ static int binder_node_release(struct binder_node *node, int refs) static void binder_deferred_release(struct binder_proc *proc) { struct binder_context *context = proc->context; - struct binder_device *device; struct rb_node *n; int threads, nodes, incoming_refs, outgoing_refs, active_transactions;
@@ -5423,12 +5429,6 @@ static void binder_deferred_release(struct binder_proc *proc) context->binder_context_mgr_node = NULL; } mutex_unlock(&context->context_mgr_node_lock); - device = container_of(proc->context, struct binder_device, context); - if (refcount_dec_and_test(&device->ref)) { - kfree(context->name); - kfree(device); - } - proc->context = NULL; binder_inner_proc_lock(proc); /* * Make sure proc stays alive after we
From: Tomas Winkler tomas.winkler@intel.com
commit f76d77f50b343bc7f7d01e4c2771d43fb074f617 upstream.
For SPS firmware versions 5.0 and newer the way detection has changed. The detection is done now via PCI_CFG_HFS_3 register. To prevent conflict the previous method will get sps_4 suffix Disable both CNP_H and CNP_H_3 interfaces. CNP_H_3 requires a separate configuration as it doesn't support DMA.
Cc: stable@vger.kernel.org Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20200619165121.2145330-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/hw-me-regs.h | 2 ++ drivers/misc/mei/hw-me.c | 60 +++++++++++++++++++++++++++++++---- drivers/misc/mei/hw-me.h | 13 +++++--- drivers/misc/mei/pci-me.c | 16 +++++----- 4 files changed, 73 insertions(+), 18 deletions(-)
diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h index 9392934e3a060..01b1bf74f2626 100644 --- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -107,6 +107,8 @@ # define PCI_CFG_HFS_1_D0I3_MSK 0x80000000 #define PCI_CFG_HFS_2 0x48 #define PCI_CFG_HFS_3 0x60 +# define PCI_CFG_HFS_3_FW_SKU_MSK 0x00000070 +# define PCI_CFG_HFS_3_FW_SKU_SPS 0x00000060 #define PCI_CFG_HFS_4 0x64 #define PCI_CFG_HFS_5 0x68 #define PCI_CFG_HFS_6 0x6C diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c index f620442addf52..f8155c1e811d7 100644 --- a/drivers/misc/mei/hw-me.c +++ b/drivers/misc/mei/hw-me.c @@ -1366,7 +1366,7 @@ static bool mei_me_fw_type_nm(struct pci_dev *pdev) #define MEI_CFG_FW_NM \ .quirk_probe = mei_me_fw_type_nm
-static bool mei_me_fw_type_sps(struct pci_dev *pdev) +static bool mei_me_fw_type_sps_4(struct pci_dev *pdev) { u32 reg; unsigned int devfn; @@ -1382,7 +1382,36 @@ static bool mei_me_fw_type_sps(struct pci_dev *pdev) return (reg & 0xf0000) == 0xf0000; }
-#define MEI_CFG_FW_SPS \ +#define MEI_CFG_FW_SPS_4 \ + .quirk_probe = mei_me_fw_type_sps_4 + +/** + * mei_me_fw_sku_sps() - check for sps sku + * + * Read ME FW Status register to check for SPS Firmware. + * The SPS FW is only signaled in pci function 0 + * + * @pdev: pci device + * + * Return: true in case of SPS firmware + */ +static bool mei_me_fw_type_sps(struct pci_dev *pdev) +{ + u32 reg; + u32 fw_type; + unsigned int devfn; + + devfn = PCI_DEVFN(PCI_SLOT(pdev->devfn), 0); + pci_bus_read_config_dword(pdev->bus, devfn, PCI_CFG_HFS_3, ®); + trace_mei_pci_cfg_read(&pdev->dev, "PCI_CFG_HFS_3", PCI_CFG_HFS_3, reg); + fw_type = (reg & PCI_CFG_HFS_3_FW_SKU_MSK); + + dev_dbg(&pdev->dev, "fw type is %d\n", fw_type); + + return fw_type == PCI_CFG_HFS_3_FW_SKU_SPS; +} + +#define MEI_CFG_FW_SPS \ .quirk_probe = mei_me_fw_type_sps
#define MEI_CFG_FW_VER_SUPP \ @@ -1452,10 +1481,17 @@ static const struct mei_cfg mei_me_pch8_cfg = { };
/* PCH8 Lynx Point with quirk for SPS Firmware exclusion */ -static const struct mei_cfg mei_me_pch8_sps_cfg = { +static const struct mei_cfg mei_me_pch8_sps_4_cfg = { MEI_CFG_PCH8_HFS, MEI_CFG_FW_VER_SUPP, - MEI_CFG_FW_SPS, + MEI_CFG_FW_SPS_4, +}; + +/* LBG with quirk for SPS (4.0) Firmware exclusion */ +static const struct mei_cfg mei_me_pch12_sps_4_cfg = { + MEI_CFG_PCH8_HFS, + MEI_CFG_FW_VER_SUPP, + MEI_CFG_FW_SPS_4, };
/* Cannon Lake and newer devices */ @@ -1465,8 +1501,18 @@ static const struct mei_cfg mei_me_pch12_cfg = { MEI_CFG_DMA_128, };
-/* LBG with quirk for SPS Firmware exclusion */ +/* Cannon Lake with quirk for SPS 5.0 and newer Firmware exclusion */ static const struct mei_cfg mei_me_pch12_sps_cfg = { + MEI_CFG_PCH8_HFS, + MEI_CFG_FW_VER_SUPP, + MEI_CFG_DMA_128, + MEI_CFG_FW_SPS, +}; + +/* Cannon Lake with quirk for SPS 5.0 and newer Firmware exclusion + * w/o DMA support + */ +static const struct mei_cfg mei_me_pch12_nodma_sps_cfg = { MEI_CFG_PCH8_HFS, MEI_CFG_FW_VER_SUPP, MEI_CFG_FW_SPS, @@ -1492,9 +1538,11 @@ static const struct mei_cfg *const mei_cfg_list[] = { [MEI_ME_PCH7_CFG] = &mei_me_pch7_cfg, [MEI_ME_PCH_CPT_PBG_CFG] = &mei_me_pch_cpt_pbg_cfg, [MEI_ME_PCH8_CFG] = &mei_me_pch8_cfg, - [MEI_ME_PCH8_SPS_CFG] = &mei_me_pch8_sps_cfg, + [MEI_ME_PCH8_SPS_4_CFG] = &mei_me_pch8_sps_4_cfg, [MEI_ME_PCH12_CFG] = &mei_me_pch12_cfg, + [MEI_ME_PCH12_SPS_4_CFG] = &mei_me_pch12_sps_4_cfg, [MEI_ME_PCH12_SPS_CFG] = &mei_me_pch12_sps_cfg, + [MEI_ME_PCH12_SPS_NODMA_CFG] = &mei_me_pch12_nodma_sps_cfg, [MEI_ME_PCH15_CFG] = &mei_me_pch15_cfg, };
diff --git a/drivers/misc/mei/hw-me.h b/drivers/misc/mei/hw-me.h index b6b94e2114645..52e0c6d578f27 100644 --- a/drivers/misc/mei/hw-me.h +++ b/drivers/misc/mei/hw-me.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Copyright (c) 2012-2019, Intel Corporation. All rights reserved. + * Copyright (c) 2012-2020, Intel Corporation. All rights reserved. * Intel Management Engine Interface (Intel MEI) Linux driver */
@@ -76,11 +76,14 @@ struct mei_me_hw { * with quirk for Node Manager exclusion. * @MEI_ME_PCH8_CFG: Platform Controller Hub Gen8 and newer * client platforms. - * @MEI_ME_PCH8_SPS_CFG: Platform Controller Hub Gen8 and newer + * @MEI_ME_PCH8_SPS_4_CFG: Platform Controller Hub Gen8 and newer * servers platforms with quirk for * SPS firmware exclusion. * @MEI_ME_PCH12_CFG: Platform Controller Hub Gen12 and newer - * @MEI_ME_PCH12_SPS_CFG: Platform Controller Hub Gen12 and newer + * @MEI_ME_PCH12_SPS_4_CFG:Platform Controller Hub Gen12 up to 4.0 + * servers platforms with quirk for + * SPS firmware exclusion. + * @MEI_ME_PCH12_SPS_CFG: Platform Controller Hub Gen12 5.0 and newer * servers platforms with quirk for * SPS firmware exclusion. * @MEI_ME_PCH15_CFG: Platform Controller Hub Gen15 and newer @@ -94,9 +97,11 @@ enum mei_cfg_idx { MEI_ME_PCH7_CFG, MEI_ME_PCH_CPT_PBG_CFG, MEI_ME_PCH8_CFG, - MEI_ME_PCH8_SPS_CFG, + MEI_ME_PCH8_SPS_4_CFG, MEI_ME_PCH12_CFG, + MEI_ME_PCH12_SPS_4_CFG, MEI_ME_PCH12_SPS_CFG, + MEI_ME_PCH12_SPS_NODMA_CFG, MEI_ME_PCH15_CFG, MEI_ME_NUM_CFG, }; diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index a1ed375fed374..f74c6113812d6 100644 --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -59,18 +59,18 @@ static const struct pci_device_id mei_me_pci_tbl[] = { {MEI_PCI_DEVICE(MEI_DEV_ID_PPT_1, MEI_ME_PCH7_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_PPT_2, MEI_ME_PCH7_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_PPT_3, MEI_ME_PCH7_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_H, MEI_ME_PCH8_SPS_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_W, MEI_ME_PCH8_SPS_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_H, MEI_ME_PCH8_SPS_4_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_W, MEI_ME_PCH8_SPS_4_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_LP, MEI_ME_PCH8_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_HR, MEI_ME_PCH8_SPS_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_LPT_HR, MEI_ME_PCH8_SPS_4_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_WPT_LP, MEI_ME_PCH8_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_WPT_LP_2, MEI_ME_PCH8_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_SPT, MEI_ME_PCH8_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_2, MEI_ME_PCH8_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H, MEI_ME_PCH8_SPS_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H_2, MEI_ME_PCH8_SPS_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_LBG, MEI_ME_PCH12_SPS_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H, MEI_ME_PCH8_SPS_4_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_SPT_H_2, MEI_ME_PCH8_SPS_4_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_LBG, MEI_ME_PCH12_SPS_4_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_BXT_M, MEI_ME_PCH8_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_APL_I, MEI_ME_PCH8_CFG)}, @@ -84,8 +84,8 @@ static const struct pci_device_id mei_me_pci_tbl[] = {
{MEI_PCI_DEVICE(MEI_DEV_ID_CNP_LP, MEI_ME_PCH12_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_CNP_LP_3, MEI_ME_PCH8_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_CNP_H, MEI_ME_PCH12_CFG)}, - {MEI_PCI_DEVICE(MEI_DEV_ID_CNP_H_3, MEI_ME_PCH8_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_CNP_H, MEI_ME_PCH12_SPS_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_CNP_H_3, MEI_ME_PCH12_SPS_NODMA_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_CMP_LP, MEI_ME_PCH12_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_CMP_LP_3, MEI_ME_PCH8_CFG)},
From: Alexander Usyskin alexander.usyskin@intel.com
commit 8c289ea064165237891a7b4be77b74d5cba8fa99 upstream.
Add Tiger Lake device ids H for HECI1. TGH_H is also used in Tatlow SPS platform we need to disable the mei interface there.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20200619165121.2145330-7-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/hw-me-regs.h | 1 + drivers/misc/mei/hw-me.c | 10 ++++++++++ drivers/misc/mei/hw-me.h | 4 ++++ drivers/misc/mei/pci-me.c | 1 + 4 files changed, 16 insertions(+)
diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h index 01b1bf74f2626..7becfc768bbcc 100644 --- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -94,6 +94,7 @@ #define MEI_DEV_ID_JSP_N 0x4DE0 /* Jasper Lake Point N */
#define MEI_DEV_ID_TGP_LP 0xA0E0 /* Tiger Lake Point LP */ +#define MEI_DEV_ID_TGP_H 0x43E0 /* Tiger Lake Point H */
#define MEI_DEV_ID_MCC 0x4B70 /* Mule Creek Canyon (EHL) */ #define MEI_DEV_ID_MCC_4 0x4B75 /* Mule Creek Canyon 4 (EHL) */ diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c index f8155c1e811d7..7649710a2ab9e 100644 --- a/drivers/misc/mei/hw-me.c +++ b/drivers/misc/mei/hw-me.c @@ -1526,6 +1526,15 @@ static const struct mei_cfg mei_me_pch15_cfg = { MEI_CFG_TRC, };
+/* Tiger Lake with quirk for SPS 5.0 and newer Firmware exclusion */ +static const struct mei_cfg mei_me_pch15_sps_cfg = { + MEI_CFG_PCH8_HFS, + MEI_CFG_FW_VER_SUPP, + MEI_CFG_DMA_128, + MEI_CFG_TRC, + MEI_CFG_FW_SPS, +}; + /* * mei_cfg_list - A list of platform platform specific configurations. * Note: has to be synchronized with enum mei_cfg_idx. @@ -1544,6 +1553,7 @@ static const struct mei_cfg *const mei_cfg_list[] = { [MEI_ME_PCH12_SPS_CFG] = &mei_me_pch12_sps_cfg, [MEI_ME_PCH12_SPS_NODMA_CFG] = &mei_me_pch12_nodma_sps_cfg, [MEI_ME_PCH15_CFG] = &mei_me_pch15_cfg, + [MEI_ME_PCH15_SPS_CFG] = &mei_me_pch15_sps_cfg, };
const struct mei_cfg *mei_me_get_cfg(kernel_ulong_t idx) diff --git a/drivers/misc/mei/hw-me.h b/drivers/misc/mei/hw-me.h index 52e0c6d578f27..6a8973649c490 100644 --- a/drivers/misc/mei/hw-me.h +++ b/drivers/misc/mei/hw-me.h @@ -87,6 +87,9 @@ struct mei_me_hw { * servers platforms with quirk for * SPS firmware exclusion. * @MEI_ME_PCH15_CFG: Platform Controller Hub Gen15 and newer + * @MEI_ME_PCH15_SPS_CFG: Platform Controller Hub Gen15 and newer + * servers platforms with quirk for + * SPS firmware exclusion. * @MEI_ME_NUM_CFG: Upper Sentinel. */ enum mei_cfg_idx { @@ -103,6 +106,7 @@ enum mei_cfg_idx { MEI_ME_PCH12_SPS_CFG, MEI_ME_PCH12_SPS_NODMA_CFG, MEI_ME_PCH15_CFG, + MEI_ME_PCH15_SPS_CFG, MEI_ME_NUM_CFG, };
diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index f74c6113812d6..81e759674c1b5 100644 --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -96,6 +96,7 @@ static const struct pci_device_id mei_me_pci_tbl[] = { {MEI_PCI_DEVICE(MEI_DEV_ID_ICP_LP, MEI_ME_PCH12_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_TGP_LP, MEI_ME_PCH15_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_TGP_H, MEI_ME_PCH15_SPS_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_JSP_N, MEI_ME_PCH15_CFG)},
From: Anand Moon linux.amoon@gmail.com
commit ad38beb373a14e082f4e64b68c0b6e6b09764680 upstream.
This reverts commit 07f6842341abe978e6375078f84506ec3280ece5.
Since SCLK_SCLK_USBD300 suspend clock need to be configured for phy module, I wrongly mapped this clock to DWC3 code.
Cc: Felipe Balbi balbi@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Anand Moon linux.amoon@gmail.com Cc: stable stable@vger.kernel.org Fixes: 07f6842341ab ("usb: dwc3: exynos: Add support for Exynos5422 suspend clk") Link: https://lore.kernel.org/r/20200623074637.756-1-linux.amoon@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-exynos.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/usb/dwc3/dwc3-exynos.c b/drivers/usb/dwc3/dwc3-exynos.c index 48b68b6f0dc82..90bb022737da8 100644 --- a/drivers/usb/dwc3/dwc3-exynos.c +++ b/drivers/usb/dwc3/dwc3-exynos.c @@ -162,12 +162,6 @@ static const struct dwc3_exynos_driverdata exynos5250_drvdata = { .suspend_clk_idx = -1, };
-static const struct dwc3_exynos_driverdata exynos5420_drvdata = { - .clk_names = { "usbdrd30", "usbdrd30_susp_clk"}, - .num_clks = 2, - .suspend_clk_idx = 1, -}; - static const struct dwc3_exynos_driverdata exynos5433_drvdata = { .clk_names = { "aclk", "susp_clk", "pipe_pclk", "phyclk" }, .num_clks = 4, @@ -184,9 +178,6 @@ static const struct of_device_id exynos_dwc3_match[] = { { .compatible = "samsung,exynos5250-dwusb3", .data = &exynos5250_drvdata, - }, { - .compatible = "samsung,exynos5420-dwusb3", - .data = &exynos5420_drvdata, }, { .compatible = "samsung,exynos5433-dwusb3", .data = &exynos5433_drvdata,
From: Chuhong Yuan hslester96@gmail.com
commit 07c112fb09c86c0231f6ff0061a000ffe91c8eb9 upstream.
This driver misses calling iounmap() in remove to undo the ioremap() called in probe. Add the missed call to fix it.
Fixes: f54aab6ebcec ("usb: ohci-sm501 driver") Cc: stable stable@vger.kernel.org Signed-off-by: Chuhong Yuan hslester96@gmail.com Acked-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20200610024844.3628408-1-hslester96@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ohci-sm501.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/ohci-sm501.c b/drivers/usb/host/ohci-sm501.c index cff9652403270..b91d50da6127f 100644 --- a/drivers/usb/host/ohci-sm501.c +++ b/drivers/usb/host/ohci-sm501.c @@ -191,6 +191,7 @@ static int ohci_hcd_sm501_drv_remove(struct platform_device *pdev) struct resource *mem;
usb_remove_hcd(hcd); + iounmap(hcd->regs); release_mem_region(hcd->rsrc_start, hcd->rsrc_len); usb_put_hcd(hcd); mem = platform_get_resource(pdev, IORESOURCE_MEM, 1);
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit 207324a321a866401b098cadf19e4a2dd6584622 upstream.
During dwc2 driver probe, after gadget registration to the udc class driver, if exist any builtin function driver it immediately bound to dwc2 and after init host side (dwc2_hcd_init()) stucked in host mode. Patch postpone gadget registration after host side initialization done.
Fixes: 117777b2c3bb9 ("usb: dwc2: Move gadget probe function into platform code") Reported-by: kbuild test robot lkp@intel.com Tested-by: Marek Vasut marex@denx.de Cc: stable stable@vger.kernel.org Signed-off-by: Minas Harutyunyan hminas@synopsys.com Link: https://lore.kernel.org/r/f21cb38fecc72a230b86155d94c7e60c9cb66f58.159169093... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 6 ------ drivers/usb/dwc2/platform.c | 11 +++++++++++ 2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index 12b98b4662872..7faf5f8c056d4 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -4920,12 +4920,6 @@ int dwc2_gadget_init(struct dwc2_hsotg *hsotg) epnum, 0); }
- ret = usb_add_gadget_udc(dev, &hsotg->gadget); - if (ret) { - dwc2_hsotg_ep_free_request(&hsotg->eps_out[0]->ep, - hsotg->ctrl_req); - return ret; - } dwc2_hsotg_dump(hsotg);
return 0; diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c index 69972750e1612..5684c4781af9e 100644 --- a/drivers/usb/dwc2/platform.c +++ b/drivers/usb/dwc2/platform.c @@ -536,6 +536,17 @@ static int dwc2_driver_probe(struct platform_device *dev) if (hsotg->dr_mode == USB_DR_MODE_PERIPHERAL) dwc2_lowlevel_hw_disable(hsotg);
+#if IS_ENABLED(CONFIG_USB_DWC2_PERIPHERAL) || \ + IS_ENABLED(CONFIG_USB_DWC2_DUAL_ROLE) + /* Postponed adding a new gadget to the udc class driver list */ + if (hsotg->gadget_enabled) { + retval = usb_add_gadget_udc(hsotg->dev, &hsotg->gadget); + if (retval) { + dwc2_hsotg_remove(hsotg); + goto error_init; + } + } +#endif /* CONFIG_USB_DWC2_PERIPHERAL || CONFIG_USB_DWC2_DUAL_ROLE */ return 0;
error_init:
From: Tomasz Meresiński tomasz@meresinski.eu
commit 5d8021923e8a8cc37a421a64e27c7221f0fee33c upstream.
The Logitech C922, just like other Logitech webcams, needs the USB_QUIRK_DELAY_INIT or it will randomly not respond after device connection
Signed-off-by: Tomasz Meresiński tomasz@meresinski.eu Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200603203347.7792-1-tomasz@meresinski.eu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/quirks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 3e8efe759c3e6..e0b77674869ce 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -218,11 +218,12 @@ static const struct usb_device_id usb_quirk_list[] = { /* Logitech HD Webcam C270 */ { USB_DEVICE(0x046d, 0x0825), .driver_info = USB_QUIRK_RESET_RESUME },
- /* Logitech HD Pro Webcams C920, C920-C, C925e and C930e */ + /* Logitech HD Pro Webcams C920, C920-C, C922, C925e and C930e */ { USB_DEVICE(0x046d, 0x082d), .driver_info = USB_QUIRK_DELAY_INIT }, { USB_DEVICE(0x046d, 0x0841), .driver_info = USB_QUIRK_DELAY_INIT }, { USB_DEVICE(0x046d, 0x0843), .driver_info = USB_QUIRK_DELAY_INIT }, { USB_DEVICE(0x046d, 0x085b), .driver_info = USB_QUIRK_DELAY_INIT }, + { USB_DEVICE(0x046d, 0x085c), .driver_info = USB_QUIRK_DELAY_INIT },
/* Logitech ConferenceCam CC3000e */ { USB_DEVICE(0x046d, 0x0847), .driver_info = USB_QUIRK_DELAY_INIT },
From: Longfang Liu liulongfang@huawei.com
commit 1ddcb71a3edf0e1682b6e056158e4c4b00325f66 upstream.
A Synopsys USB2.0 core used in Huawei Kunpeng920 SoC has a bug which might cause the host controller not issuing ping.
Bug description: After indicating an Interrupt on Async Advance, the software uses the doorbell mechanism to delete the Next Link queue head of the last executed queue head. At this time, the host controller still references the removed queue head(the queue head is NULL). NULL reference causes the host controller to lose the USB device.
Solution: After deleting the Next Link queue head, when has_synopsys_hc_bug set to 1,the software can write one of the valid queue head addresses to the ASYNCLISTADDR register to allow the host controller to get the valid queue head. in order to solve that problem, this patch set the flag for Huawei Kunpeng920
There are detailed instructions and solutions in this patch: commit 2f7ac6c19997 ("USB: ehci: add workaround for Synopsys HC bug")
Signed-off-by: Longfang Liu liulongfang@huawei.com Cc: stable stable@vger.kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/1591588019-44284-1-git-send-email-liulongfang@huaw... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ehci-pci.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/usb/host/ehci-pci.c b/drivers/usb/host/ehci-pci.c index 1a48ab1bd3b2a..7ff2cbdcd0b22 100644 --- a/drivers/usb/host/ehci-pci.c +++ b/drivers/usb/host/ehci-pci.c @@ -216,6 +216,13 @@ static int ehci_pci_setup(struct usb_hcd *hcd) ehci_info(ehci, "applying MosChip frame-index workaround\n"); ehci->frame_index_bug = 1; break; + case PCI_VENDOR_ID_HUAWEI: + /* Synopsys HC bug */ + if (pdev->device == 0xa239) { + ehci_info(ehci, "applying Synopsys HC workaround\n"); + ehci->has_synopsys_hc_bug = 1; + } + break; }
/* optional debug port, normally in the first BAR */
From: Macpaul Lin macpaul.lin@mediatek.com
commit a24d5072e87457a14023ee1dd3fc8b1e76f899ef upstream.
When runtime suspend was enabled, runtime suspend might happen when xhci is removing hcd. This might cause kernel panic when hcd has been freed but runtime pm suspend related handle need to reference it.
Signed-off-by: Macpaul Lin macpaul.lin@mediatek.com Reviewed-by: Chunfeng Yun chunfeng.yun@mediatek.com Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200624135949.22611-4-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mtk.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c index bfbdb3ceed291..4311d4c9b68de 100644 --- a/drivers/usb/host/xhci-mtk.c +++ b/drivers/usb/host/xhci-mtk.c @@ -587,6 +587,9 @@ static int xhci_mtk_remove(struct platform_device *dev) struct xhci_hcd *xhci = hcd_to_xhci(hcd); struct usb_hcd *shared_hcd = xhci->shared_hcd;
+ pm_runtime_put_noidle(&dev->dev); + pm_runtime_disable(&dev->dev); + usb_remove_hcd(shared_hcd); xhci->shared_hcd = NULL; device_init_wakeup(&dev->dev, false); @@ -597,8 +600,6 @@ static int xhci_mtk_remove(struct platform_device *dev) xhci_mtk_sch_exit(mtk); xhci_mtk_clks_disable(mtk); xhci_mtk_ldos_disable(mtk); - pm_runtime_put_sync(&dev->dev); - pm_runtime_disable(&dev->dev);
return 0; }
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit b3d71abd135e6919ca0b6cab463738472653ddfb upstream.
USB2 devices with LPM enabled may interrupt the system suspend: [ 932.510475] usb 1-7: usb suspend, wakeup 0 [ 932.510549] hub 1-0:1.0: hub_suspend [ 932.510581] usb usb1: bus suspend, wakeup 0 [ 932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended [ 932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended .. [ 932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03 .. [ 932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16 [ 932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16 [ 932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
During system suspend, USB core will let HC suspends the device if it doesn't have remote wakeup enabled and doesn't have any children. However, from the log above we can see that the usb 1-7 doesn't get bus suspended due to not in U0. After a while the port finished U2 -> U0 transition, interrupts the suspend process.
The observation is that after disabling LPM, port doesn't transit to U0 immediately and can linger in U2. xHCI spec 4.23.5.2 states that the maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for the affected device is advertised as 400us, which is still not enough based on my testing result.
So let's use the maximum permitted latency, 10000, to poll for U0 status to solve the issue.
Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200624135949.22611-6-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index bee5deccc83d8..10d76205a79ca 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -4471,6 +4471,9 @@ static int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd, mutex_lock(hcd->bandwidth_mutex); xhci_change_max_exit_latency(xhci, udev, 0); mutex_unlock(hcd->bandwidth_mutex); + readl_poll_timeout(ports[port_num]->addr, pm_val, + (pm_val & PORT_PLS_MASK) == XDEV_U0, + 100, 10000); return 0; } }
From: Tang Bin tangbin@cmss.chinamobile.com
commit 44ed240d62736ad29943ec01e41e194b96f7c5e9 upstream.
If the function platform_get_irq() failed, the negative value returned will not be detected here. So fix error handling in exynos_ehci_probe(). And when get irq failed, the function platform_get_irq() logs an error message, so remove redundant message here.
Fixes: 1bcc5aa87f04 ("USB: Add initial S5P EHCI driver") Cc: stable stable@vger.kernel.org Signed-off-by: Zhang Shengju zhangshengju@cmss.chinamobile.com Signed-off-by: Tang Bin tangbin@cmss.chinamobile.com Link: https://lore.kernel.org/r/20200602114708.28620-1-tangbin@cmss.chinamobile.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ehci-exynos.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/ehci-exynos.c b/drivers/usb/host/ehci-exynos.c index a4e9abcbdc4fc..1a9b7572e17fe 100644 --- a/drivers/usb/host/ehci-exynos.c +++ b/drivers/usb/host/ehci-exynos.c @@ -203,9 +203,8 @@ static int exynos_ehci_probe(struct platform_device *pdev) hcd->rsrc_len = resource_size(res);
irq = platform_get_irq(pdev, 0); - if (!irq) { - dev_err(&pdev->dev, "Failed to get IRQ\n"); - err = -ENODEV; + if (irq < 0) { + err = irq; goto fail_io; }
From: Li Jun jun.li@nxp.com
commit 302c570bf36e997d55ad0d60628a2feec76954a4 upstream.
John reported screaming irq caused by rt1711h when system boot[1], this is because irq request is done before tcpci_register_port(), so the chip->tcpci has not been setup, irq handler is entered but can't do anything, this patch is to address this by moving the irq request after tcpci_register_port().
[1] https://lore.kernel.org/linux-usb/20200530040157.31038-1-john.stultz@linaro....
Fixes: ce08eaeb6388 ("staging: typec: rt1711h typec chip driver") Cc: stable stable@vger.kernel.org # v4.18+ Cc: John Stultz john.stultz@linaro.org Reported-and-tested-by: John Stultz john.stultz@linaro.org Reviewed-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Li Jun jun.li@nxp.com Link: https://lore.kernel.org/r/20200604112118.38062-1-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/tcpci_rt1711h.c | 31 +++++++++----------------- 1 file changed, 10 insertions(+), 21 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpci_rt1711h.c b/drivers/usb/typec/tcpm/tcpci_rt1711h.c index 017389021b96a..b56a0880a0441 100644 --- a/drivers/usb/typec/tcpm/tcpci_rt1711h.c +++ b/drivers/usb/typec/tcpm/tcpci_rt1711h.c @@ -179,26 +179,6 @@ out: return tcpci_irq(chip->tcpci); }
-static int rt1711h_init_alert(struct rt1711h_chip *chip, - struct i2c_client *client) -{ - int ret; - - /* Disable chip interrupts before requesting irq */ - ret = rt1711h_write16(chip, TCPC_ALERT_MASK, 0); - if (ret < 0) - return ret; - - ret = devm_request_threaded_irq(chip->dev, client->irq, NULL, - rt1711h_irq, - IRQF_ONESHOT | IRQF_TRIGGER_LOW, - dev_name(chip->dev), chip); - if (ret < 0) - return ret; - enable_irq_wake(client->irq); - return 0; -} - static int rt1711h_sw_reset(struct rt1711h_chip *chip) { int ret; @@ -260,7 +240,8 @@ static int rt1711h_probe(struct i2c_client *client, if (ret < 0) return ret;
- ret = rt1711h_init_alert(chip, client); + /* Disable chip interrupts before requesting irq */ + ret = rt1711h_write16(chip, TCPC_ALERT_MASK, 0); if (ret < 0) return ret;
@@ -271,6 +252,14 @@ static int rt1711h_probe(struct i2c_client *client, if (IS_ERR_OR_NULL(chip->tcpci)) return PTR_ERR(chip->tcpci);
+ ret = devm_request_threaded_irq(chip->dev, client->irq, NULL, + rt1711h_irq, + IRQF_ONESHOT | IRQF_TRIGGER_LOW, + dev_name(chip->dev), chip); + if (ret < 0) + return ret; + enable_irq_wake(client->irq); + return 0; }
From: Heikki Krogerus heikki.krogerus@linux.intel.com
commit 130206a88683d859f63ed6d4a56ab5c2b4930c8e upstream.
The PMC needs to be notified separately about HPD (hotplug detected) signal being high after mode entry. There is a bit "HPD High" in the Alternate Mode Request that the driver already sets, but that bit is only valid when the DisplayPort Alternate Mode is directly entered from disconnected state.
Fixes: 5c4edcdbcd97 ("usb: typec: mux: intel: Fix DP_HPD_LVL bit field") Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Cc: stable stable@vger.kernel.org Tested-by: Prashant Malani pmalani@chromium.org Link: https://lore.kernel.org/r/20200529131753.15587-1-heikki.krogerus@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/mux/intel_pmc_mux.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/typec/mux/intel_pmc_mux.c b/drivers/usb/typec/mux/intel_pmc_mux.c index c22e5c4bbf1a9..e35508f5e1287 100644 --- a/drivers/usb/typec/mux/intel_pmc_mux.c +++ b/drivers/usb/typec/mux/intel_pmc_mux.c @@ -129,7 +129,8 @@ pmc_usb_mux_dp_hpd(struct pmc_usb_port *port, struct typec_mux_state *state) msg[0] = PMC_USB_DP_HPD; msg[0] |= port->usb3_port << PMC_USB_MSG_USB3_PORT_SHIFT;
- msg[1] = PMC_USB_DP_HPD_IRQ; + if (data->status & DP_STATUS_IRQ_HPD) + msg[1] = PMC_USB_DP_HPD_IRQ;
if (data->status & DP_STATUS_HPD_STATE) msg[1] |= PMC_USB_DP_HPD_LVL; @@ -142,6 +143,7 @@ pmc_usb_mux_dp(struct pmc_usb_port *port, struct typec_mux_state *state) { struct typec_displayport_data *data = state->data; struct altmode_req req = { }; + int ret;
if (data->status & DP_STATUS_IRQ_HPD) return pmc_usb_mux_dp_hpd(port, state); @@ -161,7 +163,14 @@ pmc_usb_mux_dp(struct pmc_usb_port *port, struct typec_mux_state *state) if (data->status & DP_STATUS_HPD_STATE) req.mode_data |= PMC_USB_ALTMODE_HPD_HIGH;
- return pmc_usb_command(port, (void *)&req, sizeof(req)); + ret = pmc_usb_command(port, (void *)&req, sizeof(req)); + if (ret) + return ret; + + if (data->status & DP_STATUS_HPD_STATE) + return pmc_usb_mux_dp_hpd(port, state); + + return 0; }
static int
From: Laurence Tratt laurie@tratt.net
commit e7585db1b0b5b4e4eb1967bb1472df308f7ffcbf upstream.
This uses the same quirk as the Motu M2 and M4 to ensure the driver uses the audio interface's clock. Tested on an SSL2+.
Signed-off-by: Laurence Tratt laurie@tratt.net Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200612111807.dgnig6rwhmsl2bod@overdrive.tratt.ne... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/pcm.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 3411e27eb64bb..39aec83f8aca4 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -367,6 +367,7 @@ static int set_sync_ep_implicit_fb_quirk(struct snd_usb_substream *subs, ifnum = 0; goto add_sync_ep_from_ifnum; case USB_ID(0x07fd, 0x0008): /* MOTU M Series */ + case USB_ID(0x31e9, 0x0002): /* Solid State Logic SSL2+ */ ep = 0x81; ifnum = 2; goto add_sync_ep_from_ifnum;
From: "Yick W. Tse" y_w_tse@yahoo.com.hk
commit c9808bbfed3cfc911ecb60fe8e80c0c27876c657 upstream.
fix error "clock source 41 is not valid, cannot use"
[] New USB device found, idVendor=154e, idProduct=1002, bcdDevice= 1.00 [] New USB device strings: Mfr=1, Product=2, SerialNumber=0 [] Product: DCD-1500RE [] Manufacturer: D & M Holdings Inc. [] [] clock source 41 is not valid, cannot use [] usbcore: registered new interface driver snd-usb-audio
Signed-off-by: Yick W. Tse y_w_tse@yahoo.com.hk Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1373857985.210365.1592048406997@mail.yahoo.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index d8a765be5dfe8..aa964847b39b2 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1505,6 +1505,7 @@ bool snd_usb_get_sample_rate_quirk(struct snd_usb_audio *chip) static bool is_itf_usb_dsd_dac(unsigned int id) { switch (id) { + case USB_ID(0x154e, 0x1002): /* Denon DCD-1500RE */ case USB_ID(0x154e, 0x1003): /* Denon DA-300USB */ case USB_ID(0x154e, 0x3005): /* Marantz HD-DAC1 */ case USB_ID(0x154e, 0x3006): /* Marantz SA-14S1 */
From: Christopher Swenson swenson@swenson.io
commit 8abf41dcd1bcdda0d09905fb59d18f45c035c752 upstream.
Like the Line6 devices, the Rode Rodecaster Pro does not support UAC2_CS_RANGE and only supports a sample rate of 48 kHz.
Tested against a Rode Rodecaster Pro.
Tested-by: Christopher Swenson swenson@swenson.io Signed-off-by: Christopher Swenson swenson@swenson.io Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/ebdb9e72-9649-0b5e-b9b9-d757dbf26927@swenson.io Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/format.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/usb/format.c b/sound/usb/format.c index 5ffb457cc88cf..1b28d01d1f4cd 100644 --- a/sound/usb/format.c +++ b/sound/usb/format.c @@ -394,8 +394,9 @@ skip_rate: return nr_rates; }
-/* Line6 Helix series don't support the UAC2_CS_RANGE usb function - * call. Return a static table of known clock rates. +/* Line6 Helix series and the Rode Rodecaster Pro don't support the + * UAC2_CS_RANGE usb function call. Return a static table of known + * clock rates. */ static int line6_parse_audio_format_rates_quirk(struct snd_usb_audio *chip, struct audioformat *fp) @@ -408,6 +409,7 @@ static int line6_parse_audio_format_rates_quirk(struct snd_usb_audio *chip, case USB_ID(0x0e41, 0x4248): /* Line6 Helix >= fw 2.82 */ case USB_ID(0x0e41, 0x4249): /* Line6 Helix Rack >= fw 2.82 */ case USB_ID(0x0e41, 0x424a): /* Line6 Helix LT >= fw 2.82 */ + case USB_ID(0x19f7, 0x0011): /* Rode Rodecaster Pro */ return set_fixed_rate(fp, 48000, SNDRV_PCM_RATE_48000); }
From: Christoffer Nielsen cn@obviux.dk
commit 73094608b8e214952444fb104651704c98a37aeb upstream.
Similar to the Kingston HyperX AMP, the Kingston HyperX Cloud Alpha S (0951:0x16ea) uses two interfaces, but only the second interface contains the capture stream. This patch delays the registration until the second interface appears.
Signed-off-by: Christoffer Nielsen cn@obviux.dk Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CAOtG2YHOM3zy+ed9KS-J4HkZo_QGzcUG9MigSp4e4_-13r6B=... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index aa964847b39b2..b5e9fe0486189 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1844,6 +1844,7 @@ struct registration_quirk { static const struct registration_quirk registration_quirks[] = { REG_QUIRK_ENTRY(0x0951, 0x16d8, 2), /* Kingston HyperX AMP */ REG_QUIRK_ENTRY(0x0951, 0x16ed, 2), /* Kingston HyperX Cloud Alpha S */ + REG_QUIRK_ENTRY(0x0951, 0x16ea, 2), /* Kingston HyperX Cloud Flight S */ { 0 } /* terminator */ };
From: Macpaul Lin macpaul.lin@mediatek.com
commit a32a1fc99807244d920d274adc46ba04b538cc8a upstream.
We've found Samsung USBC Headset (AKG) (VID: 0x04e8, PID: 0xa051) need a tiny delay after each class compliant request. Otherwise the device might not be able to be recognized each times.
Signed-off-by: Chihhao Chen chihhao.chen@mediatek.com Signed-off-by: Macpaul Lin macpaul.lin@mediatek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1592910203-24035-1-git-send-email-macpaul.lin@medi... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index b5e9fe0486189..d7d900ebcf37f 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1647,6 +1647,14 @@ void snd_usb_ctl_msg_quirk(struct usb_device *dev, unsigned int pipe, chip->usb_id == USB_ID(0x0951, 0x16ad)) && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) usleep_range(1000, 2000); + + /* + * Samsung USBC Headset (AKG) need a tiny delay after each + * class compliant request. (Model number: AAM625R or AAM627R) + */ + if (chip->usb_id == USB_ID(0x04e8, 0xa051) && + (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) + usleep_range(5000, 6000); }
/*
From: Takashi Iwai tiwai@suse.de
commit 220345e98f1cdc768eeb6e3364a0fa7ab9647fe7 upstream.
The USB-audio mixer code holds a linked list of usb_mixer_elem_list, and several operations are performed for each mixer element. A few of them (snd_usb_mixer_notify_id() and snd_usb_mixer_interrupt_v2()) assume each mixer element being a usb_mixer_elem_info object that is a subclass of usb_mixer_elem_list, cast via container_of() and access it members. This may result in an out-of-bound access when a non-standard list element has been added, as spotted by syzkaller recently.
This patch adds a new field, is_std_info, in usb_mixer_elem_list to indicate that the element is the usb_mixer_elem_info type or not, and skip the access to such an element if needed.
Reported-by: syzbot+fb14314433463ad51625@syzkaller.appspotmail.com Reported-by: syzbot+2405ca3401e943c538b5@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200624122340.9615-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/mixer.c | 15 +++++++++++---- sound/usb/mixer.h | 9 +++++++-- sound/usb/mixer_quirks.c | 3 ++- 3 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c index 15769f266790e..eab0fd4fd7c33 100644 --- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -581,8 +581,9 @@ static int check_matrix_bitmap(unsigned char *bmap, * if failed, give up and free the control instance. */
-int snd_usb_mixer_add_control(struct usb_mixer_elem_list *list, - struct snd_kcontrol *kctl) +int snd_usb_mixer_add_list(struct usb_mixer_elem_list *list, + struct snd_kcontrol *kctl, + bool is_std_info) { struct usb_mixer_interface *mixer = list->mixer; int err; @@ -596,6 +597,7 @@ int snd_usb_mixer_add_control(struct usb_mixer_elem_list *list, return err; } list->kctl = kctl; + list->is_std_info = is_std_info; list->next_id_elem = mixer->id_elems[list->id]; mixer->id_elems[list->id] = list; return 0; @@ -3234,8 +3236,11 @@ void snd_usb_mixer_notify_id(struct usb_mixer_interface *mixer, int unitid) unitid = delegate_notify(mixer, unitid, NULL, NULL);
for_each_mixer_elem(list, mixer, unitid) { - struct usb_mixer_elem_info *info = - mixer_elem_list_to_info(list); + struct usb_mixer_elem_info *info; + + if (!list->is_std_info) + continue; + info = mixer_elem_list_to_info(list); /* invalidate cache, so the value is read from the device */ info->cached = 0; snd_ctl_notify(mixer->chip->card, SNDRV_CTL_EVENT_MASK_VALUE, @@ -3315,6 +3320,8 @@ static void snd_usb_mixer_interrupt_v2(struct usb_mixer_interface *mixer,
if (!list->kctl) continue; + if (!list->is_std_info) + continue;
info = mixer_elem_list_to_info(list); if (count > 1 && info->control != control) diff --git a/sound/usb/mixer.h b/sound/usb/mixer.h index 41ec9dc4139bb..c29e27ac43a7a 100644 --- a/sound/usb/mixer.h +++ b/sound/usb/mixer.h @@ -66,6 +66,7 @@ struct usb_mixer_elem_list { struct usb_mixer_elem_list *next_id_elem; /* list of controls with same id */ struct snd_kcontrol *kctl; unsigned int id; + bool is_std_info; usb_mixer_elem_dump_func_t dump; usb_mixer_elem_resume_func_t resume; }; @@ -103,8 +104,12 @@ void snd_usb_mixer_notify_id(struct usb_mixer_interface *mixer, int unitid); int snd_usb_mixer_set_ctl_value(struct usb_mixer_elem_info *cval, int request, int validx, int value_set);
-int snd_usb_mixer_add_control(struct usb_mixer_elem_list *list, - struct snd_kcontrol *kctl); +int snd_usb_mixer_add_list(struct usb_mixer_elem_list *list, + struct snd_kcontrol *kctl, + bool is_std_info); + +#define snd_usb_mixer_add_control(list, kctl) \ + snd_usb_mixer_add_list(list, kctl, true)
void snd_usb_mixer_elem_init_std(struct usb_mixer_elem_list *list, struct usb_mixer_interface *mixer, diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c index aad2683ff7933..260607144f565 100644 --- a/sound/usb/mixer_quirks.c +++ b/sound/usb/mixer_quirks.c @@ -158,7 +158,8 @@ static int add_single_ctl_with_resume(struct usb_mixer_interface *mixer, return -ENOMEM; } kctl->private_free = snd_usb_mixer_elem_free; - return snd_usb_mixer_add_control(list, kctl); + /* don't use snd_usb_mixer_add_control() here, this is a special list element */ + return snd_usb_mixer_add_list(list, kctl, false); }
/*
From: Peter Chen peter.chen@nxp.com
commit ba3a80fe0fb67d8790f62b7bc60df97406d89871 upstream.
It should use the correct direction value from register, not depends on previous software setting. It fixed the EP number wrong issue at trace when the TRBERR interrupt occurs for EP0IN.
When the EP0IN IOC has finished, software prepares the setup packet request, the expected direction is OUT, but at that time, the TRBERR for EP0IN may occur since it is DMULT mode, the DMA does not stop until TRBERR has met.
Cc: stable@vger.kernel.org Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Pawel Laszczak pawell@cadence.com Signed-off-by: Peter Chen peter.chen@nxp.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/cdns3/trace.h b/drivers/usb/cdns3/trace.h index 8d121e207fd8f..755c565822575 100644 --- a/drivers/usb/cdns3/trace.h +++ b/drivers/usb/cdns3/trace.h @@ -156,7 +156,7 @@ DECLARE_EVENT_CLASS(cdns3_log_ep0_irq, __dynamic_array(char, str, CDNS3_MSG_MAX) ), TP_fast_assign( - __entry->ep_dir = priv_dev->ep0_data_dir; + __entry->ep_dir = priv_dev->selected_ep; __entry->ep_sts = ep_sts; ), TP_printk("%s", cdns3_decode_ep0_irq(__get_str(str),
From: Peter Chen peter.chen@nxp.com
commit b51e1cf64f93acebb6d8afbacd648a6ecefc39b4 upstream.
The 'tmode' is ctrl->wIndex, changing it as the real test mode value for register assignment.
Cc: stable@vger.kernel.org Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Jun Li jun.li@nxp.com Signed-off-by: Peter Chen peter.chen@nxp.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/ep0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/cdns3/ep0.c b/drivers/usb/cdns3/ep0.c index e71240b386b49..f785ce84aba62 100644 --- a/drivers/usb/cdns3/ep0.c +++ b/drivers/usb/cdns3/ep0.c @@ -327,7 +327,8 @@ static int cdns3_ep0_feature_handle_device(struct cdns3_device *priv_dev, if (!set || (tmode & 0xff) != 0) return -EINVAL;
- switch (tmode >> 8) { + tmode >>= 8; + switch (tmode) { case TEST_J: case TEST_K: case TEST_SE0_NAK:
From: Peter Chen peter.chen@nxp.com
commit 2587a029fa2a877d0a8dda955ef1b24c94b4bd0e upstream.
The other thread may access other endpoints when the cdns3_check_new_setup is handling, add spinlock to protect it.
Cc: stable@vger.kernel.org Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Pawel Laszczak pawell@cadence.com Signed-off-by: Peter Chen peter.chen@nxp.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/ep0.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/cdns3/ep0.c b/drivers/usb/cdns3/ep0.c index f785ce84aba62..da4c5eb03d7ee 100644 --- a/drivers/usb/cdns3/ep0.c +++ b/drivers/usb/cdns3/ep0.c @@ -712,15 +712,17 @@ static int cdns3_gadget_ep0_queue(struct usb_ep *ep, int ret = 0; u8 zlp = 0;
+ spin_lock_irqsave(&priv_dev->lock, flags); trace_cdns3_ep0_queue(priv_dev, request);
/* cancel the request if controller receive new SETUP packet. */ - if (cdns3_check_new_setup(priv_dev)) + if (cdns3_check_new_setup(priv_dev)) { + spin_unlock_irqrestore(&priv_dev->lock, flags); return -ECONNRESET; + }
/* send STATUS stage. Should be called only for SET_CONFIGURATION */ if (priv_dev->ep0_stage == CDNS3_STATUS_STAGE) { - spin_lock_irqsave(&priv_dev->lock, flags); cdns3_select_ep(priv_dev, 0x00);
erdy_sent = !priv_dev->hw_configured_flag; @@ -745,7 +747,6 @@ static int cdns3_gadget_ep0_queue(struct usb_ep *ep, return 0; }
- spin_lock_irqsave(&priv_dev->lock, flags); if (!list_empty(&priv_ep->pending_req_list)) { dev_err(priv_dev->dev, "can't handle multiple requests for ep0\n");
From: Roman Bolshakov r.bolshakov@yadro.com
commit 632f24f09d5b7c8a2f94932c3391ca957ae76cc4 upstream.
The driver performs SCR (state change registration) in all modes including pure target mode.
For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the port mentioned in the RSCN and fabric rescan is scheduled. During the rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of the port that caused the RSCN.
In target mode, the session deletion has an impact on ATIO handler, qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O incoming from the deleted session. qlt_handle_cmd_for_atio() and qlt_handle_task_mgmt() return -EFAULT if they are not able to find session of the command/TMF, and that results in invocation of qlt_send_busy():
qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014 qla_target(0): Unable to send command to target, sending BUSY status
Such response causes command timeout on the initiator. Error handler thread on the initiator will be spawned to abort the commands:
scsi 23:0:0:0: tag#0 abort scheduled scsi 23:0:0:0: tag#0 aborting command qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command. qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003.
Command abort is rejected by target and fails (2003), error handler then tries to perform DEVICE RESET and TARGET RESET but they're also doomed to fail because TMFs are ignored for the deleted sessions.
Then initiator makes BUS RESET that resets the link via qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port up, SAN switch detects that and sends RSCN to the target port and it fails again the same way as described above. It never goes out of the loop.
The change breaks the RSCN loop by keeping initiator sessions mentioned in RSCN payload in all modes, including dual and pure target mode.
Link: https://lore.kernel.org/r/20200605144435.27023-1-r.bolshakov@yadro.com Fixes: 2037ce49d30a ("scsi: qla2xxx: Fix stale session") Cc: Quinn Tran qutran@marvell.com Cc: Arun Easi aeasi@marvell.com Cc: Nilesh Javali njavali@marvell.com Cc: Bart Van Assche bvanassche@acm.org Cc: Daniel Wagner dwagner@suse.de Cc: Himanshu Madhani himanshu.madhani@oracle.com Cc: Martin Wilck mwilck@suse.com Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Daniel Wagner dwagner@suse.de Reviewed-by: Shyam Sundar ssundar@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Roman Bolshakov r.bolshakov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_gs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c index 42c3ad27f1cbc..df670fba2ab8a 100644 --- a/drivers/scsi/qla2xxx/qla_gs.c +++ b/drivers/scsi/qla2xxx/qla_gs.c @@ -3496,7 +3496,9 @@ void qla24xx_async_gnnft_done(scsi_qla_host_t *vha, srb_t *sp) qla2x00_clear_loop_id(fcport); fcport->flags |= FCF_FABRIC_DEVICE; } else if (fcport->d_id.b24 != rp->id.b24 || - fcport->scan_needed) { + (fcport->scan_needed && + fcport->port_type != FCT_INITIATOR && + fcport->port_type != FCT_NVME_INITIATOR)) { qlt_schedule_sess_for_deletion(fcport); } fcport->d_id.b24 = rp->id.b24;
From: Steffen Maier maier@linux.ibm.com
commit 936e6b85da0476dd2edac7c51c68072da9fb4ba2 upstream.
Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout.
In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()].
If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler().
Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery.
Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()].
This fixes the following panic due to kernel page fault in IRQ context:
Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt
Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: stable@vger.kernel.org #4.15+ Reviewed-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Steffen Maier maier@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/scsi/zfcp_erp.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c index 3d0bc000f5005..c621e8f0897f2 100644 --- a/drivers/s390/scsi/zfcp_erp.c +++ b/drivers/s390/scsi/zfcp_erp.c @@ -576,7 +576,10 @@ static void zfcp_erp_strategy_check_fsfreq(struct zfcp_erp_action *act) ZFCP_STATUS_ERP_TIMEDOUT)) { req->status |= ZFCP_STATUS_FSFREQ_DISMISSED; zfcp_dbf_rec_run("erscf_1", act); - req->erp_action = NULL; + /* lock-free concurrent access with + * zfcp_erp_timeout_handler() + */ + WRITE_ONCE(req->erp_action, NULL); } if (act->status & ZFCP_STATUS_ERP_TIMEDOUT) zfcp_dbf_rec_run("erscf_2", act); @@ -612,8 +615,14 @@ void zfcp_erp_notify(struct zfcp_erp_action *erp_action, unsigned long set_mask) void zfcp_erp_timeout_handler(struct timer_list *t) { struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer); - struct zfcp_erp_action *act = fsf_req->erp_action; + struct zfcp_erp_action *act;
+ if (fsf_req->status & ZFCP_STATUS_FSFREQ_DISMISSED) + return; + /* lock-free concurrent access with zfcp_erp_strategy_check_fsfreq() */ + act = READ_ONCE(fsf_req->erp_action); + if (!act) + return; zfcp_erp_notify(act, ZFCP_STATUS_ERP_TIMEDOUT); }
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 77577de64167aa0643d47ffbaacf3642632b321b upstream.
open_shroot() invokes kref_get(), which increases the refcount of the "tcon->crfid" object. When open_shroot() returns not zero, it means the open operation failed and close_shroot() will not be called to decrement the refcount of the "tcon->crfid".
The reference counting issue happens in one normal path of open_shroot(). When the cached root have been opened successfully in a concurrent process, the function increases the refcount and jump to "oshr_free" to return. However the current return value "rc" may not equal to 0, thus the increased refcount will not be balanced outside the function, causing a refcnt leak.
Fix this issue by setting the value of "rc" to 0 before jumping to "oshr_free" label.
Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2ops.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index f829f4165d38c..c6a4caa7053e9 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -759,6 +759,7 @@ int open_shroot(unsigned int xid, struct cifs_tcon *tcon, /* close extra handle outside of crit sec */ SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); } + rc = 0; goto oshr_free; }
From: Zhang Xiaoxu zhangxiaoxu5@huawei.com
commit acc91c2d8de4ef46ed751c5f9df99ed9a109b100 upstream.
When punch hole success, we also can read old data from file: # strace -e trace=pread64,fallocate xfs_io -f -c "pread 20 40" \ -c "fpunch 20 40" -c"pread 20 40" file pread64(3, " version 5.8.0-rc1+"..., 40, 20) = 40 fallocate(3, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 20, 40) = 0 pread64(3, " version 5.8.0-rc1+"..., 40, 20) = 40
CIFS implements the fallocate(FALLOCATE_FL_PUNCH_HOLE) with send SMB ioctl(FSCTL_SET_ZERO_DATA) to server. It just set the range of the remote file to zero, but local page caches not updated, then the local page caches inconsistent with server.
Also can be found by xfstests generic/316.
So, we need to remove the page caches before send the SMB ioctl(FSCTL_SET_ZERO_DATA) to server.
Fixes: 31742c5a33176 ("enable fallocate punch hole ("fallocate -p") for SMB3") Suggested-by: Pavel Shilovsky pshilov@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com Cc: stable@vger.kernel.org # v3.17 Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2ops.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index c6a4caa7053e9..3de9113eb8e35 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -3211,6 +3211,12 @@ static long smb3_punch_hole(struct file *file, struct cifs_tcon *tcon, return rc; }
+ /* + * We implement the punch hole through ioctl, so we need remove the page + * caches first, otherwise the data may be inconsistent with the server. + */ + truncate_pagecache_range(inode, offset, offset + len - 1); + cifs_dbg(FYI, "Offset %lld len %lld\n", offset, len);
fsctl_buf.FileOffset = cpu_to_le64(offset);
From: Zhang Xiaoxu zhangxiaoxu5@huawei.com
commit 6b69040247e14b43419a520f841f2b3052833df9 upstream.
CIFS implements the fallocate(FALLOC_FL_ZERO_RANGE) with send SMB ioctl(FSCTL_SET_ZERO_DATA) to server. It just set the range of the remote file to zero, but local page cache not update, then the data inconsistent with server, which leads the xfstest generic/008 failed.
So we need to remove the local page caches before send SMB ioctl(FSCTL_SET_ZERO_DATA) to server. After next read, it will re-cache it.
Fixes: 30175628bf7f5 ("[SMB3] Enable fallocate -z support for SMB3 mounts") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Cc: stable@vger.kernel.org # v3.17 Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2ops.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 3de9113eb8e35..6fc69c3b2749d 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -3145,6 +3145,11 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, trace_smb3_zero_enter(xid, cfile->fid.persistent_fid, tcon->tid, ses->Suid, offset, len);
+ /* + * We zero the range through ioctl, so we need remove the page caches + * first, otherwise the data may be inconsistent with the server. + */ + truncate_pagecache_range(inode, offset, offset + len - 1);
/* if file not oplocked can't be sure whether asking to extend size */ if (!CIFS_CACHE_READ(cifsi))
From: Mathias Nyman mathias.nyman@linux.intel.com
commit dceea67058fe22075db3aed62d5cb62092be5053 upstream.
EP_STATE_MASK should be 0x7 instead of 0xf
xhci spec 6.2.3 shows that the EP state field in the endpoint context data structure consist of bits [2:0]. The old value included a bit from the next field which fortunately is a RsvdZ region. So hopefully this hasn't caused too much harm
Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200624135949.22611-2-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 86cfefdd6632b..c80710e474769 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -716,7 +716,7 @@ struct xhci_ep_ctx { * 4 - TRB error * 5-7 - reserved */ -#define EP_STATE_MASK (0xf) +#define EP_STATE_MASK (0x7) #define EP_STATE_DISABLED 0 #define EP_STATE_RUNNING 1 #define EP_STATE_HALTED 2
From: Al Cooper alcooperx@gmail.com
commit a73d9d9cfc3cfceabd91fb0b0c13e4062b6dbcd7 upstream.
Unable to complete the enumeration of a USB TV Tuner device.
Per XHCI spec (4.6.5), the EP state field of the input context shall be cleared for a set address command. In the special case of an FS device that has "MaxPacketSize0 = 8", the Linux XHCI driver does not do this before evaluating the context. With an XHCI controller that checks the EP state field for parameter context error this causes a problem in cases such as the device getting reset again after enumeration.
When that field is cleared, the problem does not occur.
This was found and fixed by Sasi Kumar.
Cc: stable@vger.kernel.org Signed-off-by: Al Cooper alcooperx@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200624135949.22611-3-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 10d76205a79ca..dcc987e1d57b0 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1430,6 +1430,7 @@ static int xhci_check_maxpacket(struct xhci_hcd *xhci, unsigned int slot_id, xhci->devs[slot_id]->out_ctx, ep_index);
ep_ctx = xhci_get_ep_ctx(xhci, command->in_ctx, ep_index); + ep_ctx->ep_info &= cpu_to_le32(~EP_STATE_MASK);/* must clear */ ep_ctx->ep_info2 &= cpu_to_le32(~MAX_PACKET_MASK); ep_ctx->ep_info2 |= cpu_to_le32(MAX_PACKET(max_packet_size));
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit f0c472a6da51f9fac15e80fe2fd9c83b68754cff upstream.
Just return if xHCI is quirked to disable LPM. We can save some time from reading registers and doing spinlocks.
Add stable tag as we want this patch together with the next one, "Poll for U0 after disabling USB2 LPM" which fixes a suspend issue for some USB2 LPM devices
Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200624135949.22611-5-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index dcc987e1d57b0..ed468eed299c5 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -4391,6 +4391,9 @@ static int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd, int hird, exit_latency; int ret;
+ if (xhci->quirks & XHCI_HW_LPM_DISABLE) + return -EPERM; + if (hcd->speed >= HCD_USB3 || !xhci->hw_lpm_support || !udev->lpm_capable) return -EPERM; @@ -4413,7 +4416,7 @@ static int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd, xhci_dbg(xhci, "%s port %d USB2 hardware LPM\n", enable ? "enable" : "disable", port_num + 1);
- if (enable && !(xhci->quirks & XHCI_HW_LPM_DISABLE)) { + if (enable) { /* Host supports BESL timeout instead of HIRD */ if (udev->usb2_hw_lpm_besl_capable) { /* if device doesn't have a preferred BESL value use a
From: Joakim Tjernlund joakim.tjernlund@infinera.com
commit 03894573f2913181ee5aae0089f333b2131f2d4b upstream.
USB_DEVICE(0x0424, 0x274e) can send data before cdc_acm is ready, causing garbage chars on the TTY causing stray input to the shell and/or login prompt.
Signed-off-by: Joakim Tjernlund joakim.tjernlund@infinera.com Cc: stable@vger.kernel.org Acked-by: Oliver Neukum oneukum@suse.com Link: https://lore.kernel.org/r/20200605105418.22263-1-joakim.tjernlund@infinera.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-acm.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c index f67088bb8218b..d5187b50fc828 100644 --- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -1689,6 +1689,8 @@ static int acm_pre_reset(struct usb_interface *intf)
static const struct usb_device_id acm_ids[] = { /* quirky and broken devices */ + { USB_DEVICE(0x0424, 0x274e), /* Microchip Technology, Inc. (formerly SMSC) */ + .driver_info = DISABLE_ECHO, }, /* DISABLE ECHO in termios flag */ { USB_DEVICE(0x076d, 0x0006), /* Denso Cradle CU-321 */ .driver_info = NO_UNION_NORMAL, },/* has no union descriptor */ { USB_DEVICE(0x17ef, 0x7000), /* Lenovo USB modem */
From: Zheng Bin zhengbin13@huawei.com
[ Upstream commit f4bd34b139a3fa2808c4205f12714c65e1548c6c ]
When a filesystem is mounted on a loop device and on a loop ioctl LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting destroyed. kill_bdev truncate_inode_pages truncate_inode_pages_range do_invalidatepage block_invalidatepage discard_buffer -->clear BH_Mapped flag
sb_bread __bread_gfp bh = __getblk_gfp -->discard_buffer clear BH_Mapped flag __bread_slow submit_bh submit_bh_wbc BUG_ON(!buffer_mapped(bh)) --> hit this BUG_ON
Fixes: 5db470e229e2 ("loop: drop caches if offset or block_size are changed") Signed-off-by: Zheng Bin zhengbin13@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/loop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index da693e6a834e5..418bb4621255a 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1289,7 +1289,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) if (lo->lo_offset != info->lo_offset || lo->lo_sizelimit != info->lo_sizelimit) { sync_blockdev(lo->lo_device); - kill_bdev(lo->lo_device); + invalidate_bdev(lo->lo_device); }
/* I/O need to be drained during transfer transition */ @@ -1320,7 +1320,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
if (lo->lo_offset != info->lo_offset || lo->lo_sizelimit != info->lo_sizelimit) { - /* kill_bdev should have truncated all the pages */ + /* invalidate_bdev should have truncated all the pages */ if (lo->lo_device->bd_inode->i_mapping->nrpages) { err = -EAGAIN; pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", @@ -1565,11 +1565,11 @@ static int loop_set_block_size(struct loop_device *lo, unsigned long arg) return 0;
sync_blockdev(lo->lo_device); - kill_bdev(lo->lo_device); + invalidate_bdev(lo->lo_device);
blk_mq_freeze_queue(lo->lo_queue);
- /* kill_bdev should have truncated all the pages */ + /* invalidate_bdev should have truncated all the pages */ if (lo->lo_device->bd_inode->i_mapping->nrpages) { err = -EAGAIN; pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n",
From: Shay Drory shayd@mellanox.com
commit 116a1b9f1cb769b83e5adff323f977a62b1dcb2e upstream.
Currently, when RMPP MADs are processed while the MAD agent is destroyed, it could result in use after free of rmpp_recv, as decribed below:
cpu-0 cpu-1 ----- ----- ib_mad_recv_done() ib_mad_complete_recv() ib_process_rmpp_recv_wc() unregister_mad_agent() ib_cancel_rmpp_recvs() cancel_delayed_work() process_rmpp_data() start_rmpp() queue_delayed_work(rmpp_recv->cleanup_work) destroy_rmpp_recv() free_rmpp_recv() cleanup_work()[1] spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free
[1] cleanup_work() == recv_cleanup_handler
Fix it by waiting for the MAD agent reference count becoming zero before calling to ib_cancel_rmpp_recvs().
Fixes: 9a41e38a467c ("IB/mad: Use IDR for agent IDs") Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org Signed-off-by: Shay Drory shayd@mellanox.com Reviewed-by: Maor Gottlieb maorg@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/mad.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index c54db13fa9b0a..f7626ebcf31cf 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -639,10 +639,10 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) xa_erase(&ib_mad_clients, mad_agent_priv->agent.hi_tid);
flush_workqueue(port_priv->wq); - ib_cancel_rmpp_recvs(mad_agent_priv);
deref_mad_agent(mad_agent_priv); wait_for_completion(&mad_agent_priv->comp); + ib_cancel_rmpp_recvs(mad_agent_priv);
ib_mad_agent_security_cleanup(&mad_agent_priv->agent);
From: Dennis Dalessandro dennis.dalessandro@intel.com
commit 822fbd37410639acdae368ea55477ddd3498651d upstream.
When the try_module_get calls were removed from opening and closing of the i2c debugfs file, the corresponding module_put calls were missed. This results in an inaccurate module use count that requires a power cycle to fix.
Fixes: 09fbca8e6240 ("IB/hfi1: No need to use try_module_get for debugfs") Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel... Cc: stable@vger.kernel.org Reviewed-by: Kaike Wan kaike.wan@intel.com Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/hfi1/debugfs.c | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c index 4633a0ce1a8c0..2ced236e1553c 100644 --- a/drivers/infiniband/hw/hfi1/debugfs.c +++ b/drivers/infiniband/hw/hfi1/debugfs.c @@ -985,15 +985,10 @@ static ssize_t qsfp2_debugfs_read(struct file *file, char __user *buf, static int __i2c_debugfs_open(struct inode *in, struct file *fp, u32 target) { struct hfi1_pportdata *ppd; - int ret;
ppd = private2ppd(fp);
- ret = acquire_chip_resource(ppd->dd, i2c_target(target), 0); - if (ret) /* failed - release the module */ - module_put(THIS_MODULE); - - return ret; + return acquire_chip_resource(ppd->dd, i2c_target(target), 0); }
static int i2c1_debugfs_open(struct inode *in, struct file *fp) @@ -1013,7 +1008,6 @@ static int __i2c_debugfs_release(struct inode *in, struct file *fp, u32 target) ppd = private2ppd(fp);
release_chip_resource(ppd->dd, i2c_target(target)); - module_put(THIS_MODULE);
return 0; } @@ -1031,18 +1025,10 @@ static int i2c2_debugfs_release(struct inode *in, struct file *fp) static int __qsfp_debugfs_open(struct inode *in, struct file *fp, u32 target) { struct hfi1_pportdata *ppd; - int ret; - - if (!try_module_get(THIS_MODULE)) - return -ENODEV;
ppd = private2ppd(fp);
- ret = acquire_chip_resource(ppd->dd, i2c_target(target), 0); - if (ret) /* failed - release the module */ - module_put(THIS_MODULE); - - return ret; + return acquire_chip_resource(ppd->dd, i2c_target(target), 0); }
static int qsfp1_debugfs_open(struct inode *in, struct file *fp) @@ -1062,7 +1048,6 @@ static int __qsfp_debugfs_release(struct inode *in, struct file *fp, u32 target) ppd = private2ppd(fp);
release_chip_resource(ppd->dd, i2c_target(target)); - module_put(THIS_MODULE);
return 0; }
From: Tony Lindgren tony@atomide.com
[ Upstream commit 5ce8aee81be6c8bc19051d7c7b0d3cbb7ac5fc3f ]
Looks like we're missing flush of posted write after module enable and disable. I've seen occasional errors accessing various modules, and it is suspected that the lack of posted writes can also cause random reboots.
The errors we can see are similar to the one below from spi for example:
44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4CFG (Read): Data Access in User mode during Functional access ... mcspi_wait_for_reg_bit omap2_mcspi_transfer_one spi_transfer_one_message ...
We also want to also flush posted write for disable. The clkctrl clock disable happens after module disable, and we don't want to have the module potentially stay active while we're trying to disable the clock.
Fixes: d59b60564cbf ("bus: ti-sysc: Add generic enable/disable functions") Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index e5f5f48d69d2f..369c97c3e0c0b 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -991,6 +991,9 @@ set_autoidle: sysc_write_sysconfig(ddata, reg); }
+ /* Flush posted write */ + sysc_read(ddata, ddata->offsets[SYSC_SYSCONFIG]); + if (ddata->module_enable_quirk) ddata->module_enable_quirk(ddata);
@@ -1071,6 +1074,9 @@ set_sidle: reg |= 1 << regbits->autoidle_shift; sysc_write_sysconfig(ddata, reg);
+ /* Flush posted write */ + sysc_read(ddata, ddata->offsets[SYSC_SYSCONFIG]); + return 0; }
From: Tony Lindgren tony@atomide.com
[ Upstream commit d46f9fbec71997420e4fb83c04d9affdf423f879 ]
Some modules reset automatically when idled, and when re-enabled, we must wait for the automatic OCP softreset to complete. And if optional clocks are configured, we need to keep the clocks on while waiting for the reset to complete.
Let's fix the issue by moving the OCP softreset code to a separate function sysc_wait_softreset(), and call it also from sysc_enable_module() with the optional clocks enabled.
This is based on what we're already doing for legacy platform data booting in _enable_sysc().
Fixes: 7324a7a0d5e2 ("bus: ti-sysc: Implement display subsystem reset quirk") Reported-by: Faiz Abbas faiz_abbas@ti.com Cc: Laurent Pinchart laurent.pinchart@ideasonboard.com Cc: Tomi Valkeinen tomi.valkeinen@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 80 ++++++++++++++++++++++++++++++++----------- 1 file changed, 60 insertions(+), 20 deletions(-)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index 369c97c3e0c0b..a3a2c269e9ad7 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -221,6 +221,35 @@ static u32 sysc_read_sysstatus(struct sysc *ddata) return sysc_read(ddata, offset); }
+/* Poll on reset status */ +static int sysc_wait_softreset(struct sysc *ddata) +{ + u32 sysc_mask, syss_done, rstval; + int syss_offset, error = 0; + + syss_offset = ddata->offsets[SYSC_SYSSTATUS]; + sysc_mask = BIT(ddata->cap->regbits->srst_shift); + + if (ddata->cfg.quirks & SYSS_QUIRK_RESETDONE_INVERTED) + syss_done = 0; + else + syss_done = ddata->cfg.syss_mask; + + if (syss_offset >= 0) { + error = readx_poll_timeout(sysc_read_sysstatus, ddata, rstval, + (rstval & ddata->cfg.syss_mask) == + syss_done, + 100, MAX_MODULE_SOFTRESET_WAIT); + + } else if (ddata->cfg.quirks & SYSC_QUIRK_RESET_STATUS) { + error = readx_poll_timeout(sysc_read_sysconfig, ddata, rstval, + !(rstval & sysc_mask), + 100, MAX_MODULE_SOFTRESET_WAIT); + } + + return error; +} + static int sysc_add_named_clock_from_child(struct sysc *ddata, const char *name, const char *optfck_name) @@ -925,8 +954,34 @@ static int sysc_enable_module(struct device *dev) struct sysc *ddata; const struct sysc_regbits *regbits; u32 reg, idlemodes, best_mode; + int error;
ddata = dev_get_drvdata(dev); + + /* + * Some modules like DSS reset automatically on idle. Enable optional + * reset clocks and wait for OCP softreset to complete. + */ + if (ddata->cfg.quirks & SYSC_QUIRK_OPT_CLKS_IN_RESET) { + error = sysc_enable_opt_clocks(ddata); + if (error) { + dev_err(ddata->dev, + "Optional clocks failed for enable: %i\n", + error); + return error; + } + } + error = sysc_wait_softreset(ddata); + if (error) + dev_warn(ddata->dev, "OCP softreset timed out\n"); + if (ddata->cfg.quirks & SYSC_QUIRK_OPT_CLKS_IN_RESET) + sysc_disable_opt_clocks(ddata); + + /* + * Some subsystem private interconnects, like DSS top level module, + * need only the automatic OCP softreset handling with no sysconfig + * register bits to configure. + */ if (ddata->offsets[SYSC_SYSCONFIG] == -ENODEV) return 0;
@@ -1828,11 +1883,10 @@ static int sysc_legacy_init(struct sysc *ddata) */ static int sysc_reset(struct sysc *ddata) { - int sysc_offset, syss_offset, sysc_val, rstval, error = 0; - u32 sysc_mask, syss_done; + int sysc_offset, sysc_val, error; + u32 sysc_mask;
sysc_offset = ddata->offsets[SYSC_SYSCONFIG]; - syss_offset = ddata->offsets[SYSC_SYSSTATUS];
if (ddata->legacy_mode || ddata->cap->regbits->srst_shift < 0 || @@ -1841,11 +1895,6 @@ static int sysc_reset(struct sysc *ddata)
sysc_mask = BIT(ddata->cap->regbits->srst_shift);
- if (ddata->cfg.quirks & SYSS_QUIRK_RESETDONE_INVERTED) - syss_done = 0; - else - syss_done = ddata->cfg.syss_mask; - if (ddata->pre_reset_quirk) ddata->pre_reset_quirk(ddata);
@@ -1862,18 +1911,9 @@ static int sysc_reset(struct sysc *ddata) if (ddata->post_reset_quirk) ddata->post_reset_quirk(ddata);
- /* Poll on reset status */ - if (syss_offset >= 0) { - error = readx_poll_timeout(sysc_read_sysstatus, ddata, rstval, - (rstval & ddata->cfg.syss_mask) == - syss_done, - 100, MAX_MODULE_SOFTRESET_WAIT); - - } else if (ddata->cfg.quirks & SYSC_QUIRK_RESET_STATUS) { - error = readx_poll_timeout(sysc_read_sysconfig, ddata, rstval, - !(rstval & sysc_mask), - 100, MAX_MODULE_SOFTRESET_WAIT); - } + error = sysc_wait_softreset(ddata); + if (error) + dev_warn(ddata->dev, "OCP softreset timed out\n");
if (ddata->reset_done_quirk) ddata->reset_done_quirk(ddata);
From: Tony Lindgren tony@atomide.com
[ Upstream commit 08b91dd6e547467fad61a7c201ff71080d7ad65a ]
We must ignore the clockactivity bit for most modules and not set it unless specified for the module with SYSC_QUIRK_USE_CLOCKACT. Otherwise the interface clock can be automatically gated constantly causing unexpected performance issues.
Fixes: ae9ae12e9daa ("bus: ti-sysc: Handle clockactivity for enable and disable") Cc: Laurent Pinchart laurent.pinchart@ideasonboard.com Cc: Tomi Valkeinen tomi.valkeinen@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index a3a2c269e9ad7..4f640a635ded1 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -988,10 +988,13 @@ static int sysc_enable_module(struct device *dev) regbits = ddata->cap->regbits; reg = sysc_read(ddata, ddata->offsets[SYSC_SYSCONFIG]);
- /* Set CLOCKACTIVITY, we only use it for ick */ + /* + * Set CLOCKACTIVITY, we only use it for ick. And we only configure it + * based on the SYSC_QUIRK_USE_CLOCKACT flag, not based on the hardware + * capabilities. See the old HWMOD_SET_DEFAULT_CLOCKACT flag. + */ if (regbits->clkact_shift >= 0 && - (ddata->cfg.quirks & SYSC_QUIRK_USE_CLOCKACT || - ddata->cfg.sysc_val & BIT(regbits->clkact_shift))) + (ddata->cfg.quirks & SYSC_QUIRK_USE_CLOCKACT)) reg |= SYSC_CLOCACT_ICK << regbits->clkact_shift;
/* Set SIDLE mode */
From: Tony Lindgren tony@atomide.com
[ Upstream commit 085bc0e576a4bf53b67a917c54908f299a2fb949 ]
We are currently only setting the framedonetv_irq disabled for the SoCs that don't have it. But we are never setting it enabled for the SoCs that have it. Let's initialized it to true by default.
Fixes: 7324a7a0d5e2 ("bus: ti-sysc: Implement display subsystem reset quirk") Cc: Laurent Pinchart laurent.pinchart@ideasonboard.com Cc: Tomi Valkeinen tomi.valkeinen@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index 4f640a635ded1..db9541f385055 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -1552,7 +1552,7 @@ static u32 sysc_quirk_dispc(struct sysc *ddata, int dispc_offset, bool lcd_en, digit_en, lcd2_en = false, lcd3_en = false; const int lcd_en_mask = BIT(0), digit_en_mask = BIT(1); int manager_count; - bool framedonetv_irq; + bool framedonetv_irq = true; u32 val, irq_mask = 0;
switch (sysc_soc->soc) { @@ -1569,6 +1569,7 @@ static u32 sysc_quirk_dispc(struct sysc *ddata, int dispc_offset, break; case SOC_AM4: manager_count = 1; + framedonetv_irq = false; break; case SOC_UNKNOWN: default:
From: Tony Lindgren tony@atomide.com
[ Upstream commit 77cad9dbc957f23a73169e8a8971186744296614 ]
We must check for "dss_core" instead of "dss" to avoid also matching also "dss_dispc". This only matters for the mixed case of data configured in device tree but with legacy booting ti,hwmods property still enabled.
Fixes: 8b30919a4e3c ("ARM: OMAP2+: Handle reset quirks for dynamically allocated modules") Cc: Laurent Pinchart laurent.pinchart@ideasonboard.com Cc: Tomi Valkeinen tomi.valkeinen@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/omap_hwmod.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c index 82706af307dee..c630457bb228e 100644 --- a/arch/arm/mach-omap2/omap_hwmod.c +++ b/arch/arm/mach-omap2/omap_hwmod.c @@ -3489,7 +3489,7 @@ static const struct omap_hwmod_reset dra7_reset_quirks[] = { };
static const struct omap_hwmod_reset omap_reset_quirks[] = { - { .match = "dss", .len = 3, .reset = omap_dss_reset, }, + { .match = "dss_core", .len = 8, .reset = omap_dss_reset, }, { .match = "hdq1w", .len = 5, .reset = omap_hdq1w_reset, }, { .match = "i2c", .len = 3, .reset = omap_i2c_reset, }, { .match = "wd_timer", .len = 8, .reset = omap2_wd_timer_reset, },
From: Huy Nguyen huyn@mellanox.com
[ Upstream commit 94579ac3f6d0820adc83b5dc5358ead0158101e9 ]
During IPsec performance testing, we see bad ICMP checksum. The error packet has duplicated ESP trailer due to double validate_xmit_xfrm calls. The first call is from ip_output, but the packet cannot be sent because netif_xmit_frozen_or_stopped is true and the packet gets dev_requeue_skb. The second call is from NET_TX softirq. However after the first call, the packet already has the ESP trailer.
Fix by marking the skb with XFRM_XMIT bit after the packet is handled by validate_xmit_xfrm to avoid duplicate ESP trailer insertion.
Fixes: f6e27114a60a ("net: Add a xfrm validate function to validate_xmit_skb") Signed-off-by: Huy Nguyen huyn@mellanox.com Reviewed-by: Boris Pismenny borisp@mellanox.com Reviewed-by: Raed Salem raeds@mellanox.com Reviewed-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 1 + net/xfrm/xfrm_device.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 8f71c111e65af..03024701c79f7 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1013,6 +1013,7 @@ struct xfrm_offload { #define XFRM_GRO 32 #define XFRM_ESP_NO_TRAILER 64 #define XFRM_DEV_RESUME 128 +#define XFRM_XMIT 256
__u32 status; #define CRYPTO_SUCCESS 1 diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index f50d1f97cf8ec..626096bd0d294 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -108,7 +108,7 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur struct xfrm_offload *xo = xfrm_offload(skb); struct sec_path *sp;
- if (!xo) + if (!xo || (xo->flags & XFRM_XMIT)) return skb;
if (!(features & NETIF_F_HW_ESP)) @@ -129,6 +129,8 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur return skb; }
+ xo->flags |= XFRM_XMIT; + if (skb_is_gso(skb)) { struct net_device *dev = skb->dev;
From: Oskar Holmlund oskar@ohdata.se
[ Upstream commit 3f311e8993ed18fb7325373ec0f82a7f8e8be82e ]
AM335x TRM: Table 2-1 defines USBSS - USB Queue Manager in memory region 0x4740 0000 to 0x4740 7FFF.
Looks like the older TRM revisions list the range from 0x5000 to 0x8000 as reserved.
Fixes: 0782e8572ce4 ("ARM: dts: Probe am335x musb with ti-sysc") Signed-off-by: Oskar Holmlund oskar@ohdata.se [tony@atomide.com: updated comments] Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am33xx.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi index a35f5052d76f6..be76ded7e4c0c 100644 --- a/arch/arm/boot/dts/am33xx.dtsi +++ b/arch/arm/boot/dts/am33xx.dtsi @@ -347,7 +347,7 @@ clock-names = "fck"; #address-cells = <1>; #size-cells = <1>; - ranges = <0x0 0x47400000 0x5000>; + ranges = <0x0 0x47400000 0x8000>;
usb0_phy: usb-phy@1300 { compatible = "ti,am335x-usb-phy";
From: Oskar Holmlund oskar@ohdata.se
[ Upstream commit 9f872f924545324a06fa216ad38132804c20f2db ]
AM335x TRM: Figure 16-23 define sysconfig register and soft_reset are in first position corresponding to SYSC_OMAP4_SOFTRESET defined in ti-sysc.h.
Fixes: 0782e8572ce4 ("ARM: dts: Probe am335x musb with ti-sysc") Signed-off-by: Oskar Holmlund oskar@ohdata.se Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am33xx.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi index be76ded7e4c0c..ed6634d34c3c7 100644 --- a/arch/arm/boot/dts/am33xx.dtsi +++ b/arch/arm/boot/dts/am33xx.dtsi @@ -335,7 +335,7 @@ <0x47400010 0x4>; reg-names = "rev", "sysc"; ti,sysc-mask = <(SYSC_OMAP4_FREEEMU | - SYSC_OMAP2_SOFTRESET)>; + SYSC_OMAP4_SOFTRESET)>; ti,sysc-midle = <SYSC_IDLE_FORCE>, <SYSC_IDLE_NO>, <SYSC_IDLE_SMART>;
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 6476b60f32866be49d05e2e0163f337374c55b06 ]
Successful send of EOS command does not indicate that EOS is actually finished, correct event to wait EOS is finished is EOS_RENDERED event. EOS_RENDERED means that the DSP has finished processing all the buffers for that particular session and stream.
This patch fixes EOS handling!
Fixes: 68fd8480bb7b ("ASoC: qdsp6: q6asm: Add support to audio stream apis") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20200611124159.20742-3-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/qdsp6/q6asm.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/sound/soc/qcom/qdsp6/q6asm.c b/sound/soc/qcom/qdsp6/q6asm.c index 0e0e8f7a460ab..ae4b2cabdf2d6 100644 --- a/sound/soc/qcom/qdsp6/q6asm.c +++ b/sound/soc/qcom/qdsp6/q6asm.c @@ -25,6 +25,7 @@ #define ASM_STREAM_CMD_FLUSH 0x00010BCE #define ASM_SESSION_CMD_PAUSE 0x00010BD3 #define ASM_DATA_CMD_EOS 0x00010BDB +#define ASM_DATA_EVENT_RENDERED_EOS 0x00010C1C #define ASM_NULL_POPP_TOPOLOGY 0x00010C68 #define ASM_STREAM_CMD_FLUSH_READBUFS 0x00010C09 #define ASM_STREAM_CMD_SET_ENCDEC_PARAM 0x00010C10 @@ -622,9 +623,6 @@ static int32_t q6asm_stream_callback(struct apr_device *adev, case ASM_SESSION_CMD_SUSPEND: client_event = ASM_CLIENT_EVENT_CMD_SUSPEND_DONE; break; - case ASM_DATA_CMD_EOS: - client_event = ASM_CLIENT_EVENT_CMD_EOS_DONE; - break; case ASM_STREAM_CMD_FLUSH: client_event = ASM_CLIENT_EVENT_CMD_FLUSH_DONE; break; @@ -727,6 +725,9 @@ static int32_t q6asm_stream_callback(struct apr_device *adev, spin_unlock_irqrestore(&ac->lock, flags); }
+ break; + case ASM_DATA_EVENT_RENDERED_EOS: + client_event = ASM_CLIENT_EVENT_CMD_EOS_DONE; break; }
From: Martin Fuzzey martin.fuzzey@flowbird.group
[ Upstream commit d7442ba13d62de9afc4e57344a676f9f4623dcaa ]
Commit 99f75ce66619 ("regulator: da9063: fix suspend") converted the regulators to use a common (corrected) suspend bit setting but one of regulators (LDO9) slipped through the crack.
This means that the original problem was not fixed for LDO9 and also leads to a warning found by the test robot. da9063-regulator.c:515:3: warning: initialized field overwritten
Fix this by converting that regulator too like the others.
Fixes: 99f75ce66619 ("regulator: da9063: fix suspend") Reported-by: kernel test robot lkp@intel.com
Signed-off-by: Martin Fuzzey martin.fuzzey@flowbird.group Link: https://lore.kernel.org/r/1591959073-16792-1-git-send-email-martin.fuzzey@fl... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/da9063-regulator.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/regulator/da9063-regulator.c b/drivers/regulator/da9063-regulator.c index e1d6c8f6d40bb..fe65b5acaf280 100644 --- a/drivers/regulator/da9063-regulator.c +++ b/drivers/regulator/da9063-regulator.c @@ -512,7 +512,6 @@ static const struct da9063_regulator_info da9063_regulator_info[] = { }, { DA9063_LDO(DA9063, LDO9, 950, 50, 3600), - .suspend = BFIELD(DA9063_REG_LDO9_CONT, DA9063_VLDO9_SEL), }, { DA9063_LDO(DA9063, LDO11, 900, 50, 3600),
From: Rafał Miłecki rafal@milecki.pl
[ Upstream commit de1f6d9304c38e414552c3565d36286609ced0c1 ]
This property is needed since commit abe60a3a7afb ("ARM: dts: Kill off skeleton{64}.dtsi"). Without it booting silently hangs at: [ 0.000000] Memory policy: Data cache writealloc
Fixes: 984829e2d39b ("ARM: dts: BCM5301X: Add DT for Luxul XWC-2000") Signed-off-by: Rafał Miłecki rafal@milecki.pl Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts b/arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts index 334325390aed0..29bbecd36f65d 100644 --- a/arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts +++ b/arch/arm/boot/dts/bcm47094-luxul-xwc-2000.dts @@ -17,6 +17,7 @@ };
memory { + device_type = "memory"; reg = <0x00000000 0x08000000 0x88000000 0x18000000>; };
From: Matthew Hagan mnhagan88@gmail.com
[ Upstream commit 0386e9ce5877ee73e07675529d5ae594d00f0900 ]
The NSP SoC includes an SP804 timer so should be enabled here.
Fixes: a0efb0d28b77 ("ARM: dts: NSP: Add SP804 Support to DT") Signed-off-by: Matthew Hagan mnhagan88@gmail.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-bcm/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig index 6aa938b949db2..1df0ee01ee02b 100644 --- a/arch/arm/mach-bcm/Kconfig +++ b/arch/arm/mach-bcm/Kconfig @@ -53,6 +53,7 @@ config ARCH_BCM_NSP select ARM_ERRATA_754322 select ARM_ERRATA_775420 select ARM_ERRATA_764369 if SMP + select ARM_TIMER_SP804 select THERMAL select THERMAL_OF help
From: Reinette Chatre reinette.chatre@intel.com
[ Upstream commit 0118ad82c2a64ebcf15d7565ed35361407efadfa ]
The function determining a platform's support and properties of cache occupancy and memory bandwidth monitoring (properties of X86_FEATURE_CQM_LLC) can be found among the common CPU code. After the feature's properties is populated in the per-CPU data the resctrl subsystem is the only consumer (via boot_cpu_data).
Move the function that obtains the CPU information used by resctrl to the resctrl subsystem and rename it from init_cqm() to resctrl_cpu_detect(). The function continues to be called from the common CPU code. This move is done in preparation of the addition of some vendor specific code.
No functional change.
Suggested-by: Borislav Petkov bp@suse.de Signed-off-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/38433b99f9d16c8f4ee796f8cc42b871531fa203.158871569... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/resctrl_sched.h | 3 +++ arch/x86/kernel/cpu/common.c | 27 ++------------------------- arch/x86/kernel/cpu/resctrl/core.c | 24 ++++++++++++++++++++++++ 3 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/resctrl_sched.h b/arch/x86/include/asm/resctrl_sched.h index f6b7fe2833cc7..c8a27cbbdae29 100644 --- a/arch/x86/include/asm/resctrl_sched.h +++ b/arch/x86/include/asm/resctrl_sched.h @@ -84,9 +84,12 @@ static inline void resctrl_sched_in(void) __resctrl_sched_in(); }
+void resctrl_cpu_detect(struct cpuinfo_x86 *c); + #else
static inline void resctrl_sched_in(void) {} +static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
#endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8293ee5149758..e2e4af685f9b4 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -56,6 +56,7 @@ #include <asm/intel-family.h> #include <asm/cpu_device_id.h> #include <asm/uv/uv.h> +#include <asm/resctrl_sched.h>
#include "cpu.h"
@@ -854,30 +855,6 @@ static void init_speculation_control(struct cpuinfo_x86 *c) } }
-static void init_cqm(struct cpuinfo_x86 *c) -{ - if (!cpu_has(c, X86_FEATURE_CQM_LLC)) { - c->x86_cache_max_rmid = -1; - c->x86_cache_occ_scale = -1; - return; - } - - /* will be overridden if occupancy monitoring exists */ - c->x86_cache_max_rmid = cpuid_ebx(0xf); - - if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) || - cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) || - cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) { - u32 eax, ebx, ecx, edx; - - /* QoS sub-leaf, EAX=0Fh, ECX=1 */ - cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx); - - c->x86_cache_max_rmid = ecx; - c->x86_cache_occ_scale = ebx; - } -} - void get_cpu_cap(struct cpuinfo_x86 *c) { u32 eax, ebx, ecx, edx; @@ -945,7 +922,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
init_scattered_cpuid_features(c); init_speculation_control(c); - init_cqm(c); + resctrl_cpu_detect(c);
/* * Clear/Set all flags overridden by options, after probe. diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index d8cc5223b7ce8..49599733fa94d 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -958,6 +958,30 @@ static __init void rdt_init_res_defs(void)
static enum cpuhp_state rdt_online;
+void resctrl_cpu_detect(struct cpuinfo_x86 *c) +{ + if (!cpu_has(c, X86_FEATURE_CQM_LLC)) { + c->x86_cache_max_rmid = -1; + c->x86_cache_occ_scale = -1; + return; + } + + /* will be overridden if occupancy monitoring exists */ + c->x86_cache_max_rmid = cpuid_ebx(0xf); + + if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) || + cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) || + cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) { + u32 eax, ebx, ecx, edx; + + /* QoS sub-leaf, EAX=0Fh, ECX=1 */ + cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx); + + c->x86_cache_max_rmid = ecx; + c->x86_cache_occ_scale = ebx; + } +} + static int __init resctrl_late_init(void) { struct rdt_resource *r;
From: Reinette Chatre reinette.chatre@intel.com
[ Upstream commit f3d44f18b0662327c42128b9d3604489bdb6e36f ]
The original Memory Bandwidth Monitoring (MBM) architectural definition defines counters of up to 62 bits in the IA32_QM_CTR MSR while the first-generation MBM implementation uses statically defined 24 bit counters.
Expand the MBM CPUID enumeration properties to include the MBM counter width. The previously undefined EAX output register contains, in bits [7:0], the MBM counter width encoded as an offset from 24 bits. Enumerating this property is only specified for Intel CPUs.
Suggested-by: Borislav Petkov bp@suse.de Signed-off-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/afa3af2f753f6bc301fb743bc8944e749cb24afa.158871569... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/processor.h | 3 ++- arch/x86/kernel/cpu/resctrl/core.c | 5 +++++ 2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 3bcf27caf6c9f..c4e8fd709cf67 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -113,9 +113,10 @@ struct cpuinfo_x86 { /* in KB - valid for CPUS which support this call: */ unsigned int x86_cache_size; int x86_cache_alignment; /* In bytes */ - /* Cache QoS architectural values: */ + /* Cache QoS architectural values, valid only on the BSP: */ int x86_cache_max_rmid; /* max index */ int x86_cache_occ_scale; /* scale to bytes */ + int x86_cache_mbm_width_offset; int x86_power; unsigned long loops_per_jiffy; /* cpuid returned max cores value: */ diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 49599733fa94d..d242e864726c0 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -963,6 +963,7 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c) if (!cpu_has(c, X86_FEATURE_CQM_LLC)) { c->x86_cache_max_rmid = -1; c->x86_cache_occ_scale = -1; + c->x86_cache_mbm_width_offset = -1; return; }
@@ -979,6 +980,10 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
c->x86_cache_max_rmid = ecx; c->x86_cache_occ_scale = ebx; + if (c->x86_vendor == X86_VENDOR_INTEL) + c->x86_cache_mbm_width_offset = eax & 0xff; + else + c->x86_cache_mbm_width_offset = -1; } }
From: Babu Moger babu.moger@amd.com
[ Upstream commit 2c18bd525c47f882f033b0a813ecd09c93e1ecdf ]
Memory bandwidth is calculated reading the monitoring counter at two intervals and calculating the delta. It is the software’s responsibility to read the count often enough to avoid having the count roll over _twice_ between reads.
The current code hardcodes the bandwidth monitoring counter's width to 24 bits for AMD. This is due to default base counter width which is 24. Currently, AMD does not implement the CPUID 0xF.[ECX=1]:EAX to adjust the counter width. But, the AMD hardware supports much wider bandwidth counter with the default width of 44 bits.
Kernel reads these monitoring counters every 1 second and adjusts the counter value for overflow. With 24 bits and scale value of 64 for AMD, it can only measure up to 1GB/s without overflowing. For the rates above 1GB/s this will fail to measure the bandwidth.
Fix the issue setting the default width to 44 bits by adjusting the offset.
AMD future products will implement CPUID 0xF.[ECX=1]:EAX.
[ bp: Let the line stick out and drop {}-brackets around a single statement. ]
Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Babu Moger babu.moger@amd.com Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/159129975546.62538.5656031125604254041.stgit@naple... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/resctrl/core.c | 8 ++++---- arch/x86/kernel/cpu/resctrl/internal.h | 1 + 2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index d242e864726c0..c1551541c7a53 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -980,10 +980,10 @@ void resctrl_cpu_detect(struct cpuinfo_x86 *c)
c->x86_cache_max_rmid = ecx; c->x86_cache_occ_scale = ebx; - if (c->x86_vendor == X86_VENDOR_INTEL) - c->x86_cache_mbm_width_offset = eax & 0xff; - else - c->x86_cache_mbm_width_offset = -1; + c->x86_cache_mbm_width_offset = eax & 0xff; + + if (c->x86_vendor == X86_VENDOR_AMD && !c->x86_cache_mbm_width_offset) + c->x86_cache_mbm_width_offset = MBM_CNTR_WIDTH_OFFSET_AMD; } }
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 3dd13f3a8b231..0963864757143 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -37,6 +37,7 @@ #define MBA_IS_LINEAR 0x4 #define MBA_MAX_MBPS U32_MAX #define MAX_MBA_BW_AMD 0x800 +#define MBM_CNTR_WIDTH_OFFSET_AMD 20
#define RMID_VAL_ERROR BIT_ULL(63) #define RMID_VAL_UNAVAIL BIT_ULL(62)
From: Fabian Vogt fvogt@suse.de
[ Upstream commit 7dfc06a0f25b593a9f51992f540c0f80a57f3629 ]
It is possible that the first event in the event log is not actually a log header at all, but rather a normal event. This leads to the cast in __calc_tpm2_event_size being an invalid conversion, which means that the values read are effectively garbage. Depending on the first event's contents, this leads either to apparently normal behaviour, a crash or a freeze.
While this behaviour of the firmware is not in accordance with the TCG Client EFI Specification, this happens on a Dell Precision 5510 with the TPM enabled but hidden from the OS ("TPM On" disabled, state otherwise untouched). The EFI firmware claims that the TPM is present and active and that it supports the TCG 2.0 event log format.
Fortunately, this can be worked around by simply checking the header of the first event and the event log header signature itself.
Commit b4f1874c6216 ("tpm: check event log version before reading final events") addressed a similar issue also found on Dell models.
Fixes: 6b0326190205 ("efi: Attempt to get the TCG2 event log in the boot stub") Signed-off-by: Fabian Vogt fvogt@suse.de Link: https://lore.kernel.org/r/1927248.evlx2EsYKh@linux-e202.suse.de Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1165773 Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/tpm_eventlog.h | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h index c253461b1c4e6..96d36b7a13440 100644 --- a/include/linux/tpm_eventlog.h +++ b/include/linux/tpm_eventlog.h @@ -81,6 +81,8 @@ struct tcg_efi_specid_event_algs { u16 digest_size; } __packed;
+#define TCG_SPECID_SIG "Spec ID Event03" + struct tcg_efi_specid_event_head { u8 signature[16]; u32 platform_class; @@ -171,6 +173,7 @@ static inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event, int i; int j; u32 count, event_type; + const u8 zero_digest[sizeof(event_header->digest)] = {0};
marker = event; marker_start = marker; @@ -198,10 +201,19 @@ static inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event, count = READ_ONCE(event->count); event_type = READ_ONCE(event->event_type);
+ /* Verify that it's the log header */ + if (event_header->pcr_idx != 0 || + event_header->event_type != NO_ACTION || + memcmp(event_header->digest, zero_digest, sizeof(zero_digest))) { + size = 0; + goto out; + } + efispecid = (struct tcg_efi_specid_event_head *)event_header->event;
/* Check if event is malformed. */ - if (count > efispecid->num_algs) { + if (memcmp(efispecid->signature, TCG_SPECID_SIG, + sizeof(TCG_SPECID_SIG)) || count > efispecid->num_algs) { size = 0; goto out; }
From: Qiushi Wu wu000273@umn.edu
[ Upstream commit 4ddf4739be6e375116c375f0a68bf3893ffcee21 ]
kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit "b8eb718348b8" fixed a similar problem.
Fixes: 0bb549052d33 ("efi: Add esrt support") Signed-off-by: Qiushi Wu wu000273@umn.edu Link: https://lore.kernel.org/r/20200528183804.4497-1-wu000273@umn.edu Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/efi/esrt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c index e3d6926965834..d5915272141fd 100644 --- a/drivers/firmware/efi/esrt.c +++ b/drivers/firmware/efi/esrt.c @@ -181,7 +181,7 @@ static int esre_create_sysfs_entry(void *esre, int entry_num) rc = kobject_init_and_add(&entry->kobj, &esre1_ktype, NULL, "entry%d", entry_num); if (rc) { - kfree(entry); + kobject_put(&entry->kobj); return rc; } }
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 96bf62f018f40cb5d4e4bed95e50fd990a2354af ]
soc_dpcm_fe_runtime_update() is called for all dailinks, and we want to first discard all back-ends, then deal with front-ends.
The existing code first reports an error with multi-cpu front-ends, and that check needs to be moved after we know that we are dealing with a front-end.
Fixes: 6e1276a5e613d ('ASoC: Return error if the function does not support multi-cpu') Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com BugLink: https://github.com/thesofproject/linux/issues/1970 Link: https://lore.kernel.org/r/20200612203507.25621-1-pierre-louis.bossart@linux.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-pcm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 39ce61c5b8744..fde097a7aad32 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2749,15 +2749,15 @@ static int soc_dpcm_fe_runtime_update(struct snd_soc_pcm_runtime *fe, int new) int count, paths; int ret;
+ if (!fe->dai_link->dynamic) + return 0; + if (fe->num_cpus > 1) { dev_err(fe->dev, "%s doesn't support Multi CPU yet\n", __func__); return -EINVAL; }
- if (!fe->dai_link->dynamic) - return 0; - /* only check active links */ if (!fe->cpu_dai->active) return 0;
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 4a95737440d426e93441d49d11abf4c6526d4666 ]
This patch adds support to q6afe_is_rx_port() to get direction of DSP BE dai port, this is useful for setting dailink directions correctly.
Fixes: c25e295cd77b (ASoC: qcom: Add support to parse common audio device nodes) Reported-by: John Stultz john.stultz@linaro.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Reviewed-by: Vinod Koul vkoul@kernel.org Link: https://lore.kernel.org/r/20200612123711.29130-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/qdsp6/q6afe.c | 8 ++++++++ sound/soc/qcom/qdsp6/q6afe.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/sound/soc/qcom/qdsp6/q6afe.c b/sound/soc/qcom/qdsp6/q6afe.c index e0945f7a58c81..0ce4eb60f9848 100644 --- a/sound/soc/qcom/qdsp6/q6afe.c +++ b/sound/soc/qcom/qdsp6/q6afe.c @@ -800,6 +800,14 @@ int q6afe_get_port_id(int index) } EXPORT_SYMBOL_GPL(q6afe_get_port_id);
+int q6afe_is_rx_port(int index) +{ + if (index < 0 || index >= AFE_PORT_MAX) + return -EINVAL; + + return port_maps[index].is_rx; +} +EXPORT_SYMBOL_GPL(q6afe_is_rx_port); static int afe_apr_send_pkt(struct q6afe *afe, struct apr_pkt *pkt, struct q6afe_port *port) { diff --git a/sound/soc/qcom/qdsp6/q6afe.h b/sound/soc/qcom/qdsp6/q6afe.h index c7ed5422baffd..1a0f80a14afea 100644 --- a/sound/soc/qcom/qdsp6/q6afe.h +++ b/sound/soc/qcom/qdsp6/q6afe.h @@ -198,6 +198,7 @@ int q6afe_port_start(struct q6afe_port *port); int q6afe_port_stop(struct q6afe_port *port); void q6afe_port_put(struct q6afe_port *port); int q6afe_get_port_id(int index); +int q6afe_is_rx_port(int index); void q6afe_hdmi_port_prepare(struct q6afe_port *port, struct q6afe_hdmi_cfg *cfg); void q6afe_slim_port_prepare(struct q6afe_port *port,
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit a2120089251f1fe221305e88df99af16f940e236 ]
Currently both FE and BE dai-links are configured bi-directional, However the DSP BE dais are only single directional, so set the directions as supported by the BE dais.
Fixes: c25e295cd77b (ASoC: qcom: Add support to parse common audio device nodes) Reported-by: John Stultz john.stultz@linaro.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Tested-by: John Stultz john.stultz@linaro.org Reviewed-by: Vinod Koul vkoul@kernel.org Link: https://lore.kernel.org/r/20200612123711.29130-2-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/common.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/sound/soc/qcom/common.c b/sound/soc/qcom/common.c index 6c20bdd850f33..8ada4ecba8472 100644 --- a/sound/soc/qcom/common.c +++ b/sound/soc/qcom/common.c @@ -4,6 +4,7 @@
#include <linux/module.h> #include "common.h" +#include "qdsp6/q6afe.h"
int qcom_snd_parse_of(struct snd_soc_card *card) { @@ -101,6 +102,15 @@ int qcom_snd_parse_of(struct snd_soc_card *card) } link->no_pcm = 1; link->ignore_pmdown_time = 1; + + if (q6afe_is_rx_port(link->id)) { + link->dpcm_playback = 1; + link->dpcm_capture = 0; + } else { + link->dpcm_playback = 0; + link->dpcm_capture = 1; + } + } else { dlc = devm_kzalloc(dev, sizeof(*dlc), GFP_KERNEL); if (!dlc) @@ -113,12 +123,12 @@ int qcom_snd_parse_of(struct snd_soc_card *card) link->codecs->dai_name = "snd-soc-dummy-dai"; link->codecs->name = "snd-soc-dummy"; link->dynamic = 1; + link->dpcm_playback = 1; + link->dpcm_capture = 1; }
link->ignore_suspend = 1; link->nonatomic = 1; - link->dpcm_playback = 1; - link->dpcm_capture = 1; link->stream_name = link->name; link++;
From: Robin Gong yibin.gong@nxp.com
[ Upstream commit 6f1cf5257acc6e6242ddf2f52bc7912aed77b79f ]
PFUZE100_SWB_REG is not proper for sw1a/sw2, because enable_mask/enable_reg is not correct. On PFUZE3000, sw1a/sw2 should be the same as sw1a/sw2 on pfuze100 except that voltages are not linear, so add new PFUZE3000_SW_REG and pfuze3000_sw_regulator_ops which like the non-linear PFUZE100_SW_REG and pfuze100_sw_regulator_ops.
Fixes: 1dced996ee70 ("regulator: pfuze100: update voltage setting for pfuze3000 sw1a") Reported-by: Christophe Meynard Christophe.Meynard@ign.fr Signed-off-by: Robin Gong yibin.gong@nxp.com Link: https://lore.kernel.org/r/1592171648-8752-1-git-send-email-yibin.gong@nxp.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/pfuze100-regulator.c | 60 +++++++++++++++++--------- 1 file changed, 39 insertions(+), 21 deletions(-)
diff --git a/drivers/regulator/pfuze100-regulator.c b/drivers/regulator/pfuze100-regulator.c index 689537927f6f7..4c8e8b4722872 100644 --- a/drivers/regulator/pfuze100-regulator.c +++ b/drivers/regulator/pfuze100-regulator.c @@ -209,6 +209,19 @@ static const struct regulator_ops pfuze100_swb_regulator_ops = {
};
+static const struct regulator_ops pfuze3000_sw_regulator_ops = { + .enable = regulator_enable_regmap, + .disable = regulator_disable_regmap, + .is_enabled = regulator_is_enabled_regmap, + .list_voltage = regulator_list_voltage_table, + .map_voltage = regulator_map_voltage_ascend, + .set_voltage_sel = regulator_set_voltage_sel_regmap, + .get_voltage_sel = regulator_get_voltage_sel_regmap, + .set_voltage_time_sel = regulator_set_voltage_time_sel, + .set_ramp_delay = pfuze100_set_ramp_delay, + +}; + #define PFUZE100_FIXED_REG(_chip, _name, base, voltage) \ [_chip ## _ ## _name] = { \ .desc = { \ @@ -318,23 +331,28 @@ static const struct regulator_ops pfuze100_swb_regulator_ops = { .stby_mask = 0x20, \ }
- -#define PFUZE3000_SW2_REG(_chip, _name, base, min, max, step) { \ - .desc = { \ - .name = #_name,\ - .n_voltages = ((max) - (min)) / (step) + 1, \ - .ops = &pfuze100_sw_regulator_ops, \ - .type = REGULATOR_VOLTAGE, \ - .id = _chip ## _ ## _name, \ - .owner = THIS_MODULE, \ - .min_uV = (min), \ - .uV_step = (step), \ - .vsel_reg = (base) + PFUZE100_VOL_OFFSET, \ - .vsel_mask = 0x7, \ - }, \ - .stby_reg = (base) + PFUZE100_STANDBY_OFFSET, \ - .stby_mask = 0x7, \ -} +/* No linar case for the some switches of PFUZE3000 */ +#define PFUZE3000_SW_REG(_chip, _name, base, mask, voltages) \ + [_chip ## _ ## _name] = { \ + .desc = { \ + .name = #_name, \ + .n_voltages = ARRAY_SIZE(voltages), \ + .ops = &pfuze3000_sw_regulator_ops, \ + .type = REGULATOR_VOLTAGE, \ + .id = _chip ## _ ## _name, \ + .owner = THIS_MODULE, \ + .volt_table = voltages, \ + .vsel_reg = (base) + PFUZE100_VOL_OFFSET, \ + .vsel_mask = (mask), \ + .enable_reg = (base) + PFUZE100_MODE_OFFSET, \ + .enable_mask = 0xf, \ + .enable_val = 0x8, \ + .enable_time = 500, \ + }, \ + .stby_reg = (base) + PFUZE100_STANDBY_OFFSET, \ + .stby_mask = (mask), \ + .sw_reg = true, \ + }
#define PFUZE3000_SW3_REG(_chip, _name, base, min, max, step) { \ .desc = { \ @@ -391,9 +409,9 @@ static struct pfuze_regulator pfuze200_regulators[] = { };
static struct pfuze_regulator pfuze3000_regulators[] = { - PFUZE100_SWB_REG(PFUZE3000, SW1A, PFUZE100_SW1ABVOL, 0x1f, pfuze3000_sw1a), + PFUZE3000_SW_REG(PFUZE3000, SW1A, PFUZE100_SW1ABVOL, 0x1f, pfuze3000_sw1a), PFUZE100_SW_REG(PFUZE3000, SW1B, PFUZE100_SW1CVOL, 700000, 1475000, 25000), - PFUZE100_SWB_REG(PFUZE3000, SW2, PFUZE100_SW2VOL, 0x7, pfuze3000_sw2lo), + PFUZE3000_SW_REG(PFUZE3000, SW2, PFUZE100_SW2VOL, 0x7, pfuze3000_sw2lo), PFUZE3000_SW3_REG(PFUZE3000, SW3, PFUZE100_SW3AVOL, 900000, 1650000, 50000), PFUZE100_SWB_REG(PFUZE3000, SWBST, PFUZE100_SWBSTCON1, 0x3, pfuze100_swbst), PFUZE100_SWB_REG(PFUZE3000, VSNVS, PFUZE100_VSNVSVOL, 0x7, pfuze100_vsnvs), @@ -407,8 +425,8 @@ static struct pfuze_regulator pfuze3000_regulators[] = { };
static struct pfuze_regulator pfuze3001_regulators[] = { - PFUZE100_SWB_REG(PFUZE3001, SW1, PFUZE100_SW1ABVOL, 0x1f, pfuze3000_sw1a), - PFUZE100_SWB_REG(PFUZE3001, SW2, PFUZE100_SW2VOL, 0x7, pfuze3000_sw2lo), + PFUZE3000_SW_REG(PFUZE3001, SW1, PFUZE100_SW1ABVOL, 0x1f, pfuze3000_sw1a), + PFUZE3000_SW_REG(PFUZE3001, SW2, PFUZE100_SW2VOL, 0x7, pfuze3000_sw2lo), PFUZE3000_SW3_REG(PFUZE3001, SW3, PFUZE100_SW3AVOL, 900000, 1650000, 50000), PFUZE100_SWB_REG(PFUZE3001, VSNVS, PFUZE100_VSNVSVOL, 0x7, pfuze100_vsnvs), PFUZE100_VGEN_REG(PFUZE3001, VLDO1, PFUZE100_VGEN1VOL, 1800000, 3300000, 100000),
From: Philipp Fent fent@in.tum.de
[ Upstream commit 7a88a6227dc7f2e723bba11ece05e57bd8dce8e4 ]
Commit 9302c1bb8e47 ("efi/libstub: Rewrite file I/O routine") introduced a regression that made a couple of (badly configured) systems fail to boot [1]: Until 5.6, we silently accepted Unix-style file separators in EFI paths, which might violate the EFI standard, but are an easy to make mistake. This fix restores the pre-5.7 behaviour.
[1] https://bbs.archlinux.org/viewtopic.php?id=256273
Fixes: 9302c1bb8e47 ("efi/libstub: Rewrite file I/O routine") Signed-off-by: Philipp Fent fent@in.tum.de Link: https://lore.kernel.org/r/20200615115109.7823-1-fent@in.tum.de [ardb: rewrite as chained if/else statements] Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/efi/libstub/file.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/firmware/efi/libstub/file.c b/drivers/firmware/efi/libstub/file.c index ea66b1f16a79d..f1c4faf58c764 100644 --- a/drivers/firmware/efi/libstub/file.c +++ b/drivers/firmware/efi/libstub/file.c @@ -104,12 +104,20 @@ static int find_file_option(const efi_char16_t *cmdline, int cmdline_len, if (!found) return 0;
+ /* Skip any leading slashes */ + while (cmdline[i] == L'/' || cmdline[i] == L'\') + i++; + while (--result_len > 0 && i < cmdline_len) { - if (cmdline[i] == L'\0' || - cmdline[i] == L'\n' || - cmdline[i] == L' ') + efi_char16_t c = cmdline[i++]; + + if (c == L'\0' || c == L'\n' || c == L' ') break; - *result++ = cmdline[i++]; + else if (c == L'/') + /* Replace UNIX dir separators with EFI standard ones */ + *result++ = L'\'; + else + *result++ = c; } *result = L'\0'; return i;
From: Tom Seewald tseewald@gmail.com
[ Upstream commit 6769b275a313c76ddcd7d94c632032326db5f759 ]
The variable buf_addr is type dma_addr_t, which may not be the same size as a pointer. To ensure it is the correct size, cast to a uintptr_t.
Fixes: c536277e0db1 ("RDMA/siw: Fix 64/32bit pointer inconsistency") Link: https://lore.kernel.org/r/20200610174717.15932-1-tseewald@gmail.com Signed-off-by: Tom Seewald tseewald@gmail.com Reviewed-by: Bernard Metzler bmt@zurich.ibm.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/sw/siw/siw_qp_rx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c index 650520244ed0c..7271d705f4b06 100644 --- a/drivers/infiniband/sw/siw/siw_qp_rx.c +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c @@ -139,7 +139,8 @@ static int siw_rx_pbl(struct siw_rx_stream *srx, int *pbl_idx, break;
bytes = min(bytes, len); - if (siw_rx_kva(srx, (void *)buf_addr, bytes) == bytes) { + if (siw_rx_kva(srx, (void *)(uintptr_t)buf_addr, bytes) == + bytes) { copied += bytes; offset += bytes; len -= bytes;
From: Matthew Hagan mnhagan88@gmail.com
[ Upstream commit b9dbe0101e344e8339406a11b7a91d4a0c50ad13 ]
Currently the PL330 is enabled by default. However if left in IDM reset, as is the case with the Meraki and Synology NSP devices, the system will hang when probing for the PL330's AMBA peripheral ID. We therefore should be able to disable it in these cases.
The PL330 is also included among of the list of peripherals put into coherent mode, so "dma-coherent" has been added here as well.
Fixes: 5fa1026a3e4d ("ARM: dts: NSP: Add PL330 support") Signed-off-by: Matthew Hagan mnhagan88@gmail.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm-nsp.dtsi | 4 +++- arch/arm/boot/dts/bcm958522er.dts | 4 ++++ arch/arm/boot/dts/bcm958525er.dts | 4 ++++ arch/arm/boot/dts/bcm958525xmc.dts | 4 ++++ arch/arm/boot/dts/bcm958622hr.dts | 4 ++++ arch/arm/boot/dts/bcm958623hr.dts | 4 ++++ arch/arm/boot/dts/bcm958625hr.dts | 4 ++++ arch/arm/boot/dts/bcm958625k.dts | 4 ++++ 8 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/bcm-nsp.dtsi b/arch/arm/boot/dts/bcm-nsp.dtsi index da6d70f09ef19..920c0f561e5ce 100644 --- a/arch/arm/boot/dts/bcm-nsp.dtsi +++ b/arch/arm/boot/dts/bcm-nsp.dtsi @@ -200,7 +200,7 @@ status = "disabled"; };
- dma@20000 { + dma: dma@20000 { compatible = "arm,pl330", "arm,primecell"; reg = <0x20000 0x1000>; interrupts = <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>, @@ -215,6 +215,8 @@ clocks = <&iprocslow>; clock-names = "apb_pclk"; #dma-cells = <1>; + dma-coherent; + status = "disabled"; };
sdio: sdhci@21000 { diff --git a/arch/arm/boot/dts/bcm958522er.dts b/arch/arm/boot/dts/bcm958522er.dts index 8c388eb8a08f8..7be4c4e628e02 100644 --- a/arch/arm/boot/dts/bcm958522er.dts +++ b/arch/arm/boot/dts/bcm958522er.dts @@ -58,6 +58,10 @@
/* USB 3 support needed to be complete */
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958525er.dts b/arch/arm/boot/dts/bcm958525er.dts index c339771bb22e0..e58ed7e953460 100644 --- a/arch/arm/boot/dts/bcm958525er.dts +++ b/arch/arm/boot/dts/bcm958525er.dts @@ -58,6 +58,10 @@
/* USB 3 support needed to be complete */
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958525xmc.dts b/arch/arm/boot/dts/bcm958525xmc.dts index 1c72ec8288de4..716da62f57885 100644 --- a/arch/arm/boot/dts/bcm958525xmc.dts +++ b/arch/arm/boot/dts/bcm958525xmc.dts @@ -58,6 +58,10 @@
/* XHCI support needed to be complete */
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958622hr.dts b/arch/arm/boot/dts/bcm958622hr.dts index 96a021cebd97b..a49c2fd21f4a8 100644 --- a/arch/arm/boot/dts/bcm958622hr.dts +++ b/arch/arm/boot/dts/bcm958622hr.dts @@ -58,6 +58,10 @@
/* USB 3 and SLIC support needed to be complete */
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958623hr.dts b/arch/arm/boot/dts/bcm958623hr.dts index b2c7f21d471e6..dd6dff6452b87 100644 --- a/arch/arm/boot/dts/bcm958623hr.dts +++ b/arch/arm/boot/dts/bcm958623hr.dts @@ -58,6 +58,10 @@
/* USB 3 and SLIC support needed to be complete */
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958625hr.dts b/arch/arm/boot/dts/bcm958625hr.dts index 536fb24f38bb7..a71371b4065ed 100644 --- a/arch/arm/boot/dts/bcm958625hr.dts +++ b/arch/arm/boot/dts/bcm958625hr.dts @@ -69,6 +69,10 @@ status = "okay"; };
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; }; diff --git a/arch/arm/boot/dts/bcm958625k.dts b/arch/arm/boot/dts/bcm958625k.dts index 3fcca12d83c2d..7b84b54436edd 100644 --- a/arch/arm/boot/dts/bcm958625k.dts +++ b/arch/arm/boot/dts/bcm958625k.dts @@ -48,6 +48,10 @@ }; };
+&dma { + status = "okay"; +}; + &amac0 { status = "okay"; };
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit ed1220df6e666500ebf58c4f2fccc681941646fb ]
For mono channel, SSI will switch to Normal mode.
In Normal mode and Network mode, the Word Length Control bits control the word length divider in clock generator, which is different with I2S Master mode (the word length is fixed to 32bit), it should be the value of params_width(hw_params).
The condition "slots == 2" is not good for I2S Master mode, because for Network mode and Normal mode, the slots can also be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) to check if it is I2S Master mode.
So we refine the formula for mono channel, otherwise there will be sound issue for S24_LE.
Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot number and width") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Reviewed-by: Nicolin Chen nicoleotsuka@gmail.com Link: https://lore.kernel.org/r/034eff1435ff6ce300b6c781130cefd9db22ab9a.159227614... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_ssi.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c index bad89b0d129e7..1a2fa7f181423 100644 --- a/sound/soc/fsl/fsl_ssi.c +++ b/sound/soc/fsl/fsl_ssi.c @@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream, struct regmap *regs = ssi->regs; u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i; unsigned long clkrate, baudrate, tmprate; - unsigned int slots = params_channels(hw_params); - unsigned int slot_width = 32; + unsigned int channels = params_channels(hw_params); + unsigned int slot_width = params_width(hw_params); + unsigned int slots = 2; u64 sub, savesub = 100000; unsigned int freq; bool baudclk_is_used; @@ -688,10 +689,14 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream, /* Override slots and slot_width if being specifically set... */ if (ssi->slots) slots = ssi->slots; - /* ...but keep 32 bits if slots is 2 -- I2S Master mode */ - if (ssi->slot_width && slots != 2) + if (ssi->slot_width) slot_width = ssi->slot_width;
+ /* ...but force 32 bits for stereo audio using I2S Master Mode */ + if (channels == 2 && + (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) == SSI_SCR_I2S_MODE_MASTER) + slot_width = 32; + /* Generate bit clock based on the slot number and slot width */ freq = slots * slot_width * params_rate(hw_params);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 6a09815428547657f3ffd2f5c31ac2a191e7fdf3 ]
xdp_redirect_cpu is currently failing in bpf_prog_load_xattr() allocating cpu_map map if CONFIG_NR_CPUS is less than 64 since cpu_map_alloc() requires max_entries to be less than NR_CPUS. Set cpu_map max_entries according to NR_CPUS in xdp_redirect_cpu_kern.c and get currently running cpus in xdp_redirect_cpu_user.c
Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/374472755001c260158c4e4b22f193bdd3c56fb7.1589300... Signed-off-by: Sasha Levin sashal@kernel.org --- samples/bpf/xdp_redirect_cpu_kern.c | 2 +- samples/bpf/xdp_redirect_cpu_user.c | 29 ++++++++++++++++------------- 2 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/samples/bpf/xdp_redirect_cpu_kern.c b/samples/bpf/xdp_redirect_cpu_kern.c index 313a8fe6d125c..2baf8db1f7e70 100644 --- a/samples/bpf/xdp_redirect_cpu_kern.c +++ b/samples/bpf/xdp_redirect_cpu_kern.c @@ -15,7 +15,7 @@ #include <bpf/bpf_helpers.h> #include "hash_func01.h"
-#define MAX_CPUS 64 /* WARNING - sync with _user.c */ +#define MAX_CPUS NR_CPUS
/* Special map type that can XDP_REDIRECT frames to another CPU */ struct { diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c index 15bdf047a2221..9b8f21abeac47 100644 --- a/samples/bpf/xdp_redirect_cpu_user.c +++ b/samples/bpf/xdp_redirect_cpu_user.c @@ -13,6 +13,7 @@ static const char *__doc__ = #include <unistd.h> #include <locale.h> #include <sys/resource.h> +#include <sys/sysinfo.h> #include <getopt.h> #include <net/if.h> #include <time.h> @@ -24,8 +25,6 @@ static const char *__doc__ = #include <arpa/inet.h> #include <linux/if_link.h>
-#define MAX_CPUS 64 /* WARNING - sync with _kern.c */ - /* How many xdp_progs are defined in _kern.c */ #define MAX_PROG 6
@@ -40,6 +39,7 @@ static char *ifname; static __u32 prog_id;
static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; +static int n_cpus; static int cpu_map_fd; static int rx_cnt_map_fd; static int redirect_err_cnt_map_fd; @@ -170,7 +170,7 @@ struct stats_record { struct record redir_err; struct record kthread; struct record exception; - struct record enq[MAX_CPUS]; + struct record enq[]; };
static bool map_collect_percpu(int fd, __u32 key, struct record *rec) @@ -225,10 +225,11 @@ static struct datarec *alloc_record_per_cpu(void) static struct stats_record *alloc_stats_record(void) { struct stats_record *rec; - int i; + int i, size;
- rec = malloc(sizeof(*rec)); - memset(rec, 0, sizeof(*rec)); + size = sizeof(*rec) + n_cpus * sizeof(struct record); + rec = malloc(size); + memset(rec, 0, size); if (!rec) { fprintf(stderr, "Mem alloc error\n"); exit(EXIT_FAIL_MEM); @@ -237,7 +238,7 @@ static struct stats_record *alloc_stats_record(void) rec->redir_err.cpu = alloc_record_per_cpu(); rec->kthread.cpu = alloc_record_per_cpu(); rec->exception.cpu = alloc_record_per_cpu(); - for (i = 0; i < MAX_CPUS; i++) + for (i = 0; i < n_cpus; i++) rec->enq[i].cpu = alloc_record_per_cpu();
return rec; @@ -247,7 +248,7 @@ static void free_stats_record(struct stats_record *r) { int i;
- for (i = 0; i < MAX_CPUS; i++) + for (i = 0; i < n_cpus; i++) free(r->enq[i].cpu); free(r->exception.cpu); free(r->kthread.cpu); @@ -350,7 +351,7 @@ static void stats_print(struct stats_record *stats_rec, }
/* cpumap enqueue stats */ - for (to_cpu = 0; to_cpu < MAX_CPUS; to_cpu++) { + for (to_cpu = 0; to_cpu < n_cpus; to_cpu++) { char *fmt = "%-15s %3d:%-3d %'-14.0f %'-11.0f %'-10.2f %s\n"; char *fm2 = "%-15s %3s:%-3d %'-14.0f %'-11.0f %'-10.2f %s\n"; char *errstr = ""; @@ -475,7 +476,7 @@ static void stats_collect(struct stats_record *rec) map_collect_percpu(fd, 1, &rec->redir_err);
fd = cpumap_enqueue_cnt_map_fd; - for (i = 0; i < MAX_CPUS; i++) + for (i = 0; i < n_cpus; i++) map_collect_percpu(fd, i, &rec->enq[i]);
fd = cpumap_kthread_cnt_map_fd; @@ -549,10 +550,10 @@ static int create_cpu_entry(__u32 cpu, __u32 queue_size, */ static void mark_cpus_unavailable(void) { - __u32 invalid_cpu = MAX_CPUS; + __u32 invalid_cpu = n_cpus; int ret, i;
- for (i = 0; i < MAX_CPUS; i++) { + for (i = 0; i < n_cpus; i++) { ret = bpf_map_update_elem(cpus_available_map_fd, &i, &invalid_cpu, 0); if (ret) { @@ -688,6 +689,8 @@ int main(int argc, char **argv) int prog_fd; __u32 qsize;
+ n_cpus = get_nprocs_conf(); + /* Notice: choosing he queue size is very important with the * ixgbe driver, because it's driver page recycling trick is * dependend on pages being returned quickly. The number of @@ -757,7 +760,7 @@ int main(int argc, char **argv) case 'c': /* Add multiple CPUs */ add_cpu = strtoul(optarg, NULL, 0); - if (add_cpu >= MAX_CPUS) { + if (add_cpu >= n_cpus) { fprintf(stderr, "--cpu nr too large for cpumap err(%d):%s\n", errno, strerror(errno));
From: Gaurav Singh gaurav1086@gmail.com
[ Upstream commit 6903cdae9f9f08d61e49c16cbef11c293e33a615 ]
Memset on the pointer right after malloc can cause a NULL pointer deference if it failed to allocate memory. A simple fix is to replace malloc()/memset() pair with a simple call to calloc().
Fixes: 0fca931a6f21 ("samples/bpf: program demonstrating access to xdp_rxq_info") Signed-off-by: Gaurav Singh gaurav1086@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Jesper Dangaard Brouer brouer@redhat.com Acked-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- samples/bpf/xdp_monitor_user.c | 8 ++------ samples/bpf/xdp_redirect_cpu_user.c | 7 ++----- samples/bpf/xdp_rxq_info_user.c | 13 +++---------- 3 files changed, 7 insertions(+), 21 deletions(-)
diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c index dd558cbb23094..ef53b93db5732 100644 --- a/samples/bpf/xdp_monitor_user.c +++ b/samples/bpf/xdp_monitor_user.c @@ -509,11 +509,8 @@ static void *alloc_rec_per_cpu(int record_size) { unsigned int nr_cpus = bpf_num_possible_cpus(); void *array; - size_t size;
- size = record_size * nr_cpus; - array = malloc(size); - memset(array, 0, size); + array = calloc(nr_cpus, record_size); if (!array) { fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); exit(EXIT_FAIL_MEM); @@ -528,8 +525,7 @@ static struct stats_record *alloc_stats_record(void) int i;
/* Alloc main stats_record structure */ - rec = malloc(sizeof(*rec)); - memset(rec, 0, sizeof(*rec)); + rec = calloc(1, sizeof(*rec)); if (!rec) { fprintf(stderr, "Mem alloc error\n"); exit(EXIT_FAIL_MEM); diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c index 9b8f21abeac47..e86fed5cdb92c 100644 --- a/samples/bpf/xdp_redirect_cpu_user.c +++ b/samples/bpf/xdp_redirect_cpu_user.c @@ -210,11 +210,8 @@ static struct datarec *alloc_record_per_cpu(void) { unsigned int nr_cpus = bpf_num_possible_cpus(); struct datarec *array; - size_t size;
- size = sizeof(struct datarec) * nr_cpus; - array = malloc(size); - memset(array, 0, size); + array = calloc(nr_cpus, sizeof(struct datarec)); if (!array) { fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); exit(EXIT_FAIL_MEM); @@ -229,11 +226,11 @@ static struct stats_record *alloc_stats_record(void)
size = sizeof(*rec) + n_cpus * sizeof(struct record); rec = malloc(size); - memset(rec, 0, size); if (!rec) { fprintf(stderr, "Mem alloc error\n"); exit(EXIT_FAIL_MEM); } + memset(rec, 0, size); rec->rx_cnt.cpu = alloc_record_per_cpu(); rec->redir_err.cpu = alloc_record_per_cpu(); rec->kthread.cpu = alloc_record_per_cpu(); diff --git a/samples/bpf/xdp_rxq_info_user.c b/samples/bpf/xdp_rxq_info_user.c index 4fe47502ebed4..caa4e7ffcfc7b 100644 --- a/samples/bpf/xdp_rxq_info_user.c +++ b/samples/bpf/xdp_rxq_info_user.c @@ -198,11 +198,8 @@ static struct datarec *alloc_record_per_cpu(void) { unsigned int nr_cpus = bpf_num_possible_cpus(); struct datarec *array; - size_t size;
- size = sizeof(struct datarec) * nr_cpus; - array = malloc(size); - memset(array, 0, size); + array = calloc(nr_cpus, sizeof(struct datarec)); if (!array) { fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); exit(EXIT_FAIL_MEM); @@ -214,11 +211,8 @@ static struct record *alloc_record_per_rxq(void) { unsigned int nr_rxqs = bpf_map__def(rx_queue_index_map)->max_entries; struct record *array; - size_t size;
- size = sizeof(struct record) * nr_rxqs; - array = malloc(size); - memset(array, 0, size); + array = calloc(nr_rxqs, sizeof(struct record)); if (!array) { fprintf(stderr, "Mem alloc error (nr_rxqs:%u)\n", nr_rxqs); exit(EXIT_FAIL_MEM); @@ -232,8 +226,7 @@ static struct stats_record *alloc_stats_record(void) struct stats_record *rec; int i;
- rec = malloc(sizeof(*rec)); - memset(rec, 0, sizeof(*rec)); + rec = calloc(1, sizeof(struct stats_record)); if (!rec) { fprintf(stderr, "Mem alloc error\n"); exit(EXIT_FAIL_MEM);
From: Drew Fustini drew@beagleboard.org
[ Upstream commit d7af722344e6dc52d87649100516515263e15c75 ]
AM3358 pin mcasp0_aclkr (ZCZ ball B13) [0] is routed to P1.31 header [1] Mode 4 of this pin is mmc0_sdwp (SD Write Protect). A signal connected to P1.31 may accidentally trigger mmc0 write protection. To avoid this situation, do not put mcasp0_aclkr in mode 4 (mmc0_sdwp) by default.
[0] http://www.ti.com/lit/ds/symlink/am3358.pdf [1] https://github.com/beagleboard/pocketbeagle/wiki/System-Reference-Manual#531...
Fixes: 047905376a16 (ARM: dts: Add am335x-pocketbeagle) Signed-off-by: Robert Nelson robertcnelson@gmail.com Signed-off-by: Drew Fustini drew@beagleboard.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am335x-pocketbeagle.dts | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/arm/boot/dts/am335x-pocketbeagle.dts b/arch/arm/boot/dts/am335x-pocketbeagle.dts index 4da719098028f..f0b222201b867 100644 --- a/arch/arm/boot/dts/am335x-pocketbeagle.dts +++ b/arch/arm/boot/dts/am335x-pocketbeagle.dts @@ -88,7 +88,6 @@ AM33XX_PADCONF(AM335X_PIN_MMC0_DAT3, PIN_INPUT_PULLUP, MUX_MODE0) AM33XX_PADCONF(AM335X_PIN_MMC0_CMD, PIN_INPUT_PULLUP, MUX_MODE0) AM33XX_PADCONF(AM335X_PIN_MMC0_CLK, PIN_INPUT_PULLUP, MUX_MODE0) - AM33XX_PADCONF(AM335X_PIN_MCASP0_ACLKR, PIN_INPUT, MUX_MODE4) /* (B12) mcasp0_aclkr.mmc0_sdwp */ >; };
From: Tony Lindgren tony@atomide.com
[ Upstream commit 9cf28e41f9f768791f54ee18333239fda6927ed8 ]
While testing the recent suspend and resume regressions I noticed that duovero can still end up losing edge gpio interrupts on runtime suspend. This causes NFSroot easily stopping working after resume on duovero.
Let's fix the issue by using gpio level interrupts for smsc as then the gpio interrupt state is seen by the gpio controller on resume.
Fixes: 731b409878a3 ("ARM: dts: Configure duovero for to allow core retention during idle") Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/omap4-duovero-parlor.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/omap4-duovero-parlor.dts b/arch/arm/boot/dts/omap4-duovero-parlor.dts index 8047e8cdb3af0..4548d87534e37 100644 --- a/arch/arm/boot/dts/omap4-duovero-parlor.dts +++ b/arch/arm/boot/dts/omap4-duovero-parlor.dts @@ -139,7 +139,7 @@ ethernet@gpmc { reg = <5 0 0xff>; interrupt-parent = <&gpio2>; - interrupts = <12 IRQ_TYPE_EDGE_FALLING>; /* gpio_44 */ + interrupts = <12 IRQ_TYPE_LEVEL_LOW>; /* gpio_44 */
phy-mode = "mii";
From: David Rientjes rientjes@google.com
[ Upstream commit 96a539fa3bb71f443ae08e57b9f63d6e5bb2207c ]
If arch_dma_set_uncached() fails after memory has been decrypted, it needs to be re-encrypted before freeing.
Fixes: fa7e2247c572 ("dma-direct: make uncached_kernel_address more general") Signed-off-by: David Rientjes rientjes@google.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/direct.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 8f4bbdaf965eb..4e789c46ff0bf 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -186,7 +186,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size, arch_dma_prep_coherent(page, size); ret = arch_dma_set_uncached(ret, size); if (IS_ERR(ret)) - goto out_free_pages; + goto out_encrypt_pages; } done: if (force_dma_unencrypted(dev)) @@ -194,6 +194,11 @@ done: else *dma_handle = phys_to_dma(dev, page_to_phys(page)); return ret; + +out_encrypt_pages: + if (force_dma_unencrypted(dev)) + set_memory_encrypted((unsigned long)page_address(page), + 1 << get_order(size)); out_free_pages: dma_free_contiguous(dev, page, size); return NULL;
From: David Rientjes rientjes@google.com
[ Upstream commit 56fccf21d1961a06e2a0c96ce446ebf036651062 ]
__change_page_attr() can fail which will cause set_memory_encrypted() and set_memory_decrypted() to return non-zero.
If the device requires unencrypted DMA memory and decryption fails, simply free the memory and fail.
If attempting to re-encrypt in the failure path and that encryption fails, there is no alternative other than to leak the memory.
Fixes: c10f07aa27da ("dma/direct: Handle force decryption for DMA coherent buffers in common code") Signed-off-by: David Rientjes rientjes@google.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/direct.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 4e789c46ff0bf..98c445dcb308b 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -124,6 +124,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size, { struct page *page; void *ret; + int err;
if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && dma_alloc_need_uncached(dev, attrs) && @@ -176,8 +177,12 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size, }
ret = page_address(page); - if (force_dma_unencrypted(dev)) - set_memory_decrypted((unsigned long)ret, 1 << get_order(size)); + if (force_dma_unencrypted(dev)) { + err = set_memory_decrypted((unsigned long)ret, + 1 << get_order(size)); + if (err) + goto out_free_pages; + }
memset(ret, 0, size);
@@ -196,9 +201,13 @@ done: return ret;
out_encrypt_pages: - if (force_dma_unencrypted(dev)) - set_memory_encrypted((unsigned long)page_address(page), - 1 << get_order(size)); + if (force_dma_unencrypted(dev)) { + err = set_memory_encrypted((unsigned long)page_address(page), + 1 << get_order(size)); + /* If memory cannot be re-encrypted, it must be leaked */ + if (err) + return NULL; + } out_free_pages: dma_free_contiguous(dev, page, size); return NULL;
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit cc5277fe66cf3ad68f41f1c539b2ef0d5e432974 ]
The callers don't expect *d_cdp to be set to an error pointer, they only check for NULL. This leads to a static checker warning:
arch/x86/kernel/cpu/resctrl/rdtgroup.c:2648 __init_one_rdt_domain() warn: 'd_cdp' could be an error pointer
This would not trigger a bug in this specific case because __init_one_rdt_domain() calls it with a valid domain that would not have a negative id and thus not trigger the return of the ERR_PTR(). If this was a negative domain id then the call to rdt_find_domain() in domain_add_cpu() would have returned the ERR_PTR() much earlier and the creation of the domain with an invalid id would have been prevented.
Even though a bug is not triggered currently the right and safe thing to do is to set the pointer to NULL because that is what can be checked for when the caller is handling the CDP and non-CDP cases.
Fixes: 52eb74339a62 ("x86/resctrl: Fix rdt_find_domain() return value and checks") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Reinette Chatre reinette.chatre@intel.com Acked-by: Fenghua Yu fenghua.yu@intel.com Link: https://lkml.kernel.org/r/20200602193611.GA190851@mwanda Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 5a359d9fcc055..29a3878ab3c01 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1117,6 +1117,7 @@ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d, _d_cdp = rdt_find_domain(_r_cdp, d->id, NULL); if (WARN_ON(IS_ERR_OR_NULL(_d_cdp))) { _r_cdp = NULL; + _d_cdp = NULL; ret = -EINVAL; }
From: Arvind Sankar nivedita@alum.mit.edu
[ Upstream commit 41d90b0c1108d1e46c48cf79964636c553844f4c ]
Commit
17054f492dfd ("efi/x86: Implement mixed mode boot without the handover protocol")
introduced a new entry point for the EFI stub to be booted in mixed mode on 32-bit firmware.
When entered via efi32_pe_entry, control is first transferred to startup_32 to setup for the switch to long mode, and then the EFI stub proper is entered via efi_pe_entry. efi_pe_entry is an MS ABI function, and the ABI requires 32 bytes of shadow stack space to be allocated by the caller, as well as the stack being aligned to 8 mod 16 on entry.
Allocate 40 bytes on the stack before switching to 64-bit mode when calling efi_pe_entry to account for this.
For robustness, explicitly align boot_stack_end to 16 bytes. It is currently implicitly aligned since .bss is cacheline-size aligned, head_64.o is the first object file with a .bss section, and the heap and boot sizes are aligned.
Fixes: 17054f492dfd ("efi/x86: Implement mixed mode boot without the handover protocol") Signed-off-by: Arvind Sankar nivedita@alum.mit.edu Link: https://lore.kernel.org/r/20200617131957.2507632-1-nivedita@alum.mit.edu Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/boot/compressed/head_64.S | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index 76d1d64d51e38..41f7922086220 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -213,7 +213,6 @@ SYM_FUNC_START(startup_32) * We place all of the values on our mini stack so lret can * used to perform that far jump. */ - pushl $__KERNEL_CS leal startup_64(%ebp), %eax #ifdef CONFIG_EFI_MIXED movl efi32_boot_args(%ebp), %edi @@ -224,11 +223,20 @@ SYM_FUNC_START(startup_32) movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer cmpl $0, %edx jnz 1f + /* + * efi_pe_entry uses MS calling convention, which requires 32 bytes of + * shadow space on the stack even if all arguments are passed in + * registers. We also need an additional 8 bytes for the space that + * would be occupied by the return address, and this also results in + * the correct stack alignment for entry. + */ + subl $40, %esp leal efi_pe_entry(%ebp), %eax movl %edi, %ecx // MS calling convention movl %esi, %edx 1: #endif + pushl $__KERNEL_CS pushl %eax
/* Enter paged protected Mode, activating Long Mode */ @@ -776,6 +784,7 @@ SYM_DATA_LOCAL(boot_heap, .fill BOOT_HEAP_SIZE, 1, 0)
SYM_DATA_START_LOCAL(boot_stack) .fill BOOT_STACK_SIZE, 1, 0 + .balign 16 SYM_DATA_END_LABEL(boot_stack, SYM_L_LOCAL, boot_stack_end)
/*
From: Charles Keepax ckeepax@opensource.cirrus.com
[ Upstream commit 95b2c3ec4cb1689db2389c251d39f64490ba641c ]
When a register patch is registered the reg_sequence is copied but the memory allocated is never freed. Add a kfree in regmap_exit to clean it up.
Fixes: 22f0d90a3482 ("regmap: Support register patch sets") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20200617152129.19655-1-ckeepax@opensource.cirrus.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regmap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c index 59f911e577192..508bbd6ea4396 100644 --- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -1356,6 +1356,7 @@ void regmap_exit(struct regmap *map) if (map->hwlock) hwspin_lock_free(map->hwlock); kfree_const(map->name); + kfree(map->patch); kfree(map); } EXPORT_SYMBOL_GPL(regmap_exit);
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit 99c51064fb06146b3d494b745c947e438a10aaa7 ]
Syzkaller discovered that creating a hash of type devmap_hash with a large number of entries can hit the memory allocator limit for allocating contiguous memory regions. There's really no reason to use kmalloc_array() directly in the devmap code, so just switch it to the existing bpf_map_area_alloc() function that is used elsewhere.
Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Reported-by: Xiumei Mu xmu@redhat.com Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20200616142829.114173-1-toke@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/devmap.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 58bdca5d978a8..badf382bbd365 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -85,12 +85,13 @@ static DEFINE_PER_CPU(struct list_head, dev_flush_list); static DEFINE_SPINLOCK(dev_map_lock); static LIST_HEAD(dev_map_list);
-static struct hlist_head *dev_map_create_hash(unsigned int entries) +static struct hlist_head *dev_map_create_hash(unsigned int entries, + int numa_node) { int i; struct hlist_head *hash;
- hash = kmalloc_array(entries, sizeof(*hash), GFP_KERNEL); + hash = bpf_map_area_alloc(entries * sizeof(*hash), numa_node); if (hash != NULL) for (i = 0; i < entries; i++) INIT_HLIST_HEAD(&hash[i]); @@ -138,7 +139,8 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) return -EINVAL;
if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { - dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets); + dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, + dtab->map.numa_node); if (!dtab->dev_index_head) goto free_charge;
@@ -223,7 +225,7 @@ static void dev_map_free(struct bpf_map *map) } }
- kfree(dtab->dev_index_head); + bpf_map_area_free(dtab->dev_index_head); } else { for (i = 0; i < dtab->map.max_entries; i++) { struct bpf_dtab_netdev *dev;
From: Stanislav Fomichev sdf@google.com
[ Upstream commit d8fe449a9c51a37d844ab607e14e2f5c657d3cf2 ]
Attaching to these hooks can break iptables because its optval is usually quite big, or at least bigger than the current PAGE_SIZE limit. David also mentioned some SCTP options can be big (around 256k).
For such optvals we expose only the first PAGE_SIZE bytes to the BPF program. BPF program has two options: 1. Set ctx->optlen to 0 to indicate that the BPF's optval should be ignored and the kernel should use original userspace value. 2. Set ctx->optlen to something that's smaller than the PAGE_SIZE.
v5: * use ctx->optlen == 0 with trimmed buffer (Alexei Starovoitov) * update the docs accordingly
v4: * use temporary buffer to avoid optval == optval_end == NULL; this removes the corner case in the verifier that might assume non-zero PTR_TO_PACKET/PTR_TO_PACKET_END.
v3: * don't increase the limit, bypass the argument
v2: * proper comments formatting (Jakub Kicinski)
Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Signed-off-by: Stanislav Fomichev sdf@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Cc: David Laight David.Laight@ACULAB.COM Link: https://lore.kernel.org/bpf/20200617010416.93086-1-sdf@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/cgroup.c | 53 ++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 20 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index cb305e71e7deb..25aebd21c15b1 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -1240,16 +1240,23 @@ static bool __cgroup_bpf_prog_array_is_empty(struct cgroup *cgrp,
static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen) { - if (unlikely(max_optlen > PAGE_SIZE) || max_optlen < 0) + if (unlikely(max_optlen < 0)) return -EINVAL;
+ if (unlikely(max_optlen > PAGE_SIZE)) { + /* We don't expose optvals that are greater than PAGE_SIZE + * to the BPF program. + */ + max_optlen = PAGE_SIZE; + } + ctx->optval = kzalloc(max_optlen, GFP_USER); if (!ctx->optval) return -ENOMEM;
ctx->optval_end = ctx->optval + max_optlen;
- return 0; + return max_optlen; }
static void sockopt_free_buf(struct bpf_sockopt_kern *ctx) @@ -1283,13 +1290,13 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level, */ max_optlen = max_t(int, 16, *optlen);
- ret = sockopt_alloc_buf(&ctx, max_optlen); - if (ret) - return ret; + max_optlen = sockopt_alloc_buf(&ctx, max_optlen); + if (max_optlen < 0) + return max_optlen;
ctx.optlen = *optlen;
- if (copy_from_user(ctx.optval, optval, *optlen) != 0) { + if (copy_from_user(ctx.optval, optval, min(*optlen, max_optlen)) != 0) { ret = -EFAULT; goto out; } @@ -1317,8 +1324,14 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level, /* export any potential modifications */ *level = ctx.level; *optname = ctx.optname; - *optlen = ctx.optlen; - *kernel_optval = ctx.optval; + + /* optlen == 0 from BPF indicates that we should + * use original userspace data. + */ + if (ctx.optlen != 0) { + *optlen = ctx.optlen; + *kernel_optval = ctx.optval; + } }
out: @@ -1350,12 +1363,12 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, __cgroup_bpf_prog_array_is_empty(cgrp, BPF_CGROUP_GETSOCKOPT)) return retval;
- ret = sockopt_alloc_buf(&ctx, max_optlen); - if (ret) - return ret; - ctx.optlen = max_optlen;
+ max_optlen = sockopt_alloc_buf(&ctx, max_optlen); + if (max_optlen < 0) + return max_optlen; + if (!retval) { /* If kernel getsockopt finished successfully, * copy whatever was returned to the user back @@ -1369,10 +1382,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, goto out; }
- if (ctx.optlen > max_optlen) - ctx.optlen = max_optlen; - - if (copy_from_user(ctx.optval, optval, ctx.optlen) != 0) { + if (copy_from_user(ctx.optval, optval, + min(ctx.optlen, max_optlen)) != 0) { ret = -EFAULT; goto out; } @@ -1401,10 +1412,12 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, goto out; }
- if (copy_to_user(optval, ctx.optval, ctx.optlen) || - put_user(ctx.optlen, optlen)) { - ret = -EFAULT; - goto out; + if (ctx.optlen != 0) { + if (copy_to_user(optval, ctx.optval, ctx.optlen) || + put_user(ctx.optlen, optlen)) { + ret = -EFAULT; + goto out; + } }
ret = ctx.retval;
From: Matthew Hagan mnhagan88@gmail.com
[ Upstream commit ac4e106d8934a5894811fc263f4b03fc8ed0fb7a ]
The FA2 mailbox is specified at 0x18025000 but should actually be 0x18025c00, length 0x400 according to socregs_nsp.h and board_bu.c. Also the interrupt was off by one and should be GIC SPI 151 instead of 150.
Fixes: 17d517172300 ("ARM: dts: NSP: Add mailbox (PDC) to NSP") Signed-off-by: Matthew Hagan mnhagan88@gmail.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm-nsp.dtsi | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/boot/dts/bcm-nsp.dtsi b/arch/arm/boot/dts/bcm-nsp.dtsi index 920c0f561e5ce..3175266ede646 100644 --- a/arch/arm/boot/dts/bcm-nsp.dtsi +++ b/arch/arm/boot/dts/bcm-nsp.dtsi @@ -259,10 +259,10 @@ status = "disabled"; };
- mailbox: mailbox@25000 { + mailbox: mailbox@25c00 { compatible = "brcm,iproc-fa2-mbox"; - reg = <0x25000 0x445>; - interrupts = <GIC_SPI 150 IRQ_TYPE_LEVEL_HIGH>; + reg = <0x25c00 0x400>; + interrupts = <GIC_SPI 151 IRQ_TYPE_LEVEL_HIGH>; #mbox-cells = <1>; brcm,rx-status-len = <32>; brcm,use-bcm-hdr;
From: David Howells dhowells@redhat.com
[ Upstream commit a2ad7c21ad8cf1ce4ad65e13df1c2a1c29b38ac5 ]
The handling of the receive window size (rwind) from a received ACK packet is not correct. The rxrpc_input_ackinfo() function currently checks the current Tx window size against the rwind from the ACK to see if it has changed, but then limits the rwind size before storing it in the tx_winsize member and, if it increased, wake up the transmitting process. This means that if rwind > RXRPC_RXTX_BUFF_SIZE - 1, this path will always be followed.
Fix this by limiting rwind before we compare it to tx_winsize.
The effect of this can be seen by enabling the rxrpc_rx_rwind_change tracepoint.
Fixes: 702f2ac87a9a ("rxrpc: Wake up the transmitter if Rx window size increases on the peer") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/input.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 3be4177baf707..22dec6049e1bb 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -723,13 +723,12 @@ static void rxrpc_input_ackinfo(struct rxrpc_call *call, struct sk_buff *skb, ntohl(ackinfo->rxMTU), ntohl(ackinfo->maxMTU), rwind, ntohl(ackinfo->jumbo_max));
+ if (rwind > RXRPC_RXTX_BUFF_SIZE - 1) + rwind = RXRPC_RXTX_BUFF_SIZE - 1; if (call->tx_winsize != rwind) { - if (rwind > RXRPC_RXTX_BUFF_SIZE - 1) - rwind = RXRPC_RXTX_BUFF_SIZE - 1; if (rwind > call->tx_winsize) wake = true; - trace_rxrpc_rx_rwind_change(call, sp->hdr.serial, - ntohl(ackinfo->rwind), wake); + trace_rxrpc_rx_rwind_change(call, sp->hdr.serial, rwind, wake); call->tx_winsize = rwind; }
From: Aditya Pakki pakki001@umn.edu
[ Upstream commit 90a239ee25fa3a483facec3de7c144361a3d3a51 ]
In case of failure of alloc_ud_wq_attr(), the memory allocated by rvt_alloc_rq() is not freed. Fix it by calling rvt_free_rq() using the existing clean-up code.
Fixes: d310c4bf8aea ("IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs") Link: https://lore.kernel.org/r/20200614041148.131983-1-pakki001@umn.edu Signed-off-by: Aditya Pakki pakki001@umn.edu Acked-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/sw/rdmavt/qp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c index 500a7ee04c44e..ca29954a54ac3 100644 --- a/drivers/infiniband/sw/rdmavt/qp.c +++ b/drivers/infiniband/sw/rdmavt/qp.c @@ -1196,7 +1196,7 @@ struct ib_qp *rvt_create_qp(struct ib_pd *ibpd, err = alloc_ud_wq_attr(qp, rdi->dparms.node); if (err) { ret = (ERR_PTR(err)); - goto bail_driver_priv; + goto bail_rq_rvt; }
err = alloc_qpn(rdi, &rdi->qp_dev->qpn_table, @@ -1300,9 +1300,11 @@ bail_qpn: rvt_free_qpn(&rdi->qp_dev->qpn_table, qp->ibqp.qp_num);
bail_rq_wq: - rvt_free_rq(&qp->r_rq); free_ud_wq_attr(qp);
+bail_rq_rvt: + rvt_free_rq(&qp->r_rq); + bail_driver_priv: rdi->driver_f.qp_priv_free(rdi, qp);
From: Gal Pressman galpress@amazon.com
[ Upstream commit 0133654d8eb8607eacc96badfe49bf992155f4cb ]
The max_pkeys device attribute was not set in query device verb, set it to one in order to account for the default pkey (0xffff). This information is exposed to userspace and can cause malfunction
Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Link: https://lore.kernel.org/r/20200614103534.88060-1-galpress@amazon.com Reviewed-by: Firas JahJah firasj@amazon.com Reviewed-by: Yossi Leybovich sleybo@amazon.com Signed-off-by: Gal Pressman galpress@amazon.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/efa/efa_verbs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c index 5c57098a4aee5..3420c77424861 100644 --- a/drivers/infiniband/hw/efa/efa_verbs.c +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -209,6 +209,7 @@ int efa_query_device(struct ib_device *ibdev, props->max_send_sge = dev_attr->max_sq_sge; props->max_recv_sge = dev_attr->max_rq_sge; props->max_sge_rd = dev_attr->max_wr_rdma_sge; + props->max_pkeys = 1;
if (udata && udata->outlen) { resp.max_sq_sge = dev_attr->max_sq_sge;
From: Michal Kalderon michal.kalderon@marvell.com
[ Upstream commit 0dfbd5ecf28cbcb81674c49d34ee97366db1be44 ]
Private data passed to iwarp_cm_handler is copied for connection request / response, but ignored otherwise. If junk is passed, it is stored in the event and used later in the event processing.
The driver passes an old junk pointer during connection close which leads to a use-after-free on event processing. Set private data to NULL for events that don 't have private data.
BUG: KASAN: use-after-free in ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: Read of size 4 at addr ffff8886caa71200 by task kworker/u128:1/5250 kernel: kernel: Workqueue: iw_cm_wq cm_work_handler [iw_cm] kernel: Call Trace: kernel: dump_stack+0x8c/0xc0 kernel: print_address_description.constprop.0+0x1b/0x210 kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: __kasan_report.cold+0x1a/0x33 kernel: ? ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: kasan_report+0xe/0x20 kernel: check_memory_region+0x130/0x1a0 kernel: memcpy+0x20/0x50 kernel: ucma_event_handler+0x532/0x560 [rdma_ucm] kernel: ? __rpc_execute+0x608/0x620 [sunrpc] kernel: cma_iw_handler+0x212/0x330 [rdma_cm] kernel: ? iw_conn_req_handler+0x6e0/0x6e0 [rdma_cm] kernel: ? enqueue_timer+0x86/0x140 kernel: ? _raw_write_lock_irq+0xd0/0xd0 kernel: cm_work_handler+0xd3d/0x1070 [iw_cm]
Fixes: e411e0587e0d ("RDMA/qedr: Add iWARP connection management functions") Link: https://lore.kernel.org/r/20200616093408.17827-1-michal.kalderon@marvell.com Signed-off-by: Ariel Elior ariel.elior@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/qedr/qedr_iw_cm.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/qedr/qedr_iw_cm.c b/drivers/infiniband/hw/qedr/qedr_iw_cm.c index 792eecd206b61..97fc7dd353b04 100644 --- a/drivers/infiniband/hw/qedr/qedr_iw_cm.c +++ b/drivers/infiniband/hw/qedr/qedr_iw_cm.c @@ -150,8 +150,17 @@ qedr_iw_issue_event(void *context, if (params->cm_info) { event.ird = params->cm_info->ird; event.ord = params->cm_info->ord; - event.private_data_len = params->cm_info->private_data_len; - event.private_data = (void *)params->cm_info->private_data; + /* Only connect_request and reply have valid private data + * the rest of the events this may be left overs from + * connection establishment. CONNECT_REQUEST is issued via + * qedr_iw_mpa_request + */ + if (event_type == IW_CM_EVENT_CONNECT_REPLY) { + event.private_data_len = + params->cm_info->private_data_len; + event.private_data = + (void *)params->cm_info->private_data; + } }
if (ep->cm_id)
From: Mark Zhang markz@mellanox.com
[ Upstream commit 730c8912484186d4623d0c76509066d285c3a755 ]
The bind_list and listen_list must be accessed under a lock, add the missing locking around the access in cm_ib_id_from_event()
In addition add lockdep asserts to make it clearer what the locking semantic is here.
general protection fault: 0000 [#1] SMP NOPTI CPU: 226 PID: 126135 Comm: kworker/226:1 Tainted: G OE 4.12.14-150.47-default #1 SLE15 Hardware name: Cray Inc. Windom/Windom, BIOS 0.8.7 01-10-2020 Workqueue: ib_cm cm_work_handler [ib_cm] task: ffff9c5a60a1d2c0 task.stack: ffffc1d91f554000 RIP: 0010:cma_ib_req_handler+0x3f1/0x11b0 [rdma_cm] RSP: 0018:ffffc1d91f557b40 EFLAGS: 00010286 RAX: deacffffffffff30 RBX: 0000000000000001 RCX: ffff9c2af5bb6000 RDX: 00000000000000a9 RSI: ffff9c5aa4ed2f10 RDI: ffffc1d91f557b08 RBP: ffffc1d91f557d90 R08: ffff9c340cc80000 R09: ffff9c2c0f901900 R10: 0000000000000000 R11: 0000000000000001 R12: deacffffffffff30 R13: ffff9c5a48aeec00 R14: ffffc1d91f557c30 R15: ffff9c5c2eea3688 FS: 0000000000000000(0000) GS:ffff9c5c2fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002b5cc03fa320 CR3: 0000003f8500a000 CR4: 00000000003406e0 Call Trace: ? rdma_addr_cancel+0xa0/0xa0 [ib_core] ? cm_process_work+0x28/0x140 [ib_cm] cm_process_work+0x28/0x140 [ib_cm] ? cm_get_bth_pkey.isra.44+0x34/0xa0 [ib_cm] cm_work_handler+0xa06/0x1a6f [ib_cm] ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to+0x7c/0x4b0 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 process_one_work+0x1da/0x400 worker_thread+0x2b/0x3f0 ? process_one_work+0x400/0x400 kthread+0x118/0x140 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x22/0x40 Code: 00 66 83 f8 02 0f 84 ca 05 00 00 49 8b 84 24 d0 01 00 00 48 85 c0 0f 84 68 07 00 00 48 2d d0 01 00 00 49 89 c4 0f 84 59 07 00 00 <41> 0f b7 44 24 20 49 8b 77 50 66 83 f8 0a 75 9e 49 8b 7c 24 28
Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Link: https://lore.kernel.org/r/20200616104304.2426081-1-leon@kernel.org Signed-off-by: Mark Zhang markz@mellanox.com Reviewed-by: Maor Gottlieb maorg@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 26e6f7df247b6..12ada58c96a90 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1619,6 +1619,8 @@ static struct rdma_id_private *cma_find_listener( { struct rdma_id_private *id_priv, *id_priv_dev;
+ lockdep_assert_held(&lock); + if (!bind_list) return ERR_PTR(-EINVAL);
@@ -1665,6 +1667,7 @@ cma_ib_id_from_event(struct ib_cm_id *cm_id, } }
+ mutex_lock(&lock); /* * Net namespace might be getting deleted while route lookup, * cm_id lookup is in progress. Therefore, perform netdevice @@ -1706,6 +1709,7 @@ cma_ib_id_from_event(struct ib_cm_id *cm_id, id_priv = cma_find_listener(bind_list, cm_id, ib_event, req, *net_dev); err: rcu_read_unlock(); + mutex_unlock(&lock); if (IS_ERR(id_priv) && *net_dev) { dev_put(*net_dev); *net_dev = NULL; @@ -2481,6 +2485,8 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv, struct net *net = id_priv->id.route.addr.dev_addr.net; int ret;
+ lockdep_assert_held(&lock); + if (cma_family(id_priv) == AF_IB && !rdma_cap_ib_cm(cma_dev->device, 1)) return;
@@ -3308,6 +3314,8 @@ static void cma_bind_port(struct rdma_bind_list *bind_list, u64 sid, mask; __be16 port;
+ lockdep_assert_held(&lock); + addr = cma_src_addr(id_priv); port = htons(bind_list->port);
@@ -3336,6 +3344,8 @@ static int cma_alloc_port(enum rdma_ucm_port_space ps, struct rdma_bind_list *bind_list; int ret;
+ lockdep_assert_held(&lock); + bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL); if (!bind_list) return -ENOMEM; @@ -3362,6 +3372,8 @@ static int cma_port_is_unique(struct rdma_bind_list *bind_list, struct sockaddr *saddr = cma_src_addr(id_priv); __be16 dport = cma_port(daddr);
+ lockdep_assert_held(&lock); + hlist_for_each_entry(cur_id, &bind_list->owners, node) { struct sockaddr *cur_daddr = cma_dst_addr(cur_id); struct sockaddr *cur_saddr = cma_src_addr(cur_id); @@ -3401,6 +3413,8 @@ static int cma_alloc_any_port(enum rdma_ucm_port_space ps, unsigned int rover; struct net *net = id_priv->id.route.addr.dev_addr.net;
+ lockdep_assert_held(&lock); + inet_get_local_port_range(net, &low, &high); remaining = (high - low) + 1; rover = prandom_u32() % remaining + low; @@ -3448,6 +3462,8 @@ static int cma_check_port(struct rdma_bind_list *bind_list, struct rdma_id_private *cur_id; struct sockaddr *addr, *cur_addr;
+ lockdep_assert_held(&lock); + addr = cma_src_addr(id_priv); hlist_for_each_entry(cur_id, &bind_list->owners, node) { if (id_priv == cur_id) @@ -3478,6 +3494,8 @@ static int cma_use_port(enum rdma_ucm_port_space ps, unsigned short snum; int ret;
+ lockdep_assert_held(&lock); + snum = ntohs(cma_port(cma_src_addr(id_priv))); if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE)) return -EACCES;
From: Leon Romanovsky leonro@mellanox.com
[ Upstream commit 4121fb0db68ed4de574f9bdc630b75fcc99b4835 ]
In disassociate flow, the type_attrs is set to be NULL, which is in an implicit way is checked in alloc_uobj() by "if (!attrs->context)".
Change the logic to rely on that check, to be consistent with other alloc_uobj() places that will fix the following kernel splat.
BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 2743 Comm: python3 Not tainted 5.7.0-rc6-for-upstream-perf-2020-05-23_19-04-38-5 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:alloc_begin_fd_uobject+0x18/0xf0 [ib_uverbs] Code: 89 43 48 eb 97 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 f5 41 54 55 48 89 fd 53 48 83 ec 08 48 8b 1f <48> 8b 43 18 48 8b 80 80 00 00 00 48 3d 20 10 33 a0 74 1c 48 3d 30 RSP: 0018:ffffc90001127b70 EFLAGS: 00010282 RAX: ffffffffa0339fe0 RBX: 0000000000000000 RCX: 8000000000000007 RDX: fffffffffffffffb RSI: ffffc90001127d28 RDI: ffff88843fe1f600 RBP: ffff88843fe1f600 R08: ffff888461eb06d8 R09: ffff888461eb06f8 R10: ffff888461eb0700 R11: 0000000000000000 R12: ffff88846a5f6450 R13: ffffc90001127d28 R14: ffff88845d7d6ea0 R15: ffffc90001127cb8 FS: 00007f469bff1540(0000) GS:ffff88846f980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000450018003 CR4: 0000000000760ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? xa_store+0x28/0x40 rdma_alloc_begin_uobject+0x4f/0x90 [ib_uverbs] ib_uverbs_create_comp_channel+0x87/0xf0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb1/0xf0 [ib_uverbs] ib_uverbs_cmd_verbs.isra.8+0x96d/0xae0 [ib_uverbs] ? get_page_from_freelist+0x3bb/0xf70 ? _copy_to_user+0x22/0x30 ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs] ? __wake_up_common_lock+0x87/0xc0 ib_uverbs_ioctl+0xbc/0x130 [ib_uverbs] ksys_ioctl+0x83/0xc0 ? ksys_write+0x55/0xd0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x48/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f469ac43267
Fixes: 849e149063bd ("RDMA/core: Do not allow alloc_commit to fail") Link: https://lore.kernel.org/r/20200617061826.2625359-1-leon@kernel.org Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/rdma_core.c | 36 +++++++++++++++++------------ 1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c index e0a5e897e4b1d..75bcbc625616e 100644 --- a/drivers/infiniband/core/rdma_core.c +++ b/drivers/infiniband/core/rdma_core.c @@ -459,40 +459,46 @@ static struct ib_uobject * alloc_begin_fd_uobject(const struct uverbs_api_object *obj, struct uverbs_attr_bundle *attrs) { - const struct uverbs_obj_fd_type *fd_type = - container_of(obj->type_attrs, struct uverbs_obj_fd_type, type); + const struct uverbs_obj_fd_type *fd_type; int new_fd; - struct ib_uobject *uobj; + struct ib_uobject *uobj, *ret; struct file *filp;
+ uobj = alloc_uobj(attrs, obj); + if (IS_ERR(uobj)) + return uobj; + + fd_type = + container_of(obj->type_attrs, struct uverbs_obj_fd_type, type); if (WARN_ON(fd_type->fops->release != &uverbs_uobject_fd_release && - fd_type->fops->release != &uverbs_async_event_release)) - return ERR_PTR(-EINVAL); + fd_type->fops->release != &uverbs_async_event_release)) { + ret = ERR_PTR(-EINVAL); + goto err_fd; + }
new_fd = get_unused_fd_flags(O_CLOEXEC); - if (new_fd < 0) - return ERR_PTR(new_fd); - - uobj = alloc_uobj(attrs, obj); - if (IS_ERR(uobj)) + if (new_fd < 0) { + ret = ERR_PTR(new_fd); goto err_fd; + }
/* Note that uverbs_uobject_fd_release() is called during abort */ filp = anon_inode_getfile(fd_type->name, fd_type->fops, NULL, fd_type->flags); if (IS_ERR(filp)) { - uverbs_uobject_put(uobj); - uobj = ERR_CAST(filp); - goto err_fd; + ret = ERR_CAST(filp); + goto err_getfile; } uobj->object = filp;
uobj->id = new_fd; return uobj;
-err_fd: +err_getfile: put_unused_fd(new_fd); - return uobj; +err_fd: + uverbs_uobject_put(uobj); + return ret; }
struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_api_object *obj,
From: Qiushi Wu wu000273@umn.edu
[ Upstream commit f141a422159a199f4c8dedb7e0df55b3b2cf16cd ]
Calling pm_runtime_get_sync increments the counter even in case of failure, causing incorrect ref count if pm_runtime_put is not called in error handling paths. Call pm_runtime_put if pm_runtime_get_sync fails.
Fixes: fc05a5b22253 ("ASoC: rockchip: add support for pdm controller") Signed-off-by: Qiushi Wu wu000273@umn.edu Reviewed-by: Heiko Stuebner heiko@sntech.de Link: https://lore.kernel.org/r/20200613205158.27296-1-wu000273@umn.edu Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/rockchip/rockchip_pdm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/sound/soc/rockchip/rockchip_pdm.c b/sound/soc/rockchip/rockchip_pdm.c index 7cd42fcfcf38a..1707414cfa921 100644 --- a/sound/soc/rockchip/rockchip_pdm.c +++ b/sound/soc/rockchip/rockchip_pdm.c @@ -590,8 +590,10 @@ static int rockchip_pdm_resume(struct device *dev) int ret;
ret = pm_runtime_get_sync(dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put(dev); return ret; + }
ret = regcache_sync(pdm->regmap);
From: Julian Wiedmann jwi@linux.ibm.com
[ Upstream commit e2dfcfba00ba4a414617ef4c5a8501fe21567eb3 ]
Current(?) OSA devices also store their cmd-specific return codes for SET_ACCESS_CONTROL cmds into the top-level cmd->hdr.return_code. So once we added stricter checking for the top-level field a while ago, none of the error logic that rolls back the user's configuration to its old state is applied any longer.
For this specific cmd, go back to the old model where we peek into the cmd structure even though the top-level field indicated an error.
Fixes: 686c97ee29c8 ("s390/qeth: fix error handling in adapter command callbacks") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/qeth_core_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 569966bdc5138..60d675fefac7d 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -4265,9 +4265,6 @@ static int qeth_setadpparms_set_access_ctrl_cb(struct qeth_card *card, int fallback = *(int *)reply->param;
QETH_CARD_TEXT(card, 4, "setaccb"); - if (cmd->hdr.return_code) - return -EIO; - qeth_setadpparms_inspect_rc(cmd);
access_ctrl_req = &cmd->data.setadapterparms.data.set_access_ctrl; QETH_CARD_TEXT_(card, 2, "rc=%d", @@ -4277,7 +4274,7 @@ static int qeth_setadpparms_set_access_ctrl_cb(struct qeth_card *card, QETH_DBF_MESSAGE(3, "ERR:SET_ACCESS_CTRL(%#x) on device %x: %#x\n", access_ctrl_req->subcmd_code, CARD_DEVID(card), cmd->data.setadapterparms.hdr.return_code); - switch (cmd->data.setadapterparms.hdr.return_code) { + switch (qeth_setadpparms_inspect_rc(cmd)) { case SET_ACCESS_CTRL_RC_SUCCESS: if (card->options.isolation == ISOLATION_MODE_NONE) { dev_info(&card->gdev->dev,
From: Fan Guo guofan5@huawei.com
[ Upstream commit a17f4bed811c60712d8131883cdba11a105d0161 ]
If ib_dma_mapping_error() returns non-zero value, ib_mad_post_receive_mads() will jump out of loops and return -ENOMEM without freeing mad_priv. Fix this memory-leak problem by freeing mad_priv in this case.
Fixes: 2c34e68f4261 ("IB/mad: Check and handle potential DMA mapping errors") Link: https://lore.kernel.org/r/20200612063824.180611-1-guofan5@huawei.com Signed-off-by: Fan Guo guofan5@huawei.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/mad.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index f7626ebcf31cf..049c9cdc10dec 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2941,6 +2941,7 @@ static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info, DMA_FROM_DEVICE); if (unlikely(ib_dma_mapping_error(qp_info->port_priv->device, sg_list.addr))) { + kfree(mad_priv); ret = -ENOMEM; break; }
From: Willem de Bruijn willemb@google.com
[ Upstream commit ca8826095e4d4afc0ccaead27bba6e4b623a12ae ]
The ETF qdisc can queue skbs that it could not pace on the errqueue.
Address a few issues in the selftest
- recv buffer size was too small, and incorrectly calculated - compared errno to ee_code instead of ee_errno - missed invalid request error type
v2: - fix a few checkpatch --strict indentation warnings
Fixes: ea6a547669b3 ("selftests/net: make so_txtime more robust to timer variance") Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/so_txtime.c | 33 +++++++++++++++++++------ 1 file changed, 26 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/net/so_txtime.c b/tools/testing/selftests/net/so_txtime.c index 383bac05ac324..ceaad78e96674 100644 --- a/tools/testing/selftests/net/so_txtime.c +++ b/tools/testing/selftests/net/so_txtime.c @@ -15,8 +15,9 @@ #include <inttypes.h> #include <linux/net_tstamp.h> #include <linux/errqueue.h> +#include <linux/if_ether.h> #include <linux/ipv6.h> -#include <linux/tcp.h> +#include <linux/udp.h> #include <stdbool.h> #include <stdlib.h> #include <stdio.h> @@ -140,8 +141,8 @@ static void do_recv_errqueue_timeout(int fdt) { char control[CMSG_SPACE(sizeof(struct sock_extended_err)) + CMSG_SPACE(sizeof(struct sockaddr_in6))] = {0}; - char data[sizeof(struct ipv6hdr) + - sizeof(struct tcphdr) + 1]; + char data[sizeof(struct ethhdr) + sizeof(struct ipv6hdr) + + sizeof(struct udphdr) + 1]; struct sock_extended_err *err; struct msghdr msg = {0}; struct iovec iov = {0}; @@ -159,6 +160,8 @@ static void do_recv_errqueue_timeout(int fdt) msg.msg_controllen = sizeof(control);
while (1) { + const char *reason; + ret = recvmsg(fdt, &msg, MSG_ERRQUEUE); if (ret == -1 && errno == EAGAIN) break; @@ -176,14 +179,30 @@ static void do_recv_errqueue_timeout(int fdt) err = (struct sock_extended_err *)CMSG_DATA(cm); if (err->ee_origin != SO_EE_ORIGIN_TXTIME) error(1, 0, "errqueue: origin 0x%x\n", err->ee_origin); - if (err->ee_code != ECANCELED) - error(1, 0, "errqueue: code 0x%x\n", err->ee_code); + + switch (err->ee_errno) { + case ECANCELED: + if (err->ee_code != SO_EE_CODE_TXTIME_MISSED) + error(1, 0, "errqueue: unknown ECANCELED %u\n", + err->ee_code); + reason = "missed txtime"; + break; + case EINVAL: + if (err->ee_code != SO_EE_CODE_TXTIME_INVALID_PARAM) + error(1, 0, "errqueue: unknown EINVAL %u\n", + err->ee_code); + reason = "invalid txtime"; + break; + default: + error(1, 0, "errqueue: errno %u code %u\n", + err->ee_errno, err->ee_code); + };
tstamp = ((int64_t) err->ee_data) << 32 | err->ee_info; tstamp -= (int64_t) glob_tstart; tstamp /= 1000 * 1000; - fprintf(stderr, "send: pkt %c at %" PRId64 "ms dropped\n", - data[ret - 1], tstamp); + fprintf(stderr, "send: pkt %c at %" PRId64 "ms dropped: %s\n", + data[ret - 1], tstamp, reason);
msg.msg_flags = 0; msg.msg_controllen = sizeof(control);
From: Shannon Nelson snelson@pensando.io
[ Upstream commit b59eabd23ee53e8c3492602722e1697f9a2f63ad ]
Even with moving netif_tx_disable() to an earlier point when taking down the queues for a reconfiguration, we still end up with the occasional netdev watchdog Tx Timeout complaint. The old method of using netif_trans_update() works fine for queue 0, but has no effect on the remaining queues. Using netif_device_detach() allows us to signal to the watchdog to ignore us for the moment.
Fixes: beead698b173 ("ionic: Add the basic NDO callbacks for netdev support") Signed-off-by: Shannon Nelson snelson@pensando.io Acked-by: Jonathan Toppins jtoppins@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 2729b0bb12739..790d4854b8ef5 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -1682,8 +1682,8 @@ static void ionic_stop_queues(struct ionic_lif *lif) if (!test_and_clear_bit(IONIC_LIF_F_UP, lif->state)) return;
- ionic_txrx_disable(lif); netif_tx_disable(lif->netdev); + ionic_txrx_disable(lif); }
int ionic_stop(struct net_device *netdev) @@ -1949,18 +1949,19 @@ int ionic_reset_queues(struct ionic_lif *lif) bool running; int err = 0;
- /* Put off the next watchdog timeout */ - netif_trans_update(lif->netdev); - err = ionic_wait_for_bit(lif, IONIC_LIF_F_QUEUE_RESET); if (err) return err;
running = netif_running(lif->netdev); - if (running) + if (running) { + netif_device_detach(lif->netdev); err = ionic_stop(lif->netdev); - if (!err && running) + } + if (!err && running) { ionic_open(lif->netdev); + netif_device_attach(lif->netdev); + }
clear_bit(IONIC_LIF_F_QUEUE_RESET, lif->state);
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit 49097762fa405cdc16f8f597f6d27c078d4a31e9 ]
Guest crashes are observed on a Cascade Lake system when 'perf top' is launched on the host, e.g.
BUG: unable to handle kernel paging request at fffffe0000073038 PGD 7ffa7067 P4D 7ffa7067 PUD 7ffa6067 PMD 7ffa5067 PTE ffffffffff120 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1 Comm: systemd Not tainted 4.18.0+ #380 ... Call Trace: serial8250_console_write+0xfe/0x1f0 call_console_drivers.constprop.0+0x9d/0x120 console_unlock+0x1ea/0x460
Call traces are different but the crash is imminent. The problem was blindly bisected to the commit 041bc42ce2d0 ("KVM: VMX: Micro-optimize vmexit time when not exposing PMU"). It was also confirmed that the issue goes away if PMU is exposed to the guest.
With some instrumentation of the guest we can see what is being switched (when we do atomic_switch_perf_msrs()):
vmx_vcpu_run: switching 2 msrs vmx_vcpu_run: switching MSR38f guest: 70000000d host: 70000000f vmx_vcpu_run: switching MSR3f1 guest: 0 host: 2
The current guess is that PEBS (MSR_IA32_PEBS_ENABLE, 0x3f1) is to blame. Regardless of whether PMU is exposed to the guest or not, PEBS needs to be disabled upon switch.
This reverts commit 041bc42ce2d0efac3b85bbb81dea8c74b81f4ef9.
Reported-by: Maxime Coquelin maxime.coquelin@redhat.com Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Message-Id: 20200619094046.654019-1-vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx/vmx.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d7aa0dfab8bbd..8f59f8c8fd05d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6575,8 +6575,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
pt_guest_enter(vmx);
- if (vcpu_to_pmu(vcpu)->version) - atomic_switch_perf_msrs(vmx); + atomic_switch_perf_msrs(vmx); atomic_switch_umwait_control_msr(vmx);
if (enable_preemption_timer)
From: Lu Baolu baolu.lu@linux.intel.com
[ Upstream commit 16ecf10e815d70d11d2300243f4a3b4c7c5acac7 ]
When using first-level translation for IOVA, currently the U/S bit in the page table is cleared which implies DMA requests with user privilege are blocked. As the result, following error messages might be observed when passing through a device to user level:
DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000 [fault reason 129] SM: U/S set 0 for first-level translation with user privilege
This fixes it by setting U/S bit in the first level page table and makes IOVA over first level compatible with previous second-level translation.
Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level") Reported-by: Xin Zeng xin.zeng@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20200622231345.29722-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel-iommu.c | 5 ++--- include/linux/intel-iommu.h | 1 + 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index fde7aba49b746..a0b9ea0805210 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -943,7 +943,7 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain, domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE); pteval = ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) | DMA_PTE_READ | DMA_PTE_WRITE; if (domain_use_first_level(domain)) - pteval |= DMA_FL_PTE_XD; + pteval |= DMA_FL_PTE_XD | DMA_FL_PTE_US; if (cmpxchg64(&pte->val, 0ULL, pteval)) /* Someone else set it while we were thinking; use theirs. */ free_pgtable_page(tmp_page); @@ -2034,7 +2034,6 @@ static inline void context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid) { context->hi |= pasid & ((1 << 20) - 1); - context->hi |= (1 << 20); }
/* @@ -2326,7 +2325,7 @@ static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP); if (domain_use_first_level(domain)) - attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD; + attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD | DMA_FL_PTE_US;
if (!sg) { sg_res = nr_pages; diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index de23fb95fe918..64a5335046b00 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -40,6 +40,7 @@ #define DMA_PTE_SNP BIT_ULL(11)
#define DMA_FL_PTE_PRESENT BIT_ULL(0) +#define DMA_FL_PTE_US BIT_ULL(2) #define DMA_FL_PTE_XD BIT_ULL(63)
#define CONTEXT_TT_MULTI_LEVEL 0
From: Lu Baolu baolu.lu@linux.intel.com
[ Upstream commit 50310600ebda74b9988467e2e6128711c7ba56fc ]
PCI ACS is disabled if Intel IOMMU is off by default or intel_iommu=off is used in command line. Unfortunately, Intel IOMMU will be forced on if there're devices sitting on an external facing PCI port that is marked as untrusted (for example, thunderbolt peripherals). That means, PCI ACS is disabled while Intel IOMMU is forced on to isolate those devices. As the result, the devices of an MFD will be grouped by a single group even the ACS is supported on device.
[ 0.691263] pci 0000:00:07.1: Adding to iommu group 3 [ 0.691277] pci 0000:00:07.2: Adding to iommu group 3 [ 0.691292] pci 0000:00:07.3: Adding to iommu group 3
Fix it by requesting PCI ACS when Intel IOMMU is detected with platform opt in hint.
Fixes: 89a6079df791a ("iommu/vt-d: Force IOMMU on for platform opt in hint") Co-developed-by: Lalithambika Krishnakumar lalithambika.krishnakumar@intel.com Signed-off-by: Lalithambika Krishnakumar lalithambika.krishnakumar@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Ashok Raj ashok.raj@intel.com Link: https://lore.kernel.org/r/20200622231345.29722-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/dmar.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index f77dae7ba7d40..7df5621bba8de 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -898,7 +898,8 @@ int __init detect_intel_iommu(void) if (!ret) ret = dmar_walk_dmar_table((struct acpi_table_dmar *)dmar_tbl, &validate_drhd_cb); - if (!ret && !no_iommu && !iommu_detected && !dmar_disabled) { + if (!ret && !no_iommu && !iommu_detected && + (!dmar_disabled || dmar_platform_optin())) { iommu_detected = 1; /* Make sure ACS will be enabled */ pci_request_acs();
From: Lu Baolu baolu.lu@linux.intel.com
[ Upstream commit 04c00956ee3cd138fd38560a91452a804a8c5550 ]
The Scalable-mode Page-walk Coherency (SMPWC) field in the VT-d extended capability register indicates the hardware coherency behavior on paging structures accessed through the pasid table entry. This is ignored in current code and using ECAP.C instead which is only valid in legacy mode. Fix this so that paging structure updates could be manually flushed from the cache line if hardware page walking is not snooped.
Fixes: 765b6a98c1de3 ("iommu/vt-d: Enumerate the scalable mode capability") Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Cc: Ashok Raj ashok.raj@intel.com Cc: Kevin Tian kevin.tian@intel.com Cc: Jacob Pan jacob.jun.pan@linux.intel.com Link: https://lore.kernel.org/r/20200622231345.29722-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel-iommu.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a0b9ea0805210..34b2ed91cf4d9 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -634,6 +634,12 @@ struct intel_iommu *domain_get_iommu(struct dmar_domain *domain) return g_iommus[iommu_id]; }
+static inline bool iommu_paging_structure_coherency(struct intel_iommu *iommu) +{ + return sm_supported(iommu) ? + ecap_smpwc(iommu->ecap) : ecap_coherent(iommu->ecap); +} + static void domain_update_iommu_coherency(struct dmar_domain *domain) { struct dmar_drhd_unit *drhd; @@ -645,7 +651,7 @@ static void domain_update_iommu_coherency(struct dmar_domain *domain)
for_each_domain_iommu(i, domain) { found = true; - if (!ecap_coherent(g_iommus[i]->ecap)) { + if (!iommu_paging_structure_coherency(g_iommus[i])) { domain->iommu_coherency = 0; break; } @@ -656,7 +662,7 @@ static void domain_update_iommu_coherency(struct dmar_domain *domain) /* No hardware attached; use lowest common denominator */ rcu_read_lock(); for_each_active_iommu(iommu, drhd) { - if (!ecap_coherent(iommu->ecap)) { + if (!iommu_paging_structure_coherency(iommu)) { domain->iommu_coherency = 0; break; } @@ -2177,7 +2183,8 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
context_set_fault_enable(context); context_set_present(context); - domain_flush_cache(domain, context, sizeof(*context)); + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(context, sizeof(*context));
/* * It's a non-present to present mapping. If hardware doesn't cache
From: Anson Huang Anson.Huang@nxp.com
[ Upstream commit c95c9693b112f312b59c5d100fd09a1349970fab ]
Correct i.MX8MP UID fuse offset according to fuse map:
UID_LOW: 0x420 UID_HIGH: 0x430
Fixes: fc40200ebf82 ("soc: imx: increase build coverage for imx8m soc driver") Fixes: 18f662a73862 ("soc: imx: Add i.MX8MP SoC driver support") Signed-off-by: Anson Huang Anson.Huang@nxp.com Reviewed-by: Iuliana Prodan iuliana.prodan@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/imx/soc-imx8m.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/soc/imx/soc-imx8m.c b/drivers/soc/imx/soc-imx8m.c index 719e1f189ebf2..ada0d8804d84c 100644 --- a/drivers/soc/imx/soc-imx8m.c +++ b/drivers/soc/imx/soc-imx8m.c @@ -22,6 +22,8 @@ #define OCOTP_UID_LOW 0x410 #define OCOTP_UID_HIGH 0x420
+#define IMX8MP_OCOTP_UID_OFFSET 0x10 + /* Same as ANADIG_DIGPROG_IMX7D */ #define ANADIG_DIGPROG_IMX8MM 0x800
@@ -88,6 +90,8 @@ static void __init imx8mm_soc_uid(void) { void __iomem *ocotp_base; struct device_node *np; + u32 offset = of_machine_is_compatible("fsl,imx8mp") ? + IMX8MP_OCOTP_UID_OFFSET : 0;
np = of_find_compatible_node(NULL, NULL, "fsl,imx8mm-ocotp"); if (!np) @@ -96,9 +100,9 @@ static void __init imx8mm_soc_uid(void) ocotp_base = of_iomap(np, 0); WARN_ON(!ocotp_base);
- soc_uid = readl_relaxed(ocotp_base + OCOTP_UID_HIGH); + soc_uid = readl_relaxed(ocotp_base + OCOTP_UID_HIGH + offset); soc_uid <<= 32; - soc_uid |= readl_relaxed(ocotp_base + OCOTP_UID_LOW); + soc_uid |= readl_relaxed(ocotp_base + OCOTP_UID_LOW + offset);
iounmap(ocotp_base); of_node_put(np);
From: David Rientjes rientjes@google.com
[ Upstream commit 1a2b3357e860d890f8045367b179c7e7e802cd71 ]
When a coherent mapping is created in dma_direct_alloc_pages(), it needs to be decrypted if the device requires unencrypted DMA before returning.
Fixes: 3acac065508f ("dma-mapping: merge the generic remapping helpers into dma-direct") Signed-off-by: David Rientjes rientjes@google.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/direct.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 98c445dcb308b..2270930f36f83 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -161,6 +161,12 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size, __builtin_return_address(0)); if (!ret) goto out_free_pages; + if (force_dma_unencrypted(dev)) { + err = set_memory_decrypted((unsigned long)ret, + 1 << get_order(size)); + if (err) + goto out_free_pages; + } memset(ret, 0, size); goto done; }
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit 97dd1abd026ae4e6a82fa68645928404ad483409 ]
qed_chain_get_element_left{,_u32} returned 0 when the difference between producer and consumer page count was equal to the total page count. Fix this by conditional expanding of producer value (vs unconditional). This allowed to eliminate normalizaton against total page count, which was the cause of this bug.
Misc: replace open-coded constants with common defines.
Fixes: a91eb52abb50 ("qed: Revisit chain implementation") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/qed/qed_chain.h | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/qed/qed_chain.h b/include/linux/qed/qed_chain.h index 733fad7dfbed9..6d15040c642cb 100644 --- a/include/linux/qed/qed_chain.h +++ b/include/linux/qed/qed_chain.h @@ -207,28 +207,34 @@ static inline u32 qed_chain_get_cons_idx_u32(struct qed_chain *p_chain)
static inline u16 qed_chain_get_elem_left(struct qed_chain *p_chain) { + u16 elem_per_page = p_chain->elem_per_page; + u32 prod = p_chain->u.chain16.prod_idx; + u32 cons = p_chain->u.chain16.cons_idx; u16 used;
- used = (u16) (((u32)0x10000 + - (u32)p_chain->u.chain16.prod_idx) - - (u32)p_chain->u.chain16.cons_idx); + if (prod < cons) + prod += (u32)U16_MAX + 1; + + used = (u16)(prod - cons); if (p_chain->mode == QED_CHAIN_MODE_NEXT_PTR) - used -= p_chain->u.chain16.prod_idx / p_chain->elem_per_page - - p_chain->u.chain16.cons_idx / p_chain->elem_per_page; + used -= prod / elem_per_page - cons / elem_per_page;
return (u16)(p_chain->capacity - used); }
static inline u32 qed_chain_get_elem_left_u32(struct qed_chain *p_chain) { + u16 elem_per_page = p_chain->elem_per_page; + u64 prod = p_chain->u.chain32.prod_idx; + u64 cons = p_chain->u.chain32.cons_idx; u32 used;
- used = (u32) (((u64)0x100000000ULL + - (u64)p_chain->u.chain32.prod_idx) - - (u64)p_chain->u.chain32.cons_idx); + if (prod < cons) + prod += (u64)U32_MAX + 1; + + used = (u32)(prod - cons); if (p_chain->mode == QED_CHAIN_MODE_NEXT_PTR) - used -= p_chain->u.chain32.prod_idx / p_chain->elem_per_page - - p_chain->u.chain32.cons_idx / p_chain->elem_per_page; + used -= (u32)(prod / elem_per_page - cons / elem_per_page);
return p_chain->capacity - used; }
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit 31333c1a2521ff4b4ceb0c29de492549cd4a8de3 ]
qed_spq_unregister_async_cb() should be called before qed_rdma_info_free() to avoid crash-spawning uses-after-free. Instead of calling it from each subsystem exit code, do it in one place on PF down.
Fixes: 291d57f67d24 ("qed: Fix rdma_info structure allocation") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 9 +++++++-- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 2 -- drivers/net/ethernet/qlogic/qed/qed_roce.c | 1 - 3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 38a65b984e476..9b00988fb77e1 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -1368,6 +1368,8 @@ static void qed_dbg_user_data_free(struct qed_hwfn *p_hwfn)
void qed_resc_free(struct qed_dev *cdev) { + struct qed_rdma_info *rdma_info; + struct qed_hwfn *p_hwfn; int i;
if (IS_VF(cdev)) { @@ -1385,7 +1387,8 @@ void qed_resc_free(struct qed_dev *cdev) qed_llh_free(cdev);
for_each_hwfn(cdev, i) { - struct qed_hwfn *p_hwfn = &cdev->hwfns[i]; + p_hwfn = cdev->hwfns + i; + rdma_info = p_hwfn->p_rdma_info;
qed_cxt_mngr_free(p_hwfn); qed_qm_info_free(p_hwfn); @@ -1404,8 +1407,10 @@ void qed_resc_free(struct qed_dev *cdev) qed_ooo_free(p_hwfn); }
- if (QED_IS_RDMA_PERSONALITY(p_hwfn)) + if (QED_IS_RDMA_PERSONALITY(p_hwfn) && rdma_info) { + qed_spq_unregister_async_cb(p_hwfn, rdma_info->proto); qed_rdma_info_free(p_hwfn); + }
qed_iov_free(p_hwfn); qed_l2_free(p_hwfn); diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index d2fe61a5cf567..5409a2da6106d 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -2836,8 +2836,6 @@ int qed_iwarp_stop(struct qed_hwfn *p_hwfn) if (rc) return rc;
- qed_spq_unregister_async_cb(p_hwfn, PROTOCOLID_IWARP); - return qed_iwarp_ll2_stop(p_hwfn); }
diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c index 37e70562a9645..f15c26ef88709 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.c +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c @@ -113,7 +113,6 @@ void qed_roce_stop(struct qed_hwfn *p_hwfn) break; } } - qed_spq_unregister_async_cb(p_hwfn, PROTOCOLID_ROCE); }
static void qed_rdma_copy_gids(struct qed_rdma_qp *qp, __le32 *src_gid,
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit 4079c7f7a2a00ab403c177ce723b560de59139c3 ]
Set rdma_wq pointer to NULL after destroying the workqueue and check for it when adding new events to fix crashes on driver unload.
Fixes: cee9fbd8e2e9 ("qede: Add qedr framework") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qede/qede_rdma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c index 2d873ae8a234d..668ccc9d49f83 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c +++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c @@ -105,6 +105,7 @@ static void qede_rdma_destroy_wq(struct qede_dev *edev)
qede_rdma_cleanup_event(edev); destroy_workqueue(edev->rdma_info.rdma_wq); + edev->rdma_info.rdma_wq = NULL; }
int qede_rdma_dev_add(struct qede_dev *edev, bool recovery) @@ -325,7 +326,7 @@ static void qede_rdma_add_event(struct qede_dev *edev, if (edev->rdma_info.exp_recovery) return;
- if (!edev->rdma_info.qedr_dev) + if (!edev->rdma_info.qedr_dev || !edev->rdma_info.rdma_wq) return;
/* We don't want the cleanup flow to start while we're allocating and
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit ccd7c7ce167a21dbf2b698ffcf00f11d96d44f9b ]
25ms sleep cycles in waiting for PF response are excessive and may lead to different timeout failures.
Start to wait with short udelays, and in most cases polling will end here. If the time was not sufficient, switch to msleeps. usleep_range() may go far beyond 100us depending on platform and tick configuration, hence atomic udelays for consistency.
Also add explicit DMA barriers since 'done' always comes from a shared request-response DMA pool, and note that in the comment nearby.
Fixes: 1408cc1fa48c ("qed: Introduce VFs") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_vf.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c index 856051f50eb75..adc2c8f3d48ef 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_vf.c +++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c @@ -81,12 +81,17 @@ static void qed_vf_pf_req_end(struct qed_hwfn *p_hwfn, int req_status) mutex_unlock(&(p_hwfn->vf_iov_info->mutex)); }
+#define QED_VF_CHANNEL_USLEEP_ITERATIONS 90 +#define QED_VF_CHANNEL_USLEEP_DELAY 100 +#define QED_VF_CHANNEL_MSLEEP_ITERATIONS 10 +#define QED_VF_CHANNEL_MSLEEP_DELAY 25 + static int qed_send_msg2pf(struct qed_hwfn *p_hwfn, u8 *done, u32 resp_size) { union vfpf_tlvs *p_req = p_hwfn->vf_iov_info->vf2pf_request; struct ustorm_trigger_vf_zone trigger; struct ustorm_vf_zone *zone_data; - int rc = 0, time = 100; + int iter, rc = 0;
zone_data = (struct ustorm_vf_zone *)PXP_VF_BAR0_START_USDM_ZONE_B;
@@ -126,11 +131,19 @@ static int qed_send_msg2pf(struct qed_hwfn *p_hwfn, u8 *done, u32 resp_size) REG_WR(p_hwfn, (uintptr_t)&zone_data->trigger, *((u32 *)&trigger));
/* When PF would be done with the response, it would write back to the - * `done' address. Poll until then. + * `done' address from a coherent DMA zone. Poll until then. */ - while ((!*done) && time) { - msleep(25); - time--; + + iter = QED_VF_CHANNEL_USLEEP_ITERATIONS; + while (!*done && iter--) { + udelay(QED_VF_CHANNEL_USLEEP_DELAY); + dma_rmb(); + } + + iter = QED_VF_CHANNEL_MSLEEP_ITERATIONS; + while (!*done && iter--) { + msleep(QED_VF_CHANNEL_MSLEEP_DELAY); + dma_rmb(); }
if (!*done) {
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit d434d02f7e7c24c721365fd594ed781acb18e0da ]
This is likely a copy'n'paste mistake. The amount of ILT lines to reserve for a single VF was being multiplied by the total VFs count. This led to a huge redundancy in reservation and potential lines drainouts.
Fixes: 1408cc1fa48c ("qed: Introduce VFs") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_cxt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c b/drivers/net/ethernet/qlogic/qed/qed_cxt.c index 1a636bad717dc..1880aa1d3bb59 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c +++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c @@ -270,7 +270,7 @@ static void qed_cxt_qm_iids(struct qed_hwfn *p_hwfn, vf_tids += segs[NUM_TASK_PF_SEGMENTS].count; }
- iids->vf_cids += vf_cids * p_mngr->vf_count; + iids->vf_cids = vf_cids; iids->tids += vf_tids * p_mngr->vf_count;
DP_VERBOSE(p_hwfn, QED_MSG_ILT,
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit 1c85f394c2206ea3835f43534d5675f0574e1b70 ]
Currently PTP cyclecounter and timecounter are initialized only on the first probing and are cleaned up during removal. This means that PTP becomes non-functional after device recovery. Fix this by unconditional PTP initialization on probing and clearing Tx pending bit on exiting.
Fixes: ccc67ef50b90 ("qede: Error recovery process") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qede/qede_main.c | 2 +- drivers/net/ethernet/qlogic/qede/qede_ptp.c | 31 ++++++++------------ drivers/net/ethernet/qlogic/qede/qede_ptp.h | 2 +- 3 files changed, 15 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 1a83d1fd8ccd6..51bb5866a212c 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -1158,7 +1158,7 @@ static int __qede_probe(struct pci_dev *pdev, u32 dp_module, u8 dp_level,
/* PTP not supported on VFs */ if (!is_vf) - qede_ptp_enable(edev, (mode == QEDE_PROBE_NORMAL)); + qede_ptp_enable(edev);
edev->ops->register_ops(cdev, &qede_ll_ops, edev);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ptp.c b/drivers/net/ethernet/qlogic/qede/qede_ptp.c index 4c7f7a7fc1515..cd5841a9415ef 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ptp.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ptp.c @@ -412,6 +412,7 @@ void qede_ptp_disable(struct qede_dev *edev) if (ptp->tx_skb) { dev_kfree_skb_any(ptp->tx_skb); ptp->tx_skb = NULL; + clear_bit_unlock(QEDE_FLAGS_PTP_TX_IN_PRORGESS, &edev->flags); }
/* Disable PTP in HW */ @@ -423,7 +424,7 @@ void qede_ptp_disable(struct qede_dev *edev) edev->ptp = NULL; }
-static int qede_ptp_init(struct qede_dev *edev, bool init_tc) +static int qede_ptp_init(struct qede_dev *edev) { struct qede_ptp *ptp; int rc; @@ -444,25 +445,19 @@ static int qede_ptp_init(struct qede_dev *edev, bool init_tc) /* Init work queue for Tx timestamping */ INIT_WORK(&ptp->work, qede_ptp_task);
- /* Init cyclecounter and timecounter. This is done only in the first - * load. If done in every load, PTP application will fail when doing - * unload / load (e.g. MTU change) while it is running. - */ - if (init_tc) { - memset(&ptp->cc, 0, sizeof(ptp->cc)); - ptp->cc.read = qede_ptp_read_cc; - ptp->cc.mask = CYCLECOUNTER_MASK(64); - ptp->cc.shift = 0; - ptp->cc.mult = 1; - - timecounter_init(&ptp->tc, &ptp->cc, - ktime_to_ns(ktime_get_real())); - } + /* Init cyclecounter and timecounter */ + memset(&ptp->cc, 0, sizeof(ptp->cc)); + ptp->cc.read = qede_ptp_read_cc; + ptp->cc.mask = CYCLECOUNTER_MASK(64); + ptp->cc.shift = 0; + ptp->cc.mult = 1;
- return rc; + timecounter_init(&ptp->tc, &ptp->cc, ktime_to_ns(ktime_get_real())); + + return 0; }
-int qede_ptp_enable(struct qede_dev *edev, bool init_tc) +int qede_ptp_enable(struct qede_dev *edev) { struct qede_ptp *ptp; int rc; @@ -483,7 +478,7 @@ int qede_ptp_enable(struct qede_dev *edev, bool init_tc)
edev->ptp = ptp;
- rc = qede_ptp_init(edev, init_tc); + rc = qede_ptp_init(edev); if (rc) goto err1;
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ptp.h b/drivers/net/ethernet/qlogic/qede/qede_ptp.h index 691a14c4b2c5a..89c7f3cf3ee28 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ptp.h +++ b/drivers/net/ethernet/qlogic/qede/qede_ptp.h @@ -41,7 +41,7 @@ void qede_ptp_rx_ts(struct qede_dev *edev, struct sk_buff *skb); void qede_ptp_tx_ts(struct qede_dev *edev, struct sk_buff *skb); int qede_ptp_hw_ts(struct qede_dev *edev, struct ifreq *req); void qede_ptp_disable(struct qede_dev *edev); -int qede_ptp_enable(struct qede_dev *edev, bool init_tc); +int qede_ptp_enable(struct qede_dev *edev); int qede_ptp_get_ts_info(struct qede_dev *edev, struct ethtool_ts_info *ts);
static inline void qede_ptp_record_rx_ts(struct qede_dev *edev,
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit ec6c80590bde6b5dfa4970fffa3572f1acd313ca ]
Set edev->cdev pointer to NULL after calling remove() callback to avoid using of already freed object.
Fixes: ccc67ef50b90 ("qede: Error recovery process") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qede/qede_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 51bb5866a212c..26eb58e7e0765 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -1247,6 +1247,7 @@ static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode) if (system_state == SYSTEM_POWER_OFF) return; qed_ops->common->remove(cdev); + edev->cdev = NULL;
/* Since this can happen out-of-sync with other flows, * don't release the netdevice until after slowpath stop
From: Alexander Lobakin alobakin@marvell.com
[ Upstream commit c221dd1831732449f844e03631a34fd21c2b9429 ]
Sizes of all ILT blocks must be reset before ILT recomputing when disabling clients, or memory allocation may exceed ILT shadow array and provoke system crashes.
Fixes: 1408cc1fa48c ("qed: Introduce VFs") Signed-off-by: Alexander Lobakin alobakin@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_cxt.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c b/drivers/net/ethernet/qlogic/qed/qed_cxt.c index 1880aa1d3bb59..aeed8939f410a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c +++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c @@ -442,6 +442,20 @@ static struct qed_ilt_cli_blk *qed_cxt_set_blk(struct qed_ilt_cli_blk *p_blk) return p_blk; }
+static void qed_cxt_ilt_blk_reset(struct qed_hwfn *p_hwfn) +{ + struct qed_ilt_client_cfg *clients = p_hwfn->p_cxt_mngr->clients; + u32 cli_idx, blk_idx; + + for (cli_idx = 0; cli_idx < MAX_ILT_CLIENTS; cli_idx++) { + for (blk_idx = 0; blk_idx < ILT_CLI_PF_BLOCKS; blk_idx++) + clients[cli_idx].pf_blks[blk_idx].total_size = 0; + + for (blk_idx = 0; blk_idx < ILT_CLI_VF_BLOCKS; blk_idx++) + clients[cli_idx].vf_blks[blk_idx].total_size = 0; + } +} + int qed_cxt_cfg_ilt_compute(struct qed_hwfn *p_hwfn, u32 *line_count) { struct qed_cxt_mngr *p_mngr = p_hwfn->p_cxt_mngr; @@ -461,6 +475,11 @@ int qed_cxt_cfg_ilt_compute(struct qed_hwfn *p_hwfn, u32 *line_count)
p_mngr->pf_start_line = RESC_START(p_hwfn, QED_ILT);
+ /* Reset all ILT blocks at the beginning of ILT computing in order + * to prevent memory allocation for irrelevant blocks afterwards. + */ + qed_cxt_ilt_blk_reset(p_hwfn); + DP_VERBOSE(p_hwfn, QED_MSG_ILT, "hwfn [%d] - Set context manager starting line to be 0x%08x\n", p_hwfn->my_id, p_hwfn->p_cxt_mngr->pf_start_line);
From: Rahul Lakkireddy rahul.lakkireddy@chelsio.com
[ Upstream commit 11d8cd5c9f3b46f397f889cefdb66795518aaebd ]
Move code handling L2T ARP failures to the only caller.
Fixes following sparse warning: skbuff.h:2091:29: warning: context imbalance in 'handle_failed_resolution' - unexpected unlock
Fixes: 749cb5fe48bb ("cxgb4: Replace arpq_head/arpq_tail with SKB double link-list code") Signed-off-by: Rahul Lakkireddy rahul.lakkireddy@chelsio.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/chelsio/cxgb4/l2t.c | 52 +++++++++++------------- 1 file changed, 24 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/l2t.c b/drivers/net/ethernet/chelsio/cxgb4/l2t.c index 72b37a66c7d88..0ed20a9cca144 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c +++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c @@ -502,41 +502,20 @@ u64 cxgb4_select_ntuple(struct net_device *dev, } EXPORT_SYMBOL(cxgb4_select_ntuple);
-/* - * Called when address resolution fails for an L2T entry to handle packets - * on the arpq head. If a packet specifies a failure handler it is invoked, - * otherwise the packet is sent to the device. - */ -static void handle_failed_resolution(struct adapter *adap, struct l2t_entry *e) -{ - struct sk_buff *skb; - - while ((skb = __skb_dequeue(&e->arpq)) != NULL) { - const struct l2t_skb_cb *cb = L2T_SKB_CB(skb); - - spin_unlock(&e->lock); - if (cb->arp_err_handler) - cb->arp_err_handler(cb->handle, skb); - else - t4_ofld_send(adap, skb); - spin_lock(&e->lock); - } -} - /* * Called when the host's neighbor layer makes a change to some entry that is * loaded into the HW L2 table. */ void t4_l2t_update(struct adapter *adap, struct neighbour *neigh) { - struct l2t_entry *e; - struct sk_buff_head *arpq = NULL; - struct l2t_data *d = adap->l2t; unsigned int addr_len = neigh->tbl->key_len; u32 *addr = (u32 *) neigh->primary_key; - int ifidx = neigh->dev->ifindex; - int hash = addr_hash(d, addr, addr_len, ifidx); + int hash, ifidx = neigh->dev->ifindex; + struct sk_buff_head *arpq = NULL; + struct l2t_data *d = adap->l2t; + struct l2t_entry *e;
+ hash = addr_hash(d, addr, addr_len, ifidx); read_lock_bh(&d->lock); for (e = d->l2tab[hash].first; e; e = e->next) if (!addreq(e, addr) && e->ifindex == ifidx) { @@ -569,8 +548,25 @@ void t4_l2t_update(struct adapter *adap, struct neighbour *neigh) write_l2e(adap, e, 0); }
- if (arpq) - handle_failed_resolution(adap, e); + if (arpq) { + struct sk_buff *skb; + + /* Called when address resolution fails for an L2T + * entry to handle packets on the arpq head. If a + * packet specifies a failure handler it is invoked, + * otherwise the packet is sent to the device. + */ + while ((skb = __skb_dequeue(&e->arpq)) != NULL) { + const struct l2t_skb_cb *cb = L2T_SKB_CB(skb); + + spin_unlock(&e->lock); + if (cb->arp_err_handler) + cb->arp_err_handler(cb->handle, skb); + else + t4_ofld_send(adap, skb); + spin_lock(&e->lock); + } + } spin_unlock_bh(&e->lock); }
From: Rahul Lakkireddy rahul.lakkireddy@chelsio.com
[ Upstream commit 030c98824deaba205787af501a074b3d7f46e32e ]
Check for whether PTP is enabled or not at the caller and perform locking/unlocking at the caller.
Fixes following sparse warning: sge.c:1641:26: warning: context imbalance in 'cxgb4_eth_xmit' - different lock contexts for basic block
Fixes: a456950445a0 ("cxgb4: time stamping interface for PTP") Signed-off-by: Rahul Lakkireddy rahul.lakkireddy@chelsio.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/chelsio/cxgb4/sge.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index 6516c45864b35..db8106d9d6edf 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -1425,12 +1425,10 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
qidx = skb_get_queue_mapping(skb); if (ptp_enabled) { - spin_lock(&adap->ptp_lock); if (!(adap->ptp_tx_skb)) { skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; adap->ptp_tx_skb = skb_get(skb); } else { - spin_unlock(&adap->ptp_lock); goto out_free; } q = &adap->sge.ptptxq; @@ -1444,11 +1442,8 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
#ifdef CONFIG_CHELSIO_T4_FCOE ret = cxgb_fcoe_offload(skb, adap, pi, &cntrl); - if (unlikely(ret == -ENOTSUPP)) { - if (ptp_enabled) - spin_unlock(&adap->ptp_lock); + if (unlikely(ret == -EOPNOTSUPP)) goto out_free; - } #endif /* CONFIG_CHELSIO_T4_FCOE */
chip_ver = CHELSIO_CHIP_VERSION(adap->params.chip); @@ -1461,8 +1456,6 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev) dev_err(adap->pdev_dev, "%s: Tx ring %u full while queue awake!\n", dev->name, qidx); - if (ptp_enabled) - spin_unlock(&adap->ptp_lock); return NETDEV_TX_BUSY; }
@@ -1481,8 +1474,6 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev) unlikely(cxgb4_map_skb(adap->pdev_dev, skb, sgl_sdesc->addr) < 0)) { memset(sgl_sdesc->addr, 0, sizeof(sgl_sdesc->addr)); q->mapping_err++; - if (ptp_enabled) - spin_unlock(&adap->ptp_lock); goto out_free; }
@@ -1630,8 +1621,6 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev) txq_advance(&q->q, ndesc);
cxgb4_ring_tx_db(adap, &q->q, ndesc); - if (ptp_enabled) - spin_unlock(&adap->ptp_lock); return NETDEV_TX_OK;
out_free: @@ -2365,6 +2354,16 @@ netdev_tx_t t4_start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(qid >= pi->nqsets)) return cxgb4_ethofld_xmit(skb, dev);
+ if (is_ptp_enabled(skb, dev)) { + struct adapter *adap = netdev2adap(dev); + netdev_tx_t ret; + + spin_lock(&adap->ptp_lock); + ret = cxgb4_eth_xmit(skb, dev); + spin_unlock(&adap->ptp_lock); + return ret; + } + return cxgb4_eth_xmit(skb, dev); }
From: yu kuai yukuai3@huawei.com
[ Upstream commit 586745f1598ccf71b0a5a6df2222dee0a865954e ]
if of_find_device_by_node() succeed, imx_suspend_alloc_ocram() doesn't have a corresponding put_device(). Thus add a jump target to fix the exception handling for this function implementation.
Fixes: 1579c7b9fe01 ("ARM: imx53: Set DDR pins to high impedance when in suspend to RAM.") Signed-off-by: yu kuai yukuai3@huawei.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-imx/pm-imx5.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mach-imx/pm-imx5.c b/arch/arm/mach-imx/pm-imx5.c index f057df813f83a..e9962b48e30cb 100644 --- a/arch/arm/mach-imx/pm-imx5.c +++ b/arch/arm/mach-imx/pm-imx5.c @@ -295,14 +295,14 @@ static int __init imx_suspend_alloc_ocram( if (!ocram_pool) { pr_warn("%s: ocram pool unavailable!\n", __func__); ret = -ENODEV; - goto put_node; + goto put_device; }
ocram_base = gen_pool_alloc(ocram_pool, size); if (!ocram_base) { pr_warn("%s: unable to alloc ocram!\n", __func__); ret = -ENOMEM; - goto put_node; + goto put_device; }
phys = gen_pool_virt_to_phys(ocram_pool, ocram_base); @@ -312,6 +312,8 @@ static int __init imx_suspend_alloc_ocram( if (virt_out) *virt_out = virt;
+put_device: + put_device(&pdev->dev); put_node: of_node_put(node);
From: SeongJae Park sjpark@amazon.de
[ Upstream commit 46da547e21d6cefceec3fb3dba5ebbca056627fc ]
Commit cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu") has introduced static checker warnings for potential null dereferences in 'lpfc_sli4_hba_unset()' and commit 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset") has tried to fix it. However, yet another potential null dereference is remaining. This commit fixes it.
This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc.
Link: https://lore.kernel.org/r/20200623084122.30633-1-sjpark@amazon.com Fixes: 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning inlpfc_sli4_hba_unset") Fixes: cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu") Reviewed-by: James Smart james.smart@broadcom.com Signed-off-by: SeongJae Park sjpark@amazon.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_init.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 4104bdcdbb6fd..70be1f5de8730 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -11895,7 +11895,8 @@ lpfc_sli4_hba_unset(struct lpfc_hba *phba) lpfc_sli4_xri_exchange_busy_wait(phba);
/* per-phba callback de-registration for hotplug event */ - lpfc_cpuhp_remove(phba); + if (phba->pport) + lpfc_cpuhp_remove(phba);
/* Disable PCI subsystem interrupt */ lpfc_sli4_disable_intr(phba);
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit e55f3c37cb8d31c7e301f46396b2ac6a19eb3a7c ]
If this is in "transceiver" mode the the ->qwork isn't required and is a NULL pointer. This can lead to a NULL dereference when we call destroy_workqueue(udc->qwork).
Fixes: 3517c31a8ece ("usb: gadget: mv_udc: use devm_xxx for probe") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/mv_udc_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/mv_udc_core.c b/drivers/usb/gadget/udc/mv_udc_core.c index cafde053788bb..80a1b52c656e0 100644 --- a/drivers/usb/gadget/udc/mv_udc_core.c +++ b/drivers/usb/gadget/udc/mv_udc_core.c @@ -2313,7 +2313,8 @@ static int mv_udc_probe(struct platform_device *pdev) return 0;
err_create_workqueue: - destroy_workqueue(udc->qwork); + if (udc->qwork) + destroy_workqueue(udc->qwork); err_destroy_dma: dma_pool_destroy(udc->dtd_pool); err_free_dma:
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit ea0efd687b01355cd799c8643d0c636ba4859ffc ]
This driver assumed that dmaengine_tx_status() could return the residue even if the transfer was completed. However, this was not correct usage [1] and this caused to break getting the residue after the commit 24461d9792c2 ("dmaengine: virt-dma: Fix access after free in vchan_complete()") actually. So, this is possible to get wrong received size if the usb controller gets a short packet. For example, g_zero driver causes "bad OUT byte" errors.
The usb-dmac driver will support the callback_result, so this driver can use it to get residue correctly. Note that even if the usb-dmac driver has not supported the callback_result yet, this patch doesn't cause any side-effects.
[1] https://lore.kernel.org/dmaengine/20200616165550.GP2324254@vkoul-mobl/
Reported-by: Hien Dang hien.dang.eb@renesas.com Fixes: 24461d9792c2 ("dmaengine: virt-dma: Fix access after free in vchan_complete()") Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://lore.kernel.org/r/1592482277-19563-1-git-send-email-yoshihiro.shimod... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/renesas_usbhs/fifo.c | 23 +++++++++++++---------- drivers/usb/renesas_usbhs/fifo.h | 2 +- 2 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/renesas_usbhs/fifo.c b/drivers/usb/renesas_usbhs/fifo.c index 01c6a48c41bc9..ac9a81ae82164 100644 --- a/drivers/usb/renesas_usbhs/fifo.c +++ b/drivers/usb/renesas_usbhs/fifo.c @@ -803,7 +803,8 @@ static int __usbhsf_dma_map_ctrl(struct usbhs_pkt *pkt, int map) return info->dma_map_ctrl(chan->device->dev, pkt, map); }
-static void usbhsf_dma_complete(void *arg); +static void usbhsf_dma_complete(void *arg, + const struct dmaengine_result *result); static void usbhsf_dma_xfer_preparing(struct usbhs_pkt *pkt) { struct usbhs_pipe *pipe = pkt->pipe; @@ -813,6 +814,7 @@ static void usbhsf_dma_xfer_preparing(struct usbhs_pkt *pkt) struct dma_chan *chan; struct device *dev = usbhs_priv_to_dev(priv); enum dma_transfer_direction dir; + dma_cookie_t cookie;
fifo = usbhs_pipe_to_fifo(pipe); if (!fifo) @@ -827,11 +829,11 @@ static void usbhsf_dma_xfer_preparing(struct usbhs_pkt *pkt) if (!desc) return;
- desc->callback = usbhsf_dma_complete; - desc->callback_param = pipe; + desc->callback_result = usbhsf_dma_complete; + desc->callback_param = pkt;
- pkt->cookie = dmaengine_submit(desc); - if (pkt->cookie < 0) { + cookie = dmaengine_submit(desc); + if (cookie < 0) { dev_err(dev, "Failed to submit dma descriptor\n"); return; } @@ -1152,12 +1154,10 @@ static size_t usbhs_dma_calc_received_size(struct usbhs_pkt *pkt, struct dma_chan *chan, int dtln) { struct usbhs_pipe *pipe = pkt->pipe; - struct dma_tx_state state; size_t received_size; int maxp = usbhs_pipe_get_maxpacket(pipe);
- dmaengine_tx_status(chan, pkt->cookie, &state); - received_size = pkt->length - state.residue; + received_size = pkt->length - pkt->dma_result->residue;
if (dtln) { received_size -= USBHS_USB_DMAC_XFER_SIZE; @@ -1363,13 +1363,16 @@ static int usbhsf_irq_ready(struct usbhs_priv *priv, return 0; }
-static void usbhsf_dma_complete(void *arg) +static void usbhsf_dma_complete(void *arg, + const struct dmaengine_result *result) { - struct usbhs_pipe *pipe = arg; + struct usbhs_pkt *pkt = arg; + struct usbhs_pipe *pipe = pkt->pipe; struct usbhs_priv *priv = usbhs_pipe_to_priv(pipe); struct device *dev = usbhs_priv_to_dev(priv); int ret;
+ pkt->dma_result = result; ret = usbhsf_pkt_handler(pipe, USBHSF_PKT_DMA_DONE); if (ret < 0) dev_err(dev, "dma_complete run_error %d : %d\n", diff --git a/drivers/usb/renesas_usbhs/fifo.h b/drivers/usb/renesas_usbhs/fifo.h index c3d3cc35cee0f..4a7dc23ce3d35 100644 --- a/drivers/usb/renesas_usbhs/fifo.h +++ b/drivers/usb/renesas_usbhs/fifo.h @@ -50,7 +50,7 @@ struct usbhs_pkt { struct usbhs_pkt *pkt); struct work_struct work; dma_addr_t dma; - dma_cookie_t cookie; + const struct dmaengine_result *dma_result; void *buf; int length; int trans;
From: Keith Busch kbusch@kernel.org
[ Upstream commit b2ce4d90690bd29ce5b554e203cd03682dd59697 ]
The queues' backing device info capabilities don't change with each namespace revalidation. Set it only when each path's request_queue is initially added to a multipath queue.
Signed-off-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 7 ------- drivers/nvme/host/multipath.c | 8 ++++++++ 2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7b4cbe2c69541..887139f8fa53b 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1910,13 +1910,6 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id) if (ns->head->disk) { nvme_update_disk_info(ns->head->disk, ns, id); blk_queue_stack_limits(ns->head->disk->queue, ns->queue); - if (bdi_cap_stable_pages_required(ns->queue->backing_dev_info)) { - struct backing_dev_info *info = - ns->head->disk->queue->backing_dev_info; - - info->capabilities |= BDI_CAP_STABLE_WRITES; - } - revalidate_disk(ns->head->disk); } #endif diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 54603bd3e02de..9f2844935fdfa 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -3,6 +3,7 @@ * Copyright (c) 2017-2018 Christoph Hellwig. */
+#include <linux/backing-dev.h> #include <linux/moduleparam.h> #include <trace/events/block.h> #include "nvme.h" @@ -666,6 +667,13 @@ void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id) nvme_mpath_set_live(ns); mutex_unlock(&ns->head->lock); } + + if (bdi_cap_stable_pages_required(ns->queue->backing_dev_info)) { + struct backing_dev_info *info = + ns->head->disk->queue->backing_dev_info; + + info->capabilities |= BDI_CAP_STABLE_WRITES; + } }
void nvme_mpath_remove_disk(struct nvme_ns_head *head)
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 3b4b19721ec652ad2c4fe51dfbe5124212b5f581 ]
Revert fab7772bfbcf ("nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns")
When adding a new namespace to the head disk (via nvme_mpath_set_live) we will see partition scan which triggers I/O on the mpath device node. This process will usually be triggered from the scan_work which holds the scan_lock. If I/O blocks (if we got ana change currently have only available paths but none are accessible) this can deadlock on the head disk bd_mutex as both partition scan I/O takes it, and head disk revalidation takes it to check for resize (also triggered from scan_work on a different path). See trace [1].
The mpath disk revalidation was originally added to detect online disk size change, but this is no longer needed since commit cb224c3af4df ("nvme: Convert to use set_capacity_revalidate_and_notify") which already updates resize info without unnecessarily revalidating the disk (the mpath disk doesn't even implement .revalidate_disk fop).
[1]: -- kernel: INFO: task kworker/u65:9:494 blocked for more than 241 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u65:9 D 0 494 2 0x80004000 kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: schedule_preempt_disabled+0xe/0x10 kernel: __mutex_lock.isra.0+0x182/0x4f0 kernel: __mutex_lock_slowpath+0x13/0x20 kernel: mutex_lock+0x2e/0x40 kernel: revalidate_disk+0x63/0xa0 kernel: __nvme_revalidate_disk+0xfe/0x110 [nvme_core] kernel: nvme_revalidate_disk+0xa4/0x160 [nvme_core] kernel: ? evict+0x14c/0x1b0 kernel: revalidate_disk+0x2b/0xa0 kernel: nvme_validate_ns+0x49/0x940 [nvme_core] kernel: ? blk_mq_free_request+0xd2/0x100 kernel: ? __nvme_submit_sync_cmd+0xbe/0x1e0 [nvme_core] kernel: nvme_scan_work+0x24f/0x380 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x249/0x400 kernel: kthread+0x104/0x140 kernel: ? process_one_work+0x380/0x380 kernel: ? kthread_park+0x80/0x80 kernel: ret_from_fork+0x1f/0x40 ... kernel: INFO: task kworker/u65:1:2630 blocked for more than 241 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u65:1 D 0 2630 2 0x80004000 kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: io_schedule+0x16/0x40 kernel: do_read_cache_page+0x438/0x830 kernel: ? __switch_to_asm+0x34/0x70 kernel: ? file_fdatawait_range+0x30/0x30 kernel: read_cache_page+0x12/0x20 kernel: read_dev_sector+0x27/0xc0 kernel: read_lba+0xc1/0x220 kernel: ? kmem_cache_alloc_trace+0x19c/0x230 kernel: efi_partition+0x1e6/0x708 kernel: ? vsnprintf+0x39e/0x4e0 kernel: ? snprintf+0x49/0x60 kernel: check_partition+0x154/0x244 kernel: rescan_partitions+0xae/0x280 kernel: __blkdev_get+0x40f/0x560 kernel: blkdev_get+0x3d/0x140 kernel: __device_add_disk+0x388/0x480 kernel: device_add_disk+0x13/0x20 kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core] kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core] kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core] kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core] kernel: ? nvme_update_ns_ana_state+0x60/0x60 [nvme_core] kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core] kernel: nvme_validate_ns+0x396/0x940 [nvme_core] kernel: ? blk_mq_free_request+0xd2/0x100 kernel: nvme_scan_work+0x24f/0x380 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x249/0x400 kernel: kthread+0x104/0x140 kernel: ? process_one_work+0x380/0x380 kernel: ? kthread_park+0x80/0x80 kernel: ret_from_fork+0x1f/0x40 --
Fixes: fab7772bfbcf ("nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns") Signed-off-by: Anton Eidelman anton@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 887139f8fa53b..85ce6c682849e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1910,7 +1910,6 @@ static void __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id) if (ns->head->disk) { nvme_update_disk_info(ns->head->disk, ns, id); blk_queue_stack_limits(ns->head->disk->queue, ns->queue); - revalidate_disk(ns->head->disk); } #endif }
From: Anton Eidelman anton@lightbitslabs.com
[ Upstream commit 489dd102a2c7c94d783a35f9412eb085b8da1aa4 ]
When scan_work calls nvme_mpath_add_disk() this holds ana_lock and invokes nvme_parse_ana_log(), which may issue IO in device_add_disk() and hang waiting for an accessible path. While nvme_mpath_set_live() only called when nvme_state_is_live(), a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.
In order to recover and complete the IO ana_work on the same ctrl should be able to update the path state and remove NVME_NS_ANA_PENDING.
The deadlock occurs because scan_work keeps holding ana_lock, so ana_work hangs [1].
Fix: Now nvme_mpath_add_disk() uses nvme_parse_ana_log() to obtain a copy of the ANA group desc, and then calls nvme_update_ns_ana_state() without holding ana_lock.
[1]: kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: io_schedule+0x16/0x40 kernel: do_read_cache_page+0x438/0x830 kernel: read_cache_page+0x12/0x20 kernel: read_dev_sector+0x27/0xc0 kernel: read_lba+0xc1/0x220 kernel: efi_partition+0x1e6/0x708 kernel: check_partition+0x154/0x244 kernel: rescan_partitions+0xae/0x280 kernel: __blkdev_get+0x40f/0x560 kernel: blkdev_get+0x3d/0x140 kernel: __device_add_disk+0x388/0x480 kernel: device_add_disk+0x13/0x20 kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core] kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core] kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core] kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core] kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core] kernel: nvme_validate_ns+0x396/0x940 [nvme_core] kernel: nvme_scan_work+0x24f/0x380 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x249/0x400 kernel: kthread+0x104/0x140
kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: schedule_preempt_disabled+0xe/0x10 kernel: __mutex_lock.isra.0+0x182/0x4f0 kernel: ? __switch_to_asm+0x34/0x70 kernel: ? select_task_rq_fair+0x1aa/0x5c0 kernel: ? kvm_sched_clock_read+0x11/0x20 kernel: ? sched_clock+0x9/0x10 kernel: __mutex_lock_slowpath+0x13/0x20 kernel: mutex_lock+0x2e/0x40 kernel: nvme_read_ana_log+0x3a/0x100 [nvme_core] kernel: nvme_ana_work+0x15/0x20 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x4d/0x400 kernel: kthread+0x104/0x140 kernel: ? process_one_work+0x380/0x380 kernel: ? kthread_park+0x80/0x80 kernel: ret_from_fork+0x35/0x40
Fixes: 0d0b660f214d ("nvme: add ANA support") Signed-off-by: Anton Eidelman anton@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/multipath.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 9f2844935fdfa..fece4654fa3e7 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -641,26 +641,34 @@ static ssize_t ana_state_show(struct device *dev, struct device_attribute *attr, } DEVICE_ATTR_RO(ana_state);
-static int nvme_set_ns_ana_state(struct nvme_ctrl *ctrl, +static int nvme_lookup_ana_group_desc(struct nvme_ctrl *ctrl, struct nvme_ana_group_desc *desc, void *data) { - struct nvme_ns *ns = data; + struct nvme_ana_group_desc *dst = data;
- if (ns->ana_grpid == le32_to_cpu(desc->grpid)) { - nvme_update_ns_ana_state(desc, ns); - return -ENXIO; /* just break out of the loop */ - } + if (desc->grpid != dst->grpid) + return 0;
- return 0; + *dst = *desc; + return -ENXIO; /* just break out of the loop */ }
void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id) { if (nvme_ctrl_use_ana(ns->ctrl)) { + struct nvme_ana_group_desc desc = { + .grpid = id->anagrpid, + .state = 0, + }; + mutex_lock(&ns->ctrl->ana_lock); ns->ana_grpid = le32_to_cpu(id->anagrpid); - nvme_parse_ana_log(ns->ctrl, ns, nvme_set_ns_ana_state); + nvme_parse_ana_log(ns->ctrl, &desc, nvme_lookup_ana_group_desc); mutex_unlock(&ns->ctrl->ana_lock); + if (desc.state) { + /* found the group desc: update */ + nvme_update_ns_ana_state(&desc, ns); + } } else { mutex_lock(&ns->head->lock); ns->ana_state = NVME_ANA_OPTIMIZED;
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit e164471dcf19308d154adb69e7760d8ba426a77f ]
Right now ns->head->lock is protecting namespace mutation which is wrong and unneeded. Move it to only protect against head mutations. While we're at it, remove unnecessary ns->head reference as we already have head pointer.
The problem with this is that the head->lock spans mpath disk node I/O that may block under some conditions (if for example the controller is disconnecting or the path became inaccessible), The locking scheme does not allow any other path to enable itself, preventing blocked I/O to complete and forward-progress from there.
This is a preparation patch for the fix in a subsequent patch where the disk I/O will also be done outside the head->lock.
Fixes: 0d0b660f214d ("nvme: add ANA support") Signed-off-by: Anton Eidelman anton@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/multipath.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index fece4654fa3e7..f4287d8550a9f 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -410,11 +410,10 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) { struct nvme_ns_head *head = ns->head;
- lockdep_assert_held(&ns->head->lock); - if (!head->disk) return;
+ mutex_lock(&head->lock); if (!(head->disk->flags & GENHD_FL_UP)) device_add_disk(&head->subsys->dev, head->disk, nvme_ns_id_attr_groups); @@ -427,9 +426,10 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) __nvme_find_path(head, node); srcu_read_unlock(&head->srcu, srcu_idx); } + mutex_unlock(&head->lock);
- synchronize_srcu(&ns->head->srcu); - kblockd_schedule_work(&ns->head->requeue_work); + synchronize_srcu(&head->srcu); + kblockd_schedule_work(&head->requeue_work); }
static int nvme_parse_ana_log(struct nvme_ctrl *ctrl, void *data, @@ -484,14 +484,12 @@ static inline bool nvme_state_is_live(enum nvme_ana_state state) static void nvme_update_ns_ana_state(struct nvme_ana_group_desc *desc, struct nvme_ns *ns) { - mutex_lock(&ns->head->lock); ns->ana_grpid = le32_to_cpu(desc->grpid); ns->ana_state = desc->state; clear_bit(NVME_NS_ANA_PENDING, &ns->flags);
if (nvme_state_is_live(ns->ana_state)) nvme_mpath_set_live(ns); - mutex_unlock(&ns->head->lock); }
static int nvme_update_ana_state(struct nvme_ctrl *ctrl, @@ -670,10 +668,8 @@ void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id) nvme_update_ns_ana_state(&desc, ns); } } else { - mutex_lock(&ns->head->lock); ns->ana_state = NVME_ANA_OPTIMIZED; nvme_mpath_set_live(ns); - mutex_unlock(&ns->head->lock); }
if (bdi_cap_stable_pages_required(ns->queue->backing_dev_info)) {
From: Anton Eidelman anton@lightbitslabs.com
[ Upstream commit d8a22f85609fadb46ba699e0136cc3ebdeebff79 ]
In the following scenario scan_work and ana_work will deadlock:
When scan_work calls nvme_mpath_add_disk() this holds ana_lock and invokes nvme_parse_ana_log(), which may issue IO in device_add_disk() and hang waiting for an accessible path.
While nvme_mpath_set_live() only called when nvme_state_is_live(), a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.
Since nvme_mpath_set_live() holds ns->head->lock, an ana_work on ANY ctrl will not be able to complete nvme_mpath_set_live() on the same ns->head, which is required in order to update the new accessible path and remove NVME_NS_ANA_PENDING.. Therefore IO never completes: deadlock [1].
Fix: Move device_add_disk out of the head->lock and protect it with an atomic test_and_set for a new NVME_NS_HEAD_HAS_DISK bit.
[1]: kernel: INFO: task kworker/u8:2:160 blocked for more than 120 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u8:2 D 0 160 2 0x80004000 kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: schedule_preempt_disabled+0xe/0x10 kernel: __mutex_lock.isra.0+0x182/0x4f0 kernel: __mutex_lock_slowpath+0x13/0x20 kernel: mutex_lock+0x2e/0x40 kernel: nvme_update_ns_ana_state+0x22/0x60 [nvme_core] kernel: nvme_update_ana_state+0xca/0xe0 [nvme_core] kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core] kernel: nvme_read_ana_log+0x76/0x100 [nvme_core] kernel: nvme_ana_work+0x15/0x20 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x4d/0x400 kernel: kthread+0x104/0x140 kernel: ret_from_fork+0x35/0x40 kernel: INFO: task kworker/u8:4:439 blocked for more than 120 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u8:4 D 0 439 2 0x80004000 kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: io_schedule+0x16/0x40 kernel: do_read_cache_page+0x438/0x830 kernel: read_cache_page+0x12/0x20 kernel: read_dev_sector+0x27/0xc0 kernel: read_lba+0xc1/0x220 kernel: efi_partition+0x1e6/0x708 kernel: check_partition+0x154/0x244 kernel: rescan_partitions+0xae/0x280 kernel: __blkdev_get+0x40f/0x560 kernel: blkdev_get+0x3d/0x140 kernel: __device_add_disk+0x388/0x480 kernel: device_add_disk+0x13/0x20 kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core] kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core] kernel: nvme_mpath_add_disk+0xbe/0x100 [nvme_core] kernel: nvme_validate_ns+0x396/0x940 [nvme_core] kernel: nvme_scan_work+0x256/0x390 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x4d/0x400 kernel: kthread+0x104/0x140 kernel: ret_from_fork+0x35/0x40
Fixes: 0d0b660f214d ("nvme: add ANA support") Signed-off-by: Anton Eidelman anton@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/multipath.c | 4 ++-- drivers/nvme/host/nvme.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index f4287d8550a9f..d1cb65698288b 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -413,11 +413,11 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) if (!head->disk) return;
- mutex_lock(&head->lock); - if (!(head->disk->flags & GENHD_FL_UP)) + if (!test_and_set_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) device_add_disk(&head->subsys->dev, head->disk, nvme_ns_id_attr_groups);
+ mutex_lock(&head->lock); if (nvme_path_is_optimized(ns)) { int node, srcu_idx;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 2e04a36296d95..719342600be62 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -359,6 +359,8 @@ struct nvme_ns_head { spinlock_t requeue_lock; struct work_struct requeue_work; struct mutex lock; + unsigned long flags; +#define NVME_NSHEAD_DISK_LIVE 0 struct nvme_ns __rcu *current_path[]; #endif };
From: Colin Ian King colin.king@canonical.com
[ Upstream commit a51243860893615b457b149da1ef5df0a4f1ffc2 ]
The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum enum dbg_status however there is a missing corresponding entry for this in the array s_status_str. This causes an out-of-bounds read when indexing into the last entry of s_status_str. Fix this by adding in the missing entry.
Addresses-Coverity: ("Out-of-bounds read"). Fixes: 2d22bc8354b1 ("qed: FW 8.42.2.0 debug features") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_debug.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c b/drivers/net/ethernet/qlogic/qed/qed_debug.c index f4eebaabb6d0d..3e56b6056b477 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_debug.c +++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c @@ -5568,7 +5568,8 @@ static const char * const s_status_str[] = {
/* DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS */ "The filter/trigger constraint dword offsets are not enabled for recording", - + /* DBG_STATUS_NO_MATCHING_FRAMING_MODE */ + "No matching framing mode",
/* DBG_STATUS_VFC_READ_ERROR */ "Error reading from VFC",
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 715028460082d07a7ec6fcd87b14b46784346a72 ]
When using ip_set with counters and comment, traffic causes the kernel to panic on 32-bit ARM:
Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>] Unhandled fault: alignment exception (0x221) at 0xea08133c PC is at ip_set_match_extensions+0xe0/0x224 [ip_set]
The problem occurs when we try to update the 64-bit counters - the faulting address above is not 64-bit aligned. The problem occurs due to the way elements are allocated, for example:
set->dsize = ip_set_elem_len(set, tb, 0, 0); map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
If the element has a requirement for a member to be 64-bit aligned, and set->dsize is not a multiple of 8, but is a multiple of four, then every odd numbered elements will be misaligned - and hitting an atomic64_add() on that element will cause the kernel to panic.
ip_set_elem_len() must return a size that is rounded to the maximum alignment of any extension field stored in the element. This change ensures that is the case.
Fixes: 95ad1f4a9358 ("netfilter: ipset: Fix extension alignment") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 340cb955af25c..56621d6bfd297 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -460,6 +460,8 @@ ip_set_elem_len(struct ip_set *set, struct nlattr *tb[], size_t len, for (id = 0; id < IPSET_EXT_ID_MAX; id++) { if (!add_extension(id, cadt_flags, tb)) continue; + if (align < ip_set_extensions[id].align) + align = ip_set_extensions[id].align; len = ALIGN(len, ip_set_extensions[id].align); set->offset[id] = len; set->extensions |= ip_set_extensions[id].type;
From: Doug Berger opendmb@gmail.com
[ Upstream commit 20d1f2d1b024f6be199a3bedf1578a1d21592bc5 ]
When commit 474ea9cafc45 ("net: bcmgenet: correctly pad short packets") added the call to skb_padto() it should have been located before the nr_frags parameter was read since that value could be changed when padding packets with lengths between 55 and 59 bytes (inclusive).
The use of a stale nr_frags value can cause corruption of the pad data when tx-scatter-gather is enabled. This corruption of the pad can cause invalid checksum computation when hardware offload of tx-checksum is also enabled.
Since the original reason for the padding was corrected by commit 7dd399130efb ("net: bcmgenet: fix skb_len in bcmgenet_xmit_single()") we can remove the software padding all together and make use of hardware padding of short frames as long as the hardware also always appends the FCS value to the frame.
Fixes: 474ea9cafc45 ("net: bcmgenet: correctly pad short packets") Signed-off-by: Doug Berger opendmb@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 38bdfd4b46f0f..dde1c23c8e399 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1520,11 +1520,6 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) goto out; }
- if (skb_padto(skb, ETH_ZLEN)) { - ret = NETDEV_TX_OK; - goto out; - } - /* Retain how many bytes will be sent on the wire, without TSB inserted * by transmit checksum offload */ @@ -1571,6 +1566,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) len_stat = (size << DMA_BUFLENGTH_SHIFT) | (priv->hw_params->qtag_mask << DMA_TX_QTAG_SHIFT);
+ /* Note: if we ever change from DMA_TX_APPEND_CRC below we + * will need to restore software padding of "runt" packets + */ if (!i) { len_stat |= DMA_TX_APPEND_CRC | DMA_SOP; if (skb->ip_summed == CHECKSUM_PARTIAL)
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit cd664b0e35cb1202f40c259a1a5ea791d18c879d ]
io_do_iopoll() won't do anything with a request unless req->iopoll_completed is set. So io_complete_rw_iopoll() has to set it, otherwise io_do_iopoll() will poll a file again and again even though the request of interest was completed long time ago.
Also, remove -EAGAIN check from io_issue_sqe() as it races with the changed lines. The request will take the long way and be resubmitted from io_iopoll*().
io_kiocb's result and iopoll_completed")
Fixes: bbde017a32b3 ("io_uring: add memory barrier to synchronize Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 1829be7f63a35..4ab1728de247c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1942,10 +1942,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2)
WRITE_ONCE(req->result, res); /* order with io_poll_complete() checking ->result */ - if (res != -EAGAIN) { - smp_wmb(); - WRITE_ONCE(req->iopoll_completed, 1); - } + smp_wmb(); + WRITE_ONCE(req->iopoll_completed, 1); }
/* @@ -5425,9 +5423,6 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, if ((ctx->flags & IORING_SETUP_IOPOLL) && req->file) { const bool in_async = io_wq_current_is_worker();
- if (req->result == -EAGAIN) - return -EAGAIN; - /* workqueue context doesn't hold uring_lock, grab it now */ if (in_async) mutex_lock(&ctx->uring_lock);
From: Vincent Chen vincent.chen@sifive.com
[ Upstream commit d0a5fdf4cc83dabcdea668f971b8a2e916437711 ]
The (struct __prci_data).hw_clks.hws is an array with dynamic elements. Using struct_size(pd, hw_clks.hws, ARRAY_SIZE(__prci_init_clocks)) instead of sizeof(*pd) to get the correct memory size of struct __prci_data for sifive/fu540-prci. After applying this modifications, the kernel runs smoothly with CONFIG_SLAB_FREELIST_RANDOM enabled on the HiFive unleashed board.
Fixes: 30b8e27e3b58 ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block") Signed-off-by: Vincent Chen vincent.chen@sifive.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/sifive/fu540-prci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/sifive/fu540-prci.c b/drivers/clk/sifive/fu540-prci.c index 6282ee2f361cd..a8901f90a61ac 100644 --- a/drivers/clk/sifive/fu540-prci.c +++ b/drivers/clk/sifive/fu540-prci.c @@ -586,7 +586,10 @@ static int sifive_fu540_prci_probe(struct platform_device *pdev) struct __prci_data *pd; int r;
- pd = devm_kzalloc(dev, sizeof(*pd), GFP_KERNEL); + pd = devm_kzalloc(dev, + struct_size(pd, hw_clks.hws, + ARRAY_SIZE(__prci_init_clocks)), + GFP_KERNEL); if (!pd) return -ENOMEM;
From: Eddie James eajames@linux.ibm.com
[ Upstream commit 502035e284cc7e9efef22b01771d822d49698ab9 ]
The port number field in the status register was not correct, so fix it.
Fixes: d6ffb6300116 ("i2c: Add FSI-attached I2C master algorithm") Signed-off-by: Eddie James eajames@linux.ibm.com Signed-off-by: Joel Stanley joel@jms.id.au Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-fsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-fsi.c b/drivers/i2c/busses/i2c-fsi.c index e0c256922d4f1..977d6f524649c 100644 --- a/drivers/i2c/busses/i2c-fsi.c +++ b/drivers/i2c/busses/i2c-fsi.c @@ -98,7 +98,7 @@ #define I2C_STAT_DAT_REQ BIT(25) #define I2C_STAT_CMD_COMP BIT(24) #define I2C_STAT_STOP_ERR BIT(23) -#define I2C_STAT_MAX_PORT GENMASK(19, 16) +#define I2C_STAT_MAX_PORT GENMASK(22, 16) #define I2C_STAT_ANY_INT BIT(15) #define I2C_STAT_SCL_IN BIT(11) #define I2C_STAT_SDA_IN BIT(10)
From: Claudiu Beznea claudiu.beznea@microchip.com
[ Upstream commit 33fdef24c9ac20c68b71b363e423fbf9ad2bfc1e ]
DMA buffers were not freed on failure path of at91ether_open(). Along with changes for freeing the DMA buffers the enable/disable interrupt instructions were moved to at91ether_start()/at91ether_stop() functions and the operations on at91ether_stop() were done in their reverse order (compared with how is done in at91ether_start()): before this patch the operation order on interface open path was as follows: 1/ alloc DMA buffers 2/ enable tx, rx 3/ enable interrupts and the order on interface close path was as follows: 1/ disable tx, rx 2/ disable interrupts 3/ free dma buffers.
Fixes: 7897b071ac3b ("net: macb: convert to phylink") Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 116 ++++++++++++++--------- 1 file changed, 73 insertions(+), 43 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 5705359a36124..52582e8ed90e5 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -3762,15 +3762,9 @@ static int macb_init(struct platform_device *pdev)
static struct sifive_fu540_macb_mgmt *mgmt;
-/* Initialize and start the Receiver and Transmit subsystems */ -static int at91ether_start(struct net_device *dev) +static int at91ether_alloc_coherent(struct macb *lp) { - struct macb *lp = netdev_priv(dev); struct macb_queue *q = &lp->queues[0]; - struct macb_dma_desc *desc; - dma_addr_t addr; - u32 ctl; - int i;
q->rx_ring = dma_alloc_coherent(&lp->pdev->dev, (AT91ETHER_MAX_RX_DESCR * @@ -3792,6 +3786,43 @@ static int at91ether_start(struct net_device *dev) return -ENOMEM; }
+ return 0; +} + +static void at91ether_free_coherent(struct macb *lp) +{ + struct macb_queue *q = &lp->queues[0]; + + if (q->rx_ring) { + dma_free_coherent(&lp->pdev->dev, + AT91ETHER_MAX_RX_DESCR * + macb_dma_desc_get_size(lp), + q->rx_ring, q->rx_ring_dma); + q->rx_ring = NULL; + } + + if (q->rx_buffers) { + dma_free_coherent(&lp->pdev->dev, + AT91ETHER_MAX_RX_DESCR * + AT91ETHER_MAX_RBUFF_SZ, + q->rx_buffers, q->rx_buffers_dma); + q->rx_buffers = NULL; + } +} + +/* Initialize and start the Receiver and Transmit subsystems */ +static int at91ether_start(struct macb *lp) +{ + struct macb_queue *q = &lp->queues[0]; + struct macb_dma_desc *desc; + dma_addr_t addr; + u32 ctl; + int i, ret; + + ret = at91ether_alloc_coherent(lp); + if (ret) + return ret; + addr = q->rx_buffers_dma; for (i = 0; i < AT91ETHER_MAX_RX_DESCR; i++) { desc = macb_rx_desc(q, i); @@ -3813,9 +3844,39 @@ static int at91ether_start(struct net_device *dev) ctl = macb_readl(lp, NCR); macb_writel(lp, NCR, ctl | MACB_BIT(RE) | MACB_BIT(TE));
+ /* Enable MAC interrupts */ + macb_writel(lp, IER, MACB_BIT(RCOMP) | + MACB_BIT(RXUBR) | + MACB_BIT(ISR_TUND) | + MACB_BIT(ISR_RLE) | + MACB_BIT(TCOMP) | + MACB_BIT(ISR_ROVR) | + MACB_BIT(HRESP)); + return 0; }
+static void at91ether_stop(struct macb *lp) +{ + u32 ctl; + + /* Disable MAC interrupts */ + macb_writel(lp, IDR, MACB_BIT(RCOMP) | + MACB_BIT(RXUBR) | + MACB_BIT(ISR_TUND) | + MACB_BIT(ISR_RLE) | + MACB_BIT(TCOMP) | + MACB_BIT(ISR_ROVR) | + MACB_BIT(HRESP)); + + /* Disable Receiver and Transmitter */ + ctl = macb_readl(lp, NCR); + macb_writel(lp, NCR, ctl & ~(MACB_BIT(TE) | MACB_BIT(RE))); + + /* Free resources. */ + at91ether_free_coherent(lp); +} + /* Open the ethernet interface */ static int at91ether_open(struct net_device *dev) { @@ -3835,27 +3896,20 @@ static int at91ether_open(struct net_device *dev)
macb_set_hwaddr(lp);
- ret = at91ether_start(dev); + ret = at91ether_start(lp); if (ret) goto pm_exit;
- /* Enable MAC interrupts */ - macb_writel(lp, IER, MACB_BIT(RCOMP) | - MACB_BIT(RXUBR) | - MACB_BIT(ISR_TUND) | - MACB_BIT(ISR_RLE) | - MACB_BIT(TCOMP) | - MACB_BIT(ISR_ROVR) | - MACB_BIT(HRESP)); - ret = macb_phylink_connect(lp); if (ret) - goto pm_exit; + goto stop;
netif_start_queue(dev);
return 0;
+stop: + at91ether_stop(lp); pm_exit: pm_runtime_put_sync(&lp->pdev->dev); return ret; @@ -3865,37 +3919,13 @@ pm_exit: static int at91ether_close(struct net_device *dev) { struct macb *lp = netdev_priv(dev); - struct macb_queue *q = &lp->queues[0]; - u32 ctl; - - /* Disable Receiver and Transmitter */ - ctl = macb_readl(lp, NCR); - macb_writel(lp, NCR, ctl & ~(MACB_BIT(TE) | MACB_BIT(RE))); - - /* Disable MAC interrupts */ - macb_writel(lp, IDR, MACB_BIT(RCOMP) | - MACB_BIT(RXUBR) | - MACB_BIT(ISR_TUND) | - MACB_BIT(ISR_RLE) | - MACB_BIT(TCOMP) | - MACB_BIT(ISR_ROVR) | - MACB_BIT(HRESP));
netif_stop_queue(dev);
phylink_stop(lp->phylink); phylink_disconnect_phy(lp->phylink);
- dma_free_coherent(&lp->pdev->dev, - AT91ETHER_MAX_RX_DESCR * - macb_dma_desc_get_size(lp), - q->rx_ring, q->rx_ring_dma); - q->rx_ring = NULL; - - dma_free_coherent(&lp->pdev->dev, - AT91ETHER_MAX_RX_DESCR * AT91ETHER_MAX_RBUFF_SZ, - q->rx_buffers, q->rx_buffers_dma); - q->rx_buffers = NULL; + at91ether_stop(lp);
return pm_runtime_put(&lp->pdev->dev); }
From: "Jason A. Donenfeld" Jason@zx2c4.com
[ Upstream commit df08126e3833e9dca19e2407db5f5860a7c194fb ]
The napi_gro_receive function no longer returns GRO_DROP ever, making handling GRO_DROP dead code. This commit removes that dead code. Further, it's not even clear that device drivers have any business in taking action after passing off received packets; that's arguably out of their hands.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireguard/receive.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c index 91438144e4f7a..9b2ab6fc91cdd 100644 --- a/drivers/net/wireguard/receive.c +++ b/drivers/net/wireguard/receive.c @@ -414,14 +414,8 @@ static void wg_packet_consume_data_done(struct wg_peer *peer, if (unlikely(routed_peer != peer)) goto dishonest_packet_peer;
- if (unlikely(napi_gro_receive(&peer->napi, skb) == GRO_DROP)) { - ++dev->stats.rx_dropped; - net_dbg_ratelimited("%s: Failed to give packet to userspace from peer %llu (%pISpfsc)\n", - dev->name, peer->internal_id, - &peer->endpoint.addr); - } else { - update_rx_stats(peer, message_data_len(len_before_trim)); - } + napi_gro_receive(&peer->napi, skb); + update_rx_stats(peer, message_data_len(len_before_trim)); return;
dishonest_packet_peer:
From: "Jason A. Donenfeld" Jason@zx2c4.com
[ Upstream commit e5e7d8052f6140985c03bd49ebaa0af9c2944bc6 ]
The napi_gro_receive function no longer returns GRO_DROP ever, making handling GRO_DROP dead code. This commit removes that dead code. Further, it's not even clear that device drivers have any business in taking action after passing off received packets; that's arguably out of their hands.
Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/socionext/netsec.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index a5a0fb60193ab..5a70c49bf454f 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1038,8 +1038,9 @@ static int netsec_process_rx(struct netsec_priv *priv, int budget) skb->ip_summed = CHECKSUM_UNNECESSARY;
next: - if ((skb && napi_gro_receive(&priv->napi, skb) != GRO_DROP) || - xdp_result) { + if (skb) + napi_gro_receive(&priv->napi, skb); + if (skb || xdp_result) { ndev->stats.rx_packets++; ndev->stats.rx_bytes += xdp.data_end - xdp.data; }
From: "Jason A. Donenfeld" Jason@zx2c4.com
[ Upstream commit 045790b7bc66a75070c112a61558c639cef2263e ]
The napi_gro_receive function no longer returns GRO_DROP ever, making handling GRO_DROP dead code. This commit removes that dead code. Further, it's not even clear that device drivers have any business in taking action after passing off received packets; that's arguably out of their hands. In this case, too, the non-gro path didn't bother checking the return value. Plus, this had some clunky debugging functions that duplicated code from elsewhere and was generally pretty messy. So, this commit cleans that all up too.
Fixes: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/txrx.c | 39 +++++++------------------ 1 file changed, 11 insertions(+), 28 deletions(-)
diff --git a/drivers/net/wireless/ath/wil6210/txrx.c b/drivers/net/wireless/ath/wil6210/txrx.c index bc8c15fb609dc..080e5aa60bea4 100644 --- a/drivers/net/wireless/ath/wil6210/txrx.c +++ b/drivers/net/wireless/ath/wil6210/txrx.c @@ -897,7 +897,6 @@ static void wil_rx_handle_eapol(struct wil6210_vif *vif, struct sk_buff *skb) void wil_netif_rx(struct sk_buff *skb, struct net_device *ndev, int cid, struct wil_net_stats *stats, bool gro) { - gro_result_t rc = GRO_NORMAL; struct wil6210_vif *vif = ndev_to_vif(ndev); struct wil6210_priv *wil = ndev_to_wil(ndev); struct wireless_dev *wdev = vif_to_wdev(vif); @@ -908,22 +907,16 @@ void wil_netif_rx(struct sk_buff *skb, struct net_device *ndev, int cid, */ int mcast = is_multicast_ether_addr(da); struct sk_buff *xmit_skb = NULL; - static const char * const gro_res_str[] = { - [GRO_MERGED] = "GRO_MERGED", - [GRO_MERGED_FREE] = "GRO_MERGED_FREE", - [GRO_HELD] = "GRO_HELD", - [GRO_NORMAL] = "GRO_NORMAL", - [GRO_DROP] = "GRO_DROP", - [GRO_CONSUMED] = "GRO_CONSUMED", - };
if (wdev->iftype == NL80211_IFTYPE_STATION) { sa = wil_skb_get_sa(skb); if (mcast && ether_addr_equal(sa, ndev->dev_addr)) { /* mcast packet looped back to us */ - rc = GRO_DROP; dev_kfree_skb(skb); - goto stats; + ndev->stats.rx_dropped++; + stats->rx_dropped++; + wil_dbg_txrx(wil, "Rx drop %d bytes\n", len); + return; } } else if (wdev->iftype == NL80211_IFTYPE_AP && !vif->ap_isolate) { if (mcast) { @@ -967,26 +960,16 @@ void wil_netif_rx(struct sk_buff *skb, struct net_device *ndev, int cid, wil_rx_handle_eapol(vif, skb);
if (gro) - rc = napi_gro_receive(&wil->napi_rx, skb); + napi_gro_receive(&wil->napi_rx, skb); else netif_rx_ni(skb); - wil_dbg_txrx(wil, "Rx complete %d bytes => %s\n", - len, gro_res_str[rc]); - } -stats: - /* statistics. rc set to GRO_NORMAL for AP bridging */ - if (unlikely(rc == GRO_DROP)) { - ndev->stats.rx_dropped++; - stats->rx_dropped++; - wil_dbg_txrx(wil, "Rx drop %d bytes\n", len); - } else { - ndev->stats.rx_packets++; - stats->rx_packets++; - ndev->stats.rx_bytes += len; - stats->rx_bytes += len; - if (mcast) - ndev->stats.multicast++; } + ndev->stats.rx_packets++; + stats->rx_packets++; + ndev->stats.rx_bytes += len; + stats->rx_bytes += len; + if (mcast) + ndev->stats.multicast++; }
void wil_netif_rx_any(struct sk_buff *skb, struct net_device *ndev)
From: Harish harish@linux.ibm.com
[ Upstream commit 896066aa0685af3434637998b76218c2045142a8 ]
We use OUTPUT directory as TMPOUT for checking no-pie option.
Since commit f2f02ebd8f38 ("kbuild: improve cc-option to clean up all temporary files") when building powerpc/ from selftests directory, the OUTPUT directory points to powerpc/pmu/ebb/ and gets removed when checking for -no-pie option in try-run routine, subsequently build fails with the following:
$ make -C powerpc ... TARGET=ebb; BUILD_TARGET=$OUTPUT/$TARGET; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C $TARGET all make[2]: Entering directory '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb' make[2]: *** No rule to make target 'Makefile'. make[2]: Failed to remake makefile 'Makefile'. make[2]: *** No rule to make target 'ebb.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'. make[2]: *** No rule to make target 'ebb_handler.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'. make[2]: *** No rule to make target 'trace.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'. make[2]: *** No rule to make target 'busy_loop.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'. make[2]: Target 'all' not remade because of errors.
Fix this by adding a suffix to the OUTPUT directory so that the failure is avoided.
Fixes: 9686813f6e9d ("selftests/powerpc: Fix try-run when source tree is not writable") Signed-off-by: Harish harish@linux.ibm.com [mpe: Mention that commit that triggered the breakage] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20200625165721.264904-1-harish@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/powerpc/pmu/ebb/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile b/tools/testing/selftests/powerpc/pmu/ebb/Makefile index ca35dd8848b0a..af3df79d8163f 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile +++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile @@ -7,7 +7,7 @@ noarg: # The EBB handler is 64-bit code and everything links against it CFLAGS += -m64
-TMPOUT = $(OUTPUT)/ +TMPOUT = $(OUTPUT)/TMPDIR/ # Toolchains may build PIE by default which breaks the assembly no-pie-option := $(call try-run, echo 'int main() { return 0; }' | \ $(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -no-pie -x c - -o "$$TMP", -no-pie)
From: Mans Rullgard mans@mansr.com
[ Upstream commit 40e05200593af06633f64ab0effff052eee6f076 ]
If the i2c bus driver ignores the I2C_M_RECV_LEN flag (as some of them do), it is possible for an I2C_SMBUS_BLOCK_DATA read issued on some random device to return an arbitrary value in the first byte (and nothing else). When this happens, i2c_smbus_xfer_emulated() will happily write past the end of the supplied data buffer, thus causing Bad Things to happen. To prevent this, check the size before copying the data block and return an error if it is too large.
Fixes: 209d27c3b167 ("i2c: Emulate SMBus block read over I2C") Signed-off-by: Mans Rullgard mans@mansr.com [wsa: use better errno] Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-core-smbus.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/i2c/i2c-core-smbus.c b/drivers/i2c/i2c-core-smbus.c index b34d2ff069318..bbb70a8a411e3 100644 --- a/drivers/i2c/i2c-core-smbus.c +++ b/drivers/i2c/i2c-core-smbus.c @@ -495,6 +495,13 @@ static s32 i2c_smbus_xfer_emulated(struct i2c_adapter *adapter, u16 addr, break; case I2C_SMBUS_BLOCK_DATA: case I2C_SMBUS_BLOCK_PROC_CALL: + if (msg[1].buf[0] > I2C_SMBUS_BLOCK_MAX) { + dev_err(&adapter->dev, + "Invalid block size returned: %d\n", + msg[1].buf[0]); + status = -EPROTO; + goto cleanup; + } for (i = 0; i < msg[1].buf[0] + 1; i++) data->block[i] = msg[1].buf[i]; break;
From: David Howells dhowells@redhat.com
[ Upstream commit 719fdd32921fb7e3208db8832d32ae1c2d68900f ]
The cell name stored in the afs_cell struct is a 64-char + NUL buffer - when it needs to be able to handle up to AFS_MAXCELLNAME (256 chars) + NUL.
Fix this by changing the array to a pointer and allocating the string.
Found using Coverity.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Reported-by: Colin Ian King colin.king@canonical.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/cell.c | 9 +++++++++ fs/afs/internal.h | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 78ba5f9322879..296b489861a9a 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -154,10 +154,17 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, return ERR_PTR(-ENOMEM); }
+ cell->name = kmalloc(namelen + 1, GFP_KERNEL); + if (!cell->name) { + kfree(cell); + return ERR_PTR(-ENOMEM); + } + cell->net = net; cell->name_len = namelen; for (i = 0; i < namelen; i++) cell->name[i] = tolower(name[i]); + cell->name[i] = 0;
atomic_set(&cell->usage, 2); INIT_WORK(&cell->manager, afs_manage_cell); @@ -203,6 +210,7 @@ parse_failed: if (ret == -EINVAL) printk(KERN_ERR "kAFS: bad VL server IP address\n"); error: + kfree(cell->name); kfree(cell); _leave(" = %d", ret); return ERR_PTR(ret); @@ -483,6 +491,7 @@ static void afs_cell_destroy(struct rcu_head *rcu)
afs_put_vlserverlist(cell->net, rcu_access_pointer(cell->vl_servers)); key_put(cell->anonymous_key); + kfree(cell->name); kfree(cell);
_leave(" [destroyed]"); diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 98e0cebd5e5e9..c67a9767397d0 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -397,7 +397,7 @@ struct afs_cell { struct afs_vlserver_list __rcu *vl_servers;
u8 name_len; /* Length of name */ - char name[64 + 1]; /* Cell name, case-flattened and NUL-padded */ + char *name; /* Cell name, case-flattened and NUL-padded */ };
/*
From: Juri Lelli juri.lelli@redhat.com
[ Upstream commit ce9bc3b27f2a21a7969b41ffb04df8cf61bd1592 ]
syzbot reported the following warning triggered via SYSC_sched_setattr():
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 setup_new_dl_entity /kernel/sched/deadline.c:594 [inline] WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_dl_entity /kernel/sched/deadline.c:1370 [inline] WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_task_dl+0x1c17/0x2ba0 /kernel/sched/deadline.c:1441
This happens because the ->dl_boosted flag is currently not initialized by __dl_clear_params() (unlike the other flags) and setup_new_dl_entity() rightfully complains about it.
Initialize dl_boosted to 0.
Fixes: 2d3d891d3344 ("sched/deadline: Add SCHED_DEADLINE inheritance logic") Reported-by: syzbot+5ac8bac25f95e8b221e7@syzkaller.appspotmail.com Signed-off-by: Juri Lelli juri.lelli@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Daniel Wagner dwagner@suse.de Link: https://lkml.kernel.org/r/20200617072919.818409-1-juri.lelli@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/deadline.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 504d2f51b0d63..f63f337c7147a 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2692,6 +2692,7 @@ void __dl_clear_params(struct task_struct *p) dl_se->dl_bw = 0; dl_se->dl_density = 0;
+ dl_se->dl_boosted = 0; dl_se->dl_throttled = 0; dl_se->dl_yielded = 0; dl_se->dl_non_contending = 0;
From: Juri Lelli juri.lelli@redhat.com
[ Upstream commit 740797ce3a124b7dd22b7fb832d87bc8fba1cf6f ]
syzbot reported the following warning:
WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628 enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
At deadline.c:628 we have:
623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se) 624 { 625 struct dl_rq *dl_rq = dl_rq_of_se(dl_se); 626 struct rq *rq = rq_of_dl_rq(dl_rq); 627 628 WARN_ON(dl_se->dl_boosted); 629 WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline)); [...] }
Which means that setup_new_dl_entity() has been called on a task currently boosted. This shouldn't happen though, as setup_new_dl_entity() is only called when the 'dynamic' deadline of the new entity is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this condition.
Digging through the PI code I noticed that what above might in fact happen if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the first branch of boosting conditions we check only if a pi_task 'dynamic' deadline is earlier than mutex holder's and in this case we set mutex holder to be dl_boosted. However, since RT 'dynamic' deadlines are only initialized if such tasks get boosted at some point (or if they become DEADLINE of course), in general RT 'dynamic' deadlines are usually equal to 0 and this verifies the aforementioned condition.
Fix it by checking that the potential donor task is actually (even if temporary because in turn boosted) running at DEADLINE priority before using its 'dynamic' deadline value.
Fixes: 2d3d891d3344 ("sched/deadline: Add SCHED_DEADLINE inheritance logic") Reported-by: syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com Signed-off-by: Juri Lelli juri.lelli@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Daniel Bristot de Oliveira bristot@redhat.com Tested-by: Daniel Wagner dwagner@suse.de Link: https://lkml.kernel.org/r/20181119153201.GB2119@localhost.localdomain Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 5eccfb816d23b..f2618ade80479 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4461,7 +4461,8 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) */ if (dl_prio(prio)) { if (!dl_prio(p->normal_prio) || - (pi_task && dl_entity_preempt(&pi_task->dl, &p->dl))) { + (pi_task && dl_prio(pi_task->prio) && + dl_entity_preempt(&pi_task->dl, &p->dl))) { p->dl.dl_boosted = 1; queue_flag |= ENQUEUE_REPLENISH; } else
From: Vincent Guittot vincent.guittot@linaro.org
[ Upstream commit e21cf43406a190adfcc4bfe592768066fb3aaa9b ]
Some performance regression on reaim benchmark have been raised with commit 070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group")
The problem comes from the init value of runnable_avg which is initialized with max value. This can be a problem if the newly forked task is finally a short task because the group of CPUs is wrongly set to overloaded and tasks are pulled less agressively.
Set initial value of runnable_avg equals to util_avg to reflect that there is no waiting time so far.
Fixes: 070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group") Reported-by: kernel test robot rong.a.chen@intel.com Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20200624154422.29166-1-vincent.guittot@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2ae7e30ccb33c..5725199b32dcf 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -807,7 +807,7 @@ void post_init_entity_util_avg(struct task_struct *p) } }
- sa->runnable_avg = cpu_scale; + sa->runnable_avg = sa->util_avg;
if (p->sched_class != &fair_sched_class) { /*
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit eea1238867205b9e48a67c1a63219529a73c46fd ]
Calling pm_runtime_get_sync increments the counter even in case of failure, causing incorrect ref count. Call pm_runtime_put if pm_runtime_get_sync fails.
Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/sata_rcar.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/ata/sata_rcar.c b/drivers/ata/sata_rcar.c index 980aacdbcf3b4..141ac600b64c8 100644 --- a/drivers/ata/sata_rcar.c +++ b/drivers/ata/sata_rcar.c @@ -907,7 +907,7 @@ static int sata_rcar_probe(struct platform_device *pdev) pm_runtime_enable(dev); ret = pm_runtime_get_sync(dev); if (ret < 0) - goto err_pm_disable; + goto err_pm_put;
host = ata_host_alloc(dev, 1); if (!host) { @@ -937,7 +937,6 @@ static int sata_rcar_probe(struct platform_device *pdev)
err_pm_put: pm_runtime_put(dev); -err_pm_disable: pm_runtime_disable(dev); return ret; } @@ -991,8 +990,10 @@ static int sata_rcar_resume(struct device *dev) int ret;
ret = pm_runtime_get_sync(dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put(dev); return ret; + }
if (priv->type == RCAR_GEN3_SATA) { sata_rcar_init_module(priv); @@ -1017,8 +1018,10 @@ static int sata_rcar_restore(struct device *dev) int ret;
ret = pm_runtime_get_sync(dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put(dev); return ret; + }
sata_rcar_setup_port(host);
From: Ye Bin yebin10@huawei.com
[ Upstream commit f650ef61e040bcb175dd8762164b00a5d627f20e ]
BUG: KASAN: use-after-free in ata_scsi_mode_select_xlat+0x10bd/0x10f0 drivers/ata/libata-scsi.c:4045 Read of size 1 at addr ffff88803b8cd003 by task syz-executor.6/12621
CPU: 1 PID: 12621 Comm: syz-executor.6 Not tainted 4.19.95 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xac/0xee lib/dump_stack.c:118 print_address_description+0x60/0x223 mm/kasan/report.c:253 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report mm/kasan/report.c:409 [inline] kasan_report.cold+0xae/0x2d8 mm/kasan/report.c:393 ata_scsi_mode_select_xlat+0x10bd/0x10f0 drivers/ata/libata-scsi.c:4045 ata_scsi_translate+0x2da/0x680 drivers/ata/libata-scsi.c:2035 __ata_scsi_queuecmd drivers/ata/libata-scsi.c:4360 [inline] ata_scsi_queuecmd+0x2e4/0x790 drivers/ata/libata-scsi.c:4409 scsi_dispatch_cmd+0x2ee/0x6c0 drivers/scsi/scsi_lib.c:1867 scsi_queue_rq+0xfd7/0x1990 drivers/scsi/scsi_lib.c:2170 blk_mq_dispatch_rq_list+0x1e1/0x19a0 block/blk-mq.c:1186 blk_mq_do_dispatch_sched+0x147/0x3d0 block/blk-mq-sched.c:108 blk_mq_sched_dispatch_requests+0x427/0x680 block/blk-mq-sched.c:204 __blk_mq_run_hw_queue+0xbc/0x200 block/blk-mq.c:1308 __blk_mq_delay_run_hw_queue+0x3c0/0x460 block/blk-mq.c:1376 blk_mq_run_hw_queue+0x152/0x310 block/blk-mq.c:1413 blk_mq_sched_insert_request+0x337/0x6c0 block/blk-mq-sched.c:397 blk_execute_rq_nowait+0x124/0x320 block/blk-exec.c:64 blk_execute_rq+0xc5/0x112 block/blk-exec.c:101 sg_scsi_ioctl+0x3b0/0x6a0 block/scsi_ioctl.c:507 sg_ioctl+0xd37/0x23f0 drivers/scsi/sg.c:1106 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:501 [inline] do_vfs_ioctl+0xae6/0x1030 fs/ioctl.c:688 ksys_ioctl+0x76/0xa0 fs/ioctl.c:705 __do_sys_ioctl fs/ioctl.c:712 [inline] __se_sys_ioctl fs/ioctl.c:710 [inline] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:710 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c479 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb0e9602c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fb0e96036d4 RCX: 000000000045c479 RDX: 0000000020000040 RSI: 0000000000000001 RDI: 0000000000000003 RBP: 000000000076bfc0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 000000000000046d R14: 00000000004c6e1a R15: 000000000076bfcc
Allocated by task 12577: set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc mm/kasan/kasan.c:553 [inline] kasan_kmalloc+0xbf/0xe0 mm/kasan/kasan.c:531 __kmalloc+0xf3/0x1e0 mm/slub.c:3749 kmalloc include/linux/slab.h:520 [inline] load_elf_phdrs+0x118/0x1b0 fs/binfmt_elf.c:441 load_elf_binary+0x2de/0x4610 fs/binfmt_elf.c:737 search_binary_handler fs/exec.c:1654 [inline] search_binary_handler+0x15c/0x4e0 fs/exec.c:1632 exec_binprm fs/exec.c:1696 [inline] __do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8a/0xb0 fs/exec.c:1959 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 12577: set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x129/0x170 mm/kasan/kasan.c:521 slab_free_hook mm/slub.c:1370 [inline] slab_free_freelist_hook mm/slub.c:1397 [inline] slab_free mm/slub.c:2952 [inline] kfree+0x8b/0x1a0 mm/slub.c:3904 load_elf_binary+0x1be7/0x4610 fs/binfmt_elf.c:1118 search_binary_handler fs/exec.c:1654 [inline] search_binary_handler+0x15c/0x4e0 fs/exec.c:1632 exec_binprm fs/exec.c:1696 [inline] __do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8a/0xb0 fs/exec.c:1959 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff88803b8ccf00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 259 bytes inside of 512-byte region [ffff88803b8ccf00, ffff88803b8cd100) The buggy address belongs to the page: page:ffffea0000ee3300 count:1 mapcount:0 mapping:ffff88806cc03080 index:0xffff88803b8cc780 compound_mapcount: 0 flags: 0x100000000008100(slab|head) raw: 0100000000008100 ffffea0001104080 0000000200000002 ffff88806cc03080 raw: ffff88803b8cc780 00000000800c000b 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88803b8ccf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88803b8ccf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88803b8cd000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88803b8cd080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88803b8cd100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
You can refer to "https://www.lkml.org/lkml/2019/1/17/474" reproduce this error.
The exception code is "bd_len = p[3];", "p" value is ffff88803b8cd000 which belongs to the cache kmalloc-512 of size 512. The "page_address(sg_page(scsi_sglist(scmd)))" maybe from sg_scsi_ioctl function "buffer" which allocated by kzalloc, so "buffer" may not page aligned. This also looks completely buggy on highmem systems and really needs to use a kmap_atomic. --Christoph Hellwig To address above bugs, Paolo Bonzini advise to simpler to just make a char array of size CACHE_MPAGE_LEN+8+8+4-2(or just 64 to make it easy), use sg_copy_to_buffer to copy from the sglist into the buffer, and workthere.
Signed-off-by: Ye Bin yebin10@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-scsi.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 36e588d88b956..c10deb87015ba 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -3692,12 +3692,13 @@ static unsigned int ata_scsi_mode_select_xlat(struct ata_queued_cmd *qc) { struct scsi_cmnd *scmd = qc->scsicmd; const u8 *cdb = scmd->cmnd; - const u8 *p; u8 pg, spg; unsigned six_byte, pg_len, hdr_len, bd_len; int len; u16 fp = (u16)-1; u8 bp = 0xff; + u8 buffer[64]; + const u8 *p = buffer;
VPRINTK("ENTER\n");
@@ -3731,12 +3732,14 @@ static unsigned int ata_scsi_mode_select_xlat(struct ata_queued_cmd *qc) if (!scsi_sg_count(scmd) || scsi_sglist(scmd)->length < len) goto invalid_param_len;
- p = page_address(sg_page(scsi_sglist(scmd))); - /* Move past header and block descriptors. */ if (len < hdr_len) goto invalid_param_len;
+ if (!sg_copy_to_buffer(scsi_sglist(scmd), scsi_sg_count(scmd), + buffer, sizeof(buffer))) + goto invalid_param_len; + if (six_byte) bd_len = p[3]; else
From: Denis Efremov efremov@linux.com
[ Upstream commit 43a562774fceba867e8eebba977d7d42f8a2eac7 ]
Use kfree() instead of kvfree() to free rgb_user in calculate_user_regamma_ramp() because the memory is allocated with kcalloc().
Signed-off-by: Denis Efremov efremov@linux.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c index e89694eb90b4c..700f0039df7bd 100644 --- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c +++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c @@ -1777,7 +1777,7 @@ bool calculate_user_regamma_ramp(struct dc_transfer_func *output_tf,
kfree(rgb_regamma); rgb_regamma_alloc_fail: - kvfree(rgb_user); + kfree(rgb_user); rgb_user_alloc_fail: return ret; }
From: Nathan Huckleberry nhuck@google.com
[ Upstream commit 6c58f25e6938c073198af8b1e1832f83f8f0df33 ]
The argument passed to cmpxchg is not guaranteed to be sign extended, but lr.w sign extends on RV64I. This makes cmpxchg fail on clang built kernels when __old is negative.
To fix this, we just cast __old to long which sign extends on RV64I. With this fix, clang built RISC-V kernels now boot.
Link: https://github.com/ClangBuiltLinux/linux/issues/867 Signed-off-by: Nathan Huckleberry nhuck@google.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/cmpxchg.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index d969bab4a26b5..262e5bbb27760 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -179,7 +179,7 @@ " bnez %1, 0b\n" \ "1:\n" \ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \ - : "rJ" (__old), "rJ" (__new) \ + : "rJ" ((long)__old), "rJ" (__new) \ : "memory"); \ break; \ case 8: \ @@ -224,7 +224,7 @@ RISCV_ACQUIRE_BARRIER \ "1:\n" \ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \ - : "rJ" (__old), "rJ" (__new) \ + : "rJ" ((long)__old), "rJ" (__new) \ : "memory"); \ break; \ case 8: \ @@ -270,7 +270,7 @@ " bnez %1, 0b\n" \ "1:\n" \ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \ - : "rJ" (__old), "rJ" (__new) \ + : "rJ" ((long)__old), "rJ" (__new) \ : "memory"); \ break; \ case 8: \ @@ -316,7 +316,7 @@ " fence rw, rw\n" \ "1:\n" \ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \ - : "rJ" (__old), "rJ" (__new) \ + : "rJ" ((long)__old), "rJ" (__new) \ : "memory"); \ break; \ case 8: \
From: Mauricio Faria de Oliveira mfo@canonical.com
[ Upstream commit dcacbc1242c71e18fa9d2eadc5647e115c9c627d ]
It's possible for a block driver to set logical block size to a value greater than page size incorrectly; e.g. bcache takes the value from the superblock, set by the user w/ make-bcache.
This causes a BUG/NULL pointer dereference in the path:
__blkdev_get() -> set_init_blocksize() // set i_blkbits based on ... -> bdev_logical_block_size() -> queue_logical_block_size() // ... this value -> bdev_disk_changed() ... -> blkdev_readpage() -> block_read_full_page() -> create_page_buffers() // size = 1 << i_blkbits -> create_empty_buffers() // give size/take pointer -> alloc_page_buffers() // return NULL .. BUG!
Because alloc_page_buffers() is called with size > PAGE_SIZE, thus it initializes head = NULL, skips the loop, return head; then create_empty_buffers() gets (and uses) the NULL pointer.
This has been around longer than commit ad6bf88a6c19 ("block: fix an integer overflow in logical block size"); however, it increased the range of values that can trigger the issue.
Previously only 8k/16k/32k (on x86/4k page size) would do it, as greater values overflow unsigned short to zero, and queue_ logical_block_size() would then use the default of 512.
Now the range with unsigned int is much larger, and users w/ the 512k value, which happened to be zero'ed previously and work fine, started to hit this issue -- as the zero is gone, and queue_logical_block_size() does return 512k (>PAGE_SIZE.)
Fix this by checking the bcache device's logical block size, and if it's greater than page size, fallback to the backing/ cached device's logical page size.
This doesn't affect cache devices as those are still checked for block/page size in read_super(); only the backing/cached devices are not.
Apparently it's a regression from commit 2903381fce71 ("bcache: Take data offset from the bdev superblock."), moving the check into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks of backing devices out there with this larger value, we cannot refuse to load them (i.e., have a similar check in _BDEV.)
Ideally perhaps bcache should use all values from the backing device (physical/logical/io_min block size)? But for now just fix the problematic case.
Test-case:
# IMG=/root/disk.img # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G # DEV=$(losetup --find --show $IMG) # make-bcache --bdev $DEV --block 8k < see dmesg >
Before:
# uname -r 5.7.0-rc7
[ 55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4 ... [ 55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100 ... [ 55.966434] Call Trace: [ 55.967021] create_page_buffers+0x48/0x50 [ 55.967834] block_read_full_page+0x49/0x380 [ 55.972181] do_read_cache_page+0x494/0x610 [ 55.974780] read_part_sector+0x2d/0xaa [ 55.975558] read_lba+0x10e/0x1e0 [ 55.977904] efi_partition+0x120/0x5a6 [ 55.980227] blk_add_partitions+0x161/0x390 [ 55.982177] bdev_disk_changed+0x61/0xd0 [ 55.982961] __blkdev_get+0x350/0x490 [ 55.983715] __device_add_disk+0x318/0x480 [ 55.984539] bch_cached_dev_run+0xc5/0x270 [ 55.986010] register_bcache.cold+0x122/0x179 [ 55.987628] kernfs_fop_write+0xbc/0x1a0 [ 55.988416] vfs_write+0xb1/0x1a0 [ 55.989134] ksys_write+0x5a/0xd0 [ 55.989825] do_syscall_64+0x43/0x140 [ 55.990563] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 55.991519] RIP: 0033:0x7f7d60ba3154 ...
After:
# uname -r 5.7.0.bcachelbspgsz
[ 31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512) [ 31.675133] bcache: register_bdev() registered backing device loop0
# grep ^ /sys/block/bcache0/queue/*_block_size /sys/block/bcache0/queue/logical_block_size:512 /sys/block/bcache0/queue/physical_block_size:8192
Reported-by: Ryan Finnie ryan@finnie.org Reported-by: Sebastian Marsching sebastian@marsching.com Signed-off-by: Mauricio Faria de Oliveira mfo@canonical.com Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/super.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 4d8bf731b118c..a2e5a0fcd7d5c 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -819,7 +819,8 @@ static void bcache_device_free(struct bcache_device *d) }
static int bcache_device_init(struct bcache_device *d, unsigned int block_size, - sector_t sectors, make_request_fn make_request_fn) + sector_t sectors, make_request_fn make_request_fn, + struct block_device *cached_bdev) { struct request_queue *q; const size_t max_stripes = min_t(size_t, INT_MAX, @@ -885,6 +886,21 @@ static int bcache_device_init(struct bcache_device *d, unsigned int block_size, q->limits.io_min = block_size; q->limits.logical_block_size = block_size; q->limits.physical_block_size = block_size; + + if (q->limits.logical_block_size > PAGE_SIZE && cached_bdev) { + /* + * This should only happen with BCACHE_SB_VERSION_BDEV. + * Block/page size is checked for BCACHE_SB_VERSION_CDEV. + */ + pr_info("%s: sb/logical block size (%u) greater than page size " + "(%lu) falling back to device logical block size (%u)", + d->disk->disk_name, q->limits.logical_block_size, + PAGE_SIZE, bdev_logical_block_size(cached_bdev)); + + /* This also adjusts physical block size/min io size if needed */ + blk_queue_logical_block_size(q, bdev_logical_block_size(cached_bdev)); + } + blk_queue_flag_set(QUEUE_FLAG_NONROT, d->disk->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, d->disk->queue); blk_queue_flag_set(QUEUE_FLAG_DISCARD, d->disk->queue); @@ -1342,7 +1358,7 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
ret = bcache_device_init(&dc->disk, block_size, dc->bdev->bd_part->nr_sects - dc->sb.data_offset, - cached_dev_make_request); + cached_dev_make_request, dc->bdev); if (ret) return ret;
@@ -1455,7 +1471,7 @@ static int flash_dev_run(struct cache_set *c, struct uuid_entry *u) kobject_init(&d->kobj, &bch_flash_dev_ktype);
if (bcache_device_init(d, block_bytes(c), u->sectors, - flash_dev_make_request)) + flash_dev_make_request, NULL)) goto err;
bcache_device_attach(d, c, u - c->uuids);
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 95459261c99f1621d90bc628c2a48e60b7cf9a88 ]
pm_runtime_get_sync() increments the runtime PM usage counter even the call returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced.
Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Reviewed-by: Alexander Sverdlin alexander.sverdlin@nokia.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/hw_random/ks-sa-rng.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/char/hw_random/ks-sa-rng.c b/drivers/char/hw_random/ks-sa-rng.c index e2330e757f1ff..001617033d6a2 100644 --- a/drivers/char/hw_random/ks-sa-rng.c +++ b/drivers/char/hw_random/ks-sa-rng.c @@ -244,6 +244,7 @@ static int ks_sa_rng_probe(struct platform_device *pdev) ret = pm_runtime_get_sync(dev); if (ret < 0) { dev_err(dev, "Failed to enable SA power-domain\n"); + pm_runtime_put_noidle(dev); pm_runtime_disable(dev); return ret; }
From: Dave Martin Dave.Martin@arm.com
[ Upstream commit 1e570f512cbdc5e9e401ba640d9827985c1bea1e ]
sve_default_vl can be modified via the /proc/sys/abi/sve_default_vl sysctl concurrently with use, and modified concurrently by multiple threads.
Adding a lock for this seems overkill, and I don't want to think any more than necessary, so just define wrappers using READ_ONCE()/ WRITE_ONCE().
This will avoid the possibility of torn accesses and repeated loads and stores.
There's no evidence yet that this is going wrong in practice: this is just hygiene. For generic sysctl users, it would be better to build this kind of thing into the sysctl common code somehow.
Reported-by: Will Deacon will@kernel.org Signed-off-by: Dave Martin Dave.Martin@arm.com Link: https://lore.kernel.org/r/1591808590-20210-3-git-send-email-Dave.Martin@arm.... [will: move set_sve_default_vl() inside #ifdef to squash allnoconfig warning] Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/fpsimd.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 94289d1269933..4a77263c183b3 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -12,6 +12,7 @@ #include <linux/bug.h> #include <linux/cache.h> #include <linux/compat.h> +#include <linux/compiler.h> #include <linux/cpu.h> #include <linux/cpu_pm.h> #include <linux/kernel.h> @@ -119,10 +120,20 @@ struct fpsimd_last_state_struct { static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
/* Default VL for tasks that don't set it explicitly: */ -static int sve_default_vl = -1; +static int __sve_default_vl = -1; + +static int get_sve_default_vl(void) +{ + return READ_ONCE(__sve_default_vl); +}
#ifdef CONFIG_ARM64_SVE
+static void set_sve_default_vl(int val) +{ + WRITE_ONCE(__sve_default_vl, val); +} + /* Maximum supported vector length across all CPUs (initially poisoned) */ int __ro_after_init sve_max_vl = SVE_VL_MIN; int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN; @@ -345,7 +356,7 @@ static int sve_proc_do_default_vl(struct ctl_table *table, int write, loff_t *ppos) { int ret; - int vl = sve_default_vl; + int vl = get_sve_default_vl(); struct ctl_table tmp_table = { .data = &vl, .maxlen = sizeof(vl), @@ -362,7 +373,7 @@ static int sve_proc_do_default_vl(struct ctl_table *table, int write, if (!sve_vl_valid(vl)) return -EINVAL;
- sve_default_vl = find_supported_vector_length(vl); + set_sve_default_vl(find_supported_vector_length(vl)); return 0; }
@@ -869,7 +880,7 @@ void __init sve_setup(void) * For the default VL, pick the maximum supported value <= 64. * VL == 64 is guaranteed not to grow the signal frame. */ - sve_default_vl = find_supported_vector_length(64); + set_sve_default_vl(find_supported_vector_length(64));
bitmap_andnot(tmp_map, sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX); @@ -890,7 +901,7 @@ void __init sve_setup(void) pr_info("SVE: maximum available vector length %u bytes per vector\n", sve_max_vl); pr_info("SVE: default vector length %u bytes per vector\n", - sve_default_vl); + get_sve_default_vl());
/* KVM decides whether to support mismatched systems. Just warn here: */ if (sve_max_virtualisable_vl < sve_max_vl) @@ -1030,13 +1041,13 @@ void fpsimd_flush_thread(void) * vector length configured: no kernel task can become a user * task without an exec and hence a call to this function. * By the time the first call to this function is made, all - * early hardware probing is complete, so sve_default_vl + * early hardware probing is complete, so __sve_default_vl * should be valid. * If a bug causes this to go wrong, we make some noise and * try to fudge thread.sve_vl to a safe value here. */ vl = current->thread.sve_vl_onexec ? - current->thread.sve_vl_onexec : sve_default_vl; + current->thread.sve_vl_onexec : get_sve_default_vl();
if (WARN_ON(!sve_vl_valid(vl))) vl = SVE_VL_MIN;
From: Thomas Falcon tlfalcon@linux.ibm.com
[ Upstream commit dff515a3e71dc8ab3b9dcc2e23a9b5fca88b3c18 ]
The VNIC driver's "login" command sequence is the final step in the driver's initialization process with device firmware, confirming the available device queue resources to be utilized by the driver. Under high system load, firmware may not respond to the request in a timely manner or may abort the request. In such cases, the driver should reattempt the login command sequence. In case of a device error, the number of retries is bounded.
Signed-off-by: Thomas Falcon tlfalcon@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 1b4d04e4474bb..2baf7b3ff4cbf 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -842,12 +842,13 @@ static int ibmvnic_login(struct net_device *netdev) struct ibmvnic_adapter *adapter = netdev_priv(netdev); unsigned long timeout = msecs_to_jiffies(30000); int retry_count = 0; + int retries = 10; bool retry; int rc;
do { retry = false; - if (retry_count > IBMVNIC_MAX_QUEUES) { + if (retry_count > retries) { netdev_warn(netdev, "Login attempts exceeded\n"); return -1; } @@ -862,11 +863,23 @@ static int ibmvnic_login(struct net_device *netdev)
if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out\n"); - return -1; + netdev_warn(netdev, "Login timed out, retrying...\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + continue; }
- if (adapter->init_done_rc == PARTIALSUCCESS) { + if (adapter->init_done_rc == ABORTED) { + netdev_warn(netdev, "Login aborted, retrying...\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + /* FW or device may be busy, so + * wait a bit before retrying login + */ + msleep(500); + } else if (adapter->init_done_rc == PARTIALSUCCESS) { retry_count++; release_sub_crqs(adapter, 1);
From: Zekun Shen bruceshenzk@gmail.com
[ Upstream commit e89df5c4322c1bf495f62d74745895b5fd2a4393 ]
There is a race condition exist during termination. The path is alx_stop and then alx_remove. An alx_schedule_link_check could be called before alx_stop by interrupt handler and invoke alx_link_check later. Alx_stop frees the napis, and alx_remove cancels any pending works. If any of the work is scheduled before termination and invoked before alx_remove, a null-ptr-deref occurs because both expect alx->napis[i].
This patch fix the race condition by moving cancel_work_sync functions before alx_free_napis inside alx_stop. Because interrupt handler can call alx_schedule_link_check again, alx_free_irq is moved before cancel_work_sync calls too.
Signed-off-by: Zekun Shen bruceshenzk@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/alx/main.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c index b9b4edb913c13..9b7f1af5f5747 100644 --- a/drivers/net/ethernet/atheros/alx/main.c +++ b/drivers/net/ethernet/atheros/alx/main.c @@ -1249,8 +1249,12 @@ out_disable_adv_intr:
static void __alx_stop(struct alx_priv *alx) { - alx_halt(alx); alx_free_irq(alx); + + cancel_work_sync(&alx->link_check_wk); + cancel_work_sync(&alx->reset_wk); + + alx_halt(alx); alx_free_rings(alx); alx_free_napis(alx); } @@ -1855,9 +1859,6 @@ static void alx_remove(struct pci_dev *pdev) struct alx_priv *alx = pci_get_drvdata(pdev); struct alx_hw *hw = &alx->hw;
- cancel_work_sync(&alx->link_check_wk); - cancel_work_sync(&alx->reset_wk); - /* restore permanent mac address */ alx_set_macaddr(hw, hw->perm_addr);
From: Aditya Pakki pakki001@umn.edu
[ Upstream commit a6379f0ad6375a707e915518ecd5c2270afcd395 ]
In case of failure of check_expect_hints_stats(), the resources allocated by objagg_hints_get should be freed. The patch fixes this issue.
Signed-off-by: Aditya Pakki pakki001@umn.edu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_objagg.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/test_objagg.c b/lib/test_objagg.c index 72c1abfa154dc..da137939a4100 100644 --- a/lib/test_objagg.c +++ b/lib/test_objagg.c @@ -979,10 +979,10 @@ err_check_expect_stats2: err_world2_obj_get: for (i--; i >= 0; i--) world_obj_put(&world2, objagg, hints_case->key_ids[i]); - objagg_hints_put(hints); - objagg_destroy(objagg2); i = hints_case->key_ids_count; + objagg_destroy(objagg2); err_check_expect_hints_stats: + objagg_hints_put(hints); err_hints_get: err_check_expect_stats: err_world_obj_get:
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 5e50311556c9f409a85740e3cb4c4511e7e27da0 ]
Fix the following warnings caused by reusage of the same irq_chip instance for all spmi-gpio gpio_irq_chip instances. Instead embed irq_chip into pmic_gpio_state struct.
gpio gpiochip2: (c440000.qcom,spmi:pmic@2:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver. gpio gpiochip3: (c440000.qcom,spmi:pmic@4:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver. gpio gpiochip4: (c440000.qcom,spmi:pmic@a:gpio@c000): detected irqchip that is shared with multiple gpiochips: please fix the driver.
Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Acked-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20200604002817.667160-1-dmitry.baryshkov@linaro.or... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c index fe0be8a6ebb7b..092a48e4dff57 100644 --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c @@ -170,6 +170,7 @@ struct pmic_gpio_state { struct regmap *map; struct pinctrl_dev *ctrl; struct gpio_chip chip; + struct irq_chip irq; };
static const struct pinconf_generic_params pmic_gpio_bindings[] = { @@ -917,16 +918,6 @@ static int pmic_gpio_populate(struct pmic_gpio_state *state, return 0; }
-static struct irq_chip pmic_gpio_irq_chip = { - .name = "spmi-gpio", - .irq_ack = irq_chip_ack_parent, - .irq_mask = irq_chip_mask_parent, - .irq_unmask = irq_chip_unmask_parent, - .irq_set_type = irq_chip_set_type_parent, - .irq_set_wake = irq_chip_set_wake_parent, - .flags = IRQCHIP_MASK_ON_SUSPEND, -}; - static int pmic_gpio_domain_translate(struct irq_domain *domain, struct irq_fwspec *fwspec, unsigned long *hwirq, @@ -1053,8 +1044,16 @@ static int pmic_gpio_probe(struct platform_device *pdev) if (!parent_domain) return -ENXIO;
+ state->irq.name = "spmi-gpio", + state->irq.irq_ack = irq_chip_ack_parent, + state->irq.irq_mask = irq_chip_mask_parent, + state->irq.irq_unmask = irq_chip_unmask_parent, + state->irq.irq_set_type = irq_chip_set_type_parent, + state->irq.irq_set_wake = irq_chip_set_wake_parent, + state->irq.flags = IRQCHIP_MASK_ON_SUSPEND, + girq = &state->chip.irq; - girq->chip = &pmic_gpio_irq_chip; + girq->chip = &state->irq; girq->default_type = IRQ_TYPE_NONE; girq->handler = handle_level_irq; girq->fwnode = of_node_to_fwnode(state->dev->of_node);
From: Vidya Sagar vidyas@nvidia.com
[ Upstream commit 782b6b69847f34dda330530493ea62b7de3fd06a ]
Use noirq suspend/resume callbacks as other drivers which implement noirq suspend/resume callbacks (Ex:- PCIe) depend on pinctrl driver to configure the signals used by their respective devices in the noirq phase.
Signed-off-by: Vidya Sagar vidyas@nvidia.com Reviewed-by: Dmitry Osipenko digetx@gmail.com Link: https://lore.kernel.org/r/20200604174935.26560-1-vidyas@nvidia.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/tegra/pinctrl-tegra.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pinctrl/tegra/pinctrl-tegra.c b/drivers/pinctrl/tegra/pinctrl-tegra.c index 21661f6490d68..195cfe557511b 100644 --- a/drivers/pinctrl/tegra/pinctrl-tegra.c +++ b/drivers/pinctrl/tegra/pinctrl-tegra.c @@ -731,8 +731,8 @@ static int tegra_pinctrl_resume(struct device *dev) }
const struct dev_pm_ops tegra_pinctrl_pm = { - .suspend = &tegra_pinctrl_suspend, - .resume = &tegra_pinctrl_resume + .suspend_noirq = &tegra_pinctrl_suspend, + .resume_noirq = &tegra_pinctrl_resume };
static bool tegra_pinctrl_gpio_node_has_range(struct tegra_pmx *pmx)
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 664f5f8de825648d1d31f6f5652e3cd117c77b50 ]
Use __secure_computing() and pass the register data via seccomp_data so secure computing doesn't have to fetch it again.
Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ptrace.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index 58faa12542a1b..9eee345568890 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -839,6 +839,9 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) { unsigned long mask = -1UL;
+ if (is_compat_task()) + mask = 0xffffffff; + /* * The sysc_tracesys code in entry.S stored the system * call number to gprs[2]. @@ -855,17 +858,35 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) return -1; }
+#ifdef CONFIG_SECCOMP /* Do the secure computing check after ptrace. */ - if (secure_computing()) { - /* seccomp failures shouldn't expose any additional code. */ - return -1; + if (unlikely(test_thread_flag(TIF_SECCOMP))) { + struct seccomp_data sd; + + if (is_compat_task()) { + sd.instruction_pointer = regs->psw.addr & 0x7fffffff; + sd.arch = AUDIT_ARCH_S390; + } else { + sd.instruction_pointer = regs->psw.addr; + sd.arch = AUDIT_ARCH_S390X; + } + + sd.nr = regs->gprs[2] & 0xffff; + sd.args[0] = regs->orig_gpr2 & mask; + sd.args[1] = regs->gprs[3] & mask; + sd.args[2] = regs->gprs[4] & mask; + sd.args[3] = regs->gprs[5] & mask; + sd.args[4] = regs->gprs[6] & mask; + sd.args[5] = regs->gprs[7] & mask; + + if (__secure_computing(&sd) == -1) + return -1; } +#endif /* CONFIG_SECCOMP */
if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) trace_sys_enter(regs, regs->gprs[2]);
- if (is_compat_task()) - mask = 0xffffffff;
audit_syscall_entry(regs->gprs[2], regs->orig_gpr2 & mask, regs->gprs[3] &mask, regs->gprs[4] &mask,
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit cd29fa798001075a554b978df3a64e6656c25794 ]
The current code returns the syscall number which an invalid syscall number is supplied and tracing is enabled. This makes the strace testsuite fail.
Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ptrace.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index 9eee345568890..3f29646313e82 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -838,6 +838,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request, asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) { unsigned long mask = -1UL; + long ret = -1;
if (is_compat_task()) mask = 0xffffffff; @@ -854,8 +855,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) * debugger stored an invalid system call number. Skip * the system call and the system call restart handling. */ - clear_pt_regs_flag(regs, PIF_SYSCALL); - return -1; + goto skip; }
#ifdef CONFIG_SECCOMP @@ -871,7 +871,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) sd.arch = AUDIT_ARCH_S390X; }
- sd.nr = regs->gprs[2] & 0xffff; + sd.nr = regs->int_code & 0xffff; sd.args[0] = regs->orig_gpr2 & mask; sd.args[1] = regs->gprs[3] & mask; sd.args[2] = regs->gprs[4] & mask; @@ -880,19 +880,26 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) sd.args[5] = regs->gprs[7] & mask;
if (__secure_computing(&sd) == -1) - return -1; + goto skip; } #endif /* CONFIG_SECCOMP */
if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) - trace_sys_enter(regs, regs->gprs[2]); + trace_sys_enter(regs, regs->int_code & 0xffff);
- audit_syscall_entry(regs->gprs[2], regs->orig_gpr2 & mask, + audit_syscall_entry(regs->int_code & 0xffff, regs->orig_gpr2 & mask, regs->gprs[3] &mask, regs->gprs[4] &mask, regs->gprs[5] &mask);
+ if ((signed long)regs->gprs[2] >= NR_syscalls) { + regs->gprs[2] = -ENOSYS; + ret = -ENOSYS; + } return regs->gprs[2]; +skip: + clear_pt_regs_flag(regs, PIF_SYSCALL); + return ret; }
asmlinkage void do_syscall_trace_exit(struct pt_regs *regs)
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 00332c16b1604242a56289ff2b26e283dbad0812 ]
tracing expects to see invalid syscalls, so pass it through. The syscall path in entry.S checks the syscall number before looking up the handler, so it is still safe.
Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/entry.S | 2 +- arch/s390/kernel/ptrace.c | 6 ++---- 2 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 3ae64914bd144..9584e743102b7 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -368,9 +368,9 @@ ENTRY(system_call) jnz .Lsysc_nr_ok # svc 0: system call number in %r1 llgfr %r1,%r1 # clear high word in r1 + sth %r1,__PT_INT_CODE+2(%r11) cghi %r1,NR_syscalls jnl .Lsysc_nr_ok - sth %r1,__PT_INT_CODE+2(%r11) slag %r8,%r1,3 .Lsysc_nr_ok: xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15) diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index 3f29646313e82..fca78b269349d 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -848,11 +848,9 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) * call number to gprs[2]. */ if (test_thread_flag(TIF_SYSCALL_TRACE) && - (tracehook_report_syscall_entry(regs) || - regs->gprs[2] >= NR_syscalls)) { + tracehook_report_syscall_entry(regs)) { /* - * Tracing decided this syscall should not happen or the - * debugger stored an invalid system call number. Skip + * Tracing decided this syscall should not happen. Skip * the system call and the system call restart handling. */ goto skip;
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 873e5a763d604c32988c4a78913a8dab3862d2f9 ]
When strace wants to update the syscall number, it sets GPR2 to the desired number and updates the GPR via PTRACE_SETREGSET. It doesn't update regs->int_code which would cause the old syscall executed on syscall restart. As we cannot change the ptrace ABI and don't have a field for the interruption code, check whether the tracee is in a syscall and the last instruction was svc. In that case assume that the tracer wants to update the syscall number and copy the GPR2 value to regs->int_code.
Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ptrace.c | 31 ++++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index fca78b269349d..e007224b65bb2 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -324,6 +324,25 @@ static inline void __poke_user_per(struct task_struct *child, child->thread.per_user.end = data; }
+static void fixup_int_code(struct task_struct *child, addr_t data) +{ + struct pt_regs *regs = task_pt_regs(child); + int ilc = regs->int_code >> 16; + u16 insn; + + if (ilc > 6) + return; + + if (ptrace_access_vm(child, regs->psw.addr - (regs->int_code >> 16), + &insn, sizeof(insn), FOLL_FORCE) != sizeof(insn)) + return; + + /* double check that tracee stopped on svc instruction */ + if ((insn >> 8) != 0xa) + return; + + regs->int_code = 0x20000 | (data & 0xffff); +} /* * Write a word to the user area of a process at location addr. This * operation does have an additional problem compared to peek_user. @@ -335,7 +354,9 @@ static int __poke_user(struct task_struct *child, addr_t addr, addr_t data) struct user *dummy = NULL; addr_t offset;
+ if (addr < (addr_t) &dummy->regs.acrs) { + struct pt_regs *regs = task_pt_regs(child); /* * psw and gprs are stored on the stack */ @@ -353,7 +374,11 @@ static int __poke_user(struct task_struct *child, addr_t addr, addr_t data) /* Invalid addressing mode bits */ return -EINVAL; } - *(addr_t *)((addr_t) &task_pt_regs(child)->psw + addr) = data; + + if (test_pt_regs_flag(regs, PIF_SYSCALL) && + addr == offsetof(struct user, regs.gprs[2])) + fixup_int_code(child, data); + *(addr_t *)((addr_t) ®s->psw + addr) = data;
} else if (addr < (addr_t) (&dummy->regs.orig_gpr2)) { /* @@ -719,6 +744,10 @@ static int __poke_user_compat(struct task_struct *child, regs->psw.mask = (regs->psw.mask & ~PSW_MASK_BA) | (__u64)(tmp & PSW32_ADDR_AMODE); } else { + + if (test_pt_regs_flag(regs, PIF_SYSCALL) && + addr == offsetof(struct compat_user, regs.gprs[2])) + fixup_int_code(child, data); /* gpr 0-15 */ *(__u32*)((addr_t) ®s->psw + addr*2 + 4) = tmp; }
From: Nathan Chancellor natechancellor@gmail.com
[ Upstream commit 2b2a25845d534ac6d55086e35c033961fdd83a26 ]
Currently, the VDSO is being linked through $(CC). This does not match how the rest of the kernel links objects, which is through the $(LD) variable.
When clang is built in a default configuration, it first attempts to use the target triple's default linker, which is just ld. However, the user can override this through the CLANG_DEFAULT_LINKER cmake define so that clang uses another linker by default, such as LLVM's own linker, ld.lld. This can be useful to get more optimized links across various different projects.
However, this is problematic for the s390 vDSO because ld.lld does not have any s390 emulatiom support:
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.1-rc1/lld/ELF/Driver....
Thus, if a user is using a toolchain with ld.lld as the default, they will see an error, even if they have specified ld.bfd through the LD make variable:
$ make -j"$(nproc)" -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- LLVM=1 \ LD=s390x-linux-gnu-ld \ defconfig arch/s390/kernel/vdso64/ ld.lld: error: unknown emulation: elf64_s390 clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
Normally, '-fuse-ld=bfd' could be used to get around this; however, this can be fragile, depending on paths and variable naming. The cleaner solution for the kernel is to take advantage of the fact that $(LD) can be invoked directly, which bypasses the heuristics of $(CC) and respects the user's choice. Similar changes have been done for ARM, ARM64, and MIPS.
Link: https://lkml.kernel.org/r/20200602192523.32758-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/1041 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Reviewed-by: Nick Desaulniers ndesaulniers@google.com [heiko.carstens@de.ibm.com: add --build-id flag] Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/vdso64/Makefile | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile index bec19e7e6e1cf..4a66a1cb919b1 100644 --- a/arch/s390/kernel/vdso64/Makefile +++ b/arch/s390/kernel/vdso64/Makefile @@ -18,8 +18,8 @@ KBUILD_AFLAGS_64 += -m64 -s
KBUILD_CFLAGS_64 := $(filter-out -m64,$(KBUILD_CFLAGS)) KBUILD_CFLAGS_64 += -m64 -fPIC -shared -fno-common -fno-builtin -KBUILD_CFLAGS_64 += -nostdlib -Wl,-soname=linux-vdso64.so.1 \ - -Wl,--hash-style=both +ldflags-y := -fPIC -shared -nostdlib -soname=linux-vdso64.so.1 \ + --hash-style=both --build-id -T
$(targets:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_64) $(targets:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_64) @@ -37,8 +37,8 @@ KASAN_SANITIZE := n $(obj)/vdso64_wrapper.o : $(obj)/vdso64.so
# link rule for the .so file, .lds has to be first -$(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) FORCE - $(call if_changed,vdso64ld) +$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) FORCE + $(call if_changed,ld)
# strip rule for the .so file $(obj)/%.so: OBJCOPYFLAGS := -S @@ -50,8 +50,6 @@ $(obj-vdso64): %.o: %.S FORCE $(call if_changed_dep,vdso64as)
# actual build commands -quiet_cmd_vdso64ld = VDSO64L $@ - cmd_vdso64ld = $(CC) $(c_flags) -Wl,-T $(filter %.lds %.o,$^) -o $@ quiet_cmd_vdso64as = VDSO64A $@ cmd_vdso64as = $(CC) $(a_flags) -c -o $@ $<
From: Vincenzo Frascino vincenzo.frascino@arm.com
[ Upstream commit 478237a595120a18e9b52fd2c57a6e8b7a01e411 ]
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res().
In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or at run time.
Fix the s390 vdso implementation of clock_getres keeping a copy of hrtimer_resolution in vdso data and using that directly.
Link: https://lkml.kernel.org/r/20200324121027.21665-1-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com Acked-by: Martin Schwidefsky schwidefsky@de.ibm.com [heiko.carstens@de.ibm.com: use llgf for proper zero extension] Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/vdso.h | 1 + arch/s390/kernel/asm-offsets.c | 2 +- arch/s390/kernel/time.c | 1 + arch/s390/kernel/vdso64/clock_getres.S | 10 +++++----- 4 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/s390/include/asm/vdso.h b/arch/s390/include/asm/vdso.h index 3bcfdeb013951..0cd085cdeb4f2 100644 --- a/arch/s390/include/asm/vdso.h +++ b/arch/s390/include/asm/vdso.h @@ -36,6 +36,7 @@ struct vdso_data { __u32 tk_shift; /* Shift used for xtime_nsec 0x60 */ __u32 ts_dir; /* TOD steering direction 0x64 */ __u64 ts_end; /* TOD steering end 0x68 */ + __u32 hrtimer_res; /* hrtimer resolution 0x70 */ };
struct vdso_per_cpu_data { diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c index e80f0e6f59722..46f84cb0d5528 100644 --- a/arch/s390/kernel/asm-offsets.c +++ b/arch/s390/kernel/asm-offsets.c @@ -76,6 +76,7 @@ int main(void) OFFSET(__VDSO_TK_SHIFT, vdso_data, tk_shift); OFFSET(__VDSO_TS_DIR, vdso_data, ts_dir); OFFSET(__VDSO_TS_END, vdso_data, ts_end); + OFFSET(__VDSO_CLOCK_REALTIME_RES, vdso_data, hrtimer_res); OFFSET(__VDSO_ECTG_BASE, vdso_per_cpu_data, ectg_timer_base); OFFSET(__VDSO_ECTG_USER, vdso_per_cpu_data, ectg_user_time); OFFSET(__VDSO_GETCPU_VAL, vdso_per_cpu_data, getcpu_val); @@ -86,7 +87,6 @@ int main(void) DEFINE(__CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE); DEFINE(__CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE); DEFINE(__CLOCK_THREAD_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID); - DEFINE(__CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC); DEFINE(__CLOCK_COARSE_RES, LOW_RES_NSEC); BLANK(); /* idle data offsets */ diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index f9d070d016e35..b1113b5194325 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -301,6 +301,7 @@ void update_vsyscall(struct timekeeper *tk)
vdso_data->tk_mult = tk->tkr_mono.mult; vdso_data->tk_shift = tk->tkr_mono.shift; + vdso_data->hrtimer_res = hrtimer_resolution; smp_wmb(); ++vdso_data->tb_update_count; } diff --git a/arch/s390/kernel/vdso64/clock_getres.S b/arch/s390/kernel/vdso64/clock_getres.S index 081435398e0a1..0c79caa32b592 100644 --- a/arch/s390/kernel/vdso64/clock_getres.S +++ b/arch/s390/kernel/vdso64/clock_getres.S @@ -17,12 +17,14 @@ .type __kernel_clock_getres,@function __kernel_clock_getres: CFI_STARTPROC - larl %r1,4f + larl %r1,3f + lg %r0,0(%r1) cghi %r2,__CLOCK_REALTIME_COARSE je 0f cghi %r2,__CLOCK_MONOTONIC_COARSE je 0f - larl %r1,3f + larl %r1,_vdso_data + llgf %r0,__VDSO_CLOCK_REALTIME_RES(%r1) cghi %r2,__CLOCK_REALTIME je 0f cghi %r2,__CLOCK_MONOTONIC @@ -36,7 +38,6 @@ __kernel_clock_getres: jz 2f 0: ltgr %r3,%r3 jz 1f /* res == NULL */ - lg %r0,0(%r1) xc 0(8,%r3),0(%r3) /* set tp->tv_sec to zero */ stg %r0,8(%r3) /* store tp->tv_usec */ 1: lghi %r2,0 @@ -45,6 +46,5 @@ __kernel_clock_getres: svc 0 br %r14 CFI_ENDPROC -3: .quad __CLOCK_REALTIME_RES -4: .quad __CLOCK_COARSE_RES +3: .quad __CLOCK_COARSE_RES .size __kernel_clock_getres,.-__kernel_clock_getres
From: Will Deacon will@kernel.org
[ Upstream commit e575fb9e76c8e33440fb859572a8b7d430f053d6 ]
When I squashed the 'allnoconfig' compiler warning about the set_sve_default_vl() function being defined but not used in commit 1e570f512cbd ("arm64/sve: Eliminate data races on sve_default_vl"), I accidentally broke the build for configs where ARM64_SVE is enabled, but SYSCTL is not.
Fix this by only compiling the SVE sysctl support if both CONFIG_SVE=y and CONFIG_SYSCTL=y.
Cc: Dave Martin Dave.Martin@arm.com Reported-by: Qian Cai cai@lca.pw Link: https://lore.kernel.org/r/20200616131808.GA1040@lca.pw Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/fpsimd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 4a77263c183b3..befc5a715dc4e 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -349,7 +349,7 @@ static unsigned int find_supported_vector_length(unsigned int vl) return sve_vl_from_vq(__bit_to_vq(bit)); }
-#ifdef CONFIG_SYSCTL +#if defined(CONFIG_ARM64_SVE) && defined(CONFIG_SYSCTL)
static int sve_proc_do_default_vl(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, @@ -395,9 +395,9 @@ static int __init sve_sysctl_init(void) return 0; }
-#else /* ! CONFIG_SYSCTL */ +#else /* ! (CONFIG_ARM64_SVE && CONFIG_SYSCTL) */ static int __init sve_sysctl_init(void) { return 0; } -#endif /* ! CONFIG_SYSCTL */ +#endif /* ! (CONFIG_ARM64_SVE && CONFIG_SYSCTL) */
#define ZREG(sve_state, vq, n) ((char *)(sve_state) + \ (SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET))
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit f2f02ebd8f3833626642688b2d2c6a7b3c141fa9 ]
When cc-option and friends evaluate compiler flags, the temporary file $$TMP is created as an output object, and automatically cleaned up. The actual file path of $$TMP is .<pid>.tmp, here <pid> is the process ID of $(shell ...) invoked from cc-option. (Please note $$$$ is the escape sequence of $$).
Such garbage files are cleaned up in most cases, but some compiler flags create additional output files.
For example, -gsplit-dwarf creates a .dwo file.
When CONFIG_DEBUG_INFO_SPLIT=y, you will see a bunch of .<pid>.dwo files left in the top of build directories. You may not notice them unless you do 'ls -a', but the garbage files will increase every time you run 'make'.
This commit changes the temporary object path to .tmp_<pid>/tmp, and removes .tmp_<pid> directory when exiting. Separate build artifacts such as *.dwo will be cleaned up all together because their file paths are usually determined based on the base name of the object.
Another example is -ftest-coverage, which outputs the coverage data into <base-name-of-object>.gcno
Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/Kbuild.include | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include index 6cabf20ce66a3..fe427f7fcfb31 100644 --- a/scripts/Kbuild.include +++ b/scripts/Kbuild.include @@ -86,20 +86,21 @@ cc-cross-prefix = $(firstword $(foreach c, $(1), \ $(if $(shell command -v -- $(c)gcc 2>/dev/null), $(c))))
# output directory for tests below -TMPOUT := $(if $(KBUILD_EXTMOD),$(firstword $(KBUILD_EXTMOD))/) +TMPOUT = $(if $(KBUILD_EXTMOD),$(firstword $(KBUILD_EXTMOD))/).tmp_$$$$
# try-run # Usage: option = $(call try-run, $(CC)...-o "$$TMP",option-ok,otherwise) # Exit code chooses option. "$$TMP" serves as a temporary file and is # automatically cleaned up. try-run = $(shell set -e; \ - TMP="$(TMPOUT).$$$$.tmp"; \ - TMPO="$(TMPOUT).$$$$.o"; \ + TMP=$(TMPOUT)/tmp; \ + TMPO=$(TMPOUT)/tmp.o; \ + mkdir -p $(TMPOUT); \ + trap "rm -rf $(TMPOUT)" EXIT; \ if ($(1)) >/dev/null 2>&1; \ then echo "$(2)"; \ else echo "$(3)"; \ - fi; \ - rm -f "$$TMP" "$$TMPO") + fi)
# as-option # Usage: cflags-y += $(call as-option,-Wa$(comma)-isa=foo,)
From: Sami Tolvanen samitolvanen@google.com
[ Upstream commit 4ef57b21d6fb49d2b25c47e4cff467a0c2c8b6b7 ]
When compiling a kernel with Clang and LTO, we need to run recordmcount on vmlinux.o with a large number of sections, which currently fails as the program doesn't understand extended section indexes. This change adds support for processing binaries with >64k sections.
Link: https://lkml.kernel.org/r/20200424193046.160744-1-samitolvanen@google.com Link: https://lore.kernel.org/lkml/CAK7LNARbZhoaA=Nnuw0=gBrkuKbr_4Ng_Ei57uafujZf7X...
Cc: Kees Cook keescook@chromium.org Reviewed-by: Matt Helsley mhelsley@vmware.com Signed-off-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/recordmcount.h | 98 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 92 insertions(+), 6 deletions(-)
diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index 74eab03e31d4d..f9b19524da112 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -29,6 +29,11 @@ #undef has_rel_mcount #undef tot_relsize #undef get_mcountsym +#undef find_symtab +#undef get_shnum +#undef set_shnum +#undef get_shstrndx +#undef get_symindex #undef get_sym_str_and_relp #undef do_func #undef Elf_Addr @@ -58,6 +63,11 @@ # define __has_rel_mcount __has64_rel_mcount # define has_rel_mcount has64_rel_mcount # define tot_relsize tot64_relsize +# define find_symtab find_symtab64 +# define get_shnum get_shnum64 +# define set_shnum set_shnum64 +# define get_shstrndx get_shstrndx64 +# define get_symindex get_symindex64 # define get_sym_str_and_relp get_sym_str_and_relp_64 # define do_func do64 # define get_mcountsym get_mcountsym_64 @@ -91,6 +101,11 @@ # define __has_rel_mcount __has32_rel_mcount # define has_rel_mcount has32_rel_mcount # define tot_relsize tot32_relsize +# define find_symtab find_symtab32 +# define get_shnum get_shnum32 +# define set_shnum set_shnum32 +# define get_shstrndx get_shstrndx32 +# define get_symindex get_symindex32 # define get_sym_str_and_relp get_sym_str_and_relp_32 # define do_func do32 # define get_mcountsym get_mcountsym_32 @@ -173,6 +188,67 @@ static int MIPS_is_fake_mcount(Elf_Rel const *rp) return is_fake; }
+static unsigned int get_symindex(Elf_Sym const *sym, Elf32_Word const *symtab, + Elf32_Word const *symtab_shndx) +{ + unsigned long offset; + int index; + + if (sym->st_shndx != SHN_XINDEX) + return w2(sym->st_shndx); + + offset = (unsigned long)sym - (unsigned long)symtab; + index = offset / sizeof(*sym); + + return w(symtab_shndx[index]); +} + +static unsigned int get_shnum(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0) +{ + if (shdr0 && !ehdr->e_shnum) + return w(shdr0->sh_size); + + return w2(ehdr->e_shnum); +} + +static void set_shnum(Elf_Ehdr *ehdr, Elf_Shdr *shdr0, unsigned int new_shnum) +{ + if (new_shnum >= SHN_LORESERVE) { + ehdr->e_shnum = 0; + shdr0->sh_size = w(new_shnum); + } else + ehdr->e_shnum = w2(new_shnum); +} + +static int get_shstrndx(Elf_Ehdr const *ehdr, Elf_Shdr const *shdr0) +{ + if (ehdr->e_shstrndx != SHN_XINDEX) + return w2(ehdr->e_shstrndx); + + return w(shdr0->sh_link); +} + +static void find_symtab(Elf_Ehdr *const ehdr, Elf_Shdr const *shdr0, + unsigned const nhdr, Elf32_Word **symtab, + Elf32_Word **symtab_shndx) +{ + Elf_Shdr const *relhdr; + unsigned k; + + *symtab = NULL; + *symtab_shndx = NULL; + + for (relhdr = shdr0, k = nhdr; k; --k, ++relhdr) { + if (relhdr->sh_type == SHT_SYMTAB) + *symtab = (void *)ehdr + relhdr->sh_offset; + else if (relhdr->sh_type == SHT_SYMTAB_SHNDX) + *symtab_shndx = (void *)ehdr + relhdr->sh_offset; + + if (*symtab && *symtab_shndx) + break; + } +} + /* Append the new shstrtab, Elf_Shdr[], __mcount_loc and its relocations. */ static int append_func(Elf_Ehdr *const ehdr, Elf_Shdr *const shstr, @@ -188,10 +264,12 @@ static int append_func(Elf_Ehdr *const ehdr, char const *mc_name = (sizeof(Elf_Rela) == rel_entsize) ? ".rela__mcount_loc" : ".rel__mcount_loc"; - unsigned const old_shnum = w2(ehdr->e_shnum); uint_t const old_shoff = _w(ehdr->e_shoff); uint_t const old_shstr_sh_size = _w(shstr->sh_size); uint_t const old_shstr_sh_offset = _w(shstr->sh_offset); + Elf_Shdr *const shdr0 = (Elf_Shdr *)(old_shoff + (void *)ehdr); + unsigned int const old_shnum = get_shnum(ehdr, shdr0); + unsigned int const new_shnum = 2 + old_shnum; /* {.rel,}__mcount_loc */ uint_t t = 1 + strlen(mc_name) + _w(shstr->sh_size); uint_t new_e_shoff;
@@ -201,6 +279,8 @@ static int append_func(Elf_Ehdr *const ehdr, t += (_align & -t); /* word-byte align */ new_e_shoff = t;
+ set_shnum(ehdr, shdr0, new_shnum); + /* body for new shstrtab */ if (ulseek(sb.st_size, SEEK_SET) < 0) return -1; @@ -255,7 +335,6 @@ static int append_func(Elf_Ehdr *const ehdr, return -1;
ehdr->e_shoff = _w(new_e_shoff); - ehdr->e_shnum = w2(2 + w2(ehdr->e_shnum)); /* {.rel,}__mcount_loc */ if (ulseek(0, SEEK_SET) < 0) return -1; if (uwrite(ehdr, sizeof(*ehdr)) < 0) @@ -434,6 +513,8 @@ static int find_secsym_ndx(unsigned const txtndx, uint_t *const recvalp, unsigned int *sym_index, Elf_Shdr const *const symhdr, + Elf32_Word const *symtab, + Elf32_Word const *symtab_shndx, Elf_Ehdr const *const ehdr) { Elf_Sym const *const sym0 = (Elf_Sym const *)(_w(symhdr->sh_offset) @@ -445,7 +526,7 @@ static int find_secsym_ndx(unsigned const txtndx, for (symp = sym0, t = nsym; t; --t, ++symp) { unsigned int const st_bind = ELF_ST_BIND(symp->st_info);
- if (txtndx == w2(symp->st_shndx) + if (txtndx == get_symindex(symp, symtab, symtab_shndx) /* avoid STB_WEAK */ && (STB_LOCAL == st_bind || STB_GLOBAL == st_bind)) { /* function symbols on ARM have quirks, avoid them */ @@ -516,21 +597,23 @@ static unsigned tot_relsize(Elf_Shdr const *const shdr0, return totrelsz; }
- /* Overall supervision for Elf32 ET_REL file. */ static int do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) { Elf_Shdr *const shdr0 = (Elf_Shdr *)(_w(ehdr->e_shoff) + (void *)ehdr); - unsigned const nhdr = w2(ehdr->e_shnum); - Elf_Shdr *const shstr = &shdr0[w2(ehdr->e_shstrndx)]; + unsigned const nhdr = get_shnum(ehdr, shdr0); + Elf_Shdr *const shstr = &shdr0[get_shstrndx(ehdr, shdr0)]; char const *const shstrtab = (char const *)(_w(shstr->sh_offset) + (void *)ehdr);
Elf_Shdr const *relhdr; unsigned k;
+ Elf32_Word *symtab; + Elf32_Word *symtab_shndx; + /* Upper bound on space: assume all relevant relocs are for mcount. */ unsigned totrelsz;
@@ -561,6 +644,8 @@ static int do_func(Elf_Ehdr *const ehdr, char const *const fname, return -1; }
+ find_symtab(ehdr, shdr0, nhdr, &symtab, &symtab_shndx); + for (relhdr = shdr0, k = nhdr; k; --k, ++relhdr) { char const *const txtname = has_rel_mcount(relhdr, shdr0, shstrtab, fname); @@ -577,6 +662,7 @@ static int do_func(Elf_Ehdr *const ehdr, char const *const fname, result = find_secsym_ndx(w(relhdr->sh_info), txtname, &recval, &recsym, &shdr0[symsec_sh_link], + symtab, symtab_shndx, ehdr); if (result) goto out;
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit 6743ad432ec92e680cd0d9db86cb17b949cf5a43 ]
Anders reported that the lockdep warns that suspicious RCU list usage in register_kprobe() (detected by CONFIG_PROVE_RCU_LIST.) This is because get_kprobe() access kprobe_table[] by hlist_for_each_entry_rcu() without rcu_read_lock.
If we call get_kprobe() from the breakpoint handler context, it is run with preempt disabled, so this is not a problem. But in other cases, instead of rcu_read_lock(), we locks kprobe_mutex so that the kprobe_table[] is not updated. So, current code is safe, but still not good from the view point of RCU.
Joel suggested that we can silent that warning by passing lockdep_is_held() to the last argument of hlist_for_each_entry_rcu().
Add lockdep_is_held(&kprobe_mutex) at the end of the hlist_for_each_entry_rcu() to suppress the warning.
Link: http://lkml.kernel.org/r/158927055350.27680.10261450713467997503.stgit@devno...
Reported-by: Anders Roxell anders.roxell@linaro.org Suggested-by: Joel Fernandes joel@joelfernandes.org Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/kprobes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 195ecb955fcc5..950a5cfd262ce 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -326,7 +326,8 @@ struct kprobe *get_kprobe(void *addr) struct kprobe *p;
head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)]; - hlist_for_each_entry_rcu(p, head, hlist) { + hlist_for_each_entry_rcu(p, head, hlist, + lockdep_is_held(&kprobe_mutex)) { if (p->addr == addr) return p; }
From: Luis Chamberlain mcgrof@kernel.org
[ Upstream commit 1b0b283648163dae2a214ca28ed5a99f62a77319 ]
We use one blktrace per request_queue, that means one per the entire disk. So we cannot run one blktrace on say /dev/vda and then /dev/vda1, or just two calls on /dev/vda.
We check for concurrent setup only at the very end of the blktrace setup though.
If we try to run two concurrent blktraces on the same block device the second one will fail, and the first one seems to go on. However when one tries to kill the first one one will see things like this:
The kernel will show these:
``` debugfs: File 'dropped' in directory 'nvme1n1' already present! debugfs: File 'msg' in directory 'nvme1n1' already present! debugfs: File 'trace0' in directory 'nvme1n1' already present! ``
And userspace just sees this error message for the second call:
``` blktrace /dev/nvme1n1 BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error ```
The first userspace process #1 will also claim that the files were taken underneath their nose as well. The files are taken away form the first process given that when the second blktrace fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN. This means that even if go-happy process #1 is waiting for blktrace data, we *have* been asked to take teardown the blktrace.
This can easily be reproduced with break-blktrace [0] run_0005.sh test.
Just break out early if we know we're already going to fail, this will prevent trying to create the files all over again, which we know still exist.
[0] https://github.com/mcgrof/break-blktrace
Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/blktrace.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 35610a4be4a9e..085fceca33774 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -3,6 +3,9 @@ * Copyright (C) 2006 Jens Axboe axboe@kernel.dk * */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include <linux/kernel.h> #include <linux/blkdev.h> #include <linux/blktrace_api.h> @@ -494,6 +497,16 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, */ strreplace(buts->name, '/', '_');
+ /* + * bdev can be NULL, as with scsi-generic, this is a helpful as + * we can be. + */ + if (q->blk_trace) { + pr_warn("Concurrent blktraces are not allowed on %s\n", + buts->name); + return -EBUSY; + } + bt = kzalloc(sizeof(*bt), GFP_KERNEL); if (!bt) return -ENOMEM;
From: Vishal Verma vishal.l.verma@intel.com
[ Upstream commit 543094e19c82b5d171e139d09a1a3ea0a7361117 ]
It is possible that a platform that is capable of 'namespace labels' comes up without the labels properly initialized. In this case, the region's 'align' attribute is hidden. Howerver, once the user does initialize he labels, the 'align' attribute still stays hidden, which is unexpected.
The sysfs_update_group() API is meant to address this, and could be called during region probe, but it has entanglements with the device 'lockdep_mutex'. Therefore, simply make the 'align' attribute always visible. It doesn't matter what it says for label-less namespaces, since it is not possible to change their allocation anyway.
Suggested-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Cc: Dan Williams dan.j.williams@intel.com Link: https://lore.kernel.org/r/20200520225026.29426-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvdimm/region_devs.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c index ccbb5b43b8b2c..4502f9c4708d0 100644 --- a/drivers/nvdimm/region_devs.c +++ b/drivers/nvdimm/region_devs.c @@ -679,18 +679,8 @@ static umode_t region_visible(struct kobject *kobj, struct attribute *a, int n) return a->mode; }
- if (a == &dev_attr_align.attr) { - int i; - - for (i = 0; i < nd_region->ndr_mappings; i++) { - struct nd_mapping *nd_mapping = &nd_region->mapping[i]; - struct nvdimm *nvdimm = nd_mapping->nvdimm; - - if (test_bit(NDD_LABELING, &nvdimm->flags)) - return a->mode; - } - return 0; - } + if (a == &dev_attr_align.attr) + return a->mode;
if (a != &dev_attr_set_cookie.attr && a != &dev_attr_available_size.attr)
From: Weiping Zhang zhangweiping@didiglobal.com
[ Upstream commit fe35ec58f0d339221643287bbb7cee15c93a5389 ]
There is an issue when tune the number for read and write queues, if the total queue count was not changed. The hctx->type cannot be updated, since __blk_mq_update_nr_hw_queues will return directly if the total queue count has not been changed.
Reproduce:
dmesg | grep "default/read/poll" [ 2.607459] nvme nvme0: 48/0/0 default/read/poll queues cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c 48 default
tune the write queues to 24: echo 24 > /sys/module/nvme/parameters/write_queues echo 1 > /sys/block/nvme0n1/device/reset_controller
dmesg | grep "default/read/poll" [ 433.547235] nvme nvme0: 24/24/0 default/read/poll queues
cat /sys/kernel/debug/block/nvme0n1/hctx*/type | sort | uniq -c 48 default
The driver's hardware queue mapping is not same as block layer.
Signed-off-by: Weiping Zhang zhangweiping@didiglobal.com Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 98a702761e2cc..8f580e66691b9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3328,7 +3328,9 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
if (set->nr_maps == 1 && nr_hw_queues > nr_cpu_ids) nr_hw_queues = nr_cpu_ids; - if (nr_hw_queues < 1 || nr_hw_queues == set->nr_hw_queues) + if (nr_hw_queues < 1) + return; + if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues) return;
list_for_each_entry(q, &set->tag_list, tag_set_list)
From: Yash Shah yash.shah@sifive.com
[ Upstream commit e0d17c842c0f824fd4df9f4688709fc6907201e1 ]
As per the table 4.4 of version "20190608-Priv-MSU-Ratified" of the RISC-V instruction set manual[0], the PTE permission bit combination of "write+exec only" is reserved for future use. Hence, don't allow such mapping request in mmap call.
An issue is been reported by David Abdurachmanov, that while running stress-ng with "sysbadaddr" argument, RCU stalls are observed on RISC-V specific kernel.
This issue arises when the stress-sysbadaddr request for pages with "write+exec only" permission bits and then passes the address obtain from this mmap call to various system call. For the riscv kernel, the mmap call should fail for this particular combination of permission bits since it's not valid.
[0]: http://dabbelt.com/~palmer/keep/riscv-isa-manual/riscv-privileged-20190608-1...
Signed-off-by: Yash Shah yash.shah@sifive.com Reported-by: David Abdurachmanov david.abdurachmanov@gmail.com [Palmer: Refer to the latest ISA specification at the only link I could find, and update the terminology.] Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/sys_riscv.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c index f3619f59d85cc..12f8a7fce78b1 100644 --- a/arch/riscv/kernel/sys_riscv.c +++ b/arch/riscv/kernel/sys_riscv.c @@ -8,6 +8,7 @@ #include <linux/syscalls.h> #include <asm/unistd.h> #include <asm/cacheflush.h> +#include <asm-generic/mman-common.h>
static long riscv_sys_mmap(unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, @@ -16,6 +17,11 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len, { if (unlikely(offset & (~PAGE_MASK >> page_shift_offset))) return -EINVAL; + + if ((prot & PROT_WRITE) && (prot & PROT_EXEC)) + if (unlikely(!(prot & PROT_READ))) + return -EINVAL; + return ksys_mmap_pgoff(addr, len, prot, flags, fd, offset >> (PAGE_SHIFT - page_shift_offset)); }
From: Jiri Slaby jslaby@suse.cz
commit 8e742aa79780b13cd300a42198c1a4cea9c89905 upstream.
After the commit below, truncate() on x86 32bit uses ksys_ftruncate(). But ksys_ftruncate() truncates the offset to unsigned long.
Switch the type of offset to loff_t which is what do_sys_ftruncate() expects.
Fixes: 121b32a58a3a (x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments) Signed-off-by: Jiri Slaby jslaby@suse.cz Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Brian Gerst brgerst@gmail.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200610114851.28549-1-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/syscalls.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 1815065d52f37..1b0c7813197b0 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1358,7 +1358,7 @@ static inline long ksys_lchown(const char __user *filename, uid_t user,
extern long do_sys_ftruncate(unsigned int fd, loff_t length, int small);
-static inline long ksys_ftruncate(unsigned int fd, unsigned long length) +static inline long ksys_ftruncate(unsigned int fd, loff_t length) { return do_sys_ftruncate(fd, length, 1); }
From: Aaron Plattner aplattner@nvidia.com
commit adb36a8203831e40494a92095dacd566b2ad4a69 upstream.
These IDs are for upcoming NVIDIA chips with audio functions that are largely similar to the existing ones.
Signed-off-by: Aaron Plattner aplattner@nvidia.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200611180845.39942-1-aplattner@nvidia.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_hdmi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index 93760a3564cfa..137d655fed8f8 100644 --- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -4145,6 +4145,11 @@ HDA_CODEC_ENTRY(0x10de0095, "GPU 95 HDMI/DP", patch_nvhdmi), HDA_CODEC_ENTRY(0x10de0097, "GPU 97 HDMI/DP", patch_nvhdmi), HDA_CODEC_ENTRY(0x10de0098, "GPU 98 HDMI/DP", patch_nvhdmi), HDA_CODEC_ENTRY(0x10de0099, "GPU 99 HDMI/DP", patch_nvhdmi), +HDA_CODEC_ENTRY(0x10de009a, "GPU 9a HDMI/DP", patch_nvhdmi), +HDA_CODEC_ENTRY(0x10de009d, "GPU 9d HDMI/DP", patch_nvhdmi), +HDA_CODEC_ENTRY(0x10de009e, "GPU 9e HDMI/DP", patch_nvhdmi), +HDA_CODEC_ENTRY(0x10de009f, "GPU 9f HDMI/DP", patch_nvhdmi), +HDA_CODEC_ENTRY(0x10de00a0, "GPU a0 HDMI/DP", patch_nvhdmi), HDA_CODEC_ENTRY(0x10de8001, "MCP73 HDMI", patch_nvhdmi_2ch), HDA_CODEC_ENTRY(0x10de8067, "MCP67/68 HDMI", patch_nvhdmi_2ch), HDA_CODEC_ENTRY(0x11069f80, "VX900 HDMI/DP", patch_via_hdmi),
From: Takashi Iwai tiwai@suse.de
commit a0b03952a797591d4b6d6fa7b9b7872e27783729 upstream.
MSI GE63 laptop with ALC1220 codec requires the very same quirk (ALC1220_FIXUP_CLEVO_P950) as other MSI devices for the proper sound output.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208057 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200616132150.8778-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index e057ecb5a9045..f58102925c705 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2460,6 +2460,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = { SND_PCI_QUIRK(0x1458, 0xa0b8, "Gigabyte AZ370-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS), SND_PCI_QUIRK(0x1458, 0xa0cd, "Gigabyte X570 Aorus Master", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1458, 0xa0ce, "Gigabyte X570 Aorus Xtreme", ALC1220_FIXUP_CLEVO_P950), + SND_PCI_QUIRK(0x1462, 0x11f7, "MSI-GE63", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1228, "MSI-GP63", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1275, "MSI-GL63", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1276, "MSI-GL73", ALC1220_FIXUP_CLEVO_P950),
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit b2c22910fe5aae10b7e17b0721e63a3edf0c9553 upstream.
There are two more HP systems control mute LED from HDA codec and need to expose micmute led class so SoF can control micmute LED.
Add quirks to support them.
Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200617102906.16156-2-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index f58102925c705..cb689878ba205 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7436,6 +7436,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x83b9, "HP Spectre x360", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x103c, 0x8497, "HP Envy x360", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x103c, 0x84e7, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3), + SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x8729, "HP", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8736, "HP", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x877a, "HP", ALC285_FIXUP_HP_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x877d, "HP", ALC236_FIXUP_HP_MUTE_LED),
From: Nathan Chancellor natechancellor@gmail.com
commit e6d701dca9893990d999fd145e3e07223c002b06 upstream.
When running a kernel with Clang's Control Flow Integrity implemented, there is a violation that happens when accessing /sys/firmware/acpi/pm_profile:
$ cat /sys/firmware/acpi/pm_profile 0
$ dmesg ... [ 17.352564] ------------[ cut here ]------------ [ 17.352568] CFI failure (target: acpi_show_profile+0x0/0x8): [ 17.352572] WARNING: CPU: 3 PID: 497 at kernel/cfi.c:29 __cfi_check_fail+0x33/0x40 [ 17.352573] Modules linked in: [ 17.352575] CPU: 3 PID: 497 Comm: cat Tainted: G W 5.7.0-microsoft-standard+ #1 [ 17.352576] RIP: 0010:__cfi_check_fail+0x33/0x40 [ 17.352577] Code: 48 c7 c7 50 b3 85 84 48 c7 c6 50 0a 4e 84 e8 a4 d8 60 00 85 c0 75 02 5b c3 48 c7 c7 dc 5e 49 84 48 89 de 31 c0 e8 7d 06 eb ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 85 f6 74 25 41 b9 ea ff ff [ 17.352577] RSP: 0018:ffffaa6dc3c53d30 EFLAGS: 00010246 [ 17.352578] RAX: 331267e0c06cee00 RBX: ffffffff83d85890 RCX: ffffffff8483a6f8 [ 17.352579] RDX: ffff9cceabbb37c0 RSI: 0000000000000082 RDI: ffffffff84bb9e1c [ 17.352579] RBP: ffffffff845b2bc8 R08: 0000000000000001 R09: ffff9cceabbba200 [ 17.352579] R10: 000000000000019d R11: 0000000000000000 R12: ffff9cc947766f00 [ 17.352580] R13: ffffffff83d6bd50 R14: ffff9ccc6fa80000 R15: ffffffff845bd328 [ 17.352582] FS: 00007fdbc8d13580(0000) GS:ffff9cce91ac0000(0000) knlGS:0000000000000000 [ 17.352582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.352583] CR2: 00007fdbc858e000 CR3: 00000005174d0000 CR4: 0000000000340ea0 [ 17.352584] Call Trace: [ 17.352586] ? rev_id_show+0x8/0x8 [ 17.352587] ? __cfi_check+0x45bac/0x4b640 [ 17.352589] ? kobj_attr_show+0x73/0x80 [ 17.352590] ? sysfs_kf_seq_show+0xc1/0x140 [ 17.352592] ? ext4_seq_options_show.cfi_jt+0x8/0x8 [ 17.352593] ? seq_read+0x180/0x600 [ 17.352595] ? sysfs_create_file_ns.cfi_jt+0x10/0x10 [ 17.352596] ? tlbflush_read_file+0x8/0x8 [ 17.352597] ? __vfs_read+0x6b/0x220 [ 17.352598] ? handle_mm_fault+0xa23/0x11b0 [ 17.352599] ? vfs_read+0xa2/0x130 [ 17.352599] ? ksys_read+0x6a/0xd0 [ 17.352601] ? __do_sys_getpgrp+0x8/0x8 [ 17.352602] ? do_syscall_64+0x72/0x120 [ 17.352603] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 17.352604] ---[ end trace 7b1fa81dc897e419 ]---
When /sys/firmware/acpi/pm_profile is read, sysfs_kf_seq_show is called, which in turn calls kobj_attr_show, which gets the ->show callback member by calling container_of on attr (casting it to struct kobj_attribute) then calls it.
There is a CFI violation because pm_profile_attr is of type struct device_attribute but kobj_attr_show calls ->show expecting it to be from struct kobj_attribute. CFI checking ensures that function pointer types match when doing indirect calls. Fix pm_profile_attr to be defined in terms of kobj_attribute so there is no violation or mismatch.
Fixes: 362b646062b2 ("ACPI: Export FADT pm_profile integer value to userspace") Link: https://github.com/ClangBuiltLinux/linux/issues/1051 Reported-by: yuu ichii byahu140@heisei.be Signed-off-by: Nathan Chancellor natechancellor@gmail.com Cc: 3.10+ stable@vger.kernel.org # 3.10+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c index 3a89909b50a6c..76c668c05fa03 100644 --- a/drivers/acpi/sysfs.c +++ b/drivers/acpi/sysfs.c @@ -938,13 +938,13 @@ static void __exit interrupt_stats_exit(void) }
static ssize_t -acpi_show_profile(struct device *dev, struct device_attribute *attr, +acpi_show_profile(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { return sprintf(buf, "%d\n", acpi_gbl_FADT.preferred_profile); }
-static const struct device_attribute pm_profile_attr = +static const struct kobj_attribute pm_profile_attr = __ATTR(pm_profile, S_IRUGO, acpi_show_profile, NULL);
static ssize_t hotplug_enabled_show(struct kobject *kobj,
From: "Jason A. Donenfeld" Jason@zx2c4.com
commit 75b0cea7bf307f362057cc778efe89af4c615354 upstream.
Like other vectors already patched, this one here allows the root user to load ACPI tables, which enables arbitrary physical address writes, which in turn makes it possible to disable lockdown.
Prevents this by checking the lockdown status before allowing a new ACPI table to be installed. The link in the trailer shows a PoC of how this might be used.
Link: https://git.zx2c4.com/american-unsigned-language/tree/american-unsigned-lang... Cc: 5.4+ stable@vger.kernel.org # 5.4+ Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_configfs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_configfs.c b/drivers/acpi/acpi_configfs.c index ece8c1a921cc1..88c8af455ea3f 100644 --- a/drivers/acpi/acpi_configfs.c +++ b/drivers/acpi/acpi_configfs.c @@ -11,6 +11,7 @@ #include <linux/module.h> #include <linux/configfs.h> #include <linux/acpi.h> +#include <linux/security.h>
#include "acpica/accommon.h" #include "acpica/actables.h" @@ -28,7 +29,10 @@ static ssize_t acpi_table_aml_write(struct config_item *cfg, { const struct acpi_table_header *header = data; struct acpi_table *table; - int ret; + int ret = security_locked_down(LOCKDOWN_ACPI_TABLES); + + if (ret) + return ret;
table = container_of(cfg, struct acpi_table, cfg);
From: Gao Xiang hsiangkao@redhat.com
commit 3c597282887fd55181578996dca52ce697d985a5 upstream.
Hongyu reported "id != index" in z_erofs_onlinepage_fixup() with specific aarch64 environment easily, which wasn't shown before.
After digging into that, I found that high 32 bits of page->private was set to 0xaaaaaaaa rather than 0 (due to z_erofs_onlinepage_init behavior with specific compiler options). Actually we only use low 32 bits to keep the page information since page->private is only 4 bytes on most 32-bit platforms. However z_erofs_onlinepage_fixup() uses the upper 32 bits by mistake.
Let's fix it now.
Reported-and-tested-by: Hongyu Jin hongyu.jin@unisoc.com Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support") Cc: stable@vger.kernel.org # 4.19+ Reviewed-by: Chao Yu yuchao0@huawei.com Link: https://lore.kernel.org/r/20200618234349.22553-1-hsiangkao@aol.com Signed-off-by: Gao Xiang hsiangkao@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/erofs/zdata.h | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/zdata.h b/fs/erofs/zdata.h index 7824f5563a552..9b66c28b3ae9d 100644 --- a/fs/erofs/zdata.h +++ b/fs/erofs/zdata.h @@ -144,22 +144,22 @@ static inline void z_erofs_onlinepage_init(struct page *page) static inline void z_erofs_onlinepage_fixup(struct page *page, uintptr_t index, bool down) { - unsigned long *p, o, v, id; -repeat: - p = &page_private(page); - o = READ_ONCE(*p); + union z_erofs_onlinepage_converter u = { .v = &page_private(page) }; + int orig, orig_index, val;
- id = o >> Z_EROFS_ONLINEPAGE_INDEX_SHIFT; - if (id) { +repeat: + orig = atomic_read(u.o); + orig_index = orig >> Z_EROFS_ONLINEPAGE_INDEX_SHIFT; + if (orig_index) { if (!index) return;
- DBG_BUGON(id != index); + DBG_BUGON(orig_index != index); }
- v = (index << Z_EROFS_ONLINEPAGE_INDEX_SHIFT) | - ((o & Z_EROFS_ONLINEPAGE_COUNT_MASK) + (unsigned int)down); - if (cmpxchg(p, o, v) != o) + val = (index << Z_EROFS_ONLINEPAGE_INDEX_SHIFT) | + ((orig & Z_EROFS_ONLINEPAGE_COUNT_MASK) + (unsigned int)down); + if (atomic_cmpxchg(u.o, orig, val) != orig) goto repeat; }
From: Xiaoyao Li xiaoyao.li@intel.com
commit bf10bd0be53282183f374af23577b18b5fbf7801 upstream.
Only MSR address range 0x800 through 0x8ff is architecturally reserved and dedicated for accessing APIC registers in x2APIC mode.
Fixes: 0105d1a52640 ("KVM: x2apic interface to lapic") Signed-off-by: Xiaoyao Li xiaoyao.li@intel.com Message-Id: 20200616073307.16440-1-xiaoyao.li@intel.com Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson sean.j.christopherson@intel.com Reviewed-by: Jim Mattson jmattson@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 97c5a92146f9d..5f08eeac16c87 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2784,7 +2784,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return kvm_mtrr_set_msr(vcpu, msr, data); case MSR_IA32_APICBASE: return kvm_set_apic_base(vcpu, msr_info); - case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff: + case APIC_BASE_MSR ... APIC_BASE_MSR + 0xff: return kvm_x2apic_msr_write(vcpu, msr, data); case MSR_IA32_TSCDEADLINE: kvm_set_lapic_tscdeadline_msr(vcpu, data); @@ -3112,7 +3112,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_APICBASE: msr_info->data = kvm_get_apic_base(vcpu); break; - case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff: + case APIC_BASE_MSR ... APIC_BASE_MSR + 0xff: return kvm_x2apic_msr_read(vcpu, msr_info->index, &msr_info->data); case MSR_IA32_TSCDEADLINE: msr_info->data = kvm_get_lapic_tscdeadline_msr(vcpu);
From: Igor Mammedov imammedo@redhat.com
commit af28dfacbe00d53df5dec2bf50640df33138b1fe upstream.
Guest fails to online hotplugged CPU with error smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
It's caused by the fact that kvm_apic_set_state(), which used to call recalculate_apic_map() unconditionally and pulled hotplugged CPU into apic map, is updating map conditionally on state changes. In this case the APIC map is not considered dirty and the is not updated.
Fix the issue by forcing unconditional update from kvm_apic_set_state(), like it used to be.
Fixes: 4abaffce4d25a ("KVM: LAPIC: Recalculate apic map in batch") Cc: stable@vger.kernel.org Signed-off-by: Igor Mammedov imammedo@redhat.com Message-Id: 20200622160830.426022-1-imammedo@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 9af25c97612a7..8967e320a9784 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2512,6 +2512,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s) } memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s));
+ apic->vcpu->kvm->arch.apic_map_dirty = true; kvm_recalculate_apic_map(vcpu->kvm); kvm_apic_set_version(vcpu);
From: Sean Christopherson sean.j.christopherson@intel.com
commit 2dbebf7ae1ed9a420d954305e2c9d5ed39ec57c3 upstream.
Explicitly pass the L2 GPA to kvm_arch_write_log_dirty(), which for all intents and purposes is vmx_write_pml_buffer(), instead of having the latter pull the GPA from vmcs.GUEST_PHYSICAL_ADDRESS. If the dirty bit update is the result of KVM emulation (rare for L2), then the GPA in the VMCS may be stale and/or hold a completely unrelated GPA.
Fixes: c5f983f6e8455 ("nVMX: Implement emulated Page Modification Logging") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Message-Id: 20200622215832.22090-2-sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu.h | 2 +- arch/x86/kvm/mmu/mmu.c | 4 ++-- arch/x86/kvm/mmu/paging_tmpl.h | 7 ++++--- arch/x86/kvm/vmx/vmx.c | 6 +++--- 5 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0a6b35353fc79..86e2e0272c576 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1195,7 +1195,7 @@ struct kvm_x86_ops { void (*enable_log_dirty_pt_masked)(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t offset, unsigned long mask); - int (*write_log_dirty)(struct kvm_vcpu *vcpu); + int (*write_log_dirty)(struct kvm_vcpu *vcpu, gpa_t l2_gpa);
/* pmu operations of sub-arch */ const struct kvm_pmu_ops *pmu_ops; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 8a3b1bce722a4..d0e3b1b6845b5 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -222,7 +222,7 @@ void kvm_mmu_gfn_disallow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, gfn_t gfn); bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, struct kvm_memory_slot *slot, u64 gfn); -int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); +int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu, gpa_t l2_gpa);
int kvm_mmu_post_init_vm(struct kvm *kvm); void kvm_mmu_pre_destroy_vm(struct kvm *kvm); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 92d0569541943..eb27ab47d6073 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1746,10 +1746,10 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, * Emulate arch specific page modification logging for the * nested hypervisor */ -int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu) +int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu, gpa_t l2_gpa) { if (kvm_x86_ops.write_log_dirty) - return kvm_x86_ops.write_log_dirty(vcpu); + return kvm_x86_ops.write_log_dirty(vcpu, l2_gpa);
return 0; } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 9bdf9b7d9a962..7098f843eabdd 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -235,7 +235,7 @@ static inline unsigned FNAME(gpte_access)(u64 gpte) static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, struct guest_walker *walker, - int write_fault) + gpa_t addr, int write_fault) { unsigned level, index; pt_element_t pte, orig_pte; @@ -260,7 +260,7 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, !(pte & PT_GUEST_DIRTY_MASK)) { trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte)); #if PTTYPE == PTTYPE_EPT - if (kvm_arch_write_log_dirty(vcpu)) + if (kvm_arch_write_log_dirty(vcpu, addr)) return -EINVAL; #endif pte |= PT_GUEST_DIRTY_MASK; @@ -457,7 +457,8 @@ retry_walk: (PT_GUEST_DIRTY_SHIFT - PT_GUEST_ACCESSED_SHIFT);
if (unlikely(!accessed_dirty)) { - ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault); + ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, + addr, write_fault); if (unlikely(ret < 0)) goto error; else if (ret) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8f59f8c8fd05d..2d7ce012b1d5f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7333,11 +7333,11 @@ static void vmx_flush_log_dirty(struct kvm *kvm) kvm_flush_pml_buffers(kvm); }
-static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu) +static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu, gpa_t gpa) { struct vmcs12 *vmcs12; struct vcpu_vmx *vmx = to_vmx(vcpu); - gpa_t gpa, dst; + gpa_t dst;
if (is_guest_mode(vcpu)) { WARN_ON_ONCE(vmx->nested.pml_full); @@ -7356,7 +7356,7 @@ static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu) return 1; }
- gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS) & ~0xFFFull; + gpa &= ~0xFFFull; dst = vmcs12->pml_address + sizeof(u64) * vmcs12->guest_pml_index;
if (kvm_write_guest_page(vcpu->kvm, gpa_to_gfn(dst), &gpa,
From: Sean Christopherson sean.j.christopherson@intel.com
commit bf09fb6cba4f7099620cc9ed32d94c27c4af992e upstream.
Remove support for context switching between the guest's and host's desired UMWAIT_CONTROL. Propagating the guest's value to hardware isn't required for correct functionality, e.g. KVM intercepts reads and writes to the MSR, and the latency effects of the settings controlled by the MSR are not architecturally visible.
As a general rule, KVM should not allow the guest to control power management settings unless explicitly enabled by userspace, e.g. see KVM_CAP_X86_DISABLE_EXITS. E.g. Intel's SDM explicitly states that C0.2 can improve the performance of SMT siblings. A devious guest could disable C0.2 so as to improve the performance of their workloads at the detriment to workloads running in the host or on other VMs.
Wholesale removal of UMWAIT_CONTROL context switching also fixes a race condition where updates from the host may cause KVM to enter the guest with the incorrect value. Because updates are are propagated to all CPUs via IPI (SMP function callback), the value in hardware may be stale with respect to the cached value and KVM could enter the guest with the wrong value in hardware. As above, the guest can't observe the bad value, but it's a weird and confusing wart in the implementation.
Removal also fixes the unnecessary usage of VMX's atomic load/store MSR lists. Using the lists is only necessary for MSRs that are required for correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on old hardware, or for MSRs that need to-the-uop precision, e.g. perf related MSRs. For UMWAIT_CONTROL, the effects are only visible in the kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in vcpu_vmx_run(). Using the atomic lists is undesirable as they are more expensive than direct RDMSR/WRMSR.
Furthermore, even if giving the guest control of the MSR is legitimate, e.g. in pass-through scenarios, it's not clear that the benefits would outweigh the overhead. E.g. saving and restoring an MSR across a VMX roundtrip costs ~250 cycles, and if the guest diverged from the host that cost would be paid on every run of the guest. In other words, if there is a legitimate use case then it should be enabled by a new per-VM capability.
Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can correctly expose other WAITPKG features to the guest, e.g. TPAUSE, UMWAIT and UMONITOR.
Fixes: 6e3ba4abcea56 ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL") Cc: stable@vger.kernel.org Cc: Jingqi Liu jingqi.liu@intel.com Cc: Tao Xu tao3.xu@intel.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Message-Id: 20200623005135.10414-1-sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/mwait.h | 2 -- arch/x86/kernel/cpu/umwait.c | 6 ------ arch/x86/kvm/vmx/vmx.c | 18 ------------------ 3 files changed, 26 deletions(-)
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index b809f117f3f46..9d5252c9685c6 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -23,8 +23,6 @@ #define MWAITX_MAX_LOOPS ((u32)-1) #define MWAITX_DISABLE_CSTATES 0xf0
-u32 get_umwait_control_msr(void); - static inline void __monitor(const void *eax, unsigned long ecx, unsigned long edx) { diff --git a/arch/x86/kernel/cpu/umwait.c b/arch/x86/kernel/cpu/umwait.c index 300e3fd5ade3e..ec8064c0ae035 100644 --- a/arch/x86/kernel/cpu/umwait.c +++ b/arch/x86/kernel/cpu/umwait.c @@ -18,12 +18,6 @@ */ static u32 umwait_control_cached = UMWAIT_CTRL_VAL(100000, UMWAIT_C02_ENABLE);
-u32 get_umwait_control_msr(void) -{ - return umwait_control_cached; -} -EXPORT_SYMBOL_GPL(get_umwait_control_msr); - /* * Cache the original IA32_UMWAIT_CONTROL MSR value which is configured by * hardware or BIOS before kernel boot. diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 2d7ce012b1d5f..390ec34e4b4f1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6467,23 +6467,6 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) msrs[i].host, false); }
-static void atomic_switch_umwait_control_msr(struct vcpu_vmx *vmx) -{ - u32 host_umwait_control; - - if (!vmx_has_waitpkg(vmx)) - return; - - host_umwait_control = get_umwait_control_msr(); - - if (vmx->msr_ia32_umwait_control != host_umwait_control) - add_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL, - vmx->msr_ia32_umwait_control, - host_umwait_control, false); - else - clear_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL); -} - static void vmx_update_hv_timer(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6576,7 +6559,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) pt_guest_enter(vmx);
atomic_switch_perf_msrs(vmx); - atomic_switch_umwait_control_msr(vmx);
if (enable_preemption_timer) vmx_update_hv_timer(vcpu);
From: Kees Cook keescook@chromium.org
commit a13b9d0b97211579ea63b96c606de79b963c0f47 upstream.
The X86_CR4_FSGSBASE bit of CR4 should not change after boot[1]. Older kernels should enforce this bit to zero, and newer kernels need to enforce it depending on boot-time configuration (e.g. "nofsgsbase"). To support a pinned bit being either 1 or 0, use an explicit mask in combination with the expected pinned bit values.
[1] https://lore.kernel.org/lkml/20200527103147.GI325280@hirez.programming.kicks...
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/202006082013.71E29A42@keescook Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/common.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index e2e4af685f9b4..c669a5756bdf6 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -348,6 +348,9 @@ out: cr4_clear_bits(X86_CR4_UMIP); }
+/* These bits should not change their value after CPU init is finished. */ +static const unsigned long cr4_pinned_mask = + X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | X86_CR4_FSGSBASE; static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); static unsigned long cr4_pinned_bits __ro_after_init;
@@ -372,20 +375,20 @@ EXPORT_SYMBOL(native_write_cr0);
void native_write_cr4(unsigned long val) { - unsigned long bits_missing = 0; + unsigned long bits_changed = 0;
set_register: asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
if (static_branch_likely(&cr_pinning)) { - if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) { - bits_missing = ~val & cr4_pinned_bits; - val |= bits_missing; + if (unlikely((val & cr4_pinned_mask) != cr4_pinned_bits)) { + bits_changed = (val & cr4_pinned_mask) ^ cr4_pinned_bits; + val = (val & ~cr4_pinned_mask) | cr4_pinned_bits; goto set_register; } - /* Warn after we've set the missing bits. */ - WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n", - bits_missing); + /* Warn after we've corrected the changed bits. */ + WARN_ONCE(bits_changed, "pinned CR4 bits changed: 0x%lx!?\n", + bits_changed); } } EXPORT_SYMBOL(native_write_cr4); @@ -397,7 +400,7 @@ void cr4_init(void) if (boot_cpu_has(X86_FEATURE_PCID)) cr4 |= X86_CR4_PCIDE; if (static_branch_likely(&cr_pinning)) - cr4 |= cr4_pinned_bits; + cr4 = (cr4 & ~cr4_pinned_mask) | cr4_pinned_bits;
__write_cr4(cr4);
@@ -412,10 +415,7 @@ void cr4_init(void) */ static void __init setup_cr_pinning(void) { - unsigned long mask; - - mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP); - cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask; + cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask; static_key_enable(&cr_pinning.key); }
From: Sean Christopherson sean.j.christopherson@intel.com
commit 5d5103595e9e53048bb7e70ee2673c897ab38300 upstream.
Reinitialize IA32_FEAT_CTL on the BSP during wakeup to handle the case where firmware doesn't initialize or save/restore across S3. This fixes a bug where IA32_FEAT_CTL is left uninitialized and results in VMXON taking a #GP due to VMX not being fully enabled, i.e. breaks KVM.
Use init_ia32_feat_ctl() to "restore" IA32_FEAT_CTL as it already deals with the case where the MSR is locked, and because APs already redo init_ia32_feat_ctl() during suspend by virtue of the SMP boot flow being used to reinitialize APs upon wakeup. Do the call in the early wakeup flow to avoid dependencies in the syscore_ops chain, e.g. simply adding a resume hook is not guaranteed to work, as KVM does VMXON in its own resume hook, kvm_resume(), when KVM has active guests.
Fixes: 21bd3467a58e ("KVM: VMX: Drop initialization of IA32_FEAT_CTL MSR") Reported-by: Brad Campbell lists2009@fnarfbargle.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Liam Merwick liam.merwick@oracle.com Reviewed-by: Maxim Levitsky mlevitsk@redhat.com Tested-by: Brad Campbell lists2009@fnarfbargle.com Cc: stable@vger.kernel.org # v5.6 Link: https://lkml.kernel.org/r/20200608174134.11157-1-sean.j.christopherson@intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpu.h | 5 +++++ arch/x86/kernel/cpu/centaur.c | 1 + arch/x86/kernel/cpu/cpu.h | 4 ---- arch/x86/kernel/cpu/zhaoxin.c | 1 + arch/x86/power/cpu.c | 6 ++++++ 5 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index dd17c2da1af5f..da78ccbd493b7 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -58,4 +58,9 @@ static inline bool handle_guest_split_lock(unsigned long ip) return false; } #endif +#ifdef CONFIG_IA32_FEAT_CTL +void init_ia32_feat_ctl(struct cpuinfo_x86 *c); +#else +static inline void init_ia32_feat_ctl(struct cpuinfo_x86 *c) {} +#endif #endif /* _ASM_X86_CPU_H */ diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c index 426792565d864..c5cf336e50776 100644 --- a/arch/x86/kernel/cpu/centaur.c +++ b/arch/x86/kernel/cpu/centaur.c @@ -3,6 +3,7 @@ #include <linux/sched.h> #include <linux/sched/clock.h>
+#include <asm/cpu.h> #include <asm/cpufeature.h> #include <asm/e820/api.h> #include <asm/mtrr.h> diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h index fb538fccd24c0..9d033693519aa 100644 --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -81,8 +81,4 @@ extern void update_srbds_msr(void);
extern u64 x86_read_arch_cap_msr(void);
-#ifdef CONFIG_IA32_FEAT_CTL -void init_ia32_feat_ctl(struct cpuinfo_x86 *c); -#endif - #endif /* ARCH_X86_CPU_H */ diff --git a/arch/x86/kernel/cpu/zhaoxin.c b/arch/x86/kernel/cpu/zhaoxin.c index df1358ba622bd..05fa4ef634902 100644 --- a/arch/x86/kernel/cpu/zhaoxin.c +++ b/arch/x86/kernel/cpu/zhaoxin.c @@ -2,6 +2,7 @@ #include <linux/sched.h> #include <linux/sched/clock.h>
+#include <asm/cpu.h> #include <asm/cpufeature.h>
#include "cpu.h" diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index aaff9ed7ff45c..b0d3c5ca6d807 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -193,6 +193,8 @@ static void fix_processor_context(void) */ static void notrace __restore_processor_state(struct saved_context *ctxt) { + struct cpuinfo_x86 *c; + if (ctxt->misc_enable_saved) wrmsrl(MSR_IA32_MISC_ENABLE, ctxt->misc_enable); /* @@ -263,6 +265,10 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) mtrr_bp_restore(); perf_restore_debug_store(); msr_restore_context(ctxt); + + c = &cpu_data(smp_processor_id()); + if (cpu_has(c, X86_FEATURE_MSR_IA32_FEAT_CTL)) + init_ia32_feat_ctl(c); }
/* Needed by apm.c */
From: Matt Fleming matt@codeblueprint.co.uk
commit bb5570ad3b54e7930997aec76ab68256d5236d94 upstream.
x86 CPUs can suffer severe performance drops if a tight loop, such as the ones in __clear_user(), straddles a 16-byte instruction fetch window, or worse, a 64-byte cacheline. This issues was discovered in the SUSE kernel with the following commit,
1153933703d9 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants")
which increased the code object size from 10 bytes to 15 bytes and caused the 8-byte copy loop in __clear_user() to be split across a 64-byte cacheline.
Aligning the start of the loop to 16-bytes makes this fit neatly inside a single instruction fetch window again and restores the performance of __clear_user() which is used heavily when reading from /dev/zero.
Here are some numbers from running libmicro's read_z* and pread_z* microbenchmarks which read from /dev/zero:
Zen 1 (Naples)
libmicro-file 5.7.0-rc6 5.7.0-rc6 5.7.0-rc6 revert-1153933703d9+ align16+ Time mean95-pread_z100k 9.9195 ( 0.00%) 5.9856 ( 39.66%) 5.9938 ( 39.58%) Time mean95-pread_z10k 1.1378 ( 0.00%) 0.7450 ( 34.52%) 0.7467 ( 34.38%) Time mean95-pread_z1k 0.2623 ( 0.00%) 0.2251 ( 14.18%) 0.2252 ( 14.15%) Time mean95-pread_zw100k 9.9974 ( 0.00%) 6.0648 ( 39.34%) 6.0756 ( 39.23%) Time mean95-read_z100k 9.8940 ( 0.00%) 5.9885 ( 39.47%) 5.9994 ( 39.36%) Time mean95-read_z10k 1.1394 ( 0.00%) 0.7483 ( 34.33%) 0.7482 ( 34.33%)
Note that this doesn't affect Haswell or Broadwell microarchitectures which seem to avoid the alignment issue by executing the loop straight out of the Loop Stream Detector (verified using perf events).
Fixes: 1153933703d9 ("x86/asm/64: Micro-optimize __clear_user() - Use immediate constants") Signed-off-by: Matt Fleming matt@codeblueprint.co.uk Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org # v4.19+ Link: https://lkml.kernel.org/r/20200618102002.30034-1-matt@codeblueprint.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/lib/usercopy_64.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c index fff28c6f73a21..b0dfac3d3df71 100644 --- a/arch/x86/lib/usercopy_64.c +++ b/arch/x86/lib/usercopy_64.c @@ -24,6 +24,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size) asm volatile( " testq %[size8],%[size8]\n" " jz 4f\n" + " .align 16\n" "0: movq $0,(%[dst])\n" " addq $8,%[dst]\n" " decl %%ecx ; jnz 0b\n"
From: Filipe Manana fdmanana@suse.com
commit 6bd335b469f945f75474c11e3f577f85409f39c3 upstream.
When balance and scrub are running in parallel it is possible to end up with an underflow of the bytes_may_use counter of the data space_info object, which triggers a warning like the following:
[134243.793196] BTRFS info (device sdc): relocating block group 1104150528 flags data [134243.806891] ------------[ cut here ]------------ [134243.807561] WARNING: CPU: 1 PID: 26884 at fs/btrfs/space-info.h:125 btrfs_add_reserved_bytes+0x1da/0x280 [btrfs] [134243.808819] Modules linked in: btrfs blake2b_generic xor (...) [134243.815779] CPU: 1 PID: 26884 Comm: kworker/u8:8 Tainted: G W 5.6.0-rc7-btrfs-next-58 #5 [134243.816944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [134243.818389] Workqueue: writeback wb_workfn (flush-btrfs-108483) [134243.819186] RIP: 0010:btrfs_add_reserved_bytes+0x1da/0x280 [btrfs] [134243.819963] Code: 0b f2 85 (...) [134243.822271] RSP: 0018:ffffa4160aae7510 EFLAGS: 00010287 [134243.822929] RAX: 000000000000c000 RBX: ffff96159a8c1000 RCX: 0000000000000000 [134243.823816] RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffff96158067a810 [134243.824742] RBP: ffff96158067a800 R08: 0000000000000001 R09: 0000000000000000 [134243.825636] R10: ffff961501432a40 R11: 0000000000000000 R12: 000000000000c000 [134243.826532] R13: 0000000000000001 R14: ffffffffffff4000 R15: ffff96158067a810 [134243.827432] FS: 0000000000000000(0000) GS:ffff9615baa00000(0000) knlGS:0000000000000000 [134243.828451] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [134243.829184] CR2: 000055bd7e414000 CR3: 00000001077be004 CR4: 00000000003606e0 [134243.830083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [134243.830975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [134243.831867] Call Trace: [134243.832211] find_free_extent+0x4a0/0x16c0 [btrfs] [134243.832846] btrfs_reserve_extent+0x91/0x180 [btrfs] [134243.833487] cow_file_range+0x12d/0x490 [btrfs] [134243.834080] fallback_to_cow+0x82/0x1b0 [btrfs] [134243.834689] ? release_extent_buffer+0x121/0x170 [btrfs] [134243.835370] run_delalloc_nocow+0x33f/0xa30 [btrfs] [134243.836032] btrfs_run_delalloc_range+0x1ea/0x6d0 [btrfs] [134243.836725] ? find_lock_delalloc_range+0x221/0x250 [btrfs] [134243.837450] writepage_delalloc+0xe8/0x150 [btrfs] [134243.838059] __extent_writepage+0xe8/0x4c0 [btrfs] [134243.838674] extent_write_cache_pages+0x237/0x530 [btrfs] [134243.839364] extent_writepages+0x44/0xa0 [btrfs] [134243.839946] do_writepages+0x23/0x80 [134243.840401] __writeback_single_inode+0x59/0x700 [134243.841006] writeback_sb_inodes+0x267/0x5f0 [134243.841548] __writeback_inodes_wb+0x87/0xe0 [134243.842091] wb_writeback+0x382/0x590 [134243.842574] ? wb_workfn+0x4a2/0x6c0 [134243.843030] wb_workfn+0x4a2/0x6c0 [134243.843468] process_one_work+0x26d/0x6a0 [134243.843978] worker_thread+0x4f/0x3e0 [134243.844452] ? process_one_work+0x6a0/0x6a0 [134243.844981] kthread+0x103/0x140 [134243.845400] ? kthread_create_worker_on_cpu+0x70/0x70 [134243.846030] ret_from_fork+0x3a/0x50 [134243.846494] irq event stamp: 0 [134243.846892] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [134243.847682] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 [134243.848687] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 [134243.849913] softirqs last disabled at (0): [<0000000000000000>] 0x0 [134243.850698] ---[ end trace bd7c03622e0b0a96 ]--- [134243.851335] ------------[ cut here ]------------
When relocating a data block group, for each extent allocated in the block group we preallocate another extent with the same size for the data relocation inode (we do it at prealloc_file_extent_cluster()). We reserve space by calling btrfs_check_data_free_space(), which ends up incrementing the data space_info's bytes_may_use counter, and then call btrfs_prealloc_file_range() to allocate the extent, which always decrements the bytes_may_use counter by the same amount.
The expectation is that writeback of the data relocation inode always follows a NOCOW path, by writing into the preallocated extents. However, when starting writeback we might end up falling back into the COW path, because the block group that contains the preallocated extent was turned into RO mode by a scrub running in parallel. The COW path then calls the extent allocator which ends up calling btrfs_add_reserved_bytes(), and this function decrements the bytes_may_use counter of the data space_info object by an amount corresponding to the size of the allocated extent, despite we haven't previously incremented it. When the counter currently has a value smaller then the allocated extent we reset the counter to 0 and emit a warning, otherwise we just decrement it and slowly mess up with this counter which is crucial for space reservation, the end result can be granting reserved space to tasks when there isn't really enough free space, and having the tasks fail later in critical places where error handling consists of a transaction abort or hitting a BUG_ON().
Fix this by making sure that if we fallback to the COW path for a data relocation inode, we increment the bytes_may_use counter of the data space_info object. The COW path will then decrement it at btrfs_add_reserved_bytes() on success or through its error handling part by a call to extent_clear_unlock_delalloc() (which ends up calling btrfs_clear_delalloc_extent() that does the decrement operation) in case of an error.
Test case btrfs/061 from fstests could sporadically trigger this.
CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 66dd919fc7235..e141dabed0ffb 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1361,6 +1361,8 @@ static int fallback_to_cow(struct inode *inode, struct page *locked_page, int *page_started, unsigned long *nr_written) { const bool is_space_ino = btrfs_is_free_space_inode(BTRFS_I(inode)); + const bool is_reloc_ino = (BTRFS_I(inode)->root->root_key.objectid == + BTRFS_DATA_RELOC_TREE_OBJECTID); const u64 range_bytes = end + 1 - start; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; u64 range_start = start; @@ -1391,18 +1393,23 @@ static int fallback_to_cow(struct inode *inode, struct page *locked_page, * data space info, which we incremented in the step above. * * If we need to fallback to cow and the inode corresponds to a free - * space cache inode, we must also increment bytes_may_use of the data - * space_info for the same reason. Space caches always get a prealloc + * space cache inode or an inode of the data relocation tree, we must + * also increment bytes_may_use of the data space_info for the same + * reason. Space caches and relocated data extents always get a prealloc * extent for them, however scrub or balance may have set the block - * group that contains that extent to RO mode. + * group that contains that extent to RO mode and therefore force COW + * when starting writeback. */ count = count_range_bits(io_tree, &range_start, end, range_bytes, EXTENT_NORESERVE, 0); - if (count > 0 || is_space_ino) { - const u64 bytes = is_space_ino ? range_bytes : count; + if (count > 0 || is_space_ino || is_reloc_ino) { + u64 bytes = count; struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct btrfs_space_info *sinfo = fs_info->data_sinfo;
+ if (is_space_ino || is_reloc_ino) + bytes = range_bytes; + spin_lock(&sinfo->lock); btrfs_space_info_update_bytes_may_use(fs_info, sinfo, bytes); spin_unlock(&sinfo->lock);
From: Filipe Manana fdmanana@suse.com
commit 432cd2a10f1c10cead91fe706ff5dc52f06d642a upstream.
When running relocation of a data block group while scrub is running in parallel, it is possible that the relocation will fail and abort the current transaction with an -EINVAL error:
[134243.988595] BTRFS info (device sdc): found 14 extents, stage: move data extents [134243.999871] ------------[ cut here ]------------ [134244.000741] BTRFS: Transaction aborted (error -22) [134244.001692] WARNING: CPU: 0 PID: 26954 at fs/btrfs/ctree.c:1071 __btrfs_cow_block+0x6a7/0x790 [btrfs] [134244.003380] Modules linked in: btrfs blake2b_generic xor raid6_pq (...) [134244.012577] CPU: 0 PID: 26954 Comm: btrfs Tainted: G W 5.6.0-rc7-btrfs-next-58 #5 [134244.014162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [134244.016184] RIP: 0010:__btrfs_cow_block+0x6a7/0x790 [btrfs] [134244.017151] Code: 48 c7 c7 (...) [134244.020549] RSP: 0018:ffffa41607863888 EFLAGS: 00010286 [134244.021515] RAX: 0000000000000000 RBX: ffff9614bdfe09c8 RCX: 0000000000000000 [134244.022822] RDX: 0000000000000001 RSI: ffffffffb3d63980 RDI: 0000000000000001 [134244.024124] RBP: ffff961589e8c000 R08: 0000000000000000 R09: 0000000000000001 [134244.025424] R10: ffffffffc0ae5955 R11: 0000000000000000 R12: ffff9614bd530d08 [134244.026725] R13: ffff9614ced41b88 R14: ffff9614bdfe2a48 R15: 0000000000000000 [134244.028024] FS: 00007f29b63c08c0(0000) GS:ffff9615ba600000(0000) knlGS:0000000000000000 [134244.029491] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [134244.030560] CR2: 00007f4eb339b000 CR3: 0000000130d6e006 CR4: 00000000003606f0 [134244.031997] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [134244.033153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [134244.034484] Call Trace: [134244.034984] btrfs_cow_block+0x12b/0x2b0 [btrfs] [134244.035859] do_relocation+0x30b/0x790 [btrfs] [134244.036681] ? do_raw_spin_unlock+0x49/0xc0 [134244.037460] ? _raw_spin_unlock+0x29/0x40 [134244.038235] relocate_tree_blocks+0x37b/0x730 [btrfs] [134244.039245] relocate_block_group+0x388/0x770 [btrfs] [134244.040228] btrfs_relocate_block_group+0x161/0x2e0 [btrfs] [134244.041323] btrfs_relocate_chunk+0x36/0x110 [btrfs] [134244.041345] btrfs_balance+0xc06/0x1860 [btrfs] [134244.043382] ? btrfs_ioctl_balance+0x27c/0x310 [btrfs] [134244.045586] btrfs_ioctl_balance+0x1ed/0x310 [btrfs] [134244.045611] btrfs_ioctl+0x1880/0x3760 [btrfs] [134244.049043] ? do_raw_spin_unlock+0x49/0xc0 [134244.049838] ? _raw_spin_unlock+0x29/0x40 [134244.050587] ? __handle_mm_fault+0x11b3/0x14b0 [134244.051417] ? ksys_ioctl+0x92/0xb0 [134244.052070] ksys_ioctl+0x92/0xb0 [134244.052701] ? trace_hardirqs_off_thunk+0x1a/0x1c [134244.053511] __x64_sys_ioctl+0x16/0x20 [134244.054206] do_syscall_64+0x5c/0x280 [134244.054891] entry_SYSCALL_64_after_hwframe+0x49/0xbe [134244.055819] RIP: 0033:0x7f29b51c9dd7 [134244.056491] Code: 00 00 00 (...) [134244.059767] RSP: 002b:00007ffcccc1dd08 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [134244.061168] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f29b51c9dd7 [134244.062474] RDX: 00007ffcccc1dda0 RSI: 00000000c4009420 RDI: 0000000000000003 [134244.063771] RBP: 0000000000000003 R08: 00005565cea4b000 R09: 0000000000000000 [134244.065032] R10: 0000000000000541 R11: 0000000000000202 R12: 00007ffcccc2060a [134244.066327] R13: 00007ffcccc1dda0 R14: 0000000000000002 R15: 00007ffcccc1dec0 [134244.067626] irq event stamp: 0 [134244.068202] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [134244.069351] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 [134244.070909] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 [134244.072392] softirqs last disabled at (0): [<0000000000000000>] 0x0 [134244.073432] ---[ end trace bd7c03622e0b0a99 ]---
The -EINVAL error comes from the following chain of function calls:
__btrfs_cow_block() <-- aborts the transaction btrfs_reloc_cow_block() replace_file_extents() get_new_location() <-- returns -EINVAL
When relocating a data block group, for each allocated extent of the block group, we preallocate another extent (at prealloc_file_extent_cluster()), associated with the data relocation inode, and then dirty all its pages. These preallocated extents have, and must have, the same size that extents from the data block group being relocated have.
Later before we start the relocation stage that updates pointers (bytenr field of file extent items) to point to the the new extents, we trigger writeback for the data relocation inode. The expectation is that writeback will write the pages to the previously preallocated extents, that it follows the NOCOW path. That is generally the case, however, if a scrub is running it may have turned the block group that contains those extents into RO mode, in which case writeback falls back to the COW path.
However in the COW path instead of allocating exactly one extent with the expected size, the allocator may end up allocating several smaller extents due to free space fragmentation - because we tell it at cow_file_range() that the minimum allocation size can match the filesystem's sector size. This later breaks the relocation's expectation that an extent associated to a file extent item in the data relocation inode has the same size as the respective extent pointed by a file extent item in another tree - in this case the extent to which the relocation inode poins to is smaller, causing relocation.c:get_new_location() to return -EINVAL.
For example, if we are relocating a data block group X that has a logical address of X and the block group has an extent allocated at the logical address X + 128KiB with a size of 64KiB:
1) At prealloc_file_extent_cluster() we allocate an extent for the data relocation inode with a size of 64KiB and associate it to the file offset 128KiB (X + 128KiB - X) of the data relocation inode. This preallocated extent was allocated at block group Z;
2) A scrub running in parallel turns block group Z into RO mode and starts scrubing its extents;
3) Relocation triggers writeback for the data relocation inode;
4) When running delalloc (btrfs_run_delalloc_range()), we try first the NOCOW path because the data relocation inode has BTRFS_INODE_PREALLOC set in its flags. However, because block group Z is in RO mode, the NOCOW path (run_delalloc_nocow()) falls back into the COW path, by calling cow_file_range();
5) At cow_file_range(), in the first iteration of the while loop we call btrfs_reserve_extent() to allocate a 64KiB extent and pass it a minimum allocation size of 4KiB (fs_info->sectorsize). Due to free space fragmentation, btrfs_reserve_extent() ends up allocating two extents of 32KiB each, each one on a different iteration of that while loop;
6) Writeback of the data relocation inode completes;
7) Relocation proceeds and ends up at relocation.c:replace_file_extents(), with a leaf which has a file extent item that points to the data extent from block group X, that has a logical address (bytenr) of X + 128KiB and a size of 64KiB. Then it calls get_new_location(), which does a lookup in the data relocation tree for a file extent item starting at offset 128KiB (X + 128KiB - X) and belonging to the data relocation inode. It finds a corresponding file extent item, however that item points to an extent that has a size of 32KiB, which doesn't match the expected size of 64KiB, resuling in -EINVAL being returned from this function and propagated up to __btrfs_cow_block(), which aborts the current transaction.
To fix this make sure that at cow_file_range() when we call the allocator we pass it a minimum allocation size corresponding the desired extent size if the inode belongs to the data relocation tree, otherwise pass it the filesystem's sector size as the minimum allocation size.
CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e141dabed0ffb..18111b0263d71 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -985,6 +985,7 @@ static noinline int cow_file_range(struct inode *inode, u64 num_bytes; unsigned long ram_size; u64 cur_alloc_size = 0; + u64 min_alloc_size; u64 blocksize = fs_info->sectorsize; struct btrfs_key ins; struct extent_map *em; @@ -1035,10 +1036,26 @@ static noinline int cow_file_range(struct inode *inode, btrfs_drop_extent_cache(BTRFS_I(inode), start, start + num_bytes - 1, 0);
+ /* + * Relocation relies on the relocated extents to have exactly the same + * size as the original extents. Normally writeback for relocation data + * extents follows a NOCOW path because relocation preallocates the + * extents. However, due to an operation such as scrub turning a block + * group to RO mode, it may fallback to COW mode, so we must make sure + * an extent allocated during COW has exactly the requested size and can + * not be split into smaller extents, otherwise relocation breaks and + * fails during the stage where it updates the bytenr of file extent + * items. + */ + if (root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID) + min_alloc_size = num_bytes; + else + min_alloc_size = fs_info->sectorsize; + while (num_bytes > 0) { cur_alloc_size = num_bytes; ret = btrfs_reserve_extent(root, cur_alloc_size, cur_alloc_size, - fs_info->sectorsize, 0, alloc_hint, + min_alloc_size, 0, alloc_hint, &ins, 1, 1); if (ret < 0) goto out_unlock;
From: Filipe Manana fdmanana@suse.com
commit e7a79811d0db136dc2d336b56d54cf1b774ce972 upstream.
This brings back an optimization that commit e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans") removed, but in a different form. So it's almost equivalent to a revert.
That commit removed an optimization where we avoid locking a root's log_mutex when there is no log tree created in the current transaction. The affected code path is triggered through unlink operations.
That commit was based on the assumption that the optimization was not necessary because we used to have the following checks when the patch was authored:
int btrfs_del_dir_entries_in_log(...) { (...) if (dir->logged_trans < trans->transid) return 0;
ret = join_running_log_trans(root); (...) }
int btrfs_del_inode_ref_in_log(...) { (...) if (inode->logged_trans < trans->transid) return 0;
ret = join_running_log_trans(root); (...) }
However before that patch was merged, another patch was merged first which replaced those checks because they were buggy.
That other patch corresponds to commit 803f0f64d17769 ("Btrfs: fix fsync not persisting dentry deletions due to inode evictions"). The assumption that if the logged_trans field of an inode had a smaller value then the current transaction's generation (transid) meant that the inode was not logged in the current transaction was only correct if the inode was not evicted and reloaded in the current transaction. So the corresponding bug fix changed those checks and replaced them with the following helper function:
static bool inode_logged(struct btrfs_trans_handle *trans, struct btrfs_inode *inode) { if (inode->logged_trans == trans->transid) return true;
if (inode->last_trans == trans->transid && test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) && !test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags)) return true;
return false; }
So if we have a subvolume without a log tree in the current transaction (because we had no fsyncs), every time we unlink an inode we can end up trying to lock the log_mutex of the root through join_running_log_trans() twice, once for the inode being unlinked (by btrfs_del_inode_ref_in_log()) and once for the parent directory (with btrfs_del_dir_entries_in_log()).
This means if we have several unlink operations happening in parallel for inodes in the same subvolume, and the those inodes and/or their parent inode were changed in the current transaction, we end up having a lot of contention on the log_mutex.
The test robots from intel reported a -30.7% performance regression for a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans").
So just bring back the optimization to join_running_log_trans() where we check first if a log root exists before trying to lock the log_mutex. This is done by checking for a bit that is set on the root when a log tree is created and removed when a log tree is freed (at transaction commit time).
Commit e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans") was merged in the 5.4 merge window while commit 803f0f64d17769 ("Btrfs: fix fsync not persisting dentry deletions due to inode evictions") was merged in the 5.3 merge window. But the first commit was actually authored before the second commit (May 23 2019 vs June 19 2019).
Reported-by: kernel test robot rong.a.chen@intel.com Link: https://lore.kernel.org/lkml/20200611090233.GL12456@shao2-debian/ Fixes: e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/tree-log.c | 5 +++++ 2 files changed, 7 insertions(+)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 196d4511f8126..09e6dff8a8f85 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -988,6 +988,8 @@ enum { BTRFS_ROOT_DEAD_RELOC_TREE, /* Mark dead root stored on device whose cleanup needs to be resumed */ BTRFS_ROOT_DEAD_TREE, + /* The root has a log tree. Used only for subvolume roots. */ + BTRFS_ROOT_HAS_LOG_TREE, };
/* diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index ea72b9d54ec86..bdfc421494481 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -169,6 +169,7 @@ static int start_log_trans(struct btrfs_trans_handle *trans, if (ret) goto out;
+ set_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state); clear_bit(BTRFS_ROOT_MULTI_LOG_TASKS, &root->state); root->log_start_pid = current->pid; } @@ -195,6 +196,9 @@ static int join_running_log_trans(struct btrfs_root *root) { int ret = -ENOENT;
+ if (!test_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state)) + return ret; + mutex_lock(&root->log_mutex); if (root->log_root) { ret = 0; @@ -3312,6 +3316,7 @@ int btrfs_free_log(struct btrfs_trans_handle *trans, struct btrfs_root *root) if (root->log_root) { free_log_tree(trans, root->log_root); root->log_root = NULL; + clear_bit(BTRFS_ROOT_HAS_LOG_TREE, &root->state); } return 0; }
From: Filipe Manana fdmanana@suse.com
commit f2cb2f39ccc30fa13d3ac078d461031a63960e5b upstream.
If we do a successful RWF_NOWAIT write we end up locking the snapshot lock of the inode, through a call to check_can_nocow(), but we never unlock it.
This means the next attempt to create a snapshot on the subvolume will hang forever.
Trivial reproducer:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ touch /mnt/foobar $ chattr +C /mnt/foobar $ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foobar $ xfs_io -d -c "pwrite -N -V 1 -S 0xfe 0 64K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap --> hangs
Fix this by unlocking the snapshot lock if check_can_nocow() returned success.
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/file.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 719e68ab552c5..484803f8b2290 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1922,6 +1922,8 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, inode_unlock(inode); return -EAGAIN; } + /* check_can_nocow() locks the snapshot lock on success */ + btrfs_drew_write_unlock(&root->snapshot_lock); }
current->backing_dev_info = inode_to_bdi(inode);
From: Filipe Manana fdmanana@suse.com
commit 4b1946284dd6641afdb9457101056d9e6ee6204c upstream.
If we attempt to write to prealloc extent located after eof using a RWF_NOWAIT write, we always fail with -EAGAIN.
We do actually check if we have an allocated extent for the write at the start of btrfs_file_write_iter() through a call to check_can_nocow(), but later when we go into the actual direct IO write path we simply return -EAGAIN if the write starts at or beyond EOF.
Trivial to reproduce:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ touch /mnt/foo $ chattr +C /mnt/foo
$ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foo wrote 65536/65536 bytes at offset 0 64 KiB, 16 ops; 0.0004 sec (135.575 MiB/sec and 34707.1584 ops/sec)
$ xfs_io -c "falloc -k 64K 1M" /mnt/foo
$ xfs_io -d -c "pwrite -N -V 1 -S 0xfe -b 64K 64K 64K" /mnt/foo pwrite: Resource temporarily unavailable
On xfs and ext4 the write succeeds, as expected.
Fix this by removing the wrong check at btrfs_direct_IO().
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 18111b0263d71..6aa200e373c8f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8262,9 +8262,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) dio_data.overwrite = 1; inode_unlock(inode); relock = true; - } else if (iocb->ki_flags & IOCB_NOWAIT) { - ret = -EAGAIN; - goto out; } ret = btrfs_delalloc_reserve_space(inode, &data_reserved, offset, count);
From: Filipe Manana fdmanana@suse.com
commit 260a63395f90f67d6ab89e4266af9e3dc34a77e9 upstream.
If we attempt to do a RWF_NOWAIT write against a file range for which we can only do NOCOW for a part of it, due to the existence of holes or shared extents for example, we proceed with the write as if it were possible to NOCOW the whole range.
Example:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ touch /mnt/sdj/bar $ chattr +C /mnt/sdj/bar
$ xfs_io -d -c "pwrite -S 0xab -b 256K 0 256K" /mnt/bar wrote 262144/262144 bytes at offset 0 256 KiB, 1 ops; 0.0003 sec (694.444 MiB/sec and 2777.7778 ops/sec)
$ xfs_io -c "fpunch 64K 64K" /mnt/bar $ sync
$ xfs_io -d -c "pwrite -N -V 1 -b 128K -S 0xfe 0 128K" /mnt/bar wrote 131072/131072 bytes at offset 0 128 KiB, 1 ops; 0.0007 sec (160.051 MiB/sec and 1280.4097 ops/sec)
This last write should fail with -EAGAIN since the file range from 64K to 128K is a hole. On xfs it fails, as expected, but on ext4 it currently succeeds because apparently it is expensive to check if there are extents allocated for the whole range, but I'll check with the ext4 people.
Fix the issue by checking if check_can_nocow() returns a number of NOCOW'able bytes smaller then the requested number of bytes, and if it does return -EAGAIN.
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/file.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 484803f8b2290..52d565ff66e2d 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1912,18 +1912,29 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, pos = iocb->ki_pos; count = iov_iter_count(from); if (iocb->ki_flags & IOCB_NOWAIT) { + size_t nocow_bytes = count; + /* * We will allocate space in case nodatacow is not set, * so bail */ if (!(BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW | BTRFS_INODE_PREALLOC)) || - check_can_nocow(BTRFS_I(inode), pos, &count) <= 0) { + check_can_nocow(BTRFS_I(inode), pos, &nocow_bytes) <= 0) { inode_unlock(inode); return -EAGAIN; } /* check_can_nocow() locks the snapshot lock on success */ btrfs_drew_write_unlock(&root->snapshot_lock); + /* + * There are holes in the range or parts of the range that must + * be COWed (shared extents, RO block groups, etc), so just bail + * out. + */ + if (nocow_bytes < count) { + inode_unlock(inode); + return -EAGAIN; + } }
current->backing_dev_info = inode_to_bdi(inode);
From: Vlastimil Babka vbabka@suse.cz
commit b9e20f0da1f5c9c68689450a8cb436c9486434c8 upstream.
Hugh reports:
"While stressing compaction, one run oopsed on NULL capc->cc in __free_one_page()'s task_capc(zone): compact_zone_order() had been interrupted, and a page was being freed in the return from interrupt.
Though you would not expect it from the source, both gccs I was using (4.8.1 and 7.5.0) had chosen to compile compact_zone_order() with the ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before callq compact_zone - long after the "current->capture_control = &capc". An interrupt in between those finds capc->cc NULL (zeroed by an earlier rep stos).
This could presumably be fixed by a barrier() before setting current->capture_control in compact_zone_order(); but would also need more care on return from compact_zone(), in order not to risk leaking a page captured by interrupt just before capture_control is reset.
Maybe that is the preferable fix, but I felt safer for task_capc() to exclude the rather surprising possibility of capture at interrupt time"
I have checked that gcc10 also behaves the same.
The advantage of fix in compact_zone_order() is that we don't add another test in the page freeing hot path, and that it might prevent future problems if we stop exposing pointers to uninitialized structures in current task.
So this patch implements the suggestion for compact_zone_order() with barrier() (and WRITE_ONCE() to prevent store tearing) for setting current->capture_control, and prevents page leaking with WRITE_ONCE/READ_ONCE in the proper order.
Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") Signed-off-by: Vlastimil Babka vbabka@suse.cz Reported-by: Hugh Dickins hughd@google.com Suggested-by: Hugh Dickins hughd@google.com Acked-by: Hugh Dickins hughd@google.com Cc: Alex Shi alex.shi@linux.alibaba.com Cc: Li Wang liwang@redhat.com Cc: Mel Gorman mgorman@techsingularity.net Cc: stable@vger.kernel.org [5.1+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/compaction.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c index 46f0fcc93081e..65b568e195823 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2318,15 +2318,26 @@ static enum compact_result compact_zone_order(struct zone *zone, int order, .page = NULL, };
- current->capture_control = &capc; + /* + * Make sure the structs are really initialized before we expose the + * capture control, in case we are interrupted and the interrupt handler + * frees a page. + */ + barrier(); + WRITE_ONCE(current->capture_control, &capc);
ret = compact_zone(&cc, &capc);
VM_BUG_ON(!list_empty(&cc.freepages)); VM_BUG_ON(!list_empty(&cc.migratepages));
- *capture = capc.page; - current->capture_control = NULL; + /* + * Make sure we hide capture control first before we read the captured + * page pointer, otherwise an interrupt could free and capture a page + * and we would leak it. + */ + WRITE_ONCE(current->capture_control, NULL); + *capture = READ_ONCE(capc.page);
return ret; }
From: Waiman Long longman@redhat.com
commit d7670879c5c4aa443d518fb234a9e5f30931efa3 upstream.
It was found that running the LTP test on a PowerPC system could produce erroneous values in /proc/meminfo, like:
MemTotal: 531915072 kB MemFree: 507962176 kB MemAvailable: 1100020596352 kB
Using bisection, the problem is tracked down to commit 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
In memcg_uncharge_slab() with a "int order" argument:
unsigned int nr_pages = 1 << order; : mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
The mod_lruvec_state() function will eventually call the __mod_zone_page_state() which accepts a long argument. Depending on the compiler and how inlining is done, "-nr_pages" may be treated as a negative number or a very large positive number. Apparently, it was treated as a large positive number in that PowerPC system leading to incorrect stat counts. This problem hasn't been seen in x86-64 yet, perhaps the gcc compiler there has some slight difference in behavior.
It is fixed by making nr_pages a signed value. For consistency, a similar change is applied to memcg_charge_slab() as well.
Link: http://lkml.kernel.org/r/20200620184719.10994-1-longman@redhat.com Fixes: 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()"). Signed-off-by: Waiman Long longman@redhat.com Acked-by: Roman Gushchin guro@fb.com Cc: Christoph Lameter cl@linux.com Cc: Pekka Enberg penberg@kernel.org Cc: David Rientjes rientjes@google.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Shakeel Butt shakeelb@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@kernel.org Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slab.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h index 207c83ef6e06d..74f7e09a7cfd1 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -348,7 +348,7 @@ static __always_inline int memcg_charge_slab(struct page *page, gfp_t gfp, int order, struct kmem_cache *s) { - unsigned int nr_pages = 1 << order; + int nr_pages = 1 << order; struct mem_cgroup *memcg; struct lruvec *lruvec; int ret; @@ -388,7 +388,7 @@ out: static __always_inline void memcg_uncharge_slab(struct page *page, int order, struct kmem_cache *s) { - unsigned int nr_pages = 1 << order; + int nr_pages = 1 << order; struct mem_cgroup *memcg; struct lruvec *lruvec;
From: Waiman Long longman@redhat.com
commit 8982ae527fbef170ef298650c15d55a9ccd33973 upstream.
The kzfree() function is normally used to clear some sensitive information, like encryption keys, in the buffer before freeing it back to the pool. Memset() is currently used for buffer clearing. However unlikely, there is still a non-zero probability that the compiler may choose to optimize away the memory clearing especially if LTO is being used in the future.
To make sure that this optimization will never happen, memzero_explicit(), which is introduced in v3.18, is now used in kzfree() to future-proof it.
Link: http://lkml.kernel.org/r/20200616154311.12314-2-longman@redhat.com Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()") Signed-off-by: Waiman Long longman@redhat.com Acked-by: Michal Hocko mhocko@suse.com Cc: David Howells dhowells@redhat.com Cc: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Cc: James Morris jmorris@namei.org Cc: "Serge E. Hallyn" serge@hallyn.com Cc: Joe Perches joe@perches.com Cc: Matthew Wilcox willy@infradead.org Cc: David Rientjes rientjes@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Dan Carpenter dan.carpenter@oracle.com Cc: "Jason A . Donenfeld" Jason@zx2c4.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slab_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c index 9e72ba2241750..37d48a56431d0 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1726,7 +1726,7 @@ void kzfree(const void *p) if (unlikely(ZERO_OR_NULL_PTR(mem))) return; ks = ksize(mem); - memset(mem, 0, ks); + memzero_explicit(mem, ks); kfree(mem); } EXPORT_SYMBOL(kzfree);
From: Junxiao Bi junxiao.bi@oracle.com
commit 4cd9973f9ff69e37dd0ba2bd6e6423f8179c329a upstream.
Patch series "ocfs2: fix nfsd over ocfs2 issues", v2.
This is a series of patches to fix issues on nfsd over ocfs2. patch 1 is to avoid inode removed while nfsd access it patch 2 & 3 is to fix a panic issue.
This patch (of 4):
When nfsd is getting file dentry using handle or parent dentry of some dentry, one cluster lock is used to avoid inode removed from other node, but it still could be removed from local node, so use a rw lock to avoid this.
Link: http://lkml.kernel.org/r/20200616183829.87211-1-junxiao.bi@oracle.com Link: http://lkml.kernel.org/r/20200616183829.87211-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/dlmglue.c | 17 ++++++++++++++++- fs/ocfs2/ocfs2.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 152a0fc4e9051..751bc4dc74663 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -689,6 +689,12 @@ static void ocfs2_nfs_sync_lock_res_init(struct ocfs2_lock_res *res, &ocfs2_nfs_sync_lops, osb); }
+static void ocfs2_nfs_sync_lock_init(struct ocfs2_super *osb) +{ + ocfs2_nfs_sync_lock_res_init(&osb->osb_nfs_sync_lockres, osb); + init_rwsem(&osb->nfs_sync_rwlock); +} + void ocfs2_trim_fs_lock_res_init(struct ocfs2_super *osb) { struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres; @@ -2855,6 +2861,11 @@ int ocfs2_nfs_sync_lock(struct ocfs2_super *osb, int ex) if (ocfs2_is_hard_readonly(osb)) return -EROFS;
+ if (ex) + down_write(&osb->nfs_sync_rwlock); + else + down_read(&osb->nfs_sync_rwlock); + if (ocfs2_mount_local(osb)) return 0;
@@ -2873,6 +2884,10 @@ void ocfs2_nfs_sync_unlock(struct ocfs2_super *osb, int ex) if (!ocfs2_mount_local(osb)) ocfs2_cluster_unlock(osb, lockres, ex ? LKM_EXMODE : LKM_PRMODE); + if (ex) + up_write(&osb->nfs_sync_rwlock); + else + up_read(&osb->nfs_sync_rwlock); }
int ocfs2_trim_fs_lock(struct ocfs2_super *osb, @@ -3340,7 +3355,7 @@ int ocfs2_dlm_init(struct ocfs2_super *osb) local: ocfs2_super_lock_res_init(&osb->osb_super_lockres, osb); ocfs2_rename_lock_res_init(&osb->osb_rename_lockres, osb); - ocfs2_nfs_sync_lock_res_init(&osb->osb_nfs_sync_lockres, osb); + ocfs2_nfs_sync_lock_init(osb); ocfs2_orphan_scan_lock_res_init(&osb->osb_orphan_scan.os_lockres, osb);
osb->cconn = conn; diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 9150cfa4df7dc..9461bd3e1c0c8 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -394,6 +394,7 @@ struct ocfs2_super struct ocfs2_lock_res osb_super_lockres; struct ocfs2_lock_res osb_rename_lockres; struct ocfs2_lock_res osb_nfs_sync_lockres; + struct rw_semaphore nfs_sync_rwlock; struct ocfs2_lock_res osb_trim_fs_lockres; struct mutex obs_trim_fs_mutex; struct ocfs2_dlm_debug *osb_dlm_debug;
From: Junxiao Bi junxiao.bi@oracle.com
commit 7569d3c754e452769a5747eeeba488179e38a5da upstream.
Set global_inode_alloc as OCFS2_FIRST_ONLINE_SYSTEM_INODE, that will make it load during mount. It can be used to test whether some global/system inodes are valid. One use case is that nfsd will test whether root inode is valid.
Link: http://lkml.kernel.org/r/20200616183829.87211-3-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/ocfs2_fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index 0dd8c41bafd4c..3fc99659ed09d 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -326,8 +326,8 @@ struct ocfs2_system_inode_info { enum { BAD_BLOCK_SYSTEM_INODE = 0, GLOBAL_INODE_ALLOC_SYSTEM_INODE, +#define OCFS2_FIRST_ONLINE_SYSTEM_INODE GLOBAL_INODE_ALLOC_SYSTEM_INODE SLOT_MAP_SYSTEM_INODE, -#define OCFS2_FIRST_ONLINE_SYSTEM_INODE SLOT_MAP_SYSTEM_INODE HEARTBEAT_SYSTEM_INODE, GLOBAL_BITMAP_SYSTEM_INODE, USER_QUOTA_SYSTEM_INODE,
From: Junxiao Bi junxiao.bi@oracle.com
commit 9277f8334ffc719fe922d776444d6e4e884dbf30 upstream.
In the ocfs2 disk layout, slot number is 16 bits, but in ocfs2 implementation, slot number is 32 bits. Usually this will not cause any issue, because slot number is converted from u16 to u32, but OCFS2_INVALID_SLOT was defined as -1, when an invalid slot number from disk was obtained, its value was (u16)-1, and it was converted to u32. Then the following checking in get_local_system_inode will be always skipped:
static struct inode **get_local_system_inode(struct ocfs2_super *osb, int type, u32 slot) { BUG_ON(slot == OCFS2_INVALID_SLOT); ... }
Link: http://lkml.kernel.org/r/20200616183829.87211-5-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/ocfs2_fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index 3fc99659ed09d..19137c6d087bc 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -290,7 +290,7 @@ #define OCFS2_MAX_SLOTS 255
/* Slot map indicator for an empty slot */ -#define OCFS2_INVALID_SLOT -1 +#define OCFS2_INVALID_SLOT ((u16)-1)
#define OCFS2_VOL_UUID_LEN 16 #define OCFS2_MAX_VOL_LABEL_LEN 64
From: Junxiao Bi junxiao.bi@oracle.com
commit e5a15e17a78d58f933d17cafedfcf7486a29f5b4 upstream.
The following kernel panic was captured when running nfs server over ocfs2, at that time ocfs2_test_inode_bit() was checking whether one inode locating at "blkno" 5 was valid, that is ocfs2 root inode, its "suballoc_slot" was OCFS2_INVALID_SLOT(65535) and it was allocted from //global_inode_alloc, but here it wrongly assumed that it was got from per slot inode alloctor which would cause array overflow and trigger kernel panic.
BUG: unable to handle kernel paging request at 0000000000001088 IP: [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 PGD 1e06ba067 PUD 1e9e7d067 PMD 0 Oops: 0002 [#1] SMP CPU: 6 PID: 24873 Comm: nfsd Not tainted 4.1.12-124.36.1.el6uek.x86_64 #2 Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 3.87 02/02/2018 RIP: _raw_spin_lock+0x18/0xf0 RSP: e02b:ffff88005ae97908 EFLAGS: 00010206 RAX: ffff88005ae98000 RBX: 0000000000001088 RCX: 0000000000000000 RDX: 0000000000020000 RSI: 0000000000000009 RDI: 0000000000001088 RBP: ffff88005ae97928 R08: 0000000000000000 R09: ffff880212878e00 R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000001088 R13: ffff8800063c0aa8 R14: ffff8800650c27d0 R15: 000000000000ffff FS: 0000000000000000(0000) GS:ffff880218180000(0000) knlGS:ffff880218180000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000001088 CR3: 00000002033d0000 CR4: 0000000000042660 Call Trace: igrab+0x1e/0x60 ocfs2_get_system_file_inode+0x63/0x3a0 [ocfs2] ocfs2_test_inode_bit+0x328/0xa00 [ocfs2] ocfs2_get_parent+0xba/0x3e0 [ocfs2] reconnect_path+0xb5/0x300 exportfs_decode_fh+0xf6/0x2b0 fh_verify+0x350/0x660 [nfsd] nfsd4_putfh+0x4d/0x60 [nfsd] nfsd4_proc_compound+0x3d3/0x6f0 [nfsd] nfsd_dispatch+0xe0/0x290 [nfsd] svc_process_common+0x412/0x6a0 [sunrpc] svc_process+0x123/0x210 [sunrpc] nfsd+0xff/0x170 [nfsd] kthread+0xcb/0xf0 ret_from_fork+0x61/0x90 Code: 83 c2 02 0f b7 f2 e8 18 dc 91 ff 66 90 eb bf 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 48 89 fb ba 00 00 02 00 <f0> 0f c1 17 89 d0 45 31 e4 45 31 ed c1 e8 10 66 39 d0 41 89 c6 RIP _raw_spin_lock+0x18/0xf0 CR2: 0000000000001088 ---[ end trace 7264463cd1aac8f9 ]--- Kernel panic - not syncing: Fatal exception
Link: http://lkml.kernel.org/r/20200616183829.87211-4-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/suballoc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 4836becb7578a..45745cc3408a5 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -2825,9 +2825,12 @@ int ocfs2_test_inode_bit(struct ocfs2_super *osb, u64 blkno, int *res) goto bail; }
- inode_alloc_inode = - ocfs2_get_system_file_inode(osb, INODE_ALLOC_SYSTEM_INODE, - suballoc_slot); + if (suballoc_slot == (u16)OCFS2_INVALID_SLOT) + inode_alloc_inode = ocfs2_get_system_file_inode(osb, + GLOBAL_INODE_ALLOC_SYSTEM_INODE, suballoc_slot); + else + inode_alloc_inode = ocfs2_get_system_file_inode(osb, + INODE_ALLOC_SYSTEM_INODE, suballoc_slot); if (!inode_alloc_inode) { /* the error code could be inaccurate, but we are not able to * get the correct one. */
From: Johannes Weiner hannes@cmpxchg.org
commit cd324edce598ebddde44162a2aa01321c1261b9e upstream.
Tejun reports seeing rare div0 crashes in memory.low stress testing:
RIP: 0010:mem_cgroup_calculate_protection+0xed/0x150 Code: 0f 46 d1 4c 39 d8 72 57 f6 05 16 d6 42 01 40 74 1f 4c 39 d8 76 1a 4c 39 d1 76 15 4c 29 d1 4c 29 d8 4d 29 d9 31 d2 48 0f af c1 <49> f7 f1 49 01 c2 4c 89 96 38 01 00 00 5d c3 48 0f af c7 31 d2 49 RSP: 0018:ffffa14e01d6fcd0 EFLAGS: 00010246 RAX: 000000000243e384 RBX: 0000000000000000 RCX: 0000000000008f4b RDX: 0000000000000000 RSI: ffff8b89bee84000 RDI: 0000000000000000 RBP: ffffa14e01d6fcd0 R08: ffff8b89ca7d40f8 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000006422f7 R12: 0000000000000000 R13: ffff8b89d9617000 R14: ffff8b89bee84000 R15: ffffa14e01d6fdb8 FS: 0000000000000000(0000) GS:ffff8b8a1f1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f93b1fc175b CR3: 000000016100a000 CR4: 0000000000340ea0 Call Trace: shrink_node+0x1e5/0x6c0 balance_pgdat+0x32d/0x5f0 kswapd+0x1d7/0x3d0 kthread+0x11c/0x160 ret_from_fork+0x1f/0x30
This happens when parent_usage == siblings_protected.
We check that usage is bigger than protected, which should imply parent_usage being bigger than siblings_protected. However, we don't read (or even update) these values atomically, and they can be out of sync as the memory state changes under us. A bit of fluctuation around the target protection isn't a big deal, but we need to handle the div0 case.
Check the parent state explicitly to make sure we have a reasonable positive value for the divisor.
Link: http://lkml.kernel.org/r/20200615140658.601684-1-hannes@cmpxchg.org Fixes: 8a931f801340 ("mm: memcontrol: recursive memory.low protection") Signed-off-by: Johannes Weiner hannes@cmpxchg.org Reported-by: Tejun Heo tj@kernel.org Acked-by: Michal Hocko mhocko@suse.com Acked-by: Chris Down chris@chrisdown.name Cc: Roman Gushchin guro@fb.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a3b97f1039665..1b05e25d8aef2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6349,11 +6349,16 @@ static unsigned long effective_protection(unsigned long usage, * We're using unprotected memory for the weight so that if * some cgroups DO claim explicit protection, we don't protect * the same bytes twice. + * + * Check both usage and parent_usage against the respective + * protected values. One should imply the other, but they + * aren't read atomically - make sure the division is sane. */ if (!(cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_RECURSIVE_PROT)) return ep; - - if (parent_effective > siblings_protected && usage > protected) { + if (parent_effective > siblings_protected && + parent_usage > siblings_protected && + usage > protected) { unsigned long unclaimed;
unclaimed = parent_effective - siblings_protected;
From: Muchun Song songmuchun@bytedance.com
commit 3a98990ae2150277ed34d3b248c60e68bf2244b2 upstream.
We should put the css reference when memory allocation failed.
Link: http://lkml.kernel.org/r/20200614122653.98829-1-songmuchun@bytedance.com Fixes: f0a3a24b532d ("mm: memcg/slab: rework non-root kmem_cache lifecycle management") Signed-off-by: Muchun Song songmuchun@bytedance.com Acked-by: Roman Gushchin guro@fb.com Acked-by: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Qian Cai cai@lca.pw Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1b05e25d8aef2..ef0e291a8cf46 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2790,8 +2790,10 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg, return;
cw = kmalloc(sizeof(*cw), GFP_NOWAIT | __GFP_NOWARN); - if (!cw) + if (!cw) { + css_put(&memcg->css); return; + }
cw->memcg = memcg; cw->cachep = cachep;
From: Ben Widawsky ben.widawsky@intel.com
commit b7e3debdd0408c0dca5d4750371afa5003f792dc upstream.
When working with very large nodes, poisoning the struct pages (for which there will be very many) can take a very long time. If the system is using voluntary preemptions, the software watchdog will not be able to detect forward progress. This patch addresses this issue by offering to give up time like __remove_pages() does. This behavior was introduced in v5.6 with: commit d33695b16a9f ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()")
Alternately, init_page_poison could do this cond_resched(), but it seems to me that the caller of init_page_poison() is what actually knows whether or not it should relax its own priority.
Based on Dan's notes, I think this is perfectly safe: commit f931ab479dd2 ("mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}")
Aside from fixing the lockup, it is also a friendlier thing to do on lower core systems that might wipe out large chunks of hotplug memory (probably not a very common case).
Fixes this kind of splat:
watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [daxctl:9922] irq event stamp: 138450 hardirqs last enabled at (138449): [<ffffffffa1001f26>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (138450): [<ffffffffa1001f42>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (138448): [<ffffffffa1e00347>] __do_softirq+0x347/0x456 softirqs last disabled at (138443): [<ffffffffa10c416d>] irq_exit+0x7d/0xb0 CPU: 46 PID: 9922 Comm: daxctl Not tainted 5.7.0-BEN-14238-g373c6049b336 #30 Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYXCRB1.86B.0578.D07.1902280810 02/28/2019 RIP: 0010:memset_erms+0x9/0x10 Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 Call Trace: remove_pfn_range_from_zone+0x3a/0x380 memunmap_pages+0x17f/0x280 release_nodes+0x22a/0x260 __device_release_driver+0x172/0x220 device_driver_detach+0x3e/0xa0 unbind_store+0x113/0x130 kernfs_fop_write+0xdc/0x1c0 vfs_write+0xde/0x1d0 ksys_write+0x58/0xd0 do_syscall_64+0x5a/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Built 2 zonelists, mobility grouping on. Total pages: 49050381 Policy zone: Normal Built 3 zonelists, mobility grouping on. Total pages: 49312525 Policy zone: Normal
David said: "It really only is an issue for devmem. Ordinary hotplugged system memory is not affected (onlined/offlined in memory block granularity)."
Link: http://lkml.kernel.org/r/20200619231213.1160351-1-ben.widawsky@intel.com Fixes: commit d33695b16a9f ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()") Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reported-by: "Scargall, Steve" steve.scargall@intel.com Reported-by: Ben Widawsky ben.widawsky@intel.com Acked-by: David Hildenbrand david@redhat.com Cc: Dan Williams dan.j.williams@intel.com Cc: Vishal Verma vishal.l.verma@intel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory_hotplug.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index fc0aad0bc1f54..744a3ea284b78 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -468,11 +468,20 @@ void __ref remove_pfn_range_from_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages) { + const unsigned long end_pfn = start_pfn + nr_pages; struct pglist_data *pgdat = zone->zone_pgdat; - unsigned long flags; + unsigned long pfn, cur_nr_pages, flags;
/* Poison struct pages because they are now uninitialized again. */ - page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages); + for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) { + cond_resched(); + + /* Select all remaining pages up to the next section boundary */ + cur_nr_pages = + min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); + page_init_poison(pfn_to_page(pfn), + sizeof(struct page) * cur_nr_pages); + }
#ifdef CONFIG_ZONE_DEVICE /*
From: Jiping Ma jiping.ma2@windriver.com
commit 8dfe804a4031ca6ba3a3efb2048534249b64f3a5 upstream.
A 32-bit perf querying the registers of a compat task using REGS_ABI_32 will receive zeroes from w15, when it expects to find the PC.
Return the PC value for register dwarf register 15 when returning register values for a compat task to perf.
Cc: stable@vger.kernel.org Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Jiping Ma jiping.ma2@windriver.com Link: https://lore.kernel.org/r/1589165527-188401-1-git-send-email-jiping.ma2@wind... [will: Shuffled code and added a comment] Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/perf_regs.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c index 0bbac612146ea..666b225aeb3ad 100644 --- a/arch/arm64/kernel/perf_regs.c +++ b/arch/arm64/kernel/perf_regs.c @@ -15,15 +15,34 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) return 0;
/* - * Compat (i.e. 32 bit) mode: - * - PC has been set in the pt_regs struct in kernel_entry, - * - Handle SP and LR here. + * Our handling of compat tasks (PERF_SAMPLE_REGS_ABI_32) is weird, but + * we're stuck with it for ABI compatability reasons. + * + * For a 32-bit consumer inspecting a 32-bit task, then it will look at + * the first 16 registers (see arch/arm/include/uapi/asm/perf_regs.h). + * These correspond directly to a prefix of the registers saved in our + * 'struct pt_regs', with the exception of the PC, so we copy that down + * (x15 corresponds to SP_hyp in the architecture). + * + * So far, so good. + * + * The oddity arises when a 64-bit consumer looks at a 32-bit task and + * asks for registers beyond PERF_REG_ARM_MAX. In this case, we return + * SP_usr, LR_usr and PC in the positions where the AArch64 SP, LR and + * PC registers would normally live. The initial idea was to allow a + * 64-bit unwinder to unwind a 32-bit task and, although it's not clear + * how well that works in practice, somebody might be relying on it. + * + * At the time we make a sample, we don't know whether the consumer is + * 32-bit or 64-bit, so we have to cater for both possibilities. */ if (compat_user_mode(regs)) { if ((u32)idx == PERF_REG_ARM64_SP) return regs->compat_sp; if ((u32)idx == PERF_REG_ARM64_LR) return regs->compat_lr; + if (idx == 15) + return regs->pc; }
if ((u32)idx == PERF_REG_ARM64_SP)
From: Robin Gong yibin.gong@nxp.com
commit 4fd6b5735c03c0955d93960d31f17d7144f5578f upstream.
Correct ldo1 voltage range from wrong high group(3.0V~3.3V) to low group (1.6V~1.9V) because the ldo1 should be 1.8V. Actually, two voltage groups have been supported at bd718x7-regulator driver, hence, just corrrect the voltage range to 1.6V~3.3V. For ldo2@0.8V, correct voltage range too. Otherwise, ldo1 would be kept @3.0V and ldo2@0.9V which violate i.mx8mm datasheet as the below warning log in kernel:
[ 0.995524] LDO1: Bringing 1800000uV into 3000000-3000000uV [ 0.999196] LDO2: Bringing 800000uV into 900000-900000uV
Fixes: 78cc25fa265d ("arm64: dts: imx8mm-evk: Add BD71847 PMIC") Cc: stable@vger.kernel.org Signed-off-by: Robin Gong yibin.gong@nxp.com Reviewed-by: Dong Aisheng aisheng.dong@nxp.com Reviewed-by: Fabio Estevam festevam@gmail.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/imx8mm-evk.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts index 951e14a3de0e5..22aed2806fda8 100644 --- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts +++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts @@ -196,7 +196,7 @@
ldo1_reg: LDO1 { regulator-name = "LDO1"; - regulator-min-microvolt = <3000000>; + regulator-min-microvolt = <1600000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; @@ -204,7 +204,7 @@
ldo2_reg: LDO2 { regulator-name = "LDO2"; - regulator-min-microvolt = <900000>; + regulator-min-microvolt = <800000>; regulator-max-microvolt = <900000>; regulator-boot-on; regulator-always-on;
From: Robin Gong yibin.gong@nxp.com
commit cfb12c8952f617df58d73d24161e539a035d82b0 upstream.
Correct ldo1 voltage range from wrong high group(3.0V~3.3V) to low group (1.6V~1.9V) because the ldo1 should be 1.8V. Actually, two voltage groups have been supported at bd718x7-regulator driver, hence, just corrrect the voltage range to 1.6V~3.3V. For ldo2@0.8V, correct voltage range too. Otherwise, ldo1 would be kept @3.0V and ldo2@0.9V which violate i.mx8mn datasheet as the below warning log in kernel:
[ 0.995524] LDO1: Bringing 1800000uV into 3000000-3000000uV [ 0.999196] LDO2: Bringing 800000uV into 900000-900000uV
Fixes: 3e44dd09736d ("arm64: dts: imx8mn-ddr4-evk: Add rohm,bd71847 PMIC support") Cc: stable@vger.kernel.org Signed-off-by: Robin Gong yibin.gong@nxp.com Reviewed-by: Dong Aisheng aisheng.dong@nxp.com Reviewed-by: Fabio Estevam festevam@gmail.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts index 2497eebb57393..fe49dbc535e1f 100644 --- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts +++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts @@ -101,7 +101,7 @@
ldo1_reg: LDO1 { regulator-name = "LDO1"; - regulator-min-microvolt = <3000000>; + regulator-min-microvolt = <1600000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; @@ -109,7 +109,7 @@
ldo2_reg: LDO2 { regulator-name = "LDO2"; - regulator-min-microvolt = <900000>; + regulator-min-microvolt = <800000>; regulator-max-microvolt = <900000>; regulator-boot-on; regulator-always-on;
From: Sascha Ortmann sascha.ortmann@stud.uni-hannover.de
commit 20dc3847cc2fc886ee4eb9112e6e2fad9419b0c7 upstream.
Fix boottime kprobe events to report and abort after each failure when adding probes.
As an example, when we try to set multiprobe kprobe events in bootconfig like this:
ftrace.event.kprobes.vfsevents { probes = "vfs_read $arg1 $arg2,, !error! not reported;?", // leads to error "vfs_write $arg1 $arg2" }
This will not work as expected. After commit da0f1f4167e3af69e ("tracing/boottime: Fix kprobe event API usage"), the function trace_boot_add_kprobe_event will not produce any error message when adding a probe fails at kprobe_event_gen_cmd_start. Furthermore, we continue to add probes when kprobe_event_gen_cmd_end fails (and kprobe_event_gen_cmd_start did not fail). In this case the function even returns successfully when the last call to kprobe_event_gen_cmd_end is successful.
The behaviour of reporting and aborting after failures is not consistent.
The function trace_boot_add_kprobe_event now reports each failure and stops adding probes immediately.
Link: https://lkml.kernel.org/r/20200618163301.25854-1-sascha.ortmann@stud.uni-han...
Cc: stable@vger.kernel.org Cc: linux-kernel@i4.cs.fau.de Co-developed-by: Maximilian Werner maximilian.werner96@gmail.com Fixes: da0f1f4167e3 ("tracing/boottime: Fix kprobe event API usage") Acked-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Maximilian Werner maximilian.werner96@gmail.com Signed-off-by: Sascha Ortmann sascha.ortmann@stud.uni-hannover.de Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_boot.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index 9de29bb45a27f..fdc5abc00bf84 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -101,12 +101,16 @@ trace_boot_add_kprobe_event(struct xbc_node *node, const char *event) kprobe_event_cmd_init(&cmd, buf, MAX_BUF_LEN);
ret = kprobe_event_gen_cmd_start(&cmd, event, val); - if (ret) + if (ret) { + pr_err("Failed to generate probe: %s\n", buf); break; + }
ret = kprobe_event_gen_cmd_end(&cmd); - if (ret) + if (ret) { pr_err("Failed to add probe: %s\n", buf); + break; + } }
return ret;
From: Masami Hiramatsu mhiramat@kernel.org
commit 6784beada631800f2c5afd567e5628c843362cee upstream.
Fix the event trigger to accept redundant spaces in the trigger input.
For example, these return -EINVAL
echo " traceon" > events/ftrace/print/trigger echo "traceon if common_pid == 0" > events/ftrace/print/trigger echo "disable_event:kmem:kmalloc " > events/ftrace/print/trigger
But these are hard to find what is wrong.
To fix this issue, use skip_spaces() to remove spaces in front of actual tokens, and set NULL if there is no token.
Link: http://lkml.kernel.org/r/159262476352.185015.5261566783045364186.stgit@devno...
Cc: Tom Zanussi zanussi@kernel.org Cc: stable@vger.kernel.org Fixes: 85f2b08268c0 ("tracing: Add basic event trigger framework") Reviewed-by: Tom Zanussi zanussi@kernel.org Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_trigger.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c index 3a74736da363a..f725802160c0b 100644 --- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -216,11 +216,17 @@ static int event_trigger_regex_open(struct inode *inode, struct file *file)
int trigger_process_regex(struct trace_event_file *file, char *buff) { - char *command, *next = buff; + char *command, *next; struct event_command *p; int ret = -EINVAL;
+ next = buff = skip_spaces(buff); command = strsep(&next, ": \t"); + if (next) { + next = skip_spaces(next); + if (!*next) + next = NULL; + } command = (command[0] != '!') ? command : command + 1;
mutex_lock(&trigger_cmd_mutex); @@ -630,8 +636,14 @@ event_trigger_callback(struct event_command *cmd_ops, int ret;
/* separate the trigger from the filter (t:n [if filter]) */ - if (param && isdigit(param[0])) + if (param && isdigit(param[0])) { trigger = strsep(¶m, " \t"); + if (param) { + param = skip_spaces(param); + if (!*param) + param = NULL; + } + }
trigger_ops = cmd_ops->get_trigger_ops(cmd, trigger);
@@ -1368,6 +1380,11 @@ int event_enable_trigger_func(struct event_command *cmd_ops, trigger = strsep(¶m, " \t"); if (!trigger) return -EINVAL; + if (param) { + param = skip_spaces(param); + if (!*param) + param = NULL; + }
system = strsep(&trigger, ":"); if (!trigger)
From: "Steven Rostedt (VMware)" rostedt@goodmis.org
commit 097350d1c6e1f5808cae142006f18a0bbc57018d upstream.
Currently the ring buffer makes events that happen in interrupts that preempt another event have a delta of zero. (Hopefully we can change this soon). But this is to deal with the races of updating a global counter with lockless and nesting functions updating deltas.
With the addition of absolute time stamps, the time extend didn't follow this rule. A time extend can happen if two events happen longer than 2^27 nanoseconds appart, as the delta time field in each event is only 27 bits. If that happens, then a time extend is injected with 2^59 bits of nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between two events, and as it is writing the event, an interrupt triggers, it will see the 2^27 difference as well and inject a time extend of its own. But a recent change made the time extend logic not take into account the nesting, and this can cause two time extend deltas to happen moving the time stamp much further ahead than the current time. This gets all reset when the ring buffer moves to the next page, but that can cause time to appear to go backwards.
This was observed in a trace-cmd recording, and since the data is saved in a file, with trace-cmd report --debug, it was possible to see that this indeed did happen!
bash-52501 110d... 81778.908247: sched_switch: bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64] <idle>-0 110d... 81778.908757: sched_switch: swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64] TIME EXTEND: delta:306454770 length:0 bash-52501 110.... 81779.215212: sched_swap_numa: src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48] TIME EXTEND: delta:306458165 length:0 bash-52501 110dNh. 81779.521670: sched_wakeup: migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
and at the next page, caused the time to go backwards:
bash-52504 110d... 81779.685411: sched_switch: bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64] CPU:110 [SUBBUFFER START] [81779379165886:0x1320000] <idle>-0 110dN.. 81779.379166: sched_wakeup: bash:52504 [120] success=1 CPU:110 [0:0x10:40] <idle>-0 110d... 81779.379167: sched_switch: swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
Cc: Ingo Molnar mingo@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Tom Zanussi zanussi@kernel.org Cc: stable@vger.kernel.org Fixes: dc4e2801d400b ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP") Reported-by: Julia Lawall julia.lawall@inria.fr Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ring_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index b8e1ca48be50f..00867ff82412a 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2427,7 +2427,7 @@ rb_update_event(struct ring_buffer_per_cpu *cpu_buffer, if (unlikely(info->add_timestamp)) { bool abs = ring_buffer_time_stamp_abs(cpu_buffer->buffer);
- event = rb_add_time_stamp(event, info->delta, abs); + event = rb_add_time_stamp(event, abs ? info->delta : delta, abs); length -= RB_LEN_TIME_EXTEND; delta = 0; }
From: Stylon Wang stylon.wang@amd.com
commit 5ae9c378c3d88b40af72f8e8f961808e29f3e70b upstream.
[Why] Connector property output_bpc is available on DP/eDP only. New IGT tests would benifit if this property works on HDMI.
[How] Enable this read-only property on all types of connectors.
Signed-off-by: Stylon Wang stylon.wang@amd.com Reviewed-by: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 0461fecd68db3..11491ae1effc2 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -1017,7 +1017,6 @@ static const struct { {"link_settings", &dp_link_settings_debugfs_fops}, {"phy_settings", &dp_phy_settings_debugfs_fop}, {"test_pattern", &dp_phy_test_pattern_fops}, - {"output_bpc", &output_bpc_fops}, {"vrr_range", &vrr_range_fops}, {"sdp_message", &sdp_message_fops}, {"aux_dpcd_address", &dp_dpcd_address_debugfs_fops}, @@ -1090,6 +1089,9 @@ void connector_debugfs_init(struct amdgpu_dm_connector *connector) debugfs_create_file_unsafe("force_yuv420_output", 0644, dir, connector, &force_yuv420_output_fops);
+ debugfs_create_file("output_bpc", 0644, dir, connector, + &output_bpc_fops); + connector->debugfs_dpcd_address = 0; connector->debugfs_dpcd_size = 0;
From: Bernard Zhao bernard@vivo.com
commit b5b78a6c8d8cb9c307bc6b16a754603424459d6e upstream.
The function kobject_init_and_add alloc memory like: kobject_init_and_add->kobject_add_varg->kobject_set_name_vargs ->kvasprintf_const->kstrdup_const->kstrdup->kmalloc_track_caller ->kmalloc_slab, in err branch this memory not free. If use kmemleak, this path maybe catched. These changes are to add kobject_put in kobject_init_and_add failed branch, fix potential memleak.
Signed-off-by: Bernard Zhao bernard@vivo.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index fe0cd49d4ea7c..d8c74aa4e5650 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -396,6 +396,7 @@ struct kfd_process *kfd_create_process(struct file *filep) (int)process->lead_thread->pid); if (ret) { pr_warn("Creating procfs pid directory failed"); + kobject_put(process->kobj); goto out; }
From: Daniel Gomez dagmcr@gmail.com
commit 5f9af404eec82981c4345c9943be48422234e7ab upstream.
Select DRM_KMS_HELPER dependency.
Build error when DRM_KMS_HELPER is not selected:
drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xd48): undefined reference to `drm_atomic_helper_bridge_duplicate_state' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xd50): undefined reference to `drm_atomic_helper_bridge_destroy_state' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xd70): undefined reference to `drm_atomic_helper_bridge_reset' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xdc8): undefined reference to `drm_atomic_helper_connector_reset' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xde0): undefined reference to `drm_helper_probe_single_connector_modes' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xe08): undefined reference to `drm_atomic_helper_connector_duplicate_state' drivers/gpu/drm/rcar-du/rcar_lvds.o:(.rodata+0xe10): undefined reference to `drm_atomic_helper_connector_destroy_state'
Fixes: c6a27fa41fab ("drm: rcar-du: Convert LVDS encoder code to bridge driver") Cc: stable@vger.kernel.org Signed-off-by: Daniel Gomez dagmcr@gmail.com Reviewed-by: Emil Velikov emil.l.velikov@gmail.com Reviewed-by: Kieran Bingham kieran.bingham+renesas@ideasonboard.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/rcar-du/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/rcar-du/Kconfig b/drivers/gpu/drm/rcar-du/Kconfig index 0919f1f159a49..f65d1489dc509 100644 --- a/drivers/gpu/drm/rcar-du/Kconfig +++ b/drivers/gpu/drm/rcar-du/Kconfig @@ -31,6 +31,7 @@ config DRM_RCAR_DW_HDMI config DRM_RCAR_LVDS tristate "R-Car DU LVDS Encoder Support" depends on DRM && DRM_BRIDGE && OF + select DRM_KMS_HELPER select DRM_PANEL select OF_FLATTREE select OF_OVERLAY
From: Denis Efremov efremov@linux.com
commit 35f760b44b1b9cb16a306bdcc7220fbbf78c4789 upstream.
clk_s is checked twice in a row in ni_init_smc_spll_table(). fb_div should be checked instead.
Fixes: 69e0b57a91ad ("drm/radeon/kms: add dpm support for cayman (v5)") Cc: stable@vger.kernel.org Signed-off-by: Denis Efremov efremov@linux.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/radeon/ni_dpm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/radeon/ni_dpm.c b/drivers/gpu/drm/radeon/ni_dpm.c index b57c37ddd164c..c7fbb7932f379 100644 --- a/drivers/gpu/drm/radeon/ni_dpm.c +++ b/drivers/gpu/drm/radeon/ni_dpm.c @@ -2127,7 +2127,7 @@ static int ni_init_smc_spll_table(struct radeon_device *rdev) if (clk_s & ~(SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_SHIFT)) ret = -EINVAL;
- if (clk_s & ~(SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_SHIFT)) + if (fb_div & ~(SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_SHIFT)) ret = -EINVAL;
if (clk_v & ~(SMC_NISLANDS_SPLL_DIV_TABLE_CLKV_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_CLKV_SHIFT))
From: Daniel Vetter daniel.vetter@ffwll.ch
commit dc5bdb68b5b369d5bc7d1de96fa64cc1737a6320 upstream.
In the past we had a pile of hacks to orchestrate access between fbdev emulation and native kms clients. We've tried to streamline this, by always preferring the kms side above fbdev calls when a drm master exists, because drm master controls access to the display resources.
Unfortunately this breaks existing userspace, specifically Xorg. When exiting Xorg first restores the console to text mode using the KDSET ioctl on the vt. This does nothing, because a drm master is still around. Then it drops the drm master status, which again does nothing, because logind is keeping additional drm fd open to be able to orchestrate vt switches. In the past this is the point where fbdev was restored, as part of the ->lastclose hook on the drm side.
Now to fix this regression we don't want to go back to letting fbdev restore things whenever it feels like, or to the pile of hacks we've had before. Instead try and go with a minimal exception to make the KDSET case work again, and nothing else.
This means that if userspace does a KDSET call when switching between graphical compositors, there will be some flickering with fbcon showing up for a bit. But a) that's not a regression and b) userspace can fix it by improving the vt switching dance - logind should have all the information it needs.
While pondering all this I'm also wondering wheter we should have a SWITCH_MASTER ioctl to allow race-free master status handover. But that's for another day.
v2: Somehow forgot to cc all the fbdev people.
v3: Fix typo Alex spotted.
Reviewed-by: Alex Deucher alexander.deucher@amd.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208179 Cc: shlomo@fastmail.com Reported-and-Tested-by: shlomo@fastmail.com Cc: Michel Dänzer michel@daenzer.net Fixes: 64914da24ea9 ("drm/fbdev-helper: don't force restores") Cc: Noralf Trønnes noralf@tronnes.org Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Daniel Vetter daniel.vetter@intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Maxime Ripard mripard@kernel.org Cc: David Airlie airlied@linux.ie Cc: Daniel Vetter daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.7+ Cc: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Nathan Chancellor natechancellor@gmail.com Cc: Qiujun Huang hqjagain@gmail.com Cc: Peter Rosin peda@axentia.se Cc: linux-fbdev@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20200624092910.3280448-1-danie... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_fb_helper.c | 63 +++++++++++++++++++++++++------- drivers/video/fbdev/core/fbcon.c | 3 +- include/uapi/linux/fb.h | 1 + 3 files changed, 52 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index a9771de4d17e6..c7be39a00d437 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -227,18 +227,9 @@ int drm_fb_helper_debug_leave(struct fb_info *info) } EXPORT_SYMBOL(drm_fb_helper_debug_leave);
-/** - * drm_fb_helper_restore_fbdev_mode_unlocked - restore fbdev configuration - * @fb_helper: driver-allocated fbdev helper, can be NULL - * - * This should be called from driver's drm &drm_driver.lastclose callback - * when implementing an fbcon on top of kms using this helper. This ensures that - * the user isn't greeted with a black screen when e.g. X dies. - * - * RETURNS: - * Zero if everything went ok, negative error code otherwise. - */ -int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) +static int +__drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper, + bool force) { bool do_delayed; int ret; @@ -250,7 +241,16 @@ int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) return 0;
mutex_lock(&fb_helper->lock); - ret = drm_client_modeset_commit(&fb_helper->client); + if (force) { + /* + * Yes this is the _locked version which expects the master lock + * to be held. But for forced restores we're intentionally + * racing here, see drm_fb_helper_set_par(). + */ + ret = drm_client_modeset_commit_locked(&fb_helper->client); + } else { + ret = drm_client_modeset_commit(&fb_helper->client); + }
do_delayed = fb_helper->delayed_hotplug; if (do_delayed) @@ -262,6 +262,22 @@ int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper)
return ret; } + +/** + * drm_fb_helper_restore_fbdev_mode_unlocked - restore fbdev configuration + * @fb_helper: driver-allocated fbdev helper, can be NULL + * + * This should be called from driver's drm &drm_driver.lastclose callback + * when implementing an fbcon on top of kms using this helper. This ensures that + * the user isn't greeted with a black screen when e.g. X dies. + * + * RETURNS: + * Zero if everything went ok, negative error code otherwise. + */ +int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) +{ + return __drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper, false); +} EXPORT_SYMBOL(drm_fb_helper_restore_fbdev_mode_unlocked);
#ifdef CONFIG_MAGIC_SYSRQ @@ -1310,6 +1326,7 @@ int drm_fb_helper_set_par(struct fb_info *info) { struct drm_fb_helper *fb_helper = info->par; struct fb_var_screeninfo *var = &info->var; + bool force;
if (oops_in_progress) return -EBUSY; @@ -1319,7 +1336,25 @@ int drm_fb_helper_set_par(struct fb_info *info) return -EINVAL; }
- drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper); + /* + * Normally we want to make sure that a kms master takes precedence over + * fbdev, to avoid fbdev flickering and occasionally stealing the + * display status. But Xorg first sets the vt back to text mode using + * the KDSET IOCTL with KD_TEXT, and only after that drops the master + * status when exiting. + * + * In the past this was caught by drm_fb_helper_lastclose(), but on + * modern systems where logind always keeps a drm fd open to orchestrate + * the vt switching, this doesn't work. + * + * To not break the userspace ABI we have this special case here, which + * is only used for the above case. Everything else uses the normal + * commit function, which ensures that we never steal the display from + * an active drm master. + */ + force = var->activate & FB_ACTIVATE_KD_TEXT; + + __drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper, force);
return 0; } diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 9d28a8e3328fb..e2a490c5ae08f 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -2402,7 +2402,8 @@ static int fbcon_blank(struct vc_data *vc, int blank, int mode_switch) ops->graphics = 1;
if (!blank) { - var.activate = FB_ACTIVATE_NOW | FB_ACTIVATE_FORCE; + var.activate = FB_ACTIVATE_NOW | FB_ACTIVATE_FORCE | + FB_ACTIVATE_KD_TEXT; fb_set_var(info, &var); ops->graphics = 0; ops->var = info->var; diff --git a/include/uapi/linux/fb.h b/include/uapi/linux/fb.h index b6aac7ee1f670..4c14e8be72677 100644 --- a/include/uapi/linux/fb.h +++ b/include/uapi/linux/fb.h @@ -205,6 +205,7 @@ struct fb_bitfield { #define FB_ACTIVATE_ALL 64 /* change all VCs on this fb */ #define FB_ACTIVATE_FORCE 128 /* force apply even when no change*/ #define FB_ACTIVATE_INV_MODE 256 /* invalidate videomode */ +#define FB_ACTIVATE_KD_TEXT 512 /* for KDSET vt ioctl */
#define FB_ACCELF_TEXT 1 /* (OBSOLETE) see fb_info.flags and vc_mode */
From: Wenhui Sheng Wenhui.Sheng@amd.com
commit edfaf6fa73f15568d4337f208b2333f647c35810 upstream.
sdma fw isn't released when module exit
Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Wenhui Sheng Wenhui.Sheng@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index d2840c2f62865..1dc57079933ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -1261,8 +1261,12 @@ static int sdma_v5_0_sw_fini(void *handle) struct amdgpu_device *adev = (struct amdgpu_device *)handle; int i;
- for (i = 0; i < adev->sdma.num_instances; i++) + for (i = 0; i < adev->sdma.num_instances; i++) { + if (adev->sdma.instance[i].fw != NULL) + release_firmware(adev->sdma.instance[i].fw); + amdgpu_ring_fini(&adev->sdma.instance[i].ring); + }
return 0; }
From: John van der Kamp sjonny@suffe.me.uk
commit ee434a4f9f5ea15b0f84bddd8c012838cf9472c5 upstream.
Make sure we pass through ret label to unlock the mutex.
Signed-off-by: John van der Kamp sjonny@suffe.me.uk Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c index dcf84a61de37f..949d10ef83040 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c @@ -510,8 +510,10 @@ static ssize_t srm_data_read(struct file *filp, struct kobject *kobj, struct bin
srm = psp_get_srm(work->hdcp.config.psp.handle, &srm_version, &srm_size);
- if (!srm) - return -EINVAL; + if (!srm) { + ret = -EINVAL; + goto ret; + }
if (pos >= srm_size) ret = 0;
From: Tomi Valkeinen tomi.valkeinen@ti.com
commit 8a4f5e1185db61bce6ce3a5dce6381a77bcf94e6 upstream.
Add connector type for newhaven_nhd_43_480272ef_atxl, as drm_panel_bridge_add() requires connector type to be set.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Sam Ravnborg sam@ravnborg.org Link: https://patchwork.freedesktop.org/patch/msgid/20200609102809.753203-1-tomi.v... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/panel/panel-simple.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index 3ad828eaefe1c..00c1a8dc4ce8f 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -2465,6 +2465,7 @@ static const struct panel_desc newhaven_nhd_43_480272ef_atxl = { .bus_format = MEDIA_BUS_FMT_RGB888_1X24, .bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_DRIVE_POSEDGE | DRM_BUS_FLAG_SYNC_DRIVE_POSEDGE, + .connector_type = DRM_MODE_CONNECTOR_DPI, };
static const struct display_timing nlt_nl192108ac18_02d_timing = {
From: Adam Ford aford173@gmail.com
commit efb94790852ae673b18efde1b171d284689ff333 upstream.
The LogicPD Type28 display used by several Logic PD products has not worked since v5.6.
The connector type for the LogicPD Type 28 display is missing and drm_panel_bridge_add() requires connector type to be set.
Signed-off-by: Adam Ford aford173@gmail.com Fixes: 0d35408afbeb ("drm/panel: simple: Add Logic PD Type 28 display support") Cc: Adam Ford aford173@gmail.com Cc: Sam Ravnborg sam@ravnborg.org Cc: Thierry Reding thierry.reding@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.6+ Signed-off-by: Sam Ravnborg sam@ravnborg.org Link: https://patchwork.freedesktop.org/patch/msgid/20200615131934.12440-1-aford17... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/panel/panel-simple.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index 00c1a8dc4ce8f..db91b3c031a13 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -2297,6 +2297,7 @@ static const struct panel_desc logicpd_type_28 = { .bus_format = MEDIA_BUS_FMT_RGB888_1X24, .bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_DRIVE_POSEDGE | DRM_BUS_FLAG_SYNC_DRIVE_NEGEDGE, + .connector_type = DRM_MODE_CONNECTOR_DPI, };
static const struct panel_desc mitsubishi_aa070mc01 = {
From: Frieder Schrempf frieder.schrempf@kontron.de
commit 04a2c05179b732a4c097f0a9c701ef4c9a37e1e3 upstream.
The watchdog's WDOG_ANY signal is used to trigger a POR of the SoC, if a soft reset is issued. As the SoM hardware connects the WDOG_ANY and the POR signals, the watchdog node itself and the pin configuration should be part of the common SoM devicetree. Let's move it from the baseboard's devicetree to its proper place.
Fixes: 1ea4b76cdfde ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards") Cc: stable@vger.kernel.org Signed-off-by: Frieder Schrempf frieder.schrempf@kontron.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi | 13 ------------- .../boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi | 13 +++++++++++++ 2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi index f05e918412027..53a25fba34f69 100644 --- a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi +++ b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-s.dtsi @@ -232,13 +232,6 @@ status = "okay"; };
-&wdog1 { - pinctrl-names = "default"; - pinctrl-0 = <&pinctrl_wdog>; - fsl,ext-reset-output; - status = "okay"; -}; - &iomuxc { pinctrl-0 = <&pinctrl_reset_out &pinctrl_gpio>;
@@ -409,10 +402,4 @@ MX6UL_PAD_NAND_DATA03__USDHC2_DATA3 0x170f9 >; }; - - pinctrl_wdog: wdoggrp { - fsl,pins = < - MX6UL_PAD_GPIO1_IO09__WDOG1_WDOG_ANY 0x30b0 - >; - }; }; diff --git a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi index a17af4d9bfdf9..fc316408721d0 100644 --- a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi +++ b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi @@ -57,6 +57,13 @@ status = "okay"; };
+&wdog1 { + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_wdog>; + fsl,ext-reset-output; + status = "okay"; +}; + &iomuxc { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_reset_out>; @@ -106,4 +113,10 @@ MX6UL_PAD_SNVS_TAMPER9__GPIO5_IO09 0x1b0b0 >; }; + + pinctrl_wdog: wdoggrp { + fsl,pins = < + MX6UL_PAD_GPIO1_IO09__WDOG1_WDOG_ANY 0x30b0 + >; + }; };
From: Frieder Schrempf frieder.schrempf@kontron.de
commit d22a16cc92e04d053fd807ef3587e4f135e4206f upstream.
The WDOG_ANY signal is connected to the RESET_IN signal of the SoM and baseboard. It is currently configured as push-pull, which means that if some external device like a programmer wants to assert the RESET_IN signal by pulling it to ground, it drives against the high level WDOG_ANY output of the SoC.
To fix this we set the WDOG_ANY signal to open-drain configuration. That way we make sure that the RESET_IN can be asserted by the watchdog as well as by external devices.
Fixes: 1ea4b76cdfde ("ARM: dts: imx6ul-kontron-n6310: Add Kontron i.MX6UL N6310 SoM and boards") Cc: stable@vger.kernel.org Signed-off-by: Frieder Schrempf frieder.schrempf@kontron.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi index fc316408721d0..61ba21a605a81 100644 --- a/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi +++ b/arch/arm/boot/dts/imx6ul-kontron-n6x1x-som-common.dtsi @@ -116,7 +116,7 @@
pinctrl_wdog: wdoggrp { fsl,pins = < - MX6UL_PAD_GPIO1_IO09__WDOG1_WDOG_ANY 0x30b0 + MX6UL_PAD_GPIO1_IO09__WDOG1_WDOG_ANY 0x18b0 >; }; };
From: Dan Carpenter dan.carpenter@oracle.com
commit b65a2d8c8614386f7e8d38ea150749f8a862f431 upstream.
The "ie_len" variable is in the 0-255 range and it comes from the network. If it's over NDIS_802_11_LENGTH_RATES_EX (16) then that will lead to memory corruption.
Fixes: 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200603101958.GA1845750@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8723bs/core/rtw_wlan_util.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c index 110338dbe3728..cc60f6a28d700 100644 --- a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c +++ b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c @@ -1830,12 +1830,14 @@ int update_sta_support_rate(struct adapter *padapter, u8 *pvar_ie, uint var_ie_l pIE = (struct ndis_80211_var_ie *)rtw_get_ie(pvar_ie, _SUPPORTEDRATES_IE_, &ie_len, var_ie_len); if (!pIE) return _FAIL; + if (ie_len > sizeof(pmlmeinfo->FW_sta_info[cam_idx].SupportedRates)) + return _FAIL;
memcpy(pmlmeinfo->FW_sta_info[cam_idx].SupportedRates, pIE->data, ie_len); supportRateNum = ie_len;
pIE = (struct ndis_80211_var_ie *)rtw_get_ie(pvar_ie, _EXT_SUPPORTEDRATES_IE_, &ie_len, var_ie_len); - if (pIE) + if (pIE && (ie_len <= sizeof(pmlmeinfo->FW_sta_info[cam_idx].SupportedRates) - supportRateNum)) memcpy((pmlmeinfo->FW_sta_info[cam_idx].SupportedRates + supportRateNum), pIE->data, ie_len);
return _SUCCESS;
From: Arseny Solokha asolokha@kb.kras.ru
commit 7e4773f73dcfb92e7e33532162f722ec291e75a4 upstream.
Building the current 5.8 kernel for an e500 machine with CONFIG_RANDOMIZE_BASE=y and CONFIG_BLOCK=n yields the following failure:
arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init': arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit declaration of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
Fixes: 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR infrastructure") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Arseny Solokha asolokha@kb.kras.ru Reviewed-by: Jason Yan yanaijie@huawei.com Acked-by: Scott Wood oss@buserror.net [mpe: Tweak change log to mention CONFIG_BLOCK=n] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20200613162801.1946619-1-asolokha@kb.kras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/mm/nohash/kaslr_booke.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c index 4a75f2d9bf0e0..bce0e5349978f 100644 --- a/arch/powerpc/mm/nohash/kaslr_booke.c +++ b/arch/powerpc/mm/nohash/kaslr_booke.c @@ -14,6 +14,7 @@ #include <linux/memblock.h> #include <linux/libfdt.h> #include <linux/crash_core.h> +#include <asm/cacheflush.h> #include <asm/pgalloc.h> #include <asm/prom.h> #include <asm/kdump.h>
From: Vasily Averin vvs@virtuozzo.com
commit b7ade38165ca0001c5a3bd5314a314abbbfbb1b7 upstream.
__rpc_depopulate(gssd_dentry) was lost on error path
cc: stable@vger.kernel.org Fixes: commit 4b9a445e3eeb ("sunrpc: create a new dummy pipe for gssd to hold open") Signed-off-by: Vasily Averin vvs@virtuozzo.com Reviewed-by: Jeff Layton jlayton@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/rpc_pipe.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 39e14d5edaf13..e9d0953522f09 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -1317,6 +1317,7 @@ rpc_gssd_dummy_populate(struct dentry *root, struct rpc_pipe *pipe_data) q.len = strlen(gssd_dummy_clnt_dir[0].name); clnt_dentry = d_hash_and_lookup(gssd_dentry, &q); if (!clnt_dentry) { + __rpc_depopulate(gssd_dentry, gssd_dummy_clnt_dir, 0, 1); pipe_dentry = ERR_PTR(-ENOENT); goto out; }
From: Chuck Lever chuck.lever@oracle.com
commit 89a3c9f5b9f0bcaa9aea3e8b2a616fcaea9aad78 upstream.
@subbuf is an output parameter of xdr_buf_subsegment(). A survey of call sites shows that @subbuf is always uninitialized before xdr_buf_segment() is invoked by callers.
There are some execution paths through xdr_buf_subsegment() that do not set all of the fields in @subbuf, leaving some pointer fields containing garbage addresses. Subsequent processing of that buffer then results in a page fault.
Signed-off-by: Chuck Lever chuck.lever@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xdr.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index 6f7d82fb1eb0a..be11d672b5b97 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -1118,6 +1118,7 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf, base = 0; } else { base -= buf->head[0].iov_len; + subbuf->head[0].iov_base = buf->head[0].iov_base; subbuf->head[0].iov_len = 0; }
@@ -1130,6 +1131,8 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf, base = 0; } else { base -= buf->page_len; + subbuf->pages = buf->pages; + subbuf->page_base = 0; subbuf->page_len = 0; }
@@ -1141,6 +1144,7 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf, base = 0; } else { base -= buf->tail[0].iov_len; + subbuf->tail[0].iov_base = buf->tail[0].iov_base; subbuf->tail[0].iov_len = 0; }
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 8b04013737341442ed914b336cde866b902664ae upstream.
If the mirror count changes in the new layout we pick up inside ff_layout_pg_init_write(), then we can end up adding the request to the wrong mirror and corrupting the mirror->pg_list.
Fixes: d600ad1f2bdb ("NFS41: pop some layoutget errors to application") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/flexfilelayout/flexfilelayout.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 7d399f72ebbbf..de03e440b7eef 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -907,9 +907,8 @@ retry: goto out_mds;
/* Use a direct mapping of ds_idx to pgio mirror_idx */ - if (WARN_ON_ONCE(pgio->pg_mirror_count != - FF_LAYOUT_MIRROR_COUNT(pgio->pg_lseg))) - goto out_mds; + if (pgio->pg_mirror_count != FF_LAYOUT_MIRROR_COUNT(pgio->pg_lseg)) + goto out_eagain;
for (i = 0; i < pgio->pg_mirror_count; i++) { mirror = FF_LAYOUT_COMP(pgio->pg_lseg, i); @@ -931,7 +930,10 @@ retry: (NFS_MOUNT_SOFT|NFS_MOUNT_SOFTERR)) pgio->pg_maxretrans = io_maxretrans; return; - +out_eagain: + pnfs_generic_pg_cleanup(pgio); + pgio->pg_error = -EAGAIN; + return; out_mds: trace_pnfs_mds_fallback_pg_init_write(pgio->pg_inode, 0, NFS4_MAX_UINT64, IOMODE_RW, @@ -941,6 +943,7 @@ out_mds: pgio->pg_lseg = NULL; pgio->pg_maxretrans = 0; nfs_pageio_reset_write_mds(pgio); + pgio->pg_error = -EAGAIN; }
static unsigned int
From: Olga Kornievskaia olga.kornievskaia@gmail.com
commit d03727b248d0dae6199569a8d7b629a681154633 upstream.
Figuring out the root case for the REMOVE/CLOSE race and suggesting the solution was done by Neil Brown.
Currently what happens is that direct IO calls hold a reference on the open context which is decremented as an asynchronous task in the nfs_direct_complete(). Before reference is decremented, control is returned to the application which is free to close the file. When close is being processed, it decrements its reference on the open_context but since directIO still holds one, it doesn't sent a close on the wire. It returns control to the application which is free to do other operations. For instance, it can delete a file. Direct IO is finally releasing its reference and triggering an asynchronous close. Which races with the REMOVE. On the server, REMOVE can be processed before the CLOSE, failing the REMOVE with EACCES as the file is still opened.
Signed-off-by: Olga Kornievskaia kolga@netapp.com Suggested-by: Neil Brown neilb@suse.com CC: stable@vger.kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/direct.c | 13 +++++++++---- fs/nfs/file.c | 1 + 2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index d49b1d1979084..f0c3f0123131e 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -267,8 +267,6 @@ static void nfs_direct_complete(struct nfs_direct_req *dreq) { struct inode *inode = dreq->inode;
- inode_dio_end(inode); - if (dreq->iocb) { long res = (long) dreq->error; if (dreq->count != 0) { @@ -280,7 +278,10 @@ static void nfs_direct_complete(struct nfs_direct_req *dreq)
complete(&dreq->completion);
+ igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); }
static void nfs_direct_read_completion(struct nfs_pgio_header *hdr) @@ -410,8 +411,10 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq, * generic layer handle the completion. */ if (requested_bytes == 0) { - inode_dio_end(inode); + igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); return result < 0 ? result : -EIO; }
@@ -864,8 +867,10 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, * generic layer handle the completion. */ if (requested_bytes == 0) { - inode_dio_end(inode); + igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); return result < 0 ? result : -EIO; }
diff --git a/fs/nfs/file.c b/fs/nfs/file.c index f96367a2463e3..ccd6c1637b270 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -83,6 +83,7 @@ nfs_file_release(struct inode *inode, struct file *filp) dprintk("NFS: release(%pD2)\n", filp);
nfs_inc_stats(inode, NFSIOS_VFSRELEASE); + inode_dio_wait(inode); nfs_file_clear_open_context(filp); return 0; }
From: Borislav Petkov bp@suse.de
commit ee470bb25d0dcdf126f586ec0ae6dca66cb340a4 upstream.
Commit:
da92110dfdfa ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h")
added support for F15h, model 0x60 CPUs but in doing so, missed to read back SCRCTRL PCI config register on F15h CPUs which are *not* model 0x60. Add that read so that doing
$ cat /sys/devices/system/edac/mc/mc0/sdram_scrub_rate
can show the previously set DRAM scrub rate.
Fixes: da92110dfdfa ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h") Reported-by: Anders Andersson pipatron@gmail.com Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org #v4.4.. Link: https://lkml.kernel.org/r/CAKkunMbNWppx_i6xSdDHLseA2QQmGJqj_crY=NF-GZML5np4V... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/amd64_edac.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 4e9994de0b900..0d89c3e473bdc 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -272,6 +272,8 @@ static int get_scrub_rate(struct mem_ctl_info *mci)
if (pvt->model == 0x60) amd64_read_pci_cfg(pvt->F2, F15H_M60H_SCRCTRL, &scrubval); + else + amd64_read_pci_cfg(pvt->F3, SCRCTRL, &scrubval); } else { amd64_read_pci_cfg(pvt->F3, SCRCTRL, &scrubval); }
From: Chuck Lever chuck.lever@oracle.com
commit 7b2182ec381f8ea15c7eb1266d6b5d7da620ad93 upstream.
The RPC client currently doesn't handle ERR_CHUNK replies correctly. rpcrdma_complete_rqst() incorrectly passes a negative number to xprt_complete_rqst() as the number of bytes copied. Instead, set task->tk_status to the error value, and return zero bytes copied.
In these cases, return -EIO rather than -EREMOTEIO. The RPC client's finite state machine doesn't know what to do with -EREMOTEIO.
Additional clean ups: - Don't double-count RDMA_ERROR replies - Remove a stale comment
Signed-off-by: Chuck Lever chuck.lever@oracle.com Cc: stable@kernel.vger.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/rpc_rdma.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c index 3c627dc685cc8..57118e342c8eb 100644 --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -1349,8 +1349,7 @@ rpcrdma_decode_error(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep, be32_to_cpup(p), be32_to_cpu(rep->rr_xid)); }
- r_xprt->rx_stats.bad_reply_count++; - return -EREMOTEIO; + return -EIO; }
/* Perform XID lookup, reconstruction of the RPC reply, and @@ -1387,13 +1386,11 @@ out: spin_unlock(&xprt->queue_lock); return;
-/* If the incoming reply terminated a pending RPC, the next - * RPC call will post a replacement receive buffer as it is - * being marshaled. - */ out_badheader: trace_xprtrdma_reply_hdr(rep); r_xprt->rx_stats.bad_reply_count++; + rqst->rq_task->tk_status = status; + status = 0; goto out; }
From: Huaisheng Ye yehs1@lenovo.com
commit 39495b12ef1cf602e6abd350dce2ef4199906531 upstream.
When uncommitted entry has been discarded, correct wc->uncommitted_block for getting the exact number.
Fixes: 48debafe4f2fe ("dm: add writecache target") Cc: stable@vger.kernel.org Signed-off-by: Huaisheng Ye yehs1@lenovo.com Acked-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-writecache.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 613c171b1b6d2..856b3515654ff 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -857,6 +857,8 @@ static void writecache_discard(struct dm_writecache *wc, sector_t start, sector_ writecache_wait_for_ios(wc, WRITE); discarded_something = true; } + if (!writecache_entry_is_committed(wc, e)) + wc->uncommitted_blocks--; writecache_free_entry(wc, e); }
From: Mikulas Patocka mpatocka@redhat.com
commit d35bd764e6899a7bea71958f08d16cea5bfa1919 upstream.
Add cond_resched() to a loop that fills in the mapper memory area because the loop can be executed many times.
Fixes: 48debafe4f2fe ("dm: add writecache target") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-writecache.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 856b3515654ff..5cc94f57421ca 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -286,6 +286,8 @@ static int persistent_memory_claim(struct dm_writecache *wc) while (daa-- && i < p) { pages[i++] = pfn_t_to_page(pfn); pfn.val++; + if (!(i & 15)) + cond_resched(); } } while (i < p); wc->memory_map = vmap(pages, p, VM_MAP, PAGE_KERNEL);
--- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index f928cd1dfdc17..a07567cf0754f 100644 --- a/Makefile +++ b/Makefile @@ -1,8 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 5 PATCHLEVEL = 7 -SUBLEVEL = 6 -EXTRAVERSION = +SUBLEVEL = 7 +EXTRAVERSION = -rc1 NAME = Kleptomaniac Octopus
# *DOCUMENTATION*
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
thanks, -- Shuah
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and can, and will, change, and patches stored on kernel.org will not.
thanks,
greg k-h
On Tue, Jun 30, 2020 at 10:38:45AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and can, and will, change, and patches stored on kernel.org will not.
That's a good point. Maybe we should push tags for -rc releases too? that would allow us to keep stable links using the git.kernel.org interface.
Hello!
On Tue, 30 Jun 2020 at 10:12, Sasha Levin sashal@kernel.org wrote:
On Tue, Jun 30, 2020 at 10:38:45AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and can, and will, change, and patches stored on kernel.org will not.
That's a good point. Maybe we should push tags for -rc releases too?
That would be GREAT for those CI's or processes looking for a definite trigger to use.
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Tue, Jun 30, 2020 at 11:12:48AM -0400, Sasha Levin wrote:
On Tue, Jun 30, 2020 at 10:38:45AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and can, and will, change, and patches stored on kernel.org will not.
That's a good point. Maybe we should push tags for -rc releases too? that would allow us to keep stable links using the git.kernel.org interface.
If we really want to do this, then yes, we could. But that kind of goes against what we (well I) have been doing in the past with that tree...
thanks,
greg k-h
On Tue, Jun 30, 2020 at 05:33:25PM +0200, Greg Kroah-Hartman wrote:
On Tue, Jun 30, 2020 at 11:12:48AM -0400, Sasha Levin wrote:
On Tue, Jun 30, 2020 at 10:38:45AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload mechanism Greg was using by simply pointing the links to automatically generated patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something like this should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and can, and will, change, and patches stored on kernel.org will not.
That's a good point. Maybe we should push tags for -rc releases too? that would allow us to keep stable links using the git.kernel.org interface.
If we really want to do this, then yes, we could. But that kind of goes against what we (well I) have been doing in the past with that tree...
We've been force pushing it, yes, but if we add tags it would just keep older version around so I don't think it would change much for our workflow, but it would make it easy to get to older versions.
On 7/1/20 8:40 AM, Sasha Levin wrote:
On Tue, Jun 30, 2020 at 05:33:25PM +0200, Greg Kroah-Hartman wrote:
On Tue, Jun 30, 2020 at 11:12:48AM -0400, Sasha Levin wrote:
On Tue, Jun 30, 2020 at 10:38:45AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jun 29, 2020 at 07:18:26PM -0400, Sasha Levin wrote:
On Mon, Jun 29, 2020 at 02:37:53PM -0600, Shuah Khan wrote:
Hi Sasha,
On 6/29/20 9:13 AM, Sasha Levin wrote: > > This is the start of the stable review cycle for the 5.7.7
release.
> There are 265 patches in this series, all will be posted as a
response
> to this one. If anyone has any issues with these being
applied, please
> let me know. > > Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. > Anything received after that time might be too late. > > The whole patch series can be found in one patch at: >
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
>
Looks like patch naming convention has changed. My scripts look for the following convention Greg uses. Are you planning to use the above going forward? My scripts failed looking for the usual naming convention.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.7.6-rc1.g...
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y
and the diffstat can be found below.
Sorry for that. I was hoping to avoid using the signed upload
mechanism
Greg was using by simply pointing the links to automatically
generated
patches on cgit (the git.kernel.org interface).
Would it be ok to change the pattern matching here? Something
like this
should work for both Greg's format and my own (and whatever may come next):
grep -A1 "The whole patch series can be found in one patch
at:" | tail -n1 | sed 's/\t//'
If those don't work, I can still push out -rc1 patches.
It might be best given that the above -rc.git tree is unstable and
can,
and will, change, and patches stored on kernel.org will not.
That's a good point. Maybe we should push tags for -rc releases too? that would allow us to keep stable links using the git.kernel.org interface.
If we really want to do this, then yes, we could. But that kind of goes against what we (well I) have been doing in the past with that tree...
We've been force pushing it, yes, but if we add tags it would just keep older version around so I don't think it would change much for our workflow, but it would make it easy to get to older versions.
I can make changes to my scripts as long as the process is consistent for stable releases. Let mw know what you come up with, I will make adjustments to my scripts.
thanks, -- Shuah
On Mon, 29 Jun 2020 at 20:48, Sasha Levin sashal@kernel.org wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
-- Thanks, Sasha
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.7.7-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.7.y git commit: 97943c6d41ef2b05f4e064eb49a538ff4b405809 git describe: v5.7.6-265-g97943c6d41ef Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.7-oe/build/v5.7.6-265-g...
No regressions (compared to build v5.7.6)
No fixes (compared to build v5.7.6)
Ran 36511 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - juno-r2-compat - juno-r2-kasan - nxp-ls2088 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86 - x86-kasan
Test Suites ----------- * build * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest * kselftest/drivers * kselftest/filesystems * kselftest/net * libhugetlbfs * linux-log-parser * ltp-containers-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-mm-tests * ltp-sched-tests * ltp-syscalls-tests * perf * v4l2-compliance * kvm-unit-tests * ltp-cap_bounds-tests * ltp-commands-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-math-tests * network-basic-tests * ltp-controllers-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-securebits-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-native/net * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems * kselftest-vsyscall-mode-none/net
On Tue, Jun 30, 2020 at 10:44:12AM +0530, Naresh Kamboju wrote:
On Mon, 29 Jun 2020 at 20:48, Sasha Levin sashal@kernel.org wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
-- Thanks, Sasha
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Thanks for testing!
On 29/06/2020 16:13, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
-- Thanks, Sasha
All tests are passing for Tegra ...
Test results for stable-v5.7: 11 builds: 11 pass, 0 fail 26 boots: 26 pass, 0 fail 56 tests: 56 pass, 0 fail
Linux version: 5.7.7-rc1-g97943c6d41ef Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Cheers Jon
On Tue, Jun 30, 2020 at 10:15:31AM +0100, Jon Hunter wrote:
On 29/06/2020 16:13, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p...
or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.7.y and the diffstat can be found below.
-- Thanks, Sasha
All tests are passing for Tegra ...
Test results for stable-v5.7: 11 builds: 11 pass, 0 fail 26 boots: 26 pass, 0 fail 56 tests: 56 pass, 0 fail
Linux version: 5.7.7-rc1-g97943c6d41ef Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Thanks for the testing Jon!
On Mon, Jun 29, 2020 at 11:13:53AM -0400, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 430 pass: 430 fail: 0
Guenter
On Tue, Jun 30, 2020 at 10:22:51AM -0700, Guenter Roeck wrote:
On Mon, Jun 29, 2020 at 11:13:53AM -0400, Sasha Levin wrote:
This is the start of the stable review cycle for the 5.7.7 release. There are 265 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 01 Jul 2020 03:14:48 PM UTC. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 430 pass: 430 fail: 0
Thanks Guenter!
linux-stable-mirror@lists.linaro.org