This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.10.30-rc1
Vlad Buslov vladbu@nvidia.com Revert "net: sched: bump refcount for new action in ACT replace mode"
Alexander Aring aahringo@redhat.com net: ieee802154: stop dump llsec params for monitors
Alexander Aring aahringo@redhat.com net: ieee802154: forbid monitor for del llsec seclevel
Alexander Aring aahringo@redhat.com net: ieee802154: forbid monitor for set llsec params
Alexander Aring aahringo@redhat.com net: ieee802154: fix nl802154 del llsec devkey
Alexander Aring aahringo@redhat.com net: ieee802154: fix nl802154 add llsec key
Alexander Aring aahringo@redhat.com net: ieee802154: fix nl802154 del llsec dev
Alexander Aring aahringo@redhat.com net: ieee802154: fix nl802154 del llsec key
Alexander Aring aahringo@redhat.com net: ieee802154: nl-mac: fix check on panid
Pavel Skripkin paskripkin@gmail.com net: mac802154: Fix general protection fault
Pavel Skripkin paskripkin@gmail.com drivers: net: fix memory leak in peak_usb_create_dev
Pavel Skripkin paskripkin@gmail.com drivers: net: fix memory leak in atusb_probe
Phillip Potter phil@philpotter.co.uk net: tun: set tun->dev->addr_len during TUNSETLINK processing
Du Cheng ducheng2@gmail.com cfg80211: remove WARN_ON() in cfg80211_sme_connect
Andy Shevchenko andriy.shevchenko@linux.intel.com gpiolib: Read "gpio-line-names" from a firmware node
Kumar Kartikeya Dwivedi memxor@gmail.com net: sched: bump refcount for new action in ACT replace mode
Rafał Miłecki rafal@milecki.pl dt-bindings: net: ethernet-controller: fix typo in NVMEM
Arnd Bergmann arnd@arndb.de lockdep: Address clang -Wformat warning printing for %hd
Krzysztof Kozlowski krzysztof.kozlowski@canonical.com clk: socfpga: fix iomem pointer cast on 64-bit
William Roche william.roche@oracle.com RAS/CEC: Correct ce_add_elem()'s returned values
Eli Cohen elic@nvidia.com vdpa/mlx5: Fix wrong use of bit numbers
Si-Wei Liu si-wei.liu@oracle.com vdpa/mlx5: should exclude header length and fcs from mtu
Leon Romanovsky leon@kernel.org RDMA/addr: Be strict with gid size
Grzegorz Siwik grzegorz.siwik@intel.com i40e: Fix parameters in aq_get_phy_register()
Dom Cobley popcornmix@gmail.com drm/vc4: crtc: Reduce PV fifo threshold on hvs4
Kamal Heib kamalheib1@gmail.com RDMA/qedr: Fix kernel panic when trying to access recv_cq
Jin Yao yao.jin@linux.intel.com perf report: Fix wrong LBR block sorting
Potnuri Bharat Teja bharat@chelsio.com RDMA/cxgb4: check for ipv6 address properly while destroying listener
Aya Levin ayal@nvidia.com net/mlx5: Fix PBMC register mapping
Aya Levin ayal@nvidia.com net/mlx5: Fix PPLM register mapping
Raed Salem raeds@nvidia.com net/mlx5: Fix placement of log_max_flow_counter
Guangbin Huang huangguangbin2@huawei.com net: hns3: clear VF down state bit before request link status
Xin Long lucien.xin@gmail.com tipc: increment the tmp aead refcnt before attaching it
Marc Kleine-Budde mkl@pengutronix.de can: mcp251x: fix support for half duplex SPI host controllers
Luca Coelho luciano.coelho@intel.com iwlwifi: fix 11ax disabled bit in the regulatory capability flags
Andy Shevchenko andriy.shevchenko@linux.intel.com i2c: designware: Adjust bus_freq_hz when refuse high speed mode set
Ilya Maximets i.maximets@ovn.org openvswitch: fix send of uninitialized stack memory in ct limit reply
Zheng Yongjun zhengyongjun3@huawei.com net: openvswitch: conntrack: simplify the return expression of ovs_ct_limit_get_default_limit()
Adrian Hunter adrian.hunter@intel.com perf inject: Fix repipe usage
Alexander Gordeev agordeev@linux.ibm.com s390/cpcmd: fix inline assembly register clobbering
Zqiang qiang.zhang@windriver.com workqueue: Move the position of debug_work_activate() in __queue_work()
Lukasz Bartosik lb@semihalf.com clk: fix invalid usage of list cursor in unregister
Lukasz Bartosik lb@semihalf.com clk: fix invalid usage of list cursor in register
Claudiu Beznea claudiu.beznea@microchip.com net: macb: restore cmp registers on resume path
Yunjian Wang wangyunjian@huawei.com net: cls_api: Fix uninitialised struct field bo->unlocked_driver_cb
Can Guo cang@codeaurora.org scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs
Can Guo cang@codeaurora.org scsi: ufs: core: Fix task management request completion timeout
Paolo Abeni pabeni@redhat.com mptcp: forbit mcast-related sockopt on MPTCP sockets
Norman Maurer norman_maurer@apple.com net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...);
Stephen Boyd swboyd@chromium.org drm/msm: Set drvdata to NULL when msm_drm_init() fails
Md Haris Iqbal haris.iqbal@cloud.ionos.com RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files
Eryk Rybak eryk.roch.rybak@intel.com i40e: Fix display statistics for veb_tc
Arnd Bergmann arnd@arndb.de soc/fsl: qbman: fix conflicting alignment attributes
Ong Boon Leong boon.leong.ong@intel.com xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model
Lv Yunlong lyl2019@mail.ustc.edu.cn net/rds: Fix a use after free in rds_message_map_pages
Daniel Jurgens danielj@mellanox.com net/mlx5: Don't request more than supported EQs
Aya Levin ayal@nvidia.com net/mlx5e: Fix ethtool indication of connector type
Ariel Levkovich lariel@nvidia.com net/mlx5e: Fix mapping of ct_label zero
Bastian Germann bage@linutronix.de ASoC: sunxi: sun4i-codec: fill ASoC card owner
周琰杰 (Zhou Yanjie) zhouyanjie@wanyeetech.com I2C: JZ4780: Fix bug for Ingenic X1000.
Florian Fainelli f.fainelli@gmail.com net: phy: broadcom: Only advertise EEE for supported modes
Yinjun Zhang yinjun.zhang@corigine.com nfp: flower: ignore duplicate merge hints from FW
Loic Poulain loic.poulain@linaro.org net: qrtr: Fix memory leak on qrtr_tx_wait failure
Milton Miller miltonm@us.ibm.com net/ncsi: Avoid channel_monitor hrtimer deadlock
Stefan Riedmueller s.riedmueller@phytec.de ARM: dts: imx6: pbab01: Set vmmc supply for both SD interfaces
Lv Yunlong lyl2019@mail.ustc.edu.cn net:tipc: Fix a double free in tipc_sk_mcast_rcv
Rahul Lakkireddy rahul.lakkireddy@chelsio.com cxgb4: avoid collecting SGE_QBASE regs during traffic
Maxim Kochetkov fido_max@inbox.ru net: dsa: Fix type was not set for devlink port
Claudiu Manoil claudiu.manoil@nxp.com gianfar: Handle error code at MAC address change
Lv Yunlong lyl2019@mail.ustc.edu.cn ethernet: myri10ge: Fix a use after free in myri10ge_sw_tso
Ido Schimmel idosch@nvidia.com mlxsw: spectrum: Fix ECN marking in tunnel decapsulation
Oliver Hartkopp socketcan@hartkopp.net can: isotp: fix msg_namelen values depending on CAN_REQUIRED_SIZE
Oliver Hartkopp socketcan@hartkopp.net can: bcm/raw: fix msg_namelen values depending on CAN_REQUIRED_SIZE
Steffen Klassert steffen.klassert@secunet.com xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets
Oliver Stäbler oliver.staebler@bytesatwork.ch arm64: dts: imx8mm/q: Fix pad control of SD1_DATA0
Lv Yunlong lyl2019@mail.ustc.edu.cn drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit
Eric Dumazet edumazet@google.com sch_red: fix off-by-one checks in red_check_params()
Antoine Tenart atenart@kernel.org geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply
Antoine Tenart atenart@kernel.org vxlan: do not modify the shared tunnel info when PMTU triggers an ICMP reply
Shyam Sundar S K Shyam-sundar.S-k@amd.com amd-xgbe: Update DMA coherency values
Al Viro viro@zeniv.linux.org.uk hostfs: fix memory handling in follow_link()
Eryk Rybak eryk.roch.rybak@intel.com i40e: Fix kernel oops when i40e driver removes VF's
Mateusz Palczewski mateusz.palczewski@intel.com i40e: Added Asym_Pause to supported link modes
Norbert Ciosek norbertx.ciosek@intel.com virtchnl: Fix layout of RSS structures
Steffen Klassert steffen.klassert@secunet.com xfrm: Fix NULL pointer dereference on policy lookup
Shengjiu Wang shengjiu.wang@nxp.com ASoC: wm8960: Fix wrong bclk and lrclk with pll enabled for some chips
Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com ASoC: SOF: Intel: HDA: fix core status verification
Xin Long lucien.xin@gmail.com esp: delete NETIF_F_SCTP_CRC bit from features for esp offload
Ahmed S. Darwish a.darwish@linutronix.de net: xfrm: Localize sequence counter per network namespace
Carlos Leija cileija@ti.com ARM: OMAP4: PM: update ROM return address for OSWR and OFF
Tony Lindgren tony@atomide.com ARM: OMAP4: Fix PMIC voltage domains for bionic
Geert Uytterhoeven geert+renesas@glider.be regulator: bd9571mwv: Fix AVS and DVFS voltage range
Arnd Bergmann arnd@arndb.de remoteproc: qcom: pil_info: avoid 64-bit division
Evan Nimmo evan.nimmo@alliedtelesis.co.nz xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume
Eyal Birger eyal.birger@gmail.com xfrm: interface: fix ipv4 pmtu check to honor ip header df
Chinh T Cao chinh.t.cao@intel.com ice: Recognize 860 as iSCSI port in CEE mode
Chinh T Cao chinh.t.cao@intel.com ice: Refactor DCB related variables out of the ice_port_info struct
Vlad Buslov vladbu@nvidia.com net: sched: fix err handler in tcf_action_init()
Paolo Bonzini pbonzini@redhat.com KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp
Sean Christopherson seanjc@google.com KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
Sean Christopherson seanjc@google.com KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
Sean Christopherson seanjc@google.com KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap
Ben Gardon bgardon@google.com KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed
Ben Gardon bgardon@google.com KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter
Ben Gardon bgardon@google.com KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn
Ben Gardon bgardon@google.com KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched
Ben Gardon bgardon@google.com KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched
Wolfram Sang wsa+renesas@sang-engineering.com i2c: turn recovery error on init to debug
Roman Gushchin guro@fb.com percpu: make pcpu_nr_empty_pop_pages per chunk type
Roman Bolshakov r.bolshakov@yadro.com scsi: target: iscsi: Fix zero tag inside a trace event
Viswas G Viswas.G@microchip.com scsi: pm80xx: Fix chip initialization failure
Saravana Kannan saravanak@google.com driver core: Fix locking bug in deferred_probe_timeout_work_func()
Shuah Khan skhan@linuxfoundation.org usbip: synchronize event handler with sysfs code paths
Shuah Khan skhan@linuxfoundation.org usbip: vudc synchronize sysfs code paths
Shuah Khan skhan@linuxfoundation.org usbip: stub-dev synchronize sysfs code paths
Shuah Khan skhan@linuxfoundation.org usbip: add sysfs_lock to synchronize sysfs code paths
Dan Carpenter dan.carpenter@oracle.com thunderbolt: Fix off by one in tb_port_find_retimer()
Dan Carpenter dan.carpenter@oracle.com thunderbolt: Fix a leak in tb_retimer_add()
Paolo Abeni pabeni@redhat.com net: let skb_orphan_partial wake-up waiters.
Maciej Żenczykowski maze@google.com net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind()
Kurt Kanzenbach kurt@linutronix.de net: hsr: Reset MAC header for Tx path
Johannes Berg johannes.berg@intel.com mac80211: fix TXQ AC confusion
Ben Greear greearb@candelatech.com mac80211: fix time-is-after bug in mlme
Johannes Berg johannes.berg@intel.com cfg80211: check S1G beacon compat element length
Johannes Berg johannes.berg@intel.com nl80211: fix potential leak of ACL params
Johannes Berg johannes.berg@intel.com nl80211: fix beacon head validation
Vlad Buslov vladbu@nvidia.com net: sched: fix action overwrite reference counting
Pavel Tikhomirov ptikhomirov@virtuozzo.com net: sched: sch_teql: fix null-pointer dereference
Eli Cohen elic@nvidia.com vdpa/mlx5: Fix suspend/resume index restoration
Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com i40e: Fix sparse errors in i40e_txrx.c
Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com i40e: Fix sparse error: uninitialized symbol 'ring'
Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com i40e: Fix sparse error: 'vsi->netdev' could be null
Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com i40e: Fix sparse warning: missing error code 'err'
Eric Dumazet edumazet@google.com virtio_net: Do not pull payload in skb->head
Eric Dumazet edumazet@google.com net: ensure mac header is set in virtio_net_hdr_to_skb()
John Fastabend john.fastabend@gmail.com bpf, sockmap: Fix incorrect fwd_alloc accounting
John Fastabend john.fastabend@gmail.com bpf, sockmap: Fix sk->prot unhash op reset
Dave Marchevsky davemarchevsky@fb.com bpf: Refcount task stack in bpf_get_task_stack
Ciara Loftus ciara.loftus@intel.com libbpf: Only create rx and tx XDP rings when necessary
Ciara Loftus ciara.loftus@intel.com libbpf: Restore umem state after socket create failure
Ciara Loftus ciara.loftus@intel.com libbpf: Ensure umem pointer is non-NULL before dereferencing
Lv Yunlong lyl2019@mail.ustc.edu.cn ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx
Lorenz Bauer lmb@cloudflare.com bpf: link: Refuse non-O_RDWR flags in BPF_OBJ_GET
Toke Høiland-Jørgensen toke@redhat.com bpf: Enforce that struct_ops programs be GPL-only
Pedro Tammela pctammela@gmail.com libbpf: Fix bail out from 'ringbuf_process_ring()' on error
Anirudh Rayabharam mail@anirudhrb.com net: hso: fix null-ptr-deref during tty device unregistration
Yongxin Liu yongxin.liu@windriver.com ice: fix memory leak of aRFS after resuming from suspend
Johannes Berg johannes.berg@intel.com iwlwifi: pcie: properly set LTR workarounds on 22000 devices
Robert Malz robertx.malz@intel.com ice: Cleanup fltr list in case of allocation issues
Anirudh Venkataramanan anirudh.venkataramanan@intel.com ice: Use port number instead of PF ID for WoL
Jacek Bułatek jacekx.bulatek@intel.com ice: Fix for dereference of NULL pointer
Dave Ertman david.m.ertman@intel.com ice: remove DCBNL_DEVRESET bit from PF state
Bruce Allan bruce.w.allan@intel.com ice: fix memory allocation call
Krzysztof Goreczny krzysztof.goreczny@intel.com ice: prevent ice_open and ice_stop during reset
Fabio Pricoco fabio.pricoco@intel.com ice: Increase control queue timeout
Anirudh Venkataramanan anirudh.venkataramanan@intel.com ice: Continue probe on link/PHY errors
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp batman-adv: initialize "struct batadv_tvlv_tt_vlan_data"->reserved field
Marek Behún kabel@kernel.org ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin
Gao Xiang hsiangkao@redhat.com parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers
Helge Deller deller@gmx.de parisc: parisc-agp requires SBA IOMMU driver
Ilya Lipnitskiy ilya.lipnitskiy@gmail.com of: property: fw_devlink: do not link ".*,nr-gpios"
Wong Vee Khee vee.khee.wong@linux.intel.com ethtool: fix incorrect datatype in set_eee ops
Jack Qiu jack.qiu@huawei.com fs: direct-io: fix missing sdio->boundary
Wengang Wang wen.gang.wang@oracle.com ocfs2: fix deadlock between setattr and dio_end_io_write
Mike Rapoport rppt@kernel.org nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
Sergei Trofimovich slyfox@gentoo.org ia64: fix user_stack_pointer() for ptrace()
Nick Desaulniers ndesaulniers@google.com gcov: re-fix clang-11+ support
Al Viro viro@zeniv.linux.org.uk LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late
Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
Vitaly Kuznetsov vkuznets@redhat.com ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m
Takashi Iwai tiwai@suse.de drm/i915: Fix invalid access to ACPI _DSM objects
Martin Blumenstingl martin.blumenstingl@googlemail.com net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
Martin Blumenstingl martin.blumenstingl@googlemail.com net: dsa: lantiq_gswip: Don't use PHY auto polling
Martin Blumenstingl martin.blumenstingl@googlemail.com net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock
Muhammad Usama Anjum musamaanjum@gmail.com net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
Luca Fancellu luca.fancellu@arm.com xen/evtchn: Change irq_info lock to raw_spinlock_t
Ondrej Mosnacek omosnace@redhat.com selinux: fix race between old and new sidtab
Ondrej Mosnacek omosnace@redhat.com selinux: fix cond_list corruption when changing booleans
Ondrej Mosnacek omosnace@redhat.com selinux: make nslot handling in avtab more robust
Xiaoming Ni nixiaoming@huawei.com nfc: Avoid endless loops caused by repeated llcp_sock_connect()
Xiaoming Ni nixiaoming@huawei.com nfc: fix memory leak in llcp_sock_connect()
Xiaoming Ni nixiaoming@huawei.com nfc: fix refcount leak in llcp_sock_connect()
Xiaoming Ni nixiaoming@huawei.com nfc: fix refcount leak in llcp_sock_bind()
Hans de Goede hdegoede@redhat.com ASoC: intel: atom: Stop advertising non working S24LE support
Takashi Iwai tiwai@suse.de ALSA: hda/conexant: Apply quirk for another HP ZBook G5 model
Takashi Iwai tiwai@suse.de ALSA: hda/realtek: Fix speaker amp setup on Acer Aspire E1
Jonas Holmberg jonashg@axis.com ALSA: aloop: Fix initialization of controls
Dmitry Safonov 0x7f454c46@gmail.com xfrm/compat: Cleanup WARN()s that can be user-triggered
-------------
Diffstat:
.../bindings/net/ethernet-controller.yaml | 2 +- Makefile | 4 +- arch/arm/boot/dts/armada-385-turris-omnia.dts | 1 + arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi | 2 + arch/arm/mach-omap2/omap-secure.c | 39 +++++ arch/arm/mach-omap2/omap-secure.h | 1 + arch/arm/mach-omap2/pmic-cpcap.c | 4 +- arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h | 2 +- arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h | 2 +- arch/ia64/include/asm/ptrace.h | 8 +- arch/nds32/mm/cacheflush.c | 2 +- arch/parisc/include/asm/cmpxchg.h | 2 +- arch/s390/kernel/cpcmd.c | 6 +- arch/x86/include/asm/smp.h | 2 +- arch/x86/kernel/smpboot.c | 26 ++- arch/x86/kvm/mmu/mmu.c | 13 +- arch/x86/kvm/mmu/tdp_iter.c | 30 +--- arch/x86/kvm/mmu/tdp_iter.h | 11 +- arch/x86/kvm/mmu/tdp_mmu.c | 99 +++++++---- arch/x86/kvm/mmu/tdp_mmu.h | 18 +- drivers/acpi/processor_idle.c | 4 +- drivers/base/dd.c | 8 +- drivers/char/agp/Kconfig | 2 +- drivers/clk/clk.c | 47 +++-- drivers/clk/socfpga/clk-gate.c | 2 +- drivers/gpio/gpiolib.c | 12 +- drivers/gpu/drm/i915/display/intel_acpi.c | 22 ++- drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/vc4/vc4_crtc.c | 17 ++ drivers/i2c/busses/i2c-designware-master.c | 1 + drivers/i2c/busses/i2c-jz4780.c | 4 +- drivers/i2c/i2c-core-base.c | 7 +- drivers/infiniband/core/addr.c | 4 +- drivers/infiniband/hw/cxgb4/cm.c | 3 +- drivers/infiniband/hw/hfi1/affinity.c | 21 +-- drivers/infiniband/hw/hfi1/hfi.h | 1 + drivers/infiniband/hw/hfi1/init.c | 10 +- drivers/infiniband/hw/hfi1/netdev_rx.c | 3 +- drivers/infiniband/hw/qedr/verbs.c | 3 +- drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +- drivers/net/can/spi/mcp251x.c | 24 ++- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 6 +- drivers/net/dsa/lantiq_gswip.c | 195 ++++++++++++++++++--- drivers/net/ethernet/amd/xgbe/xgbe.h | 6 +- drivers/net/ethernet/cadence/macb_main.c | 7 + drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c | 23 ++- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 3 +- drivers/net/ethernet/freescale/gianfar.c | 6 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 +- drivers/net/ethernet/intel/i40e/i40e.h | 1 + drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 3 + drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 55 +++++- drivers/net/ethernet/intel/i40e/i40e_main.c | 11 +- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 12 +- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 + drivers/net/ethernet/intel/ice/ice.h | 4 +- drivers/net/ethernet/intel/ice/ice_common.c | 2 +- drivers/net/ethernet/intel/ice/ice_controlq.h | 4 +- drivers/net/ethernet/intel/ice/ice_dcb.c | 76 +++++--- drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 47 ++--- drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 52 +++--- drivers/net/ethernet/intel/ice/ice_ethtool.c | 8 +- drivers/net/ethernet/intel/ice/ice_lib.c | 7 +- drivers/net/ethernet/intel/ice/ice_main.c | 53 ++++-- drivers/net/ethernet/intel/ice/ice_switch.c | 15 +- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_type.h | 17 +- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 36 +++- .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 22 +-- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 +- drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 15 ++ .../net/ethernet/mellanox/mlxsw/spectrum_ipip.c | 7 +- drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c | 7 +- drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 2 +- drivers/net/ethernet/netronome/nfp/bpf/cmsg.c | 1 + drivers/net/ethernet/netronome/nfp/flower/main.h | 8 + .../net/ethernet/netronome/nfp/flower/metadata.c | 16 +- .../net/ethernet/netronome/nfp/flower/offload.c | 48 ++++- drivers/net/geneve.c | 24 ++- drivers/net/ieee802154/atusb.c | 1 + drivers/net/phy/bcm-phy-lib.c | 13 +- drivers/net/tun.c | 48 +++++ drivers/net/usb/hso.c | 33 ++-- drivers/net/virtio_net.c | 10 +- drivers/net/vxlan.c | 18 +- drivers/net/wan/hdlc_fr.c | 5 +- drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c | 2 +- .../wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 31 +--- .../net/wireless/intel/iwlwifi/pcie/ctxt-info.c | 3 +- .../net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 35 ++++ drivers/of/property.c | 11 +- drivers/ras/cec.c | 15 +- drivers/regulator/bd9571mwv-regulator.c | 4 +- drivers/remoteproc/qcom_pil_info.c | 2 +- drivers/scsi/pm8001/pm8001_hwi.c | 8 +- drivers/scsi/ufs/ufshcd.c | 31 ++-- drivers/soc/fsl/qbman/qman.c | 2 +- drivers/target/iscsi/iscsi_target.c | 3 +- drivers/thunderbolt/retimer.c | 4 +- drivers/usb/usbip/stub_dev.c | 11 +- drivers/usb/usbip/usbip_common.h | 3 + drivers/usb/usbip/usbip_event.c | 2 + drivers/usb/usbip/vhci_hcd.c | 1 + drivers/usb/usbip/vhci_sysfs.c | 30 +++- drivers/usb/usbip/vudc_dev.c | 1 + drivers/usb/usbip/vudc_sysfs.c | 5 + drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 + drivers/vdpa/mlx5/net/mlx5_vnet.c | 40 +++-- drivers/xen/events/events_base.c | 12 +- fs/direct-io.c | 5 +- fs/hostfs/hostfs_kern.c | 7 +- fs/namei.c | 8 +- fs/ocfs2/aops.c | 11 +- fs/ocfs2/file.c | 8 +- include/linux/avf/virtchnl.h | 2 - include/linux/mlx5/mlx5_ifc.h | 10 +- include/linux/skmsg.h | 7 +- include/linux/virtio_net.h | 16 +- include/net/act_api.h | 12 +- include/net/netns/xfrm.h | 4 +- include/net/red.h | 4 +- include/net/sock.h | 9 + include/net/xfrm.h | 4 +- kernel/bpf/inode.c | 2 +- kernel/bpf/stackmap.c | 12 +- kernel/bpf/verifier.c | 5 + kernel/gcov/clang.c | 29 +-- kernel/locking/lockdep.c | 2 +- kernel/workqueue.c | 2 +- mm/percpu-internal.h | 2 +- mm/percpu-stats.c | 9 +- mm/percpu.c | 14 +- net/batman-adv/translation-table.c | 2 + net/can/bcm.c | 10 +- net/can/isotp.c | 11 +- net/can/raw.c | 14 +- net/core/skmsg.c | 12 +- net/core/sock.c | 12 +- net/core/xdp.c | 3 +- net/dsa/dsa2.c | 8 +- net/ethtool/eee.c | 4 +- net/hsr/hsr_device.c | 1 + net/hsr/hsr_forward.c | 6 - net/ieee802154/nl-mac.c | 7 +- net/ieee802154/nl802154.c | 23 ++- net/ipv4/ah4.c | 2 +- net/ipv4/esp4.c | 2 +- net/ipv4/esp4_offload.c | 17 +- net/ipv4/udp.c | 4 + net/ipv6/ah6.c | 2 +- net/ipv6/esp6.c | 2 +- net/ipv6/esp6_offload.c | 17 +- net/ipv6/raw.c | 2 +- net/ipv6/route.c | 8 +- net/mac80211/mlme.c | 5 +- net/mac80211/tx.c | 2 +- net/mac802154/llsec.c | 2 +- net/mptcp/protocol.c | 45 +++++ net/ncsi/ncsi-manage.c | 20 ++- net/nfc/llcp_sock.c | 10 ++ net/openvswitch/conntrack.c | 14 +- net/qrtr/qrtr.c | 5 +- net/rds/message.c | 3 +- net/sched/act_api.c | 48 +++-- net/sched/cls_api.c | 16 +- net/sched/sch_teql.c | 3 + net/sctp/ipv6.c | 7 +- net/tipc/crypto.c | 3 +- net/tipc/socket.c | 2 +- net/wireless/nl80211.c | 10 +- net/wireless/scan.c | 14 +- net/wireless/sme.c | 2 +- net/xfrm/xfrm_compat.c | 12 +- net/xfrm/xfrm_device.c | 2 - net/xfrm/xfrm_interface.c | 3 + net/xfrm/xfrm_output.c | 10 +- net/xfrm/xfrm_state.c | 10 +- security/selinux/ss/avtab.c | 101 ++++------- security/selinux/ss/avtab.h | 2 +- security/selinux/ss/conditional.c | 12 +- security/selinux/ss/services.c | 157 +++++++++++++---- security/selinux/ss/sidtab.c | 21 +++ security/selinux/ss/sidtab.h | 4 + sound/drivers/aloop.c | 11 +- sound/pci/hda/patch_conexant.c | 1 + sound/pci/hda/patch_realtek.c | 16 ++ sound/soc/codecs/wm8960.c | 8 +- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 6 +- sound/soc/sof/intel/hda-dsp.c | 15 +- sound/soc/sunxi/sun4i-codec.c | 5 + tools/lib/bpf/ringbuf.c | 2 +- tools/lib/bpf/xsk.c | 57 +++--- tools/perf/builtin-inject.c | 2 +- tools/perf/util/block-info.c | 6 +- 194 files changed, 1844 insertions(+), 870 deletions(-)
From: Dmitry Safonov dima@arista.com
commit ef19e111337f6c3dca7019a8bad5fbc6fb18d635 upstream.
Replace WARN_ONCE() that can be triggered from userspace with pr_warn_once(). Those still give user a hint what's the issue.
I've left WARN()s that are not possible to trigger with current code-base and that would mean that the code has issues: - relying on current compat_msg_min[type] <= xfrm_msg_min[type] - expected 4-byte padding size difference between compat_msg_min[type] and xfrm_msg_min[type] - compat_policy[type].len <= xfrma_policy[type].len (for every type)
Reported-by: syzbot+834ffd1afc7212eb8147@syzkaller.appspotmail.com Fixes: 5f3eea6b7e8f ("xfrm/compat: Attach xfrm dumps to 64=>32 bit translator") Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet eric.dumazet@gmail.com Cc: Herbert Xu herbert@gondor.apana.org.au Cc: Jakub Kicinski kuba@kernel.org Cc: Steffen Klassert steffen.klassert@secunet.com Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/xfrm/xfrm_compat.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/net/xfrm/xfrm_compat.c +++ b/net/xfrm/xfrm_compat.c @@ -216,7 +216,7 @@ static struct nlmsghdr *xfrm_nlmsg_put_c case XFRM_MSG_GETSADINFO: case XFRM_MSG_GETSPDINFO: default: - WARN_ONCE(1, "unsupported nlmsg_type %d", nlh_src->nlmsg_type); + pr_warn_once("unsupported nlmsg_type %d\n", nlh_src->nlmsg_type); return ERR_PTR(-EOPNOTSUPP); }
@@ -277,7 +277,7 @@ static int xfrm_xlate64_attr(struct sk_b return xfrm_nla_cpy(dst, src, nla_len(src)); default: BUILD_BUG_ON(XFRMA_MAX != XFRMA_IF_ID); - WARN_ONCE(1, "unsupported nla_type %d", src->nla_type); + pr_warn_once("unsupported nla_type %d\n", src->nla_type); return -EOPNOTSUPP; } } @@ -315,8 +315,10 @@ static int xfrm_alloc_compat(struct sk_b struct sk_buff *new = NULL; int err;
- if (WARN_ON_ONCE(type >= ARRAY_SIZE(xfrm_msg_min))) + if (type >= ARRAY_SIZE(xfrm_msg_min)) { + pr_warn_once("unsupported nlmsg_type %d\n", nlh_src->nlmsg_type); return -EOPNOTSUPP; + }
if (skb_shinfo(skb)->frag_list == NULL) { new = alloc_skb(skb->len + skb_tailroom(skb), GFP_ATOMIC); @@ -378,6 +380,10 @@ static int xfrm_attr_cpy32(void *dst, si struct nlmsghdr *nlmsg = dst; struct nlattr *nla;
+ /* xfrm_user_rcv_msg_compat() relies on fact that 32-bit messages + * have the same len or shorted than 64-bit ones. + * 32-bit translation that is bigger than 64-bit original is unexpected. + */ if (WARN_ON_ONCE(copy_len > payload)) copy_len = payload;
From: Jonas Holmberg jonashg@axis.com
commit 168632a495f49f33a18c2d502fc249d7610375e9 upstream.
Add a control to the card before copying the id so that the numid field is initialized in the copy. Otherwise the numid field of active_id, format_id, rate_id and channels_id will be the same (0) and snd_ctl_notify() will not queue the events properly.
Signed-off-by: Jonas Holmberg jonashg@axis.com Reviewed-by: Jaroslav Kysela perex@perex.cz Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210407075428.2666787-1-jonashg@axis.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/drivers/aloop.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -1572,6 +1572,14 @@ static int loopback_mixer_new(struct loo return -ENOMEM; kctl->id.device = dev; kctl->id.subdevice = substr; + + /* Add the control before copying the id so that + * the numid field of the id is set in the copy. + */ + err = snd_ctl_add(card, kctl); + if (err < 0) + return err; + switch (idx) { case ACTIVE_IDX: setup->active_id = kctl->id; @@ -1588,9 +1596,6 @@ static int loopback_mixer_new(struct loo default: break; } - err = snd_ctl_add(card, kctl); - if (err < 0) - return err; } } }
From: Takashi Iwai tiwai@suse.de
commit c8426b2700b57d2760ff335840a02f66a64b6044 upstream.
We've got a report about Acer Aspire E1 (PCI SSID 1025:0840) that loses the speaker output after resume. With the comparison of COEF dumps, it was identified that the COEF 0x0d bits 0x6000 corresponds to the speaker amp.
This patch adds the specific quirk for the device to restore the COEF bits at the codec (re-)initialization.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1183869 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210407095730.12560-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -3927,6 +3927,15 @@ static void alc271_fixup_dmic(struct hda snd_hda_sequence_write(codec, verbs); }
+/* Fix the speaker amp after resume, etc */ +static void alc269vb_fixup_aspire_e1_coef(struct hda_codec *codec, + const struct hda_fixup *fix, + int action) +{ + if (action == HDA_FIXUP_ACT_INIT) + alc_update_coef_idx(codec, 0x0d, 0x6000, 0x6000); +} + static void alc269_fixup_pcm_44k(struct hda_codec *codec, const struct hda_fixup *fix, int action) { @@ -6301,6 +6310,7 @@ enum { ALC283_FIXUP_HEADSET_MIC, ALC255_FIXUP_MIC_MUTE_LED, ALC282_FIXUP_ASPIRE_V5_PINS, + ALC269VB_FIXUP_ASPIRE_E1_COEF, ALC280_FIXUP_HP_GPIO4, ALC286_FIXUP_HP_GPIO_LED, ALC280_FIXUP_HP_GPIO2_MIC_HOTKEY, @@ -6979,6 +6989,10 @@ static const struct hda_fixup alc269_fix { }, }, }, + [ALC269VB_FIXUP_ASPIRE_E1_COEF] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc269vb_fixup_aspire_e1_coef, + }, [ALC280_FIXUP_HP_GPIO4] = { .type = HDA_FIXUP_FUNC, .v.func = alc280_fixup_hp_gpio4, @@ -7901,6 +7915,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1025, 0x0762, "Acer Aspire E1-472", ALC271_FIXUP_HP_GATE_MIC_JACK_E1_572), SND_PCI_QUIRK(0x1025, 0x0775, "Acer Aspire E1-572", ALC271_FIXUP_HP_GATE_MIC_JACK_E1_572), SND_PCI_QUIRK(0x1025, 0x079b, "Acer Aspire V5-573G", ALC282_FIXUP_ASPIRE_V5_PINS), + SND_PCI_QUIRK(0x1025, 0x0840, "Acer Aspire E1", ALC269VB_FIXUP_ASPIRE_E1_COEF), SND_PCI_QUIRK(0x1025, 0x101c, "Acer Veriton N2510G", ALC269_FIXUP_LIFEBOOK), SND_PCI_QUIRK(0x1025, 0x102b, "Acer Aspire C24-860", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x1065, "Acer Aspire C20-820", ALC269VC_FIXUP_ACER_HEADSET_MIC), @@ -8395,6 +8410,7 @@ static const struct hda_model_fixup alc2 {.id = ALC283_FIXUP_HEADSET_MIC, .name = "alc283-headset"}, {.id = ALC255_FIXUP_MIC_MUTE_LED, .name = "alc255-dell-mute"}, {.id = ALC282_FIXUP_ASPIRE_V5_PINS, .name = "aspire-v5"}, + {.id = ALC269VB_FIXUP_ASPIRE_E1_COEF, .name = "aspire-e1-coef"}, {.id = ALC280_FIXUP_HP_GPIO4, .name = "hp-gpio4"}, {.id = ALC286_FIXUP_HP_GPIO_LED, .name = "hp-gpio-led"}, {.id = ALC280_FIXUP_HP_GPIO2_MIC_HOTKEY, .name = "hp-gpio2-hotkey"},
From: Takashi Iwai tiwai@suse.de
commit c6423ed2da6214a68527446b5f8e09cf7162b2ce upstream.
There is another HP ZBook G5 model with the PCI SSID 103c:844f that requires the same quirk for controlling the mute LED. Add the corresponding entry to the quirk table.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212407 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210401171314.667-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -944,6 +944,7 @@ static const struct snd_pci_quirk cxt506 SND_PCI_QUIRK(0x103c, 0x829a, "HP 800 G3 DM", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8402, "HP ProBook 645 G4", CXT_FIXUP_MUTE_LED_GPIO), SND_PCI_QUIRK(0x103c, 0x8427, "HP ZBook Studio G5", CXT_FIXUP_HP_ZBOOK_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x844f, "HP ZBook Studio G5", CXT_FIXUP_HP_ZBOOK_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8455, "HP Z2 G4", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8456, "HP Z2 G4 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8457, "HP Z2 G4 mini", CXT_FIXUP_HP_MIC_NO_PRESENCE),
From: Hans de Goede hdegoede@redhat.com
commit aa65bacdb70e549a81de03ec72338e1047842883 upstream.
The SST firmware's media and deep-buffer inputs are hardcoded to S16LE, the corresponding DAIs don't have a hw_params callback and their prepare callback also does not take the format into account.
So far the advertising of non working S24LE support has not caused issues because pulseaudio defaults to S16LE, but changing pulse-audio's config to use S24LE will result in broken sound.
Pipewire is replacing pulse now and pipewire prefers S24LE over S16LE when available, causing the problem of the broken S24LE support to come to the surface now.
Cc: stable@vger.kernel.org BugLink: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/866 Fixes: 098c2cd281409 ("ASoC: Intel: Atom: add 24-bit support for media playback and capture") Acked-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210324132711.216152-2-hdegoede@redhat.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -488,14 +488,14 @@ static struct snd_soc_dai_driver sst_pla .channels_min = SST_STEREO, .channels_max = SST_STEREO, .rates = SNDRV_PCM_RATE_44100|SNDRV_PCM_RATE_48000, - .formats = SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S24_LE, + .formats = SNDRV_PCM_FMTBIT_S16_LE, }, .capture = { .stream_name = "Headset Capture", .channels_min = 1, .channels_max = 2, .rates = SNDRV_PCM_RATE_44100|SNDRV_PCM_RATE_48000, - .formats = SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S24_LE, + .formats = SNDRV_PCM_FMTBIT_S16_LE, }, }, { @@ -506,7 +506,7 @@ static struct snd_soc_dai_driver sst_pla .channels_min = SST_STEREO, .channels_max = SST_STEREO, .rates = SNDRV_PCM_RATE_44100|SNDRV_PCM_RATE_48000, - .formats = SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S24_LE, + .formats = SNDRV_PCM_FMTBIT_S16_LE, }, }, {
From: Xiaoming Ni nixiaoming@huawei.com
commit c33b1cc62ac05c1dbb1cdafe2eb66da01c76ca8d upstream.
nfc_llcp_local_get() is invoked in llcp_sock_bind(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put().
fix CVE-2020-25670 Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" kiyin@tencent.com Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: stable@vger.kernel.org #v3.6 Signed-off-by: Xiaoming Ni nixiaoming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -108,11 +108,13 @@ static int llcp_sock_bind(struct socket llcp_sock->service_name_len, GFP_KERNEL); if (!llcp_sock->service_name) { + nfc_llcp_local_put(llcp_sock->local); ret = -ENOMEM; goto put_dev; } llcp_sock->ssap = nfc_llcp_get_sdp_ssap(local, llcp_sock); if (llcp_sock->ssap == LLCP_SAP_MAX) { + nfc_llcp_local_put(llcp_sock->local); kfree(llcp_sock->service_name); llcp_sock->service_name = NULL; ret = -EADDRINUSE;
From: Xiaoming Ni nixiaoming@huawei.com
commit 8a4cd82d62b5ec7e5482333a72b58a4eea4979f0 upstream.
nfc_llcp_local_get() is invoked in llcp_sock_connect(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put().
fix CVE-2020-25671 Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" kiyin@tencent.com Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: stable@vger.kernel.org #v3.6 Signed-off-by: Xiaoming Ni nixiaoming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -704,6 +704,7 @@ static int llcp_sock_connect(struct sock llcp_sock->local = nfc_llcp_local_get(local); llcp_sock->ssap = nfc_llcp_get_local_ssap(local); if (llcp_sock->ssap == LLCP_SAP_MAX) { + nfc_llcp_local_put(llcp_sock->local); ret = -ENOMEM; goto put_dev; } @@ -748,6 +749,7 @@ sock_unlink:
sock_llcp_release: nfc_llcp_put_ssap(local, llcp_sock->ssap); + nfc_llcp_local_put(llcp_sock->local);
put_dev: nfc_put_device(dev);
From: Xiaoming Ni nixiaoming@huawei.com
commit 7574fcdbdcb335763b6b322f6928dc0fd5730451 upstream.
In llcp_sock_connect(), use kmemdup to allocate memory for "llcp_sock->service_name". The memory is not released in the sock_unlink label of the subsequent failure branch. As a result, memory leakage occurs.
fix CVE-2020-25672
Fixes: d646960f7986 ("NFC: Initial LLCP support") Reported-by: "kiyin(尹亮)" kiyin@tencent.com Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: stable@vger.kernel.org #v3.3 Signed-off-by: Xiaoming Ni nixiaoming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -746,6 +746,8 @@ static int llcp_sock_connect(struct sock
sock_unlink: nfc_llcp_sock_unlink(&local->connecting_sockets, sk); + kfree(llcp_sock->service_name); + llcp_sock->service_name = NULL;
sock_llcp_release: nfc_llcp_put_ssap(local, llcp_sock->ssap);
From: Xiaoming Ni nixiaoming@huawei.com
commit 4b5db93e7f2afbdfe3b78e37879a85290187e6f1 upstream.
When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked, nfc_llcp_sock_link() will add sk to local->connecting_sockets twice. sk->sk_node->next will point to itself, that will make an endless loop and hang-up the system. To fix it, check whether sk->sk_state is LLCP_CONNECTING in llcp_sock_connect() to avoid repeated invoking.
Fixes: b4011239a08e ("NFC: llcp: Fix non blocking sockets connections") Reported-by: "kiyin(尹亮)" kiyin@tencent.com Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: stable@vger.kernel.org #v3.11 Signed-off-by: Xiaoming Ni nixiaoming@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/llcp_sock.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -673,6 +673,10 @@ static int llcp_sock_connect(struct sock ret = -EISCONN; goto error; } + if (sk->sk_state == LLCP_CONNECTING) { + ret = -EINPROGRESS; + goto error; + }
dev = nfc_get_device(addr->dev_idx); if (dev == NULL) {
From: Ondrej Mosnacek omosnace@redhat.com
commit 442dc00f82a9727dc0c48c44f792c168f593c6df upstream.
1. Make sure all fileds are initialized in avtab_init(). 2. Slightly refactor avtab_alloc() to use the above fact. 3. Use h->nslot == 0 as a sentinel in the access functions to prevent dereferencing h->htable when it's not allocated.
Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/ss/avtab.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
--- a/security/selinux/ss/avtab.c +++ b/security/selinux/ss/avtab.c @@ -109,7 +109,7 @@ static int avtab_insert(struct avtab *h, struct avtab_node *prev, *cur, *newnode; u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);
- if (!h) + if (!h || !h->nslot) return -EINVAL;
hvalue = avtab_hash(key, h->mask); @@ -154,7 +154,7 @@ avtab_insert_nonunique(struct avtab *h, struct avtab_node *prev, *cur; u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);
- if (!h) + if (!h || !h->nslot) return NULL; hvalue = avtab_hash(key, h->mask); for (prev = NULL, cur = h->htable[hvalue]; @@ -184,7 +184,7 @@ struct avtab_datum *avtab_search(struct struct avtab_node *cur; u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);
- if (!h) + if (!h || !h->nslot) return NULL;
hvalue = avtab_hash(key, h->mask); @@ -220,7 +220,7 @@ avtab_search_node(struct avtab *h, struc struct avtab_node *cur; u16 specified = key->specified & ~(AVTAB_ENABLED|AVTAB_ENABLED_OLD);
- if (!h) + if (!h || !h->nslot) return NULL;
hvalue = avtab_hash(key, h->mask); @@ -295,6 +295,7 @@ void avtab_destroy(struct avtab *h) } kvfree(h->htable); h->htable = NULL; + h->nel = 0; h->nslot = 0; h->mask = 0; } @@ -303,14 +304,15 @@ void avtab_init(struct avtab *h) { h->htable = NULL; h->nel = 0; + h->nslot = 0; + h->mask = 0; }
int avtab_alloc(struct avtab *h, u32 nrules) { - u32 mask = 0; u32 shift = 0; u32 work = nrules; - u32 nslot = 0; + u32 nslot;
if (nrules == 0) goto avtab_alloc_out; @@ -324,16 +326,15 @@ int avtab_alloc(struct avtab *h, u32 nru nslot = 1 << shift; if (nslot > MAX_AVTAB_HASH_BUCKETS) nslot = MAX_AVTAB_HASH_BUCKETS; - mask = nslot - 1;
h->htable = kvcalloc(nslot, sizeof(void *), GFP_KERNEL); if (!h->htable) return -ENOMEM;
- avtab_alloc_out: - h->nel = 0; h->nslot = nslot; - h->mask = mask; + h->mask = nslot - 1; + +avtab_alloc_out: pr_debug("SELinux: %d avtab hash slots, %d rules.\n", h->nslot, nrules); return 0;
From: Ondrej Mosnacek omosnace@redhat.com
commit d8f5f0ea5b86300390b026b6c6e7836b7150814a upstream.
Currently, duplicate_policydb_cond_list() first copies the whole conditional avtab and then tries to link to the correct entries in cond_dup_av_list() using avtab_search(). However, since the conditional avtab may contain multiple entries with the same key, this approach often fails to find the right entry, potentially leading to wrong rules being activated/deactivated when booleans are changed.
To fix this, instead start with an empty conditional avtab and add the individual entries one-by-one while building the new av_lists. This approach leads to the correct result, since each entry is present in the av_lists exactly once.
The issue can be reproduced with Fedora policy as follows:
# sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True # setsebool ftpd_anon_write=off ftpd_connect_all_unreserved=off ftpd_connect_db=off ftpd_full_access=off
On fixed kernels, the sesearch output is the same after the setsebool command:
# sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True
While on the broken kernels, it will be different:
# sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True
While there, also simplify the computation of nslots. This changes the nslots values for nrules 2 or 3 to just two slots instead of 4, which makes the sequence more consistent.
Cc: stable@vger.kernel.org Fixes: c7c556f1e81b ("selinux: refactor changing booleans") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/ss/avtab.c | 86 +++++++++++--------------------------- security/selinux/ss/avtab.h | 2 security/selinux/ss/conditional.c | 12 ++--- 3 files changed, 32 insertions(+), 68 deletions(-)
--- a/security/selinux/ss/avtab.c +++ b/security/selinux/ss/avtab.c @@ -308,24 +308,10 @@ void avtab_init(struct avtab *h) h->mask = 0; }
-int avtab_alloc(struct avtab *h, u32 nrules) +static int avtab_alloc_common(struct avtab *h, u32 nslot) { - u32 shift = 0; - u32 work = nrules; - u32 nslot; - - if (nrules == 0) - goto avtab_alloc_out; - - while (work) { - work = work >> 1; - shift++; - } - if (shift > 2) - shift = shift - 2; - nslot = 1 << shift; - if (nslot > MAX_AVTAB_HASH_BUCKETS) - nslot = MAX_AVTAB_HASH_BUCKETS; + if (!nslot) + return 0;
h->htable = kvcalloc(nslot, sizeof(void *), GFP_KERNEL); if (!h->htable) @@ -333,59 +319,37 @@ int avtab_alloc(struct avtab *h, u32 nru
h->nslot = nslot; h->mask = nslot - 1; - -avtab_alloc_out: - pr_debug("SELinux: %d avtab hash slots, %d rules.\n", - h->nslot, nrules); return 0; }
-int avtab_duplicate(struct avtab *new, struct avtab *orig) +int avtab_alloc(struct avtab *h, u32 nrules) { - int i; - struct avtab_node *node, *tmp, *tail; - - memset(new, 0, sizeof(*new)); + int rc; + u32 nslot = 0;
- new->htable = kvcalloc(orig->nslot, sizeof(void *), GFP_KERNEL); - if (!new->htable) - return -ENOMEM; - new->nslot = orig->nslot; - new->mask = orig->mask; - - for (i = 0; i < orig->nslot; i++) { - tail = NULL; - for (node = orig->htable[i]; node; node = node->next) { - tmp = kmem_cache_zalloc(avtab_node_cachep, GFP_KERNEL); - if (!tmp) - goto error; - tmp->key = node->key; - if (tmp->key.specified & AVTAB_XPERMS) { - tmp->datum.u.xperms = - kmem_cache_zalloc(avtab_xperms_cachep, - GFP_KERNEL); - if (!tmp->datum.u.xperms) { - kmem_cache_free(avtab_node_cachep, tmp); - goto error; - } - tmp->datum.u.xperms = node->datum.u.xperms; - } else - tmp->datum.u.data = node->datum.u.data; - - if (tail) - tail->next = tmp; - else - new->htable[i] = tmp; - - tail = tmp; - new->nel++; + if (nrules != 0) { + u32 shift = 1; + u32 work = nrules >> 3; + while (work) { + work >>= 1; + shift++; } + nslot = 1 << shift; + if (nslot > MAX_AVTAB_HASH_BUCKETS) + nslot = MAX_AVTAB_HASH_BUCKETS; + + rc = avtab_alloc_common(h, nslot); + if (rc) + return rc; }
+ pr_debug("SELinux: %d avtab hash slots, %d rules.\n", nslot, nrules); return 0; -error: - avtab_destroy(new); - return -ENOMEM; +} + +int avtab_alloc_dup(struct avtab *new, const struct avtab *orig) +{ + return avtab_alloc_common(new, orig->nslot); }
void avtab_hash_eval(struct avtab *h, char *tag) --- a/security/selinux/ss/avtab.h +++ b/security/selinux/ss/avtab.h @@ -89,7 +89,7 @@ struct avtab {
void avtab_init(struct avtab *h); int avtab_alloc(struct avtab *, u32); -int avtab_duplicate(struct avtab *new, struct avtab *orig); +int avtab_alloc_dup(struct avtab *new, const struct avtab *orig); struct avtab_datum *avtab_search(struct avtab *h, struct avtab_key *k); void avtab_destroy(struct avtab *h); void avtab_hash_eval(struct avtab *h, char *tag); --- a/security/selinux/ss/conditional.c +++ b/security/selinux/ss/conditional.c @@ -605,7 +605,6 @@ static int cond_dup_av_list(struct cond_ struct cond_av_list *orig, struct avtab *avtab) { - struct avtab_node *avnode; u32 i;
memset(new, 0, sizeof(*new)); @@ -615,10 +614,11 @@ static int cond_dup_av_list(struct cond_ return -ENOMEM;
for (i = 0; i < orig->len; i++) { - avnode = avtab_search_node(avtab, &orig->nodes[i]->key); - if (WARN_ON(!avnode)) - return -EINVAL; - new->nodes[i] = avnode; + new->nodes[i] = avtab_insert_nonunique(avtab, + &orig->nodes[i]->key, + &orig->nodes[i]->datum); + if (!new->nodes[i]) + return -ENOMEM; new->len++; }
@@ -630,7 +630,7 @@ static int duplicate_policydb_cond_list( { int rc, i, j;
- rc = avtab_duplicate(&newp->te_cond_avtab, &origp->te_cond_avtab); + rc = avtab_alloc_dup(&newp->te_cond_avtab, &origp->te_cond_avtab); if (rc) return rc;
From: Ondrej Mosnacek omosnace@redhat.com
commit 9ad6e9cb39c66366bf7b9aece114aca277981a1f upstream.
Since commit 1b8b31a2e612 ("selinux: convert policy read-write lock to RCU"), there is a small window during policy load where the new policy pointer has already been installed, but some threads may still be holding the old policy pointer in their read-side RCU critical sections. This means that there may be conflicting attempts to add a new SID entry to both tables via sidtab_context_to_sid().
See also (and the rest of the thread): https://lore.kernel.org/selinux/CAFqZXNvfux46_f8gnvVvRYMKoes24nwm2n3sPbMjrB8...
Fix this by installing the new policy pointer under the old sidtab's spinlock along with marking the old sidtab as "frozen". Then, if an attempt to add new entry to a "frozen" sidtab is detected, make sidtab_context_to_sid() return -ESTALE to indicate that a new policy has been installed and that the caller will have to abort the policy transaction and try again after re-taking the policy pointer (which is guaranteed to be a newer policy). This requires adding a retry-on-ESTALE logic to all callers of sidtab_context_to_sid(), but fortunately these are easy to determine and aren't that many.
This seems to be the simplest solution for this problem, even if it looks somewhat ugly. Note that other places in the kernel (e.g. do_mknodat() in fs/namei.c) use similar stale-retry patterns, so I think it's reasonable.
Cc: stable@vger.kernel.org Fixes: 1b8b31a2e612 ("selinux: convert policy read-write lock to RCU") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/ss/services.c | 157 +++++++++++++++++++++++++++++++---------- security/selinux/ss/sidtab.c | 21 +++++ security/selinux/ss/sidtab.h | 4 + 3 files changed, 145 insertions(+), 37 deletions(-)
--- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -1553,6 +1553,7 @@ static int security_context_to_sid_core( if (!str) goto out; } +retry: rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -1566,6 +1567,15 @@ static int security_context_to_sid_core( } else if (rc) goto out_unlock; rc = sidtab_context_to_sid(sidtab, &context, sid); + if (rc == -ESTALE) { + rcu_read_unlock(); + if (context.str) { + str = context.str; + context.str = NULL; + } + context_destroy(&context); + goto retry; + } context_destroy(&context); out_unlock: rcu_read_unlock(); @@ -1715,7 +1725,7 @@ static int security_compute_sid(struct s struct selinux_policy *policy; struct policydb *policydb; struct sidtab *sidtab; - struct class_datum *cladatum = NULL; + struct class_datum *cladatum; struct context *scontext, *tcontext, newcontext; struct sidtab_entry *sentry, *tentry; struct avtab_key avkey; @@ -1737,6 +1747,8 @@ static int security_compute_sid(struct s goto out; }
+retry: + cladatum = NULL; context_init(&newcontext);
rcu_read_lock(); @@ -1881,6 +1893,11 @@ static int security_compute_sid(struct s } /* Obtain the sid for the context. */ rc = sidtab_context_to_sid(sidtab, &newcontext, out_sid); + if (rc == -ESTALE) { + rcu_read_unlock(); + context_destroy(&newcontext); + goto retry; + } out_unlock: rcu_read_unlock(); context_destroy(&newcontext); @@ -2192,6 +2209,7 @@ void selinux_policy_commit(struct selinu struct selinux_load_state *load_state) { struct selinux_policy *oldpolicy, *newpolicy = load_state->policy; + unsigned long flags; u32 seqno;
oldpolicy = rcu_dereference_protected(state->policy, @@ -2213,7 +2231,13 @@ void selinux_policy_commit(struct selinu seqno = newpolicy->latest_granting;
/* Install the new policy. */ - rcu_assign_pointer(state->policy, newpolicy); + if (oldpolicy) { + sidtab_freeze_begin(oldpolicy->sidtab, &flags); + rcu_assign_pointer(state->policy, newpolicy); + sidtab_freeze_end(oldpolicy->sidtab, &flags); + } else { + rcu_assign_pointer(state->policy, newpolicy); + }
/* Load the policycaps from the new policy */ security_load_policycaps(state, newpolicy); @@ -2357,13 +2381,15 @@ int security_port_sid(struct selinux_sta struct policydb *policydb; struct sidtab *sidtab; struct ocontext *c; - int rc = 0; + int rc;
if (!selinux_initialized(state)) { *out_sid = SECINITSID_PORT; return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2382,6 +2408,10 @@ int security_port_sid(struct selinux_sta if (!c->sid[0]) { rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2408,13 +2438,15 @@ int security_ib_pkey_sid(struct selinux_ struct policydb *policydb; struct sidtab *sidtab; struct ocontext *c; - int rc = 0; + int rc;
if (!selinux_initialized(state)) { *out_sid = SECINITSID_UNLABELED; return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2435,6 +2467,10 @@ int security_ib_pkey_sid(struct selinux_ rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2460,13 +2496,15 @@ int security_ib_endport_sid(struct selin struct policydb *policydb; struct sidtab *sidtab; struct ocontext *c; - int rc = 0; + int rc;
if (!selinux_initialized(state)) { *out_sid = SECINITSID_UNLABELED; return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2487,6 +2525,10 @@ int security_ib_endport_sid(struct selin if (!c->sid[0]) { rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2510,7 +2552,7 @@ int security_netif_sid(struct selinux_st struct selinux_policy *policy; struct policydb *policydb; struct sidtab *sidtab; - int rc = 0; + int rc; struct ocontext *c;
if (!selinux_initialized(state)) { @@ -2518,6 +2560,8 @@ int security_netif_sid(struct selinux_st return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2534,10 +2578,18 @@ int security_netif_sid(struct selinux_st if (!c->sid[0] || !c->sid[1]) { rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; rc = sidtab_context_to_sid(sidtab, &c->context[1], &c->sid[1]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2587,6 +2639,7 @@ int security_node_sid(struct selinux_sta return 0; }
+retry: rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2635,6 +2688,10 @@ int security_node_sid(struct selinux_sta rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2676,18 +2733,24 @@ int security_get_user_sids(struct selinu struct sidtab *sidtab; struct context *fromcon, usercon; u32 *mysids = NULL, *mysids2, sid; - u32 mynel = 0, maxnel = SIDS_NEL; + u32 i, j, mynel, maxnel = SIDS_NEL; struct user_datum *user; struct role_datum *role; struct ebitmap_node *rnode, *tnode; - int rc = 0, i, j; + int rc;
*sids = NULL; *nel = 0;
if (!selinux_initialized(state)) - goto out; + return 0; + + mysids = kcalloc(maxnel, sizeof(*mysids), GFP_KERNEL); + if (!mysids) + return -ENOMEM;
+retry: + mynel = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2707,11 +2770,6 @@ int security_get_user_sids(struct selinu
usercon.user = user->value;
- rc = -ENOMEM; - mysids = kcalloc(maxnel, sizeof(*mysids), GFP_ATOMIC); - if (!mysids) - goto out_unlock; - ebitmap_for_each_positive_bit(&user->roles, rnode, i) { role = policydb->role_val_to_struct[i]; usercon.role = i + 1; @@ -2723,6 +2781,10 @@ int security_get_user_sids(struct selinu continue;
rc = sidtab_context_to_sid(sidtab, &usercon, &sid); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out_unlock; if (mynel < maxnel) { @@ -2745,14 +2807,14 @@ out_unlock: rcu_read_unlock(); if (rc || !mynel) { kfree(mysids); - goto out; + return rc; }
rc = -ENOMEM; mysids2 = kcalloc(mynel, sizeof(*mysids2), GFP_KERNEL); if (!mysids2) { kfree(mysids); - goto out; + return rc; } for (i = 0, j = 0; i < mynel; i++) { struct av_decision dummy_avd; @@ -2765,12 +2827,10 @@ out_unlock: mysids2[j++] = mysids[i]; cond_resched(); } - rc = 0; kfree(mysids); *sids = mysids2; *nel = j; -out: - return rc; + return 0; }
/** @@ -2783,6 +2843,9 @@ out: * Obtain a SID to use for a file in a filesystem that * cannot support xattr or use a fixed labeling behavior like * transition SIDs or task SIDs. + * + * WARNING: This function may return -ESTALE, indicating that the caller + * must retry the operation after re-acquiring the policy pointer! */ static inline int __security_genfs_sid(struct selinux_policy *policy, const char *fstype, @@ -2861,11 +2924,13 @@ int security_genfs_sid(struct selinux_st return 0; }
- rcu_read_lock(); - policy = rcu_dereference(state->policy); - retval = __security_genfs_sid(policy, - fstype, path, orig_sclass, sid); - rcu_read_unlock(); + do { + rcu_read_lock(); + policy = rcu_dereference(state->policy); + retval = __security_genfs_sid(policy, fstype, path, + orig_sclass, sid); + rcu_read_unlock(); + } while (retval == -ESTALE); return retval; }
@@ -2888,7 +2953,7 @@ int security_fs_use(struct selinux_state struct selinux_policy *policy; struct policydb *policydb; struct sidtab *sidtab; - int rc = 0; + int rc; struct ocontext *c; struct superblock_security_struct *sbsec = sb->s_security; const char *fstype = sb->s_type->name; @@ -2899,6 +2964,8 @@ int security_fs_use(struct selinux_state return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -2916,6 +2983,10 @@ int security_fs_use(struct selinux_state if (!c->sid[0]) { rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) goto out; } @@ -2923,6 +2994,10 @@ int security_fs_use(struct selinux_state } else { rc = __security_genfs_sid(policy, fstype, "/", SECCLASS_DIR, &sbsec->sid); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) { sbsec->behavior = SECURITY_FS_USE_NONE; rc = 0; @@ -3132,12 +3207,13 @@ int security_sid_mls_copy(struct selinux u32 len; int rc;
- rc = 0; if (!selinux_initialized(state)) { *new_sid = sid; - goto out; + return 0; }
+retry: + rc = 0; context_init(&newcon);
rcu_read_lock(); @@ -3196,10 +3272,14 @@ int security_sid_mls_copy(struct selinux } } rc = sidtab_context_to_sid(sidtab, &newcon, new_sid); + if (rc == -ESTALE) { + rcu_read_unlock(); + context_destroy(&newcon); + goto retry; + } out_unlock: rcu_read_unlock(); context_destroy(&newcon); -out: return rc; }
@@ -3796,6 +3876,8 @@ int security_netlbl_secattr_to_sid(struc return 0; }
+retry: + rc = 0; rcu_read_lock(); policy = rcu_dereference(state->policy); policydb = &policy->policydb; @@ -3822,23 +3904,24 @@ int security_netlbl_secattr_to_sid(struc goto out; } rc = -EIDRM; - if (!mls_context_isvalid(policydb, &ctx_new)) - goto out_free; + if (!mls_context_isvalid(policydb, &ctx_new)) { + ebitmap_destroy(&ctx_new.range.level[0].cat); + goto out; + }
rc = sidtab_context_to_sid(sidtab, &ctx_new, sid); + ebitmap_destroy(&ctx_new.range.level[0].cat); + if (rc == -ESTALE) { + rcu_read_unlock(); + goto retry; + } if (rc) - goto out_free; + goto out;
security_netlbl_cache_add(secattr, *sid); - - ebitmap_destroy(&ctx_new.range.level[0].cat); } else *sid = SECSID_NULL;
- rcu_read_unlock(); - return 0; -out_free: - ebitmap_destroy(&ctx_new.range.level[0].cat); out: rcu_read_unlock(); return rc; --- a/security/selinux/ss/sidtab.c +++ b/security/selinux/ss/sidtab.c @@ -39,6 +39,7 @@ int sidtab_init(struct sidtab *s) for (i = 0; i < SECINITSID_NUM; i++) s->isids[i].set = 0;
+ s->frozen = false; s->count = 0; s->convert = NULL; hash_init(s->context_to_sid); @@ -281,6 +282,15 @@ int sidtab_context_to_sid(struct sidtab if (*sid) goto out_unlock;
+ if (unlikely(s->frozen)) { + /* + * This sidtab is now frozen - tell the caller to abort and + * get the new one. + */ + rc = -ESTALE; + goto out_unlock; + } + count = s->count; convert = s->convert;
@@ -474,6 +484,17 @@ void sidtab_cancel_convert(struct sidtab spin_unlock_irqrestore(&s->lock, flags); }
+void sidtab_freeze_begin(struct sidtab *s, unsigned long *flags) __acquires(&s->lock) +{ + spin_lock_irqsave(&s->lock, *flags); + s->frozen = true; + s->convert = NULL; +} +void sidtab_freeze_end(struct sidtab *s, unsigned long *flags) __releases(&s->lock) +{ + spin_unlock_irqrestore(&s->lock, *flags); +} + static void sidtab_destroy_entry(struct sidtab_entry *entry) { context_destroy(&entry->context); --- a/security/selinux/ss/sidtab.h +++ b/security/selinux/ss/sidtab.h @@ -86,6 +86,7 @@ struct sidtab { u32 count; /* access only under spinlock */ struct sidtab_convert_params *convert; + bool frozen; spinlock_t lock;
#if CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE > 0 @@ -125,6 +126,9 @@ int sidtab_convert(struct sidtab *s, str
void sidtab_cancel_convert(struct sidtab *s);
+void sidtab_freeze_begin(struct sidtab *s, unsigned long *flags) __acquires(&s->lock); +void sidtab_freeze_end(struct sidtab *s, unsigned long *flags) __releases(&s->lock); + int sidtab_context_to_sid(struct sidtab *s, struct context *context, u32 *sid);
void sidtab_destroy(struct sidtab *s);
From: Luca Fancellu luca.fancellu@arm.com
commit d120198bd5ff1d41808b6914e1eb89aff937415c upstream.
Unmask operation must be called with interrupt disabled, on preempt_rt spin_lock_irqsave/spin_unlock_irqrestore don't disable/enable interrupts, so use raw_* implementation and change lock variable in struct irq_info from spinlock_t to raw_spinlock_t
Cc: stable@vger.kernel.org Fixes: 25da4618af24 ("xen/events: don't unmask an event channel when an eoi is pending") Signed-off-by: Luca Fancellu luca.fancellu@arm.com Reviewed-by: Julien Grall jgrall@amazon.com Reviewed-by: Wei Liu wei.liu@kernel.org Link: https://lore.kernel.org/r/20210406105105.10141-1-luca.fancellu@arm.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -108,7 +108,7 @@ struct irq_info { unsigned short eoi_cpu; /* EOI must happen on this cpu-1 */ unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ - spinlock_t lock; + raw_spinlock_t lock;
union { unsigned short virq; @@ -280,7 +280,7 @@ static int xen_irq_info_common_setup(str info->evtchn = evtchn; info->cpu = cpu; info->mask_reason = EVT_MASK_REASON_EXPLICIT; - spin_lock_init(&info->lock); + raw_spin_lock_init(&info->lock);
ret = set_evtchn_to_irq(evtchn, irq); if (ret < 0) @@ -432,28 +432,28 @@ static void do_mask(struct irq_info *inf { unsigned long flags;
- spin_lock_irqsave(&info->lock, flags); + raw_spin_lock_irqsave(&info->lock, flags);
if (!info->mask_reason) mask_evtchn(info->evtchn);
info->mask_reason |= reason;
- spin_unlock_irqrestore(&info->lock, flags); + raw_spin_unlock_irqrestore(&info->lock, flags); }
static void do_unmask(struct irq_info *info, u8 reason) { unsigned long flags;
- spin_lock_irqsave(&info->lock, flags); + raw_spin_lock_irqsave(&info->lock, flags);
info->mask_reason &= ~reason;
if (!info->mask_reason) unmask_evtchn(info->evtchn);
- spin_unlock_irqrestore(&info->lock, flags); + raw_spin_unlock_irqrestore(&info->lock, flags); }
#ifdef CONFIG_X86
From: Muhammad Usama Anjum musamaanjum@gmail.com
commit 864db232dc7036aa2de19749c3d5be0143b24f8f upstream.
nlh is being checked for validtity two times when it is dereferenced in this function. Check for validity again when updating the flags through nlh pointer to make the dereferencing safe.
CC: stable@vger.kernel.org Addresses-Coverity: ("NULL pointer dereference") Signed-off-by: Muhammad Usama Anjum musamaanjum@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5203,9 +5203,11 @@ static int ip6_route_multipath_add(struc * nexthops have been replaced by first new, the rest should * be added to it. */ - cfg->fc_nlinfo.nlh->nlmsg_flags &= ~(NLM_F_EXCL | - NLM_F_REPLACE); - cfg->fc_nlinfo.nlh->nlmsg_flags |= NLM_F_CREATE; + if (cfg->fc_nlinfo.nlh) { + cfg->fc_nlinfo.nlh->nlmsg_flags &= ~(NLM_F_EXCL | + NLM_F_REPLACE); + cfg->fc_nlinfo.nlh->nlmsg_flags |= NLM_F_CREATE; + } nhn++; }
From: Martin Blumenstingl martin.blumenstingl@googlemail.com
commit 3e6fdeb28f4c331acbd27bdb0effc4befd4ef8e8 upstream.
The xMII interface clock depends on the PHY interface (MII, RMII, RGMII) as well as the current link speed. Explicitly configure the GSWIP to automatically select the appropriate xMII interface clock.
This fixes an issue seen by some users where ports using an external RMII or RGMII PHY were deaf (no RX or TX traffic could be seen). Most likely this is due to an "invalid" xMII clock being selected either by the bootloader or hardware-defaults.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lantiq_gswip.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -811,10 +811,15 @@ static int gswip_setup(struct dsa_switch /* Configure the MDIO Clock 2.5 MHz */ gswip_mdio_mask(priv, 0xff, 0x09, GSWIP_MDIO_MDC_CFG1);
- /* Disable the xMII link */ - for (i = 0; i < priv->hw_info->max_ports; i++) + for (i = 0; i < priv->hw_info->max_ports; i++) { + /* Disable the xMII link */ gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_EN, 0, i);
+ /* Automatically select the xMII interface clock */ + gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_RATE_MASK, + GSWIP_MII_CFG_RATE_AUTO, i); + } + /* enable special tag insertion on cpu port */ gswip_switch_mask(priv, 0, GSWIP_FDMA_PCTRL_STEN, GSWIP_FDMA_PCTRLp(cpu_port));
From: Martin Blumenstingl martin.blumenstingl@googlemail.com
commit 3e9005be87777afc902b9f5497495898202d335d upstream.
PHY auto polling on the GSWIP hardware can be used so link changes (speed, link up/down, etc.) can be detected automatically. Internally GSWIP reads the PHY's registers for this functionality. Based on this automatic detection GSWIP can also automatically re-configure it's port settings. Unfortunately this auto polling (and configuration) mechanism seems to cause various issues observed by different people on different devices: - FritzBox 7360v2: the two Gbit/s ports (connected to the two internal PHY11G instances) are working fine but the two Fast Ethernet ports (using an AR8030 RMII PHY) are completely dead (neither RX nor TX are received). It turns out that the AR8030 PHY sets the BMSR_ESTATEN bit as well as the ESTATUS_1000_TFULL and ESTATUS_1000_XFULL bits. This makes the PHY auto polling state machine (rightfully?) think that the established link speed (when the other side is Gbit/s capable) is 1Gbit/s. - None of the Ethernet ports on the Zyxel P-2812HNU-F1 (two are connected to the internal PHY11G GPHYs while the other three are external RGMII PHYs) are working. Neither RX nor TX traffic was observed. It is not clear which part of the PHY auto polling state- machine caused this. - FritzBox 7412 (only one LAN port which is connected to one of the internal GPHYs running in PHY22F / Fast Ethernet mode) was seeing random disconnects (link down events could be seen). Sometimes all traffic would stop after such disconnect. It is not clear which part of the PHY auto polling state-machine cauased this. - TP-Link TD-W9980 (two ports are connected to the internal GPHYs running in PHY11G / Gbit/s mode, the other two are external RGMII PHYs) was affected by similar issues as the FritzBox 7412 just without the "link down" events
Switch to software based configuration instead of PHY auto polling (and letting the GSWIP hardware configure the ports automatically) for the following link parameters: - link up/down - link speed - full/half duplex - flow control (RX / TX pause)
After a big round of manual testing by various people (who helped test this on OpenWrt) it turns out that this fixes all reported issues.
Additionally it can be considered more future proof because any "quirk" which is implemented for a PHY on the driver side can now be used with the GSWIP hardware as well because Linux is in control of the link parameters.
As a nice side-effect this also solves a problem where fixed-links were not supported previously because we were relying on the PHY auto polling mechanism, which cannot work for fixed-links as there's no PHY from where it can read the registers. Configuring the link settings on the GSWIP ports means that we now use the settings from device-tree also for ports with fixed-links.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Fixes: 3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock") Cc: stable@vger.kernel.org Acked-by: Hauke Mehrtens hauke@hauke-m.de Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lantiq_gswip.c | 185 +++++++++++++++++++++++++++++++++++------ 1 file changed, 159 insertions(+), 26 deletions(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -190,6 +190,23 @@ #define GSWIP_PCE_DEFPVID(p) (0x486 + ((p) * 0xA))
#define GSWIP_MAC_FLEN 0x8C5 +#define GSWIP_MAC_CTRL_0p(p) (0x903 + ((p) * 0xC)) +#define GSWIP_MAC_CTRL_0_PADEN BIT(8) +#define GSWIP_MAC_CTRL_0_FCS_EN BIT(7) +#define GSWIP_MAC_CTRL_0_FCON_MASK 0x0070 +#define GSWIP_MAC_CTRL_0_FCON_AUTO 0x0000 +#define GSWIP_MAC_CTRL_0_FCON_RX 0x0010 +#define GSWIP_MAC_CTRL_0_FCON_TX 0x0020 +#define GSWIP_MAC_CTRL_0_FCON_RXTX 0x0030 +#define GSWIP_MAC_CTRL_0_FCON_NONE 0x0040 +#define GSWIP_MAC_CTRL_0_FDUP_MASK 0x000C +#define GSWIP_MAC_CTRL_0_FDUP_AUTO 0x0000 +#define GSWIP_MAC_CTRL_0_FDUP_EN 0x0004 +#define GSWIP_MAC_CTRL_0_FDUP_DIS 0x000C +#define GSWIP_MAC_CTRL_0_GMII_MASK 0x0003 +#define GSWIP_MAC_CTRL_0_GMII_AUTO 0x0000 +#define GSWIP_MAC_CTRL_0_GMII_MII 0x0001 +#define GSWIP_MAC_CTRL_0_GMII_RGMII 0x0002 #define GSWIP_MAC_CTRL_2p(p) (0x905 + ((p) * 0xC)) #define GSWIP_MAC_CTRL_2_MLEN BIT(3) /* Maximum Untagged Frame Lnegth */
@@ -653,16 +670,13 @@ static int gswip_port_enable(struct dsa_ GSWIP_SDMA_PCTRLp(port));
if (!dsa_is_cpu_port(ds, port)) { - u32 macconf = GSWIP_MDIO_PHY_LINK_AUTO | - GSWIP_MDIO_PHY_SPEED_AUTO | - GSWIP_MDIO_PHY_FDUP_AUTO | - GSWIP_MDIO_PHY_FCONTX_AUTO | - GSWIP_MDIO_PHY_FCONRX_AUTO | - (phydev->mdio.addr & GSWIP_MDIO_PHY_ADDR_MASK); - - gswip_mdio_w(priv, macconf, GSWIP_MDIO_PHYp(port)); - /* Activate MDIO auto polling */ - gswip_mdio_mask(priv, 0, BIT(port), GSWIP_MDIO_MDC_CFG0); + u32 mdio_phy = 0; + + if (phydev) + mdio_phy = phydev->mdio.addr & GSWIP_MDIO_PHY_ADDR_MASK; + + gswip_mdio_mask(priv, GSWIP_MDIO_PHY_ADDR_MASK, mdio_phy, + GSWIP_MDIO_PHYp(port)); }
return 0; @@ -675,14 +689,6 @@ static void gswip_port_disable(struct ds if (!dsa_is_user_port(ds, port)) return;
- if (!dsa_is_cpu_port(ds, port)) { - gswip_mdio_mask(priv, GSWIP_MDIO_PHY_LINK_DOWN, - GSWIP_MDIO_PHY_LINK_MASK, - GSWIP_MDIO_PHYp(port)); - /* Deactivate MDIO auto polling */ - gswip_mdio_mask(priv, BIT(port), 0, GSWIP_MDIO_MDC_CFG0); - } - gswip_switch_mask(priv, GSWIP_FDMA_PCTRL_EN, 0, GSWIP_FDMA_PCTRLp(port)); gswip_switch_mask(priv, GSWIP_SDMA_PCTRL_EN, 0, @@ -806,20 +812,31 @@ static int gswip_setup(struct dsa_switch gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP2); gswip_switch_w(priv, BIT(cpu_port), GSWIP_PCE_PMAP3);
- /* disable PHY auto polling */ + /* Deactivate MDIO PHY auto polling. Some PHYs as the AR8030 have an + * interoperability problem with this auto polling mechanism because + * their status registers think that the link is in a different state + * than it actually is. For the AR8030 it has the BMSR_ESTATEN bit set + * as well as ESTATUS_1000_TFULL and ESTATUS_1000_XFULL. This makes the + * auto polling state machine consider the link being negotiated with + * 1Gbit/s. Since the PHY itself is a Fast Ethernet RMII PHY this leads + * to the switch port being completely dead (RX and TX are both not + * working). + * Also with various other PHY / port combinations (PHY11G GPHY, PHY22F + * GPHY, external RGMII PEF7071/7072) any traffic would stop. Sometimes + * it would work fine for a few minutes to hours and then stop, on + * other device it would no traffic could be sent or received at all. + * Testing shows that when PHY auto polling is disabled these problems + * go away. + */ gswip_mdio_w(priv, 0x0, GSWIP_MDIO_MDC_CFG0); + /* Configure the MDIO Clock 2.5 MHz */ gswip_mdio_mask(priv, 0xff, 0x09, GSWIP_MDIO_MDC_CFG1);
- for (i = 0; i < priv->hw_info->max_ports; i++) { - /* Disable the xMII link */ + /* Disable the xMII link */ + for (i = 0; i < priv->hw_info->max_ports; i++) gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_EN, 0, i);
- /* Automatically select the xMII interface clock */ - gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_RATE_MASK, - GSWIP_MII_CFG_RATE_AUTO, i); - } - /* enable special tag insertion on cpu port */ gswip_switch_mask(priv, 0, GSWIP_FDMA_PCTRL_STEN, GSWIP_FDMA_PCTRLp(cpu_port)); @@ -1469,6 +1486,112 @@ unsupported: return; }
+static void gswip_port_set_link(struct gswip_priv *priv, int port, bool link) +{ + u32 mdio_phy; + + if (link) + mdio_phy = GSWIP_MDIO_PHY_LINK_UP; + else + mdio_phy = GSWIP_MDIO_PHY_LINK_DOWN; + + gswip_mdio_mask(priv, GSWIP_MDIO_PHY_LINK_MASK, mdio_phy, + GSWIP_MDIO_PHYp(port)); +} + +static void gswip_port_set_speed(struct gswip_priv *priv, int port, int speed, + phy_interface_t interface) +{ + u32 mdio_phy = 0, mii_cfg = 0, mac_ctrl_0 = 0; + + switch (speed) { + case SPEED_10: + mdio_phy = GSWIP_MDIO_PHY_SPEED_M10; + + if (interface == PHY_INTERFACE_MODE_RMII) + mii_cfg = GSWIP_MII_CFG_RATE_M50; + else + mii_cfg = GSWIP_MII_CFG_RATE_M2P5; + + mac_ctrl_0 = GSWIP_MAC_CTRL_0_GMII_MII; + break; + + case SPEED_100: + mdio_phy = GSWIP_MDIO_PHY_SPEED_M100; + + if (interface == PHY_INTERFACE_MODE_RMII) + mii_cfg = GSWIP_MII_CFG_RATE_M50; + else + mii_cfg = GSWIP_MII_CFG_RATE_M25; + + mac_ctrl_0 = GSWIP_MAC_CTRL_0_GMII_MII; + break; + + case SPEED_1000: + mdio_phy = GSWIP_MDIO_PHY_SPEED_G1; + + mii_cfg = GSWIP_MII_CFG_RATE_M125; + + mac_ctrl_0 = GSWIP_MAC_CTRL_0_GMII_RGMII; + break; + } + + gswip_mdio_mask(priv, GSWIP_MDIO_PHY_SPEED_MASK, mdio_phy, + GSWIP_MDIO_PHYp(port)); + gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_RATE_MASK, mii_cfg, port); + gswip_switch_mask(priv, GSWIP_MAC_CTRL_0_GMII_MASK, mac_ctrl_0, + GSWIP_MAC_CTRL_0p(port)); +} + +static void gswip_port_set_duplex(struct gswip_priv *priv, int port, int duplex) +{ + u32 mac_ctrl_0, mdio_phy; + + if (duplex == DUPLEX_FULL) { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FDUP_EN; + mdio_phy = GSWIP_MDIO_PHY_FDUP_EN; + } else { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FDUP_DIS; + mdio_phy = GSWIP_MDIO_PHY_FDUP_DIS; + } + + gswip_switch_mask(priv, GSWIP_MAC_CTRL_0_FDUP_MASK, mac_ctrl_0, + GSWIP_MAC_CTRL_0p(port)); + gswip_mdio_mask(priv, GSWIP_MDIO_PHY_FDUP_MASK, mdio_phy, + GSWIP_MDIO_PHYp(port)); +} + +static void gswip_port_set_pause(struct gswip_priv *priv, int port, + bool tx_pause, bool rx_pause) +{ + u32 mac_ctrl_0, mdio_phy; + + if (tx_pause && rx_pause) { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FCON_RXTX; + mdio_phy = GSWIP_MDIO_PHY_FCONTX_EN | + GSWIP_MDIO_PHY_FCONRX_EN; + } else if (tx_pause) { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FCON_TX; + mdio_phy = GSWIP_MDIO_PHY_FCONTX_EN | + GSWIP_MDIO_PHY_FCONRX_DIS; + } else if (rx_pause) { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FCON_RX; + mdio_phy = GSWIP_MDIO_PHY_FCONTX_DIS | + GSWIP_MDIO_PHY_FCONRX_EN; + } else { + mac_ctrl_0 = GSWIP_MAC_CTRL_0_FCON_NONE; + mdio_phy = GSWIP_MDIO_PHY_FCONTX_DIS | + GSWIP_MDIO_PHY_FCONRX_DIS; + } + + gswip_switch_mask(priv, GSWIP_MAC_CTRL_0_FCON_MASK, + mac_ctrl_0, GSWIP_MAC_CTRL_0p(port)); + gswip_mdio_mask(priv, + GSWIP_MDIO_PHY_FCONTX_MASK | + GSWIP_MDIO_PHY_FCONRX_MASK, + mdio_phy, GSWIP_MDIO_PHYp(port)); +} + static void gswip_phylink_mac_config(struct dsa_switch *ds, int port, unsigned int mode, const struct phylink_link_state *state) @@ -1525,6 +1648,9 @@ static void gswip_phylink_mac_link_down( struct gswip_priv *priv = ds->priv;
gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_EN, 0, port); + + if (!dsa_is_cpu_port(ds, port)) + gswip_port_set_link(priv, port, false); }
static void gswip_phylink_mac_link_up(struct dsa_switch *ds, int port, @@ -1536,6 +1662,13 @@ static void gswip_phylink_mac_link_up(st { struct gswip_priv *priv = ds->priv;
+ if (!dsa_is_cpu_port(ds, port)) { + gswip_port_set_link(priv, port, true); + gswip_port_set_speed(priv, port, speed, interface); + gswip_port_set_duplex(priv, port, duplex); + gswip_port_set_pause(priv, port, tx_pause, rx_pause); + } + gswip_mii_mask_cfg(priv, 0, GSWIP_MII_CFG_EN, port); }
From: Martin Blumenstingl martin.blumenstingl@googlemail.com
commit 4b5923249b8fa427943b50b8f35265176472be38 upstream.
There are a few more bits in the GSWIP_MII_CFG register for which we did rely on the boot-loader (or the hardware defaults) to set them up properly.
For some external RMII PHYs we need to select the GSWIP_MII_CFG_RMII_CLK bit and also we should un-set it for non-RMII PHYs. The GSWIP_MII_CFG_RMII_CLK bit is ignored for other PHY connection modes.
The GSWIP IP also supports in-band auto-negotiation for RGMII PHYs when the GSWIP_MII_CFG_RGMII_IBS bit is set. Clear this bit always as there's no known hardware which uses this (so it is not tested yet).
Clear the xMII isolation bit when set at initialization time if it was previously set by the bootloader. Not doing so could lead to no traffic (neither RX nor TX) on a port with this bit set.
While here, also add the GSWIP_MII_CFG_RESET bit. We don't need to manage it because this bit is self-clearning when set. We still add it here to get a better overview of the GSWIP_MII_CFG register.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable@vger.kernel.org Suggested-by: Hauke Mehrtens hauke@hauke-m.de Acked-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lantiq_gswip.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -93,8 +93,12 @@
/* GSWIP MII Registers */ #define GSWIP_MII_CFGp(p) (0x2 * (p)) +#define GSWIP_MII_CFG_RESET BIT(15) #define GSWIP_MII_CFG_EN BIT(14) +#define GSWIP_MII_CFG_ISOLATE BIT(13) #define GSWIP_MII_CFG_LDCLKDIS BIT(12) +#define GSWIP_MII_CFG_RGMII_IBS BIT(8) +#define GSWIP_MII_CFG_RMII_CLK BIT(7) #define GSWIP_MII_CFG_MODE_MIIP 0x0 #define GSWIP_MII_CFG_MODE_MIIM 0x1 #define GSWIP_MII_CFG_MODE_RMIIP 0x2 @@ -833,9 +837,11 @@ static int gswip_setup(struct dsa_switch /* Configure the MDIO Clock 2.5 MHz */ gswip_mdio_mask(priv, 0xff, 0x09, GSWIP_MDIO_MDC_CFG1);
- /* Disable the xMII link */ + /* Disable the xMII interface and clear it's isolation bit */ for (i = 0; i < priv->hw_info->max_ports; i++) - gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_EN, 0, i); + gswip_mii_mask_cfg(priv, + GSWIP_MII_CFG_EN | GSWIP_MII_CFG_ISOLATE, + 0, i);
/* enable special tag insertion on cpu port */ gswip_switch_mask(priv, 0, GSWIP_FDMA_PCTRL_STEN, @@ -1611,6 +1617,9 @@ static void gswip_phylink_mac_config(str break; case PHY_INTERFACE_MODE_RMII: miicfg |= GSWIP_MII_CFG_MODE_RMIIM; + + /* Configure the RMII clock as output: */ + miicfg |= GSWIP_MII_CFG_RMII_CLK; break; case PHY_INTERFACE_MODE_RGMII: case PHY_INTERFACE_MODE_RGMII_ID: @@ -1623,7 +1632,11 @@ static void gswip_phylink_mac_config(str "Unsupported interface: %d\n", state->interface); return; } - gswip_mii_mask_cfg(priv, GSWIP_MII_CFG_MODE_MASK, miicfg, port); + + gswip_mii_mask_cfg(priv, + GSWIP_MII_CFG_MODE_MASK | GSWIP_MII_CFG_RMII_CLK | + GSWIP_MII_CFG_RGMII_IBS | GSWIP_MII_CFG_LDCLKDIS, + miicfg, port);
switch (state->interface) { case PHY_INTERFACE_MODE_RGMII_ID:
From: Takashi Iwai tiwai@suse.de
commit b6a37a93c9ac3900987c79b726d0bb3699d8db4e upstream.
intel_dsm_platform_mux_info() tries to parse the ACPI package data from _DSM for the debug information, but it assumes the fixed format without checking what values are stored in the elements actually. When an unexpected value is returned from BIOS, it may lead to GPF or NULL dereference, as reported recently.
Add the checks of the contents in the returned values and skip the values for invalid cases.
v1->v2: Check the info contents before dereferencing, too
BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=1184074 Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210402082317.871-1-tiwai@sus... (cherry picked from commit 337d7a1621c7f02af867229990ac67c97da1b53a) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_acpi.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_acpi.c +++ b/drivers/gpu/drm/i915/display/intel_acpi.c @@ -84,13 +84,31 @@ static void intel_dsm_platform_mux_info( return; }
+ if (!pkg->package.count) { + DRM_DEBUG_DRIVER("no connection in _DSM\n"); + return; + } + connector_count = &pkg->package.elements[0]; DRM_DEBUG_DRIVER("MUX info connectors: %lld\n", (unsigned long long)connector_count->integer.value); for (i = 1; i < pkg->package.count; i++) { union acpi_object *obj = &pkg->package.elements[i]; - union acpi_object *connector_id = &obj->package.elements[0]; - union acpi_object *info = &obj->package.elements[1]; + union acpi_object *connector_id; + union acpi_object *info; + + if (obj->type != ACPI_TYPE_PACKAGE || obj->package.count < 2) { + DRM_DEBUG_DRIVER("Invalid object for MUX #%d\n", i); + continue; + } + + connector_id = &obj->package.elements[0]; + info = &obj->package.elements[1]; + if (info->type != ACPI_TYPE_BUFFER || info->buffer.length < 4) { + DRM_DEBUG_DRIVER("Invalid info for MUX obj #%d\n", i); + continue; + } + DRM_DEBUG_DRIVER("Connector id: 0x%016llx\n", (unsigned long long)connector_id->integer.value); DRM_DEBUG_DRIVER(" port id: %s\n",
From: Vitaly Kuznetsov vkuznets@redhat.com
commit fa26d0c778b432d3d9814ea82552e813b33eeb5c upstream.
Commit 8cdddd182bd7 ("ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()") tried to fix CPU0 hotplug breakage by copying wakeup_cpu0() + start_cpu0() logic from hlt_play_dead()//mwait_play_dead() into acpi_idle_play_dead(). The problem is that these functions are not exported to modules so when CONFIG_ACPI_PROCESSOR=m build fails.
The issue could've been fixed by exporting both wakeup_cpu0()/start_cpu0() (the later from assembly) but it seems putting the whole pattern into a new function and exporting it instead is better.
Reported-by: kernel test robot lkp@intel.com Fixes: 8cdddd182bd7 ("CPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/smp.h | 2 +- arch/x86/kernel/smpboot.c | 26 ++++++++++++-------------- drivers/acpi/processor_idle.c | 4 +--- 3 files changed, 14 insertions(+), 18 deletions(-)
--- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -132,7 +132,7 @@ void native_play_dead(void); void play_dead_common(void); void wbinvd_on_cpu(int cpu); int wbinvd_on_all_cpus(void); -bool wakeup_cpu0(void); +void cond_wakeup_cpu0(void);
void native_smp_send_reschedule(int cpu); void native_send_call_func_ipi(const struct cpumask *mask); --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1655,13 +1655,17 @@ void play_dead_common(void) local_irq_disable(); }
-bool wakeup_cpu0(void) +/** + * cond_wakeup_cpu0 - Wake up CPU0 if needed. + * + * If NMI wants to wake up CPU0, start CPU0. + */ +void cond_wakeup_cpu0(void) { if (smp_processor_id() == 0 && enable_start_cpu0) - return true; - - return false; + start_cpu0(); } +EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
/* * We need to flush the caches before going to sleep, lest we have @@ -1730,11 +1734,8 @@ static inline void mwait_play_dead(void) __monitor(mwait_ptr, 0, 0); mb(); __mwait(eax, 0); - /* - * If NMI wants to wake up CPU0, start CPU0. - */ - if (wakeup_cpu0()) - start_cpu0(); + + cond_wakeup_cpu0(); } }
@@ -1745,11 +1746,8 @@ void hlt_play_dead(void)
while (1) { native_halt(); - /* - * If NMI wants to wake up CPU0, start CPU0. - */ - if (wakeup_cpu0()) - start_cpu0(); + + cond_wakeup_cpu0(); } }
--- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -545,9 +545,7 @@ static int acpi_idle_play_dead(struct cp return -ENODEV;
#if defined(CONFIG_X86) && defined(CONFIG_HOTPLUG_CPU) - /* If NMI wants to wake up CPU0, start CPU0. */ - if (wakeup_cpu0()) - start_cpu0(); + cond_wakeup_cpu0(); #endif }
From: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com
commit 5de61a47eb9064cbbc5f3360d639e8e34a690a54 upstream.
A panic can result when AIP is enabled:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000 PGD 0 P4D 0 Oops: 0000 1 SMP PTI CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014 RIP: 0010:__bitmap_and+0x1b/0x70 RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048 RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750 RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0 R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003 FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0 Call Trace: hfi1_num_netdev_contexts+0x7c/0x110 [hfi1] hfi1_init_dd+0xd7f/0x1a90 [hfi1] ? pci_bus_read_config_dword+0x49/0x70 ? pci_mmcfg_read+0x3e/0xe0 do_init_one.isra.18+0x336/0x640 [hfi1] local_pci_probe+0x41/0x90 pci_device_probe+0x105/0x1c0 really_probe+0x212/0x440 driver_probe_device+0x49/0xc0 device_driver_attach+0x50/0x60 __driver_attach+0x61/0x130 ? device_driver_attach+0x60/0x60 bus_for_each_dev+0x77/0xc0 ? klist_add_tail+0x3b/0x70 bus_add_driver+0x14d/0x1e0 ? dev_init+0x10b/0x10b [hfi1] driver_register+0x6b/0xb0 ? dev_init+0x10b/0x10b [hfi1] hfi1_mod_init+0x1e6/0x20a [hfi1] do_one_initcall+0x46/0x1c3 ? free_unref_page_commit+0x91/0x100 ? _cond_resched+0x15/0x30 ? kmem_cache_alloc_trace+0x140/0x1c0 do_init_module+0x5a/0x220 load_module+0x14b4/0x17e0 ? __do_sys_finit_module+0xa8/0x110 __do_sys_finit_module+0xa8/0x110 do_syscall_64+0x5b/0x1a0
The issue happens when pcibus_to_node() returns NO_NUMA_NODE.
Fix this issue by moving the initialization of dd->node to hfi1_devdata allocation and remove the other pcibus_to_node() calls in the probe path and use dd->node instead.
Affinity logic is adjusted to use a new field dd->affinity_entry as a guard instead of dd->node.
Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev") Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessand... Cc: stable@vger.kernel.org Signed-off-by: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@cornelisnetworks.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/hfi1/affinity.c | 21 +++++---------------- drivers/infiniband/hw/hfi1/hfi.h | 1 + drivers/infiniband/hw/hfi1/init.c | 10 +++++++++- drivers/infiniband/hw/hfi1/netdev_rx.c | 3 +-- 4 files changed, 16 insertions(+), 19 deletions(-)
--- a/drivers/infiniband/hw/hfi1/affinity.c +++ b/drivers/infiniband/hw/hfi1/affinity.c @@ -632,22 +632,11 @@ static void _dev_comp_vect_cpu_mask_clea */ int hfi1_dev_affinity_init(struct hfi1_devdata *dd) { - int node = pcibus_to_node(dd->pcidev->bus); struct hfi1_affinity_node *entry; const struct cpumask *local_mask; int curr_cpu, possible, i, ret; bool new_entry = false;
- /* - * If the BIOS does not have the NUMA node information set, select - * NUMA 0 so we get consistent performance. - */ - if (node < 0) { - dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be affected\n"); - node = 0; - } - dd->node = node; - local_mask = cpumask_of_node(dd->node); if (cpumask_first(local_mask) >= nr_cpu_ids) local_mask = topology_core_cpumask(0); @@ -660,7 +649,7 @@ int hfi1_dev_affinity_init(struct hfi1_d * create an entry in the global affinity structure and initialize it. */ if (!entry) { - entry = node_affinity_allocate(node); + entry = node_affinity_allocate(dd->node); if (!entry) { dd_dev_err(dd, "Unable to allocate global affinity node\n"); @@ -751,6 +740,7 @@ int hfi1_dev_affinity_init(struct hfi1_d if (new_entry) node_affinity_add_tail(entry);
+ dd->affinity_entry = entry; mutex_unlock(&node_affinity.lock);
return 0; @@ -766,10 +756,9 @@ void hfi1_dev_affinity_clean_up(struct h { struct hfi1_affinity_node *entry;
- if (dd->node < 0) - return; - mutex_lock(&node_affinity.lock); + if (!dd->affinity_entry) + goto unlock; entry = node_affinity_lookup(dd->node); if (!entry) goto unlock; @@ -780,8 +769,8 @@ void hfi1_dev_affinity_clean_up(struct h */ _dev_comp_vect_cpu_mask_clean_up(dd, entry); unlock: + dd->affinity_entry = NULL; mutex_unlock(&node_affinity.lock); - dd->node = NUMA_NO_NODE; }
/* --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1409,6 +1409,7 @@ struct hfi1_devdata { spinlock_t irq_src_lock; int vnic_num_vports; struct net_device *dummy_netdev; + struct hfi1_affinity_node *affinity_entry;
/* Keeps track of IPoIB RSM rule users */ atomic_t ipoib_rsm_usr_num; --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -1277,7 +1277,6 @@ static struct hfi1_devdata *hfi1_alloc_d dd->pport = (struct hfi1_pportdata *)(dd + 1); dd->pcidev = pdev; pci_set_drvdata(pdev, dd); - dd->node = NUMA_NO_NODE;
ret = xa_alloc_irq(&hfi1_dev_table, &dd->unit, dd, xa_limit_32b, GFP_KERNEL); @@ -1287,6 +1286,15 @@ static struct hfi1_devdata *hfi1_alloc_d goto bail; } rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit); + /* + * If the BIOS does not have the NUMA node information set, select + * NUMA 0 so we get consistent performance. + */ + dd->node = pcibus_to_node(pdev->bus); + if (dd->node == NUMA_NO_NODE) { + dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be affected\n"); + dd->node = 0; + }
/* * Initialize all locks for the device. This needs to be as early as --- a/drivers/infiniband/hw/hfi1/netdev_rx.c +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c @@ -173,8 +173,7 @@ u32 hfi1_num_netdev_contexts(struct hfi1 return 0; }
- cpumask_and(node_cpu_mask, cpu_mask, - cpumask_of_node(pcibus_to_node(dd->pcidev->bus))); + cpumask_and(node_cpu_mask, cpu_mask, cpumask_of_node(dd->node));
available_cpus = cpumask_weight(node_cpu_mask);
From: Al Viro viro@zeniv.linux.org.uk
commit 4f0ed93fb92d3528c73c80317509df3f800a222b upstream.
That (and traversals in case of umount .) should be done before complete_walk(). Either a braino or mismerge damage on queue reorders - either way, I should've spotted that much earlier.
Fucked-up-by: Al Viro viro@zeniv.linux.org.uk X-Paperbag: Brown Fixes: 161aff1d93ab "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()" Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namei.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/fs/namei.c +++ b/fs/namei.c @@ -2328,16 +2328,16 @@ static int path_lookupat(struct nameidat while (!(err = link_path_walk(s, nd)) && (s = lookup_last(nd)) != NULL) ; + if (!err && unlikely(nd->flags & LOOKUP_MOUNTPOINT)) { + err = handle_lookup_down(nd); + nd->flags &= ~LOOKUP_JUMPED; // no d_weak_revalidate(), please... + } if (!err) err = complete_walk(nd);
if (!err && nd->flags & LOOKUP_DIRECTORY) if (!d_can_lookup(nd->path.dentry)) err = -ENOTDIR; - if (!err && unlikely(nd->flags & LOOKUP_MOUNTPOINT)) { - err = handle_lookup_down(nd); - nd->flags &= ~LOOKUP_JUMPED; // no d_weak_revalidate(), please... - } if (!err) { *path = nd->path; nd->path.mnt = NULL;
From: Nick Desaulniers ndesaulniers@google.com
commit 9562fd132985ea9185388a112e50f2a51557827d upstream.
LLVM changed the expected function signature for llvm_gcda_emit_function() in the clang-11 release. Users of clang-11 or newer may have noticed their kernels producing invalid coverage information:
$ llvm-cov gcov -a -c -u -f -b <input>.gcda -- gcno=<input>.gcno 1 <func>: checksum mismatch, \ (<lineno chksum A>, <cfg chksum B>) != (<lineno chksum A>, <cfg chksum C>) 2 Invalid .gcda File! ...
Fix up the function signatures so calling this function interprets its parameters correctly and computes the correct cfg checksum. In particular, in clang-11, the additional checksum is no longer optional.
Link: https://reviews.llvm.org/rG25544ce2df0daa4304c07e64b9c8b0f7df60c11d Link: https://lkml.kernel.org/r/20210408184631.1156669-1-ndesaulniers@google.com Reported-by: Prasad Sodagudi psodagud@quicinc.com Tested-by: Prasad Sodagudi psodagud@quicinc.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org [5.4+] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org --- kernel/gcov/clang.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
--- a/kernel/gcov/clang.c +++ b/kernel/gcov/clang.c @@ -70,7 +70,9 @@ struct gcov_fn_info {
u32 ident; u32 checksum; +#if CONFIG_CLANG_VERSION < 110000 u8 use_extra_checksum; +#endif u32 cfg_checksum;
u32 num_counters; @@ -145,10 +147,8 @@ void llvm_gcda_emit_function(u32 ident,
list_add_tail(&info->head, ¤t_info->functions); } -EXPORT_SYMBOL(llvm_gcda_emit_function); #else -void llvm_gcda_emit_function(u32 ident, u32 func_checksum, - u8 use_extra_checksum, u32 cfg_checksum) +void llvm_gcda_emit_function(u32 ident, u32 func_checksum, u32 cfg_checksum) { struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL);
@@ -158,12 +158,11 @@ void llvm_gcda_emit_function(u32 ident, INIT_LIST_HEAD(&info->head); info->ident = ident; info->checksum = func_checksum; - info->use_extra_checksum = use_extra_checksum; info->cfg_checksum = cfg_checksum; list_add_tail(&info->head, ¤t_info->functions); } -EXPORT_SYMBOL(llvm_gcda_emit_function); #endif +EXPORT_SYMBOL(llvm_gcda_emit_function);
void llvm_gcda_emit_arcs(u32 num_counters, u64 *counters) { @@ -293,11 +292,16 @@ int gcov_info_is_compatible(struct gcov_ !list_is_last(&fn_ptr2->head, &info2->functions)) { if (fn_ptr1->checksum != fn_ptr2->checksum) return false; +#if CONFIG_CLANG_VERSION < 110000 if (fn_ptr1->use_extra_checksum != fn_ptr2->use_extra_checksum) return false; if (fn_ptr1->use_extra_checksum && fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum) return false; +#else + if (fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum) + return false; +#endif fn_ptr1 = list_next_entry(fn_ptr1, head); fn_ptr2 = list_next_entry(fn_ptr2, head); } @@ -529,17 +533,22 @@ static size_t convert_to_gcda(char *buff
list_for_each_entry(fi_ptr, &info->functions, head) { u32 i; - u32 len = 2; - - if (fi_ptr->use_extra_checksum) - len++;
pos += store_gcov_u32(buffer, pos, GCOV_TAG_FUNCTION); - pos += store_gcov_u32(buffer, pos, len); +#if CONFIG_CLANG_VERSION < 110000 + pos += store_gcov_u32(buffer, pos, + fi_ptr->use_extra_checksum ? 3 : 2); +#else + pos += store_gcov_u32(buffer, pos, 3); +#endif pos += store_gcov_u32(buffer, pos, fi_ptr->ident); pos += store_gcov_u32(buffer, pos, fi_ptr->checksum); +#if CONFIG_CLANG_VERSION < 110000 if (fi_ptr->use_extra_checksum) pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum); +#else + pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum); +#endif
pos += store_gcov_u32(buffer, pos, GCOV_TAG_COUNTER_BASE); pos += store_gcov_u32(buffer, pos, fi_ptr->num_counters * 2);
From: Sergei Trofimovich slyfox@gentoo.org
commit 7ad1e366167837daeb93d0bacb57dee820b0b898 upstream.
ia64 has two stacks:
- memory stack (or stack), pointed at by by r12
- register backing store (register stack), pointed at by ar.bsp/ar.bspstore with complications around dirty register frame on CPU.
In [1] Dmitry noticed that PTRACE_GET_SYSCALL_INFO returns the register stack instead memory stack.
The bug comes from the fact that user_stack_pointer() and current_user_stack_pointer() don't return the same register:
ulong user_stack_pointer(struct pt_regs *regs) { return regs->ar_bspstore; } #define current_user_stack_pointer() (current_pt_regs()->r12)
The change gets both back in sync.
I think ptrace(PTRACE_GET_SYSCALL_INFO) is the only affected user by this bug on ia64.
The change fixes 'rt_sigreturn.gen.test' strace test where it was observed initially.
Link: https://bugs.gentoo.org/769614 [1] Link: https://lkml.kernel.org/r/20210331084447.2561532-1-slyfox@gentoo.org Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Reported-by: Dmitry V. Levin ldv@altlinux.org Cc: Oleg Nesterov oleg@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/ia64/include/asm/ptrace.h | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
--- a/arch/ia64/include/asm/ptrace.h +++ b/arch/ia64/include/asm/ptrace.h @@ -54,8 +54,7 @@
static inline unsigned long user_stack_pointer(struct pt_regs *regs) { - /* FIXME: should this be bspstore + nr_dirty regs? */ - return regs->ar_bspstore; + return regs->r12; }
static inline int is_syscall_success(struct pt_regs *regs) @@ -79,11 +78,6 @@ static inline long regs_return_value(str unsigned long __ip = instruction_pointer(regs); \ (__ip & ~3UL) + ((__ip & 3UL) << 2); \ }) -/* - * Why not default? Because user_stack_pointer() on ia64 gives register - * stack backing store instead... - */ -#define current_user_stack_pointer() (current_pt_regs()->r12)
/* given a pointer to a task_struct, return the user's pt_regs */ # define task_pt_regs(t) (((struct pt_regs *) ((char *) (t) + IA64_STK_OFFSET)) - 1)
From: Mike Rapoport rppt@linux.ibm.com
commit a3a8833dffb7e7329c2586b8bfc531adb503f123 upstream.
Commit cb9f753a3731 ("mm: fix races between swapoff and flush dcache") updated flush_dcache_page implementations on several architectures to use page_mapping_file() in order to avoid races between page_mapping() and swapoff().
This update missed arch/nds32 and there is a possibility of a race there.
Replace page_mapping() with page_mapping_file() in nds32 implementation of flush_dcache_page().
Link: https://lkml.kernel.org/r/20210330175126.26500-1-rppt@kernel.org Fixes: cb9f753a3731 ("mm: fix races between swapoff and flush dcache") Signed-off-by: Mike Rapoport rppt@linux.ibm.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Acked-by: Greentime Hu green.hu@gmail.com Cc: Huang Ying ying.huang@intel.com Cc: Nick Hu nickhu@andestech.com Cc: Vincent Chen deanbo422@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/nds32/mm/cacheflush.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/nds32/mm/cacheflush.c +++ b/arch/nds32/mm/cacheflush.c @@ -238,7 +238,7 @@ void flush_dcache_page(struct page *page { struct address_space *mapping;
- mapping = page_mapping(page); + mapping = page_mapping_file(page); if (mapping && !mapping_mapped(mapping)) set_bit(PG_dcache_dirty, &page->flags); else {
From: Wengang Wang wen.gang.wang@oracle.com
commit 90bd070aae6c4fb5d302f9c4b9c88be60c8197ec upstream.
The following deadlock is detected:
truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).
PID: 14827 TASK: ffff881686a9af80 CPU: 20 COMMAND: "ora_p005_hrltd9" #0 __schedule at ffffffff818667cc #1 schedule at ffffffff81866de6 #2 inode_dio_wait at ffffffff812a2d04 #3 ocfs2_setattr at ffffffffc05f322e [ocfs2] #4 notify_change at ffffffff812a5a09 #5 do_truncate at ffffffff812808f5 #6 do_sys_ftruncate.constprop.18 at ffffffff81280cf2 #7 sys_ftruncate at ffffffff81280d8e #8 do_syscall_64 at ffffffff81003949 #9 entry_SYSCALL_64_after_hwframe at ffffffff81a001ad
dio completion path is going to complete one direct IO (decrement inode->i_dio_count), but before that it hung at locking inode->i_rwsem:
#0 __schedule+700 at ffffffff818667cc #1 schedule+54 at ffffffff81866de6 #2 rwsem_down_write_failed+536 at ffffffff8186aa28 #3 call_rwsem_down_write_failed+23 at ffffffff8185a1b7 #4 down_write+45 at ffffffff81869c9d #5 ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2] #6 ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2] #7 dio_complete+140 at ffffffff812c873c #8 dio_aio_complete_work+25 at ffffffff812c89f9 #9 process_one_work+361 at ffffffff810b1889 #10 worker_thread+77 at ffffffff810b233d #11 kthread+261 at ffffffff810b7fd5 #12 ret_from_fork+62 at ffffffff81a0035e
Thus above forms ABBA deadlock. The same deadlock was mentioned in upstream commit 28f5a8a7c033 ("ocfs2: should wait dio before inode lock in ocfs2_setattr()"). It seems that that commit only removed the cluster lock (the victim of above dead lock) from the ABBA deadlock party.
End-user visible effects: Process hang in truncate -> ocfs2_setattr path and other processes hang at ocfs2_dio_end_io_write path.
This is to fix the deadlock itself. It removes inode_lock() call from dio completion path to remove the deadlock and add ip_alloc_sem lock in setattr path to synchronize the inode modifications.
[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested] Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com
Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang wen.gang.wang@oracle.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/aops.c | 11 +---------- fs/ocfs2/file.c | 8 ++++++-- 2 files changed, 7 insertions(+), 12 deletions(-)
--- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2295,7 +2295,7 @@ static int ocfs2_dio_end_io_write(struct struct ocfs2_alloc_context *meta_ac = NULL; handle_t *handle = NULL; loff_t end = offset + bytes; - int ret = 0, credits = 0, locked = 0; + int ret = 0, credits = 0;
ocfs2_init_dealloc_ctxt(&dealloc);
@@ -2306,13 +2306,6 @@ static int ocfs2_dio_end_io_write(struct !dwc->dw_orphaned) goto out;
- /* ocfs2_file_write_iter will get i_mutex, so we need not lock if we - * are in that context. */ - if (dwc->dw_writer_pid != task_pid_nr(current)) { - inode_lock(inode); - locked = 1; - } - ret = ocfs2_inode_lock(inode, &di_bh, 1); if (ret < 0) { mlog_errno(ret); @@ -2393,8 +2386,6 @@ out: if (meta_ac) ocfs2_free_alloc_context(meta_ac); ocfs2_run_deallocs(osb, &dealloc); - if (locked) - inode_unlock(inode); ocfs2_dio_free_write_ctx(inode, dwc);
return ret; --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1244,22 +1244,24 @@ int ocfs2_setattr(struct dentry *dentry, goto bail_unlock; } } + down_write(&OCFS2_I(inode)->ip_alloc_sem); handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS + 2 * ocfs2_quota_trans_credits(sb)); if (IS_ERR(handle)) { status = PTR_ERR(handle); mlog_errno(status); - goto bail_unlock; + goto bail_unlock_alloc; } status = __dquot_transfer(inode, transfer_to); if (status < 0) goto bail_commit; } else { + down_write(&OCFS2_I(inode)->ip_alloc_sem); handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS); if (IS_ERR(handle)) { status = PTR_ERR(handle); mlog_errno(status); - goto bail_unlock; + goto bail_unlock_alloc; } }
@@ -1272,6 +1274,8 @@ int ocfs2_setattr(struct dentry *dentry,
bail_commit: ocfs2_commit_trans(osb, handle); +bail_unlock_alloc: + up_write(&OCFS2_I(inode)->ip_alloc_sem); bail_unlock: if (status && inode_locked) { ocfs2_inode_unlock_tracker(inode, 1, &oh, had_lock);
From: Jack Qiu jack.qiu@huawei.com
commit df41872b68601059dd4a84858952dcae58acd331 upstream.
I encountered a hung task issue, but not a performance one. I run DIO on a device (need lba continuous, for example open channel ssd), maybe hungtask in below case:
DIO: Checkpoint: get addr A(at boundary), merge into BIO, no submit because boundary missing flush dirty data(get addr A+1), wait IO(A+1) writeback timeout, because DIO(A) didn't submit get addr A+2 fail, because checkpoint is doing
dio_send_cur_page() may clear sdio->boundary, so prevent it from missing a boundary.
Link: https://lkml.kernel.org/r/20210322042253.38312-1-jack.qiu@huawei.com Fixes: b1058b981272 ("direct-io: submit bio after boundary buffer is added to it") Signed-off-by: Jack Qiu jack.qiu@huawei.com Reviewed-by: Jan Kara jack@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/direct-io.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -810,6 +810,7 @@ submit_page_section(struct dio *dio, str struct buffer_head *map_bh) { int ret = 0; + int boundary = sdio->boundary; /* dio_send_cur_page may clear it */
if (dio->op == REQ_OP_WRITE) { /* @@ -848,10 +849,10 @@ submit_page_section(struct dio *dio, str sdio->cur_page_fs_offset = sdio->block_in_file << sdio->blkbits; out: /* - * If sdio->boundary then we want to schedule the IO now to + * If boundary then we want to schedule the IO now to * avoid metadata seeks. */ - if (sdio->boundary) { + if (boundary) { ret = dio_send_cur_page(dio, sdio, map_bh); if (sdio->bio) dio_bio_submit(dio, sdio);
From: Wong Vee Khee vee.khee.wong@linux.intel.com
commit 63cf32389925e234d166fb1a336b46de7f846003 upstream.
The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool header file. Hence, we should use ethnl_update_u32() in set_eee ops.
Fixes: fd77be7bd43c ("ethtool: set EEE settings with EEE_SET request") Cc: stable@vger.kernel.org # 5.10.x Cc: Michal Kubecek mkubecek@suse.cz Signed-off-by: Wong Vee Khee vee.khee.wong@linux.intel.com Reviewed-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/eee.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/ethtool/eee.c +++ b/net/ethtool/eee.c @@ -169,8 +169,8 @@ int ethnl_set_eee(struct sk_buff *skb, s ethnl_update_bool32(&eee.eee_enabled, tb[ETHTOOL_A_EEE_ENABLED], &mod); ethnl_update_bool32(&eee.tx_lpi_enabled, tb[ETHTOOL_A_EEE_TX_LPI_ENABLED], &mod); - ethnl_update_bool32(&eee.tx_lpi_timer, tb[ETHTOOL_A_EEE_TX_LPI_TIMER], - &mod); + ethnl_update_u32(&eee.tx_lpi_timer, tb[ETHTOOL_A_EEE_TX_LPI_TIMER], + &mod); ret = 0; if (!mod) goto out_ops;
From: Ilya Lipnitskiy ilya.lipnitskiy@gmail.com
commit d473d32c2fbac2d1d7082c61899cfebd34eb267a upstream.
[<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate the number of GPIOs present on a system, not define a GPIO. nr-gpios is not configured by #gpio-cells and can't be parsed along with other "*-gpios" properties.
nr-gpios without the "<vendor>," prefix is not allowed by the DT spec[1], so only add exception for the ",nr-gpios" suffix and let the error message continue being printed for non-compliant implementations.
[0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio: - gpio-adnp.txt - gpio-xgene-sb.txt - gpio-xlp.txt - snps,dw-apb-gpio.yaml
[1] Link: https://github.com/devicetree-org/dt-schema/blob/cb53a16a1eb3e2169ce170c071e...
Fixes errors such as: OF: /palmbus@300000/gpio@600: could not find phandle
Fixes: 7f00be96f125 ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)") Signed-off-by: Ilya Lipnitskiy ilya.lipnitskiy@gmail.com Cc: Saravana Kannan saravanak@google.com Cc: stable@vger.kernel.org # v5.5+ Link: https://lore.kernel.org/r/20210405222540.18145-1-ilya.lipnitskiy@gmail.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/property.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1308,7 +1308,16 @@ DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7" DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL) DEFINE_SUFFIX_PROP(regulators, "-supply", NULL) DEFINE_SUFFIX_PROP(gpio, "-gpio", "#gpio-cells") -DEFINE_SUFFIX_PROP(gpios, "-gpios", "#gpio-cells") + +static struct device_node *parse_gpios(struct device_node *np, + const char *prop_name, int index) +{ + if (!strcmp_suffix(prop_name, ",nr-gpios")) + return NULL; + + return parse_suffix_prop_cells(np, prop_name, index, "-gpios", + "#gpio-cells"); +}
static struct device_node *parse_iommu_maps(struct device_node *np, const char *prop_name, int index)
From: Helge Deller deller@gmx.de
commit 9054284e8846b0105aad43a4e7174ca29fffbc44 upstream.
Add a dependency to the SBA IOMMU driver to avoid: ERROR: modpost: "sba_list" [drivers/char/agp/parisc-agp.ko] undefined!
Reported-by: kernel test robot lkp@intel.com Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/agp/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/char/agp/Kconfig +++ b/drivers/char/agp/Kconfig @@ -125,7 +125,7 @@ config AGP_HP_ZX1
config AGP_PARISC tristate "HP Quicksilver AGP support" - depends on AGP && PARISC && 64BIT + depends on AGP && PARISC && 64BIT && IOMMU_SBA help This option gives you AGP GART support for the HP Quicksilver AGP bus adapter on HP PA-RISC machines (Ok, just on the C8000
From: Gao Xiang hsiangkao@redhat.com
commit 4d752e5af63753ab5140fc282929b98eaa4bd12e upstream.
commit b344d6a83d01 ("parisc: add support for cmpxchg on u8 pointers") can generate a sparse warning ("cast truncates bits from constant value"), which has been reported several times [1] [2] [3].
The original code worked as expected, but anyway, let silence such sparse warning as what others did [4].
[1] https://lore.kernel.org/r/202104061220.nRMBwCXw-lkp@intel.com [2] https://lore.kernel.org/r/202012291914.T5Agcn99-lkp@intel.com [3] https://lore.kernel.org/r/202008210829.KVwn7Xeh%25lkp@intel.com [4] https://lore.kernel.org/r/20210315131512.133720-2-jacopo+renesas@jmondi.org Cc: Liam Beguin liambeguin@gmail.com Cc: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Gao Xiang hsiangkao@redhat.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/asm/cmpxchg.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/parisc/include/asm/cmpxchg.h +++ b/arch/parisc/include/asm/cmpxchg.h @@ -72,7 +72,7 @@ __cmpxchg(volatile void *ptr, unsigned l #endif case 4: return __cmpxchg_u32((unsigned int *)ptr, (unsigned int)old, (unsigned int)new_); - case 1: return __cmpxchg_u8((u8 *)ptr, (u8)old, (u8)new_); + case 1: return __cmpxchg_u8((u8 *)ptr, old & 0xff, new_ & 0xff); } __cmpxchg_called_with_bad_pointer(); return old;
From: Marek Behún kabel@kernel.org
commit a26c56ae67fa9fbb45a8a232dcd7ebaa7af16086 upstream.
Use the `marvell,reg-init` DT property to configure the LED[2]/INTn pin of the Marvell 88E1514 ethernet PHY on Turris Omnia into interrupt mode.
Without this the pin is by default in LED[2] mode, and the Marvell PHY driver configures LED[2] into "On - Link, Blink - Activity" mode.
This fixes the issue where the pca9538 GPIO/interrupt controller (which can't mask interrupts in HW) received too many interrupts and after a time started ignoring the interrupt with error message: IRQ 71: nobody cared
There is a work in progress to have the Marvell PHY driver support parsing PHY LED nodes from OF and registering the LEDs as Linux LED class devices. Once this is done the PHY driver can also automatically set the pin into INTn mode if it does not find LED[2] in OF.
Until then, though, we fix this via `marvell,reg-init` DT property.
Signed-off-by: Marek Behún kabel@kernel.org Reported-by: Rui Salvaterra rsalvaterra@gmail.com Fixes: 26ca8b52d6e1 ("ARM: dts: add support for Turris Omnia") Cc: Uwe Kleine-König uwe@kleine-koenig.org Cc: linux-arm-kernel@lists.infradead.org Cc: Andrew Lunn andrew@lunn.ch Cc: Gregory CLEMENT gregory.clement@bootlin.com Cc: stable@vger.kernel.org Tested-by: Rui Salvaterra rsalvaterra@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/armada-385-turris-omnia.dts | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm/boot/dts/armada-385-turris-omnia.dts +++ b/arch/arm/boot/dts/armada-385-turris-omnia.dts @@ -236,6 +236,7 @@ status = "okay"; compatible = "ethernet-phy-id0141.0DD1", "ethernet-phy-ieee802.3-c22"; reg = <1>; + marvell,reg-init = <3 18 0 0x4985>;
/* irq is connected to &pcawan pin 7 */ };
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 08c27f3322fec11950b8f1384aa0f3b11d028528 upstream.
KMSAN found uninitialized value at batadv_tt_prepare_tvlv_local_data() [1], for commit ced72933a5e8ab52 ("batman-adv: use CRC32C instead of CRC16 in TT code") inserted 'reserved' field into "struct batadv_tvlv_tt_data" and commit 7ea7b4a142758dea ("batman-adv: make the TT CRC logic VLAN specific") moved that field to "struct batadv_tvlv_tt_vlan_data" but left that field uninitialized.
[1] https://syzkaller.appspot.com/bug?id=07f3e6dba96f0eb3cabab986adcd8a58b9bdbe9...
Reported-by: syzbot syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com Tested-by: syzbot syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Fixes: ced72933a5e8ab52 ("batman-adv: use CRC32C instead of CRC16 in TT code") Fixes: 7ea7b4a142758dea ("batman-adv: make the TT CRC logic VLAN specific") Acked-by: Sven Eckelmann sven@narfation.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/batman-adv/translation-table.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -891,6 +891,7 @@ batadv_tt_prepare_tvlv_global_data(struc hlist_for_each_entry(vlan, &orig_node->vlan_list, list) { tt_vlan->vid = htons(vlan->vid); tt_vlan->crc = htonl(vlan->tt.crc); + tt_vlan->reserved = 0;
tt_vlan++; } @@ -974,6 +975,7 @@ batadv_tt_prepare_tvlv_local_data(struct
tt_vlan->vid = htons(vlan->vid); tt_vlan->crc = htonl(vlan->tt.crc); + tt_vlan->reserved = 0;
tt_vlan++; }
From: Anirudh Venkataramanan anirudh.venkataramanan@intel.com
commit 08771bce330036d473be6ce851cd00bcd351ebf6 upstream.
An incorrect NVM update procedure can result in the driver failing probe. In this case, the recommended resolution method is to update the NVM using the right procedure. However, if the driver fails probe, the user will not be able to update the NVM. So do not fail probe on link/PHY errors.
Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion") Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_main.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4170,28 +4170,25 @@ ice_probe(struct pci_dev *pdev, const st goto err_send_version_unroll; }
+ /* not a fatal error if this fails */ err = ice_init_nvm_phy_type(pf->hw.port_info); - if (err) { + if (err) dev_err(dev, "ice_init_nvm_phy_type failed: %d\n", err); - goto err_send_version_unroll; - }
+ /* not a fatal error if this fails */ err = ice_update_link_info(pf->hw.port_info); - if (err) { + if (err) dev_err(dev, "ice_update_link_info failed: %d\n", err); - goto err_send_version_unroll; - }
ice_init_link_dflt_override(pf->hw.port_info);
/* if media available, initialize PHY settings */ if (pf->hw.port_info->phy.link_info.link_info & ICE_AQ_MEDIA_AVAILABLE) { + /* not a fatal error if this fails */ err = ice_init_phy_user_cfg(pf->hw.port_info); - if (err) { + if (err) dev_err(dev, "ice_init_phy_user_cfg failed: %d\n", err); - goto err_send_version_unroll; - }
if (!test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, pf->flags)) { struct ice_vsi *vsi = ice_get_main_vsi(pf);
From: Fabio Pricoco fabio.pricoco@intel.com
commit f88c529ac77b3c21819d2cf1dfcfae1937849743 upstream.
250 msec timeout is insufficient for some AQ commands. Advice from FW team was to increase the timeout. Increase to 1 second.
Fixes: 7ec59eeac804 ("ice: Add support for control queues") Signed-off-by: Fabio Pricoco fabio.pricoco@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_controlq.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice_controlq.h +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h @@ -31,8 +31,8 @@ enum ice_ctl_q { ICE_CTL_Q_MAILBOX, };
-/* Control Queue timeout settings - max delay 250ms */ -#define ICE_CTL_Q_SQ_CMD_TIMEOUT 2500 /* Count 2500 times */ +/* Control Queue timeout settings - max delay 1s */ +#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */
From: Krzysztof Goreczny krzysztof.goreczny@intel.com
commit e95fc8573e07c5e4825df4650fd8b8c93fad27a7 upstream.
There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open
To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress.
Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny krzysztof.goreczny@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice.h | 1 + drivers/net/ethernet/intel/ice/ice_lib.c | 4 ++-- drivers/net/ethernet/intel/ice/ice_main.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -605,6 +605,7 @@ int ice_fdir_create_dflt_rules(struct ic int ice_aq_wait_for_event(struct ice_pf *pf, u16 opcode, unsigned long timeout, struct ice_rq_event_info *event); int ice_open(struct net_device *netdev); +int ice_open_internal(struct net_device *netdev); int ice_stop(struct net_device *netdev); void ice_service_task_schedule(struct ice_pf *pf);
--- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2489,7 +2489,7 @@ int ice_ena_vsi(struct ice_vsi *vsi, boo if (!locked) rtnl_lock();
- err = ice_open(vsi->netdev); + err = ice_open_internal(vsi->netdev);
if (!locked) rtnl_unlock(); @@ -2518,7 +2518,7 @@ void ice_dis_vsi(struct ice_vsi *vsi, bo if (!locked) rtnl_lock();
- ice_stop(vsi->netdev); + ice_vsi_close(vsi);
if (!locked) rtnl_unlock(); --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6615,6 +6615,28 @@ static void ice_tx_timeout(struct net_de int ice_open(struct net_device *netdev) { struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_pf *pf = np->vsi->back; + + if (ice_is_reset_in_progress(pf->state)) { + netdev_err(netdev, "can't open net device while reset is in progress"); + return -EBUSY; + } + + return ice_open_internal(netdev); +} + +/** + * ice_open_internal - Called when a network interface becomes active + * @netdev: network interface device structure + * + * Internal ice_open implementation. Should not be used directly except for ice_open and reset + * handling routine + * + * Returns 0 on success, negative value on failure + */ +int ice_open_internal(struct net_device *netdev) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); struct ice_vsi *vsi = np->vsi; struct ice_pf *pf = vsi->back; struct ice_port_info *pi; @@ -6693,6 +6715,12 @@ int ice_stop(struct net_device *netdev) { struct ice_netdev_priv *np = netdev_priv(netdev); struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (ice_is_reset_in_progress(pf->state)) { + netdev_err(netdev, "can't stop net device while reset is in progress"); + return -EBUSY; + }
ice_vsi_close(vsi);
From: Bruce Allan bruce.w.allan@intel.com
commit 59df14f9cc2326bd6432d60eca0df8201d9d3d4b upstream.
Fix the order of number of array members and member size parameters in a *calloc() call.
Fixes: b3c3890489f6 ("ice: avoid unnecessary single-member variable-length structs") Signed-off-by: Bruce Allan bruce.w.allan@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -717,8 +717,8 @@ static enum ice_status ice_cfg_fw_log(st
if (!data) { data = devm_kcalloc(ice_hw_to_dev(hw), - sizeof(*data), ICE_AQC_FW_LOG_ID_MAX, + sizeof(*data), GFP_KERNEL); if (!data) return ICE_ERR_NO_MEMORY;
From: Dave Ertman david.m.ertman@intel.com
commit 741b7b743bbcb5a3848e4e55982064214f900d2f upstream.
The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit.
Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress.
Fixes: b94b013eb626 ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman david.m.ertman@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice.h | 1 - drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 2 -- drivers/net/ethernet/intel/ice/ice_lib.c | 1 - 3 files changed, 4 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -194,7 +194,6 @@ enum ice_state { __ICE_NEEDS_RESTART, __ICE_PREPARED_FOR_RESET, /* set by driver when prepared */ __ICE_RESET_OICR_RECV, /* set by driver after rcv reset OICR */ - __ICE_DCBNL_DEVRESET, /* set by dcbnl devreset */ __ICE_PFR_REQ, /* set by driver and peers */ __ICE_CORER_REQ, /* set by driver and peers */ __ICE_GLOBR_REQ, /* set by driver and peers */ --- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c @@ -18,12 +18,10 @@ static void ice_dcbnl_devreset(struct ne while (ice_is_reset_in_progress(pf->state)) usleep_range(1000, 2000);
- set_bit(__ICE_DCBNL_DEVRESET, pf->state); dev_close(netdev); netdev_state_change(netdev); dev_open(netdev, NULL); netdev_state_change(netdev); - clear_bit(__ICE_DCBNL_DEVRESET, pf->state); }
/** --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2944,7 +2944,6 @@ err_vsi: bool ice_is_reset_in_progress(unsigned long *state) { return test_bit(__ICE_RESET_OICR_RECV, state) || - test_bit(__ICE_DCBNL_DEVRESET, state) || test_bit(__ICE_PFR_REQ, state) || test_bit(__ICE_CORER_REQ, state) || test_bit(__ICE_GLOBR_REQ, state);
From: Jacek Bułatek jacekx.bulatek@intel.com
commit 7a91d3f02b04b2fb18c2dfa8b6c4e5a40a2753f5 upstream.
Add handling of allocation fault for ice_vsi_list_map_info.
Also *fi should not be NULL pointer, it is a reference to raw data field, so remove this variable and use the reference directly.
Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacek Bułatek jacekx.bulatek@intel.com Co-developed-by: Haiyue Wang haiyue.wang@intel.com Signed-off-by: Haiyue Wang haiyue.wang@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_switch.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice_switch.c +++ b/drivers/net/ethernet/intel/ice/ice_switch.c @@ -1239,6 +1239,9 @@ ice_add_update_vsi_list(struct ice_hw *h ice_create_vsi_list_map(hw, &vsi_handle_arr[0], 2, vsi_list_id);
+ if (!m_entry->vsi_list_info) + return ICE_ERR_NO_MEMORY; + /* If this entry was large action then the large action needs * to be updated to point to FWD to VSI list */ @@ -2224,6 +2227,7 @@ ice_vsi_uses_fltr(struct ice_fltr_mgmt_l return ((fm_entry->fltr_info.fltr_act == ICE_FWD_TO_VSI && fm_entry->fltr_info.vsi_handle == vsi_handle) || (fm_entry->fltr_info.fltr_act == ICE_FWD_TO_VSI_LIST && + fm_entry->vsi_list_info && (test_bit(vsi_handle, fm_entry->vsi_list_info->vsi_map)))); }
@@ -2296,14 +2300,12 @@ ice_add_to_vsi_fltr_list(struct ice_hw * return ICE_ERR_PARAM;
list_for_each_entry(fm_entry, lkup_list_head, list_entry) { - struct ice_fltr_info *fi; - - fi = &fm_entry->fltr_info; - if (!fi || !ice_vsi_uses_fltr(fm_entry, vsi_handle)) + if (!ice_vsi_uses_fltr(fm_entry, vsi_handle)) continue;
status = ice_add_entry_to_vsi_fltr_list(hw, vsi_handle, - vsi_list_head, fi); + vsi_list_head, + &fm_entry->fltr_info); if (status) return status; }
From: Anirudh Venkataramanan anirudh.venkataramanan@intel.com
commit 3176551979b92b02756979c0f1e2d03d1fc82b1e upstream.
As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID.
Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance.
Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice.h | 2 +- drivers/net/ethernet/intel/ice/ice_ethtool.c | 4 ++-- drivers/net/ethernet/intel/ice/ice_main.c | 9 ++++----- 3 files changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -586,7 +586,7 @@ int ice_schedule_reset(struct ice_pf *pf void ice_print_link_msg(struct ice_vsi *vsi, bool isup); const char *ice_stat_str(enum ice_status stat_err); const char *ice_aq_str(enum ice_aq_err aq_err); -bool ice_is_wol_supported(struct ice_pf *pf); +bool ice_is_wol_supported(struct ice_hw *hw); int ice_fdir_write_fltr(struct ice_pf *pf, struct ice_fdir_fltr *input, bool add, bool is_tun); --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -3472,7 +3472,7 @@ static void ice_get_wol(struct net_devic netdev_warn(netdev, "Wake on LAN is not supported on this interface!\n");
/* Get WoL settings based on the HW capability */ - if (ice_is_wol_supported(pf)) { + if (ice_is_wol_supported(&pf->hw)) { wol->supported = WAKE_MAGIC; wol->wolopts = pf->wol_ena ? WAKE_MAGIC : 0; } else { @@ -3492,7 +3492,7 @@ static int ice_set_wol(struct net_device struct ice_vsi *vsi = np->vsi; struct ice_pf *pf = vsi->back;
- if (vsi->type != ICE_VSI_PF || !ice_is_wol_supported(pf)) + if (vsi->type != ICE_VSI_PF || !ice_is_wol_supported(&pf->hw)) return -EOPNOTSUPP;
/* only magic packet is supported */ --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3515,15 +3515,14 @@ static int ice_init_interrupt_scheme(str }
/** - * ice_is_wol_supported - get NVM state of WoL - * @pf: board private structure + * ice_is_wol_supported - check if WoL is supported + * @hw: pointer to hardware info * * Check if WoL is supported based on the HW configuration. * Returns true if NVM supports and enables WoL for this port, false otherwise */ -bool ice_is_wol_supported(struct ice_pf *pf) +bool ice_is_wol_supported(struct ice_hw *hw) { - struct ice_hw *hw = &pf->hw; u16 wol_ctrl;
/* A bit set to 1 in the NVM Software Reserved Word 2 (WoL control @@ -3532,7 +3531,7 @@ bool ice_is_wol_supported(struct ice_pf if (ice_read_sr_word(hw, ICE_SR_NVM_WOL_CFG, &wol_ctrl)) return false;
- return !(BIT(hw->pf_id) & wol_ctrl); + return !(BIT(hw->port_info->lport) & wol_ctrl); }
/**
From: Robert Malz robertx.malz@intel.com
commit b7eeb52721fe417730fc5adc5cbeeb5fe349ab26 upstream.
When ice_remove_vsi_lkup_fltr is called, by calling ice_add_to_vsi_fltr_list local copy of vsi filter list is created. If any issues during creation of vsi filter list occurs it up for the caller to free already allocated memory. This patch ensures proper memory deallocation in these cases.
Fixes: 80d144c9ac82 ("ice: Refactor switch rule management structures and functions") Signed-off-by: Robert Malz robertx.malz@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_switch.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/intel/ice/ice_switch.c +++ b/drivers/net/ethernet/intel/ice/ice_switch.c @@ -2628,7 +2628,7 @@ ice_remove_vsi_lkup_fltr(struct ice_hw * &remove_list_head); mutex_unlock(rule_lock); if (status) - return; + goto free_fltr_list;
switch (lkup) { case ICE_SW_LKUP_MAC: @@ -2651,6 +2651,7 @@ ice_remove_vsi_lkup_fltr(struct ice_hw * break; }
+free_fltr_list: list_for_each_entry_safe(fm_entry, tmp, &remove_list_head, list_entry) { list_del(&fm_entry->list_entry); devm_kfree(ice_hw_to_dev(hw), fm_entry);
From: Johannes Berg johannes.berg@intel.com
commit 25628bc08d4526d3673ca7d039eb636aa9006076 upstream.
As the context info gen3 code is only called for >=AX210 devices (from iwl_trans_pcie_gen2_start_fw()) the code there to set LTR on 22000 devices cannot actually do anything (22000 < AX210).
Fix this by moving the LTR code to iwl_trans_pcie_gen2_start_fw() where it can handle both devices. This then requires that we kick the firmware only after that rather than doing it from the context info code.
Note that this again had a dead branch in gen3 code, which I've removed here.
Signed-off-by: Johannes Berg johannes.berg@intel.com Fixes: ed0022da8bd9 ("iwlwifi: pcie: set LTR on more devices") Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210326125611.675486178ed1.Ib61463aba6920... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 31 ------------- drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c | 3 - drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 35 +++++++++++++++ 3 files changed, 37 insertions(+), 32 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c @@ -5,7 +5,7 @@ * * GPL LICENSE SUMMARY * - * Copyright(c) 2018 - 2020 Intel Corporation + * Copyright(c) 2018 - 2021 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -122,15 +122,6 @@ int iwl_pcie_ctxt_info_gen3_init(struct const struct fw_img *fw) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - u32 ltr_val = CSR_LTR_LONG_VAL_AD_NO_SNOOP_REQ | - u32_encode_bits(CSR_LTR_LONG_VAL_AD_SCALE_USEC, - CSR_LTR_LONG_VAL_AD_NO_SNOOP_SCALE) | - u32_encode_bits(250, - CSR_LTR_LONG_VAL_AD_NO_SNOOP_VAL) | - CSR_LTR_LONG_VAL_AD_SNOOP_REQ | - u32_encode_bits(CSR_LTR_LONG_VAL_AD_SCALE_USEC, - CSR_LTR_LONG_VAL_AD_SNOOP_SCALE) | - u32_encode_bits(250, CSR_LTR_LONG_VAL_AD_SNOOP_VAL); struct iwl_context_info_gen3 *ctxt_info_gen3; struct iwl_prph_scratch *prph_scratch; struct iwl_prph_scratch_ctrl_cfg *prph_sc_ctrl; @@ -264,26 +255,6 @@ int iwl_pcie_ctxt_info_gen3_init(struct iwl_set_bit(trans, CSR_CTXT_INFO_BOOT_CTRL, CSR_AUTO_FUNC_BOOT_ENA);
- /* - * To workaround hardware latency issues during the boot process, - * initialize the LTR to ~250 usec (see ltr_val above). - * The firmware initializes this again later (to a smaller value). - */ - if ((trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_AX210 || - trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_22000) && - !trans->trans_cfg->integrated) { - iwl_write32(trans, CSR_LTR_LONG_VAL_AD, ltr_val); - } else if (trans->trans_cfg->integrated && - trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_22000) { - iwl_write_prph(trans, HPM_MAC_LTR_CSR, HPM_MAC_LRT_ENABLE_ALL); - iwl_write_prph(trans, HPM_UMAC_LTR, ltr_val); - } - - if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_AX210) - iwl_write_umac_prph(trans, UREG_CPU_INIT_RUN, 1); - else - iwl_set_bit(trans, CSR_GP_CNTRL, CSR_AUTO_FUNC_INIT); - return 0;
err_free_ctxt_info: --- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c @@ -6,7 +6,7 @@ * GPL LICENSE SUMMARY * * Copyright(c) 2017 Intel Deutschland GmbH - * Copyright(c) 2018 - 2020 Intel Corporation + * Copyright(c) 2018 - 2021 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -288,7 +288,6 @@ int iwl_pcie_ctxt_info_init(struct iwl_t
/* kick FW self load */ iwl_write64(trans, CSR_CTXT_INFO_BA, trans_pcie->ctxt_info_dma_addr); - iwl_write_prph(trans, UREG_CPU_INIT_RUN, 1);
/* Context info will be released upon alive or failure to get one */
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c @@ -281,6 +281,34 @@ void iwl_trans_pcie_gen2_fw_alive(struct mutex_unlock(&trans_pcie->mutex); }
+static void iwl_pcie_set_ltr(struct iwl_trans *trans) +{ + u32 ltr_val = CSR_LTR_LONG_VAL_AD_NO_SNOOP_REQ | + u32_encode_bits(CSR_LTR_LONG_VAL_AD_SCALE_USEC, + CSR_LTR_LONG_VAL_AD_NO_SNOOP_SCALE) | + u32_encode_bits(250, + CSR_LTR_LONG_VAL_AD_NO_SNOOP_VAL) | + CSR_LTR_LONG_VAL_AD_SNOOP_REQ | + u32_encode_bits(CSR_LTR_LONG_VAL_AD_SCALE_USEC, + CSR_LTR_LONG_VAL_AD_SNOOP_SCALE) | + u32_encode_bits(250, CSR_LTR_LONG_VAL_AD_SNOOP_VAL); + + /* + * To workaround hardware latency issues during the boot process, + * initialize the LTR to ~250 usec (see ltr_val above). + * The firmware initializes this again later (to a smaller value). + */ + if ((trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_AX210 || + trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_22000) && + !trans->trans_cfg->integrated) { + iwl_write32(trans, CSR_LTR_LONG_VAL_AD, ltr_val); + } else if (trans->trans_cfg->integrated && + trans->trans_cfg->device_family == IWL_DEVICE_FAMILY_22000) { + iwl_write_prph(trans, HPM_MAC_LTR_CSR, HPM_MAC_LRT_ENABLE_ALL); + iwl_write_prph(trans, HPM_UMAC_LTR, ltr_val); + } +} + int iwl_trans_pcie_gen2_start_fw(struct iwl_trans *trans, const struct fw_img *fw, bool run_in_rfkill) { @@ -347,6 +375,13 @@ int iwl_trans_pcie_gen2_start_fw(struct if (ret) goto out;
+ iwl_pcie_set_ltr(trans); + + if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_AX210) + iwl_write_umac_prph(trans, UREG_CPU_INIT_RUN, 1); + else + iwl_write_prph(trans, UREG_CPU_INIT_RUN, 1); + /* re-check RF-Kill state since we may have missed the interrupt */ hw_rfkill = iwl_pcie_check_hw_rf_kill(trans); if (hw_rfkill && !run_in_rfkill)
From: Yongxin Liu yongxin.liu@windriver.com
commit 1831da7ea5bdf5531d78bcf81f526faa4c4375fa upstream.
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor.
In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost.
So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released.
Fix the following memory leak.
unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu yongxin.liu@windriver.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_main.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4538,6 +4538,7 @@ static int __maybe_unused ice_suspend(st continue; ice_vsi_free_q_vectors(pf->vsi[v]); } + ice_free_cpu_rx_rmap(ice_get_main_vsi(pf)); ice_clear_interrupt_scheme(pf);
pci_save_state(pdev);
From: Anirudh Rayabharam mail@anirudhrb.com
commit 8a12f8836145ffe37e9c8733dce18c22fb668b66 upstream.
Multiple ttys try to claim the same the minor number causing a double unregistration of the same device. The first unregistration succeeds but the next one results in a null-ptr-deref.
The get_free_serial_index() function returns an available minor number but doesn't assign it immediately. The assignment is done by the caller later. But before this assignment, calls to get_free_serial_index() would return the same minor number.
Fix this by modifying get_free_serial_index to assign the minor number immediately after one is found to be and rename it to obtain_minor() to better reflect what it does. Similary, rename set_serial_by_index() to release_minor() and modify it to free up the minor number of the given hso_serial. Every obtain_minor() should have corresponding release_minor() call.
Fixes: 72dc1c096c705 ("HSO: add option hso driver") Reported-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com Tested-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Anirudh Rayabharam mail@anirudhrb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/hso.c | 33 ++++++++++++--------------------- 1 file changed, 12 insertions(+), 21 deletions(-)
--- a/drivers/net/usb/hso.c +++ b/drivers/net/usb/hso.c @@ -611,7 +611,7 @@ static struct hso_serial *get_serial_by_ return serial; }
-static int get_free_serial_index(void) +static int obtain_minor(struct hso_serial *serial) { int index; unsigned long flags; @@ -619,8 +619,10 @@ static int get_free_serial_index(void) spin_lock_irqsave(&serial_table_lock, flags); for (index = 0; index < HSO_SERIAL_TTY_MINORS; index++) { if (serial_table[index] == NULL) { + serial_table[index] = serial->parent; + serial->minor = index; spin_unlock_irqrestore(&serial_table_lock, flags); - return index; + return 0; } } spin_unlock_irqrestore(&serial_table_lock, flags); @@ -629,15 +631,12 @@ static int get_free_serial_index(void) return -1; }
-static void set_serial_by_index(unsigned index, struct hso_serial *serial) +static void release_minor(struct hso_serial *serial) { unsigned long flags;
spin_lock_irqsave(&serial_table_lock, flags); - if (serial) - serial_table[index] = serial->parent; - else - serial_table[index] = NULL; + serial_table[serial->minor] = NULL; spin_unlock_irqrestore(&serial_table_lock, flags); }
@@ -2230,6 +2229,7 @@ static int hso_stop_serial_device(struct static void hso_serial_tty_unregister(struct hso_serial *serial) { tty_unregister_device(tty_drv, serial->minor); + release_minor(serial); }
static void hso_serial_common_free(struct hso_serial *serial) @@ -2253,24 +2253,22 @@ static void hso_serial_common_free(struc static int hso_serial_common_create(struct hso_serial *serial, int num_urbs, int rx_size, int tx_size) { - int minor; int i;
tty_port_init(&serial->port);
- minor = get_free_serial_index(); - if (minor < 0) + if (obtain_minor(serial)) goto exit2;
/* register our minor number */ serial->parent->dev = tty_port_register_device_attr(&serial->port, - tty_drv, minor, &serial->parent->interface->dev, + tty_drv, serial->minor, &serial->parent->interface->dev, serial->parent, hso_serial_dev_groups); - if (IS_ERR(serial->parent->dev)) + if (IS_ERR(serial->parent->dev)) { + release_minor(serial); goto exit2; + }
- /* fill in specific data for later use */ - serial->minor = minor; serial->magic = HSO_SERIAL_MAGIC; spin_lock_init(&serial->serial_lock); serial->num_rx_urbs = num_urbs; @@ -2667,9 +2665,6 @@ static struct hso_device *hso_create_bul
serial->write_data = hso_std_serial_write_data;
- /* and record this serial */ - set_serial_by_index(serial->minor, serial); - /* setup the proc dirs and files if needed */ hso_log_port(hso_dev);
@@ -2726,9 +2721,6 @@ struct hso_device *hso_create_mux_serial serial->shared_int->ref_count++; mutex_unlock(&serial->shared_int->shared_int_lock);
- /* and record this serial */ - set_serial_by_index(serial->minor, serial); - /* setup the proc dirs and files if needed */ hso_log_port(hso_dev);
@@ -3113,7 +3105,6 @@ static void hso_free_interface(struct us cancel_work_sync(&serial_table[i]->async_get_intf); hso_serial_tty_unregister(serial); kref_put(&serial_table[i]->ref, hso_serial_ref_free); - set_serial_by_index(i, NULL); } }
From: Pedro Tammela pctammela@gmail.com
commit 6032ebb54c60cae24329f6aba3ce0c1ca8ad6abe upstream.
The current code bails out with negative and positive returns. If the callback returns a positive return code, 'ring_buffer__consume()' and 'ring_buffer__poll()' will return a spurious number of records consumed, but mostly important will continue the processing loop.
This patch makes positive returns from the callback a no-op.
Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Pedro Tammela pctammela@mojatatu.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/bpf/ringbuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/lib/bpf/ringbuf.c +++ b/tools/lib/bpf/ringbuf.c @@ -227,7 +227,7 @@ static int ringbuf_process_ring(struct r if ((len & BPF_RINGBUF_DISCARD_BIT) == 0) { sample = (void *)len_ptr + BPF_RINGBUF_HDR_SZ; err = r->sample_cb(r->ctx, sample, len); - if (err) { + if (err < 0) { /* update consumer pos and bail out */ smp_store_release(r->consumer_pos, cons_pos);
From: Toke Høiland-Jørgensen toke@redhat.com
commit 12aa8a9467b354ef893ce0fc5719a4de4949a9fb upstream.
With the introduction of the struct_ops program type, it became possible to implement kernel functionality in BPF, making it viable to use BPF in place of a regular kernel module for these particular operations.
Thus far, the only user of this mechanism is for implementing TCP congestion control algorithms. These are clearly marked as GPL-only when implemented as modules (as seen by the use of EXPORT_SYMBOL_GPL for tcp_register_congestion_control()), so it seems like an oversight that this was not carried over to BPF implementations. Since this is the only user of the struct_ops mechanism, just enforcing GPL-only for the struct_ops program type seems like the simplest way to fix this.
Fixes: 0baf26b0fcd7 ("bpf: tcp: Support tcp_congestion_ops in bpf") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210326100314.121853-1-toke@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11429,6 +11429,11 @@ static int check_struct_ops_btf_id(struc u32 btf_id, member_idx; const char *mname;
+ if (!prog->gpl_compatible) { + verbose(env, "struct ops programs must have a GPL compatible license\n"); + return -EINVAL; + } + btf_id = prog->aux->attach_btf_id; st_ops = bpf_struct_ops_find(btf_id); if (!st_ops) {
From: Lorenz Bauer lmb@cloudflare.com
commit 25fc94b2f02d832fa8e29419699dcc20b0b05c6a upstream.
Invoking BPF_OBJ_GET on a pinned bpf_link checks the path access permissions based on file_flags, but the returned fd ignores flags. This means that any user can acquire a "read-write" fd for a pinned link with mode 0664 by invoking BPF_OBJ_GET with BPF_F_RDONLY in file_flags. The fd can be used to invoke BPF_LINK_DETACH, etc.
Fix this by refusing non-O_RDWR flags in BPF_OBJ_GET. This works because OBJ_GET by default returns a read write mapping and libbpf doesn't expose a way to override this behaviour for programs and links.
Fixes: 70ed506c3bbc ("bpf: Introduce pinnable bpf_link abstraction") Signed-off-by: Lorenz Bauer lmb@cloudflare.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210326160501.46234-1-lmb@cloudflare.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -546,7 +546,7 @@ int bpf_obj_get_user(const char __user * else if (type == BPF_TYPE_MAP) ret = bpf_map_new_fd(raw, f_flags); else if (type == BPF_TYPE_LINK) - ret = bpf_link_new_fd(raw); + ret = (f_flags != O_RDWR) ? -EINVAL : bpf_link_new_fd(raw); else return -ENOENT;
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
commit 6e5a03bcba44e080a6bf300194a68ce9bb1e5184 upstream.
In nfp_bpf_ctrl_msg_rx, if nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb).
My patch adds a return when the skb was freed.
Fixes: bcf0cafab44fd ("nfp: split out common control message handling code") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/bpf/cmsg.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/netronome/nfp/bpf/cmsg.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/cmsg.c @@ -454,6 +454,7 @@ void nfp_bpf_ctrl_msg_rx(struct nfp_app dev_consume_skb_any(skb); else dev_kfree_skb_any(skb); + return; }
nfp_ccm_rx(&bpf->ccm, skb);
From: Ciara Loftus ciara.loftus@intel.com
commit df662016310aa4475d7986fd726af45c8fe4f362 upstream.
Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this.
Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus ciara.loftus@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Magnus Karlsson magnus.karlsson@intel.com Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/bpf/xsk.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -870,6 +870,9 @@ int xsk_socket__create(struct xsk_socket struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, const struct xsk_socket_config *usr_config) { + if (!umem) + return -EFAULT; + return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem, rx, tx, umem->fill_save, umem->comp_save, usr_config);
From: Ciara Loftus ciara.loftus@intel.com
commit 43f1bc1efff16f553dd573d02eb7a15750925568 upstream.
If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure.
Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus ciara.loftus@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/bpf/xsk.c | 41 +++++++++++++++++++++++------------------ 1 file changed, 23 insertions(+), 18 deletions(-)
--- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -628,26 +628,30 @@ static struct xsk_ctx *xsk_get_ctx(struc return NULL; }
-static void xsk_put_ctx(struct xsk_ctx *ctx) +static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap) { struct xsk_umem *umem = ctx->umem; struct xdp_mmap_offsets off; int err;
- if (--ctx->refcount == 0) { - err = xsk_get_mmap_offsets(umem->fd, &off); - if (!err) { - munmap(ctx->fill->ring - off.fr.desc, - off.fr.desc + umem->config.fill_size * - sizeof(__u64)); - munmap(ctx->comp->ring - off.cr.desc, - off.cr.desc + umem->config.comp_size * - sizeof(__u64)); - } + if (--ctx->refcount) + return;
- list_del(&ctx->list); - free(ctx); - } + if (!unmap) + goto out_free; + + err = xsk_get_mmap_offsets(umem->fd, &off); + if (err) + goto out_free; + + munmap(ctx->fill->ring - off.fr.desc, off.fr.desc + umem->config.fill_size * + sizeof(__u64)); + munmap(ctx->comp->ring - off.cr.desc, off.cr.desc + umem->config.comp_size * + sizeof(__u64)); + +out_free: + list_del(&ctx->list); + free(ctx); }
static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk, @@ -682,8 +686,6 @@ static struct xsk_ctx *xsk_create_ctx(st memcpy(ctx->ifname, ifname, IFNAMSIZ - 1); ctx->ifname[IFNAMSIZ - 1] = '\0';
- umem->fill_save = NULL; - umem->comp_save = NULL; ctx->fill = fill; ctx->comp = comp; list_add(&ctx->list, &umem->ctx_list); @@ -705,6 +707,7 @@ int xsk_socket__create_shared(struct xsk struct xsk_socket *xsk; struct xsk_ctx *ctx; int err, ifindex; + bool unmap = umem->fill_save != fill;
if (!umem || !xsk_ptr || !(rx || tx)) return -EFAULT; @@ -845,6 +848,8 @@ int xsk_socket__create_shared(struct xsk }
*xsk_ptr = xsk; + umem->fill_save = NULL; + umem->comp_save = NULL; return 0;
out_mmap_tx: @@ -856,7 +861,7 @@ out_mmap_rx: munmap(rx_map, off.rx.desc + xsk->config.rx_size * sizeof(struct xdp_desc)); out_put_ctx: - xsk_put_ctx(ctx); + xsk_put_ctx(ctx, unmap); out_socket: if (--umem->refcount) close(xsk->fd); @@ -922,7 +927,7 @@ void xsk_socket__delete(struct xsk_socke } }
- xsk_put_ctx(ctx); + xsk_put_ctx(ctx, true);
umem->refcount--; /* Do not close an fd that also has an associated umem connected
From: Ciara Loftus ciara.loftus@intel.com
commit ca7a83e2487ad0bc9a3e0e7a8645354aa1782f13 upstream.
Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings.
Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus ciara.loftus@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/bpf/xsk.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -54,6 +54,8 @@ struct xsk_umem { int fd; int refcount; struct list_head ctx_list; + bool rx_ring_setup_done; + bool tx_ring_setup_done; };
struct xsk_ctx { @@ -708,6 +710,7 @@ int xsk_socket__create_shared(struct xsk struct xsk_ctx *ctx; int err, ifindex; bool unmap = umem->fill_save != fill; + bool rx_setup_done = false, tx_setup_done = false;
if (!umem || !xsk_ptr || !(rx || tx)) return -EFAULT; @@ -735,6 +738,8 @@ int xsk_socket__create_shared(struct xsk } } else { xsk->fd = umem->fd; + rx_setup_done = umem->rx_ring_setup_done; + tx_setup_done = umem->tx_ring_setup_done; }
ctx = xsk_get_ctx(umem, ifindex, queue_id); @@ -753,7 +758,7 @@ int xsk_socket__create_shared(struct xsk } xsk->ctx = ctx;
- if (rx) { + if (rx && !rx_setup_done) { err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING, &xsk->config.rx_size, sizeof(xsk->config.rx_size)); @@ -761,8 +766,10 @@ int xsk_socket__create_shared(struct xsk err = -errno; goto out_put_ctx; } + if (xsk->fd == umem->fd) + umem->rx_ring_setup_done = true; } - if (tx) { + if (tx && !tx_setup_done) { err = setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING, &xsk->config.tx_size, sizeof(xsk->config.tx_size)); @@ -770,6 +777,8 @@ int xsk_socket__create_shared(struct xsk err = -errno; goto out_put_ctx; } + if (xsk->fd == umem->fd) + umem->rx_ring_setup_done = true; }
err = xsk_get_mmap_offsets(xsk->fd, &off);
From: Dave Marchevsky davemarchevsky@fb.com
commit 06ab134ce8ecfa5a69e850f88f81c8a4c3fa91df upstream.
On x86 the struct pt_regs * grabbed by task_pt_regs() points to an offset of task->stack. The pt_regs are later dereferenced in __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if the task in question exits while bpf_get_task_stack is executing, as warned by task_stack_page's comment:
* When accessing the stack of a non-current task that might exit, use * try_get_task_stack() instead. task_stack_page will return a pointer * that could get freed out from under you.
Taking the comment's advice and using try_get_task_stack() and put_task_stack() to hold task->stack refcount, or bail early if it's already 0. Incrementing stack_refcount will ensure the task's stack sticks around while we're using its data.
I noticed this bug while testing a bpf task iter similar to bpf_iter_task_stack in selftests, except mine grabbed user stack, and getting intermittent crashes, which resulted in dumps like:
BUG: unable to handle page fault for address: 0000000000003fe0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:__bpf_get_stack+0xd0/0x230 <snip...> Call Trace: bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec bpf_iter_run_prog+0x24/0x81 __task_seq_show+0x58/0x80 bpf_seq_read+0xf7/0x3d0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x48/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Dave Marchevsky davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20210401000747.3648767-1-davemarchevsky@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/stackmap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -662,9 +662,17 @@ const struct bpf_func_proto bpf_get_stac BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf, u32, size, u64, flags) { - struct pt_regs *regs = task_pt_regs(task); + struct pt_regs *regs; + long res;
- return __bpf_get_stack(regs, task, NULL, buf, size, flags); + if (!try_get_task_stack(task)) + return -EFAULT; + + regs = task_pt_regs(task); + res = __bpf_get_stack(regs, task, NULL, buf, size, flags); + put_task_stack(task); + + return res; }
BTF_ID_LIST_SINGLE(bpf_get_task_stack_btf_ids, struct, task_struct)
From: John Fastabend john.fastabend@gmail.com
commit 1c84b33101c82683dee8b06761ca1f69e78c8ee7 upstream.
In '4da6a196f93b1' we fixed a potential unhash loop caused when a TLS socket in a sockmap was removed from the sockmap. This happened because the unhash operation on the TLS ctx continued to point at the sockmap implementation of unhash even though the psock has already been removed. The sockmap unhash handler when a psock is removed does the following,
void sock_map_unhash(struct sock *sk) { void (*saved_unhash)(struct sock *sk); struct sk_psock *psock;
rcu_read_lock(); psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); if (sk->sk_prot->unhash) sk->sk_prot->unhash(sk); return; } [...] }
The unlikely() case is there to handle the case where psock is detached but the proto ops have not been updated yet. But, in the above case with TLS and removed psock we never fixed sk_prot->unhash() and unhash() points back to sock_map_unhash resulting in a loop. To fix this we added this bit of code,
static inline void sk_psock_restore_proto(struct sock *sk, struct sk_psock *psock) { sk->sk_prot->unhash = psock->saved_unhash;
This will set the sk_prot->unhash back to its saved value. This is the correct callback for a TLS socket that has been removed from the sock_map. Unfortunately, this also overwrites the unhash pointer for all psocks. We effectively break sockmap unhash handling for any future socks. Omitting the unhash operation will leave stale entries in the map if a socket transition through unhash, but does not do close() op.
To fix set unhash correctly before calling into tls_update. This way the TLS enabled socket will point to the saved unhash() handler.
Fixes: 4da6a196f93b1 ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop") Reported-by: Cong Wang xiyou.wangcong@gmail.com Reported-by: Lorenz Bauer lmb@cloudflare.com Suggested-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@jo... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/skmsg.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -349,8 +349,13 @@ static inline void sk_psock_update_proto static inline void sk_psock_restore_proto(struct sock *sk, struct sk_psock *psock) { - sk->sk_prot->unhash = psock->saved_unhash; if (inet_csk_has_ulp(sk)) { + /* TLS does not have an unhash proto in SW cases, but we need + * to ensure we stop using the sock_map unhash routine because + * the associated psock is being removed. So use the original + * unhash handler. + */ + WRITE_ONCE(sk->sk_prot->unhash, psock->saved_unhash); tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space); } else { sk->sk_write_space = psock->saved_write_space;
From: John Fastabend john.fastabend@gmail.com
commit 144748eb0c445091466c9b741ebd0bfcc5914f3d upstream.
Incorrect accounting fwd_alloc can result in a warning when the socket is torn down,
[18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230 [...] [18455.319543] Call Trace: [18455.319556] inet_csk_destroy_sock+0xba/0x1f0 [18455.319577] tcp_rcv_state_process+0x1b4e/0x2380 [18455.319593] ? lock_downgrade+0x3a0/0x3a0 [18455.319617] ? tcp_finish_connect+0x1e0/0x1e0 [18455.319631] ? sk_reset_timer+0x15/0x70 [18455.319646] ? tcp_schedule_loss_probe+0x1b2/0x240 [18455.319663] ? lock_release+0xb2/0x3f0 [18455.319676] ? __release_sock+0x8a/0x1b0 [18455.319690] ? lock_downgrade+0x3a0/0x3a0 [18455.319704] ? lock_release+0x3f0/0x3f0 [18455.319717] ? __tcp_close+0x2c6/0x790 [18455.319736] ? tcp_v4_do_rcv+0x168/0x370 [18455.319750] tcp_v4_do_rcv+0x168/0x370 [18455.319767] __release_sock+0xbc/0x1b0 [18455.319785] __tcp_close+0x2ee/0x790 [18455.319805] tcp_close+0x20/0x80
This currently happens because on redirect case we do skb_set_owner_r() with the original sock. This increments the fwd_alloc memory accounting on the original sock. Then on redirect we may push this into the queue of the psock we are redirecting to. When the skb is flushed from the queue we give the memory back to the original sock. The problem is if the original sock is destroyed/closed with skbs on another psocks queue then the original sock will not have a way to reclaim the memory before being destroyed. Then above warning will be thrown
sockA sockB
sk_psock_strp_read() sk_psock_verdict_apply() -- SK_REDIRECT -- sk_psock_skb_redirect() skb_queue_tail(psock_other->ingress_skb..)
sk_close() sock_map_unref() sk_psock_put() sk_psock_drop() sk_psock_zap_ingress()
At this point we have torn down our own psock, but have the outstanding skb in psock_other. Note that SK_PASS doesn't have this problem because the sk_psock_drop() logic releases the skb, its still associated with our psock.
To resolve lets only account for sockets on the ingress queue that are still associated with the current socket. On the redirect case we will check memory limits per 6fa9201a89898, but will omit fwd_alloc accounting until skb is actually enqueued. When the skb is sent via skb_send_sock_locked or received with sk_psock_skb_ingress memory will be claimed on psock_other.
Fixes: 6fa9201a89898 ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self") Reported-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@joh... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/skmsg.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -488,6 +488,7 @@ static int sk_psock_skb_ingress_self(str if (unlikely(!msg)) return -EAGAIN; sk_msg_init(msg); + skb_set_owner_r(skb, sk); return sk_psock_skb_ingress_enqueue(skb, psock, sk, msg); }
@@ -791,7 +792,6 @@ static void sk_psock_tls_verdict_apply(s { switch (verdict) { case __SK_REDIRECT: - skb_set_owner_r(skb, sk); sk_psock_skb_redirect(skb); break; case __SK_PASS: @@ -809,10 +809,6 @@ int sk_psock_tls_strp_read(struct sk_pso rcu_read_lock(); prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { - /* We skip full set_owner_r here because if we do a SK_PASS - * or SK_DROP we can skip skb memory accounting and use the - * TLS context. - */ skb->sk = psock->sk; tcp_skb_bpf_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); @@ -881,12 +877,13 @@ static void sk_psock_strp_read(struct st kfree_skb(skb); goto out; } - skb_set_owner_r(skb, sk); prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { + skb->sk = sk; tcp_skb_bpf_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + skb->sk = NULL; } sk_psock_verdict_apply(psock, skb, ret); out: @@ -957,12 +954,13 @@ static int sk_psock_verdict_recv(read_de kfree_skb(skb); goto out; } - skb_set_owner_r(skb, sk); prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { + skb->sk = sk; tcp_skb_bpf_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); + skb->sk = NULL; } sk_psock_verdict_apply(psock, skb, ret); out:
From: Eric Dumazet edumazet@google.com
commit 61431a5907fc36d0738e9a547c7e1556349a03e9 upstream.
Commit 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct") added a call to dev_parse_header_protocol() but mac_header is not yet set.
This means that eth_hdr() reads complete garbage, and syzbot complained about it [1]
This patch resets mac_header earlier, to get more coverage about this change.
Audit of virtio_net_hdr_to_skb() callers shows that this change should be safe.
[1]
BUG: KASAN: use-after-free in eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282 Read of size 2 at addr ffff888017a6200b by task syz-executor313/8409
CPU: 1 PID: 8409 Comm: syz-executor313 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282 dev_parse_header_protocol include/linux/netdevice.h:3177 [inline] virtio_net_hdr_to_skb.constprop.0+0x99d/0xcd0 include/linux/virtio_net.h:83 packet_snd net/packet/af_packet.c:2994 [inline] packet_sendmsg+0x2325/0x52b0 net/packet/af_packet.c:3031 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 sock_no_sendpage+0xf3/0x130 net/core/sock.c:2860 kernel_sendpage.part.0+0x1ab/0x350 net/socket.c:3631 kernel_sendpage net/socket.c:3628 [inline] sock_sendpage+0xe5/0x140 net/socket.c:947 pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364 splice_from_pipe_feed fs/splice.c:418 [inline] __splice_from_pipe+0x43e/0x8a0 fs/splice.c:562 splice_from_pipe fs/splice.c:597 [inline] generic_splice_sendpage+0xd4/0x140 fs/splice.c:746 do_splice_from fs/splice.c:767 [inline] do_splice+0xb7e/0x1940 fs/splice.c:1079 __do_splice+0x134/0x250 fs/splice.c:1144 __do_sys_splice fs/splice.c:1350 [inline] __se_sys_splice fs/splice.c:1332 [inline] __x64_sys_splice+0x198/0x250 fs/splice.c:1332 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
Fixes: 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Balazs Nemeth bnemeth@redhat.com Cc: Willem de Bruijn willemb@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/virtio_net.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -62,6 +62,8 @@ static inline int virtio_net_hdr_to_skb( return -EINVAL; }
+ skb_reset_mac_header(skb); + if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start); u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
From: Eric Dumazet edumazet@google.com
commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced to chain sk_buff (using skb_shinfo(skb)->frag_list), which uses more memory but also cause packet consumers to go over a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy : It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") because GRO was using 2 frags per MSS, meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") Reported-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://www.spinics.net/lists/netdev/msg731397.html Co-Developed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 10 +++++++--- include/linux/virtio_net.h | 14 +++++++++----- 2 files changed, 16 insertions(+), 8 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struc offset += hdr_padded_len; p += hdr_padded_len;
- copy = len; - if (copy > skb_tailroom(skb)) - copy = skb_tailroom(skb); + /* Copy all frame if it fits skb->head, otherwise + * we let virtio_net_hdr_to_skb() and GRO pull headers as needed. + */ + if (len <= skb_tailroom(skb)) + copy = len; + else + copy = ETH_HLEN + metasize; skb_put_data(skb, p, copy);
if (metasize) { --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb( skb_reset_mac_header(skb);
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { - u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start); - u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); + u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start); + u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset); + u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16)); + + if (!pskb_may_pull(skb, needed)) + return -EINVAL;
if (!skb_partial_csum_set(skb, start, off)) return -EINVAL;
p_off = skb_transport_offset(skb) + thlen; - if (p_off > skb_headlen(skb)) + if (!pskb_may_pull(skb, p_off)) return -EINVAL; } else { /* gso packets without NEEDS_CSUM do not set transport_offset. @@ -102,14 +106,14 @@ retry: }
p_off = keys.control.thoff + thlen; - if (p_off > skb_headlen(skb) || + if (!pskb_may_pull(skb, p_off) || keys.basic.ip_proto != ip_proto) return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff); } else if (gso_type) { p_off = thlen; - if (p_off > skb_headlen(skb)) + if (!pskb_may_pull(skb, p_off)) return -EINVAL; } }
On Mon, Apr 12, 2021 at 10:39:29AM +0200, Greg Kroah-Hartman wrote:
From: Eric Dumazet edumazet@google.com
commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced to chain sk_buff (using skb_shinfo(skb)->frag_list), which uses more memory but also cause packet consumers to go over a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy : It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") because GRO was using 2 frags per MSS, meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") Reported-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://www.spinics.net/lists/netdev/msg731397.html Co-Developed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Note that an issue related to this patch was recently reported. It's quite possible that the root cause is a bug elsewhere in the kernel, but it probably makes sense to defer the backport until we know more ...
drivers/net/virtio_net.c | 10 +++++++--- include/linux/virtio_net.h | 14 +++++++++----- 2 files changed, 16 insertions(+), 8 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struc offset += hdr_padded_len; p += hdr_padded_len;
- copy = len;
- if (copy > skb_tailroom(skb))
copy = skb_tailroom(skb);
- /* Copy all frame if it fits skb->head, otherwise
* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
*/
- if (len <= skb_tailroom(skb))
copy = len;
- else
skb_put_data(skb, p, copy);copy = ETH_HLEN + metasize;
if (metasize) { --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb( skb_reset_mac_header(skb); if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
if (!pskb_may_pull(skb, needed))
return -EINVAL;
if (!skb_partial_csum_set(skb, start, off)) return -EINVAL; p_off = skb_transport_offset(skb) + thlen;
if (p_off > skb_headlen(skb))
} else { /* gso packets without NEEDS_CSUM do not set transport_offset.if (!pskb_may_pull(skb, p_off)) return -EINVAL;
@@ -102,14 +106,14 @@ retry: } p_off = keys.control.thoff + thlen;
if (p_off > skb_headlen(skb) ||
if (!pskb_may_pull(skb, p_off) || keys.basic.ip_proto != ip_proto) return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff); } else if (gso_type) { p_off = thlen;
if (p_off > skb_headlen(skb))
} }if (!pskb_may_pull(skb, p_off)) return -EINVAL;
On Mon, Apr 12, 2021 at 05:11:40AM -0400, Michael S. Tsirkin wrote:
On Mon, Apr 12, 2021 at 10:39:29AM +0200, Greg Kroah-Hartman wrote:
From: Eric Dumazet edumazet@google.com
commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced to chain sk_buff (using skb_shinfo(skb)->frag_list), which uses more memory but also cause packet consumers to go over a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy : It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") because GRO was using 2 frags per MSS, meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") Reported-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://www.spinics.net/lists/netdev/msg731397.html Co-Developed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Note that an issue related to this patch was recently reported. It's quite possible that the root cause is a bug elsewhere in the kernel, but it probably makes sense to defer the backport until we know more ...
Thanks, I'll go drop it from all 4 queues. If you all find out that all is good, and it should be added back, please let us at stable@vger know about it.
thanks,
greg k-h
From: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com
commit 8a1e918d833ca5c391c4ded5dc006e2d1ce6d37c upstream.
Set proper return values inside error checking if-statements.
Previously following warning was produced when compiling against sparse. i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err'
Fixes: 4ff0ee1af0169 ("i40e: Introduce recovery mode support") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Signed-off-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -14647,12 +14647,16 @@ static int i40e_init_recovery_mode(struc * in order to register the netdev */ v_idx = i40e_vsi_mem_alloc(pf, I40E_VSI_MAIN); - if (v_idx < 0) + if (v_idx < 0) { + err = v_idx; goto err_switch_setup; + } pf->lan_vsi = v_idx; vsi = pf->vsi[v_idx]; - if (!vsi) + if (!vsi) { + err = -EFAULT; goto err_switch_setup; + } vsi->alloc_queue_pairs = 1; err = i40e_config_netdev(vsi); if (err)
From: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com
commit 6b5674fe6b9bf05394886ebcec62b2d7dae88c42 upstream.
Remove vsi->netdev->name from the trace. This is redundant information. With the devinfo trace, the adapter is already identifiable.
Previously following error was produced when compiling against sparse. i40e_main.c:2571 i40e_sync_vsi_filters() error: we previously assumed 'vsi->netdev' could be null (see line 2323)
Fixes: b603f9dc20af ("i40e: Log info when PF is entering and leaving Allmulti mode.") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Signed-off-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2560,8 +2560,7 @@ int i40e_sync_vsi_filters(struct i40e_vs i40e_stat_str(hw, aq_ret), i40e_aq_str(hw, hw->aq.asq_last_status)); } else { - dev_info(&pf->pdev->dev, "%s is %s allmulti mode.\n", - vsi->netdev->name, + dev_info(&pf->pdev->dev, "%s allmulti mode.\n", cur_multipromisc ? "entering" : "leaving"); } }
From: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com
commit d6d04ee6d2c9bb5084c8f6074195d6aa0024e825 upstream.
Init pointer with NULL in default switch case statement.
Previously the error was produced when compiling against sparse. i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'.
Fixes: 44ea803e2fa7 ("i40e: introduce new dump desc XDP command") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Signed-off-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -578,6 +578,9 @@ static void i40e_dbg_dump_desc(int cnt, case RING_TYPE_XDP: ring = kmemdup(vsi->xdp_rings[ring_id], sizeof(*ring), GFP_KERNEL); break; + default: + ring = NULL; + break; } if (!ring) return;
From: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com
commit 12738ac4754ec92a6a45bf3677d8da780a1412b3 upstream.
Remove error handling through pointers. Instead use plain int to return value from i40e_run_xdp(...).
Previously: - sparse errors were produced during compilation: i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR()
- sk_buff* was used to return value, but it has never had valid pointer to sk_buff. Returned value was always int handled as a pointer.
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Signed-off-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2187,8 +2187,7 @@ int i40e_xmit_xdp_tx_ring(struct xdp_buf * @rx_ring: Rx ring being processed * @xdp: XDP buffer containing the frame **/ -static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring, - struct xdp_buff *xdp) +static int i40e_run_xdp(struct i40e_ring *rx_ring, struct xdp_buff *xdp) { int err, result = I40E_XDP_PASS; struct i40e_ring *xdp_ring; @@ -2227,7 +2226,7 @@ static struct sk_buff *i40e_run_xdp(stru } xdp_out: rcu_read_unlock(); - return ERR_PTR(-result); + return result; }
/** @@ -2339,6 +2338,7 @@ static int i40e_clean_rx_irq(struct i40e unsigned int xdp_xmit = 0; bool failure = false; struct xdp_buff xdp; + int xdp_res = 0;
#if (PAGE_SIZE < 8192) xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, 0); @@ -2405,12 +2405,10 @@ static int i40e_clean_rx_irq(struct i40e /* At larger PAGE_SIZE, frame_sz depend on len size */ xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, size); #endif - skb = i40e_run_xdp(rx_ring, &xdp); + xdp_res = i40e_run_xdp(rx_ring, &xdp); }
- if (IS_ERR(skb)) { - unsigned int xdp_res = -PTR_ERR(skb); - + if (xdp_res) { if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) { xdp_xmit |= xdp_res; i40e_rx_buffer_flip(rx_ring, rx_buffer, size);
From: Eli Cohen elic@nvidia.com
commit bc04d93ea30a0a8eb2a2648b848cef35d1f6f798 upstream.
When we suspend the VM, the VDPA interface will be reset. When the VM is resumed again, clear_virtqueues() will clear the available and used indices resulting in hardware virqtqueue objects becoming out of sync. We can avoid this function alltogether since qemu will clear them if required, e.g. when the VM went through a reboot.
Moreover, since the hw available and used indices should always be identical on query and should be restored to the same value same value for virtqueues that complete in order, we set the single value provided by set_vq_state(). In get_vq_state() we return the value of hardware used index.
Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen elic@nvidia.com Link: https://lore.kernel.org/r/20210408091047.4269-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1154,6 +1154,7 @@ static void suspend_vq(struct mlx5_vdpa_ return; } mvq->avail_idx = attr.available_index; + mvq->used_idx = attr.used_index; }
static void suspend_vqs(struct mlx5_vdpa_net *ndev) @@ -1411,6 +1412,7 @@ static int mlx5_vdpa_set_vq_state(struct return -EINVAL; }
+ mvq->used_idx = state->avail_index; mvq->avail_idx = state->avail_index; return 0; } @@ -1428,7 +1430,11 @@ static int mlx5_vdpa_get_vq_state(struct * that cares about emulating the index after vq is stopped. */ if (!mvq->initialized) { - state->avail_index = mvq->avail_idx; + /* Firmware returns a wrong value for the available index. + * Since both values should be identical, we take the value of + * used_idx which is reported correctly. + */ + state->avail_index = mvq->used_idx; return 0; }
@@ -1437,7 +1443,7 @@ static int mlx5_vdpa_get_vq_state(struct mlx5_vdpa_warn(mvdev, "failed to query virtqueue\n"); return err; } - state->avail_index = attr.available_index; + state->avail_index = attr.used_index; return 0; }
@@ -1525,16 +1531,6 @@ static void teardown_virtqueues(struct m } }
-static void clear_virtqueues(struct mlx5_vdpa_net *ndev) -{ - int i; - - for (i = ndev->mvdev.max_vqs - 1; i >= 0; i--) { - ndev->vqs[i].avail_idx = 0; - ndev->vqs[i].used_idx = 0; - } -} - /* TODO: cross-endian support */ static inline bool mlx5_vdpa_is_little_endian(struct mlx5_vdpa_dev *mvdev) { @@ -1770,7 +1766,6 @@ static void mlx5_vdpa_set_status(struct if (!status) { mlx5_vdpa_info(mvdev, "performing device reset\n"); teardown_driver(ndev); - clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(&ndev->mvdev); ndev->mvdev.status = 0; ndev->mvdev.mlx_features = 0;
From: Pavel Tikhomirov ptikhomirov@virtuozzo.com
commit 1ffbc7ea91606e4abd10eb60de5367f1c86daf5e upstream.
Reproduce:
modprobe sch_teql tc qdisc add dev teql0 root teql0
This leads to (for instance in Centos 7 VM) OOPS:
[ 532.366633] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 532.366733] IP: [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.366825] PGD 80000001376d5067 PUD 137e37067 PMD 0 [ 532.366906] Oops: 0000 [#1] SMP [ 532.366987] Modules linked in: sch_teql ... [ 532.367945] CPU: 1 PID: 3026 Comm: tc Kdump: loaded Tainted: G ------------ T 3.10.0-1062.7.1.el7.x86_64 #1 [ 532.368041] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014 [ 532.368125] task: ffff8b7d37d31070 ti: ffff8b7c9fdbc000 task.ti: ffff8b7c9fdbc000 [ 532.368224] RIP: 0010:[<ffffffffc06124a8>] [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.368320] RSP: 0018:ffff8b7c9fdbf8e0 EFLAGS: 00010286 [ 532.368394] RAX: ffffffffc0612490 RBX: ffff8b7cb1565e00 RCX: ffff8b7d35ba2000 [ 532.368476] RDX: ffff8b7d35ba2000 RSI: 0000000000000000 RDI: ffff8b7cb1565e00 [ 532.368557] RBP: ffff8b7c9fdbf8f8 R08: ffff8b7d3fd1f140 R09: ffff8b7d3b001600 [ 532.368638] R10: ffff8b7d3b001600 R11: ffffffff84c7d65b R12: 00000000ffffffd8 [ 532.368719] R13: 0000000000008000 R14: ffff8b7d35ba2000 R15: ffff8b7c9fdbf9a8 [ 532.368800] FS: 00007f6a4e872740(0000) GS:ffff8b7d3fd00000(0000) knlGS:0000000000000000 [ 532.368885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 532.368961] CR2: 00000000000000a8 CR3: 00000001396ee000 CR4: 00000000000206e0 [ 532.369046] Call Trace: [ 532.369159] [<ffffffff84c8192e>] qdisc_create+0x36e/0x450 [ 532.369268] [<ffffffff846a9b49>] ? ns_capable+0x29/0x50 [ 532.369366] [<ffffffff849afde2>] ? nla_parse+0x32/0x120 [ 532.369442] [<ffffffff84c81b4c>] tc_modify_qdisc+0x13c/0x610 [ 532.371508] [<ffffffff84c693e7>] rtnetlink_rcv_msg+0xa7/0x260 [ 532.372668] [<ffffffff84907b65>] ? sock_has_perm+0x75/0x90 [ 532.373790] [<ffffffff84c69340>] ? rtnl_newlink+0x890/0x890 [ 532.374914] [<ffffffff84c8da7b>] netlink_rcv_skb+0xab/0xc0 [ 532.376055] [<ffffffff84c63708>] rtnetlink_rcv+0x28/0x30 [ 532.377204] [<ffffffff84c8d400>] netlink_unicast+0x170/0x210 [ 532.378333] [<ffffffff84c8d7a8>] netlink_sendmsg+0x308/0x420 [ 532.379465] [<ffffffff84c2f3a6>] sock_sendmsg+0xb6/0xf0 [ 532.380710] [<ffffffffc034a56e>] ? __xfs_filemap_fault+0x8e/0x1d0 [xfs] [ 532.381868] [<ffffffffc034a75c>] ? xfs_filemap_fault+0x2c/0x30 [xfs] [ 532.383037] [<ffffffff847ec23a>] ? __do_fault.isra.61+0x8a/0x100 [ 532.384144] [<ffffffff84c30269>] ___sys_sendmsg+0x3e9/0x400 [ 532.385268] [<ffffffff847f3fad>] ? handle_mm_fault+0x39d/0x9b0 [ 532.386387] [<ffffffff84d88678>] ? __do_page_fault+0x238/0x500 [ 532.387472] [<ffffffff84c31921>] __sys_sendmsg+0x51/0x90 [ 532.388560] [<ffffffff84c31972>] SyS_sendmsg+0x12/0x20 [ 532.389636] [<ffffffff84d8dede>] system_call_fastpath+0x25/0x2a [ 532.390704] [<ffffffff84d8de21>] ? system_call_after_swapgs+0xae/0x146 [ 532.391753] Code: 00 00 00 00 00 00 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b b7 48 01 00 00 48 89 fb <48> 8b 8e a8 00 00 00 48 85 c9 74 43 48 89 ca eb 0f 0f 1f 80 00 [ 532.394036] RIP [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.395127] RSP <ffff8b7c9fdbf8e0> [ 532.396179] CR2: 00000000000000a8
Null pointer dereference happens on master->slaves dereference in teql_destroy() as master is null-pointer.
When qdisc_create() calls teql_qdisc_init() it imediately fails after check "if (m->dev == dev)" because both devices are teql0, and it does not set qdisc_priv(sch)->m leaving it zero on error path, then qdisc_create() imediately calls teql_destroy() which does not expect zero master pointer and we get OOPS.
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Pavel Tikhomirov ptikhomirov@virtuozzo.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_teql.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_teql.c +++ b/net/sched/sch_teql.c @@ -134,6 +134,9 @@ teql_destroy(struct Qdisc *sch) struct teql_sched_data *dat = qdisc_priv(sch); struct teql_master *master = dat->m;
+ if (!master) + return; + prev = master->slaves; if (prev) { do {
From: Vlad Buslov vladbu@nvidia.com
commit 87c750e8c38bce706eb32e4d8f1e3402f2cebbd4 upstream.
Action init code increments reference counter when it changes an action. This is the desired behavior for cls API which needs to obtain action reference for every classifier that points to action. However, act API just needs to change the action and releases the reference before returning. This sequence breaks when the requested action doesn't exist, which causes act API init code to create new action with specified index, but action is still released before returning and is deleted (unless it was referenced concurrently by cls API).
Reproduction:
$ sudo tc actions ls action gact $ sudo tc actions change action gact drop index 1 $ sudo tc actions ls action gact
Extend tcf_action_init() to accept 'init_res' array and initialize it with action->ops->init() result. In tcf_action_add() remove pointers to created actions from actions array before passing it to tcf_action_put_many().
Fixes: cae422f379f3 ("net: sched: use reference counting action init") Reported-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Vlad Buslov vladbu@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/act_api.h | 5 +++-- net/sched/act_api.c | 22 +++++++++++++++------- net/sched/cls_api.c | 9 +++++---- 3 files changed, 23 insertions(+), 13 deletions(-)
--- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -185,7 +185,7 @@ int tcf_action_exec(struct sk_buff *skb, int nr_actions, struct tcf_result *res); int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct tc_action *actions[], size_t *attr_size, + struct tc_action *actions[], int init_res[], size_t *attr_size, bool rtnl_held, struct netlink_ext_ack *extack); struct tc_action_ops *tc_action_load_ops(char *name, struct nlattr *nla, bool rtnl_held, @@ -193,7 +193,8 @@ struct tc_action_ops *tc_action_load_ops struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct tc_action_ops *ops, bool rtnl_held, + struct tc_action_ops *a_o, int *init_res, + bool rtnl_held, struct netlink_ext_ack *extack); int tcf_action_dump(struct sk_buff *skb, struct tc_action *actions[], int bind, int ref, bool terse); --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -972,7 +972,8 @@ struct tc_action_ops *tc_action_load_ops struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct tc_action_ops *a_o, bool rtnl_held, + struct tc_action_ops *a_o, int *init_res, + bool rtnl_held, struct netlink_ext_ack *extack) { struct nla_bitfield32 flags = { 0, 0 }; @@ -1008,6 +1009,7 @@ struct tc_action *tcf_action_init_1(stru } if (err < 0) goto err_out; + *init_res = err;
if (!name && tb[TCA_ACT_COOKIE]) tcf_set_action_cookie(&a->act_cookie, cookie); @@ -1036,7 +1038,7 @@ err_out:
int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct tc_action *actions[], size_t *attr_size, + struct tc_action *actions[], int init_res[], size_t *attr_size, bool rtnl_held, struct netlink_ext_ack *extack) { struct tc_action_ops *ops[TCA_ACT_MAX_PRIO] = {}; @@ -1064,7 +1066,8 @@ int tcf_action_init(struct net *net, str
for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) { act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind, - ops[i - 1], rtnl_held, extack); + ops[i - 1], &init_res[i - 1], rtnl_held, + extack); if (IS_ERR(act)) { err = PTR_ERR(act); goto err; @@ -1477,12 +1480,13 @@ static int tcf_action_add(struct net *ne struct netlink_ext_ack *extack) { size_t attr_size = 0; - int loop, ret; + int loop, ret, i; struct tc_action *actions[TCA_ACT_MAX_PRIO] = {}; + int init_res[TCA_ACT_MAX_PRIO] = {};
for (loop = 0; loop < 10; loop++) { ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0, - actions, &attr_size, true, extack); + actions, init_res, &attr_size, true, extack); if (ret != -EAGAIN) break; } @@ -1490,8 +1494,12 @@ static int tcf_action_add(struct net *ne if (ret < 0) return ret; ret = tcf_add_notify(net, n, actions, portid, attr_size, extack); - if (ovr) - tcf_action_put_many(actions); + + /* only put existing actions */ + for (i = 0; i < TCA_ACT_MAX_PRIO; i++) + if (init_res[i] == ACT_P_CREATED) + actions[i] = NULL; + tcf_action_put_many(actions);
return ret; } --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -3051,6 +3051,7 @@ int tcf_exts_validate(struct net *net, s { #ifdef CONFIG_NET_CLS_ACT { + int init_res[TCA_ACT_MAX_PRIO] = {}; struct tc_action *act; size_t attr_size = 0;
@@ -3062,8 +3063,8 @@ int tcf_exts_validate(struct net *net, s return PTR_ERR(a_o); act = tcf_action_init_1(net, tp, tb[exts->police], rate_tlv, "police", ovr, - TCA_ACT_BIND, a_o, rtnl_held, - extack); + TCA_ACT_BIND, a_o, init_res, + rtnl_held, extack); if (IS_ERR(act)) { module_put(a_o->owner); return PTR_ERR(act); @@ -3078,8 +3079,8 @@ int tcf_exts_validate(struct net *net, s
err = tcf_action_init(net, tp, tb[exts->action], rate_tlv, NULL, ovr, TCA_ACT_BIND, - exts->actions, &attr_size, - rtnl_held, extack); + exts->actions, init_res, + &attr_size, rtnl_held, extack); if (err < 0) return err; exts->nr_actions = err;
From: Johannes Berg johannes.berg@intel.com
commit 9a6847ba1747858ccac53c5aba3e25c54fbdf846 upstream.
If the beacon head attribute (NL80211_ATTR_BEACON_HEAD) is too short to even contain the frame control field, we access uninitialized data beyond the buffer. Fix this by checking the minimal required size first. We used to do this until S1G support was added, where the fixed data portion has a different size.
Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com Suggested-by: Eric Dumazet eric.dumazet@gmail.com Fixes: 1d47f1198d58 ("nl80211: correctly validate S1G beacon head") Signed-off-by: Johannes Berg johannes.berg@intel.com Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/nl80211.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -209,9 +209,13 @@ static int validate_beacon_head(const st unsigned int len = nla_len(attr); const struct element *elem; const struct ieee80211_mgmt *mgmt = (void *)data; - bool s1g_bcn = ieee80211_is_s1g_beacon(mgmt->frame_control); unsigned int fixedlen, hdrlen; + bool s1g_bcn;
+ if (len < offsetofend(typeof(*mgmt), frame_control)) + goto err; + + s1g_bcn = ieee80211_is_s1g_beacon(mgmt->frame_control); if (s1g_bcn) { fixedlen = offsetof(struct ieee80211_ext, u.s1g_beacon.variable);
From: Johannes Berg johannes.berg@intel.com
commit abaf94ecc9c356d0b885a84edef4905cdd89cfdd upstream.
In case nl80211_parse_unsol_bcast_probe_resp() results in an error, need to "goto out" instead of just returning to free possibly allocated data.
Fixes: 7443dcd1f171 ("nl80211: Unsolicited broadcast probe response support") Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671f... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/nl80211.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -5,7 +5,7 @@ * Copyright 2006-2010 Johannes Berg johannes@sipsolutions.net * Copyright 2013-2014 Intel Mobile Communications GmbH * Copyright 2015-2017 Intel Deutschland GmbH - * Copyright (C) 2018-2020 Intel Corporation + * Copyright (C) 2018-2021 Intel Corporation */
#include <linux/if.h> @@ -5324,7 +5324,7 @@ static int nl80211_start_ap(struct sk_bu rdev, info->attrs[NL80211_ATTR_UNSOL_BCAST_PROBE_RESP], ¶ms); if (err) - return err; + goto out; }
nl80211_calculate_ap_params(¶ms);
From: Johannes Berg johannes.berg@intel.com
commit b5ac0146492fc5c199de767e492be8a66471011a upstream.
We need to check the length of this element so that we don't access data beyond its end. Fix that.
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/scan.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -2351,14 +2351,16 @@ cfg80211_inform_single_bss_frame_data(st return NULL;
if (ext) { - struct ieee80211_s1g_bcn_compat_ie *compat; - u8 *ie; + const struct ieee80211_s1g_bcn_compat_ie *compat; + const struct element *elem;
- ie = (void *)cfg80211_find_ie(WLAN_EID_S1G_BCN_COMPAT, - variable, ielen); - if (!ie) + elem = cfg80211_find_elem(WLAN_EID_S1G_BCN_COMPAT, + variable, ielen); + if (!elem) return NULL; - compat = (void *)(ie + 2); + if (elem->datalen < sizeof(*compat)) + return NULL; + compat = (void *)elem->data; bssid = ext->u.s1g_beacon.sa; capability = le16_to_cpu(compat->compat_info); beacon_int = le16_to_cpu(compat->beacon_int);
From: Ben Greear greearb@candelatech.com
commit 7d73cd946d4bc7d44cdc5121b1c61d5d71425dea upstream.
The incorrect timeout check caused probing to happen when it did not need to happen. This in turn caused tx performance drop for around 5 seconds in ath10k-ct driver. Possibly that tx drop is due to a secondary issue, but fixing the probe to not happen when traffic is running fixes the symptom.
Signed-off-by: Ben Greear greearb@candelatech.com Fixes: 9abf4e49830d ("mac80211: optimize station connection monitor") Acked-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/mlme.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -4660,7 +4660,10 @@ static void ieee80211_sta_conn_mon_timer timeout = sta->rx_stats.last_rx; timeout += IEEE80211_CONNECTION_IDLE_TIME;
- if (time_is_before_jiffies(timeout)) { + /* If timeout is after now, then update timer to fire at + * the later date, but do not actually probe at this time. + */ + if (time_is_after_jiffies(timeout)) { mod_timer(&ifmgd->conn_mon_timer, round_jiffies_up(timeout)); return; }
From: Johannes Berg johannes.berg@intel.com
commit 1153a74768a9212daadbb50767aa400bc6a0c9b0 upstream.
Normally, TXQs have
txq->tid = tid; txq->ac = ieee80211_ac_from_tid(tid);
However, the special management TXQ actually has
txq->tid = IEEE80211_NUM_TIDS; // 16 txq->ac = IEEE80211_AC_VO;
This makes sense, but ieee80211_ac_from_tid(16) is the same as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE.
Now, normally this is fine. However, if the netdev queues were stopped, then the code in ieee80211_tx_dequeue() will propagate the stop from the interface (vif->txqs_stopped[]) if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as stopped. On wake, however, __ieee80211_wake_txqs() will wake the TXQ if AC 0 (txq->ac) is woken up.
If a driver stops all queues with ieee80211_stop_tx_queues() and then wakes them again with ieee80211_wake_tx_queues(), the ieee80211_wake_txqs() tasklet will run to resync queue and TXQ state. If all queues were woken, then what'll happen is that _ieee80211_wake_txqs() will run in order of HW queues 0-3, typically (and certainly for iwlwifi) corresponding to ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in order 0-3.
When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll wake up the management TXQ (remember its tid is 16), and the driver's wake_tx_queue() will be called. That tries to get a frame, which will immediately *stop* the TXQ again, because now we check against AC 2, and AC 2 hasn't yet been marked as woken up again in sdata->vif.txqs_stopped[] since we're only in the __ieee80211_wake_txqs() call for AC 0.
Thus, the management TXQ will never be started again.
Fix this by checking txq->ac directly instead of calculating the AC as ieee80211_ac_from_tid(txq->tid).
Fixes: adf8ed01e4fd ("mac80211: add an optional TXQ for other PS-buffered frames") Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/tx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3605,7 +3605,7 @@ begin: test_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txqi->flags)) goto out;
- if (vif->txqs_stopped[ieee80211_ac_from_tid(txq->tid)]) { + if (vif->txqs_stopped[txq->ac]) { set_bit(IEEE80211_TXQ_STOP_NETIF_TX, &txqi->flags); goto out; }
From: Kurt Kanzenbach kurt@linutronix.de
commit 9d6803921a16f4d768dc41a75375629828f4d91e upstream.
Reset MAC header in HSR Tx path. This is needed, because direct packet transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC header.
This has been observed using the following setup:
|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1 |$ ifconfig hsr0 up |$ ./test hsr0
The test binary is using mmap'ed sockets and is specifying the PACKET_QDISC_BYPASS socket option.
This patch resolves the following warning on a non-patched kernel:
|[ 112.725394] ------------[ cut here ]------------ |[ 112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568 |[ 112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0)
The warning can be safely removed, because the other call sites of hsr_forward_skb() make sure that the skb is prepared correctly.
Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option") Signed-off-by: Kurt Kanzenbach kurt@linutronix.de Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/hsr/hsr_device.c | 1 + net/hsr/hsr_forward.c | 6 ------ 2 files changed, 1 insertion(+), 6 deletions(-)
--- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -217,6 +217,7 @@ static netdev_tx_t hsr_dev_xmit(struct s master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); if (master) { skb->dev = master->dev; + skb_reset_mac_header(skb); hsr_forward_skb(skb, master); } else { atomic_long_inc(&dev->tx_dropped); --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -528,12 +528,6 @@ void hsr_forward_skb(struct sk_buff *skb { struct hsr_frame_info frame;
- if (skb_mac_header(skb) != skb->data) { - WARN_ONCE(1, "%s:%d: Malformed frame (port_src %s)\n", - __FILE__, __LINE__, port->dev->name); - goto out_drop; - } - if (fill_frame_info(&frame, skb, port) < 0) goto out_drop;
From: Maciej Żenczykowski maze@google.com
commit 630e4576f83accf90366686f39808d665d8dbecc upstream.
Found by virtue of ipv6 raw sockets not honouring the per-socket IP{,V6}_FREEBIND setting.
Based on hits found via: git grep '[.]ip_nonlocal_bind' We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT, and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl.
The helper is defined as: static inline bool ipv6_can_nonlocal_bind(struct net *net, struct inet_sock *inet) { return net->ipv6.sysctl.ip_nonlocal_bind || inet->freebind || inet->transparent; } so this change only widens the accepted opt-outs and is thus a clean bugfix.
I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug, but IMHO this should be applied to stable kernels as far back as possible. As such I'm adding a 'fixes' tag with the commit that originally added the helper, which happened in 4.19. Backporting to older LTS kernels (at least 4.9 and 4.14) would presumably require open-coding it or backporting the helper as well.
Other possibly relevant commits: v4.18-rc6-1502-g83ba4645152d net: add helpers checking if socket can be bound to nonlocal address v4.18-rc6-1431-gd0c1f01138c4 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind v4.14-rc5-271-gb71d21c274ef sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND v4.7-rc7-1883-g9b9742022888 sctp: support ipv6 nonlocal bind v4.1-12247-g35a256fee52c ipv6: Nonlocal bind
Cc: Lorenzo Colitti lorenzo@google.com Fixes: 83ba4645152d ("net: add helpers checking if socket can be bound to nonlocal address") Signed-off-by: Maciej Żenczykowski maze@google.com Reviewed-By: Lorenzo Colitti lorenzo@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/raw.c | 2 +- net/sctp/ipv6.c | 7 +++---- 2 files changed, 4 insertions(+), 5 deletions(-)
--- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -298,7 +298,7 @@ static int rawv6_bind(struct sock *sk, s */ v4addr = LOOPBACK4_IPV6; if (!(addr_type & IPV6_ADDR_MULTICAST) && - !sock_net(sk)->ipv6.sysctl.ip_nonlocal_bind) { + !ipv6_can_nonlocal_bind(sock_net(sk), inet)) { err = -EADDRNOTAVAIL; if (!ipv6_chk_addr(sock_net(sk), &addr->sin6_addr, dev, 0)) { --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -643,8 +643,8 @@ static int sctp_v6_available(union sctp_ if (!(type & IPV6_ADDR_UNICAST)) return 0;
- return sp->inet.freebind || net->ipv6.sysctl.ip_nonlocal_bind || - ipv6_chk_addr(net, in6, NULL, 0); + return ipv6_can_nonlocal_bind(net, &sp->inet) || + ipv6_chk_addr(net, in6, NULL, 0); }
/* This function checks if the address is a valid address to be used for @@ -933,8 +933,7 @@ static int sctp_inet6_bind_verify(struct net = sock_net(&opt->inet.sk); rcu_read_lock(); dev = dev_get_by_index_rcu(net, addr->v6.sin6_scope_id); - if (!dev || !(opt->inet.freebind || - net->ipv6.sysctl.ip_nonlocal_bind || + if (!dev || !(ipv6_can_nonlocal_bind(net, &opt->inet) || ipv6_chk_addr(net, &addr->v6.sin6_addr, dev, 0))) { rcu_read_unlock();
From: Paolo Abeni pabeni@redhat.com
commit 9adc89af724f12a03b47099cd943ed54e877cd59 upstream.
Currently the mentioned helper can end-up freeing the socket wmem without waking-up any processes waiting for more write memory.
If the partially orphaned skb is attached to an UDP (or raw) socket, the lack of wake-up can hang the user-space.
Even for TCP sockets not calling the sk destructor could have bad effects on TSQ.
Address the issue using skb_orphan to release the sk wmem before setting the new sock_efree destructor. Additionally bundle the whole ownership update in a new helper, so that later other potential users could avoid duplicate code.
v1 -> v2: - use skb_orphan() instead of sort of open coding it (Eric) - provide an helper for the ownership change (Eric)
Fixes: f6ba8d33cfbb ("netem: fix skb_orphan_partial()") Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 9 +++++++++ net/core/sock.c | 12 +++--------- 2 files changed, 12 insertions(+), 9 deletions(-)
--- a/include/net/sock.h +++ b/include/net/sock.h @@ -2197,6 +2197,15 @@ static inline void skb_set_owner_r(struc sk_mem_charge(sk, skb->truesize); }
+static inline void skb_set_owner_sk_safe(struct sk_buff *skb, struct sock *sk) +{ + if (sk && refcount_inc_not_zero(&sk->sk_refcnt)) { + skb_orphan(skb); + skb->destructor = sock_efree; + skb->sk = sk; + } +} + void sk_reset_timer(struct sock *sk, struct timer_list *timer, unsigned long expires);
--- a/net/core/sock.c +++ b/net/core/sock.c @@ -2099,16 +2099,10 @@ void skb_orphan_partial(struct sk_buff * if (skb_is_tcp_pure_ack(skb)) return;
- if (can_skb_orphan_partial(skb)) { - struct sock *sk = skb->sk; - - if (refcount_inc_not_zero(&sk->sk_refcnt)) { - WARN_ON(refcount_sub_and_test(skb->truesize, &sk->sk_wmem_alloc)); - skb->destructor = sock_efree; - } - } else { + if (can_skb_orphan_partial(skb)) + skb_set_owner_sk_safe(skb, skb->sk); + else skb_orphan(skb); - } } EXPORT_SYMBOL(skb_orphan_partial);
From: Dan Carpenter dan.carpenter@oracle.com
commit bec4d7c93afc07dd0454ae41c559513f858cfb83 upstream.
After the device_register() succeeds, then the correct way to clean up is to call device_unregister(). The unregister calls both device_del() and device_put(). Since this code was only device_del() it results in a memory leak.
Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/retimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/thunderbolt/retimer.c +++ b/drivers/thunderbolt/retimer.c @@ -347,7 +347,7 @@ static int tb_retimer_add(struct tb_port ret = tb_retimer_nvm_add(rt); if (ret) { dev_err(&rt->dev, "failed to add NVM devices: %d\n", ret); - device_del(&rt->dev); + device_unregister(&rt->dev); return ret; }
From: Dan Carpenter dan.carpenter@oracle.com
commit 08fe7ae1857080f5075df5ac7fef2ecd4e289117 upstream.
This array uses 1-based indexing so it corrupts memory one element beyond of the array. Fix it by making the array one element larger.
Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/retimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/thunderbolt/retimer.c +++ b/drivers/thunderbolt/retimer.c @@ -406,7 +406,7 @@ static struct tb_retimer *tb_port_find_r */ int tb_retimer_scan(struct tb_port *port) { - u32 status[TB_MAX_RETIMER_INDEX] = {}; + u32 status[TB_MAX_RETIMER_INDEX + 1] = {}; int ret, i, last_idx = 0;
if (!port->cap_usb4)
From: Shuah Khan skhan@linuxfoundation.org
commit 4e9c93af7279b059faf5bb1897ee90512b258a12 upstream.
Fuzzing uncovered race condition between sysfs code paths in usbip drivers. Device connect/disconnect code paths initiated through sysfs interface are prone to races if disconnect happens during connect and vice versa.
This problem is common to all drivers while it can be reproduced easily in vhci_hcd. Add a sysfs_lock to usbip_device struct to protect the paths.
Use this in vhci_hcd to protect sysfs paths. For a complete fix, usip_host and usip-vudc drivers and the event handler will have to use this lock to protect the paths. These changes will be done in subsequent patches.
Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/b6568f7beae702bbc236a545d3c020106ca75eac.161680711... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/usbip_common.h | 3 +++ drivers/usb/usbip/vhci_hcd.c | 1 + drivers/usb/usbip/vhci_sysfs.c | 30 +++++++++++++++++++++++++----- 3 files changed, 29 insertions(+), 5 deletions(-)
--- a/drivers/usb/usbip/usbip_common.h +++ b/drivers/usb/usbip/usbip_common.h @@ -263,6 +263,9 @@ struct usbip_device { /* lock for status */ spinlock_t lock;
+ /* mutex for synchronizing sysfs store paths */ + struct mutex sysfs_lock; + int sockfd; struct socket *tcp_socket;
--- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -1101,6 +1101,7 @@ static void vhci_device_init(struct vhci vdev->ud.side = USBIP_VHCI; vdev->ud.status = VDEV_ST_NULL; spin_lock_init(&vdev->ud.lock); + mutex_init(&vdev->ud.sysfs_lock);
INIT_LIST_HEAD(&vdev->priv_rx); INIT_LIST_HEAD(&vdev->priv_tx); --- a/drivers/usb/usbip/vhci_sysfs.c +++ b/drivers/usb/usbip/vhci_sysfs.c @@ -185,6 +185,8 @@ static int vhci_port_disconnect(struct v
usbip_dbg_vhci_sysfs("enter\n");
+ mutex_lock(&vdev->ud.sysfs_lock); + /* lock */ spin_lock_irqsave(&vhci->lock, flags); spin_lock(&vdev->ud.lock); @@ -195,6 +197,7 @@ static int vhci_port_disconnect(struct v /* unlock */ spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); + mutex_unlock(&vdev->ud.sysfs_lock);
return -EINVAL; } @@ -205,6 +208,8 @@ static int vhci_port_disconnect(struct v
usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN);
+ mutex_unlock(&vdev->ud.sysfs_lock); + return 0; }
@@ -349,30 +354,36 @@ static ssize_t attach_store(struct devic else vdev = &vhci->vhci_hcd_hs->vdev[rhport];
+ mutex_lock(&vdev->ud.sysfs_lock); + /* Extract socket from fd. */ socket = sockfd_lookup(sockfd, &err); if (!socket) { dev_err(dev, "failed to lookup sock"); - return -EINVAL; + err = -EINVAL; + goto unlock_mutex; } if (socket->type != SOCK_STREAM) { dev_err(dev, "Expecting SOCK_STREAM - found %d", socket->type); sockfd_put(socket); - return -EINVAL; + err = -EINVAL; + goto unlock_mutex; }
/* create threads before locking */ tcp_rx = kthread_create(vhci_rx_loop, &vdev->ud, "vhci_rx"); if (IS_ERR(tcp_rx)) { sockfd_put(socket); - return -EINVAL; + err = -EINVAL; + goto unlock_mutex; } tcp_tx = kthread_create(vhci_tx_loop, &vdev->ud, "vhci_tx"); if (IS_ERR(tcp_tx)) { kthread_stop(tcp_rx); sockfd_put(socket); - return -EINVAL; + err = -EINVAL; + goto unlock_mutex; }
/* get task structs now */ @@ -397,7 +408,8 @@ static ssize_t attach_store(struct devic * Will be retried from userspace * if there's another free port. */ - return -EBUSY; + err = -EBUSY; + goto unlock_mutex; }
dev_info(dev, "pdev(%u) rhport(%u) sockfd(%d)\n", @@ -422,7 +434,15 @@ static ssize_t attach_store(struct devic
rh_port_connect(vdev, speed);
+ dev_info(dev, "Device attached\n"); + + mutex_unlock(&vdev->ud.sysfs_lock); + return count; + +unlock_mutex: + mutex_unlock(&vdev->ud.sysfs_lock); + return err; } static DEVICE_ATTR_WO(attach);
From: Shuah Khan skhan@linuxfoundation.org
commit 9dbf34a834563dada91366c2ac266f32ff34641a upstream.
Fuzzing uncovered race condition between sysfs code paths in usbip drivers. Device connect/disconnect code paths initiated through sysfs interface are prone to races if disconnect happens during connect and vice versa.
Use sysfs_lock to protect sysfs paths in stub-dev.
Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/2b182f3561b4a065bf3bf6dce3b0e9944ba17b3f.161680711... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/stub_dev.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -63,6 +63,7 @@ static ssize_t usbip_sockfd_store(struct
dev_info(dev, "stub up\n");
+ mutex_lock(&sdev->ud.sysfs_lock); spin_lock_irq(&sdev->ud.lock);
if (sdev->ud.status != SDEV_ST_AVAILABLE) { @@ -87,13 +88,13 @@ static ssize_t usbip_sockfd_store(struct tcp_rx = kthread_create(stub_rx_loop, &sdev->ud, "stub_rx"); if (IS_ERR(tcp_rx)) { sockfd_put(socket); - return -EINVAL; + goto unlock_mutex; } tcp_tx = kthread_create(stub_tx_loop, &sdev->ud, "stub_tx"); if (IS_ERR(tcp_tx)) { kthread_stop(tcp_rx); sockfd_put(socket); - return -EINVAL; + goto unlock_mutex; }
/* get task structs now */ @@ -112,6 +113,8 @@ static ssize_t usbip_sockfd_store(struct wake_up_process(sdev->ud.tcp_rx); wake_up_process(sdev->ud.tcp_tx);
+ mutex_unlock(&sdev->ud.sysfs_lock); + } else { dev_info(dev, "stub down\n");
@@ -122,6 +125,7 @@ static ssize_t usbip_sockfd_store(struct spin_unlock_irq(&sdev->ud.lock);
usbip_event_add(&sdev->ud, SDEV_EVENT_DOWN); + mutex_unlock(&sdev->ud.sysfs_lock); }
return count; @@ -130,6 +134,8 @@ sock_err: sockfd_put(socket); err: spin_unlock_irq(&sdev->ud.lock); +unlock_mutex: + mutex_unlock(&sdev->ud.sysfs_lock); return -EINVAL; } static DEVICE_ATTR_WO(usbip_sockfd); @@ -270,6 +276,7 @@ static struct stub_device *stub_device_a sdev->ud.side = USBIP_STUB; sdev->ud.status = SDEV_ST_AVAILABLE; spin_lock_init(&sdev->ud.lock); + mutex_init(&sdev->ud.sysfs_lock); sdev->ud.tcp_socket = NULL; sdev->ud.sockfd = -1;
From: Shuah Khan skhan@linuxfoundation.org
commit bd8b82042269a95db48074b8bb400678dbac1815 upstream.
Fuzzing uncovered race condition between sysfs code paths in usbip drivers. Device connect/disconnect code paths initiated through sysfs interface are prone to races if disconnect happens during connect and vice versa.
Use sysfs_lock to protect sysfs paths in vudc.
Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/caabcf3fc87bdae970509b5ff32d05bb7ce2fb15.161680711... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/vudc_dev.c | 1 + drivers/usb/usbip/vudc_sysfs.c | 5 +++++ 2 files changed, 6 insertions(+)
--- a/drivers/usb/usbip/vudc_dev.c +++ b/drivers/usb/usbip/vudc_dev.c @@ -572,6 +572,7 @@ static int init_vudc_hw(struct vudc *udc init_waitqueue_head(&udc->tx_waitq);
spin_lock_init(&ud->lock); + mutex_init(&ud->sysfs_lock); ud->status = SDEV_ST_AVAILABLE; ud->side = USBIP_VUDC;
--- a/drivers/usb/usbip/vudc_sysfs.c +++ b/drivers/usb/usbip/vudc_sysfs.c @@ -112,6 +112,7 @@ static ssize_t usbip_sockfd_store(struct dev_err(dev, "no device"); return -ENODEV; } + mutex_lock(&udc->ud.sysfs_lock); spin_lock_irqsave(&udc->lock, flags); /* Don't export what we don't have */ if (!udc->driver || !udc->pullup) { @@ -187,6 +188,8 @@ static ssize_t usbip_sockfd_store(struct
wake_up_process(udc->ud.tcp_rx); wake_up_process(udc->ud.tcp_tx); + + mutex_unlock(&udc->ud.sysfs_lock); return count;
} else { @@ -207,6 +210,7 @@ static ssize_t usbip_sockfd_store(struct }
spin_unlock_irqrestore(&udc->lock, flags); + mutex_unlock(&udc->ud.sysfs_lock);
return count;
@@ -216,6 +220,7 @@ unlock_ud: spin_unlock_irq(&udc->ud.lock); unlock: spin_unlock_irqrestore(&udc->lock, flags); + mutex_unlock(&udc->ud.sysfs_lock);
return ret; }
From: Shuah Khan skhan@linuxfoundation.org
commit 363eaa3a450abb4e63bd6e3ad79d1f7a0f717814 upstream.
Fuzzing uncovered race condition between sysfs code paths in usbip drivers. Device connect/disconnect code paths initiated through sysfs interface are prone to races if disconnect happens during connect and vice versa.
Use sysfs_lock to synchronize event handler with sysfs paths in usbip drivers.
Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/c5c8723d3f29dfe3d759cfaafa7dd16b0dfe2918.161680711... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/usbip_event.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/usbip/usbip_event.c +++ b/drivers/usb/usbip/usbip_event.c @@ -70,6 +70,7 @@ static void event_handler(struct work_st while ((ud = get_event()) != NULL) { usbip_dbg_eh("pending event %lx\n", ud->event);
+ mutex_lock(&ud->sysfs_lock); /* * NOTE: shutdown must come first. * Shutdown the device. @@ -90,6 +91,7 @@ static void event_handler(struct work_st ud->eh_ops.unusable(ud); unset_event(ud, USBIP_EH_UNUSABLE); } + mutex_unlock(&ud->sysfs_lock);
wake_up(&ud->eh_waitq); }
From: Saravana Kannan saravanak@google.com
commit eed6e41813deb9ee622cd9242341f21430d7789f upstream.
list_for_each_entry_safe() is only useful if we are deleting nodes in a linked list within the loop. It doesn't protect against other threads adding/deleting nodes to the list in parallel. We need to grab deferred_probe_mutex when traversing the deferred_probe_pending_list.
Cc: stable@vger.kernel.org Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init") Signed-off-by: Saravana Kannan saravanak@google.com Link: https://lore.kernel.org/r/20210402040342.2944858-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/dd.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -292,14 +292,16 @@ int driver_deferred_probe_check_state(st
static void deferred_probe_timeout_work_func(struct work_struct *work) { - struct device_private *private, *p; + struct device_private *p;
driver_deferred_probe_timeout = 0; driver_deferred_probe_trigger(); flush_work(&deferred_probe_work);
- list_for_each_entry_safe(private, p, &deferred_probe_pending_list, deferred_probe) - dev_info(private->device, "deferred probe pending\n"); + mutex_lock(&deferred_probe_mutex); + list_for_each_entry(p, &deferred_probe_pending_list, deferred_probe) + dev_info(p->device, "deferred probe pending\n"); + mutex_unlock(&deferred_probe_mutex); wake_up_all(&probe_timeout_waitqueue); } static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
From: Viswas G Viswas.G@microchip.com
commit 65df7d1986a1909a0869419919e7d9c78d70407e upstream.
Inbound and outbound queues were not properly configured and that lead to MPI configuration failure.
Fixes: 05c6c029a44d ("scsi: pm80xx: Increase number of supported queues") Cc: stable@vger.kernel.org # 5.10+ Link: https://lore.kernel.org/r/20210402054212.17834-1-Viswas.G@microchip.com.com Reported-and-tested-by: Ash Izat ash@ai0.uk Signed-off-by: Viswas G Viswas.G@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/pm8001/pm8001_hwi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/scsi/pm8001/pm8001_hwi.c +++ b/drivers/scsi/pm8001/pm8001_hwi.c @@ -223,7 +223,7 @@ static void init_default_table_values(st PM8001_EVENT_LOG_SIZE; pm8001_ha->main_cfg_tbl.pm8001_tbl.iop_event_log_option = 0x01; pm8001_ha->main_cfg_tbl.pm8001_tbl.fatal_err_interrupt = 0x01; - for (i = 0; i < PM8001_MAX_INB_NUM; i++) { + for (i = 0; i < pm8001_ha->max_q_num; i++) { pm8001_ha->inbnd_q_tbl[i].element_pri_size_cnt = PM8001_MPI_QUEUE | (pm8001_ha->iomb_size << 16) | (0x00<<30); pm8001_ha->inbnd_q_tbl[i].upper_base_addr = @@ -249,7 +249,7 @@ static void init_default_table_values(st pm8001_ha->inbnd_q_tbl[i].producer_idx = 0; pm8001_ha->inbnd_q_tbl[i].consumer_index = 0; } - for (i = 0; i < PM8001_MAX_OUTB_NUM; i++) { + for (i = 0; i < pm8001_ha->max_q_num; i++) { pm8001_ha->outbnd_q_tbl[i].element_size_cnt = PM8001_MPI_QUEUE | (pm8001_ha->iomb_size << 16) | (0x01<<30); pm8001_ha->outbnd_q_tbl[i].upper_base_addr = @@ -671,9 +671,9 @@ static int pm8001_chip_init(struct pm800 read_outbnd_queue_table(pm8001_ha); /* update main config table ,inbound table and outbound table */ update_main_config_table(pm8001_ha); - for (i = 0; i < PM8001_MAX_INB_NUM; i++) + for (i = 0; i < pm8001_ha->max_q_num; i++) update_inbnd_queue_table(pm8001_ha, i); - for (i = 0; i < PM8001_MAX_OUTB_NUM; i++) + for (i = 0; i < pm8001_ha->max_q_num; i++) update_outbnd_queue_table(pm8001_ha, i); /* 8081 controller donot require these operations */ if (deviceid != 0x8081 && deviceid != 0x0042) {
From: Roman Bolshakov r.bolshakov@yadro.com
commit 0352c3d3959a6cf543075b88c7e662fd3546f12e upstream.
target_sequencer_start event is triggered inside target_cmd_init_cdb(). se_cmd.tag is not initialized with ITT at the moment so the event always prints zero tag.
Link: https://lore.kernel.org/r/20210403215415.95077-1-r.bolshakov@yadro.com Cc: stable@vger.kernel.org # 5.10+ Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Roman Bolshakov r.bolshakov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/target/iscsi/iscsi_target.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1166,6 +1166,7 @@ int iscsit_setup_scsi_cmd(struct iscsi_c
target_get_sess_cmd(&cmd->se_cmd, true);
+ cmd->se_cmd.tag = (__force u32)cmd->init_task_tag; cmd->sense_reason = target_cmd_init_cdb(&cmd->se_cmd, hdr->cdb); if (cmd->sense_reason) { if (cmd->sense_reason == TCM_OUT_OF_RESOURCES) { @@ -1180,8 +1181,6 @@ int iscsit_setup_scsi_cmd(struct iscsi_c if (cmd->sense_reason) goto attach_cmd;
- /* only used for printks or comparing with ->ref_task_tag */ - cmd->se_cmd.tag = (__force u32)cmd->init_task_tag; cmd->sense_reason = target_cmd_parse_cdb(&cmd->se_cmd); if (cmd->sense_reason) goto attach_cmd;
From: Roman Gushchin guro@fb.com
commit 0760fa3d8f7fceeea508b98899f1c826e10ffe78 upstream.
nr_empty_pop_pages is used to guarantee that there are some free populated pages to satisfy atomic allocations. Accounted and non-accounted allocations are using separate sets of chunks, so both need to have a surplus of empty pages.
This commit makes pcpu_nr_empty_pop_pages and the corresponding logic per chunk type.
[Dennis] This issue came up as I was reviewing [1] and realized I missed this. Simultaneously, it was reported btrfs was seeing failed atomic allocations in fsstress tests [2] and [3].
[1] https://lore.kernel.org/linux-mm/20210324190626.564297-1-guro@fb.com/ [2] https://lore.kernel.org/linux-mm/20210401185158.3275.409509F4@e16-tech.com/ [3] https://lore.kernel.org/linux-mm/CAL3q7H5RNBjCi708GH7jnczAOe0BLnacT9C+OBgA-D...
Fixes: 3c7be18ac9a0 ("mm: memcg/percpu: account percpu memory to memory cgroups") Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Roman Gushchin guro@fb.com Tested-by: Filipe Manana fdmanana@suse.com Signed-off-by: Dennis Zhou dennis@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/percpu-internal.h | 2 +- mm/percpu-stats.c | 9 +++++++-- mm/percpu.c | 14 +++++++------- 3 files changed, 15 insertions(+), 10 deletions(-)
--- a/mm/percpu-internal.h +++ b/mm/percpu-internal.h @@ -87,7 +87,7 @@ extern spinlock_t pcpu_lock;
extern struct list_head *pcpu_chunk_lists; extern int pcpu_nr_slots; -extern int pcpu_nr_empty_pop_pages; +extern int pcpu_nr_empty_pop_pages[];
extern struct pcpu_chunk *pcpu_first_chunk; extern struct pcpu_chunk *pcpu_reserved_chunk; --- a/mm/percpu-stats.c +++ b/mm/percpu-stats.c @@ -145,6 +145,7 @@ static int percpu_stats_show(struct seq_ int slot, max_nr_alloc; int *buffer; enum pcpu_chunk_type type; + int nr_empty_pop_pages;
alloc_buffer: spin_lock_irq(&pcpu_lock); @@ -165,7 +166,11 @@ alloc_buffer: goto alloc_buffer; }
-#define PL(X) \ + nr_empty_pop_pages = 0; + for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) + nr_empty_pop_pages += pcpu_nr_empty_pop_pages[type]; + +#define PL(X) \ seq_printf(m, " %-20s: %12lld\n", #X, (long long int)pcpu_stats_ai.X)
seq_printf(m, @@ -196,7 +201,7 @@ alloc_buffer: PU(nr_max_chunks); PU(min_alloc_size); PU(max_alloc_size); - P("empty_pop_pages", pcpu_nr_empty_pop_pages); + P("empty_pop_pages", nr_empty_pop_pages); seq_putc(m, '\n');
#undef PU --- a/mm/percpu.c +++ b/mm/percpu.c @@ -172,10 +172,10 @@ struct list_head *pcpu_chunk_lists __ro_ static LIST_HEAD(pcpu_map_extend_chunks);
/* - * The number of empty populated pages, protected by pcpu_lock. The - * reserved chunk doesn't contribute to the count. + * The number of empty populated pages by chunk type, protected by pcpu_lock. + * The reserved chunk doesn't contribute to the count. */ -int pcpu_nr_empty_pop_pages; +int pcpu_nr_empty_pop_pages[PCPU_NR_CHUNK_TYPES];
/* * The number of populated pages in use by the allocator, protected by @@ -555,7 +555,7 @@ static inline void pcpu_update_empty_pag { chunk->nr_empty_pop_pages += nr; if (chunk != pcpu_reserved_chunk) - pcpu_nr_empty_pop_pages += nr; + pcpu_nr_empty_pop_pages[pcpu_chunk_type(chunk)] += nr; }
/* @@ -1831,7 +1831,7 @@ area_found: mutex_unlock(&pcpu_alloc_mutex); }
- if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) + if (pcpu_nr_empty_pop_pages[type] < PCPU_EMPTY_POP_PAGES_LOW) pcpu_schedule_balance_work();
/* clear the areas and return address relative to base address */ @@ -1999,7 +1999,7 @@ retry_pop: pcpu_atomic_alloc_failed = false; } else { nr_to_pop = clamp(PCPU_EMPTY_POP_PAGES_HIGH - - pcpu_nr_empty_pop_pages, + pcpu_nr_empty_pop_pages[type], 0, PCPU_EMPTY_POP_PAGES_HIGH); }
@@ -2579,7 +2579,7 @@ void __init pcpu_setup_first_chunk(const
/* link the first chunk in */ pcpu_first_chunk = chunk; - pcpu_nr_empty_pop_pages = pcpu_first_chunk->nr_empty_pop_pages; + pcpu_nr_empty_pop_pages[PCPU_CHUNK_ROOT] = pcpu_first_chunk->nr_empty_pop_pages; pcpu_chunk_relocate(pcpu_first_chunk, -1);
/* include all regions of the first chunk */
From: Wolfram Sang wsa+renesas@sang-engineering.com
commit e409a6a3e0690efdef9b8a96197bc61ff117cfaf upstream.
In some configurations, recovery is optional. So, don't throw an error when it is not used because e.g. pinctrl settings for recovery are not provided. Reword the message and make it debug output.
Reported-by: Klaus Kudielka klaus.kudielka@gmail.com Tested-by: Klaus Kudielka klaus.kudielka@gmail.com Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Signed-off-by: Wolfram Sang wsa@kernel.org Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-core-base.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -378,7 +378,7 @@ static int i2c_gpio_init_recovery(struct static int i2c_init_recovery(struct i2c_adapter *adap) { struct i2c_bus_recovery_info *bri = adap->bus_recovery_info; - char *err_str; + char *err_str, *err_level = KERN_ERR;
if (!bri) return 0; @@ -387,7 +387,8 @@ static int i2c_init_recovery(struct i2c_ return -EPROBE_DEFER;
if (!bri->recover_bus) { - err_str = "no recover_bus() found"; + err_str = "no suitable method provided"; + err_level = KERN_DEBUG; goto err; }
@@ -414,7 +415,7 @@ static int i2c_init_recovery(struct i2c_
return 0; err: - dev_err(&adap->dev, "Not using recovery: %s\n", err_str); + dev_printk(err_level, &adap->dev, "Not using recovery: %s\n", err_str); adap->bus_recovery_info = NULL;
return -EINVAL;
From: Ben Gardon bgardon@google.com
[ Upstream commit e28a436ca4f65384cceaf3f4da0e00aa74244e6a ]
Currently the TDP MMU yield / cond_resched functions either return nothing or return true if the TLBs were not flushed. These are confusing semantics, especially when making control flow decisions in calling functions.
To clean things up, change both functions to have the same return value semantics as cond_resched: true if the thread yielded, false if it did not. If the function yielded in the _flush_ version, then the TLBs will have been flushed.
Reviewed-by: Peter Feiner pfeiner@google.com Acked-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Ben Gardon bgardon@google.com Message-Id: 20210202185734.1680553-2-bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 39 ++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index ffa0bd0e033f..22efd016f05e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -405,8 +405,15 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, _mmu->shadow_root_level, _start, _end)
/* - * Flush the TLB if the process should drop kvm->mmu_lock. - * Return whether the caller still needs to flush the tlb. + * Flush the TLB and yield if the MMU lock is contended or this thread needs to + * return control to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded, the TLBs were flushed, and the + * iterator's traversal was reset. Return false if a yield was not needed. */ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { @@ -414,18 +421,32 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it kvm_flush_remote_tlbs(kvm); cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); - return false; - } else { return true; } + + return false; }
-static void tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +/* + * Yield if the MMU lock is contended or this thread needs to return control + * to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded and the iterator's traversal was reset. + * Return false if a yield was not needed. + */ +static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); + return true; } + + return false; }
/* @@ -461,10 +482,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, 0);
- if (can_yield) - flush_needed = tdp_mmu_iter_flush_cond_resched(kvm, &iter); - else - flush_needed = true; + flush_needed = !can_yield || + !tdp_mmu_iter_flush_cond_resched(kvm, &iter); } return flush_needed; } @@ -1061,7 +1080,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
tdp_mmu_set_spte(kvm, &iter, 0);
- spte_set = tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); }
if (spte_set)
From: Ben Gardon bgardon@google.com
[ Upstream commit e139a34ef9d5627a41e1c02210229082140d1f92 ]
The flushing and non-flushing variants of tdp_mmu_iter_cond_resched have almost identical implementations. Merge the two functions and add a flush parameter.
Signed-off-by: Ben Gardon bgardon@google.com Message-Id: 20210202185734.1680553-12-bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 42 ++++++++++++-------------------------- 1 file changed, 13 insertions(+), 29 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 22efd016f05e..3b14d0008f92 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -404,33 +404,13 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, for_each_tdp_pte(_iter, __va(_mmu->root_hpa), \ _mmu->shadow_root_level, _start, _end)
-/* - * Flush the TLB and yield if the MMU lock is contended or this thread needs to - * return control to the scheduler. - * - * If this function yields, it will also reset the tdp_iter's walk over the - * paging structure and the calling function should allow the iterator to - * continue its traversal from the paging structure root. - * - * Return true if this function yielded, the TLBs were flushed, and the - * iterator's traversal was reset. Return false if a yield was not needed. - */ -static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) -{ - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { - kvm_flush_remote_tlbs(kvm); - cond_resched_lock(&kvm->mmu_lock); - tdp_iter_refresh_walk(iter); - return true; - } - - return false; -} - /* * Yield if the MMU lock is contended or this thread needs to return control * to the scheduler. * + * If this function should yield and flush is set, it will perform a remote + * TLB flush before yielding. + * * If this function yields, it will also reset the tdp_iter's walk over the * paging structure and the calling function should allow the iterator to * continue its traversal from the paging structure root. @@ -438,9 +418,13 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it * Return true if this function yielded and the iterator's traversal was reset. * Return false if a yield was not needed. */ -static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, + struct tdp_iter *iter, bool flush) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (flush) + kvm_flush_remote_tlbs(kvm); + cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); return true; @@ -483,7 +467,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0);
flush_needed = !can_yield || - !tdp_mmu_iter_flush_cond_resched(kvm, &iter); + !tdp_mmu_iter_cond_resched(kvm, &iter, true); } return flush_needed; } @@ -852,7 +836,7 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true;
- tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -911,7 +895,7 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true;
- tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -1027,7 +1011,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, new_spte); spte_set = true;
- tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); }
return spte_set; @@ -1080,7 +1064,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
tdp_mmu_set_spte(kvm, &iter, 0);
- spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true); }
if (spte_set)
From: Ben Gardon bgardon@google.com
[ Upstream commit 74953d3530280dc53256054e1906f58d07bfba44 ]
The goal_gfn field in tdp_iter can be misleading as it implies that it is the iterator's final goal. It is really a target for the lowest gfn mapped by the leaf level SPTE the iterator will traverse towards. Change the field's name to be more precise.
Signed-off-by: Ben Gardon bgardon@google.com Message-Id: 20210202185734.1680553-13-bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_iter.c | 20 ++++++++++---------- arch/x86/kvm/mmu/tdp_iter.h | 4 ++-- 2 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 87b7e16911db..9917c55b7d24 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -22,21 +22,21 @@ static gfn_t round_gfn_for_level(gfn_t gfn, int level)
/* * Sets a TDP iterator to walk a pre-order traversal of the paging structure - * rooted at root_pt, starting with the walk to translate goal_gfn. + * rooted at root_pt, starting with the walk to translate next_last_level_gfn. */ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, - int min_level, gfn_t goal_gfn) + int min_level, gfn_t next_last_level_gfn) { WARN_ON(root_level < 1); WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);
- iter->goal_gfn = goal_gfn; + iter->next_last_level_gfn = next_last_level_gfn; iter->root_level = root_level; iter->min_level = min_level; iter->level = root_level; iter->pt_path[iter->level - 1] = root_pt;
- iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level); + iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level); tdp_iter_refresh_sptep(iter);
iter->valid = true; @@ -82,7 +82,7 @@ static bool try_step_down(struct tdp_iter *iter)
iter->level--; iter->pt_path[iter->level - 1] = child_pt; - iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level); + iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level); tdp_iter_refresh_sptep(iter);
return true; @@ -106,7 +106,7 @@ static bool try_step_side(struct tdp_iter *iter) return false;
iter->gfn += KVM_PAGES_PER_HPAGE(iter->level); - iter->goal_gfn = iter->gfn; + iter->next_last_level_gfn = iter->gfn; iter->sptep++; iter->old_spte = READ_ONCE(*iter->sptep);
@@ -166,13 +166,13 @@ void tdp_iter_next(struct tdp_iter *iter) */ void tdp_iter_refresh_walk(struct tdp_iter *iter) { - gfn_t goal_gfn = iter->goal_gfn; + gfn_t next_last_level_gfn = iter->next_last_level_gfn;
- if (iter->gfn > goal_gfn) - goal_gfn = iter->gfn; + if (iter->gfn > next_last_level_gfn) + next_last_level_gfn = iter->gfn;
tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], - iter->root_level, iter->min_level, goal_gfn); + iter->root_level, iter->min_level, next_last_level_gfn); }
u64 *tdp_iter_root_pt(struct tdp_iter *iter) diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index 47170d0dc98e..b2dd269c631f 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -15,7 +15,7 @@ struct tdp_iter { * The iterator will traverse the paging structure towards the mapping * for this GFN. */ - gfn_t goal_gfn; + gfn_t next_last_level_gfn; /* Pointers to the page tables traversed to reach the current SPTE */ u64 *pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ @@ -52,7 +52,7 @@ struct tdp_iter { u64 *spte_to_child_pt(u64 pte, int level);
void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, - int min_level, gfn_t goal_gfn); + int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); void tdp_iter_refresh_walk(struct tdp_iter *iter); u64 *tdp_iter_root_pt(struct tdp_iter *iter);
From: Ben Gardon bgardon@google.com
[ Upstream commit ed5e484b79e8a9b8be714bd85b6fc70bd6dc99a7 ]
In some functions the TDP iter risks not making forward progress if two threads livelock yielding to one another. This is possible if two threads are trying to execute wrprot_gfn_range. Each could write protect an entry and then yield. This would reset the tdp_iter's walk over the paging structure and the loop would end up repeating the same entry over and over, preventing either thread from making forward progress.
Fix this issue by only yielding if the loop has made forward progress since the last yield.
Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Reviewed-by: Peter Feiner pfeiner@google.com Signed-off-by: Ben Gardon bgardon@google.com
Message-Id: 20210202185734.1680553-14-bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_iter.c | 18 +----------------- arch/x86/kvm/mmu/tdp_iter.h | 7 ++++++- arch/x86/kvm/mmu/tdp_mmu.c | 21 ++++++++++++++++----- 3 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 9917c55b7d24..1a09d212186b 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -31,6 +31,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);
iter->next_last_level_gfn = next_last_level_gfn; + iter->yielded_gfn = iter->next_last_level_gfn; iter->root_level = root_level; iter->min_level = min_level; iter->level = root_level; @@ -158,23 +159,6 @@ void tdp_iter_next(struct tdp_iter *iter) iter->valid = false; }
-/* - * Restart the walk over the paging structure from the root, starting from the - * highest gfn the iterator had previously reached. Assumes that the entire - * paging structure, except the root page, may have been completely torn down - * and rebuilt. - */ -void tdp_iter_refresh_walk(struct tdp_iter *iter) -{ - gfn_t next_last_level_gfn = iter->next_last_level_gfn; - - if (iter->gfn > next_last_level_gfn) - next_last_level_gfn = iter->gfn; - - tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], - iter->root_level, iter->min_level, next_last_level_gfn); -} - u64 *tdp_iter_root_pt(struct tdp_iter *iter) { return iter->pt_path[iter->root_level - 1]; diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index b2dd269c631f..d480c540ee27 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -16,6 +16,12 @@ struct tdp_iter { * for this GFN. */ gfn_t next_last_level_gfn; + /* + * The next_last_level_gfn at the time when the thread last + * yielded. Only yielding when the next_last_level_gfn != + * yielded_gfn helps ensure forward progress. + */ + gfn_t yielded_gfn; /* Pointers to the page tables traversed to reach the current SPTE */ u64 *pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ @@ -54,7 +60,6 @@ u64 *spte_to_child_pt(u64 pte, int level); void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); -void tdp_iter_refresh_walk(struct tdp_iter *iter); u64 *tdp_iter_root_pt(struct tdp_iter *iter);
#endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 3b14d0008f92..f0bc5d3ce3d4 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -412,8 +412,9 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, * TLB flush before yielding. * * If this function yields, it will also reset the tdp_iter's walk over the - * paging structure and the calling function should allow the iterator to - * continue its traversal from the paging structure root. + * paging structure and the calling function should skip to the next + * iteration to allow the iterator to continue its traversal from the + * paging structure root. * * Return true if this function yielded and the iterator's traversal was reset. * Return false if a yield was not needed. @@ -421,12 +422,22 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter, bool flush) { + /* Ensure forward progress has been made before yielding. */ + if (iter->next_last_level_gfn == iter->yielded_gfn) + return false; + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { if (flush) kvm_flush_remote_tlbs(kvm);
cond_resched_lock(&kvm->mmu_lock); - tdp_iter_refresh_walk(iter); + + WARN_ON(iter->gfn > iter->next_last_level_gfn); + + tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], + iter->root_level, iter->min_level, + iter->next_last_level_gfn); + return true; }
@@ -466,8 +477,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, 0);
- flush_needed = !can_yield || - !tdp_mmu_iter_cond_resched(kvm, &iter, true); + flush_needed = !(can_yield && + tdp_mmu_iter_cond_resched(kvm, &iter, true)); } return flush_needed; }
From: Ben Gardon bgardon@google.com
[ Upstream commit 1af4a96025b33587ca953c7ef12a1b20c6e70412 ]
Given certain conditions, some TDP MMU functions may not yield reliably / frequently enough. For example, if a paging structure was very large but had few, if any writable entries, wrprot_gfn_range could traverse many entries before finding a writable entry and yielding because the check for yielding only happens after an SPTE is modified.
Fix this issue by moving the yield to the beginning of the loop.
Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Reviewed-by: Peter Feiner pfeiner@google.com Signed-off-by: Ben Gardon bgardon@google.com
Message-Id: 20210202185734.1680553-15-bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index f0bc5d3ce3d4..0d17457f1c84 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -462,6 +462,12 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool flush_needed = false;
tdp_root_for_each_pte(iter, root, start, end) { + if (can_yield && + tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) { + flush_needed = false; + continue; + } + if (!is_shadow_present_pte(iter.old_spte)) continue;
@@ -476,9 +482,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue;
tdp_mmu_set_spte(kvm, &iter, 0); - - flush_needed = !(can_yield && - tdp_mmu_iter_cond_resched(kvm, &iter, true)); + flush_needed = true; } return flush_needed; } @@ -838,6 +842,9 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
for_each_tdp_pte_min_level(iter, root->spt, root->role.level, min_level, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) continue; @@ -846,8 +853,6 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -891,6 +896,9 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool spte_set = false;
tdp_root_for_each_leaf_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (spte_ad_need_write_protect(iter.old_spte)) { if (is_writable_pte(iter.old_spte)) new_spte = iter.old_spte & ~PT_WRITABLE_MASK; @@ -905,8 +913,6 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -1014,6 +1020,9 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool spte_set = false;
tdp_root_for_each_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (!is_shadow_present_pte(iter.old_spte)) continue;
@@ -1021,8 +1030,6 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); }
return spte_set; @@ -1063,6 +1070,11 @@ static void zap_collapsible_spte_range(struct kvm *kvm, bool spte_set = false;
tdp_root_for_each_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) { + spte_set = false; + continue; + } + if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) continue; @@ -1075,7 +1087,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
tdp_mmu_set_spte(kvm, &iter, 0);
- spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true); + spte_set = true; }
if (spte_set)
From: Sean Christopherson seanjc@google.com
[ Upstream commit a835429cda91621fca915d80672a157b47738afb ]
When flushing a range of GFNs across multiple roots, ensure any pending flush from a previous root is honored before yielding while walking the tables of the current root.
Note, kvm_tdp_mmu_zap_gfn_range() now intentionally overwrites its local "flush" with the result to avoid redundant flushes. zap_gfn_range() preserves and return the incoming "flush", unless of course the flush was performed prior to yielding and no new flush was triggered.
Fixes: 1af4a96025b3 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed") Cc: stable@vger.kernel.org Reviewed-by: Ben Gardon bgardon@google.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210325200119.1359384-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0d17457f1c84..f534c0a15f2b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -103,7 +103,7 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa) }
static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end, bool can_yield); + gfn_t start, gfn_t end, bool can_yield, bool flush);
void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { @@ -116,7 +116,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root)
list_del(&root->link);
- zap_gfn_range(kvm, root, 0, max_gfn, false); + zap_gfn_range(kvm, root, 0, max_gfn, false, false);
free_page((unsigned long)root->spt); kmem_cache_free(mmu_page_header_cache, root); @@ -453,18 +453,19 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, * scheduler needs the CPU or there is contention on the MMU lock. If this * function cannot yield, it will not release the MMU lock or reschedule and * the caller must ensure it does not supply too large a GFN range, or the - * operation can cause a soft lockup. + * operation can cause a soft lockup. Note, in some use cases a flush may be + * required by prior actions. Ensure the pending flush is performed prior to + * yielding. */ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end, bool can_yield) + gfn_t start, gfn_t end, bool can_yield, bool flush) { struct tdp_iter iter; - bool flush_needed = false;
tdp_root_for_each_pte(iter, root, start, end) { if (can_yield && - tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) { - flush_needed = false; + tdp_mmu_iter_cond_resched(kvm, &iter, flush)) { + flush = false; continue; }
@@ -482,9 +483,10 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue;
tdp_mmu_set_spte(kvm, &iter, 0); - flush_needed = true; + flush = true; } - return flush_needed; + + return flush; }
/* @@ -499,7 +501,7 @@ bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) bool flush = false;
for_each_tdp_mmu_root_yield_safe(kvm, root) - flush |= zap_gfn_range(kvm, root, start, end, true); + flush = zap_gfn_range(kvm, root, start, end, true, flush);
return flush; } @@ -691,7 +693,7 @@ static int zap_gfn_range_hva_wrapper(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t start, gfn_t end, unsigned long unused) { - return zap_gfn_range(kvm, root, start, end, false); + return zap_gfn_range(kvm, root, start, end, false, false); }
int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start,
From: Sean Christopherson seanjc@google.com
[ Upstream commit 048f49809c526348775425420fb5b8e84fd9a133 ]
Honor the "flush needed" return from kvm_tdp_mmu_zap_gfn_range(), which does the flush itself if and only if it yields (which it will never do in this particular scenario), and otherwise expects the caller to do the flush. If pages are zapped from the TDP MMU but not the legacy MMU, then no flush will occur.
Fixes: 29cf0f5007a2 ("kvm: x86/mmu: NX largepage recovery for TDP MMU") Cc: stable@vger.kernel.org Cc: Ben Gardon bgardon@google.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210325200119.1359384-3-seanjc@google.com Reviewed-by: Ben Gardon bgardon@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/mmu.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index dacbd13d32c6..354f9926a183 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5972,6 +5972,8 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) struct kvm_mmu_page *sp; unsigned int ratio; LIST_HEAD(invalid_list); + bool flush = false; + gfn_t gfn_end; ulong to_zap;
rcu_idx = srcu_read_lock(&kvm->srcu); @@ -5993,19 +5995,20 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) lpage_disallowed_link); WARN_ON_ONCE(!sp->lpage_disallowed); if (sp->tdp_mmu_page) - kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, - sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level)); - else { + gfn_end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level); + flush = kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, gfn_end); + } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); WARN_ON_ONCE(sp->lpage_disallowed); }
if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { - kvm_mmu_commit_zap_page(kvm, &invalid_list); + kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush); cond_resched_lock(&kvm->mmu_lock); + flush = false; } } - kvm_mmu_commit_zap_page(kvm, &invalid_list); + kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
spin_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, rcu_idx);
From: Sean Christopherson seanjc@google.com
[ Upstream commit 33a3164161fc86b9cc238f7f2aa2ccb1d5559b1c ]
Prevent the TDP MMU from yielding when zapping a gfn range during NX page recovery. If a flush is pending from a previous invocation of the zapping helper, either in the TDP MMU or the legacy MMU, but the TDP MMU has not accumulated a flush for the current invocation, then yielding will release mmu_lock with stale TLB entries.
That being said, this isn't technically a bug fix in the current code, as the TDP MMU will never yield in this case. tdp_mmu_iter_cond_resched() will yield if and only if it has made forward progress, as defined by the current gfn vs. the last yielded (or starting) gfn. Because zapping a single shadow page is guaranteed to (a) find that page and (b) step sideways at the level of the shadow page, the TDP iter will break its loop before getting a chance to yield.
But that is all very, very subtle, and will break at the slightest sneeze, e.g. zapping while holding mmu_lock for read would break as the TDP MMU wouldn't be guaranteed to see the present shadow page, and thus could step sideways at a lower level.
Cc: Ben Gardon bgardon@google.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210325200119.1359384-4-seanjc@google.com [Add lockdep assertion. - Paolo] Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/mmu.c | 6 ++---- arch/x86/kvm/mmu/tdp_mmu.c | 5 +++-- arch/x86/kvm/mmu/tdp_mmu.h | 18 +++++++++++++++++- 3 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 354f9926a183..defdd717e9da 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5973,7 +5973,6 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) unsigned int ratio; LIST_HEAD(invalid_list); bool flush = false; - gfn_t gfn_end; ulong to_zap;
rcu_idx = srcu_read_lock(&kvm->srcu); @@ -5994,9 +5993,8 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) struct kvm_mmu_page, lpage_disallowed_link); WARN_ON_ONCE(!sp->lpage_disallowed); - if (sp->tdp_mmu_page) - gfn_end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level); - flush = kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, gfn_end); + if (sp->tdp_mmu_page) { + flush = kvm_tdp_mmu_zap_sp(kvm, sp); } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); WARN_ON_ONCE(sp->lpage_disallowed); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index f534c0a15f2b..61c00f8631f1 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -495,13 +495,14 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, * SPTEs have been cleared and a TLB flush is needed before releasing the * MMU lock. */ -bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) +bool __kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end, + bool can_yield) { struct kvm_mmu_page *root; bool flush = false;
for_each_tdp_mmu_root_yield_safe(kvm, root) - flush = zap_gfn_range(kvm, root, start, end, true, flush); + flush = zap_gfn_range(kvm, root, start, end, can_yield, flush);
return flush; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index cbbdbadd1526..a7a3f6db263d 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -12,7 +12,23 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t root); hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root);
-bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end); +bool __kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end, + bool can_yield); +static inline bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + return __kvm_tdp_mmu_zap_gfn_range(kvm, start, end, true); +} +static inline bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + gfn_t end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level); + + /* + * Don't allow yielding, as the caller may have pending pages to zap + * on the shadow MMU. + */ + return __kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, end, false); +} void kvm_tdp_mmu_zap_all(struct kvm *kvm);
int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit 315f02c60d9425b38eb8ad7f21b8a35e40db23f9 ]
Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller will skip the TLB flush, which is wrong. There are two ways to fix it:
- since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to use "flush |= ..."
- or we can chain the flush argument through kvm_tdp_mmu_zap_sp down to __kvm_tdp_mmu_zap_gfn_range. Note that kvm_tdp_mmu_zap_sp will neither yield nor flush, so flush would never go from true to false.
This patch does the former to simplify application to stable kernels, and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.
Cc: seanjc@google.com Fixes: 048f49809c526 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping") Cc: stable@vger.kernel.org # 5.10.x: 048f49809c: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping Cc: stable@vger.kernel.org # 5.10.x: 33a3164161: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index defdd717e9da..15717a28b212 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5994,7 +5994,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) lpage_disallowed_link); WARN_ON_ONCE(!sp->lpage_disallowed); if (sp->tdp_mmu_page) { - flush = kvm_tdp_mmu_zap_sp(kvm, sp); + flush |= kvm_tdp_mmu_zap_sp(kvm, sp); } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); WARN_ON_ONCE(sp->lpage_disallowed);
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit b3650bf76a32380d4d80a3e21b5583e7303f216c ]
With recent changes that separated action module load from action initialization tcf_action_init() function error handling code was modified to manually release the loaded modules if loading/initialization of any further action in same batch failed. For the case when all modules successfully loaded and some of the actions were initialized before one of them failed in init handler. In this case for all previous actions the module will be released twice by the error handler: First time by the loop that manually calls module_put() for all ops, and second time by the action destroy code that puts the module after destroying the action.
Reproduction:
$ sudo tc actions add action simple sdata "2" index 2 $ sudo tc actions add action simple sdata "1" index 1 \ action simple sdata "2" index 2 RTNETLINK answers: File exists We have an error talking to the kernel $ sudo tc actions ls action simple total acts 1
action order 0: Simple <"2"> index 2 ref 1 bind 0 $ sudo tc actions flush action simple $ sudo tc actions ls action simple $ sudo tc actions add action simple sdata "2" index 2 Error: Failed to load TC action module. We have an error talking to the kernel $ lsmod | grep simple act_simple 20480 -1
Fix the issue by modifying module reference counting handling in action initialization code:
- Get module reference in tcf_idr_create() and put it in tcf_idr_release() instead of taking over the reference held by the caller.
- Modify users of tcf_action_init_1() to always release the module reference which they obtain before calling init function instead of assuming that created action takes over the reference.
- Finally, modify tcf_action_init_1() to not release the module reference when overwriting existing action as this is no longer necessary since both upper and lower layers obtain and manage their own module references independently.
Fixes: d349f9976868 ("net_sched: fix RTNL deadlock again caused by request_module()") Suggested-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Vlad Buslov vladbu@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/act_api.h | 7 +------ net/sched/act_api.c | 26 ++++++++++++++++---------- net/sched/cls_api.c | 5 ++--- 3 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/include/net/act_api.h b/include/net/act_api.h index 27786090ae8e..2c88b8af3cdb 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -170,12 +170,7 @@ void tcf_idr_insert_many(struct tc_action *actions[]); void tcf_idr_cleanup(struct tc_action_net *tn, u32 index); int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index, struct tc_action **a, int bind); -int __tcf_idr_release(struct tc_action *a, bool bind, bool strict); - -static inline int tcf_idr_release(struct tc_action *a, bool bind) -{ - return __tcf_idr_release(a, bind, false); -} +int tcf_idr_release(struct tc_action *a, bool bind);
int tcf_register_action(struct tc_action_ops *a, struct pernet_operations *ops); int tcf_unregister_action(struct tc_action_ops *a, diff --git a/net/sched/act_api.c b/net/sched/act_api.c index df1e494574d8..88e14cfeb5d5 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -142,7 +142,7 @@ static int __tcf_action_put(struct tc_action *p, bool bind) return 0; }
-int __tcf_idr_release(struct tc_action *p, bool bind, bool strict) +static int __tcf_idr_release(struct tc_action *p, bool bind, bool strict) { int ret = 0;
@@ -168,7 +168,18 @@ int __tcf_idr_release(struct tc_action *p, bool bind, bool strict)
return ret; } -EXPORT_SYMBOL(__tcf_idr_release); + +int tcf_idr_release(struct tc_action *a, bool bind) +{ + const struct tc_action_ops *ops = a->ops; + int ret; + + ret = __tcf_idr_release(a, bind, false); + if (ret == ACT_P_DELETED) + module_put(ops->owner); + return ret; +} +EXPORT_SYMBOL(tcf_idr_release);
static size_t tcf_action_shared_attrs_size(const struct tc_action *act) { @@ -445,6 +456,7 @@ int tcf_idr_create(struct tc_action_net *tn, u32 index, struct nlattr *est, }
p->idrinfo = idrinfo; + __module_get(ops->owner); p->ops = ops; *a = p; return 0; @@ -1017,13 +1029,6 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, if (!name) a->hw_stats = hw_stats;
- /* module count goes up only when brand new policy is created - * if it exists and is only bound to in a_o->init() then - * ACT_P_CREATED is not returned (a zero is). - */ - if (err != ACT_P_CREATED) - module_put(a_o->owner); - return a;
err_out: @@ -1083,7 +1088,8 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, tcf_idr_insert_many(actions);
*attr_size = tcf_action_full_attrs_size(sz); - return i - 1; + err = i - 1; + goto err_mod;
err: tcf_action_destroy(actions, bind); diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index b45eb7f6cd49..d48ba4dee9a5 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -3065,10 +3065,9 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb, rate_tlv, "police", ovr, TCA_ACT_BIND, a_o, init_res, rtnl_held, extack); - if (IS_ERR(act)) { - module_put(a_o->owner); + module_put(a_o->owner); + if (IS_ERR(act)) return PTR_ERR(act); - }
act->type = exts->type = TCA_OLD_COMPAT; exts->actions[0] = act;
From: Chinh T Cao chinh.t.cao@intel.com
[ Upstream commit fc2d1165d4a424dd325ae1f45806565350a58013 ]
Refactor the DCB related variables out of the ice_port_info_struct. The goal is to make the ice_port_info struct cleaner.
Signed-off-by: Chinh T Cao chinh.t.cao@intel.com Co-developed-by: Dave Ertman david.m.ertman@intel.com Signed-off-by: Dave Ertman david.m.ertman@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_dcb.c | 40 ++++++++-------- drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 47 +++++++++--------- drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 50 ++++++++++---------- drivers/net/ethernet/intel/ice/ice_ethtool.c | 4 +- drivers/net/ethernet/intel/ice/ice_lib.c | 2 +- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_type.h | 16 ++++--- 7 files changed, 83 insertions(+), 78 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb.c b/drivers/net/ethernet/intel/ice/ice_dcb.c index 2a3147ee0bbb..e42727941ef5 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb.c @@ -850,9 +850,9 @@ ice_get_ieee_or_cee_dcb_cfg(struct ice_port_info *pi, u8 dcbx_mode) return ICE_ERR_PARAM;
if (dcbx_mode == ICE_DCBX_MODE_IEEE) - dcbx_cfg = &pi->local_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; else if (dcbx_mode == ICE_DCBX_MODE_CEE) - dcbx_cfg = &pi->desired_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.desired_dcbx_cfg;
/* Get Local DCB Config in case of ICE_DCBX_MODE_IEEE * or get CEE DCB Desired Config in case of ICE_DCBX_MODE_CEE @@ -863,7 +863,7 @@ ice_get_ieee_or_cee_dcb_cfg(struct ice_port_info *pi, u8 dcbx_mode) goto out;
/* Get Remote DCB Config */ - dcbx_cfg = &pi->remote_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.remote_dcbx_cfg; ret = ice_aq_get_dcb_cfg(pi->hw, ICE_AQ_LLDP_MIB_REMOTE, ICE_AQ_LLDP_BRID_TYPE_NEAREST_BRID, dcbx_cfg); /* Don't treat ENOENT as an error for Remote MIBs */ @@ -892,14 +892,14 @@ enum ice_status ice_get_dcb_cfg(struct ice_port_info *pi) ret = ice_aq_get_cee_dcb_cfg(pi->hw, &cee_cfg, NULL); if (!ret) { /* CEE mode */ - dcbx_cfg = &pi->local_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; dcbx_cfg->dcbx_mode = ICE_DCBX_MODE_CEE; dcbx_cfg->tlv_status = le32_to_cpu(cee_cfg.tlv_status); ice_cee_to_dcb_cfg(&cee_cfg, dcbx_cfg); ret = ice_get_ieee_or_cee_dcb_cfg(pi, ICE_DCBX_MODE_CEE); } else if (pi->hw->adminq.sq_last_status == ICE_AQ_RC_ENOENT) { /* CEE mode not enabled try querying IEEE data */ - dcbx_cfg = &pi->local_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; dcbx_cfg->dcbx_mode = ICE_DCBX_MODE_IEEE; ret = ice_get_ieee_or_cee_dcb_cfg(pi, ICE_DCBX_MODE_IEEE); } @@ -916,26 +916,26 @@ enum ice_status ice_get_dcb_cfg(struct ice_port_info *pi) */ enum ice_status ice_init_dcb(struct ice_hw *hw, bool enable_mib_change) { - struct ice_port_info *pi = hw->port_info; + struct ice_qos_cfg *qos_cfg = &hw->port_info->qos_cfg; enum ice_status ret = 0;
if (!hw->func_caps.common_cap.dcb) return ICE_ERR_NOT_SUPPORTED;
- pi->is_sw_lldp = true; + qos_cfg->is_sw_lldp = true;
/* Get DCBX status */ - pi->dcbx_status = ice_get_dcbx_status(hw); + qos_cfg->dcbx_status = ice_get_dcbx_status(hw);
- if (pi->dcbx_status == ICE_DCBX_STATUS_DONE || - pi->dcbx_status == ICE_DCBX_STATUS_IN_PROGRESS || - pi->dcbx_status == ICE_DCBX_STATUS_NOT_STARTED) { + if (qos_cfg->dcbx_status == ICE_DCBX_STATUS_DONE || + qos_cfg->dcbx_status == ICE_DCBX_STATUS_IN_PROGRESS || + qos_cfg->dcbx_status == ICE_DCBX_STATUS_NOT_STARTED) { /* Get current DCBX configuration */ - ret = ice_get_dcb_cfg(pi); + ret = ice_get_dcb_cfg(hw->port_info); if (ret) return ret; - pi->is_sw_lldp = false; - } else if (pi->dcbx_status == ICE_DCBX_STATUS_DIS) { + qos_cfg->is_sw_lldp = false; + } else if (qos_cfg->dcbx_status == ICE_DCBX_STATUS_DIS) { return ICE_ERR_NOT_READY; }
@@ -943,7 +943,7 @@ enum ice_status ice_init_dcb(struct ice_hw *hw, bool enable_mib_change) if (enable_mib_change) { ret = ice_aq_cfg_lldp_mib_change(hw, true, NULL); if (ret) - pi->is_sw_lldp = true; + qos_cfg->is_sw_lldp = true; }
return ret; @@ -958,21 +958,21 @@ enum ice_status ice_init_dcb(struct ice_hw *hw, bool enable_mib_change) */ enum ice_status ice_cfg_lldp_mib_change(struct ice_hw *hw, bool ena_mib) { - struct ice_port_info *pi = hw->port_info; + struct ice_qos_cfg *qos_cfg = &hw->port_info->qos_cfg; enum ice_status ret;
if (!hw->func_caps.common_cap.dcb) return ICE_ERR_NOT_SUPPORTED;
/* Get DCBX status */ - pi->dcbx_status = ice_get_dcbx_status(hw); + qos_cfg->dcbx_status = ice_get_dcbx_status(hw);
- if (pi->dcbx_status == ICE_DCBX_STATUS_DIS) + if (qos_cfg->dcbx_status == ICE_DCBX_STATUS_DIS) return ICE_ERR_NOT_READY;
ret = ice_aq_cfg_lldp_mib_change(hw, ena_mib, NULL); if (!ret) - pi->is_sw_lldp = !ena_mib; + qos_cfg->is_sw_lldp = !ena_mib;
return ret; } @@ -1270,7 +1270,7 @@ enum ice_status ice_set_dcb_cfg(struct ice_port_info *pi) hw = pi->hw;
/* update the HW local config */ - dcbcfg = &pi->local_dcbx_cfg; + dcbcfg = &pi->qos_cfg.local_dcbx_cfg; /* Allocate the LLDPDU */ lldpmib = devm_kzalloc(ice_hw_to_dev(hw), ICE_LLDPDU_SIZE, GFP_KERNEL); if (!lldpmib) diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c index 36abd6b7280c..1e8f71ffc8ce 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c @@ -28,7 +28,7 @@ void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc) if (netdev_set_num_tc(netdev, vsi->tc_cfg.numtc)) return;
- dcbcfg = &pf->hw.port_info->local_dcbx_cfg; + dcbcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
ice_for_each_traffic_class(i) if (vsi->tc_cfg.ena_tc & BIT(i)) @@ -134,7 +134,7 @@ static u8 ice_dcb_get_mode(struct ice_port_info *port_info, bool host) else mode = DCB_CAP_DCBX_LLD_MANAGED;
- if (port_info->local_dcbx_cfg.dcbx_mode & ICE_DCBX_MODE_CEE) + if (port_info->qos_cfg.local_dcbx_cfg.dcbx_mode & ICE_DCBX_MODE_CEE) return mode | DCB_CAP_DCBX_VER_CEE; else return mode | DCB_CAP_DCBX_VER_IEEE; @@ -277,10 +277,10 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked) int ret = ICE_DCB_NO_HW_CHG; struct ice_vsi *pf_vsi;
- curr_cfg = &pf->hw.port_info->local_dcbx_cfg; + curr_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
/* FW does not care if change happened */ - if (!pf->hw.port_info->is_sw_lldp) + if (!pf->hw.port_info->qos_cfg.is_sw_lldp) ret = ICE_DCB_HW_CHG_RST;
/* Enable DCB tagging only when more than one TC */ @@ -327,7 +327,7 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked) /* Only send new config to HW if we are in SW LLDP mode. Otherwise, * the new config came from the HW in the first place. */ - if (pf->hw.port_info->is_sw_lldp) { + if (pf->hw.port_info->qos_cfg.is_sw_lldp) { ret = ice_set_dcb_cfg(pf->hw.port_info); if (ret) { dev_err(dev, "Set DCB Config failed\n"); @@ -360,7 +360,7 @@ int ice_pf_dcb_cfg(struct ice_pf *pf, struct ice_dcbx_cfg *new_cfg, bool locked) */ static void ice_cfg_etsrec_defaults(struct ice_port_info *pi) { - struct ice_dcbx_cfg *dcbcfg = &pi->local_dcbx_cfg; + struct ice_dcbx_cfg *dcbcfg = &pi->qos_cfg.local_dcbx_cfg; u8 i;
/* Ensure ETS recommended DCB configuration is not already set */ @@ -446,7 +446,7 @@ void ice_dcb_rebuild(struct ice_pf *pf)
mutex_lock(&pf->tc_mutex);
- if (!pf->hw.port_info->is_sw_lldp) + if (!pf->hw.port_info->qos_cfg.is_sw_lldp) ice_cfg_etsrec_defaults(pf->hw.port_info);
ret = ice_set_dcb_cfg(pf->hw.port_info); @@ -455,9 +455,9 @@ void ice_dcb_rebuild(struct ice_pf *pf) goto dcb_error; }
- if (!pf->hw.port_info->is_sw_lldp) { + if (!pf->hw.port_info->qos_cfg.is_sw_lldp) { ret = ice_cfg_lldp_mib_change(&pf->hw, true); - if (ret && !pf->hw.port_info->is_sw_lldp) { + if (ret && !pf->hw.port_info->qos_cfg.is_sw_lldp) { dev_err(dev, "Failed to register for MIB changes\n"); goto dcb_error; } @@ -510,11 +510,12 @@ static int ice_dcb_init_cfg(struct ice_pf *pf, bool locked) int ret = 0;
pi = pf->hw.port_info; - newcfg = kmemdup(&pi->local_dcbx_cfg, sizeof(*newcfg), GFP_KERNEL); + newcfg = kmemdup(&pi->qos_cfg.local_dcbx_cfg, sizeof(*newcfg), + GFP_KERNEL); if (!newcfg) return -ENOMEM;
- memset(&pi->local_dcbx_cfg, 0, sizeof(*newcfg)); + memset(&pi->qos_cfg.local_dcbx_cfg, 0, sizeof(*newcfg));
dev_info(ice_pf_to_dev(pf), "Configuring initial DCB values\n"); if (ice_pf_dcb_cfg(pf, newcfg, locked)) @@ -545,7 +546,7 @@ static int ice_dcb_sw_dflt_cfg(struct ice_pf *pf, bool ets_willing, bool locked) if (!dcbcfg) return -ENOMEM;
- memset(&pi->local_dcbx_cfg, 0, sizeof(*dcbcfg)); + memset(&pi->qos_cfg.local_dcbx_cfg, 0, sizeof(*dcbcfg));
dcbcfg->etscfg.willing = ets_willing ? 1 : 0; dcbcfg->etscfg.maxtcs = hw->func_caps.common_cap.maxtc; @@ -608,7 +609,7 @@ static bool ice_dcb_tc_contig(u8 *prio_table) */ static int ice_dcb_noncontig_cfg(struct ice_pf *pf) { - struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->local_dcbx_cfg; + struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; struct device *dev = ice_pf_to_dev(pf); int ret;
@@ -638,7 +639,7 @@ static int ice_dcb_noncontig_cfg(struct ice_pf *pf) */ void ice_pf_dcb_recfg(struct ice_pf *pf) { - struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->local_dcbx_cfg; + struct ice_dcbx_cfg *dcbcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg; u8 tc_map = 0; int v, ret;
@@ -691,7 +692,7 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked) port_info = hw->port_info;
err = ice_init_dcb(hw, false); - if (err && !port_info->is_sw_lldp) { + if (err && !port_info->qos_cfg.is_sw_lldp) { dev_err(dev, "Error initializing DCB %d\n", err); goto dcb_init_err; } @@ -858,7 +859,7 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, /* Update the remote cached instance and return */ ret = ice_aq_get_dcb_cfg(pi->hw, ICE_AQ_LLDP_MIB_REMOTE, ICE_AQ_LLDP_BRID_TYPE_NEAREST_BRID, - &pi->remote_dcbx_cfg); + &pi->qos_cfg.remote_dcbx_cfg); if (ret) { dev_err(dev, "Failed to get remote DCB config\n"); return; @@ -868,10 +869,11 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, mutex_lock(&pf->tc_mutex);
/* store the old configuration */ - tmp_dcbx_cfg = pf->hw.port_info->local_dcbx_cfg; + tmp_dcbx_cfg = pf->hw.port_info->qos_cfg.local_dcbx_cfg;
/* Reset the old DCBX configuration data */ - memset(&pi->local_dcbx_cfg, 0, sizeof(pi->local_dcbx_cfg)); + memset(&pi->qos_cfg.local_dcbx_cfg, 0, + sizeof(pi->qos_cfg.local_dcbx_cfg));
/* Get updated DCBX data from firmware */ ret = ice_get_dcb_cfg(pf->hw.port_info); @@ -881,7 +883,8 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, }
/* No change detected in DCBX configs */ - if (!memcmp(&tmp_dcbx_cfg, &pi->local_dcbx_cfg, sizeof(tmp_dcbx_cfg))) { + if (!memcmp(&tmp_dcbx_cfg, &pi->qos_cfg.local_dcbx_cfg, + sizeof(tmp_dcbx_cfg))) { dev_dbg(dev, "No change detected in DCBX configuration.\n"); goto out; } @@ -889,13 +892,13 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf, pf->dcbx_cap = ice_dcb_get_mode(pi, false);
need_reconfig = ice_dcb_need_recfg(pf, &tmp_dcbx_cfg, - &pi->local_dcbx_cfg); - ice_dcbnl_flush_apps(pf, &tmp_dcbx_cfg, &pi->local_dcbx_cfg); + &pi->qos_cfg.local_dcbx_cfg); + ice_dcbnl_flush_apps(pf, &tmp_dcbx_cfg, &pi->qos_cfg.local_dcbx_cfg); if (!need_reconfig) goto out;
/* Enable DCB tagging only when more than one TC */ - if (ice_dcb_get_num_tc(&pi->local_dcbx_cfg) > 1) { + if (ice_dcb_get_num_tc(&pi->qos_cfg.local_dcbx_cfg) > 1) { dev_dbg(dev, "DCB tagging enabled (num TC > 1)\n"); set_bit(ICE_FLAG_DCB_ENA, pf->flags); } else { diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c index 162348ef090b..4180f1f35fb8 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c @@ -32,12 +32,10 @@ static void ice_dcbnl_devreset(struct net_device *netdev) static int ice_dcbnl_getets(struct net_device *netdev, struct ieee_ets *ets) { struct ice_dcbx_cfg *dcbxcfg; - struct ice_port_info *pi; struct ice_pf *pf;
pf = ice_netdev_to_pf(netdev); - pi = pf->hw.port_info; - dcbxcfg = &pi->local_dcbx_cfg; + dcbxcfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
ets->willing = dcbxcfg->etscfg.willing; ets->ets_cap = dcbxcfg->etscfg.maxtcs; @@ -72,7 +70,7 @@ static int ice_dcbnl_setets(struct net_device *netdev, struct ieee_ets *ets) !(pf->dcbx_cap & DCB_CAP_DCBX_VER_IEEE)) return -EINVAL;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
mutex_lock(&pf->tc_mutex);
@@ -157,6 +155,7 @@ static u8 ice_dcbnl_getdcbx(struct net_device *netdev) static u8 ice_dcbnl_setdcbx(struct net_device *netdev, u8 mode) { struct ice_pf *pf = ice_netdev_to_pf(netdev); + struct ice_qos_cfg *qos_cfg;
/* if FW LLDP agent is running, DCBNL not allowed to change mode */ if (test_bit(ICE_FLAG_FW_LLDP_AGENT, pf->flags)) @@ -173,10 +172,11 @@ static u8 ice_dcbnl_setdcbx(struct net_device *netdev, u8 mode) return ICE_DCB_NO_HW_CHG;
pf->dcbx_cap = mode; + qos_cfg = &pf->hw.port_info->qos_cfg; if (mode & DCB_CAP_DCBX_VER_CEE) - pf->hw.port_info->local_dcbx_cfg.dcbx_mode = ICE_DCBX_MODE_CEE; + qos_cfg->local_dcbx_cfg.dcbx_mode = ICE_DCBX_MODE_CEE; else - pf->hw.port_info->local_dcbx_cfg.dcbx_mode = ICE_DCBX_MODE_IEEE; + qos_cfg->local_dcbx_cfg.dcbx_mode = ICE_DCBX_MODE_IEEE;
dev_info(ice_pf_to_dev(pf), "DCBx mode = 0x%x\n", mode); return ICE_DCB_HW_CHG_RST; @@ -227,7 +227,7 @@ static int ice_dcbnl_getpfc(struct net_device *netdev, struct ieee_pfc *pfc) struct ice_dcbx_cfg *dcbxcfg; int i;
- dcbxcfg = &pi->local_dcbx_cfg; + dcbxcfg = &pi->qos_cfg.local_dcbx_cfg; pfc->pfc_cap = dcbxcfg->pfc.pfccap; pfc->pfc_en = dcbxcfg->pfc.pfcena; pfc->mbc = dcbxcfg->pfc.mbc; @@ -258,7 +258,7 @@ static int ice_dcbnl_setpfc(struct net_device *netdev, struct ieee_pfc *pfc)
mutex_lock(&pf->tc_mutex);
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
if (pfc->pfc_cap) new_cfg->pfc.pfccap = pfc->pfc_cap; @@ -295,9 +295,9 @@ ice_dcbnl_get_pfc_cfg(struct net_device *netdev, int prio, u8 *setting) if (prio >= ICE_MAX_USER_PRIORITY) return;
- *setting = (pi->local_dcbx_cfg.pfc.pfcena >> prio) & 0x1; + *setting = (pi->qos_cfg.local_dcbx_cfg.pfc.pfcena >> prio) & 0x1; dev_dbg(ice_pf_to_dev(pf), "Get PFC Config up=%d, setting=%d, pfcenable=0x%x\n", - prio, *setting, pi->local_dcbx_cfg.pfc.pfcena); + prio, *setting, pi->qos_cfg.local_dcbx_cfg.pfc.pfcena); }
/** @@ -318,7 +318,7 @@ static void ice_dcbnl_set_pfc_cfg(struct net_device *netdev, int prio, u8 set) if (prio >= ICE_MAX_USER_PRIORITY) return;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
new_cfg->pfc.pfccap = pf->hw.func_caps.common_cap.maxtc; if (set) @@ -340,7 +340,7 @@ static u8 ice_dcbnl_getpfcstate(struct net_device *netdev) struct ice_port_info *pi = pf->hw.port_info;
/* Return enabled if any UP enabled for PFC */ - if (pi->local_dcbx_cfg.pfc.pfcena) + if (pi->qos_cfg.local_dcbx_cfg.pfc.pfcena) return 1;
return 0; @@ -380,8 +380,8 @@ static u8 ice_dcbnl_setstate(struct net_device *netdev, u8 state)
if (state) { set_bit(ICE_FLAG_DCB_ENA, pf->flags); - memcpy(&pf->hw.port_info->desired_dcbx_cfg, - &pf->hw.port_info->local_dcbx_cfg, + memcpy(&pf->hw.port_info->qos_cfg.desired_dcbx_cfg, + &pf->hw.port_info->qos_cfg.local_dcbx_cfg, sizeof(struct ice_dcbx_cfg)); } else { clear_bit(ICE_FLAG_DCB_ENA, pf->flags); @@ -415,7 +415,7 @@ ice_dcbnl_get_pg_tc_cfg_tx(struct net_device *netdev, int prio, if (prio >= ICE_MAX_USER_PRIORITY) return;
- *pgid = pi->local_dcbx_cfg.etscfg.prio_table[prio]; + *pgid = pi->qos_cfg.local_dcbx_cfg.etscfg.prio_table[prio]; dev_dbg(ice_pf_to_dev(pf), "Get PG config prio=%d tc=%d\n", prio, *pgid); } @@ -446,7 +446,7 @@ ice_dcbnl_set_pg_tc_cfg_tx(struct net_device *netdev, int tc, if (tc >= ICE_MAX_TRAFFIC_CLASS) return;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
/* prio_type, bwg_id and bw_pct per UP are not supported */
@@ -476,7 +476,7 @@ ice_dcbnl_get_pg_bwg_cfg_tx(struct net_device *netdev, int pgid, u8 *bw_pct) if (pgid >= ICE_MAX_TRAFFIC_CLASS) return;
- *bw_pct = pi->local_dcbx_cfg.etscfg.tcbwtable[pgid]; + *bw_pct = pi->qos_cfg.local_dcbx_cfg.etscfg.tcbwtable[pgid]; dev_dbg(ice_pf_to_dev(pf), "Get PG BW config tc=%d bw_pct=%d\n", pgid, *bw_pct); } @@ -500,7 +500,7 @@ ice_dcbnl_set_pg_bwg_cfg_tx(struct net_device *netdev, int pgid, u8 bw_pct) if (pgid >= ICE_MAX_TRAFFIC_CLASS) return;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
new_cfg->etscfg.tcbwtable[pgid] = bw_pct; } @@ -530,7 +530,7 @@ ice_dcbnl_get_pg_tc_cfg_rx(struct net_device *netdev, int prio, if (prio >= ICE_MAX_USER_PRIORITY) return;
- *pgid = pi->local_dcbx_cfg.etscfg.prio_table[prio]; + *pgid = pi->qos_cfg.local_dcbx_cfg.etscfg.prio_table[prio]; }
/** @@ -701,9 +701,9 @@ static int ice_dcbnl_setapp(struct net_device *netdev, struct dcb_app *app)
mutex_lock(&pf->tc_mutex);
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
- old_cfg = &pf->hw.port_info->local_dcbx_cfg; + old_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
if (old_cfg->numapps == ICE_DCBX_MAX_APPS) { ret = -EINVAL; @@ -753,7 +753,7 @@ static int ice_dcbnl_delapp(struct net_device *netdev, struct dcb_app *app) return -EINVAL;
mutex_lock(&pf->tc_mutex); - old_cfg = &pf->hw.port_info->local_dcbx_cfg; + old_cfg = &pf->hw.port_info->qos_cfg.local_dcbx_cfg;
if (old_cfg->numapps <= 1) goto delapp_out; @@ -762,7 +762,7 @@ static int ice_dcbnl_delapp(struct net_device *netdev, struct dcb_app *app) if (ret) goto delapp_out;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
for (i = 1; i < new_cfg->numapps; i++) { if (app->selector == new_cfg->app[i].selector && @@ -815,7 +815,7 @@ static u8 ice_dcbnl_cee_set_all(struct net_device *netdev) !(pf->dcbx_cap & DCB_CAP_DCBX_VER_CEE)) return ICE_DCB_NO_HW_CHG;
- new_cfg = &pf->hw.port_info->desired_dcbx_cfg; + new_cfg = &pf->hw.port_info->qos_cfg.desired_dcbx_cfg;
mutex_lock(&pf->tc_mutex);
@@ -886,7 +886,7 @@ void ice_dcbnl_set_all(struct ice_vsi *vsi) if (!test_bit(ICE_FLAG_DCB_ENA, pf->flags)) return;
- dcbxcfg = &pi->local_dcbx_cfg; + dcbxcfg = &pi->qos_cfg.local_dcbx_cfg;
for (i = 0; i < dcbxcfg->numapps; i++) { u8 prio, tc_map; diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index deb62b0c3855..d70573f5072c 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -2986,7 +2986,7 @@ ice_get_pauseparam(struct net_device *netdev, struct ethtool_pauseparam *pause) pause->rx_pause = 0; pause->tx_pause = 0;
- dcbx_cfg = &pi->local_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg;
pcaps = kzalloc(sizeof(*pcaps), GFP_KERNEL); if (!pcaps) @@ -3038,7 +3038,7 @@ ice_set_pauseparam(struct net_device *netdev, struct ethtool_pauseparam *pause)
pi = vsi->port_info; hw_link_info = &pi->phy.link_info; - dcbx_cfg = &pi->local_dcbx_cfg; + dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; link_up = hw_link_info->link_info & ICE_AQ_LINK_UP;
/* Changing the port's flow control is not supported if this isn't the diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 3417de29facf..170367eaa95a 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2078,7 +2078,7 @@ int ice_cfg_vlan_pruning(struct ice_vsi *vsi, bool ena, bool vlan_promisc)
static void ice_vsi_set_tc_cfg(struct ice_vsi *vsi) { - struct ice_dcbx_cfg *cfg = &vsi->port_info->local_dcbx_cfg; + struct ice_dcbx_cfg *cfg = &vsi->port_info->qos_cfg.local_dcbx_cfg;
vsi->tc_cfg.ena_tc = ice_dcb_get_ena_tc(cfg); vsi->tc_cfg.numtc = ice_dcb_get_num_tc(cfg); diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index af5b7f33db9a..0f2544c420ac 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -2421,7 +2421,7 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_ring *tx_ring) /* allow CONTROL frames egress from main VSI if FW LLDP disabled */ if (unlikely(skb->priority == TC_PRIO_CONTROL && vsi->type == ICE_VSI_PF && - vsi->port_info->is_sw_lldp)) + vsi->port_info->qos_cfg.is_sw_lldp)) offload.cd_qw1 |= (u64)(ICE_TX_DESC_DTYPE_CTX | ICE_TX_CTX_DESC_SWTCH_UPLINK << ICE_TXD_CTX_QW1_CMD_S); diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index 2226a291a394..c09c085f637a 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -514,6 +514,14 @@ struct ice_dcbx_cfg { #define ICE_DCBX_APPS_NON_WILLING 0x1 };
+struct ice_qos_cfg { + struct ice_dcbx_cfg local_dcbx_cfg; /* Oper/Local Cfg */ + struct ice_dcbx_cfg desired_dcbx_cfg; /* CEE Desired Cfg */ + struct ice_dcbx_cfg remote_dcbx_cfg; /* Peer Cfg */ + u8 dcbx_status : 3; /* see ICE_DCBX_STATUS_DIS */ + u8 is_sw_lldp : 1; +}; + struct ice_port_info { struct ice_sched_node *root; /* Root Node per Port */ struct ice_hw *hw; /* back pointer to HW instance */ @@ -537,13 +545,7 @@ struct ice_port_info { sib_head[ICE_MAX_TRAFFIC_CLASS][ICE_AQC_TOPO_MAX_LEVEL_NUM]; /* List contain profile ID(s) and other params per layer */ struct list_head rl_prof_list[ICE_AQC_TOPO_MAX_LEVEL_NUM]; - struct ice_dcbx_cfg local_dcbx_cfg; /* Oper/Local Cfg */ - /* DCBX info */ - struct ice_dcbx_cfg remote_dcbx_cfg; /* Peer Cfg */ - struct ice_dcbx_cfg desired_dcbx_cfg; /* CEE Desired Cfg */ - /* LLDP/DCBX Status */ - u8 dcbx_status:3; /* see ICE_DCBX_STATUS_DIS */ - u8 is_sw_lldp:1; + struct ice_qos_cfg qos_cfg; u8 is_vf:1; };
From: Chinh T Cao chinh.t.cao@intel.com
[ Upstream commit aeac8ce864d9c0836e12ed5b5cc80f62f3cccb7c ]
iSCSI can use both TCP ports 860 and 3260. However, in our current implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command doesn't provide a way to communicate the protocol port number to the AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the AQ's caller layer.
Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to determine which port number that iSCSI will use.
Fixes: 0ebd3ff13cca ("ice: Add code for DCB initialization part 2/4") Signed-off-by: Chinh T Cao chinh.t.cao@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_dcb.c | 38 +++++++++++++++++------ drivers/net/ethernet/intel/ice/ice_type.h | 1 + 2 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb.c b/drivers/net/ethernet/intel/ice/ice_dcb.c index e42727941ef5..211ac6f907ad 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb.c @@ -738,22 +738,27 @@ ice_aq_get_cee_dcb_cfg(struct ice_hw *hw, /** * ice_cee_to_dcb_cfg * @cee_cfg: pointer to CEE configuration struct - * @dcbcfg: DCB configuration struct + * @pi: port information structure * * Convert CEE configuration from firmware to DCB configuration */ static void ice_cee_to_dcb_cfg(struct ice_aqc_get_cee_dcb_cfg_resp *cee_cfg, - struct ice_dcbx_cfg *dcbcfg) + struct ice_port_info *pi) { u32 status, tlv_status = le32_to_cpu(cee_cfg->tlv_status); u32 ice_aqc_cee_status_mask, ice_aqc_cee_status_shift; + u8 i, j, err, sync, oper, app_index, ice_app_sel_type; u16 app_prio = le16_to_cpu(cee_cfg->oper_app_prio); - u8 i, err, sync, oper, app_index, ice_app_sel_type; u16 ice_aqc_cee_app_mask, ice_aqc_cee_app_shift; + struct ice_dcbx_cfg *cmp_dcbcfg, *dcbcfg; u16 ice_app_prot_id_type;
- /* CEE PG data to ETS config */ + dcbcfg = &pi->qos_cfg.local_dcbx_cfg; + dcbcfg->dcbx_mode = ICE_DCBX_MODE_CEE; + dcbcfg->tlv_status = tlv_status; + + /* CEE PG data */ dcbcfg->etscfg.maxtcs = cee_cfg->oper_num_tc;
/* Note that the FW creates the oper_prio_tc nibbles reversed @@ -780,10 +785,16 @@ ice_cee_to_dcb_cfg(struct ice_aqc_get_cee_dcb_cfg_resp *cee_cfg, } }
- /* CEE PFC data to ETS config */ + /* CEE PFC data */ dcbcfg->pfc.pfcena = cee_cfg->oper_pfc_en; dcbcfg->pfc.pfccap = ICE_MAX_TRAFFIC_CLASS;
+ /* CEE APP TLV data */ + if (dcbcfg->app_mode == ICE_DCBX_APPS_NON_WILLING) + cmp_dcbcfg = &pi->qos_cfg.desired_dcbx_cfg; + else + cmp_dcbcfg = &pi->qos_cfg.remote_dcbx_cfg; + app_index = 0; for (i = 0; i < 3; i++) { if (i == 0) { @@ -802,6 +813,18 @@ ice_cee_to_dcb_cfg(struct ice_aqc_get_cee_dcb_cfg_resp *cee_cfg, ice_aqc_cee_app_shift = ICE_AQC_CEE_APP_ISCSI_S; ice_app_sel_type = ICE_APP_SEL_TCPIP; ice_app_prot_id_type = ICE_APP_PROT_ID_ISCSI; + + for (j = 0; j < cmp_dcbcfg->numapps; j++) { + u16 prot_id = cmp_dcbcfg->app[j].prot_id; + u8 sel = cmp_dcbcfg->app[j].selector; + + if (sel == ICE_APP_SEL_TCPIP && + (prot_id == ICE_APP_PROT_ID_ISCSI || + prot_id == ICE_APP_PROT_ID_ISCSI_860)) { + ice_app_prot_id_type = prot_id; + break; + } + } } else { /* FIP APP */ ice_aqc_cee_status_mask = ICE_AQC_CEE_FIP_STATUS_M; @@ -892,11 +915,8 @@ enum ice_status ice_get_dcb_cfg(struct ice_port_info *pi) ret = ice_aq_get_cee_dcb_cfg(pi->hw, &cee_cfg, NULL); if (!ret) { /* CEE mode */ - dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; - dcbx_cfg->dcbx_mode = ICE_DCBX_MODE_CEE; - dcbx_cfg->tlv_status = le32_to_cpu(cee_cfg.tlv_status); - ice_cee_to_dcb_cfg(&cee_cfg, dcbx_cfg); ret = ice_get_ieee_or_cee_dcb_cfg(pi, ICE_DCBX_MODE_CEE); + ice_cee_to_dcb_cfg(&cee_cfg, pi); } else if (pi->hw->adminq.sq_last_status == ICE_AQ_RC_ENOENT) { /* CEE mode not enabled try querying IEEE data */ dcbx_cfg = &pi->qos_cfg.local_dcbx_cfg; diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index c09c085f637a..1bed183d96a0 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -493,6 +493,7 @@ struct ice_dcb_app_priority_table { #define ICE_TLV_STATUS_ERR 0x4 #define ICE_APP_PROT_ID_FCOE 0x8906 #define ICE_APP_PROT_ID_ISCSI 0x0cbc +#define ICE_APP_PROT_ID_ISCSI_860 0x035c #define ICE_APP_PROT_ID_FIP 0x8914 #define ICE_APP_SEL_ETHTYPE 0x1 #define ICE_APP_SEL_TCPIP 0x2
From: Eyal Birger eyal.birger@gmail.com
[ Upstream commit 8fc0e3b6a8666d656923d214e4dc791e9a17164a ]
Frag needed should only be sent if the header enables DF.
This fix allows packets larger than MTU to pass the xfrm interface and be fragmented after encapsulation, aligning behavior with non-interface xfrm.
Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Eyal Birger eyal.birger@gmail.com Reviewed-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_interface.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index 9b8e292a7c6a..e9ce23343f5c 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -305,6 +305,8 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); } else { + if (!(ip_hdr(skb)->frag_off & htons(IP_DF))) + goto xmit; icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); } @@ -313,6 +315,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) return -EMSGSIZE; }
+xmit: xfrmi_scrub_packet(skb, !net_eq(xi->net, dev_net(dev))); skb_dst_set(skb, dst); skb->dev = tdev;
From: Evan Nimmo evan.nimmo@alliedtelesis.co.nz
[ Upstream commit 9ab1265d52314fce1b51e8665ea6dbc9ac1a027c ]
A situation can occur where the interface bound to the sk is different to the interface bound to the sk attached to the skb. The interface bound to the sk is the correct one however this information is lost inside xfrm_output2 and instead the sk on the skb is used in xfrm_output_resume instead. This assumes that the sk bound interface and the bound interface attached to the sk within the skb are the same which can lead to lookup failures inside ip_route_me_harder resulting in the packet being dropped.
We have an l2tp v3 tunnel with ipsec protection. The tunnel is in the global VRF however we have an encapsulated dot1q tunnel interface that is within a different VRF. We also have a mangle rule that marks the packets causing them to be processed inside ip_route_me_harder.
Prior to commit 31c70d5956fc ("l2tp: keep original skb ownership") this worked fine as the sk attached to the skb was changed from the dot1q encapsulated interface to the sk for the tunnel which meant the interface bound to the sk and the interface bound to the skb were identical. Commit 46d6c5ae953c ("netfilter: use actual socket sk rather than skb sk when routing harder") fixed some of these issues however a similar problem existed in the xfrm code.
Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Signed-off-by: Evan Nimmo evan.nimmo@alliedtelesis.co.nz Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 2 +- net/ipv4/ah4.c | 2 +- net/ipv4/esp4.c | 2 +- net/ipv6/ah6.c | 2 +- net/ipv6/esp6.c | 2 +- net/xfrm/xfrm_output.c | 10 +++++----- 6 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index b2a06f10b62c..bfbc7810df94 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1557,7 +1557,7 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb, int xfrm_trans_queue(struct sk_buff *skb, int (*finish)(struct net *, struct sock *, struct sk_buff *)); -int xfrm_output_resume(struct sk_buff *skb, int err); +int xfrm_output_resume(struct sock *sk, struct sk_buff *skb, int err); int xfrm_output(struct sock *sk, struct sk_buff *skb);
#if IS_ENABLED(CONFIG_NET_PKTGEN) diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index d99e1be94019..36ed85bf2ad5 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -141,7 +141,7 @@ static void ah_output_done(struct crypto_async_request *base, int err) }
kfree(AH_SKB_CB(skb)->tmp); - xfrm_output_resume(skb, err); + xfrm_output_resume(skb->sk, skb, err); }
static int ah_output(struct xfrm_state *x, struct sk_buff *skb) diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index a3271ec3e162..4b834bbf95e0 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -279,7 +279,7 @@ static void esp_output_done(struct crypto_async_request *base, int err) x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) esp_output_tail_tcp(x, skb); else - xfrm_output_resume(skb, err); + xfrm_output_resume(skb->sk, skb, err); } }
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index 440080da805b..080ee7f44c64 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -316,7 +316,7 @@ static void ah6_output_done(struct crypto_async_request *base, int err) }
kfree(AH_SKB_CB(skb)->tmp); - xfrm_output_resume(skb, err); + xfrm_output_resume(skb->sk, skb, err); }
static int ah6_output(struct xfrm_state *x, struct sk_buff *skb) diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 2b804fcebcc6..4071cb7c7a15 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -314,7 +314,7 @@ static void esp_output_done(struct crypto_async_request *base, int err) x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) esp_output_tail_tcp(x, skb); else - xfrm_output_resume(skb, err); + xfrm_output_resume(skb->sk, skb, err); } }
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index a7ab19353313..b81ca117dac7 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -503,22 +503,22 @@ out: return err; }
-int xfrm_output_resume(struct sk_buff *skb, int err) +int xfrm_output_resume(struct sock *sk, struct sk_buff *skb, int err) { struct net *net = xs_net(skb_dst(skb)->xfrm);
while (likely((err = xfrm_output_one(skb, err)) == 0)) { nf_reset_ct(skb);
- err = skb_dst(skb)->ops->local_out(net, skb->sk, skb); + err = skb_dst(skb)->ops->local_out(net, sk, skb); if (unlikely(err != 1)) goto out;
if (!skb_dst(skb)->xfrm) - return dst_output(net, skb->sk, skb); + return dst_output(net, sk, skb);
err = nf_hook(skb_dst(skb)->ops->family, - NF_INET_POST_ROUTING, net, skb->sk, skb, + NF_INET_POST_ROUTING, net, sk, skb, NULL, skb_dst(skb)->dev, xfrm_output2); if (unlikely(err != 1)) goto out; @@ -534,7 +534,7 @@ EXPORT_SYMBOL_GPL(xfrm_output_resume);
static int xfrm_output2(struct net *net, struct sock *sk, struct sk_buff *skb) { - return xfrm_output_resume(skb, 1); + return xfrm_output_resume(sk, skb, 1); }
static int xfrm_output_gso(struct net *net, struct sock *sk, struct sk_buff *skb)
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 7029e783027706b427bbfbdf8558252c1dac6fa0 ]
On 32-bit machines with 64-bit resource_size_t, the driver causes a link failure because of the 64-bit division:
arm-linux-gnueabi-ld: drivers/remoteproc/qcom_pil_info.o: in function `qcom_pil_info_store': qcom_pil_info.c:(.text+0x1ec): undefined reference to `__aeabi_uldivmod'
Add a cast to an u32 to avoid this. If the resource exceeds 4GB, there are bigger problems.
Fixes: 549b67da660d ("remoteproc: qcom: Introduce helper to store pil info in IMEM") Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20210103135628.3702427-1-arnd@kernel.org Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/qcom_pil_info.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/remoteproc/qcom_pil_info.c b/drivers/remoteproc/qcom_pil_info.c index 5521c4437ffa..7c007dd7b200 100644 --- a/drivers/remoteproc/qcom_pil_info.c +++ b/drivers/remoteproc/qcom_pil_info.c @@ -56,7 +56,7 @@ static int qcom_pil_info_init(void) memset_io(base, 0, resource_size(&imem));
_reloc.base = base; - _reloc.num_entries = resource_size(&imem) / PIL_RELOC_ENTRY_SIZE; + _reloc.num_entries = (u32)resource_size(&imem) / PIL_RELOC_ENTRY_SIZE;
return 0; }
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 3b6e7088afc919f5b52e4d2de8501ad34d35b09b ]
According to Table 30 ("DVFS_MoniVDAC [6:0] Setting Table") in the BD9571MWV-M Datasheet Rev. 002, the valid voltage range is 600..1100 mV (settings 0x3c..0x6e). While the lower limit is taken into account (by setting regulator_desc.linear_min_sel to 0x3c), the upper limit is not.
Fix this by reducing regulator_desc.n_voltages from 0x80 to 0x6f.
Fixes: e85c5a153fe237f2 ("regulator: Add ROHM BD9571MWV-M PMIC regulator driver") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20210312130242.3390038-2-geert+renesas@glider.be Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/bd9571mwv-regulator.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/regulator/bd9571mwv-regulator.c b/drivers/regulator/bd9571mwv-regulator.c index e690c2ce5b3c..25e33028871c 100644 --- a/drivers/regulator/bd9571mwv-regulator.c +++ b/drivers/regulator/bd9571mwv-regulator.c @@ -124,7 +124,7 @@ static const struct regulator_ops vid_ops = {
static const struct regulator_desc regulators[] = { BD9571MWV_REG("VD09", "vd09", VD09, avs_ops, 0, 0x7f, - 0x80, 600000, 10000, 0x3c), + 0x6f, 600000, 10000, 0x3c), BD9571MWV_REG("VD18", "vd18", VD18, vid_ops, BD9571MWV_VD18_VID, 0xf, 16, 1625000, 25000, 0), BD9571MWV_REG("VD25", "vd25", VD25, vid_ops, BD9571MWV_VD25_VID, 0xf, @@ -133,7 +133,7 @@ static const struct regulator_desc regulators[] = { 11, 2800000, 100000, 0), BD9571MWV_REG("DVFS", "dvfs", DVFS, reg_ops, BD9571MWV_DVFS_MONIVDAC, 0x7f, - 0x80, 600000, 10000, 0x3c), + 0x6f, 600000, 10000, 0x3c), };
#ifdef CONFIG_PM_SLEEP
From: Tony Lindgren tony@atomide.com
[ Upstream commit 30916faa1a6009122e10d0c42338b8db44a36fde ]
We are now registering the mpu domain three times instead of registering mpu, core and iva domains like we should.
Fixes: d44fa156dcb2 ("ARM: OMAP2+: Configure voltage controller for cpcap") Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/pmic-cpcap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mach-omap2/pmic-cpcap.c b/arch/arm/mach-omap2/pmic-cpcap.c index 09076ad0576d..668dc84fd31e 100644 --- a/arch/arm/mach-omap2/pmic-cpcap.c +++ b/arch/arm/mach-omap2/pmic-cpcap.c @@ -246,10 +246,10 @@ int __init omap4_cpcap_init(void) omap_voltage_register_pmic(voltdm, &omap443x_max8952_mpu);
if (of_machine_is_compatible("motorola,droid-bionic")) { - voltdm = voltdm_lookup("mpu"); + voltdm = voltdm_lookup("core"); omap_voltage_register_pmic(voltdm, &omap_cpcap_core);
- voltdm = voltdm_lookup("mpu"); + voltdm = voltdm_lookup("iva"); omap_voltage_register_pmic(voltdm, &omap_cpcap_iva); } else { voltdm = voltdm_lookup("core");
From: Carlos Leija cileija@ti.com
[ Upstream commit b3d09a06d89f474cb52664e016849315a97e09d9 ]
We need to add a dummy smc call to the cpuidle wakeup path to force the ROM code to save the return address after MMU is enabled again. This is needed to prevent random hangs on secure devices like droid4.
Otherwise the system will eventually hang when entering deeper SoC idle states with the core and mpu domains in open-switch retention (OSWR). The hang happens as the ROM code tries to use the earlier physical return address set by omap-headsmp.S with MMU off while waking up CPU1 again.
The hangs started happening in theory already with commit caf8c87d7ff2 ("ARM: OMAP2+: Allow core oswr for omap4"), but in practise the issue went unnoticed as various drivers were often blocking any deeper idle states with hardware autoidle features.
This patch is based on an earlier TI Linux kernel tree commit 92f0b3028d9e ("OMAP4: PM: update ROM return address for OSWR and OFF") written by Carlos Leija cileija@ti.com, Praneeth Bajjuri praneeth@ti.com, and Bryan Buckley bryan.buckley@ti.com. A later version of the patch was updated to use CPU_PM notifiers by Tero Kristo t-kristo@ti.com.
Signed-off-by: Carlos Leija cileija@ti.com Signed-off-by: Praneeth Bajjuri praneeth@ti.com Signed-off-by: Bryan Buckley bryan.buckley@ti.com Signed-off-by: Tero Kristo t-kristo@ti.com Fixes: caf8c87d7ff2 ("ARM: OMAP2+: Allow core oswr for omap4") Reported-by: Carl Philipp Klemm philipp@uvos.xyz Reported-by: Merlijn Wajer merlijn@wizzup.org Cc: Ivan Jelincic parazyd@dyne.org Cc: Pavel Machek pavel@ucw.cz Cc: Sebastian Reichel sre@kernel.org Cc: Tero Kristo kristo@kernel.org [tony@atomide.com: updated to apply, updated description] Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/omap-secure.c | 39 +++++++++++++++++++++++++++++++ arch/arm/mach-omap2/omap-secure.h | 1 + 2 files changed, 40 insertions(+)
diff --git a/arch/arm/mach-omap2/omap-secure.c b/arch/arm/mach-omap2/omap-secure.c index f70d561f37f7..0659ab4cb0af 100644 --- a/arch/arm/mach-omap2/omap-secure.c +++ b/arch/arm/mach-omap2/omap-secure.c @@ -9,6 +9,7 @@ */
#include <linux/arm-smccc.h> +#include <linux/cpu_pm.h> #include <linux/kernel.h> #include <linux/init.h> #include <linux/io.h> @@ -20,6 +21,7 @@
#include "common.h" #include "omap-secure.h" +#include "soc.h"
static phys_addr_t omap_secure_memblock_base;
@@ -213,3 +215,40 @@ void __init omap_secure_init(void) { omap_optee_init_check(); } + +/* + * Dummy dispatcher call after core OSWR and MPU off. Updates the ROM return + * address after MMU has been re-enabled after CPU1 has been woken up again. + * Otherwise the ROM code will attempt to use the earlier physical return + * address that got set with MMU off when waking up CPU1. Only used on secure + * devices. + */ +static int cpu_notifier(struct notifier_block *nb, unsigned long cmd, void *v) +{ + switch (cmd) { + case CPU_CLUSTER_PM_EXIT: + omap_secure_dispatcher(OMAP4_PPA_SERVICE_0, + FLAG_START_CRITICAL, + 0, 0, 0, 0, 0); + break; + default: + break; + } + + return NOTIFY_OK; +} + +static struct notifier_block secure_notifier_block = { + .notifier_call = cpu_notifier, +}; + +static int __init secure_pm_init(void) +{ + if (omap_type() == OMAP2_DEVICE_TYPE_GP || !soc_is_omap44xx()) + return 0; + + cpu_pm_register_notifier(&secure_notifier_block); + + return 0; +} +omap_arch_initcall(secure_pm_init); diff --git a/arch/arm/mach-omap2/omap-secure.h b/arch/arm/mach-omap2/omap-secure.h index 4aaa95706d39..172069f31616 100644 --- a/arch/arm/mach-omap2/omap-secure.h +++ b/arch/arm/mach-omap2/omap-secure.h @@ -50,6 +50,7 @@ #define OMAP5_DRA7_MON_SET_ACR_INDEX 0x107
/* Secure PPA(Primary Protected Application) APIs */ +#define OMAP4_PPA_SERVICE_0 0x21 #define OMAP4_PPA_L2_POR_INDEX 0x23 #define OMAP4_PPA_CPU_ACTRL_SMP_INDEX 0x25
From: Ahmed S. Darwish a.darwish@linutronix.de
[ Upstream commit e88add19f68191448427a6e4eb059664650a837f ]
A sequence counter write section must be serialized or its internal state can get corrupted. The "xfrm_state_hash_generation" seqcount is global, but its write serialization lock (net->xfrm.xfrm_state_lock) is instantiated per network namespace. The write protection is thus insufficient.
To provide full protection, localize the sequence counter per network namespace instead. This should be safe as both the seqcount read and write sections access data exclusively within the network namespace. It also lays the foundation for transforming "xfrm_state_hash_generation" data type from seqcount_t to seqcount_LOCKNAME_t in further commits.
Fixes: b65e3d7be06f ("xfrm: state: add sequence count to detect hash resizes") Signed-off-by: Ahmed S. Darwish a.darwish@linutronix.de Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netns/xfrm.h | 4 +++- net/xfrm/xfrm_state.c | 10 +++++----- 2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h index 59f45b1e9dac..b59d73d529ba 100644 --- a/include/net/netns/xfrm.h +++ b/include/net/netns/xfrm.h @@ -72,7 +72,9 @@ struct netns_xfrm { #if IS_ENABLED(CONFIG_IPV6) struct dst_ops xfrm6_dst_ops; #endif - spinlock_t xfrm_state_lock; + spinlock_t xfrm_state_lock; + seqcount_t xfrm_state_hash_generation; + spinlock_t xfrm_policy_lock; struct mutex xfrm_cfg_mutex; }; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 2f1517827995..77499abd9f99 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -44,7 +44,6 @@ static void xfrm_state_gc_task(struct work_struct *work); */
static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; -static __read_mostly seqcount_t xfrm_state_hash_generation = SEQCNT_ZERO(xfrm_state_hash_generation); static struct kmem_cache *xfrm_state_cache __ro_after_init;
static DECLARE_WORK(xfrm_state_gc_work, xfrm_state_gc_task); @@ -140,7 +139,7 @@ static void xfrm_hash_resize(struct work_struct *work) }
spin_lock_bh(&net->xfrm.xfrm_state_lock); - write_seqcount_begin(&xfrm_state_hash_generation); + write_seqcount_begin(&net->xfrm.xfrm_state_hash_generation);
nhashmask = (nsize / sizeof(struct hlist_head)) - 1U; odst = xfrm_state_deref_prot(net->xfrm.state_bydst, net); @@ -156,7 +155,7 @@ static void xfrm_hash_resize(struct work_struct *work) rcu_assign_pointer(net->xfrm.state_byspi, nspi); net->xfrm.state_hmask = nhashmask;
- write_seqcount_end(&xfrm_state_hash_generation); + write_seqcount_end(&net->xfrm.xfrm_state_hash_generation); spin_unlock_bh(&net->xfrm.xfrm_state_lock);
osize = (ohashmask + 1) * sizeof(struct hlist_head); @@ -1061,7 +1060,7 @@ xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,
to_put = NULL;
- sequence = read_seqcount_begin(&xfrm_state_hash_generation); + sequence = read_seqcount_begin(&net->xfrm.xfrm_state_hash_generation);
rcu_read_lock(); h = xfrm_dst_hash(net, daddr, saddr, tmpl->reqid, encap_family); @@ -1174,7 +1173,7 @@ out: if (to_put) xfrm_state_put(to_put);
- if (read_seqcount_retry(&xfrm_state_hash_generation, sequence)) { + if (read_seqcount_retry(&net->xfrm.xfrm_state_hash_generation, sequence)) { *err = -EAGAIN; if (x) { xfrm_state_put(x); @@ -2664,6 +2663,7 @@ int __net_init xfrm_state_init(struct net *net) net->xfrm.state_num = 0; INIT_WORK(&net->xfrm.state_hash_work, xfrm_hash_resize); spin_lock_init(&net->xfrm.xfrm_state_lock); + seqcount_init(&net->xfrm.xfrm_state_hash_generation); return 0;
out_byspi:
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 154deab6a3ba47792936edf77f2f13a1cbc4351d ]
Now in esp4/6_gso_segment(), before calling inner proto .gso_segment, NETIF_F_CSUM_MASK bits are deleted, as HW won't be able to do the csum for inner proto due to the packet encrypted already.
So the UDP/TCP packet has to do the checksum on its own .gso_segment. But SCTP is using CRC checksum, and for that NETIF_F_SCTP_CRC should be deleted to make SCTP do the csum in own .gso_segment as well.
In Xiumei's testing with SCTP over IPsec/veth, the packets are kept dropping due to the wrong CRC checksum.
Reported-by: Xiumei Mu xmu@redhat.com Fixes: 7862b4058b9f ("esp: Add gso handlers for esp4 and esp6") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4_offload.c | 6 ++++-- net/ipv6/esp6_offload.c | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index 5bda5aeda579..d5c0f5a2a551 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -217,10 +217,12 @@ static struct sk_buff *esp4_gso_segment(struct sk_buff *skb,
if ((!(skb->dev->gso_partial_features & NETIF_F_HW_ESP) && !(features & NETIF_F_HW_ESP)) || x->xso.dev != skb->dev) - esp_features = features & ~(NETIF_F_SG | NETIF_F_CSUM_MASK); + esp_features = features & ~(NETIF_F_SG | NETIF_F_CSUM_MASK | + NETIF_F_SCTP_CRC); else if (!(features & NETIF_F_HW_ESP_TX_CSUM) && !(skb->dev->gso_partial_features & NETIF_F_HW_ESP_TX_CSUM)) - esp_features = features & ~NETIF_F_CSUM_MASK; + esp_features = features & ~(NETIF_F_CSUM_MASK | + NETIF_F_SCTP_CRC);
xo->flags |= XFRM_GSO_SEGMENT;
diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index 1ca516fb30e1..f35203ab39f5 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -254,9 +254,11 @@ static struct sk_buff *esp6_gso_segment(struct sk_buff *skb, skb->encap_hdr_csum = 1;
if (!(features & NETIF_F_HW_ESP) || x->xso.dev != skb->dev) - esp_features = features & ~(NETIF_F_SG | NETIF_F_CSUM_MASK); + esp_features = features & ~(NETIF_F_SG | NETIF_F_CSUM_MASK | + NETIF_F_SCTP_CRC); else if (!(features & NETIF_F_HW_ESP_TX_CSUM)) - esp_features = features & ~NETIF_F_CSUM_MASK; + esp_features = features & ~(NETIF_F_CSUM_MASK | + NETIF_F_SCTP_CRC);
xo->flags |= XFRM_GSO_SEGMENT;
From: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com
[ Upstream commit 927280909fa7d8e61596800d82f18047c6cfbbe4 ]
When checking for enabled cores it isn't enough to check that some of the requested cores are running, we have to check that all of them are.
Fixes: 747503b1813a ("ASoC: SOF: Intel: Add Intel specific HDA DSP HW operations") Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20210322163728.16616-2-pierre-louis.bossart@linux.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/hda-dsp.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c index c731b9bd60b4..85ec4361c8c4 100644 --- a/sound/soc/sof/intel/hda-dsp.c +++ b/sound/soc/sof/intel/hda-dsp.c @@ -226,10 +226,17 @@ bool hda_dsp_core_is_enabled(struct snd_sof_dev *sdev,
val = snd_sof_dsp_read(sdev, HDA_DSP_BAR, HDA_DSP_REG_ADSPCS);
- is_enable = (val & HDA_DSP_ADSPCS_CPA_MASK(core_mask)) && - (val & HDA_DSP_ADSPCS_SPA_MASK(core_mask)) && - !(val & HDA_DSP_ADSPCS_CRST_MASK(core_mask)) && - !(val & HDA_DSP_ADSPCS_CSTALL_MASK(core_mask)); +#define MASK_IS_EQUAL(v, m, field) ({ \ + u32 _m = field(m); \ + ((v) & _m) == _m; \ +}) + + is_enable = MASK_IS_EQUAL(val, core_mask, HDA_DSP_ADSPCS_CPA_MASK) && + MASK_IS_EQUAL(val, core_mask, HDA_DSP_ADSPCS_SPA_MASK) && + !(val & HDA_DSP_ADSPCS_CRST_MASK(core_mask)) && + !(val & HDA_DSP_ADSPCS_CSTALL_MASK(core_mask)); + +#undef MASK_IS_EQUAL
dev_dbg(sdev->dev, "DSP core(s) enabled? %d : core_mask %x\n", is_enable, core_mask);
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit 16b82e75c15a7dbd564ea3654f3feb61df9e1e6f ]
The input MCLK is 12.288MHz, the desired output sysclk is 11.2896MHz and sample rate is 44100Hz, with the configuration pllprescale=2, postscale=sysclkdiv=1, some chip may have wrong bclk and lrclk output with pll enabled in master mode, but with the configuration pllprescale=1, postscale=2, the output clock is correct.
From Datasheet, the PLL performs best when f2 is between
90MHz and 100MHz when the desired sysclk output is 11.2896MHz or 12.288MHz, so sysclkdiv = 2 (f2/8) is the best choice.
So search available sysclk_divs from 2 to 1 other than from 1 to 2.
Fixes: 84fdc00d519f ("ASoC: codec: wm9860: Refactor PLL out freq search") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/1616150926-22892-1-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wm8960.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wm8960.c b/sound/soc/codecs/wm8960.c index 660ec46eecf2..ceaf3bbb18e6 100644 --- a/sound/soc/codecs/wm8960.c +++ b/sound/soc/codecs/wm8960.c @@ -707,7 +707,13 @@ int wm8960_configure_pll(struct snd_soc_component *component, int freq_in, best_freq_out = -EINVAL; *sysclk_idx = *dac_idx = *bclk_idx = -1;
- for (i = 0; i < ARRAY_SIZE(sysclk_divs); ++i) { + /* + * From Datasheet, the PLL performs best when f2 is between + * 90MHz and 100MHz, the desired sysclk output is 11.2896MHz + * or 12.288MHz, then sysclkdiv = 2 is the best choice. + * So search sysclk_divs from 2 to 1 other than from 1 to 2. + */ + for (i = ARRAY_SIZE(sysclk_divs) - 1; i >= 0; --i) { if (sysclk_divs[i] == -1) continue; for (j = 0; j < ARRAY_SIZE(dac_divs); ++j) {
From: Steffen Klassert steffen.klassert@secunet.com
[ Upstream commit b1e3a5607034aa0a481c6f69a6893049406665fb ]
When xfrm interfaces are used in combination with namespaces and ESP offload, we get a dst_entry NULL pointer dereference. This is because we don't have a dst_entry attached in the ESP offloading case and we need to do a policy lookup before the namespace transition.
Fix this by expicit checking of skb_dst(skb) before accessing it.
Fixes: f203b76d78092 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xfrm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h index bfbc7810df94..c58a6d4eb610 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1097,7 +1097,7 @@ static inline int __xfrm_policy_check2(struct sock *sk, int dir, return __xfrm_policy_check(sk, ndir, skb, family);
return (!net->xfrm.policy_count[dir] && !secpath_exists(skb)) || - (skb_dst(skb)->flags & DST_NOPOLICY) || + (skb_dst(skb) && (skb_dst(skb)->flags & DST_NOPOLICY)) || __xfrm_policy_check(sk, ndir, skb, family); }
From: Norbert Ciosek norbertx.ciosek@intel.com
[ Upstream commit 22f8b5df881e9f1302514bbbbbb8649c2051de55 ]
Remove padding from RSS structures. Previous layout could lead to unwanted compiler optimizations in loops when iterating over key and lut arrays.
Fixes: 65ece6de0114 ("virtchnl: Add missing explicit padding to structures") Signed-off-by: Norbert Ciosek norbertx.ciosek@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/avf/virtchnl.h | 2 -- 1 file changed, 2 deletions(-)
diff --git a/include/linux/avf/virtchnl.h b/include/linux/avf/virtchnl.h index 40bad71865ea..532bcbfc4716 100644 --- a/include/linux/avf/virtchnl.h +++ b/include/linux/avf/virtchnl.h @@ -476,7 +476,6 @@ struct virtchnl_rss_key { u16 vsi_id; u16 key_len; u8 key[1]; /* RSS hash key, packed bytes */ - u8 pad[1]; };
VIRTCHNL_CHECK_STRUCT_LEN(6, virtchnl_rss_key); @@ -485,7 +484,6 @@ struct virtchnl_rss_lut { u16 vsi_id; u16 lut_entries; u8 lut[1]; /* RSS lookup table */ - u8 pad[1]; };
VIRTCHNL_CHECK_STRUCT_LEN(6, virtchnl_rss_lut);
From: Mateusz Palczewski mateusz.palczewski@intel.com
[ Upstream commit 90449e98c265296329446c7abcd2aae3b20c0bc9 ]
Add Asym_Pause to supported link modes (it is supported by HW). Lack of Asym_Pause in supported modes can cause several problems, i.e. it won't be possible to turn the autonegotiation on with asymmetric pause settings (i.e. Tx on, Rx off).
Fixes: 4e91bcd5d47a ("i40e: Finish implementation of ethtool get settings") Signed-off-by: Dawid Lukwinski dawid.lukwinski@intel.com Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Przemyslaw Patynowski przemyslawx.patynowski@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index 9e81f85ee2d8..a92fac6f1389 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -1101,6 +1101,7 @@ static int i40e_get_link_ksettings(struct net_device *netdev,
/* Set flow control settings */ ethtool_link_ksettings_add_link_mode(ks, supported, Pause); + ethtool_link_ksettings_add_link_mode(ks, supported, Asym_Pause);
switch (hw->fc.requested_mode) { case I40E_FC_FULL:
From: Eryk Rybak eryk.roch.rybak@intel.com
[ Upstream commit 347b5650cd158d1d953487cc2bec567af5c5bf96 ]
Fix the reason of kernel oops when i40e driver removed VFs. Added new __I40E_VFS_RELEASING state to signalize releasing process by PF, that it makes possible to exit of reset VF procedure. Without this patch, it is possible to suspend the VFs reset by releasing VFs resources procedure. Retrying the reset after the timeout works on the freed VF memory causing a kernel oops.
Fixes: d43d60e5eb95 ("i40e: ensure reset occurs when disabling VF") Signed-off-by: Eryk Rybak eryk.roch.rybak@intel.com Signed-off-by: Grzegorz Szczurek grzegorzx.szczurek@intel.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e.h | 1 + drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 +++++++++ 2 files changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index 118473dfdcbd..fe1258778cbc 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -142,6 +142,7 @@ enum i40e_state_t { __I40E_VIRTCHNL_OP_PENDING, __I40E_RECOVERY_MODE, __I40E_VF_RESETS_DISABLED, /* disable resets during i40e_remove */ + __I40E_VFS_RELEASING, /* This must be last as it determines the size of the BITMAP */ __I40E_STATE_SIZE__, }; diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 3b269c70dcfe..e4f13a49c3df 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -137,6 +137,7 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf) **/ static inline void i40e_vc_disable_vf(struct i40e_vf *vf) { + struct i40e_pf *pf = vf->pf; int i;
i40e_vc_notify_vf_reset(vf); @@ -147,6 +148,11 @@ static inline void i40e_vc_disable_vf(struct i40e_vf *vf) * ensure a reset. */ for (i = 0; i < 20; i++) { + /* If PF is in VFs releasing state reset VF is impossible, + * so leave it. + */ + if (test_bit(__I40E_VFS_RELEASING, pf->state)) + return; if (i40e_reset_vf(vf, false)) return; usleep_range(10000, 20000); @@ -1574,6 +1580,8 @@ void i40e_free_vfs(struct i40e_pf *pf)
if (!pf->vf) return; + + set_bit(__I40E_VFS_RELEASING, pf->state); while (test_and_set_bit(__I40E_VF_DISABLE, pf->state)) usleep_range(1000, 2000);
@@ -1631,6 +1639,7 @@ void i40e_free_vfs(struct i40e_pf *pf) } } clear_bit(__I40E_VF_DISABLE, pf->state); + clear_bit(__I40E_VFS_RELEASING, pf->state); }
#ifdef CONFIG_PCI_IOV
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 7f6c411c9b50cfab41cc798e003eff27608c7016 ]
1) argument should not be freed in any case - the caller already has it as ->s_fs_info (and uses it a lot afterwards) 2) allocate readlink buffer with kmalloc() - the caller has no way to tell if it's got that (on absolute symlink) or a result of kasprintf(). Sure, for SLAB and SLUB kfree() works on results of kmem_cache_alloc(), but that's not documented anywhere, might change in the future *and* is already not true for SLOB.
Fixes: 52b209f7b848 ("get rid of hostfs_read_inode()") Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hostfs/hostfs_kern.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c index c070c0d8e3e9..d4e360234579 100644 --- a/fs/hostfs/hostfs_kern.c +++ b/fs/hostfs/hostfs_kern.c @@ -142,7 +142,7 @@ static char *follow_link(char *link) char *name, *resolved, *end; int n;
- name = __getname(); + name = kmalloc(PATH_MAX, GFP_KERNEL); if (!name) { n = -ENOMEM; goto out_free; @@ -171,12 +171,11 @@ static char *follow_link(char *link) goto out_free; }
- __putname(name); - kfree(link); + kfree(name); return resolved;
out_free: - __putname(name); + kfree(name); return ERR_PTR(n); }
From: Shyam Sundar S K Shyam-sundar.S-k@amd.com
[ Upstream commit d75135082698140a26a56defe1bbc1b06f26a41f ]
Based on the IOMMU configuration, the current cache control settings can result in possible coherency issues. The hardware team has recommended new settings for the PCI device path to eliminate the issue.
Fixes: 6f595959c095 ("amd-xgbe: Adjust register settings to improve performance") Signed-off-by: Shyam Sundar S K Shyam-sundar.S-k@amd.com Acked-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/xgbe/xgbe.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h index ba8321ec1ee7..3305979a9f7c 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe.h +++ b/drivers/net/ethernet/amd/xgbe/xgbe.h @@ -180,9 +180,9 @@ #define XGBE_DMA_SYS_AWCR 0x30303030
/* DMA cache settings - PCI device */ -#define XGBE_DMA_PCI_ARCR 0x00000003 -#define XGBE_DMA_PCI_AWCR 0x13131313 -#define XGBE_DMA_PCI_AWARCR 0x00000313 +#define XGBE_DMA_PCI_ARCR 0x000f0f0f +#define XGBE_DMA_PCI_AWCR 0x0f0f0f0f +#define XGBE_DMA_PCI_AWARCR 0x00000f0f
/* DMA channel interrupt modes */ #define XGBE_IRQ_MODE_EDGE 0
From: Antoine Tenart atenart@kernel.org
[ Upstream commit 30a93d2b7d5a7cbb53ac19c9364a256d1aa6c08a ]
When the interface is part of a bridge or an Open vSwitch port and a packet exceed a PMTU estimate, an ICMP reply is sent to the sender. When using the external mode (collect metadata) the source and destination addresses are reversed, so that Open vSwitch can match the packet against an existing (reverse) flow.
But inverting the source and destination addresses in the shared ip_tunnel_info will make following packets of the flow to use a wrong destination address (packets will be tunnelled to itself), if the flow isn't updated. Which happens with Open vSwitch, until the flow times out.
Fixes this by uncloning the skb's ip_tunnel_info before inverting its source and destination addresses, so that the modification will only be made for the PTMU packet, not the following ones.
Fixes: fc68c99577cc ("vxlan: Support for PMTU discovery on directly bridged links") Tested-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Eelco Chaudron echaudro@redhat.com Signed-off-by: Antoine Tenart atenart@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vxlan.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 50cb8f045a1e..d3b698d9e2e6 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2724,12 +2724,17 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, goto tx_error; } else if (err) { if (info) { + struct ip_tunnel_info *unclone; struct in_addr src, dst;
+ unclone = skb_tunnel_info_unclone(skb); + if (unlikely(!unclone)) + goto tx_error; + src = remote_ip.sin.sin_addr; dst = local_ip.sin.sin_addr; - info->key.u.ipv4.src = src.s_addr; - info->key.u.ipv4.dst = dst.s_addr; + unclone->key.u.ipv4.src = src.s_addr; + unclone->key.u.ipv4.dst = dst.s_addr; } vxlan_encap_bypass(skb, vxlan, vxlan, vni, false); dst_release(ndst); @@ -2780,12 +2785,17 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, goto tx_error; } else if (err) { if (info) { + struct ip_tunnel_info *unclone; struct in6_addr src, dst;
+ unclone = skb_tunnel_info_unclone(skb); + if (unlikely(!unclone)) + goto tx_error; + src = remote_ip.sin6.sin6_addr; dst = local_ip.sin6.sin6_addr; - info->key.u.ipv6.src = src; - info->key.u.ipv6.dst = dst; + unclone->key.u.ipv6.src = src; + unclone->key.u.ipv6.dst = dst; }
vxlan_encap_bypass(skb, vxlan, vxlan, vni, false);
From: Antoine Tenart atenart@kernel.org
[ Upstream commit 68c1a943ef37bafde5ea2383e8ca224c7169ee31 ]
When the interface is part of a bridge or an Open vSwitch port and a packet exceed a PMTU estimate, an ICMP reply is sent to the sender. When using the external mode (collect metadata) the source and destination addresses are reversed, so that Open vSwitch can match the packet against an existing (reverse) flow.
But inverting the source and destination addresses in the shared ip_tunnel_info will make following packets of the flow to use a wrong destination address (packets will be tunnelled to itself), if the flow isn't updated. Which happens with Open vSwitch, until the flow times out.
Fixes this by uncloning the skb's ip_tunnel_info before inverting its source and destination addresses, so that the modification will only be made for the PTMU packet, not the following ones.
Fixes: c1a800e88dbf ("geneve: Support for PMTU discovery on directly bridged links") Tested-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Eelco Chaudron echaudro@redhat.com Signed-off-by: Antoine Tenart atenart@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/geneve.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 1426bfc009bc..abd37f26af68 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -907,8 +907,16 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
info = skb_tunnel_info(skb); if (info) { - info->key.u.ipv4.dst = fl4.saddr; - info->key.u.ipv4.src = fl4.daddr; + struct ip_tunnel_info *unclone; + + unclone = skb_tunnel_info_unclone(skb); + if (unlikely(!unclone)) { + dst_release(&rt->dst); + return -ENOMEM; + } + + unclone->key.u.ipv4.dst = fl4.saddr; + unclone->key.u.ipv4.src = fl4.daddr; }
if (!pskb_may_pull(skb, ETH_HLEN)) { @@ -992,8 +1000,16 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, struct ip_tunnel_info *info = skb_tunnel_info(skb);
if (info) { - info->key.u.ipv6.dst = fl6.saddr; - info->key.u.ipv6.src = fl6.daddr; + struct ip_tunnel_info *unclone; + + unclone = skb_tunnel_info_unclone(skb); + if (unlikely(!unclone)) { + dst_release(dst); + return -ENOMEM; + } + + unclone->key.u.ipv6.dst = fl6.saddr; + unclone->key.u.ipv6.src = fl6.daddr; }
if (!pskb_may_pull(skb, ETH_HLEN)) {
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3a87571f0ffc51ba3bf3ecdb6032861d0154b164 ]
This fixes following syzbot report:
UBSAN: shift-out-of-bounds in ./include/net/red.h:237:23 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 1 PID: 8418 Comm: syz-executor170 Not tainted 5.12.0-rc4-next-20210324-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327 red_set_parms include/net/red.h:237 [inline] choke_change.cold+0x3c/0xc8 net/sched/sch_choke.c:414 qdisc_create+0x475/0x12f0 net/sched/sch_api.c:1247 tc_modify_qdisc+0x4c8/0x1a50 net/sched/sch_api.c:1663 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x43f039 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdfa725168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043f039 RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000004 RBP: 0000000000403020 R08: 0000000000400488 R09: 0000000000400488 R10: 0000000000400488 R11: 0000000000000246 R12: 00000000004030b0 R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488
Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/red.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/red.h b/include/net/red.h index 9e6647c4ccd1..cc9f6b0d7f1e 100644 --- a/include/net/red.h +++ b/include/net/red.h @@ -171,9 +171,9 @@ static inline void red_set_vars(struct red_vars *v) static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog, u8 Scell_log, u8 *stab) { - if (fls(qth_min) + Wlog > 32) + if (fls(qth_min) + Wlog >= 32) return false; - if (fls(qth_max) + Wlog > 32) + if (fls(qth_max) + Wlog >= 32) return false; if (Scell_log >= 32) return false;
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit 1b479fb801602b22512f53c19b1f93a4fc5d5d9d ]
In pvc_xmit, if __skb_pad(skb, pad, false) failed, it will free the skb in the first time and goto drop. But the same skb is freed by kfree_skb(skb) in the second time in drop.
Maintaining the original function unchanged, my patch adds a new label out to avoid the double free if __skb_pad() failed.
Fixes: f5083d0cee08a ("drivers/net/wan/hdlc_fr: Improvements to the code of pvc_xmit") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/hdlc_fr.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 409e5a7ad8e2..857912ae84d7 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -415,7 +415,7 @@ static netdev_tx_t pvc_xmit(struct sk_buff *skb, struct net_device *dev)
if (pad > 0) { /* Pad the frame with zeros */ if (__skb_pad(skb, pad, false)) - goto drop; + goto out; skb_put(skb, pad); } } @@ -448,8 +448,9 @@ static netdev_tx_t pvc_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK;
drop: - dev->stats.tx_dropped++; kfree_skb(skb); +out: + dev->stats.tx_dropped++; return NETDEV_TX_OK; }
From: Oliver Stäbler oliver.staebler@bytesatwork.ch
[ Upstream commit 5cfad4f45806f6f898b63b8c77cea7452c704cb3 ]
Fix address of the pad control register (IOMUXC_SW_PAD_CTL_PAD_SD1_DATA0) for SD1_DATA0_GPIO2_IO2. This seems to be a typo but it leads to an exception when pinctrl is applied due to wrong memory address access.
Signed-off-by: Oliver Stäbler oliver.staebler@bytesatwork.ch Reviewed-by: Fabio Estevam festevam@gmail.com Acked-by: Rob Herring robh@kernel.org Fixes: c1c9d41319c3 ("dt-bindings: imx: Add pinctrl binding doc for imx8mm") Fixes: 748f908cc882 ("arm64: add basic DTS for i.MX8MQ") Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h | 2 +- arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h b/arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h index 5ccc4cc91959..a003e6af3353 100644 --- a/arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h +++ b/arch/arm64/boot/dts/freescale/imx8mm-pinfunc.h @@ -124,7 +124,7 @@ #define MX8MM_IOMUXC_SD1_CMD_USDHC1_CMD 0x0A4 0x30C 0x000 0x0 0x0 #define MX8MM_IOMUXC_SD1_CMD_GPIO2_IO1 0x0A4 0x30C 0x000 0x5 0x0 #define MX8MM_IOMUXC_SD1_DATA0_USDHC1_DATA0 0x0A8 0x310 0x000 0x0 0x0 -#define MX8MM_IOMUXC_SD1_DATA0_GPIO2_IO2 0x0A8 0x31 0x000 0x5 0x0 +#define MX8MM_IOMUXC_SD1_DATA0_GPIO2_IO2 0x0A8 0x310 0x000 0x5 0x0 #define MX8MM_IOMUXC_SD1_DATA1_USDHC1_DATA1 0x0AC 0x314 0x000 0x0 0x0 #define MX8MM_IOMUXC_SD1_DATA1_GPIO2_IO3 0x0AC 0x314 0x000 0x5 0x0 #define MX8MM_IOMUXC_SD1_DATA2_USDHC1_DATA2 0x0B0 0x318 0x000 0x0 0x0 diff --git a/arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h b/arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h index b94b02080a34..68e8fa172974 100644 --- a/arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h +++ b/arch/arm64/boot/dts/freescale/imx8mq-pinfunc.h @@ -130,7 +130,7 @@ #define MX8MQ_IOMUXC_SD1_CMD_USDHC1_CMD 0x0A4 0x30C 0x000 0x0 0x0 #define MX8MQ_IOMUXC_SD1_CMD_GPIO2_IO1 0x0A4 0x30C 0x000 0x5 0x0 #define MX8MQ_IOMUXC_SD1_DATA0_USDHC1_DATA0 0x0A8 0x310 0x000 0x0 0x0 -#define MX8MQ_IOMUXC_SD1_DATA0_GPIO2_IO2 0x0A8 0x31 0x000 0x5 0x0 +#define MX8MQ_IOMUXC_SD1_DATA0_GPIO2_IO2 0x0A8 0x310 0x000 0x5 0x0 #define MX8MQ_IOMUXC_SD1_DATA1_USDHC1_DATA1 0x0AC 0x314 0x000 0x0 0x0 #define MX8MQ_IOMUXC_SD1_DATA1_GPIO2_IO3 0x0AC 0x314 0x000 0x5 0x0 #define MX8MQ_IOMUXC_SD1_DATA2_USDHC1_DATA2 0x0B0 0x318 0x000 0x0 0x0
From: Steffen Klassert steffen.klassert@secunet.com
[ Upstream commit c7dbf4c08868d9db89b8bfe8f8245ca61b01ed2f ]
Commit 94579ac3f6d0 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") added a XFRM_XMIT flag to avoid duplicate ESP trailer insertion on HW offload. This flag is set on the secpath that is shared amongst segments. This lead to a situation where some segments are not transformed correctly when segmentation happens at layer 3.
Fix this by using private skb extensions for segmented and hw offloaded ESP packets.
Fixes: 94579ac3f6d0 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.") Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4_offload.c | 11 ++++++++++- net/ipv6/esp6_offload.c | 11 ++++++++++- net/xfrm/xfrm_device.c | 2 -- 3 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index d5c0f5a2a551..5aa7344dbec7 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -314,8 +314,17 @@ static int esp_xmit(struct xfrm_state *x, struct sk_buff *skb, netdev_features_ ip_hdr(skb)->tot_len = htons(skb->len); ip_send_check(ip_hdr(skb));
- if (hw_offload) + if (hw_offload) { + if (!skb_ext_add(skb, SKB_EXT_SEC_PATH)) + return -ENOMEM; + + xo = xfrm_offload(skb); + if (!xo) + return -EINVAL; + + xo->flags |= XFRM_XMIT; return 0; + }
err = esp_output_tail(x, skb, &esp); if (err) diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index f35203ab39f5..4af56affaafd 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -348,8 +348,17 @@ static int esp6_xmit(struct xfrm_state *x, struct sk_buff *skb, netdev_features
ipv6_hdr(skb)->payload_len = htons(len);
- if (hw_offload) + if (hw_offload) { + if (!skb_ext_add(skb, SKB_EXT_SEC_PATH)) + return -ENOMEM; + + xo = xfrm_offload(skb); + if (!xo) + return -EINVAL; + + xo->flags |= XFRM_XMIT; return 0; + }
err = esp6_output_tail(x, skb, &esp); if (err) diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index edf11893dbe8..6d6917b68856 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -134,8 +134,6 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur return skb; }
- xo->flags |= XFRM_XMIT; - if (skb_is_gso(skb) && unlikely(x->xso.dev != dev)) { struct sk_buff *segs;
From: Oliver Hartkopp socketcan@hartkopp.net
[ Upstream commit 9e9714742fb70467464359693a73b911a630226f ]
Since commit f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size.
The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name.
Fixes: f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") Reported-by: Richard Weinberger richard@nod.at Tested-by: Richard Weinberger richard@nod.at Acked-by: Kurt Van Dijck dev.kurt@vandijck-laurijssen.be Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.z... Link: https://lore.kernel.org/r/20210325125850.1620-1-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/bcm.c | 10 ++++++---- net/can/raw.c | 14 ++++++++------ 2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/net/can/bcm.c b/net/can/bcm.c index 0e5c37be4a2b..909b9e684e04 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -86,6 +86,8 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Oliver Hartkopp oliver.hartkopp@volkswagen.de"); MODULE_ALIAS("can-proto-2");
+#define BCM_MIN_NAMELEN CAN_REQUIRED_SIZE(struct sockaddr_can, can_ifindex) + /* * easy access to the first 64 bit of can(fd)_frame payload. cp->data is * 64 bit aligned so the offset has to be multiples of 8 which is ensured @@ -1292,7 +1294,7 @@ static int bcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) /* no bound device as default => check msg_name */ DECLARE_SOCKADDR(struct sockaddr_can *, addr, msg->msg_name);
- if (msg->msg_namelen < CAN_REQUIRED_SIZE(*addr, can_ifindex)) + if (msg->msg_namelen < BCM_MIN_NAMELEN) return -EINVAL;
if (addr->can_family != AF_CAN) @@ -1534,7 +1536,7 @@ static int bcm_connect(struct socket *sock, struct sockaddr *uaddr, int len, struct net *net = sock_net(sk); int ret = 0;
- if (len < CAN_REQUIRED_SIZE(*addr, can_ifindex)) + if (len < BCM_MIN_NAMELEN) return -EINVAL;
lock_sock(sk); @@ -1616,8 +1618,8 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, sock_recv_ts_and_drops(msg, sk, skb);
if (msg->msg_name) { - __sockaddr_check_size(sizeof(struct sockaddr_can)); - msg->msg_namelen = sizeof(struct sockaddr_can); + __sockaddr_check_size(BCM_MIN_NAMELEN); + msg->msg_namelen = BCM_MIN_NAMELEN; memcpy(msg->msg_name, skb->cb, msg->msg_namelen); }
diff --git a/net/can/raw.c b/net/can/raw.c index 6ec8aa1d0da4..95113b0898b2 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -60,6 +60,8 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Urs Thuermann urs.thuermann@volkswagen.de"); MODULE_ALIAS("can-proto-1");
+#define RAW_MIN_NAMELEN CAN_REQUIRED_SIZE(struct sockaddr_can, can_ifindex) + #define MASK_ALL 0
/* A raw socket has a list of can_filters attached to it, each receiving @@ -394,7 +396,7 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len) int err = 0; int notify_enetdown = 0;
- if (len < CAN_REQUIRED_SIZE(*addr, can_ifindex)) + if (len < RAW_MIN_NAMELEN) return -EINVAL; if (addr->can_family != AF_CAN) return -EINVAL; @@ -475,11 +477,11 @@ static int raw_getname(struct socket *sock, struct sockaddr *uaddr, if (peer) return -EOPNOTSUPP;
- memset(addr, 0, sizeof(*addr)); + memset(addr, 0, RAW_MIN_NAMELEN); addr->can_family = AF_CAN; addr->can_ifindex = ro->ifindex;
- return sizeof(*addr); + return RAW_MIN_NAMELEN; }
static int raw_setsockopt(struct socket *sock, int level, int optname, @@ -731,7 +733,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) if (msg->msg_name) { DECLARE_SOCKADDR(struct sockaddr_can *, addr, msg->msg_name);
- if (msg->msg_namelen < CAN_REQUIRED_SIZE(*addr, can_ifindex)) + if (msg->msg_namelen < RAW_MIN_NAMELEN) return -EINVAL;
if (addr->can_family != AF_CAN) @@ -824,8 +826,8 @@ static int raw_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, sock_recv_ts_and_drops(msg, sk, skb);
if (msg->msg_name) { - __sockaddr_check_size(sizeof(struct sockaddr_can)); - msg->msg_namelen = sizeof(struct sockaddr_can); + __sockaddr_check_size(RAW_MIN_NAMELEN); + msg->msg_namelen = RAW_MIN_NAMELEN; memcpy(msg->msg_name, skb->cb, msg->msg_namelen); }
From: Oliver Hartkopp socketcan@hartkopp.net
[ Upstream commit f522d9559b07854c231cf8f0b8cb5a3578f8b44e ]
Since commit f5223e9eee65 ("can: extend sockaddr_can to include j1939 members") the sockaddr_can has been extended in size and a new CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol specific needed size.
The ABI for the msg_name and msg_namelen has not been adapted to the new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to a problem when an existing binary reads the (increased) struct sockaddr_can in msg_name.
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Reported-by: Richard Weinberger richard@nod.at Acked-by: Kurt Van Dijck dev.kurt@vandijck-laurijssen.be Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.z... Link: https://lore.kernel.org/r/20210325125850.1620-2-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/isotp.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/can/isotp.c b/net/can/isotp.c index ea1e227b8e54..d5780ab29e09 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -77,6 +77,8 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Oliver Hartkopp socketcan@hartkopp.net"); MODULE_ALIAS("can-proto-6");
+#define ISOTP_MIN_NAMELEN CAN_REQUIRED_SIZE(struct sockaddr_can, can_addr.tp) + #define SINGLE_MASK(id) (((id) & CAN_EFF_FLAG) ? \ (CAN_EFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG) : \ (CAN_SFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG)) @@ -981,7 +983,8 @@ static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, sock_recv_timestamp(msg, sk, skb);
if (msg->msg_name) { - msg->msg_namelen = sizeof(struct sockaddr_can); + __sockaddr_check_size(ISOTP_MIN_NAMELEN); + msg->msg_namelen = ISOTP_MIN_NAMELEN; memcpy(msg->msg_name, skb->cb, msg->msg_namelen); }
@@ -1050,7 +1053,7 @@ static int isotp_bind(struct socket *sock, struct sockaddr *uaddr, int len) int err = 0; int notify_enetdown = 0;
- if (len < CAN_REQUIRED_SIZE(struct sockaddr_can, can_addr.tp)) + if (len < ISOTP_MIN_NAMELEN) return -EINVAL;
if (addr->can_addr.tp.rx_id == addr->can_addr.tp.tx_id) @@ -1136,13 +1139,13 @@ static int isotp_getname(struct socket *sock, struct sockaddr *uaddr, int peer) if (peer) return -EOPNOTSUPP;
- memset(addr, 0, sizeof(*addr)); + memset(addr, 0, ISOTP_MIN_NAMELEN); addr->can_family = AF_CAN; addr->can_ifindex = so->ifindex; addr->can_addr.tp.rx_id = so->rxid; addr->can_addr.tp.tx_id = so->txid;
- return sizeof(*addr); + return ISOTP_MIN_NAMELEN; }
static int isotp_setsockopt(struct socket *sock, int level, int optname,
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 66167c310deb4ac1725f81004fb4b504676ad0bf ]
Cited commit changed the behavior of the software data path with regards to the ECN marking of decapsulated packets. However, the commit did not change other callers of __INET_ECN_decapsulate(), namely mlxsw. The driver is using the function in order to ensure that the hardware and software data paths act the same with regards to the ECN marking of decapsulated packets.
The discrepancy was uncovered by commit 5aa3c334a449 ("selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that aligned the selftest to the new behavior. Without this patch the selftest passes when used with veth pairs, but fails when used with mlxsw netdevs.
Fix this by instructing the device to propagate the ECT(1) mark from the outer header to the inner header when the inner header is ECT(0), for both NVE and IP-in-IP tunnels.
A helper is added in order not to duplicate the code between both tunnel types.
Fixes: b723748750ec ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 15 +++++++++++++++ .../net/ethernet/mellanox/mlxsw/spectrum_ipip.c | 7 +++---- .../net/ethernet/mellanox/mlxsw/spectrum_nve.c | 7 +++---- 3 files changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h index 74b3959b36d4..3e7576e671df 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h @@ -20,6 +20,7 @@ #include <net/red.h> #include <net/vxlan.h> #include <net/flow_offload.h> +#include <net/inet_ecn.h>
#include "port.h" #include "core.h" @@ -345,6 +346,20 @@ struct mlxsw_sp_port_type_speed_ops { u32 (*ptys_proto_cap_masked_get)(u32 eth_proto_cap); };
+static inline u8 mlxsw_sp_tunnel_ecn_decap(u8 outer_ecn, u8 inner_ecn, + bool *trap_en) +{ + bool set_ce = false; + + *trap_en = !!__INET_ECN_decapsulate(outer_ecn, inner_ecn, &set_ce); + if (set_ce) + return INET_ECN_CE; + else if (outer_ecn == INET_ECN_ECT_1 && inner_ecn == INET_ECN_ECT_0) + return INET_ECN_ECT_1; + else + return inner_ecn; +} + static inline struct net_device * mlxsw_sp_bridge_vxlan_dev_find(struct net_device *br_dev) { diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c index a8525992528f..3262a2c15ea7 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c @@ -371,12 +371,11 @@ static int mlxsw_sp_ipip_ecn_decap_init_one(struct mlxsw_sp *mlxsw_sp, u8 inner_ecn, u8 outer_ecn) { char tidem_pl[MLXSW_REG_TIDEM_LEN]; - bool trap_en, set_ce = false; u8 new_inner_ecn; + bool trap_en;
- trap_en = __INET_ECN_decapsulate(outer_ecn, inner_ecn, &set_ce); - new_inner_ecn = set_ce ? INET_ECN_CE : inner_ecn; - + new_inner_ecn = mlxsw_sp_tunnel_ecn_decap(outer_ecn, inner_ecn, + &trap_en); mlxsw_reg_tidem_pack(tidem_pl, outer_ecn, inner_ecn, new_inner_ecn, trap_en, trap_en ? MLXSW_TRAP_ID_DECAP_ECN0 : 0); return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(tidem), tidem_pl); diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c index 54d3e7dcd303..a2d1b95d1f58 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c @@ -909,12 +909,11 @@ static int __mlxsw_sp_nve_ecn_decap_init(struct mlxsw_sp *mlxsw_sp, u8 inner_ecn, u8 outer_ecn) { char tndem_pl[MLXSW_REG_TNDEM_LEN]; - bool trap_en, set_ce = false; u8 new_inner_ecn; + bool trap_en;
- trap_en = !!__INET_ECN_decapsulate(outer_ecn, inner_ecn, &set_ce); - new_inner_ecn = set_ce ? INET_ECN_CE : inner_ecn; - + new_inner_ecn = mlxsw_sp_tunnel_ecn_decap(outer_ecn, inner_ecn, + &trap_en); mlxsw_reg_tndem_pack(tndem_pl, outer_ecn, inner_ecn, new_inner_ecn, trap_en, trap_en ? MLXSW_TRAP_ID_DECAP_ECN0 : 0); return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(tndem), tndem_pl);
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit 63415767a2446136372e777cde5bb351f21ec21d ]
In myri10ge_sw_tso, the skb_list_walk_safe macro will set (curr) = (segs) and (next) = (curr)->next. If status!=0 is true, the memory pointed by curr and segs will be free by dev_kfree_skb_any(curr). But later, the segs is used by segs = segs->next and causes a uaf.
As (next) = (curr)->next, my patch replaces seg->next to next.
Fixes: 536577f36ff7a ("net: myri10ge: use skb_list_walk_safe helper for gso segments") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c index 1634ca6d4a8f..c84c8bf2bc20 100644 --- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c +++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c @@ -2897,7 +2897,7 @@ static netdev_tx_t myri10ge_sw_tso(struct sk_buff *skb, dev_kfree_skb_any(curr); if (segs != NULL) { curr = segs; - segs = segs->next; + segs = next; curr->next = NULL; dev_kfree_skb_any(segs); }
From: Claudiu Manoil claudiu.manoil@nxp.com
[ Upstream commit bff5b62585123823842833ab20b1c0a7fa437f8c ]
Handle return error code of eth_mac_addr();
Fixes: 3d23a05c75c7 ("gianfar: Enable changing mac addr when if up") Signed-off-by: Claudiu Manoil claudiu.manoil@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/gianfar.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 4fab2ee5bbf5..e4d9c4c640e5 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -364,7 +364,11 @@ static void gfar_set_mac_for_addr(struct net_device *dev, int num,
static int gfar_set_mac_addr(struct net_device *dev, void *p) { - eth_mac_addr(dev, p); + int ret; + + ret = eth_mac_addr(dev, p); + if (ret) + return ret;
gfar_set_mac_for_addr(dev, 0, dev->dev_addr);
From: Maxim Kochetkov fido_max@inbox.ru
[ Upstream commit fb6ec87f7229b92baa81b35cbc76f2626d5bfadb ]
If PHY is not available on DSA port (described at devicetree but absent or failed to detect) then kernel prints warning after 3700 secs:
[ 3707.948771] ------------[ cut here ]------------ [ 3707.948784] Type was not set for devlink port. [ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8
We should unregister the devlink port as a user port and re-register it as an unused port before executing "continue" in case of dsa_port_setup error.
Fixes: 86f8b1c01a0a ("net: dsa: Do not make user port errors fatal") Signed-off-by: Maxim Kochetkov fido_max@inbox.ru Reviewed-by: Vladimir Oltean olteanv@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/dsa/dsa2.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index a04fd637b4cd..3ada338d7e08 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -533,8 +533,14 @@ static int dsa_tree_setup_switches(struct dsa_switch_tree *dst)
list_for_each_entry(dp, &dst->ports, list) { err = dsa_port_setup(dp); - if (err) + if (err) { + dsa_port_devlink_teardown(dp); + dp->type = DSA_PORT_TYPE_UNUSED; + err = dsa_port_devlink_setup(dp); + if (err) + goto teardown; continue; + } }
return 0;
From: Rahul Lakkireddy rahul.lakkireddy@chelsio.com
[ Upstream commit 1bfb3dea965ff9f6226fd1709338f227363b6061 ]
Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX registers can lead to SGE missing doorbells under heavy traffic. So, only collect them when adapter is idle. Also update the regdump range to skip collecting these registers.
Fixes: 80a95a80d358 ("cxgb4: collect SGE PF/VF queue map") Signed-off-by: Rahul Lakkireddy rahul.lakkireddy@chelsio.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/chelsio/cxgb4/cudbg_lib.c | 23 +++++++++++++++---- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 3 ++- 2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c index 75474f810249..c5b0e725b238 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c @@ -1794,11 +1794,25 @@ int cudbg_collect_sge_indirect(struct cudbg_init *pdbg_init, struct cudbg_buffer temp_buff = { 0 }; struct sge_qbase_reg_field *sge_qbase; struct ireg_buf *ch_sge_dbg; + u8 padap_running = 0; int i, rc; + u32 size;
- rc = cudbg_get_buff(pdbg_init, dbg_buff, - sizeof(*ch_sge_dbg) * 2 + sizeof(*sge_qbase), - &temp_buff); + /* Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX regs can + * lead to SGE missing doorbells under heavy traffic. So, only + * collect them when adapter is idle. + */ + for_each_port(padap, i) { + padap_running = netif_running(padap->port[i]); + if (padap_running) + break; + } + + size = sizeof(*ch_sge_dbg) * 2; + if (!padap_running) + size += sizeof(*sge_qbase); + + rc = cudbg_get_buff(pdbg_init, dbg_buff, size, &temp_buff); if (rc) return rc;
@@ -1820,7 +1834,8 @@ int cudbg_collect_sge_indirect(struct cudbg_init *pdbg_init, ch_sge_dbg++; }
- if (CHELSIO_CHIP_VERSION(padap->params.chip) > CHELSIO_T5) { + if (CHELSIO_CHIP_VERSION(padap->params.chip) > CHELSIO_T5 && + !padap_running) { sge_qbase = (struct sge_qbase_reg_field *)ch_sge_dbg; /* 1 addr reg SGE_QBASE_INDEX and 4 data reg * SGE_QBASE_MAP[0-3] diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index 98d01a7497ec..581670dced6e 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -2090,7 +2090,8 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t buf_size) 0x1190, 0x1194, 0x11a0, 0x11a4, 0x11b0, 0x11b4, - 0x11fc, 0x1274, + 0x11fc, 0x123c, + 0x1254, 0x1274, 0x1280, 0x133c, 0x1800, 0x18fc, 0x3000, 0x302c,
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit 6bf24dc0cc0cc43b29ba344b66d78590e687e046 ]
In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq) finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time.
Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after the branch completed.
My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because this skb will be freed by kfree_skb(skb) finally.
Fixes: cb1b728096f54 ("tipc: eliminate race condition at multicast reception") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tipc/socket.c b/net/tipc/socket.c index e795a8a2955b..5b18c6a46cfb 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1244,7 +1244,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq, spin_lock_bh(&inputq->lock); if (skb_peek(arrvq) == skb) { skb_queue_splice_tail_init(&tmpq, inputq); - kfree_skb(__skb_dequeue(arrvq)); + __skb_dequeue(arrvq); } spin_unlock_bh(&inputq->lock); __skb_queue_purge(&tmpq);
From: Stefan Riedmueller s.riedmueller@phytec.de
[ Upstream commit f57011e72f5fe0421ec7a812beb1b57bdf4bb47f ]
Setting the vmmc supplies is crucial since otherwise the supplying regulators get disabled and the SD interfaces are no longer powered which leads to system failures if the system is booted from that SD interface.
Fixes: 1e44d3f880d5 ("ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module") Signed-off-by: Stefan Riedmueller s.riedmueller@phytec.de Reviewed-by: Fabio Estevam festevam@gmail.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi index e361df26a168..5f84e9f2b576 100644 --- a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi +++ b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi @@ -432,6 +432,7 @@ pinctrl-0 = <&pinctrl_usdhc2>; cd-gpios = <&gpio1 4 GPIO_ACTIVE_LOW>; wp-gpios = <&gpio1 2 GPIO_ACTIVE_HIGH>; + vmmc-supply = <&vdd_sd1_reg>; status = "disabled"; };
@@ -441,5 +442,6 @@ &pinctrl_usdhc3_cdwp>; cd-gpios = <&gpio1 27 GPIO_ACTIVE_LOW>; wp-gpios = <&gpio1 29 GPIO_ACTIVE_HIGH>; + vmmc-supply = <&vdd_sd0_reg>; status = "disabled"; };
From: Milton Miller miltonm@us.ibm.com
[ Upstream commit 03cb4d05b4ea9a3491674ca40952adb708d549fa ]
Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed deadlock on SMP because stop calls del_timer_sync on the timer that invoked channel_monitor as its timer function.
Recognise the inherent race of marking the monitor disabled before deleting the timer by just returning if enable was cleared. After a timeout (the default case -- reset to START when response received) just mark the monitor.enabled false.
If the channel has an entry on the channel_queue list, or if the state is not ACTIVE or INACTIVE, then warn and mark the timer stopped and don't restart, as the locking is broken somehow.
Fixes: 0795fb2021f0 ("net/ncsi: Stop monitor if channel times out or is inactive") Signed-off-by: Milton Miller miltonm@us.ibm.com Signed-off-by: Eddie James eajames@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ncsi/ncsi-manage.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c index a9cb355324d1..ffff8da707b8 100644 --- a/net/ncsi/ncsi-manage.c +++ b/net/ncsi/ncsi-manage.c @@ -105,13 +105,20 @@ static void ncsi_channel_monitor(struct timer_list *t) monitor_state = nc->monitor.state; spin_unlock_irqrestore(&nc->lock, flags);
- if (!enabled || chained) { - ncsi_stop_channel_monitor(nc); - return; - } + if (!enabled) + return; /* expected race disabling timer */ + if (WARN_ON_ONCE(chained)) + goto bad_state; + if (state != NCSI_CHANNEL_INACTIVE && state != NCSI_CHANNEL_ACTIVE) { - ncsi_stop_channel_monitor(nc); +bad_state: + netdev_warn(ndp->ndev.dev, + "Bad NCSI monitor state channel %d 0x%x %s queue\n", + nc->id, state, chained ? "on" : "off"); + spin_lock_irqsave(&nc->lock, flags); + nc->monitor.enabled = false; + spin_unlock_irqrestore(&nc->lock, flags); return; }
@@ -136,10 +143,9 @@ static void ncsi_channel_monitor(struct timer_list *t) ncsi_report_link(ndp, true); ndp->flags |= NCSI_DEV_RESHUFFLE;
- ncsi_stop_channel_monitor(nc); - ncm = &nc->modes[NCSI_MODE_LINK]; spin_lock_irqsave(&nc->lock, flags); + nc->monitor.enabled = false; nc->state = NCSI_CHANNEL_INVISIBLE; ncm->data[2] &= ~0x1; spin_unlock_irqrestore(&nc->lock, flags);
From: Loic Poulain loic.poulain@linaro.org
[ Upstream commit 8a03dd925786bdc3834d56ccc980bb70668efa35 ]
qrtr_tx_wait does not check for radix_tree_insert failure, causing the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix that by releasing flow on radix_tree_insert failure.
Fixes: 5fdeb0d372ab ("net: qrtr: Implement outgoing flow control") Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com Signed-off-by: Loic Poulain loic.poulain@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/qrtr/qrtr.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/qrtr/qrtr.c b/net/qrtr/qrtr.c index 45fbf5f4dcd2..93a7edcff11e 100644 --- a/net/qrtr/qrtr.c +++ b/net/qrtr/qrtr.c @@ -266,7 +266,10 @@ static int qrtr_tx_wait(struct qrtr_node *node, int dest_node, int dest_port, flow = kzalloc(sizeof(*flow), GFP_KERNEL); if (flow) { init_waitqueue_head(&flow->resume_tx); - radix_tree_insert(&node->qrtr_tx_flow, key, flow); + if (radix_tree_insert(&node->qrtr_tx_flow, key, flow)) { + kfree(flow); + flow = NULL; + } } } mutex_unlock(&node->qrtr_tx_lock);
From: Yinjun Zhang yinjun.zhang@corigine.com
[ Upstream commit 2ea538dbee1c79f6f6c24a6f2f82986e4b7ccb78 ]
A merge hint message needs some time to process before the merged flow actually reaches the firmware, during which we may get duplicate merge hints if there're more than one packet that hit the pre-merged flow. And processing duplicate merge hints will cost extra host_ctx's which are a limited resource.
Avoid the duplicate merge by using hash table to store the sub_flows to be merged.
Fixes: 8af56f40e53b ("nfp: flower: offload merge flows") Signed-off-by: Yinjun Zhang yinjun.zhang@corigine.com Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/netronome/nfp/flower/main.h | 8 ++++ .../ethernet/netronome/nfp/flower/metadata.c | 16 ++++++- .../ethernet/netronome/nfp/flower/offload.c | 48 ++++++++++++++++++- 3 files changed, 69 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h index caf12eec9945..56833a41f3d2 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/main.h +++ b/drivers/net/ethernet/netronome/nfp/flower/main.h @@ -190,6 +190,7 @@ struct nfp_fl_internal_ports { * @qos_rate_limiters: Current active qos rate limiters * @qos_stats_lock: Lock on qos stats updates * @pre_tun_rule_cnt: Number of pre-tunnel rules offloaded + * @merge_table: Hash table to store merged flows */ struct nfp_flower_priv { struct nfp_app *app; @@ -223,6 +224,7 @@ struct nfp_flower_priv { unsigned int qos_rate_limiters; spinlock_t qos_stats_lock; /* Protect the qos stats */ int pre_tun_rule_cnt; + struct rhashtable merge_table; };
/** @@ -350,6 +352,12 @@ struct nfp_fl_payload_link { };
extern const struct rhashtable_params nfp_flower_table_params; +extern const struct rhashtable_params merge_table_params; + +struct nfp_merge_info { + u64 parent_ctx; + struct rhash_head ht_node; +};
struct nfp_fl_stats_frame { __be32 stats_con_id; diff --git a/drivers/net/ethernet/netronome/nfp/flower/metadata.c b/drivers/net/ethernet/netronome/nfp/flower/metadata.c index aa06fcb38f8b..327bb56b3ef5 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/metadata.c +++ b/drivers/net/ethernet/netronome/nfp/flower/metadata.c @@ -490,6 +490,12 @@ const struct rhashtable_params nfp_flower_table_params = { .automatic_shrinking = true, };
+const struct rhashtable_params merge_table_params = { + .key_offset = offsetof(struct nfp_merge_info, parent_ctx), + .head_offset = offsetof(struct nfp_merge_info, ht_node), + .key_len = sizeof(u64), +}; + int nfp_flower_metadata_init(struct nfp_app *app, u64 host_ctx_count, unsigned int host_num_mems) { @@ -506,6 +512,10 @@ int nfp_flower_metadata_init(struct nfp_app *app, u64 host_ctx_count, if (err) goto err_free_flow_table;
+ err = rhashtable_init(&priv->merge_table, &merge_table_params); + if (err) + goto err_free_stats_ctx_table; + get_random_bytes(&priv->mask_id_seed, sizeof(priv->mask_id_seed));
/* Init ring buffer and unallocated mask_ids. */ @@ -513,7 +523,7 @@ int nfp_flower_metadata_init(struct nfp_app *app, u64 host_ctx_count, kmalloc_array(NFP_FLOWER_MASK_ENTRY_RS, NFP_FLOWER_MASK_ELEMENT_RS, GFP_KERNEL); if (!priv->mask_ids.mask_id_free_list.buf) - goto err_free_stats_ctx_table; + goto err_free_merge_table;
priv->mask_ids.init_unallocated = NFP_FLOWER_MASK_ENTRY_RS - 1;
@@ -550,6 +560,8 @@ err_free_last_used: kfree(priv->mask_ids.last_used); err_free_mask_id: kfree(priv->mask_ids.mask_id_free_list.buf); +err_free_merge_table: + rhashtable_destroy(&priv->merge_table); err_free_stats_ctx_table: rhashtable_destroy(&priv->stats_ctx_table); err_free_flow_table: @@ -568,6 +580,8 @@ void nfp_flower_metadata_cleanup(struct nfp_app *app) nfp_check_rhashtable_empty, NULL); rhashtable_free_and_destroy(&priv->stats_ctx_table, nfp_check_rhashtable_empty, NULL); + rhashtable_free_and_destroy(&priv->merge_table, + nfp_check_rhashtable_empty, NULL); kvfree(priv->stats); kfree(priv->mask_ids.mask_id_free_list.buf); kfree(priv->mask_ids.last_used); diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c index d72225d64a75..e95969c462e4 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c @@ -1009,6 +1009,8 @@ int nfp_flower_merge_offloaded_flows(struct nfp_app *app, struct netlink_ext_ack *extack = NULL; struct nfp_fl_payload *merge_flow; struct nfp_fl_key_ls merge_key_ls; + struct nfp_merge_info *merge_info; + u64 parent_ctx = 0; int err;
ASSERT_RTNL(); @@ -1019,6 +1021,15 @@ int nfp_flower_merge_offloaded_flows(struct nfp_app *app, nfp_flower_is_merge_flow(sub_flow2)) return -EINVAL;
+ /* check if the two flows are already merged */ + parent_ctx = (u64)(be32_to_cpu(sub_flow1->meta.host_ctx_id)) << 32; + parent_ctx |= (u64)(be32_to_cpu(sub_flow2->meta.host_ctx_id)); + if (rhashtable_lookup_fast(&priv->merge_table, + &parent_ctx, merge_table_params)) { + nfp_flower_cmsg_warn(app, "The two flows are already merged.\n"); + return 0; + } + err = nfp_flower_can_merge(sub_flow1, sub_flow2); if (err) return err; @@ -1060,16 +1071,33 @@ int nfp_flower_merge_offloaded_flows(struct nfp_app *app, if (err) goto err_release_metadata;
+ merge_info = kmalloc(sizeof(*merge_info), GFP_KERNEL); + if (!merge_info) { + err = -ENOMEM; + goto err_remove_rhash; + } + merge_info->parent_ctx = parent_ctx; + err = rhashtable_insert_fast(&priv->merge_table, &merge_info->ht_node, + merge_table_params); + if (err) + goto err_destroy_merge_info; + err = nfp_flower_xmit_flow(app, merge_flow, NFP_FLOWER_CMSG_TYPE_FLOW_MOD); if (err) - goto err_remove_rhash; + goto err_remove_merge_info;
merge_flow->in_hw = true; sub_flow1->in_hw = false;
return 0;
+err_remove_merge_info: + WARN_ON_ONCE(rhashtable_remove_fast(&priv->merge_table, + &merge_info->ht_node, + merge_table_params)); +err_destroy_merge_info: + kfree(merge_info); err_remove_rhash: WARN_ON_ONCE(rhashtable_remove_fast(&priv->flow_table, &merge_flow->fl_node, @@ -1359,7 +1387,9 @@ nfp_flower_remove_merge_flow(struct nfp_app *app, { struct nfp_flower_priv *priv = app->priv; struct nfp_fl_payload_link *link, *temp; + struct nfp_merge_info *merge_info; struct nfp_fl_payload *origin; + u64 parent_ctx = 0; bool mod = false; int err;
@@ -1396,8 +1426,22 @@ nfp_flower_remove_merge_flow(struct nfp_app *app, err_free_links: /* Clean any links connected with the merged flow. */ list_for_each_entry_safe(link, temp, &merge_flow->linked_flows, - merge_flow.list) + merge_flow.list) { + u32 ctx_id = be32_to_cpu(link->sub_flow.flow->meta.host_ctx_id); + + parent_ctx = (parent_ctx << 32) | (u64)(ctx_id); nfp_flower_unlink_flow(link); + } + + merge_info = rhashtable_lookup_fast(&priv->merge_table, + &parent_ctx, + merge_table_params); + if (merge_info) { + WARN_ON_ONCE(rhashtable_remove_fast(&priv->merge_table, + &merge_info->ht_node, + merge_table_params)); + kfree(merge_info); + }
kfree(merge_flow->action_data); kfree(merge_flow->mask_data);
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit c056d480b40a68f2520ccc156c7fae672d69d57d ]
We should not be advertising EEE for modes that we do not support, correct that oversight by looking at the PHY device supported linkmodes.
Fixes: 99cec8a4dda2 ("net: phy: broadcom: Allow enabling or disabling of EEE") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/bcm-phy-lib.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c index ef6825b30323..b72e11ca974f 100644 --- a/drivers/net/phy/bcm-phy-lib.c +++ b/drivers/net/phy/bcm-phy-lib.c @@ -328,7 +328,7 @@ EXPORT_SYMBOL_GPL(bcm_phy_enable_apd);
int bcm_phy_set_eee(struct phy_device *phydev, bool enable) { - int val; + int val, mask = 0;
/* Enable EEE at PHY level */ val = phy_read_mmd(phydev, MDIO_MMD_AN, BRCM_CL45VEN_EEE_CONTROL); @@ -347,10 +347,17 @@ int bcm_phy_set_eee(struct phy_device *phydev, bool enable) if (val < 0) return val;
+ if (linkmode_test_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT, + phydev->supported)) + mask |= MDIO_EEE_1000T; + if (linkmode_test_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, + phydev->supported)) + mask |= MDIO_EEE_100TX; + if (enable) - val |= (MDIO_EEE_100TX | MDIO_EEE_1000T); + val |= mask; else - val &= ~(MDIO_EEE_100TX | MDIO_EEE_1000T); + val &= ~mask;
phy_write_mmd(phydev, MDIO_MMD_AN, BCM_CL45VEN_EEE_ADV, (u32)val);
From: 周琰杰 (Zhou Yanjie) zhouyanjie@wanyeetech.com
[ Upstream commit 942bfbecc0281c75db84f744b9b77b0f2396f484 ]
Only send "X1000_I2C_DC_STOP" when last byte, or it will cause error when I2C write operation which should look like this:
device_addr + w, reg_addr, data;
But without this patch, it looks like this:
device_addr + w, reg_addr, device_addr + w, data;
Fixes: 21575a7a8d4c ("I2C: JZ4780: Add support for the X1000.") Reported-by: 杨文龙 (Yang Wenlong) ywltyut@sina.cn Tested-by: 杨文龙 (Yang Wenlong) ywltyut@sina.cn Signed-off-by: 周琰杰 (Zhou Yanjie) zhouyanjie@wanyeetech.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-jz4780.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-jz4780.c b/drivers/i2c/busses/i2c-jz4780.c index cb4a25ebb890..2a946c207928 100644 --- a/drivers/i2c/busses/i2c-jz4780.c +++ b/drivers/i2c/busses/i2c-jz4780.c @@ -526,8 +526,8 @@ static irqreturn_t jz4780_i2c_irq(int irqno, void *dev_id) i2c_sta = jz4780_i2c_readw(i2c, JZ4780_I2C_STA); data = *i2c->wbuf; data &= ~JZ4780_I2C_DC_READ; - if ((!i2c->stop_hold) && (i2c->cdata->version >= - ID_X1000)) + if ((i2c->wt_len == 1) && (!i2c->stop_hold) && + (i2c->cdata->version >= ID_X1000)) data |= X1000_I2C_DC_STOP; jz4780_i2c_writew(i2c, JZ4780_I2C_DC, data); i2c->wbuf++;
From: Bastian Germann bage@linutronix.de
[ Upstream commit 7c0d6e482062eb5c06ecccfab340abc523bdca00 ]
card->owner is a required property and since commit 81033c6b584b ("ALSA: core: Warn on empty module") a warning is issued if it is empty. Add it. This fixes following warning observed on Lamobo R1:
WARNING: CPU: 1 PID: 190 at sound/core/init.c:207 snd_card_new+0x430/0x480 [snd] Modules linked in: sun4i_codec(E+) sun4i_backend(E+) snd_soc_core(E) ... CPU: 1 PID: 190 Comm: systemd-udevd Tainted: G C E 5.10.0-1-armmp #1 Debian 5.10.4-1 Hardware name: Allwinner sun7i (A20) Family Call trace: (snd_card_new [snd]) (snd_soc_bind_card [snd_soc_core]) (snd_soc_register_card [snd_soc_core]) (sun4i_codec_probe [sun4i_codec])
Fixes: 45fb6b6f2aa3 ("ASoC: sunxi: add support for the on-chip codec on early Allwinner SoCs") Related: commit 3c27ea23ffb4 ("ASoC: qcom: Set card->owner to avoid warnings") Related: commit ec653df2a0cb ("drm/vc4/vc4_hdmi: fill ASoC card owner") Cc: linux-arm-kernel@lists.infradead.org Cc: alsa-devel@alsa-project.org Signed-off-by: Bastian Germann bage@linutronix.de Link: https://lore.kernel.org/r/20210331151843.30583-1-bage@linutronix.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sunxi/sun4i-codec.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/soc/sunxi/sun4i-codec.c b/sound/soc/sunxi/sun4i-codec.c index 6c13cc84b3fb..2173991c13db 100644 --- a/sound/soc/sunxi/sun4i-codec.c +++ b/sound/soc/sunxi/sun4i-codec.c @@ -1364,6 +1364,7 @@ static struct snd_soc_card *sun4i_codec_create_card(struct device *dev) return ERR_PTR(-ENOMEM);
card->dev = dev; + card->owner = THIS_MODULE; card->name = "sun4i-codec"; card->dapm_widgets = sun4i_codec_card_dapm_widgets; card->num_dapm_widgets = ARRAY_SIZE(sun4i_codec_card_dapm_widgets); @@ -1396,6 +1397,7 @@ static struct snd_soc_card *sun6i_codec_create_card(struct device *dev) return ERR_PTR(-ENOMEM);
card->dev = dev; + card->owner = THIS_MODULE; card->name = "A31 Audio Codec"; card->dapm_widgets = sun6i_codec_card_dapm_widgets; card->num_dapm_widgets = ARRAY_SIZE(sun6i_codec_card_dapm_widgets); @@ -1449,6 +1451,7 @@ static struct snd_soc_card *sun8i_a23_codec_create_card(struct device *dev) return ERR_PTR(-ENOMEM);
card->dev = dev; + card->owner = THIS_MODULE; card->name = "A23 Audio Codec"; card->dapm_widgets = sun6i_codec_card_dapm_widgets; card->num_dapm_widgets = ARRAY_SIZE(sun6i_codec_card_dapm_widgets); @@ -1487,6 +1490,7 @@ static struct snd_soc_card *sun8i_h3_codec_create_card(struct device *dev) return ERR_PTR(-ENOMEM);
card->dev = dev; + card->owner = THIS_MODULE; card->name = "H3 Audio Codec"; card->dapm_widgets = sun6i_codec_card_dapm_widgets; card->num_dapm_widgets = ARRAY_SIZE(sun6i_codec_card_dapm_widgets); @@ -1525,6 +1529,7 @@ static struct snd_soc_card *sun8i_v3s_codec_create_card(struct device *dev) return ERR_PTR(-ENOMEM);
card->dev = dev; + card->owner = THIS_MODULE; card->name = "V3s Audio Codec"; card->dapm_widgets = sun6i_codec_card_dapm_widgets; card->num_dapm_widgets = ARRAY_SIZE(sun6i_codec_card_dapm_widgets);
From: Ariel Levkovich lariel@nvidia.com
[ Upstream commit d24f847e54214049814b9515771622eaab3f42ab ]
ct_label 0 is a default label each flow has and therefore there can be rules that match on ct_label=0 without a prior rule that set the ct_label to this value.
The ct_label value is not used directly in the HW rules and instead it is mapped to some id within a defined range and this id is used to set and match the metadata register which carries the ct_label.
If we have a rule that matches on ct_label=0, the hw rule will perform matching on a value that is != 0 because of the mapping from label to id. Since the metadata register default value is 0 and it was never set before to anything else by an action that sets the ct_label, there will always be a mismatch between that register and the value in the rule.
To support such rule, a forced mapping of ct_label 0 to id=0 is done so that it will match the metadata register default value of 0.
Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID") Signed-off-by: Ariel Levkovich lariel@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 36 +++++++++++++++---- 1 file changed, 29 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index b42396df3111..0469f53dfb99 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -184,6 +184,28 @@ mlx5_tc_ct_entry_has_nat(struct mlx5_ct_entry *entry) return !!(entry->tuple_nat_node.next); }
+static int +mlx5_get_label_mapping(struct mlx5_tc_ct_priv *ct_priv, + u32 *labels, u32 *id) +{ + if (!memchr_inv(labels, 0, sizeof(u32) * 4)) { + *id = 0; + return 0; + } + + if (mapping_add(ct_priv->labels_mapping, labels, id)) + return -EOPNOTSUPP; + + return 0; +} + +static void +mlx5_put_label_mapping(struct mlx5_tc_ct_priv *ct_priv, u32 id) +{ + if (id) + mapping_remove(ct_priv->labels_mapping, id); +} + static int mlx5_tc_ct_rule_to_tuple(struct mlx5_ct_tuple *tuple, struct flow_rule *rule) { @@ -435,7 +457,7 @@ mlx5_tc_ct_entry_del_rule(struct mlx5_tc_ct_priv *ct_priv, mlx5_tc_rule_delete(netdev_priv(ct_priv->netdev), zone_rule->rule, attr); mlx5e_mod_hdr_detach(ct_priv->dev, ct_priv->mod_hdr_tbl, zone_rule->mh); - mapping_remove(ct_priv->labels_mapping, attr->ct_attr.ct_labels_id); + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); kfree(attr); }
@@ -638,8 +660,8 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv, if (!meta) return -EOPNOTSUPP;
- err = mapping_add(ct_priv->labels_mapping, meta->ct_metadata.labels, - &attr->ct_attr.ct_labels_id); + err = mlx5_get_label_mapping(ct_priv, meta->ct_metadata.labels, + &attr->ct_attr.ct_labels_id); if (err) return -EOPNOTSUPP; if (nat) { @@ -675,7 +697,7 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
err_mapping: dealloc_mod_hdr_actions(&mod_acts); - mapping_remove(ct_priv->labels_mapping, attr->ct_attr.ct_labels_id); + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); return err; }
@@ -743,7 +765,7 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv, err_rule: mlx5e_mod_hdr_detach(ct_priv->dev, ct_priv->mod_hdr_tbl, zone_rule->mh); - mapping_remove(ct_priv->labels_mapping, attr->ct_attr.ct_labels_id); + mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); err_mod_hdr: kfree(attr); err_attr: @@ -1198,7 +1220,7 @@ void mlx5_tc_ct_match_del(struct mlx5_tc_ct_priv *priv, struct mlx5_ct_attr *ct_ if (!priv || !ct_attr->ct_labels_id) return;
- mapping_remove(priv->labels_mapping, ct_attr->ct_labels_id); + mlx5_put_label_mapping(priv, ct_attr->ct_labels_id); }
int @@ -1276,7 +1298,7 @@ mlx5_tc_ct_match_add(struct mlx5_tc_ct_priv *priv, ct_labels[1] = key->ct_labels[1] & mask->ct_labels[1]; ct_labels[2] = key->ct_labels[2] & mask->ct_labels[2]; ct_labels[3] = key->ct_labels[3] & mask->ct_labels[3]; - if (mapping_add(priv->labels_mapping, ct_labels, &ct_attr->ct_labels_id)) + if (mlx5_get_label_mapping(priv, ct_labels, &ct_attr->ct_labels_id)) return -EOPNOTSUPP; mlx5e_tc_match_to_reg_match(spec, LABELS_TO_REG, ct_attr->ct_labels_id, MLX5_CT_LABELS_MASK);
From: Aya Levin ayal@nvidia.com
[ Upstream commit 3211434dfe7a66fcf55e43961ea524b78336c04c ]
Use connector_type read from PTYS register when it's valid, based on corresponding capability bit.
Fixes: 5b4793f81745 ("net/mlx5e: Add support for reading connector type from PTYS") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Eran Ben Elisha eranbe@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en_ethtool.c | 22 +++++++++---------- 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index bcd05457647e..986f0d86e94d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -744,11 +744,11 @@ static int get_fec_supported_advertised(struct mlx5_core_dev *dev, return 0; }
-static void ptys2ethtool_supported_advertised_port(struct ethtool_link_ksettings *link_ksettings, - u32 eth_proto_cap, - u8 connector_type, bool ext) +static void ptys2ethtool_supported_advertised_port(struct mlx5_core_dev *mdev, + struct ethtool_link_ksettings *link_ksettings, + u32 eth_proto_cap, u8 connector_type) { - if ((!connector_type && !ext) || connector_type >= MLX5E_CONNECTOR_TYPE_NUMBER) { + if (!MLX5_CAP_PCAM_FEATURE(mdev, ptys_connector_type)) { if (eth_proto_cap & (MLX5E_PROT_MASK(MLX5E_10GBASE_CR) | MLX5E_PROT_MASK(MLX5E_10GBASE_SR) | MLX5E_PROT_MASK(MLX5E_40GBASE_CR4) @@ -884,9 +884,9 @@ static int ptys2connector_type[MLX5E_CONNECTOR_TYPE_NUMBER] = { [MLX5E_PORT_OTHER] = PORT_OTHER, };
-static u8 get_connector_port(u32 eth_proto, u8 connector_type, bool ext) +static u8 get_connector_port(struct mlx5_core_dev *mdev, u32 eth_proto, u8 connector_type) { - if ((connector_type || ext) && connector_type < MLX5E_CONNECTOR_TYPE_NUMBER) + if (MLX5_CAP_PCAM_FEATURE(mdev, ptys_connector_type)) return ptys2connector_type[connector_type];
if (eth_proto & @@ -987,11 +987,11 @@ int mlx5e_ethtool_get_link_ksettings(struct mlx5e_priv *priv, data_rate_oper, link_ksettings);
eth_proto_oper = eth_proto_oper ? eth_proto_oper : eth_proto_cap; - - link_ksettings->base.port = get_connector_port(eth_proto_oper, - connector_type, ext); - ptys2ethtool_supported_advertised_port(link_ksettings, eth_proto_admin, - connector_type, ext); + connector_type = connector_type < MLX5E_CONNECTOR_TYPE_NUMBER ? + connector_type : MLX5E_PORT_UNKNOWN; + link_ksettings->base.port = get_connector_port(mdev, eth_proto_oper, connector_type); + ptys2ethtool_supported_advertised_port(mdev, link_ksettings, eth_proto_admin, + connector_type); get_lp_advertising(mdev, eth_proto_lp, link_ksettings);
if (an_status == MLX5_AN_COMPLETE)
From: Daniel Jurgens danielj@mellanox.com
[ Upstream commit a7b76002ae78cd230ee652ccdfedf21aa94fcecc ]
Calculating the number of compeltion EQs based on the number of available IRQ vectors doesn't work now that all async EQs share one IRQ. Thus the max number of EQs can be exceeded on systems with more than approximately 256 CPUs. Take this into account when calculating the number of available completion EQs.
Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs") Signed-off-by: Daniel Jurgens danielj@mellanox.com Reviewed-by: Parav Pandit parav@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 8ebfe782f95e..ccd53a7a2b80 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -926,13 +926,24 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev) mutex_unlock(&table->lock); }
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING +#define MLX5_MAX_ASYNC_EQS 4 +#else +#define MLX5_MAX_ASYNC_EQS 3 +#endif + int mlx5_eq_table_create(struct mlx5_core_dev *dev) { struct mlx5_eq_table *eq_table = dev->priv.eq_table; + int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ? + MLX5_CAP_GEN(dev, max_num_eqs) : + 1 << MLX5_CAP_GEN(dev, log_max_eq); int err;
eq_table->num_comp_eqs = - mlx5_irq_get_num_comp(eq_table->irq_table); + min_t(int, + mlx5_irq_get_num_comp(eq_table->irq_table), + num_eqs - MLX5_MAX_ASYNC_EQS);
err = create_async_eqs(dev); if (err) {
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit bdc2ab5c61a5c07388f4820ff21e787b4dfd1ced ]
In rds_message_map_pages, the rm is freed by rds_message_put(rm). But rm is still used by rm->data.op_sg in return value.
My patch assigns ERR_CAST(rm->data.op_sg) to err before the rm is freed to avoid the uaf.
Fixes: 7dba92037baf3 ("net/rds: Use ERR_PTR for rds_message_alloc_sgs()") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Reviewed-by: Håkon Bugge haakon.bugge@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/rds/message.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/rds/message.c b/net/rds/message.c index 071a261fdaab..799034e0f513 100644 --- a/net/rds/message.c +++ b/net/rds/message.c @@ -347,8 +347,9 @@ struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned in rm->data.op_nents = DIV_ROUND_UP(total_len, PAGE_SIZE); rm->data.op_sg = rds_message_alloc_sgs(rm, num_sgs); if (IS_ERR(rm->data.op_sg)) { + void *err = ERR_CAST(rm->data.op_sg); rds_message_put(rm); - return ERR_CAST(rm->data.op_sg); + return err; }
for (i = 0; i < rm->data.op_nents; ++i) {
From: Ong Boon Leong boon.leong.ong@intel.com
[ Upstream commit 622d13694b5f048c01caa7ba548498d9880d4cb0 ]
xdp_return_frame() may be called outside of NAPI context to return xdpf back to page_pool. xdp_return_frame() calls __xdp_return() with napi_direct = false. For page_pool memory model, __xdp_return() calls xdp_return_frame_no_direct() unconditionally and below false negative kernel BUG throw happened under preempt-rt build:
[ 430.450355] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/3884 [ 430.451678] caller is __xdp_return+0x1ff/0x2e0 [ 430.452111] CPU: 0 PID: 3884 Comm: modprobe Tainted: G U E 5.12.0-rc2+ #45
Changes in v2: - This patch fixes the issue by making xdp_return_frame_no_direct() is only called if napi_direct = true, as recommended for better by Jesper Dangaard Brouer. Thanks!
Fixes: 2539650fadbf ("xdp: Helpers for disabling napi_direct of xdp_return_frame") Signed-off-by: Ong Boon Leong boon.leong.ong@intel.com Acked-by: Jesper Dangaard Brouer brouer@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/xdp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/xdp.c b/net/core/xdp.c index d900cebc0acd..b8d7fa47d293 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -349,7 +349,8 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, /* mem->id is valid, checked in xdp_rxq_info_reg_mem_model() */ xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); page = virt_to_head_page(data); - napi_direct &= !xdp_return_frame_no_direct(); + if (napi_direct && xdp_return_frame_no_direct()) + napi_direct = false; page_pool_put_full_page(xa->page_pool, page, napi_direct); rcu_read_unlock(); break;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 040f31196e8b2609613f399793b9225271b79471 ]
When building with W=1, gcc points out that the __packed attribute on struct qm_eqcr_entry conflicts with the 8-byte alignment attribute on struct qm_fd inside it:
drivers/soc/fsl/qbman/qman.c:189:1: error: alignment 1 of 'struct qm_eqcr_entry' is less than 8 [-Werror=packed-not-aligned]
I assume that the alignment attribute is the correct one, and that qm_eqcr_entry cannot actually be unaligned in memory, so add the same alignment on the outer struct.
Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver") Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20210323131530.2619900-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 9888a7061873..feb97470699d 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -186,7 +186,7 @@ struct qm_eqcr_entry { __be32 tag; struct qm_fd fd; u8 __reserved3[32]; -} __packed; +} __packed __aligned(8); #define QM_EQCR_VERB_VBIT 0x80 #define QM_EQCR_VERB_CMD_MASK 0x61 /* but only one value; */ #define QM_EQCR_VERB_CMD_ENQUEUE 0x01
From: Eryk Rybak eryk.roch.rybak@intel.com
[ Upstream commit c3214de929dbf1b7374add8bbed30ce82b197bbb ]
If veb-stats was enabled, the ethtool stats triggered a warning due to invalid size: 'unexpected stat size for veb.tc_%u_tx_packets'. This was due to an incorrect structure definition for the statistics. Structures and functions have been improved in line with requirements for the presentation of statistics, in particular for the functions: 'i40e_add_ethtool_stats' and 'i40e_add_stat_strings'.
Fixes: 1510ae0be2a4 ("i40e: convert VEB TC stats to use an i40e_stats array") Signed-off-by: Eryk Rybak eryk.roch.rybak@intel.com Signed-off-by: Grzegorz Szczurek grzegorzx.szczurek@intel.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/i40e/i40e_ethtool.c | 52 ++++++++++++++++--- 1 file changed, 46 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index a92fac6f1389..849e38be69ff 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -232,6 +232,8 @@ static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[], I40E_STAT(struct i40e_vsi, _name, _stat) #define I40E_VEB_STAT(_name, _stat) \ I40E_STAT(struct i40e_veb, _name, _stat) +#define I40E_VEB_TC_STAT(_name, _stat) \ + I40E_STAT(struct i40e_cp_veb_tc_stats, _name, _stat) #define I40E_PFC_STAT(_name, _stat) \ I40E_STAT(struct i40e_pfc_stats, _name, _stat) #define I40E_QUEUE_STAT(_name, _stat) \ @@ -266,11 +268,18 @@ static const struct i40e_stats i40e_gstrings_veb_stats[] = { I40E_VEB_STAT("veb.rx_unknown_protocol", stats.rx_unknown_protocol), };
+struct i40e_cp_veb_tc_stats { + u64 tc_rx_packets; + u64 tc_rx_bytes; + u64 tc_tx_packets; + u64 tc_tx_bytes; +}; + static const struct i40e_stats i40e_gstrings_veb_tc_stats[] = { - I40E_VEB_STAT("veb.tc_%u_tx_packets", tc_stats.tc_tx_packets), - I40E_VEB_STAT("veb.tc_%u_tx_bytes", tc_stats.tc_tx_bytes), - I40E_VEB_STAT("veb.tc_%u_rx_packets", tc_stats.tc_rx_packets), - I40E_VEB_STAT("veb.tc_%u_rx_bytes", tc_stats.tc_rx_bytes), + I40E_VEB_TC_STAT("veb.tc_%u_tx_packets", tc_tx_packets), + I40E_VEB_TC_STAT("veb.tc_%u_tx_bytes", tc_tx_bytes), + I40E_VEB_TC_STAT("veb.tc_%u_rx_packets", tc_rx_packets), + I40E_VEB_TC_STAT("veb.tc_%u_rx_bytes", tc_rx_bytes), };
static const struct i40e_stats i40e_gstrings_misc_stats[] = { @@ -2217,6 +2226,29 @@ static int i40e_get_sset_count(struct net_device *netdev, int sset) } }
+/** + * i40e_get_veb_tc_stats - copy VEB TC statistics to formatted structure + * @tc: the TC statistics in VEB structure (veb->tc_stats) + * @i: the index of traffic class in (veb->tc_stats) structure to copy + * + * Copy VEB TC statistics from structure of arrays (veb->tc_stats) to + * one dimensional structure i40e_cp_veb_tc_stats. + * Produce formatted i40e_cp_veb_tc_stats structure of the VEB TC + * statistics for the given TC. + **/ +static struct i40e_cp_veb_tc_stats +i40e_get_veb_tc_stats(struct i40e_veb_tc_stats *tc, unsigned int i) +{ + struct i40e_cp_veb_tc_stats veb_tc = { + .tc_rx_packets = tc->tc_rx_packets[i], + .tc_rx_bytes = tc->tc_rx_bytes[i], + .tc_tx_packets = tc->tc_tx_packets[i], + .tc_tx_bytes = tc->tc_tx_bytes[i], + }; + + return veb_tc; +} + /** * i40e_get_pfc_stats - copy HW PFC statistics to formatted structure * @pf: the PF device structure @@ -2301,8 +2333,16 @@ static void i40e_get_ethtool_stats(struct net_device *netdev, i40e_gstrings_veb_stats);
for (i = 0; i < I40E_MAX_TRAFFIC_CLASS; i++) - i40e_add_ethtool_stats(&data, veb_stats ? veb : NULL, - i40e_gstrings_veb_tc_stats); + if (veb_stats) { + struct i40e_cp_veb_tc_stats veb_tc = + i40e_get_veb_tc_stats(&veb->tc_stats, i); + + i40e_add_ethtool_stats(&data, &veb_tc, + i40e_gstrings_veb_tc_stats); + } else { + i40e_add_ethtool_stats(&data, NULL, + i40e_gstrings_veb_tc_stats); + }
i40e_add_ethtool_stats(&data, pf, i40e_gstrings_stats);
From: Md Haris Iqbal haris.iqbal@cloud.ionos.com
[ Upstream commit 7582207b1059129e59eb92026fca2cfc088a74fc ]
KASAN detected the following BUG:
BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Call Trace: <IRQ> dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] kasan_report+0x10/0x20 rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client] ? lockdep_hardirqs_on+0x1a8/0x290 ? process_io_rsp+0xb0/0xb0 [rtrs_client] ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib] ? add_interrupt_randomness+0x1a2/0x340 __ib_process_cq+0x97/0x100 [ib_core] ib_poll_handler+0x41/0xb0 [ib_core] irq_poll_softirq+0xe0/0x260 __do_softirq+0x127/0x672 irq_exit+0xd1/0xe0 do_IRQ+0xa3/0x1d0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xea/0x780 Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00 RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414 RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0 ? lockdep_hardirqs_on+0x1a8/0x290 ? cpuidle_enter_state+0xe5/0x780 cpuidle_enter+0x3c/0x60 do_idle+0x2fb/0x390 ? arch_cpu_idle_exit+0x40/0x40 ? schedule+0x94/0x120 cpu_startup_entry+0x19/0x1b start_kernel+0x5da/0x61b ? thread_stack_cache_init+0x6/0x6 ? load_ucode_amd_bsp+0x6f/0xc4 ? init_amd_microcode+0xa6/0xa6 ? x86_family+0x5/0x20 ? load_ucode_bsp+0x182/0x1fd secondary_startup_64+0xa4/0xb0
Allocated by task 5730: save_stack+0x19/0x80 __kasan_kmalloc.constprop.9+0xc1/0xd0 kmem_cache_alloc_trace+0x15b/0x350 alloc_sess+0xf4/0x570 [rtrs_client] rtrs_clt_open+0x3b4/0x780 [rtrs_client] find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client] rnbd_clt_map_device+0xd7/0xf50 [rnbd_client] rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client] kernfs_fop_write+0x141/0x240 vfs_write+0xf3/0x280 ksys_write+0xba/0x150 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 5822: save_stack+0x19/0x80 __kasan_slab_free+0x125/0x170 kfree+0xe7/0x3f0 kobject_put+0xd3/0x240 rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client] rtrs_clt_close+0x3c/0x80 [rtrs_client] close_rtrs+0x45/0x80 [rnbd_client] rnbd_client_exit+0x10f/0x2bd [rnbd_client] __x64_sys_delete_module+0x27b/0x340 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe
When rtrs_clt_close is triggered, it iterates over all the present rtrs_clt_sess and triggers close on them. However, the call to rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This is incorrect since during the initialization phase we allocate rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it.
If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it may so happen that an inflight IO completion would trigger the function rtrs_clt_rdma_done, which would lead to the above UAF case.
Hence close the rtrs_clt_con connections first, and then trigger the destruction of session files.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal haris.iqbal@ionos.com Signed-off-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Gioh Kim gi-oh.kim@ionos.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c index 6eb95e3c4c8a..6ff97fbf8756 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c @@ -2739,8 +2739,8 @@ void rtrs_clt_close(struct rtrs_clt *clt)
/* Now it is safe to iterate over all paths without locks */ list_for_each_entry_safe(sess, tmp, &clt->paths_list, s.entry) { - rtrs_clt_destroy_sess_files(sess, NULL); rtrs_clt_close_conns(sess, true); + rtrs_clt_destroy_sess_files(sess, NULL); kobject_put(&sess->kobj); } free_clt(clt);
From: Stephen Boyd swboyd@chromium.org
[ Upstream commit 5620b135aea49a8f41c86aaecfcb1598a7774121 ]
We should set the platform device's driver data to NULL here so that code doesn't assume the struct drm_device pointer is valid when it could have been destroyed. The lifetime of this pointer is managed by a kref but when msm_drm_init() fails we call drm_dev_put() on the pointer which will free the pointer's memory. This driver uses the component model, so there's sort of two "probes" in this file, one for the platform device i.e. msm_pdev_probe() and one for the component i.e. msm_drm_bind(). The msm_drm_bind() code is using the platform device's driver data to store struct drm_device so the two functions are intertwined.
This relationship becomes a problem for msm_pdev_shutdown() when it tests the NULL-ness of the pointer to see if it should call drm_atomic_helper_shutdown(). The NULL test is a proxy check for if the pointer has been freed by kref_put(). If the drm_device has been destroyed, then we shouldn't call the shutdown helper, and we know that is the case if msm_drm_init() failed, therefore set the driver data to NULL so that this pointer liveness is tracked properly.
Fixes: 9d5cbf5fe46e ("drm/msm: add shutdown support for display platform_driver") Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Fabio Estevam festevam@gmail.com Cc: Krishna Manikandan mkrishn@codeaurora.org Signed-off-by: Stephen Boyd swboyd@chromium.org Message-Id: 20210325212822.3663144-1-swboyd@chromium.org Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_drv.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index b38ebccad42f..0aacc43faefa 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -557,6 +557,7 @@ err_free_priv: kfree(priv); err_put_drm_dev: drm_dev_put(ddev); + platform_set_drvdata(pdev, NULL); return ret; }
From: Norman Maurer norman_maurer@apple.com
[ Upstream commit 98184612aca0a9ee42b8eb0262a49900ee9eef0d ]
Support for UDP_GRO was added in the past but the implementation for getsockopt was missed which did lead to an error when we tried to retrieve the setting for UDP_GRO. This patch adds the missing switch case for UDP_GRO
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Norman Maurer norman_maurer@apple.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index e37a2fa65c29..4a2fd286787c 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2747,6 +2747,10 @@ int udp_lib_getsockopt(struct sock *sk, int level, int optname, val = up->gso_size; break;
+ case UDP_GRO: + val = up->gro_enabled; + break; + /* The following two cannot be changed on UDP sockets, the return is * always 0 (which corresponds to the full checksum coverage of UDP). */ case UDPLITE_SEND_CSCOV:
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 86581852d7710990d8af9dadfe9a661f0abf2114 ]
Unrolling mcast state at msk dismantel time is bug prone, as syzkaller reported:
====================================================== WARNING: possible circular locking dependency detected 5.11.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor905/8822 is trying to acquire lock: ffffffff8d678fe8 (rtnl_mutex){+.+.}-{3:3}, at: ipv6_sock_mc_close+0xd7/0x110 net/ipv6/mcast.c:323
but task is already holding lock: ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1600 [inline] ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp6_release+0x57/0x130 net/mptcp/protocol.c:3507
which lock already depends on the new lock.
Instead we can simply forbit any mcast-related setsockopt
Fixes: 717e79c867ca5 ("mptcp: Add setsockopt()/getsockopt() socket operations") Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 45 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f56b2e331bb6..27f6672589ce 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2245,6 +2245,48 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname, return ret; }
+static bool mptcp_unsupported(int level, int optname) +{ + if (level == SOL_IP) { + switch (optname) { + case IP_ADD_MEMBERSHIP: + case IP_ADD_SOURCE_MEMBERSHIP: + case IP_DROP_MEMBERSHIP: + case IP_DROP_SOURCE_MEMBERSHIP: + case IP_BLOCK_SOURCE: + case IP_UNBLOCK_SOURCE: + case MCAST_JOIN_GROUP: + case MCAST_LEAVE_GROUP: + case MCAST_JOIN_SOURCE_GROUP: + case MCAST_LEAVE_SOURCE_GROUP: + case MCAST_BLOCK_SOURCE: + case MCAST_UNBLOCK_SOURCE: + case MCAST_MSFILTER: + return true; + } + return false; + } + if (level == SOL_IPV6) { + switch (optname) { + case IPV6_ADDRFORM: + case IPV6_ADD_MEMBERSHIP: + case IPV6_DROP_MEMBERSHIP: + case IPV6_JOIN_ANYCAST: + case IPV6_LEAVE_ANYCAST: + case MCAST_JOIN_GROUP: + case MCAST_LEAVE_GROUP: + case MCAST_JOIN_SOURCE_GROUP: + case MCAST_LEAVE_SOURCE_GROUP: + case MCAST_BLOCK_SOURCE: + case MCAST_UNBLOCK_SOURCE: + case MCAST_MSFILTER: + return true; + } + return false; + } + return false; +} + static int mptcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, unsigned int optlen) { @@ -2253,6 +2295,9 @@ static int mptcp_setsockopt(struct sock *sk, int level, int optname,
pr_debug("msk=%p", msk);
+ if (mptcp_unsupported(level, optname)) + return -ENOPROTOOPT; + if (level == SOL_SOCKET) return mptcp_setsockopt_sol_socket(msk, optname, optval, optlen);
From: Can Guo cang@codeaurora.org
[ Upstream commit 1235fc569e0bf541ddda0a1224d4c6fa6d914890 ]
ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd().
Link: https://lore.kernel.org/r/1617262750-4864-2-git-send-email-cang@codeaurora.o... Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Can Guo cang@codeaurora.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 97d9d5d99adc..7e1168ee2474 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6274,6 +6274,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
spin_lock_irqsave(host->host_lock, flags); task_tag = hba->nutrs + free_slot; + blk_mq_start_request(req);
treq->req_header.dword_0 |= cpu_to_be32(task_tag);
From: Can Guo cang@codeaurora.org
[ Upstream commit 4b42d557a8add52b9a9924fb31e40a218aab7801 ]
In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag.
Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command implementation") Link: https://lore.kernel.org/r/1617262750-4864-3-git-send-email-cang@codeaurora.o... Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Can Guo cang@codeaurora.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 7e1168ee2474..4215d9a8e5de 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6256,38 +6256,34 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, DECLARE_COMPLETION_ONSTACK(wait); struct request *req; unsigned long flags; - int free_slot, task_tag, err; + int task_tag, err;
/* - * Get free slot, sleep if slots are unavailable. - * Even though we use wait_event() which sleeps indefinitely, - * the maximum wait time is bounded by %TM_CMD_TIMEOUT. + * blk_get_request() is used here only to get a free tag. */ req = blk_get_request(q, REQ_OP_DRV_OUT, 0); if (IS_ERR(req)) return PTR_ERR(req);
req->end_io_data = &wait; - free_slot = req->tag; - WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); ufshcd_hold(hba, false);
spin_lock_irqsave(host->host_lock, flags); - task_tag = hba->nutrs + free_slot; blk_mq_start_request(req);
+ task_tag = req->tag; treq->req_header.dword_0 |= cpu_to_be32(task_tag);
- memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq)); - ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function); + memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq)); + ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function);
/* send command to the controller */ - __set_bit(free_slot, &hba->outstanding_tasks); + __set_bit(task_tag, &hba->outstanding_tasks);
/* Make sure descriptors are ready before ringing the task doorbell */ wmb();
- ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL); + ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL); /* Make sure that doorbell is committed immediately */ wmb();
@@ -6307,24 +6303,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err"); dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", __func__, tm_function); - if (ufshcd_clear_tm_cmd(hba, free_slot)) - dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) after timeout\n", - __func__, free_slot); + if (ufshcd_clear_tm_cmd(hba, task_tag)) + dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot %d) after timeout\n", + __func__, task_tag); err = -ETIMEDOUT; } else { err = 0; - memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq)); + memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq));
ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete"); }
spin_lock_irqsave(hba->host->host_lock, flags); - __clear_bit(free_slot, &hba->outstanding_tasks); + __clear_bit(task_tag, &hba->outstanding_tasks); spin_unlock_irqrestore(hba->host->host_lock, flags);
+ ufshcd_release(hba); blk_put_request(req);
- ufshcd_release(hba); return err; }
From: Yunjian Wang wangyunjian@huawei.com
[ Upstream commit 990b03b05b2fba79de2a1ee9dc359fc552d95ba6 ]
The 'unlocked_driver_cb' struct field in 'bo' is not being initialized in tcf_block_offload_init(). The uninitialized 'unlocked_driver_cb' will be used when calling unlocked_driver_cb(). So initialize 'bo' to zero to avoid the issue.
Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()") Signed-off-by: Yunjian Wang wangyunjian@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index d48ba4dee9a5..9383dc29ead5 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -646,7 +646,7 @@ static void tc_block_indr_cleanup(struct flow_block_cb *block_cb) struct net_device *dev = block_cb->indr.dev; struct Qdisc *sch = block_cb->indr.sch; struct netlink_ext_ack extack = {}; - struct flow_block_offload bo; + struct flow_block_offload bo = {};
tcf_block_offload_init(&bo, dev, sch, FLOW_BLOCK_UNBIND, block_cb->indr.binder_type,
From: Claudiu Beznea claudiu.beznea@microchip.com
[ Upstream commit a14d273ba15968495896a38b7b3399dba66d0270 ]
Restore CMP screener registers on resume path.
Fixes: c1e85c6ce57ef ("net: macb: save/restore the remaining registers and features") Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 286f0341bdf8..48a6bda2a8cc 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -3111,6 +3111,9 @@ static void gem_prog_cmp_regs(struct macb *bp, struct ethtool_rx_flow_spec *fs) bool cmp_b = false; bool cmp_c = false;
+ if (!macb_is_gem(bp)) + return; + tp4sp_v = &(fs->h_u.tcp_ip4_spec); tp4sp_m = &(fs->m_u.tcp_ip4_spec);
@@ -3479,6 +3482,7 @@ static void macb_restore_features(struct macb *bp) { struct net_device *netdev = bp->dev; netdev_features_t features = netdev->features; + struct ethtool_rx_fs_item *item;
/* TX checksum offload */ macb_set_txcsum_feature(bp, features); @@ -3487,6 +3491,9 @@ static void macb_restore_features(struct macb *bp) macb_set_rxcsum_feature(bp, features);
/* RX Flow Filters */ + list_for_each_entry(item, &bp->rx_fs_list.list, list) + gem_prog_cmp_regs(bp, &item->fs); + macb_set_rxflow_feature(bp, features); }
From: Lukasz Bartosik lb@semihalf.com
[ Upstream commit 8d3c0c01cb2e36b2bf3c06a82b18b228d0c8f5d0 ]
Fix invalid usage of a list_for_each_entry cursor in clk_notifier_register(). When list is empty or if the list is completely traversed (without breaking from the loop on one of the entries) then the list cursor does not point to a valid entry and therefore should not be used.
The issue was dicovered when running 5.12-rc1 kernel on x86_64 with KASAN enabled: BUG: KASAN: global-out-of-bounds in clk_notifier_register+0xab/0x230 Read of size 8 at addr ffffffffa0d10588 by task swapper/0/1
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1 #1 Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.430.0 07/20/2018 Call Trace: dump_stack+0xee/0x15c print_address_description+0x1e/0x2dc kasan_report+0x188/0x1ce ? clk_notifier_register+0xab/0x230 ? clk_prepare_lock+0x15/0x7b ? clk_notifier_register+0xab/0x230 clk_notifier_register+0xab/0x230 dw8250_probe+0xc01/0x10d4 ... Memory state around the buggy address: ffffffffa0d10480: 00 00 00 00 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00 ffffffffa0d10500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
ffffffffa0d10580: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
^ ffffffffa0d10600: 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 00 00 00 00 ffffffffa0d10680: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00 ==================================================================
Fixes: b2476490ef11 ("clk: introduce the common clock framework") Reported-by: Lukasz Majczak lma@semihalf.com Signed-off-by: Lukasz Bartosik lb@semihalf.com Link: https://lore.kernel.org/r/20210401225149.18826-1-lb@semihalf.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/clk.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index f83dac54ed85..dae090124263 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -4262,20 +4262,19 @@ int clk_notifier_register(struct clk *clk, struct notifier_block *nb) /* search the list of notifiers for this clk */ list_for_each_entry(cn, &clk_notifier_list, node) if (cn->clk == clk) - break; + goto found;
/* if clk wasn't in the notifier list, allocate new clk_notifier */ - if (cn->clk != clk) { - cn = kzalloc(sizeof(*cn), GFP_KERNEL); - if (!cn) - goto out; + cn = kzalloc(sizeof(*cn), GFP_KERNEL); + if (!cn) + goto out;
- cn->clk = clk; - srcu_init_notifier_head(&cn->notifier_head); + cn->clk = clk; + srcu_init_notifier_head(&cn->notifier_head);
- list_add(&cn->node, &clk_notifier_list); - } + list_add(&cn->node, &clk_notifier_list);
+found: ret = srcu_notifier_chain_register(&cn->notifier_head, nb);
clk->core->notifier_count++;
From: Lukasz Bartosik lb@semihalf.com
[ Upstream commit 7045465500e465b09f09d6e5bdc260a9f1aab97b ]
Fix invalid usage of a list_for_each_entry cursor in clk_notifier_unregister(). When list is empty or if the list is completely traversed (without breaking from the loop on one of the entries) then the list cursor does not point to a valid entry and therefore should not be used. The patch fixes a logical bug that hasn't been seen in pratice however it is analogus to the bug fixed in clk_notifier_register().
The issue was dicovered when running 5.12-rc1 kernel on x86_64 with KASAN enabled: BUG: KASAN: global-out-of-bounds in clk_notifier_register+0xab/0x230 Read of size 8 at addr ffffffffa0d10588 by task swapper/0/1
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1 #1 Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.430.0 07/20/2018 Call Trace: dump_stack+0xee/0x15c print_address_description+0x1e/0x2dc kasan_report+0x188/0x1ce ? clk_notifier_register+0xab/0x230 ? clk_prepare_lock+0x15/0x7b ? clk_notifier_register+0xab/0x230 clk_notifier_register+0xab/0x230 dw8250_probe+0xc01/0x10d4 ... Memory state around the buggy address: ffffffffa0d10480: 00 00 00 00 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00 ffffffffa0d10500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
ffffffffa0d10580: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
^ ffffffffa0d10600: 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 00 00 00 00 ffffffffa0d10680: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00 ==================================================================
Fixes: b2476490ef11 ("clk: introduce the common clock framework") Reported-by: Lukasz Majczak lma@semihalf.com Signed-off-by: Lukasz Bartosik lb@semihalf.com Link: https://lore.kernel.org/r/20210401225149.18826-2-lb@semihalf.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/clk.c | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index dae090124263..61c78714c095 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -4299,32 +4299,28 @@ EXPORT_SYMBOL_GPL(clk_notifier_register); */ int clk_notifier_unregister(struct clk *clk, struct notifier_block *nb) { - struct clk_notifier *cn = NULL; - int ret = -EINVAL; + struct clk_notifier *cn; + int ret = -ENOENT;
if (!clk || !nb) return -EINVAL;
clk_prepare_lock();
- list_for_each_entry(cn, &clk_notifier_list, node) - if (cn->clk == clk) - break; - - if (cn->clk == clk) { - ret = srcu_notifier_chain_unregister(&cn->notifier_head, nb); + list_for_each_entry(cn, &clk_notifier_list, node) { + if (cn->clk == clk) { + ret = srcu_notifier_chain_unregister(&cn->notifier_head, nb);
- clk->core->notifier_count--; + clk->core->notifier_count--;
- /* XXX the notifier code should handle this better */ - if (!cn->notifier_head.head) { - srcu_cleanup_notifier_head(&cn->notifier_head); - list_del(&cn->node); - kfree(cn); + /* XXX the notifier code should handle this better */ + if (!cn->notifier_head.head) { + srcu_cleanup_notifier_head(&cn->notifier_head); + list_del(&cn->node); + kfree(cn); + } + break; } - - } else { - ret = -ENOENT; }
clk_prepare_unlock();
From: Zqiang qiang.zhang@windriver.com
[ Upstream commit 0687c66b5f666b5ad433f4e94251590d9bc9d10e ]
The debug_work_activate() is called on the premise that the work can be inserted, because if wq be in WQ_DRAINING status, insert work may be failed.
Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() debuggability") Signed-off-by: Zqiang qiang.zhang@windriver.com Reviewed-by: Lai Jiangshan jiangshanlai@gmail.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 1d99c52cc99a..1e2ca744dadb 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1409,7 +1409,6 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, */ lockdep_assert_irqs_disabled();
- debug_work_activate(work);
/* if draining, only works from the same workqueue are allowed */ if (unlikely(wq->flags & __WQ_DRAINING) && @@ -1491,6 +1490,7 @@ retry: worklist = &pwq->delayed_works; }
+ debug_work_activate(work); insert_work(pwq, work, worklist, work_flags);
out:
From: Alexander Gordeev agordeev@linux.ibm.com
[ Upstream commit 7a2f91441b2c1d81b77c1cd816a4659f4abc9cbe ]
Register variables initialized using arithmetic. That leads to kasan instrumentaton code corrupting the registers contents. Follow GCC guidlines and use temporary variables for assigning init values to register variables.
Fixes: 94c12cc7d196 ("[S390] Inline assembly cleanup.") Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Acked-by: Ilya Leoshkevich iii@linux.ibm.com Link: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Local-Register-Variables.html Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/cpcmd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/cpcmd.c b/arch/s390/kernel/cpcmd.c index af013b4244d3..2da027359798 100644 --- a/arch/s390/kernel/cpcmd.c +++ b/arch/s390/kernel/cpcmd.c @@ -37,10 +37,12 @@ static int diag8_noresponse(int cmdlen)
static int diag8_response(int cmdlen, char *response, int *rlen) { + unsigned long _cmdlen = cmdlen | 0x40000000L; + unsigned long _rlen = *rlen; register unsigned long reg2 asm ("2") = (addr_t) cpcmd_buf; register unsigned long reg3 asm ("3") = (addr_t) response; - register unsigned long reg4 asm ("4") = cmdlen | 0x40000000L; - register unsigned long reg5 asm ("5") = *rlen; + register unsigned long reg4 asm ("4") = _cmdlen; + register unsigned long reg5 asm ("5") = _rlen;
asm volatile( " diag %2,%0,0x8\n"
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit 026334a3bb6a3919b42aba9fc11843db2b77fd41 ]
Since commit 14d3d54052539a1e ("perf session: Try to read pipe data from file") 'perf inject' has started printing "PERFILE2h" when not processing pipes.
The commit exposed perf to the possiblity that the input is not a pipe but the 'repipe' parameter gets used. That causes the printing because perf inject sets 'repipe' to true always.
The 'repipe' parameter of perf_session__new() is used by 2 functions:
- perf_file_header__read_pipe() - trace_report()
In both cases, the functions copy data to STDOUT_FILENO when 'repipe' is true.
Fix by setting 'repipe' to true only if the output is a pipe.
Fixes: e558a5bd8b74aff4 ("perf inject: Work with files") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Jiri Olsa jolsa@redhat.com Cc: Andrew Vagin avagin@openvz.org Link: http://lore.kernel.org/lkml/20210401103605.9000-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-inject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c index 0462dc8db2e3..5320ac1b1285 100644 --- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -904,7 +904,7 @@ int cmd_inject(int argc, const char **argv) }
data.path = inject.input_name; - inject.session = perf_session__new(&data, true, &inject.tool); + inject.session = perf_session__new(&data, inject.output.is_pipe, &inject.tool); if (IS_ERR(inject.session)) return PTR_ERR(inject.session);
From: Zheng Yongjun zhengyongjun3@huawei.com
[ Upstream commit 5e359044c107ecbdc2e9b3fd5ce296006e6de4bc ]
Simplify the return expression.
Signed-off-by: Zheng Yongjun zhengyongjun3@huawei.com Reviewed-by: Eelco Chaudron echaudro@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/openvswitch/conntrack.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 4beb96139d77..96a49aa3a128 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -2025,15 +2025,11 @@ static int ovs_ct_limit_get_default_limit(struct ovs_ct_limit_info *info, struct sk_buff *reply) { struct ovs_zone_limit zone_limit; - int err;
zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE; zone_limit.limit = info->default_limit; - err = nla_put_nohdr(reply, sizeof(zone_limit), &zone_limit); - if (err) - return err;
- return 0; + return nla_put_nohdr(reply, sizeof(zone_limit), &zone_limit); }
static int __ovs_ct_limit_get_zone_limit(struct net *net,
From: Ilya Maximets i.maximets@ovn.org
[ Upstream commit 4d51419d49930be2701c2633ae271b350397c3ca ]
'struct ovs_zone_limit' has more members than initialized in ovs_ct_limit_get_default_limit(). The rest of the memory is a random kernel stack content that ends up being sent to userspace.
Fix that by using designated initializer that will clear all non-specified fields.
Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit") Signed-off-by: Ilya Maximets i.maximets@ovn.org Acked-by: Tonghao Zhang xiangxia.m.yue@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/openvswitch/conntrack.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 96a49aa3a128..a11b558813c1 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -2024,10 +2024,10 @@ static int ovs_ct_limit_del_zone_limit(struct nlattr *nla_zone_limit, static int ovs_ct_limit_get_default_limit(struct ovs_ct_limit_info *info, struct sk_buff *reply) { - struct ovs_zone_limit zone_limit; - - zone_limit.zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE; - zone_limit.limit = info->default_limit; + struct ovs_zone_limit zone_limit = { + .zone_id = OVS_ZONE_LIMIT_DEFAULT_ZONE, + .limit = info->default_limit, + };
return nla_put_nohdr(reply, sizeof(zone_limit), &zone_limit); }
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 5e729bc54bda705f64941008b018b4e41a4322bf ]
When hardware doesn't support High Speed Mode, we forget bus_freq_hz timing adjustment. This makes the timings and real registers being unsynchronized. Adjust bus_freq_hz when refuse high speed mode set.
Fixes: b6e67145f149 ("i2c: designware: Enable high speed mode") Reported-by: "Song Bao Hua (Barry Song)" song.bao.hua@hisilicon.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Barry Song song.bao.hua@hisilicon.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-master.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c index d6425ad6e6a3..2871cf2ee8b4 100644 --- a/drivers/i2c/busses/i2c-designware-master.c +++ b/drivers/i2c/busses/i2c-designware-master.c @@ -129,6 +129,7 @@ static int i2c_dw_set_timings_master(struct dw_i2c_dev *dev) if ((comp_param1 & DW_IC_COMP_PARAM_1_SPEED_MODE_MASK) != DW_IC_COMP_PARAM_1_SPEED_MODE_HIGH) { dev_err(dev->dev, "High Speed not supported!\n"); + t->bus_freq_hz = I2C_MAX_FAST_MODE_FREQ; dev->master_cfg &= ~DW_IC_CON_SPEED_MASK; dev->master_cfg |= DW_IC_CON_SPEED_FAST; dev->hs_hcnt = 0;
From: Luca Coelho luciano.coelho@intel.com
[ Upstream commit 07cc40fec9a85e669ea12e161a438d2cbd76f1ed ]
When version 2 of the regulatory capability flags API was implemented, the flag to disable 11ax was defined as bit 13, but this was later changed and the bit remained as bit 10, like in version 1. This was never changed in the driver, so we were checking for the wrong bit in newer devices. Fix it.
Signed-off-by: Luca Coelho luciano.coelho@intel.com Fixes: e27c506a985c ("iwlwifi: regulatory: regulatory capabilities api change") Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/iwlwifi.20210326125611.6d28516b59cd.Id0248d5e46626... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c index 6d19de3058d2..cbde21e772b1 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c @@ -285,7 +285,7 @@ enum iwl_reg_capa_flags_v2 { REG_CAPA_V2_MCS_9_ALLOWED = BIT(6), REG_CAPA_V2_WEATHER_DISABLED = BIT(7), REG_CAPA_V2_40MHZ_ALLOWED = BIT(8), - REG_CAPA_V2_11AX_DISABLED = BIT(13), + REG_CAPA_V2_11AX_DISABLED = BIT(10), };
/*
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit 617085fca6375e2c1667d1fbfc6adc4034c85f04 ]
Some SPI host controllers do not support full-duplex SPI transfers.
The function mcp251x_spi_trans() does a full duplex transfer. It is used in several places in the driver, where a TX half duplex transfer is sufficient.
To fix support for half duplex SPI host controllers, this patch introduces a new function mcp251x_spi_write() and changes all callers that do a TX half duplex transfer to use mcp251x_spi_write().
Fixes: e0e25001d088 ("can: mcp251x: add support for half duplex controllers") Link: https://lore.kernel.org/r/20210330100246.1074375-1-mkl@pengutronix.de Cc: Tim Harvey tharvey@gateworks.com Tested-By: Tim Harvey tharvey@gateworks.com Reported-by: Gerhard Bertelsmann info@gerhard-bertelsmann.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/spi/mcp251x.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/spi/mcp251x.c b/drivers/net/can/spi/mcp251x.c index 22d814ae4edc..42c3046fa304 100644 --- a/drivers/net/can/spi/mcp251x.c +++ b/drivers/net/can/spi/mcp251x.c @@ -314,6 +314,18 @@ static int mcp251x_spi_trans(struct spi_device *spi, int len) return ret; }
+static int mcp251x_spi_write(struct spi_device *spi, int len) +{ + struct mcp251x_priv *priv = spi_get_drvdata(spi); + int ret; + + ret = spi_write(spi, priv->spi_tx_buf, len); + if (ret) + dev_err(&spi->dev, "spi write failed: ret = %d\n", ret); + + return ret; +} + static u8 mcp251x_read_reg(struct spi_device *spi, u8 reg) { struct mcp251x_priv *priv = spi_get_drvdata(spi); @@ -361,7 +373,7 @@ static void mcp251x_write_reg(struct spi_device *spi, u8 reg, u8 val) priv->spi_tx_buf[1] = reg; priv->spi_tx_buf[2] = val;
- mcp251x_spi_trans(spi, 3); + mcp251x_spi_write(spi, 3); }
static void mcp251x_write_2regs(struct spi_device *spi, u8 reg, u8 v1, u8 v2) @@ -373,7 +385,7 @@ static void mcp251x_write_2regs(struct spi_device *spi, u8 reg, u8 v1, u8 v2) priv->spi_tx_buf[2] = v1; priv->spi_tx_buf[3] = v2;
- mcp251x_spi_trans(spi, 4); + mcp251x_spi_write(spi, 4); }
static void mcp251x_write_bits(struct spi_device *spi, u8 reg, @@ -386,7 +398,7 @@ static void mcp251x_write_bits(struct spi_device *spi, u8 reg, priv->spi_tx_buf[2] = mask; priv->spi_tx_buf[3] = val;
- mcp251x_spi_trans(spi, 4); + mcp251x_spi_write(spi, 4); }
static u8 mcp251x_read_stat(struct spi_device *spi) @@ -618,7 +630,7 @@ static void mcp251x_hw_tx_frame(struct spi_device *spi, u8 *buf, buf[i]); } else { memcpy(priv->spi_tx_buf, buf, TXBDAT_OFF + len); - mcp251x_spi_trans(spi, TXBDAT_OFF + len); + mcp251x_spi_write(spi, TXBDAT_OFF + len); } }
@@ -650,7 +662,7 @@ static void mcp251x_hw_tx(struct spi_device *spi, struct can_frame *frame,
/* use INSTRUCTION_RTS, to avoid "repeated frame problem" */ priv->spi_tx_buf[0] = INSTRUCTION_RTS(1 << tx_buf_idx); - mcp251x_spi_trans(priv->spi, 1); + mcp251x_spi_write(priv->spi, 1); }
static void mcp251x_hw_rx_frame(struct spi_device *spi, u8 *buf, @@ -888,7 +900,7 @@ static int mcp251x_hw_reset(struct spi_device *spi) mdelay(MCP251X_OST_DELAY_MS);
priv->spi_tx_buf[0] = INSTRUCTION_RESET; - ret = mcp251x_spi_trans(spi, 1); + ret = mcp251x_spi_write(spi, 1); if (ret) return ret;
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 2a2403ca3add03f542f6b34bef9f74649969b06d ]
Li Shuang found a NULL pointer dereference crash in her testing:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc] [] Call Trace: [] <IRQ> [] tipc_crypto_rcv+0x2d9/0x8f0 [tipc] [] tipc_rcv+0x2fc/0x1120 [tipc] [] tipc_udp_recv+0xc6/0x1e0 [tipc] [] udpv6_queue_rcv_one_skb+0x16a/0x460 [] udp6_unicast_rcv_skb.isra.35+0x41/0xa0 [] ip6_protocol_deliver_rcu+0x23b/0x4c0 [] ip6_input+0x3d/0xb0 [] ipv6_rcv+0x395/0x510 [] __netif_receive_skb_core+0x5fc/0xc40
This is caused by NULL returned by tipc_aead_get(), and then crashed when dereferencing it later in tipc_crypto_rcv_complete(). This might happen when tipc_crypto_rcv_complete() is called by two threads at the same time: the tmp attached by tipc_crypto_key_attach() in one thread may be released by the one attached by that in the other thread.
This patch is to fix it by incrementing the tmp's refcnt before attaching it instead of calling tipc_aead_get() after attaching it.
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Reported-by: Li Shuang shuali@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/crypto.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index 740ab9ae41a6..86eb6d679225 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1934,12 +1934,13 @@ static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead, goto rcv; if (tipc_aead_clone(&tmp, aead) < 0) goto rcv; + WARN_ON(!refcount_inc_not_zero(&tmp->refcnt)); if (tipc_crypto_key_attach(rx, tmp, ehdr->tx_key, false) < 0) { tipc_aead_free(&tmp->rcu); goto rcv; } tipc_aead_put(aead); - aead = tipc_aead_get(tmp); + aead = tmp; }
if (unlikely(err)) {
From: Guangbin Huang huangguangbin2@huawei.com
[ Upstream commit ed7bedd2c3ca040f1e8ea02c6590a93116b1ec78 ]
Currently, the VF down state bit is cleared after VF sending link status request command. There is problem that when VF gets link status replied from PF, the down state bit may still set as 1. In this case, the link status replied from PF will be ignored and always set VF link status to down.
To fix this problem, clear VF down state bit before VF requests link status.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index dc5d150a9c54..ac6980acb6f0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -2554,14 +2554,14 @@ static int hclgevf_ae_start(struct hnae3_handle *handle) { struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
+ clear_bit(HCLGEVF_STATE_DOWN, &hdev->state); + hclgevf_reset_tqp_stats(handle);
hclgevf_request_link_info(hdev);
hclgevf_update_link_mode(hdev);
- clear_bit(HCLGEVF_STATE_DOWN, &hdev->state); - return 0; }
From: Raed Salem raeds@nvidia.com
[ Upstream commit a14587dfc5ad2312dabdd42a610d80ecd0dc8bea ]
The cited commit wrongly placed log_max_flow_counter field of mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended placement.
Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters") Signed-off-by: Raed Salem raeds@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mlx5/mlx5_ifc.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 233352447b1a..19c1cb214953 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -435,11 +435,11 @@ struct mlx5_ifc_flow_table_prop_layout_bits { u8 reserved_at_60[0x18]; u8 log_max_ft_num[0x8];
- u8 reserved_at_80[0x18]; + u8 reserved_at_80[0x10]; + u8 log_max_flow_counter[0x8]; u8 log_max_destination[0x8];
- u8 log_max_flow_counter[0x8]; - u8 reserved_at_a8[0x10]; + u8 reserved_at_a0[0x18]; u8 log_max_flow[0x8];
u8 reserved_at_c0[0x40];
From: Aya Levin ayal@nvidia.com
[ Upstream commit ce28f0fd670ddffcd564ce7119bdefbaf08f02d3 ]
Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields.
Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mlx5/mlx5_ifc.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 19c1cb214953..4b3b2bf83720 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -8719,6 +8719,8 @@ struct mlx5_ifc_pplm_reg_bits {
u8 fec_override_admin_100g_2x[0x10]; u8 fec_override_admin_50g_1x[0x10]; + + u8 reserved_at_140[0x140]; };
struct mlx5_ifc_ppcnt_reg_bits {
From: Aya Levin ayal@nvidia.com
[ Upstream commit 534b1204ca4694db1093b15cf3e79a99fcb6a6da ]
Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields.
Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mlx5/mlx5_ifc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 4b3b2bf83720..cc9ee0776974 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -10058,7 +10058,7 @@ struct mlx5_ifc_pbmc_reg_bits {
struct mlx5_ifc_bufferx_reg_bits buffer[10];
- u8 reserved_at_2e0[0x40]; + u8 reserved_at_2e0[0x80]; };
struct mlx5_ifc_qtct_reg_bits {
From: Potnuri Bharat Teja bharat@chelsio.com
[ Upstream commit 603c4690b01aaffe3a6c3605a429f6dac39852ae ]
ipv6 bit is wrongly set by the below which causes fatal adapter lookup engine errors for ipv4 connections while destroying a listener. Fix it to properly check the local address for ipv6.
Fixes: 3408be145a5d ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server") Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja bharat@chelsio.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/cxgb4/cm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 81903749d241..e42c812e74c3 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -3616,7 +3616,8 @@ int c4iw_destroy_listen(struct iw_cm_id *cm_id) c4iw_init_wr_wait(ep->com.wr_waitp); err = cxgb4_remove_server( ep->com.dev->rdev.lldi.ports[0], ep->stid, - ep->com.dev->rdev.lldi.rxq_ids[0], true); + ep->com.dev->rdev.lldi.rxq_ids[0], + ep->com.local_addr.ss_family == AF_INET6); if (err) goto done; err = c4iw_wait_for_reply(&ep->com.dev->rdev, ep->com.wr_waitp,
From: Jin Yao yao.jin@linux.intel.com
[ Upstream commit f2013278ae40b89cc27916366c407ce5261815ef ]
When '--total-cycles' is specified, it supports sorting for all blocks by 'Sampled Cycles%'. This is useful to concentrate on the globally hottest blocks.
'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
But in current code, it doesn't use the cycles aggregation. Part of 'cycles' counting is possibly dropped for some overlap jumps. But for identifying the hot block, we always need the full cycles.
# perf record -b ./triad_loop # perf report --total-cycles --stdio
Before:
# # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object # ............... .............. ........... .......... ............................................................. ................. # 0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so 0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms] 0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms] 0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms] 0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms] 0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms] 0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms] 0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms] 0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms] 0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms] 0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so ... 0.00% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop
The hottest block '[triad_loop.c:64 -> triad_loop.c:65]' is not at the top of output.
After:
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object # ............... .............. ........... .......... .............................................................. ................. # 94.35% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop 0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so 0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms] 0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms] 0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms] 0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms] 0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms] 0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms] 0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms] 0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms] 0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms] 0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so 0.08% 82 0.06% 11 [intel_pmu_drain_pebs_nhm+580 -> intel_pmu_drain_pebs_nhm+627] [kernel.kallsyms] 0.08% 77 0.42% 77 [lru_add_drain_cpu+0 -> lru_add_drain_cpu+133] [kernel.kallsyms] 0.08% 74 0.10% 18 [handle_pmi_common+271 -> handle_pmi_common+310] [kernel.kallsyms] 0.08% 74 0.40% 74 [get-dynamic-info.h:131 -> get-dynamic-info.h:157] ld-2.27.so 0.07% 69 0.09% 17 [intel_pmu_drain_pebs_nhm+432 -> intel_pmu_drain_pebs_nhm+468] [kernel.kallsyms]
Now the hottest block is reported at the top of output.
Fixes: b65a7d372b1a55db ("perf hist: Support block formats with compare/sort/display") Signed-off-by: Jin Yao yao.jin@linux.intel.com Reviewed-by: Andi Kleen ak@linux.intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jin Yao yao.jin@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/20210407024452.29988-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/block-info.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c index 423ec69bda6c..5ecd4f401f32 100644 --- a/tools/perf/util/block-info.c +++ b/tools/perf/util/block-info.c @@ -201,7 +201,7 @@ static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt, double ratio = 0.0;
if (block_fmt->total_cycles) - ratio = (double)bi->cycles / (double)block_fmt->total_cycles; + ratio = (double)bi->cycles_aggr / (double)block_fmt->total_cycles;
return color_pct(hpp, block_fmt->width, 100.0 * ratio); } @@ -216,9 +216,9 @@ static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt, double l, r;
if (block_fmt->total_cycles) { - l = ((double)bi_l->cycles / + l = ((double)bi_l->cycles_aggr / (double)block_fmt->total_cycles) * 100000.0; - r = ((double)bi_r->cycles / + r = ((double)bi_r->cycles_aggr / (double)block_fmt->total_cycles) * 100000.0; return (int64_t)l - (int64_t)r; }
From: Kamal Heib kamalheib1@gmail.com
[ Upstream commit e1ad897b9c738d5550be6762bf3a6ef1672259a4 ]
As INI QP does not require a recv_cq, avoid the following null pointer dereference by checking if the qp_type is not INI before trying to extract the recv_cq.
BUG: kernel NULL pointer dereference, address: 00000000000000e0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 54250 Comm: mpitests-IMB-MP Not tainted 5.12.0-rc5 #1 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.7.0 08/19/2019 RIP: 0010:qedr_create_qp+0x378/0x820 [qedr] Code: 02 00 00 50 e8 29 d4 a9 d1 48 83 c4 18 e9 65 fe ff ff 48 8b 53 10 48 8b 43 18 44 8b 82 e0 00 00 00 45 85 c0 0f 84 10 74 00 00 <8b> b8 e0 00 00 00 85 ff 0f 85 50 fd ff ff e9 fd 73 00 00 48 8d bd RSP: 0018:ffff9c8f056f7a70 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff9c8f056f7b58 RCX: 0000000000000009 RDX: ffff8c41a9744c00 RSI: ffff9c8f056f7b58 RDI: ffff8c41c0dfa280 RBP: ffff8c41c0dfa280 R08: 0000000000000002 R09: 0000000000000001 R10: 0000000000000000 R11: ffff8c41e06fc608 R12: ffff8c4194052000 R13: 0000000000000000 R14: ffff8c4191546070 R15: ffff8c41c0dfa280 FS: 00007f78b2787b80(0000) GS:ffff8c43a3200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 00000001011d6002 CR4: 00000000001706f0 Call Trace: ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x4e4/0xb90 [ib_uverbs] ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs] ib_uverbs_run_method+0x6f6/0x7a0 [ib_uverbs] ? ib_uverbs_handler_UVERBS_METHOD_QP_DESTROY+0x70/0x70 [ib_uverbs] ? __cond_resched+0x15/0x30 ? __kmalloc+0x5a/0x440 ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs] ? xa_load+0x6e/0x90 ? cred_has_capability+0x7c/0x130 ? avc_has_extended_perms+0x17f/0x440 ? vma_link+0xae/0xb0 ? vma_set_page_prot+0x2a/0x60 ? mmap_region+0x298/0x6c0 ? do_mmap+0x373/0x520 ? selinux_file_ioctl+0x17f/0x220 ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f78b120262b
Fixes: 06e8d1df46ed ("RDMA/qedr: Add support for user mode XRC-SRQ's") Link: https://lore.kernel.org/r/20210404125501.154789-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib kamalheib1@gmail.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/qedr/verbs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c index 511c95bb3d01..cdfb7732dff3 100644 --- a/drivers/infiniband/hw/qedr/verbs.c +++ b/drivers/infiniband/hw/qedr/verbs.c @@ -1241,7 +1241,8 @@ static int qedr_check_qp_attrs(struct ib_pd *ibpd, struct qedr_dev *dev, * TGT QP isn't associated with RQ/SQ */ if ((attrs->qp_type != IB_QPT_GSI) && (dev->gsi_qp_created) && - (attrs->qp_type != IB_QPT_XRC_TGT)) { + (attrs->qp_type != IB_QPT_XRC_TGT) && + (attrs->qp_type != IB_QPT_XRC_INI)) { struct qedr_cq *send_cq = get_qedr_cq(attrs->send_cq); struct qedr_cq *recv_cq = get_qedr_cq(attrs->recv_cq);
From: Dom Cobley popcornmix@gmail.com
[ Upstream commit eb9dfdd1ed40357b99a4201c8534c58c562e48c9 ]
Experimentally have found PV on hvs4 reports fifo full error with expected settings and does not with one less
This appears as: [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:82:crtc-3] flip_done timed out
with bit 10 of PV_STAT set "HVS driving pixels when the PV FIFO is full"
Fixes: c8b75bca92cb ("drm/vc4: Add KMS support for Raspberry Pi.") Signed-off-by: Dom Cobley popcornmix@gmail.com Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20210318161328.1471556-3-maxim... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_crtc.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c index 482219fb4db2..1d2416d466a3 100644 --- a/drivers/gpu/drm/vc4/vc4_crtc.c +++ b/drivers/gpu/drm/vc4/vc4_crtc.c @@ -210,6 +210,7 @@ static u32 vc4_get_fifo_full_level(struct vc4_crtc *vc4_crtc, u32 format) { const struct vc4_crtc_data *crtc_data = vc4_crtc_to_vc4_crtc_data(vc4_crtc); const struct vc4_pv_data *pv_data = vc4_crtc_to_vc4_pv_data(vc4_crtc); + struct vc4_dev *vc4 = to_vc4_dev(vc4_crtc->base.dev); u32 fifo_len_bytes = pv_data->fifo_depth;
/* @@ -238,6 +239,22 @@ static u32 vc4_get_fifo_full_level(struct vc4_crtc *vc4_crtc, u32 format) if (crtc_data->hvs_output == 5) return 32;
+ /* + * It looks like in some situations, we will overflow + * the PixelValve FIFO (with the bit 10 of PV stat being + * set) and stall the HVS / PV, eventually resulting in + * a page flip timeout. + * + * Displaying the video overlay during a playback with + * Kodi on an RPi3 seems to be a great solution with a + * failure rate around 50%. + * + * Removing 1 from the FIFO full level however + * seems to completely remove that issue. + */ + if (!vc4->hvs->hvs5) + return fifo_len_bytes - 3 * HVS_FIFO_LATENCY_PIX - 1; + return fifo_len_bytes - 3 * HVS_FIFO_LATENCY_PIX; } }
From: Grzegorz Siwik grzegorz.siwik@intel.com
[ Upstream commit b2d0efc4be7ed320e33eaa9b6dd6f3f6011ffb8e ]
Change parameters order in aq_get_phy_register() due to wrong statistics in PHY reported by ethtool. Previously all PHY statistics were exactly the same for all interfaces Now statistics are reported correctly - different for different interfaces
Fixes: 0514db37dd78 ("i40e: Extend PHY access with page change flag") Signed-off-by: Grzegorz Siwik grzegorz.siwik@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index 849e38be69ff..31d48a85cfaf 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -5285,7 +5285,7 @@ static int i40e_get_module_eeprom(struct net_device *netdev,
status = i40e_aq_get_phy_register(hw, I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE, - true, addr, offset, &value, NULL); + addr, true, offset, &value, NULL); if (status) return -EIO; data[i] = value;
From: Leon Romanovsky leonro@nvidia.com
[ Upstream commit d1c803a9ccd7bd3aff5e989ccfb39ed3b799b975 ]
The nla_len() is less than or equal to 16. If it's less than 16 then end of the "gid" buffer is uninitialized.
Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/addr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index 0abce004a959..65e3e7df8a4b 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -76,7 +76,9 @@ static struct workqueue_struct *addr_wq;
static const struct nla_policy ib_nl_addr_policy[LS_NLA_TYPE_MAX] = { [LS_NLA_TYPE_DGID] = {.type = NLA_BINARY, - .len = sizeof(struct rdma_nla_ls_gid)}, + .len = sizeof(struct rdma_nla_ls_gid), + .validation_type = NLA_VALIDATE_MIN, + .min = sizeof(struct rdma_nla_ls_gid)}, };
static inline bool ib_nl_is_good_ip_resp(const struct nlmsghdr *nlh)
From: Si-Wei Liu si-wei.liu@oracle.com
[ Upstream commit d084d996aaf53c0cc583dc75a4fc2a67fe485846 ]
When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa, 22 extra bytes worth of MTU length is shown in guest. This is because the mlx5_query_port_max_mtu API returns the "hardware" MTU value, which does not just contain the Ethernet payload, but includes extra lengths starting from the Ethernet header up to the FCS altogether.
Fix the MTU so packets won't get dropped silently.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Si-Wei Liu si-wei.liu@oracle.com Acked-by: Jason Wang jasowang@redhat.com Acked-by: Eli Cohen elic@nvidia.com Link: https://lore.kernel.org/r/20210408091047.4269-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 ++++ drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 ++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h index 08f742fd2409..b6cc53ba980c 100644 --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h @@ -4,9 +4,13 @@ #ifndef __MLX5_VDPA_H__ #define __MLX5_VDPA_H__
+#include <linux/etherdevice.h> +#include <linux/if_vlan.h> #include <linux/vdpa.h> #include <linux/mlx5/driver.h>
+#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) + struct mlx5_vdpa_direct_mr { u64 start; u64 end; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 36e5933f11bc..545160ee2a62 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1887,6 +1887,19 @@ static const struct vdpa_config_ops mlx5_vdpa_ops = { .free = mlx5_vdpa_free, };
+static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu) +{ + u16 hw_mtu; + int err; + + err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu); + if (err) + return err; + + *mtu = hw_mtu - MLX5V_ETH_HARD_MTU; + return 0; +} + static int alloc_resources(struct mlx5_vdpa_net *ndev) { struct mlx5_vdpa_net_resources *res = &ndev->res; @@ -1969,7 +1982,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev) init_mvqs(ndev); mutex_init(&ndev->reslock); config = &ndev->config; - err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu); + err = query_mtu(mdev, &ndev->mtu); if (err) goto err_mtu;
From: Eli Cohen elic@nvidia.com
[ Upstream commit 4b454a82418dd76d8c0590bb3f7a99a63ea57dc5 ]
VIRTIO_F_VERSION_1 is a bit number. Use BIT_ULL() with mask conditionals.
Also, in mlx5_vdpa_is_little_endian() use BIT_ULL for consistency with the rest of the code.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen elic@nvidia.com Link: https://lore.kernel.org/r/20210408091047.4269-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 545160ee2a62..65cfbd377130 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -805,7 +805,7 @@ static int create_virtqueue(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtque MLX5_SET(virtio_q, vq_ctx, event_qpn_or_msix, mvq->fwqp.mqp.qpn); MLX5_SET(virtio_q, vq_ctx, queue_size, mvq->num_ent); MLX5_SET(virtio_q, vq_ctx, virtio_version_1_0, - !!(ndev->mvdev.actual_features & VIRTIO_F_VERSION_1)); + !!(ndev->mvdev.actual_features & BIT_ULL(VIRTIO_F_VERSION_1))); MLX5_SET64(virtio_q, vq_ctx, desc_addr, mvq->desc_addr); MLX5_SET64(virtio_q, vq_ctx, used_addr, mvq->device_addr); MLX5_SET64(virtio_q, vq_ctx, available_addr, mvq->driver_addr); @@ -1535,7 +1535,7 @@ static void teardown_virtqueues(struct mlx5_vdpa_net *ndev) static inline bool mlx5_vdpa_is_little_endian(struct mlx5_vdpa_dev *mvdev) { return virtio_legacy_is_little_endian() || - (mvdev->actual_features & (1ULL << VIRTIO_F_VERSION_1)); + (mvdev->actual_features & BIT_ULL(VIRTIO_F_VERSION_1)); }
static __virtio16 cpu_to_mlx5vdpa16(struct mlx5_vdpa_dev *mvdev, u16 val)
From: William Roche william.roche@oracle.com
commit 3a62583c2853b0ab37a57dde79decea210b5fb89 upstream.
ce_add_elem() uses different return values to signal a result from adding an element to the collector. Commit in Fixes: broke the case where the element being added is not found in the array. Correct that.
[ bp: Rewrite commit message, add kernel-doc comments. ]
Fixes: de0e0624d86f ("RAS/CEC: Check count_threshold unconditionally") Signed-off-by: William Roche william.roche@oracle.com Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1617722939-29670-1-git-send-email-william.roche@or... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ras/cec.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -309,11 +309,20 @@ static bool sanity_check(struct ce_array return ret; }
+/** + * cec_add_elem - Add an element to the CEC array. + * @pfn: page frame number to insert + * + * Return values: + * - <0: on error + * - 0: on success + * - >0: when the inserted pfn was offlined + */ static int cec_add_elem(u64 pfn) { struct ce_array *ca = &ce_arr; + int count, err, ret = 0; unsigned int to = 0; - int count, ret = 0;
/* * We can be called very early on the identify_cpu() path where we are @@ -330,8 +339,8 @@ static int cec_add_elem(u64 pfn) if (ca->n == MAX_ELEMS) WARN_ON(!del_lru_elem_unlocked(ca));
- ret = find_elem(ca, pfn, &to); - if (ret < 0) { + err = find_elem(ca, pfn, &to); + if (err < 0) { /* * Shift range [to-end] to make room for one more element. */
From: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com
commit 2867b9746cef78745c594894aece6f8ef826e0b4 upstream.
Pointers should be cast with uintptr_t instead of integer. This fixes warning when compile testing on ARM64:
drivers/clk/socfpga/clk-gate.c: In function ‘socfpga_clk_recalc_rate’: drivers/clk/socfpga/clk-gate.c:102:7: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
Fixes: b7cec13f082f ("clk: socfpga: Look for the GPIO_DB_CLK by its offset") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Acked-by: Dinh Nguyen dinguyen@kernel.org Link: https://lore.kernel.org/r/20210314110709.32599-1-krzysztof.kozlowski@canonic... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/socfpga/clk-gate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/socfpga/clk-gate.c +++ b/drivers/clk/socfpga/clk-gate.c @@ -99,7 +99,7 @@ static unsigned long socfpga_clk_recalc_ val = readl(socfpgaclk->div_reg) >> socfpgaclk->shift; val &= GENMASK(socfpgaclk->width - 1, 0); /* Check for GPIO_DB_CLK by its offset */ - if ((int) socfpgaclk->div_reg & SOCFPGA_GPIO_DB_CLK_OFFSET) + if ((uintptr_t) socfpgaclk->div_reg & SOCFPGA_GPIO_DB_CLK_OFFSET) div = val + 1; else div = (1 << val);
From: Arnd Bergmann arnd@arndb.de
commit 6d48b7912cc72275dc7c59ff961c8bac7ef66a92 upstream.
Clang doesn't like format strings that truncate a 32-bit value to something shorter:
kernel/locking/lockdep.c:709:4: error: format specifies type 'short' but the argument has type 'int' [-Werror,-Wformat]
In this case, the warning is a slightly questionable, as it could realize that both class->wait_type_outer and class->wait_type_inner are in fact 8-bit struct members, even though the result of the ?: operator becomes an 'int'.
However, there is really no point in printing the number as a 16-bit 'short' rather than either an 8-bit or 32-bit number, so just change it to a normal %d.
Fixes: de8f5e4f2dc1 ("lockdep: Introduce wait-type checks") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20210322115531.3987555-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/locking/lockdep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -705,7 +705,7 @@ static void print_lock_name(struct lock_
printk(KERN_CONT " ("); __print_lock_name(class); - printk(KERN_CONT "){%s}-{%hd:%hd}", usage, + printk(KERN_CONT "){%s}-{%d:%d}", usage, class->wait_type_outer ?: class->wait_type_inner, class->wait_type_inner); }
From: Rafał Miłecki rafal@milecki.pl
commit af9d316f3dd6d1385fbd1631b5103e620fc4298a upstream.
The correct property name is "nvmem-cell-names". This is what: 1. Was originally documented in the ethernet.txt 2. Is used in DTS files 3. Matches standard syntax for phandles 4. Linux net subsystem checks for
Fixes: 9d3de3c58347 ("dt-bindings: net: Add YAML schemas for the generic Ethernet options") Signed-off-by: Rafał Miłecki rafal@milecki.pl Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/net/ethernet-controller.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml @@ -49,7 +49,7 @@ properties: description: Reference to an nvmem node for the MAC address
- nvmem-cells-names: + nvmem-cell-names: const: mac-address
phy-connection-type:
From: Kumar Kartikeya Dwivedi memxor@gmail.com
commit 6855e8213e06efcaf7c02a15e12b1ae64b9a7149 upstream.
Currently, action creation using ACT API in replace mode is buggy. When invoking for non-existent action index 42,
tc action replace action bpf obj foo.o sec <xyz> index 42
kernel creates the action, fills up the netlink response, and then just deletes the action after notifying userspace.
tc action show action bpf
doesn't list the action.
This happens due to the following sequence when ovr = 1 (replace mode) is enabled:
tcf_idr_check_alloc is used to atomically check and either obtain reference for existing action at index, or reserve the index slot using a dummy entry (ERR_PTR(-EBUSY)).
This is necessary as pointers to these actions will be held after dropping the idrinfo lock, so bumping the reference count is necessary as we need to insert the actions, and notify userspace by dumping their attributes. Finally, we drop the reference we took using the tcf_action_put_many call in tcf_action_add. However, for the case where a new action is created due to free index, its refcount remains one. This when paired with the put_many call leads to the kernel setting up the action, notifying userspace of its creation, and then tearing it down. For existing actions, the refcount is still held so they remain unaffected.
Fortunately due to rtnl_lock serialization requirement, such an action with refcount == 1 will not be concurrently deleted by anything else, at best CLS API can move its refcount up and down by binding to it after it has been published from tcf_idr_insert_many. Since refcount is atleast one until put_many call, CLS API cannot delete it. Also __tcf_action_put release path already ensures deterministic outcome (either new action will be created or existing action will be reused in case CLS API tries to bind to action concurrently) due to idr lock serialization.
We fix this by making refcount of newly created actions as 2 in ACT API replace mode. A relaxed store will suffice as visibility is ensured only after the tcf_idr_insert_many call.
Note that in case of creation or overwriting using CLS API only (i.e. bind = 1), overwriting existing action object is not allowed, and any such request is silently ignored (without error).
The refcount bump that occurs in tcf_idr_check_alloc call there for existing action will pair with tcf_exts_destroy call made from the owner module for the same action. In case of action creation, there is no existing action, so no tcf_exts_destroy callback happens.
This means no code changes for CLS API.
Fixes: cae422f379f3 ("net: sched: use reference counting action init") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_api.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -1029,6 +1029,9 @@ struct tc_action *tcf_action_init_1(stru if (!name) a->hw_stats = hw_stats;
+ if (!bind && ovr && err == ACT_P_CREATED) + refcount_set(&a->tcfa_refcnt, 2); + return a;
err_out:
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit b41ba2ec54a70908067034f139aa23d0dd2985ce upstream.
On STM32MP1, the GPIO banks are subnodes of pin-controller@50002000, see arch/arm/boot/dts/stm32mp151.dtsi. The driver for pin-controller@50002000 is in drivers/pinctrl/stm32/pinctrl-stm32.c and iterates over all of its DT subnodes when registering each GPIO bank gpiochip. Each gpiochip has:
- gpio_chip.parent = dev, where dev is the device node of the pin controller - gpio_chip.of_node = np, which is the OF node of the GPIO bank
Therefore, dev_fwnode(chip->parent) != of_fwnode_handle(chip.of_node), i.e. pin-controller@50002000 != pin-controller@50002000/gpio@5000*000.
The original code behaved correctly, as it extracted the "gpio-line-names" from of_fwnode_handle(chip.of_node) = pin-controller@50002000/gpio@5000*000.
To achieve the same behaviour, read property from the firmware node.
Fixes: 7cba1a4d5e162 ("gpiolib: generalize devprop_gpiochip_set_names() for device properties") Reported-by: Marek Vasut marex@denx.de Reported-by: Roman Guskov rguskov@dh-electronics.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Tested-by: Marek Vasut marex@denx.de Reviewed-by: Marek Vasut marex@denx.de Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
--- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -364,22 +364,18 @@ static int gpiochip_set_desc_names(struc * * Looks for device property "gpio-line-names" and if it exists assigns * GPIO line names for the chip. The memory allocated for the assigned - * names belong to the underlying software node and should not be released + * names belong to the underlying firmware node and should not be released * by the caller. */ static int devprop_gpiochip_set_names(struct gpio_chip *chip) { struct gpio_device *gdev = chip->gpiodev; - struct device *dev = chip->parent; + struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev); const char **names; int ret, i; int count;
- /* GPIO chip may not have a parent device whose properties we inspect. */ - if (!dev) - return 0; - - count = device_property_string_array_count(dev, "gpio-line-names"); + count = fwnode_property_string_array_count(fwnode, "gpio-line-names"); if (count < 0) return 0;
@@ -393,7 +389,7 @@ static int devprop_gpiochip_set_names(st if (!names) return -ENOMEM;
- ret = device_property_read_string_array(dev, "gpio-line-names", + ret = fwnode_property_read_string_array(fwnode, "gpio-line-names", names, count); if (ret < 0) { dev_warn(&gdev->dev, "failed to read GPIO line names\n");
From: Du Cheng ducheng2@gmail.com
commit 1b5ab825d9acc0f27d2f25c6252f3526832a9626 upstream.
A WARN_ON(wdev->conn) would trigger in cfg80211_sme_connect(), if multiple send_msg(NL80211_CMD_CONNECT) system calls are made from the userland, which should be anticipated and handled by the wireless driver. Remove this WARN_ON() to prevent kernel panic if kernel is configured to "panic_on_warn".
Bug reported by syzbot.
Reported-by: syzbot+5f9392825de654244975@syzkaller.appspotmail.com Signed-off-by: Du Cheng ducheng2@gmail.com Link: https://lore.kernel.org/r/20210407162756.6101-1-ducheng2@gmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/sme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/wireless/sme.c +++ b/net/wireless/sme.c @@ -530,7 +530,7 @@ static int cfg80211_sme_connect(struct w cfg80211_sme_free(wdev); }
- if (WARN_ON(wdev->conn)) + if (wdev->conn) return -EINPROGRESS;
wdev->conn = kzalloc(sizeof(*wdev->conn), GFP_KERNEL);
From: Phillip Potter phil@philpotter.co.uk
commit cca8ea3b05c972ffb5295367e6c544369b45fbdd upstream.
When changing type with TUNSETLINK ioctl command, set tun->dev->addr_len to match the appropriate type, using new tun_get_addr_len utility function which returns appropriate address length for given type. Fixes a KMSAN-found uninit-value bug reported by syzbot at: https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed5...
Reported-by: syzbot+001516d86dbe88862cec@syzkaller.appspotmail.com Diagnosed-by: Eric Dumazet edumazet@google.com Signed-off-by: Phillip Potter phil@philpotter.co.uk Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/tun.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+)
--- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -69,6 +69,14 @@ #include <linux/bpf.h> #include <linux/bpf_trace.h> #include <linux/mutex.h> +#include <linux/ieee802154.h> +#include <linux/if_ltalk.h> +#include <uapi/linux/if_fddi.h> +#include <uapi/linux/if_hippi.h> +#include <uapi/linux/if_fc.h> +#include <net/ax25.h> +#include <net/rose.h> +#include <net/6lowpan.h>
#include <linux/uaccess.h> #include <linux/proc_fs.h> @@ -2978,6 +2986,45 @@ static int tun_set_ebpf(struct tun_struc return __tun_set_ebpf(tun, prog_p, prog); }
+/* Return correct value for tun->dev->addr_len based on tun->dev->type. */ +static unsigned char tun_get_addr_len(unsigned short type) +{ + switch (type) { + case ARPHRD_IP6GRE: + case ARPHRD_TUNNEL6: + return sizeof(struct in6_addr); + case ARPHRD_IPGRE: + case ARPHRD_TUNNEL: + case ARPHRD_SIT: + return 4; + case ARPHRD_ETHER: + return ETH_ALEN; + case ARPHRD_IEEE802154: + case ARPHRD_IEEE802154_MONITOR: + return IEEE802154_EXTENDED_ADDR_LEN; + case ARPHRD_PHONET_PIPE: + case ARPHRD_PPP: + case ARPHRD_NONE: + return 0; + case ARPHRD_6LOWPAN: + return EUI64_ADDR_LEN; + case ARPHRD_FDDI: + return FDDI_K_ALEN; + case ARPHRD_HIPPI: + return HIPPI_ALEN; + case ARPHRD_IEEE802: + return FC_ALEN; + case ARPHRD_ROSE: + return ROSE_ADDR_LEN; + case ARPHRD_NETROM: + return AX25_ADDR_LEN; + case ARPHRD_LOCALTLK: + return LTALK_ALEN; + default: + return 0; + } +} + static long __tun_chr_ioctl(struct file *file, unsigned int cmd, unsigned long arg, int ifreq_len) { @@ -3133,6 +3180,7 @@ static long __tun_chr_ioctl(struct file ret = -EBUSY; } else { tun->dev->type = (int) arg; + tun->dev->addr_len = tun_get_addr_len(tun->dev->type); netif_info(tun, drv, tun->dev, "linktype set to %d\n", tun->dev->type); ret = 0;
From: Pavel Skripkin paskripkin@gmail.com
commit 6b9fbe16955152626557ec6f439f3407b7769941 upstream.
syzbot reported memory leak in atusb_probe()[1]. The problem was in atusb_alloc_urbs(). Since urb is anchored, we need to release the reference to correctly free the urb
backtrace: [<ffffffff82ba0466>] kmalloc include/linux/slab.h:559 [inline] [<ffffffff82ba0466>] usb_alloc_urb+0x66/0xe0 drivers/usb/core/urb.c:74 [<ffffffff82ad3888>] atusb_alloc_urbs drivers/net/ieee802154/atusb.c:362 [inline][2] [<ffffffff82ad3888>] atusb_probe+0x158/0x820 drivers/net/ieee802154/atusb.c:1038 [1]
Reported-by: syzbot+28a246747e0a465127f3@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ieee802154/atusb.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ieee802154/atusb.c +++ b/drivers/net/ieee802154/atusb.c @@ -365,6 +365,7 @@ static int atusb_alloc_urbs(struct atusb return -ENOMEM; } usb_anchor_urb(urb, &atusb->idle_urbs); + usb_free_urb(urb); n--; } return 0;
From: Pavel Skripkin paskripkin@gmail.com
commit a0b96b4a62745397aee662670cfc2157bac03f55 upstream.
syzbot reported memory leak in peak_usb. The problem was in case of failure after calling ->dev_init()[2] in peak_usb_create_dev()[1]. The data allocated int dev_init() wasn't freed, so simple ->dev_free() call fix this problem.
backtrace: [<0000000079d6542a>] kmalloc include/linux/slab.h:552 [inline] [<0000000079d6542a>] kzalloc include/linux/slab.h:682 [inline] [<0000000079d6542a>] pcan_usb_fd_init+0x156/0x210 drivers/net/can/usb/peak_usb/pcan_usb_fd.c:868 [2] [<00000000c09f9057>] peak_usb_create_dev drivers/net/can/usb/peak_usb/pcan_usb_core.c:851 [inline] [1] [<00000000c09f9057>] peak_usb_probe+0x389/0x490 drivers/net/can/usb/peak_usb/pcan_usb_core.c:949
Reported-by: syzbot+91adee8d9ebb9193d22d@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c @@ -856,7 +856,7 @@ static int peak_usb_create_dev(const str if (dev->adapter->dev_set_bus) { err = dev->adapter->dev_set_bus(dev, 0); if (err) - goto lbl_unregister_candev; + goto adap_dev_free; }
/* get device number early */ @@ -868,6 +868,10 @@ static int peak_usb_create_dev(const str
return 0;
+adap_dev_free: + if (dev->adapter->dev_free) + dev->adapter->dev_free(dev); + lbl_unregister_candev: unregister_candev(netdev);
From: Pavel Skripkin paskripkin@gmail.com
commit 1165affd484889d4986cf3b724318935a0b120d8 upstream.
syzbot found general protection fault in crypto_destroy_tfm()[1]. It was caused by wrong clean up loop in llsec_key_alloc(). If one of the tfm array members is in IS_ERR() range it will cause general protection fault in clean up function [1].
Call Trace: crypto_free_aead include/crypto/aead.h:191 [inline] [1] llsec_key_alloc net/mac802154/llsec.c:156 [inline] mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249 ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338 rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline] nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Pavel Skripkin paskripkin@gmail.com Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210304152125.1052825-1-paskripkin@gmail.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac802154/llsec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/mac802154/llsec.c +++ b/net/mac802154/llsec.c @@ -152,7 +152,7 @@ err_tfm0: crypto_free_sync_skcipher(key->tfm0); err_tfm: for (i = 0; i < ARRAY_SIZE(key->tfm); i++) - if (key->tfm[i]) + if (!IS_ERR_OR_NULL(key->tfm[i])) crypto_free_aead(key->tfm[i]);
kfree_sensitive(key);
From: Alexander Aring aahringo@redhat.com
commit 6f7f657f24405f426212c09260bf7fe8a52cef33 upstream.
This patch fixes a null pointer derefence for panid handle by move the check for the netlink variable directly before accessing them.
Reported-by: syzbot+d4c07de0144f6f63be3a@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210228151817.95700-4-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl-mac.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/net/ieee802154/nl-mac.c +++ b/net/ieee802154/nl-mac.c @@ -551,9 +551,7 @@ ieee802154_llsec_parse_key_id(struct gen desc->mode = nla_get_u8(info->attrs[IEEE802154_ATTR_LLSEC_KEY_MODE]);
if (desc->mode == IEEE802154_SCF_KEY_IMPLICIT) { - if (!info->attrs[IEEE802154_ATTR_PAN_ID] && - !(info->attrs[IEEE802154_ATTR_SHORT_ADDR] || - info->attrs[IEEE802154_ATTR_HW_ADDR])) + if (!info->attrs[IEEE802154_ATTR_PAN_ID]) return -EINVAL;
desc->device_addr.pan_id = nla_get_shortaddr(info->attrs[IEEE802154_ATTR_PAN_ID]); @@ -562,6 +560,9 @@ ieee802154_llsec_parse_key_id(struct gen desc->device_addr.mode = IEEE802154_ADDR_SHORT; desc->device_addr.short_addr = nla_get_shortaddr(info->attrs[IEEE802154_ATTR_SHORT_ADDR]); } else { + if (!info->attrs[IEEE802154_ATTR_HW_ADDR]) + return -EINVAL; + desc->device_addr.mode = IEEE802154_ADDR_LONG; desc->device_addr.extended_addr = nla_get_hwaddr(info->attrs[IEEE802154_ATTR_HW_ADDR]); }
From: Alexander Aring aahringo@redhat.com
commit 37feaaf5ceb2245e474369312bb7b922ce7bce69 upstream.
This patch fixes a nullpointer dereference if NL802154_ATTR_SEC_KEY is not set by the user. If this is the case nl802154 will return -EINVAL.
Reported-by: syzbot+ac5c11d2959a8b3c4806@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210221174321.14210-1-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -1592,7 +1592,8 @@ static int nl802154_del_llsec_key(struct struct nlattr *attrs[NL802154_KEY_ATTR_MAX + 1]; struct ieee802154_llsec_key_id id;
- if (nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) + if (!info->attrs[NL802154_ATTR_SEC_KEY] || + nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) return -EINVAL;
if (ieee802154_llsec_parse_key_id(attrs[NL802154_KEY_ATTR_ID], &id) < 0)
From: Alexander Aring aahringo@redhat.com
commit 3d1eac2f45585690d942cf47fd7fbd04093ebd1b upstream.
This patch fixes a nullpointer dereference if NL802154_ATTR_SEC_DEVICE is not set by the user. If this is the case nl802154 will return -EINVAL.
Reported-by: syzbot+d946223c2e751d136c94@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210221174321.14210-2-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -1758,7 +1758,8 @@ static int nl802154_del_llsec_dev(struct struct nlattr *attrs[NL802154_DEV_ATTR_MAX + 1]; __le64 extended_addr;
- if (nla_parse_nested_deprecated(attrs, NL802154_DEV_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVICE], nl802154_dev_policy, info->extack)) + if (!info->attrs[NL802154_ATTR_SEC_DEVICE] || + nla_parse_nested_deprecated(attrs, NL802154_DEV_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVICE], nl802154_dev_policy, info->extack)) return -EINVAL;
if (!attrs[NL802154_DEV_ATTR_EXTENDED_ADDR])
From: Alexander Aring aahringo@redhat.com
commit 20d5fe2d7103f5c43ad11a3d6d259e9d61165c35 upstream.
This patch fixes a nullpointer dereference if NL802154_ATTR_SEC_KEY is not set by the user. If this is the case nl802154 will return -EINVAL.
Reported-by: syzbot+ce4e062c2d51977ddc50@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210221174321.14210-3-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -1544,7 +1544,8 @@ static int nl802154_add_llsec_key(struct struct ieee802154_llsec_key_id id = { }; u32 commands[NL802154_CMD_FRAME_NR_IDS / 32] = { };
- if (nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) + if (!info->attrs[NL802154_ATTR_SEC_KEY] || + nla_parse_nested_deprecated(attrs, NL802154_KEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_KEY], nl802154_key_policy, info->extack)) return -EINVAL;
if (!attrs[NL802154_KEY_ATTR_USAGE_FRAMES] ||
From: Alexander Aring aahringo@redhat.com
commit 27c746869e1a135dffc2f2a80715bb7aa00445b4 upstream.
This patch fixes a nullpointer dereference if NL802154_ATTR_SEC_DEVKEY is not set by the user. If this is the case nl802154 will return -EINVAL.
Reported-by: syzbot+368672e0da240db53b5f@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210221174321.14210-4-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -1916,7 +1916,8 @@ static int nl802154_del_llsec_devkey(str struct ieee802154_llsec_device_key key; __le64 extended_addr;
- if (nla_parse_nested_deprecated(attrs, NL802154_DEVKEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVKEY], nl802154_devkey_policy, info->extack)) + if (!info->attrs[NL802154_ATTR_SEC_DEVKEY] || + nla_parse_nested_deprecated(attrs, NL802154_DEVKEY_ATTR_MAX, info->attrs[NL802154_ATTR_SEC_DEVKEY], nl802154_devkey_policy, info->extack)) return -EINVAL;
if (!attrs[NL802154_DEVKEY_ATTR_EXTENDED_ADDR])
From: Alexander Aring aahringo@redhat.com
commit 88c17855ac4291fb462e13a86b7516773b6c932e upstream.
This patch forbids to set llsec params for monitor interfaces which we don't support yet.
Reported-by: syzbot+8b6719da8a04beeafcc3@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210405003054.256017-3-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -1384,6 +1384,9 @@ static int nl802154_set_llsec_params(str u32 changed = 0; int ret;
+ if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) + return -EOPNOTSUPP; + if (info->attrs[NL802154_ATTR_SEC_ENABLED]) { u8 enabled;
From: Alexander Aring aahringo@redhat.com
commit 9dde130937e95b72adfae64ab21d6e7e707e2dac upstream.
This patch forbids to del llsec seclevel for monitor interfaces which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors.
Reported-by: syzbot+fbf4fc11a819824e027b@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210405003054.256017-15-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -2092,6 +2092,9 @@ static int nl802154_del_llsec_seclevel(s struct wpan_dev *wpan_dev = dev->ieee802154_ptr; struct ieee802154_llsec_seclevel sl;
+ if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) + return -EOPNOTSUPP; + if (!info->attrs[NL802154_ATTR_SEC_LEVEL] || llsec_parse_seclevel(info->attrs[NL802154_ATTR_SEC_LEVEL], &sl) < 0)
From: Alexander Aring aahringo@redhat.com
commit 1534efc7bbc1121e92c86c2dabebaf2c9dcece19 upstream.
This patch stops dumping llsec params for monitors which we don't support yet. Otherwise we will access llsec mib which isn't initialized for monitors.
Reported-by: syzbot+cde43a581a8e5f317bc2@syzkaller.appspotmail.com Signed-off-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210405003054.256017-16-aahringo@redhat.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl802154.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/ieee802154/nl802154.c +++ b/net/ieee802154/nl802154.c @@ -820,8 +820,13 @@ nl802154_send_iface(struct sk_buff *msg, goto nla_put_failure;
#ifdef CONFIG_IEEE802154_NL802154_EXPERIMENTAL + if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR) + goto out; + if (nl802154_get_llsec_params(msg, rdev, wpan_dev) < 0) goto nla_put_failure; + +out: #endif /* CONFIG_IEEE802154_NL802154_EXPERIMENTAL */
genlmsg_end(msg, hdr);
From: Vlad Buslov vladbu@nvidia.com
commit 4ba86128ba077fbb7d86516ae24ed642e6c3adef upstream.
This reverts commit 6855e8213e06efcaf7c02a15e12b1ae64b9a7149.
Following commit in series fixes the issue without introducing regression in error rollback of tcf_action_destroy().
Signed-off-by: Vlad Buslov vladbu@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_api.c | 3 --- 1 file changed, 3 deletions(-)
--- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -1029,9 +1029,6 @@ struct tc_action *tcf_action_init_1(stru if (!name) a->hw_stats = hw_stats;
- if (!bind && ovr && err == ACT_P_CREATED) - refcount_set(&a->tcfa_refcnt, 2); - return a;
err_out:
Hello all,
On Mon, Apr 12, 2021 at 10:38:34AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
I tested 5.10.30-rc1 on my i686, the kernel was compiled with gcc 10.2 using make localmodconfig. I found no issues to compile and run the kernel.
Selftest resluts [ok/not ok]: [1434/82]
"Regression" (compared with 5.10.29-rc1):
75d74 < not ok 1 selftests: rtc: rtctest # TIMEOUT 90 seconds 80d78 < not ok 9 selftests: timers: rtcpie # exit=255
These test results were fluctuating for the previous kernels as well, thus I can't conclude that this "regression" is actually a real regression.
Tested-by: Andrei Rabusov a.rabusov@tum.de
On Mon, 12 Apr 2021 10:38:34 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.10: 12 builds: 12 pass, 0 fail 28 boots: 28 pass, 0 fail 70 tests: 70 pass, 0 fail
Linux version: 5.10.30-rc1-g8ac4b1deedaa Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 4/12/2021 1:38 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB, using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
We ran tests on this kernel:
commit 8ac4b1deedaa507b5d0f46316e7f32004dd99cd1 (HEAD -> rc/linux-5.10.y, stable_rc/linux-5.10.y) Author: Greg Kroah-Hartman gregkh@linuxfoundation.org Date: Mon Apr 12 10:40:10 2021 +0200
Linux 5.10.30-rc1
32/32 test passed, 0 failed, 0 errors, 0 warnings.
Specific tests run:
(01/32) 32ltp.py:LTP.test_nptl: PASS (6.55 s) (02/32) 32ltp.py:LTP.test_math: PASS (2.05 s) (03/32) 32ltp.py:LTP.test_hugetlb: PASS (0.07 s) (04/32) 32ltp.py:LTP.test_ipc: PASS (20.08 s) (05/32) 32ltp.py:LTP.test_uevent: PASS (0.05 s) (06/32) 32ltp.py:LTP.test_containers: PASS (36.71 s) (07/32) 32ltp.py:LTP.test_filecaps: PASS (0.11 s) (08/32) 32ltp.py:LTP.test_hyperthreading: PASS (71.18 s) (09/32) 32kpatch.sh: PASS (12.59 s) (10/32) 32perf.py:PerfNonPriv.test_perf_help: PASS (0.07 s) (11/32) 32perf.py:PerfNonPriv.test_perf_version: PASS (0.06 s) (12/32) 32perf.py:PerfNonPriv.test_perf_list: PASS (0.39 s) (13/32) 32perf.py:PerfPriv.test_perf_record: PASS (4.71 s) (14/32) 32perf.py:PerfPriv.test_perf_cmd_kallsyms: PASS (0.31 s) (15/32) 32perf.py:PerfPriv.test_perf_cmd_annotate: PASS (6.86 s) (16/32) 32perf.py:PerfPriv.test_perf_cmd_evlist: PASS (0.07 s) (17/32) 32perf.py:PerfPriv.test_perf_cmd_script: PASS (0.49 s) (18/32) 32perf.py:PerfPriv.test_perf_stat: PASS (3.22 s) (19/32) 32perf.py:PerfPriv.test_perf_bench: PASS (0.07 s) (20/32) 32kselftest.py:kselftest.test_sysctl: PASS (0.19 s) (21/32) 32kselftest.py:kselftest.test_size: PASS (0.02 s) (22/32) 32kselftest.py:kselftest.test_sync: PASS (0.46 s) (23/32) 32kselftest.py:kselftest.test_capabilities: PASS (0.03 s) (24/32) 32kselftest.py:kselftest.test_x86: PASS (0.28 s) (25/32) 32kselftest.py:kselftest.test_pidfd: PASS (13.01 s) (26/32) 32kselftest.py:kselftest.test_membarrier: PASS (0.22 s) (27/32) 32kselftest.py:kselftest.test_sigaltstack: PASS (0.03 s) (28/32) 32kselftest.py:kselftest.test_tmpfs: PASS (0.03 s) (29/32) 32kselftest.py:kselftest.test_user: PASS (0.02 s) (30/32) 32kselftest.py:kselftest.test_sched: PASS (0.03 s) (31/32) 32kselftest.py:kselftest.test_timens: PASS (0.09 s) (32/32) 32kselftest.py:kselftest.test_timers: PASS (555.01 s) RESULTS : PASS 32 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
Tested-by: Patrick McCormick pmccormick@digitalocean.com
On Mon, Apr 12, 2021 at 12:08 PM Florian Fainelli f.fainelli@gmail.com wrote:
On 4/12/2021 1:38 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB, using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
Florian
On 4/12/21 2:38 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, Apr 12, 2021 at 10:38:34AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
Build results: total: 156 pass: 156 fail: 0 Qemu test results: total: 454 pass: 454 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Mon, 12 Apr 2021 at 14:23, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.10.30-rc1 * git: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] * git branch: linux-5.10.y * git commit: 8ac4b1deedaa507b5d0f46316e7f32004dd99cd1 * git describe: v5.10.29-189-g8ac4b1deedaa * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10....
## No regressions (compared to v5.10.28-42-g18f507c37f33)
## No fixes (compared to v5.10.28-42-g18f507c37f33)
## Test result summary total: 75890, pass: 63301, fail: 2174, skip: 10129, xfail: 286,
## Build Summary * arc: 10 total, 10 passed, 0 failed * arm: 192 total, 192 passed, 0 failed * arm64: 26 total, 26 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 26 total, 25 passed, 1 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 45 total, 45 passed, 0 failed * parisc: 9 total, 9 passed, 0 failed * powerpc: 27 total, 27 passed, 0 failed * riscv: 21 total, 21 passed, 0 failed * s390: 18 total, 18 passed, 0 failed * sh: 18 total, 18 passed, 0 failed * sparc: 9 total, 9 passed, 0 failed * x15: 2 total, 1 passed, 1 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 26 total, 26 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * install-android-platform-tools-r2600 * kselftest- * kselftest-android * kselftest-bpf * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-lkdtm * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-vsyscall-mode-native- * kselftest-vsyscall-mode-none- * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * perf * rcutorture * ssuite * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On 2021/4/12 16:38, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.30-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Tested on arm64 and x86 for 5.10.30-rc1,
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Branch: linux-5.10.y Version: 5.10.30-rc1 Commit: 8ac4b1deedaa507b5d0f46316e7f32004dd99cd1 Compiler: gcc version 7.3.0 (GCC)
arm64: -------------------------------------------------------------------- Testcase Result Summary: total: 5264 passed: 5264 failed: 0 timeout: 0 --------------------------------------------------------------------
x86: -------------------------------------------------------------------- Testcase Result Summary: total: 5264 passed: 5264 failed: 0 timeout: 0 --------------------------------------------------------------------
Tested-by: Hulk Robot hulkrobot@huawei.com
Hi!
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
Hi Greg,
On Mon, Apr 12, 2021 at 10:38:34AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.30 release. There are 188 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 14 Apr 2021 08:39:44 +0000. Anything received after that time might be too late.
Build test: (gcc version 10.2.1 20210406) mips: 63 configs -> no new failure arm: 105 configs -> no new failure x86_64: 2 configs -> no failure
Boot test: x86_64: Booted on my test laptop. No regression.
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
linux-stable-mirror@lists.linaro.org