This is the start of the stable review cycle for the 5.11.11 release. There are 254 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.11-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.11.11-rc1
Alexei Starovoitov ast@kernel.org selftest/bpf: Add a test to check trampoline freeing logic.
Marc Kleine-Budde mkl@pengutronix.de can: peak_usb: Revert "can: peak_usb: add forgotten supported devices"
Christoph Hellwig hch@lst.de nvme: fix the nsid value to print in nvme_validate_or_alloc_ns
Roger Pau Monne roger.pau@citrix.com Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"
Sabyrzhan Tasbolatov snovitoll@gmail.com fs/ext4: fix integer overflow in s_log_groups_per_flex
Jan Kara jack@suse.cz ext4: add reclaim checks to xattr code
Markus Theil markus.theil@tu-ilmenau.de mac80211: fix double free in ibss_leave
Florian Fainelli f.fainelli@gmail.com net: dsa: b53: VLAN filtering is global to all users
Heiner Kallweit hkallweit1@gmail.com r8169: fix DMA being used after buffer free if WoL is enabled
Martin Willi martin@strongswan.org can: dev: Move device back to init netns on owning netns delete
Arnd Bergmann arnd@arndb.de ch_ktls: fix enum-conversion warning
Matthew Wilcox (Oracle) willy@infradead.org fs/cachefiles: Remove wait_bit_key layout dependency
Isaku Yamahata isaku.yamahata@intel.com x86/mem_encrypt: Correct physical address calculation in __set_clr_pte_enc()
Thomas Gleixner tglx@linutronix.de locking/mutex: Fix non debug version of mutex_lock_io_nested()
Shyam Prasad N sprasad@microsoft.com cifs: Adjust key sizes and key generation routines for AES256 encryption
Steve French stfrench@microsoft.com smb3: fix cached file size problems in duplicate extents (reflink)
Jia-Ju Bai baijiaju1990@gmail.com scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()
Jia-Ju Bai baijiaju1990@gmail.com scsi: qedi: Fix error return code of qedi_alloc_global_queues()
Bart Van Assche bvanassche@acm.org scsi: Revert "qla2xxx: Make sure that aborted commands are freed"
David Jeffery djeffery@redhat.com block: recalculate segment count for multi-segment discards correctly
Pavel Begunkov asml.silence@gmail.com io_uring: fix provide_buffers sign extension
Ian Rogers irogers@google.com perf synthetic events: Avoid write of uninitialized memory when generating PERF_RECORD_MMAP* records
Adrian Hunter adrian.hunter@intel.com perf auxtrace: Fix auxtrace queue conflict
Andy Shevchenko andriy.shevchenko@linux.intel.com ACPI: scan: Use unique number for instance_no
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: scan: Rearrange memory allocation in acpi_device_add()
Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Revert "netfilter: x_tables: Update remaining dereference to RCU"
Sean Christopherson seanjc@google.com mm/mmu_notifiers: ensure range_end() is paired with range_start()
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com dm table: Fix zoned model check and zone sectors check
Pavel Tatashin pasha.tatashin@soleen.com arm64: mm: correct the inside linear map range during hotplug check
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Cleanup create_attributes_level_sysfs_files()
Stanislav Fomichev sdf@google.com bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG
Alexei Starovoitov ast@kernel.org bpf: Fix fexit trampoline.
Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz netfilter: x_tables: Use correct memory barriers.
Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Revert "netfilter: x_tables: Switch synchronization to RCU"
Florian Fainelli f.fainelli@gmail.com net: phy: broadcom: Fix RGMII delays for BCM50160 and BCM50610M
Robert Hancock robert.hancock@calian.com net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S
Florian Fainelli f.fainelli@gmail.com net: phy: broadcom: Avoid forward for bcm54xx_config_clock_delay()
Michael Walle michael@walle.cc net: phy: introduce phydev->port
Robert Hancock robert.hancock@calian.com net: axienet: Fix probe error cleanup
Li RongQing lirongqing@baidu.com igb: avoid premature Rx buffer reuse
Daniel Borkmann daniel@iogearbox.net net, bpf: Fix ip6ip6 crash with collect_md populated skbs
Daniel Borkmann daniel@iogearbox.net net: Consolidate common blackhole dst ops
Sasha Levin sashal@kernel.org bpf: Don't do bpf_cgroup_storage_set() for kuprobe/tp programs
Mike Rapoport rppt@kernel.org mm: memblock: fix section mismatch warning again
Potnuri Bharat Teja bharat@chelsio.com RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
Roger Pau Monne roger.pau@citrix.com xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG
Colin Ian King colin.king@canonical.com octeontx2-af: Fix memory leak of object buf
Vladimir Oltean vladimir.oltean@nxp.com net: bridge: don't notify switchdev for local FDB addresses
David E. Box david.e.box@linux.intel.com platform/x86: intel_pmt_crashlog: Fix incorrect macros
Lukasz Luba lukasz.luba@arm.com PM: EM: postpone creating the debugfs dir till fs_initcall
Andy Shevchenko andriy.shevchenko@linux.intel.com mfd: intel_quark_i2c_gpio: Revert "Constify static struct resources"
Aya Levin ayal@nvidia.com net/mlx5e: Fix error path for ethtool set-priv-flag
Dima Chumak dchumak@nvidia.com net/mlx5e: Offload tuple rewrite for non-CT flows
Alaa Hleihel alaa@nvidia.com net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
Huy Nguyen huyn@nvidia.com net/mlx5: Add back multicast stats for uplink representor
Rafael J. Wysocki rafael.j.wysocki@intel.com PM: runtime: Defer suspending suppliers
Pavel Tatashin pasha.tatashin@soleen.com arm64: kdump: update ppos when reading elfcorehdr
Fabio Estevam festevam@gmail.com drm/msm: Fix suspend/resume on i.MX5
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm: fix shutdown hook in case GPU components failed to bind
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Make sysman_init() return -ENODEV of the interfaces are not found
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Cleanup sysman_init() error-exit handling
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Fix release_attributes_data() getting called twice on init_bios_attributes() failure
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Make it safe to call exit_foo_attributes() multiple times
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Fix possible NULL pointer deref on exit
Hans de Goede hdegoede@redhat.com platform/x86: dell-wmi-sysman: Fix crash caused by calling kset_unregister twice
Oliver Hartkopp socketcan@hartkopp.net can: isotp: tx-path: zero initialize outgoing CAN frames
Zqiang qiang.zhang@windriver.com bpf: Fix umd memory leak in copy_process()
Jean-Philippe Brucker jean-philippe@linaro.org libbpf: Fix BTF dump of pointer-to-array-of-struct
Hangbin Liu liuhangbin@gmail.com selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value
David Brazdil dbrazdil@google.com selinux: vsock: Set SID for socket returned by accept()
Corentin Labbe clabbe@baylibre.com net: stmmac: dwmac-sun8i: Provide TX and RX fifo sizes
Hayes Wang hayeswang@realtek.com r8152: limit the RX buffer size of RTL8153A for USB 2.0
Xin Long lucien.xin@gmail.com sctp: move sk_route_caps check and set into sctp_outq_flush_transports
Jesse Brandeburg jesse.brandeburg@intel.com igb: check timestamp validity
Johan Hovold johan@kernel.org net: cdc-phonet: fix data-interface release on probe failure
Jiri Bohac jbohac@suse.cz net: check all name nodes in __dev_alloc_name
Hariprasad Kelam hkelam@marvell.com octeontx2-af: fix infinite loop in unmapping NPC counter
Geetha sowjanya gakula@marvell.com octeontx2-pf: Clear RSS enable flag on interace down
Geetha sowjanya gakula@marvell.com octeontx2-af: Fix irq free in rvu teardown
Subbaraya Sundeep sbhatta@marvell.com octeontx2-af: Remove TOS field from MKEX TX
Rakesh Babu rsaladi2@marvell.com octeontx2-af: Formatting debugfs entry rsrc_alloc.
Jakub Kicinski kuba@kernel.org ipv6: weaken the v4mapped source check
dillon min dillon.minfei@gmail.com ARM: dts: imx6ull: fix ubi filesystem mount failed
Kumar Kartikeya Dwivedi memxor@gmail.com libbpf: Use SOCK_CLOEXEC when opening the netlink socket
Namhyung Kim namhyung@kernel.org libbpf: Fix error path in bpf_object__elf_init()
Yinjun Zhang yinjun.zhang@corigine.com netfilter: flowtable: Make sure GC works periodically in idle system
Pablo Neira Ayuso pablo@netfilter.org netfilter: nftables: allow to update flowtable flags
Pablo Neira Ayuso pablo@netfilter.org netfilter: nftables: report EOPNOTSUPP on unsupported flowtable flags
wenxu wenxu@ucloud.cn net/sched: cls_flower: fix only mask bit check in the validate_ct_state
Shannon Nelson snelson@pensando.io ionic: linearize tso skb with too many frags
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/dsi: fix check-before-set in the 7nm dsi_pll code
Alexei Starovoitov ast@kernel.org ftrace: Fix modify_ftrace_direct.
Louis Peens louis.peens@corigine.com nfp: flower: fix pre_tun mask id allocation
Louis Peens louis.peens@corigine.com nfp: flower: add ipv6 bit to pre_tunnel control message
Louis Peens louis.peens@corigine.com nfp: flower: fix unsupported pre_tunnel flows
Carlos Llamas cmllamas@google.com selftests/net: fix warnings on reuseaddr_ports_exhausted
Brian Norris briannorris@chromium.org mac80211: Allow HE operation to be longer than expected.
Johannes Berg johannes.berg@intel.com mac80211: fix rate mask reset
Torin Cooper-Bennun torin@maxiluxsystems.com can: m_can: m_can_rx_peripheral(): fix RX being blocked by errors
Torin Cooper-Bennun torin@maxiluxsystems.com can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning
Tong Zhang ztong0001@gmail.com can: c_can: move runtime PM enable/disable to c_can_platform
Tong Zhang ztong0001@gmail.com can: c_can_pci: c_can_pci_remove(): fix use-after-free
Jimmy Assarsson extja@kvaser.com can: kvaser_pciefd: Always disable bus load reporting
Angelo Dureghello angelo@kernel-space.org can: flexcan: flexcan_chip_freeze(): fix chip freeze for missing bitrate
Stephane Grosjean s.grosjean@peak-system.com can: peak_usb: add forgotten supported devices
Marc Kleine-Budde mkl@pengutronix.de can: isotp: TX-path: ensure that CAN frame flags are initialized
Marc Kleine-Budde mkl@pengutronix.de can: isotp: isotp_setsockopt(): only allow to set low level TX flags for CAN-FD
Davide Caratti dcaratti@redhat.com mptcp: fix ADD_ADDR HMAC in case port is specified
Alexander Ovechkin ovov@yandex-team.ru tcp: relookup sock for RST+ACK packets handled by obsolete req sock
Eric Dumazet edumazet@google.com tipc: better validate user input in tipc_nl_retrieve_key()
Ong Boon Leong boon.leong.ong@intel.com net: phylink: Fix phylink_err() function name error in phylink_major_config
Xie He xie.he.0141@gmail.com net: hdlc_x25: Prevent racing between "x25_close" and "x25_xmit"/"x25_rx"
Florian Westphal fw@strlen.de netfilter: ctnetlink: fix dump of the expect mask attribute
Hangbin Liu liuhangbin@gmail.com selftests/bpf: Set gopt opt_class to 0 if get tunnel opt failed
Alexander Lobakin alobakin@pm.me flow_dissector: fix byteorder of dissected ICMP ID
Eric Dumazet edumazet@google.com net: qrtr: fix a kernel-infoleak in qrtr_recvmsg()
Alex Elder elder@linaro.org net: ipa: terminate message handler arrays
Douglas Anderson dianders@chromium.org clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1 clk
Dylan Hung dylan_hung@aspeedtech.com ftgmac100: Restart MAC HW once
Magnus Karlsson magnus.karlsson@intel.com ice: fix napi work done reporting in xsk path
Florian Fainelli f.fainelli@gmail.com net: phy: broadcom: Add power down exit reset state delay
Lv Yunlong lyl2019@mail.ustc.edu.cn net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template
David Gow davidgow@google.com kunit: tool: Disable PAGE_POISONING under --alltests
Dinghao Liu dinghao.liu@zju.edu.cn e1000e: Fix error handling in e1000_set_d0_lplu_state_82571
Vitaly Lifshits vitaly.lifshits@intel.com e1000e: add rtnl_lock() to e1000_reset_task
Andre Guedes andre.guedes@intel.com igc: Fix igc_ptp_rx_pktstamp()
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Fix Supported Pause Frame Link Setting
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Fix Pause Frame Advertising
Sasha Neftin sasha.neftin@intel.com igc: reinit_locked() should be called with rtnl_lock
Florian Fainelli f.fainelli@gmail.com net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port
Eric Dumazet edumazet@google.com net: sched: validate stab values
Eric Dumazet edumazet@google.com macvlan: macvlan_count_rx() needs to be aware of preemption
Ido Schimmel idosch@nvidia.com drop_monitor: Perform cleanup upon probe registration failure
Wei Wang weiwan@google.com ipv6: fix suspecious RCU usage warning
Parav Pandit parav@nvidia.com net/mlx5e: E-switch, Fix rate calculation division
Maor Dickman maord@nvidia.com net/mlx5e: Don't match on Geneve options in case option masks are all zero
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: Revert parameters on errors when changing PTP state without reset
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: When changing XDP program without reset, take refs for XSK RQs
Aya Levin ayal@nvidia.com net/mlx5e: Set PTP channel pointer explicitly to NULL
Tariq Toukan tariqt@nvidia.com net/mlx5e: RX, Mind the MPWQE gaps when calculating offsets
Georgi Valkov gvalkov@abv.bg libbpf: Fix INSTALL flag order
Tal Lossos tallossos@gmail.com bpf: Change inode_storage's lookup_elem return value from NULL to -EBADF
Alexei Starovoitov ast@kernel.org bpf: Dont allow vmlinux BTF to be used in map_create and prog_load.
Maciej Fijalkowski maciej.fijalkowski@intel.com veth: Store queue_mapping independently of XDP prog presence
Tony Lindgren tony@atomide.com soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva
Tony Lindgren tony@atomide.com ARM: OMAP2+: Fix smartreflex init regression after dropping legacy data
Tony Lindgren tony@atomide.com soc: ti: omap-prm: Fix reboot issue with invalid pcie reset map for dra7
Grygorii Strashko grygorii.strashko@ti.com bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD
Mikulas Patocka mpatocka@redhat.com dm ioctl: fix out of bounds array access when no devices
Mikulas Patocka mpatocka@redhat.com dm: don't report "detected capacity change" on device creation
JeongHyeon Lee jhs2.lee@samsung.com dm verity: fix DM_VERITY_OPTS_MAX value
Imre Deak imre.deak@intel.com drm/i915: Fix the GT fence revocation runtime PM logic
Jani Nikula jani.nikula@intel.com drm/i915/dsc: fix DSS CTL register usage for ICL DSI transcoders
Alex Deucher alexander.deucher@amd.com drm/amdgpu: Add additional Sienna Cichlid PCI ID
Prike Liang Prike.Liang@amd.com drm/amdgpu: fix the hibernation suspend with s0ix
Alex Deucher alexander.deucher@amd.com drm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x
Kenneth Feng kenneth.feng@amd.com drm/amd/pm: workaround for audio noise issue
Daniel Vetter daniel.vetter@ffwll.ch drm/etnaviv: Use FOLL_FORCE for userptr
Lyude Paul lyude@redhat.com drm/nouveau/kms/nve4-nv108: Limit cursors to 128x128
Mimi Zohar zohar@linux.ibm.com integrity: double check iint_cache was initialized
Claudiu Beznea claudiu.beznea@microchip.com ARM: dts: at91-sama5d27_som1: fix phy address to 7
Nicolas Ferre nicolas.ferre@microchip.com ARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet
Federico Pellegrin fede@evolware.org ARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C
Horia Geantă horia.geanta@nxp.com arm64: dts: ls1043a: mark crypto engine dma coherent
Horia Geantă horia.geanta@nxp.com arm64: dts: ls1012a: mark crypto engine dma coherent
Horia Geantă horia.geanta@nxp.com arm64: dts: ls1046a: mark crypto engine dma coherent
Mark Rutland mark.rutland@arm.com arm64: stacktrace: don't trace arch_stack_walk()
Vegard Nossum vegard.nossum@oracle.com ACPICA: Always create namespace nodes using acpi_ns_create_node()
Chris Chiu chris.chiu@canonical.com ACPI: video: Add missing callback back for Sony VPCEH3U1E
Ira Weiny ira.weiny@intel.com mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
Nick Desaulniers ndesaulniers@google.com gcov: fix clang-11+ support
Andrey Konovalov andreyknvl@google.com kasan: fix per-page tags for non-page_alloc pages
Miaohe Lin linmiaohe@huawei.com hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
Phillip Lougher phillip@squashfs.org.uk squashfs: fix xattr id and id lookup sanity checks
Sean Nyekjaer sean@geanix.com squashfs: fix inode lookup sanity checks
Thomas Hebb tommyhebb@gmail.com z3fold: prevent reclaim/free race for headless pages
Ido Schimmel idosch@nvidia.com psample: Fix user API breakage
Hans de Goede hdegoede@redhat.com platform/x86: intel-vbtn: Stop reporting SW_DOCK events
Mian Yousaf Kaukab ykaukab@suse.de netsec: restore phy power state after controller reset
Ondrej Mosnacek omosnace@redhat.com selinux: fix variable scope issue in live sidtab conversion
Ondrej Mosnacek omosnace@redhat.com selinux: don't log MAC_POLICY_LOAD record on failed policy load
Filipe Manana fdmanana@suse.com btrfs: fix subvolume/snapshot deletion not triggered on mount
Filipe Manana fdmanana@suse.com btrfs: fix sleep while in non-sleep context during qgroup removal
Josef Bacik josef@toxicpanda.com btrfs: initialize device::fs_info always
Omar Sandoval osandov@fb.com btrfs: fix check_data_csum() error message for direct I/O
Josef Bacik josef@toxicpanda.com btrfs: do not initialize dev replace for bad dev root
Josef Bacik josef@toxicpanda.com btrfs: do not initialize dev stats if we have no dev_root
Sean Christopherson seanjc@google.com KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish
Peter Zijlstra peterz@infradead.org static_call: Fix static_call_set_init()
Peter Zijlstra peterz@infradead.org static_call: Fix the module key fixup
Josh Poimboeuf jpoimboe@redhat.com static_call: Allow module use without exposing static_call_key
Peter Zijlstra peterz@infradead.org static_call: Pull some static_call declarations to the type headers
Sergei Trofimovich slyfox@gentoo.org ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
Sergei Trofimovich slyfox@gentoo.org ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
Fenghua Yu fenghua.yu@intel.com mm/fork: clear PASID for new mm
Pavel Begunkov asml.silence@gmail.com io_uring: cancel deferred requests in try_cancel
Daniel Wagner dwagner@suse.de block: Suppress uevent for hidden device when removed
J. Bruce Fields bfields@redhat.com nfs: we don't support removing system.nfs4_acl
Dmitry Monakhov dmtrmonakhov@yandex-team.ru nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a
Lv Yunlong lyl2019@mail.ustc.edu.cn nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done
Chaitanya Kulkarni chaitanya.kulkarni@wdc.com nvme-core: check ctrl css before setting up zns
Hannes Reinecke hare@suse.de nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted
Hannes Reinecke hare@suse.de nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()
Hannes Reinecke hare@suse.de nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()
Hannes Reinecke hare@suse.de nvme: simplify error logic in nvme_validate_ns()
Christian König christian.koenig@amd.com drm/radeon: fix AGP dependency
Nirmoy Das nirmoy.das@amd.com drm/amdgpu: fb BO should be ttm_bo_type_device
Zhan Liu zhan.liu@amd.com drm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm
Dillon Varone dillon.varone@amd.com drm/amd/display: Enabled pipe harvesting in dcn30
Sung Lee sung.lee@amd.com drm/amd/display: Revert dram_clock_change_latency for DCN2.1
Qingqing Zhuo qingqing.zhuo@amd.com drm/amd/display: Enable pflip interrupt upon pipe enable
Damien Le Moal damien.lemoal@wdc.com block: Fix REQ_OP_ZONE_RESET_ALL handling
satya priya skakit@codeaurora.org regulator: qcom-rpmh: Use correct buck for S1C regulator
satya priya skakit@codeaurora.org regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck
Mark Brown broonie@kernel.org kselftest: arm64: Fix exit code of sve-ptrace
Peter Zijlstra peterz@infradead.org u64_stats,lockdep: Fix u64_stats_init() vs lockdep
Julian Braha julianbraha@gmail.com staging: rtl8192e: fix kconfig dependency on CRYPTO
Tomer Tayar ttayar@habana.ai habanalabs: Disable file operations after device is removed
Tomer Tayar ttayar@habana.ai habanalabs: Call put_pid() when releasing control device
Rob Gardner rob.gardner@oracle.com sparc64: Fix opcode filtering in handling of no fault loads
Wei Yongjun weiyongjun1@huawei.com umem: fix error return code in mm_pci_probe()
Jiri Slaby jirislaby@kernel.org kbuild: dummy-tools: fix inverted tests for gcc
Masahiro Yamada masahiroy@kernel.org kbuild: add image_name to no-sync-config-targets
Paul Cercueil paul@crapouillou.net irqchip/ingenic: Add support for the JZ4760
Paulo Alcantara pc@cjr.nz cifs: change noisy error message to FYI
Tong Zhang ztong0001@gmail.com atm: idt77252: fix null-ptr-dereference
Tong Zhang ztong0001@gmail.com atm: uPD98402: fix incorrect allocation
Alex Marginean alexandru.marginean@nxp.com net: enetc: set MAC RX FIFO to recommended value
Paul Cercueil paul@crapouillou.net net: davicom: Use platform_get_irq_optional()
Jia-Ju Bai baijiaju1990@gmail.com net: wan: fix error return code of uhdlc_init()
Jia-Ju Bai baijiaju1990@gmail.com net: hisilicon: hns: fix error return code of hns_nic_clear_all_rx_fetch()
Frank Sorenson sorenson@redhat.com NFS: Correct size calculation for create reply length
Timo Rothenpieler timo@rothenpieler.org nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default
Yang Li yang.lee@linux.alibaba.com gpiolib: acpi: Add missing IRQF_ONESHOT
Sudeep Holla sudeep.holla@arm.com cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdev
Bob Peterson rpeterso@redhat.com gfs2: fix use-after-free in trans_drain
Aurelien Aptel aaptel@suse.com cifs: ask for more credit on async read/write code paths
Michael Braun michael-dev@fami-braun.de gianfar: fix jumbo packets+napi+rx overrun crash
Denis Efremov efremov@linux.com sun/niu: fix wrong RXMAC_BC_FRM_CNT_COUNT count
Jia-Ju Bai baijiaju1990@gmail.com net: intel: iavf: fix error return code of iavf_init_get_resources()
Jia-Ju Bai baijiaju1990@gmail.com net: tehuti: fix error return code in bdx_probe()
Xunlei Pang xlpang@linux.alibaba.com blk-cgroup: Fix the recursive blkg rwstat
Nitin Rawat nitirawa@codeaurora.org scsi: ufs: ufs-qcom: Disable interrupt in reset path
Dinghao Liu dinghao.liu@zju.edu.cn ixgbe: Fix memleak in ixgbe_configure_clsu32
Mark Pearson markpearson@lenovo.com ALSA: hda: ignore invalid NHLT table
Hayes Wang hayeswang@realtek.com Revert "r8152: adjust the settings about MAC clock speed down for RTL8153"
Tong Zhang ztong0001@gmail.com atm: lanai: dont run lanai_dev_close if not open
Tong Zhang ztong0001@gmail.com atm: eni: dont release is never initialized
Michael Ellerman mpe@ellerman.id.au powerpc/4xx: Fix build errors from mfdcr()
Heiko Thiery heiko.thiery@gmail.com net: fec: ptp: avoid register access when ipg clock is disabled
Joakim Zhang qiangqing.zhang@nxp.com net: stmmac: fix dma physical address of descriptor when display ring
Felix Fietkau nbd@nbd.name mt76: mt7915: only modify tx buffer list after allocating tx token id
Felix Fietkau nbd@nbd.name mt76: fix tx skb error handling in mt76_dma_tx_queue_skb
-------------
Diffstat:
Documentation/virt/kvm/api.rst | 6 +- Makefile | 7 +- arch/arm/boot/dts/at91-sam9x60ek.dts | 8 - arch/arm/boot/dts/at91-sama5d27_som1.dtsi | 4 +- arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts | 1 + arch/arm/boot/dts/sam9x60.dtsi | 9 + arch/arm/mach-omap2/sr_device.c | 75 +++++-- arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi | 1 + arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 + arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 1 + arch/arm64/kernel/crash_dump.c | 2 + arch/arm64/kernel/stacktrace.c | 9 +- arch/arm64/mm/mmu.c | 32 ++- arch/ia64/include/asm/syscall.h | 2 +- arch/ia64/kernel/ptrace.c | 24 ++- arch/powerpc/include/asm/dcr-native.h | 8 +- arch/sparc/kernel/traps_64.c | 13 +- arch/x86/include/asm/kvm_host.h | 14 +- arch/x86/include/asm/static_call.h | 7 + arch/x86/include/asm/xen/page.h | 12 -- arch/x86/kvm/x86.c | 109 ++++++----- arch/x86/mm/mem_encrypt.c | 2 +- arch/x86/net/bpf_jit_comp.c | 27 ++- arch/x86/xen/p2m.c | 7 +- arch/x86/xen/setup.c | 16 +- block/blk-cgroup-rwstat.c | 3 +- block/blk-merge.c | 8 + block/blk-zoned.c | 2 +- block/genhd.c | 4 +- drivers/acpi/acpica/nsaccess.c | 3 +- drivers/acpi/internal.h | 6 +- drivers/acpi/scan.c | 88 +++++---- drivers/acpi/video_detect.c | 1 + drivers/atm/eni.c | 3 +- drivers/atm/idt77105.c | 4 +- drivers/atm/lanai.c | 5 +- drivers/atm/uPD98402.c | 2 +- drivers/base/power/runtime.c | 45 ++++- drivers/block/umem.c | 5 +- drivers/bus/omap_l3_noc.c | 4 +- drivers/clk/qcom/gcc-sc7180.c | 4 +- drivers/cpufreq/cpufreq-dt-platdev.c | 2 + drivers/gpio/gpiolib-acpi.c | 2 +- drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 + drivers/gpu/drm/amd/display/dc/dc.h | 1 + drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c | 11 ++ drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h | 6 + .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 7 + drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c | 1 + drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 6 + .../drm/amd/display/dc/dcn20/dcn20_link_encoder.c | 3 +- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c | 1 + .../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c | 1 + .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 31 +++ .../drm/amd/display/dc/dcn301/dcn301_resource.c | 96 ++++++++- drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h | 2 + .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 54 +++++ .../gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 74 +++++-- .../gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 24 +++ .../gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 25 +++ drivers/gpu/drm/etnaviv/etnaviv_gem.c | 2 +- drivers/gpu/drm/i915/display/intel_vdsc.c | 10 +- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 13 +- drivers/gpu/drm/i915/intel_runtime_pm.c | 29 ++- drivers/gpu/drm/i915/intel_runtime_pm.h | 5 + drivers/gpu/drm/msm/dsi/pll/dsi_pll.c | 2 +- drivers/gpu/drm/msm/dsi/pll/dsi_pll.h | 6 +- drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c | 5 +- drivers/gpu/drm/msm/msm_drv.c | 12 ++ drivers/gpu/drm/nouveau/dispnv50/disp.c | 13 +- drivers/infiniband/hw/cxgb4/cm.c | 4 +- drivers/irqchip/irq-ingenic-tcu.c | 1 + drivers/irqchip/irq-ingenic.c | 1 + drivers/md/dm-ioctl.c | 2 +- drivers/md/dm-table.c | 33 +++- drivers/md/dm-verity-target.c | 2 +- drivers/md/dm-zoned-target.c | 2 +- drivers/md/dm.c | 5 +- drivers/mfd/intel_quark_i2c_gpio.c | 6 +- drivers/misc/habanalabs/common/device.c | 40 +++- drivers/misc/habanalabs/common/habanalabs_ioctl.c | 12 ++ drivers/net/can/c_can/c_can.c | 24 +-- drivers/net/can/c_can/c_can_pci.c | 3 +- drivers/net/can/c_can/c_can_platform.c | 6 +- drivers/net/can/dev.c | 1 + drivers/net/can/flexcan.c | 8 +- drivers/net/can/kvaser_pciefd.c | 4 + drivers/net/can/m_can/m_can.c | 5 +- drivers/net/dsa/b53/b53_common.c | 14 +- drivers/net/dsa/bcm_sf2.c | 6 +- .../chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 2 +- drivers/net/ethernet/davicom/dm9000.c | 2 +- drivers/net/ethernet/faraday/ftgmac100.c | 1 + drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 + drivers/net/ethernet/freescale/enetc/enetc_pf.c | 6 + drivers/net/ethernet/freescale/fec_ptp.c | 7 + drivers/net/ethernet/freescale/gianfar.c | 15 ++ drivers/net/ethernet/hisilicon/hns/hns_enet.c | 4 +- drivers/net/ethernet/intel/e1000e/82571.c | 2 + drivers/net/ethernet/intel/e1000e/netdev.c | 6 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 3 +- drivers/net/ethernet/intel/ice/ice_base.c | 6 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 10 +- drivers/net/ethernet/intel/igb/igb.h | 4 +- drivers/net/ethernet/intel/igb/igb_main.c | 33 ++-- drivers/net/ethernet/intel/igb/igb_ptp.c | 31 ++- drivers/net/ethernet/intel/igc/igc.h | 2 +- drivers/net/ethernet/intel/igc/igc_ethtool.c | 7 +- drivers/net/ethernet/intel/igc/igc_main.c | 9 + drivers/net/ethernet/intel/igc/igc_ptp.c | 72 ++++--- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +- .../ethernet/marvell/octeontx2/af/npc_profile.h | 2 - drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 6 +- .../ethernet/marvell/octeontx2/af/rvu_debugfs.c | 48 +++-- .../net/ethernet/marvell/octeontx2/af/rvu_npc.c | 2 +- .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 + drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 +- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 3 +- .../ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c | 4 + .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 11 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 21 +- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 57 ++++-- .../net/ethernet/netronome/nfp/flower/metadata.c | 24 ++- .../net/ethernet/netronome/nfp/flower/offload.c | 18 ++ .../ethernet/netronome/nfp/flower/tunnel_conf.c | 15 +- drivers/net/ethernet/pensando/ionic/ionic_txrx.c | 13 +- .../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c | 3 + drivers/net/ethernet/realtek/r8169_main.c | 6 +- drivers/net/ethernet/socionext/netsec.c | 9 +- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 2 + drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 50 ++++- drivers/net/ethernet/stmicro/stmmac/enh_desc.c | 9 +- drivers/net/ethernet/stmicro/stmmac/hwif.h | 3 +- drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 9 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 57 ++++-- drivers/net/ethernet/sun/niu.c | 2 - drivers/net/ethernet/tehuti/tehuti.c | 1 + drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 35 ++-- drivers/net/ipa/ipa_qmi.c | 2 + drivers/net/phy/broadcom.c | 147 ++++++++++---- drivers/net/phy/dp83822.c | 3 + drivers/net/phy/dp83869.c | 4 + drivers/net/phy/lxt.c | 1 + drivers/net/phy/marvell.c | 1 + drivers/net/phy/marvell10g.c | 2 + drivers/net/phy/micrel.c | 14 +- drivers/net/phy/phy.c | 2 +- drivers/net/phy/phy_device.c | 9 + drivers/net/phy/phylink.c | 2 +- drivers/net/usb/cdc-phonet.c | 2 + drivers/net/usb/r8152.c | 40 +--- drivers/net/veth.c | 3 +- drivers/net/wan/fsl_ucc_hdlc.c | 8 +- drivers/net/wan/hdlc_x25.c | 42 +++- drivers/net/wireless/mediatek/mt76/dma.c | 15 +- drivers/net/wireless/mediatek/mt76/mt7915/mac.c | 10 +- drivers/nvme/host/core.c | 15 +- drivers/nvme/host/fc.c | 3 +- drivers/nvme/host/pci.c | 1 + drivers/nvme/target/rdma.c | 5 +- .../platform/x86/dell-wmi-sysman/enum-attributes.c | 3 + .../platform/x86/dell-wmi-sysman/int-attributes.c | 3 + .../x86/dell-wmi-sysman/passobj-attributes.c | 3 + .../x86/dell-wmi-sysman/string-attributes.c | 3 + drivers/platform/x86/dell-wmi-sysman/sysman.c | 84 +++----- drivers/platform/x86/intel-vbtn.c | 12 +- drivers/platform/x86/intel_pmt_crashlog.c | 13 +- drivers/regulator/qcom-rpmh-regulator.c | 6 +- drivers/scsi/mpt3sas/mpt3sas_base.c | 8 +- drivers/scsi/qedi/qedi_main.c | 1 + drivers/scsi/qla2xxx/qla_target.c | 13 +- drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 - drivers/scsi/ufs/ufs-qcom.c | 10 + drivers/soc/ti/omap_prm.c | 8 +- drivers/staging/rtl8192e/Kconfig | 1 + drivers/xen/Kconfig | 4 +- fs/btrfs/dev-replace.c | 3 + fs/btrfs/disk-io.c | 19 +- fs/btrfs/inode.c | 14 +- fs/btrfs/qgroup.c | 12 +- fs/btrfs/volumes.c | 3 + fs/cachefiles/rdwr.c | 7 +- fs/cifs/cifsglob.h | 4 +- fs/cifs/cifspdu.h | 5 + fs/cifs/smb2glob.h | 1 + fs/cifs/smb2ops.c | 27 ++- fs/cifs/smb2pdu.c | 6 +- fs/cifs/smb2transport.c | 37 +++- fs/cifs/transport.c | 2 +- fs/ext4/mballoc.c | 11 +- fs/ext4/xattr.c | 4 + fs/gfs2/log.c | 4 + fs/gfs2/trans.c | 2 + fs/io_uring.c | 14 +- fs/nfs/Kconfig | 2 +- fs/nfs/nfs3xdr.c | 3 +- fs/nfs/nfs4proc.c | 3 + fs/squashfs/export.c | 8 +- fs/squashfs/id.c | 6 +- fs/squashfs/squashfs_fs.h | 1 + fs/squashfs/xattr_id.c | 6 +- include/acpi/acpi_bus.h | 1 + include/asm-generic/vmlinux.lds.h | 5 +- include/linux/bpf.h | 33 +++- include/linux/brcmphy.h | 4 + include/linux/device-mapper.h | 15 +- include/linux/hugetlb_cgroup.h | 15 +- include/linux/if_macvlan.h | 3 +- include/linux/memblock.h | 4 +- include/linux/mm.h | 18 +- include/linux/mm_types.h | 1 + include/linux/mmu_notifier.h | 10 +- include/linux/mutex.h | 2 +- include/linux/netfilter/x_tables.h | 7 +- include/linux/pagemap.h | 1 - include/linux/phy.h | 2 + include/linux/static_call.h | 43 ++-- include/linux/static_call_types.h | 50 +++++ include/linux/u64_stats_sync.h | 7 +- include/linux/usermode_driver.h | 1 + include/net/dst.h | 11 ++ include/net/inet_connection_sock.h | 2 +- include/net/netfilter/nf_tables.h | 3 + include/net/nexthop.h | 24 +++ include/net/red.h | 10 +- include/net/rtnetlink.h | 2 + include/uapi/linux/psample.h | 5 +- kernel/bpf/bpf_inode_storage.c | 2 +- kernel/bpf/bpf_struct_ops.c | 2 +- kernel/bpf/core.c | 4 +- kernel/bpf/preload/bpf_preload_kern.c | 19 +- kernel/bpf/syscall.c | 5 + kernel/bpf/trampoline.c | 218 ++++++++++++++++----- kernel/bpf/verifier.c | 4 + kernel/fork.c | 8 + kernel/gcov/clang.c | 69 +++++++ kernel/power/energy_model.c | 2 +- kernel/static_call.c | 71 ++++++- kernel/trace/ftrace.c | 43 +++- kernel/usermode_driver.c | 21 +- mm/highmem.c | 4 +- mm/hugetlb.c | 41 +++- mm/hugetlb_cgroup.c | 10 +- mm/mmu_notifier.c | 23 +++ mm/z3fold.c | 16 +- net/bridge/br_switchdev.c | 2 + net/can/isotp.c | 18 +- net/core/dev.c | 14 +- net/core/drop_monitor.c | 23 +++ net/core/dst.c | 59 ++++-- net/core/flow_dissector.c | 2 +- net/dccp/ipv6.c | 5 + net/ipv4/inet_connection_sock.c | 7 +- net/ipv4/netfilter/arp_tables.c | 16 +- net/ipv4/netfilter/ip_tables.c | 16 +- net/ipv4/route.c | 45 +---- net/ipv4/tcp_minisocks.c | 7 +- net/ipv6/ip6_fib.c | 2 +- net/ipv6/ip6_input.c | 10 - net/ipv6/netfilter/ip6_tables.c | 16 +- net/ipv6/route.c | 36 +--- net/ipv6/tcp_ipv6.c | 5 + net/mac80211/cfg.c | 4 +- net/mac80211/ibss.c | 2 + net/mac80211/mlme.c | 2 +- net/mac80211/util.c | 2 +- net/mptcp/options.c | 24 ++- net/mptcp/subflow.c | 5 + net/netfilter/nf_conntrack_netlink.c | 1 + net/netfilter/nf_flow_table_core.c | 2 +- net/netfilter/nf_tables_api.c | 19 +- net/netfilter/x_tables.c | 49 +++-- net/qrtr/qrtr.c | 5 + net/sched/cls_flower.c | 2 +- net/sched/sch_choke.c | 7 +- net/sched/sch_gred.c | 2 +- net/sched/sch_red.c | 7 +- net/sched/sch_sfq.c | 2 +- net/sctp/output.c | 7 - net/sctp/outqueue.c | 7 + net/tipc/node.c | 11 +- net/vmw_vsock/af_vsock.c | 1 + scripts/dummy-tools/gcc | 5 + security/integrity/iint.c | 8 + security/selinux/include/security.h | 15 +- security/selinux/selinuxfs.c | 13 +- security/selinux/ss/services.c | 63 +++--- sound/hda/intel-nhlt.c | 5 + tools/include/linux/static_call_types.h | 50 +++++ tools/lib/bpf/Makefile | 2 +- tools/lib/bpf/btf_dump.c | 2 +- tools/lib/bpf/libbpf.c | 3 +- tools/lib/bpf/netlink.c | 2 +- tools/objtool/check.c | 17 +- tools/perf/util/auxtrace.c | 4 - tools/perf/util/synthetic-events.c | 9 +- tools/testing/kunit/configs/broken_on_uml.config | 2 + tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 +- .../testing/selftests/bpf/prog_tests/fexit_sleep.c | 82 ++++++++ tools/testing/selftests/bpf/progs/fexit_sleep.c | 31 +++ .../testing/selftests/bpf/progs/test_tunnel_kern.c | 6 +- .../selftests/net/forwarding/vxlan_bridge_1d.sh | 2 +- .../selftests/net/reuseaddr_ports_exhausted.c | 32 +-- 309 files changed, 3116 insertions(+), 1113 deletions(-)
From: Felix Fietkau nbd@nbd.name
[ Upstream commit ae064fc0e32a4d28389086d9f4b260a0c157cfee ]
When running out of room in the tx queue after calling drv->tx_prepare_skb, the buffer list will already have been modified on MT7615 and newer drivers. This can leak a DMA mapping and will show up as swiotlb allocation failures on x86.
Fix this by moving the queue length check further up. This is less accurate, since it can overestimate the needed room in the queue on MT7615 and newer, but the difference is small enough to not matter in practice.
Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20210216135119.23809-1-nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/dma.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c index 9bf13994c036..680c899a96d7 100644 --- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -345,7 +345,6 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, }; struct ieee80211_hw *hw; int len, n = 0, ret = -ENOMEM; - struct mt76_queue_entry e; struct mt76_txwi_cache *t; struct sk_buff *iter; dma_addr_t addr; @@ -387,6 +386,11 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, } tx_info.nbuf = n;
+ if (q->queued + (tx_info.nbuf + 1) / 2 >= q->ndesc - 1) { + ret = -ENOMEM; + goto unmap; + } + dma_sync_single_for_cpu(dev->dev, t->dma_addr, dev->drv->txwi_size, DMA_TO_DEVICE); ret = dev->drv->tx_prepare_skb(dev, txwi, q->qid, wcid, sta, &tx_info); @@ -395,11 +399,6 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, if (ret < 0) goto unmap;
- if (q->queued + (tx_info.nbuf + 1) / 2 >= q->ndesc - 1) { - ret = -ENOMEM; - goto unmap; - } - return mt76_dma_add_buf(dev, q, tx_info.buf, tx_info.nbuf, tx_info.info, tx_info.skb, t);
@@ -415,9 +414,7 @@ mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, dev->test.tx_done--; #endif
- e.skb = tx_info.skb; - e.txwi = t; - dev->drv->tx_complete_skb(dev, &e); + dev_kfree_skb(tx_info.skb); mt76_put_txwi(dev, t); return ret; }
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 94f0e6256c2ab6803c935634aa1f653174c94879 ]
Modifying the tx buffer list too early can leak DMA mappings
Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20210216135119.23809-2-nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt7915/mac.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c index 1b4d65310b88..c9dd6867e125 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c @@ -957,11 +957,6 @@ int mt7915_tx_prepare_skb(struct mt76_dev *mdev, void *txwi_ptr, } txp->nbuf = nbuf;
- /* pass partial skb header to fw */ - tx_info->buf[1].len = MT_CT_PARSE_LEN; - tx_info->buf[1].skip_unmap = true; - tx_info->nbuf = MT_CT_DMA_BUF_NUM; - txp->flags = cpu_to_le16(MT_CT_INFO_APPLY_TXD | MT_CT_INFO_FROM_HOST);
if (!key) @@ -999,6 +994,11 @@ int mt7915_tx_prepare_skb(struct mt76_dev *mdev, void *txwi_ptr, txp->rept_wds_wcid = cpu_to_le16(0x3ff); tx_info->skb = DMA_DUMMY_DATA;
+ /* pass partial skb header to fw */ + tx_info->buf[1].len = MT_CT_PARSE_LEN; + tx_info->buf[1].skip_unmap = true; + tx_info->nbuf = MT_CT_DMA_BUF_NUM; + return 0; }
From: Joakim Zhang qiangqing.zhang@nxp.com
[ Upstream commit bfaf91ca848e758ed7be99b61fd936d03819fa56 ]
Driver uses dma_alloc_coherent to allocate dma memory for descriptors, dma_alloc_coherent will return both the virtual address and physical address. AFAIK, virt_to_phys could not convert virtual address to physical address, for which memory is allocated by dma_alloc_coherent.
dwmac4_display_ring() function is broken for various descriptor, it only support normal descriptor(struct dma_desc) now, this patch also extends to support all descriptor types.
Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/stmicro/stmmac/dwmac4_descs.c | 50 +++++++++++++--- .../net/ethernet/stmicro/stmmac/enh_desc.c | 9 ++- drivers/net/ethernet/stmicro/stmmac/hwif.h | 3 +- .../net/ethernet/stmicro/stmmac/norm_desc.c | 9 ++- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 57 ++++++++++++------- 5 files changed, 94 insertions(+), 34 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c index 2ecd3a8a690c..cbf4429fb1d2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c @@ -402,19 +402,53 @@ static void dwmac4_rd_set_tx_ic(struct dma_desc *p) p->des2 |= cpu_to_le32(TDES2_INTERRUPT_ON_COMPLETION); }
-static void dwmac4_display_ring(void *head, unsigned int size, bool rx) +static void dwmac4_display_ring(void *head, unsigned int size, bool rx, + dma_addr_t dma_rx_phy, unsigned int desc_size) { - struct dma_desc *p = (struct dma_desc *)head; + dma_addr_t dma_addr; int i;
pr_info("%s descriptor ring:\n", rx ? "RX" : "TX");
- for (i = 0; i < size; i++) { - pr_info("%03d [0x%x]: 0x%x 0x%x 0x%x 0x%x\n", - i, (unsigned int)virt_to_phys(p), - le32_to_cpu(p->des0), le32_to_cpu(p->des1), - le32_to_cpu(p->des2), le32_to_cpu(p->des3)); - p++; + if (desc_size == sizeof(struct dma_desc)) { + struct dma_desc *p = (struct dma_desc *)head; + + for (i = 0; i < size; i++) { + dma_addr = dma_rx_phy + i * sizeof(*p); + pr_info("%03d [%pad]: 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, + le32_to_cpu(p->des0), le32_to_cpu(p->des1), + le32_to_cpu(p->des2), le32_to_cpu(p->des3)); + p++; + } + } else if (desc_size == sizeof(struct dma_extended_desc)) { + struct dma_extended_desc *extp = (struct dma_extended_desc *)head; + + for (i = 0; i < size; i++) { + dma_addr = dma_rx_phy + i * sizeof(*extp); + pr_info("%03d [%pad]: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, + le32_to_cpu(extp->basic.des0), le32_to_cpu(extp->basic.des1), + le32_to_cpu(extp->basic.des2), le32_to_cpu(extp->basic.des3), + le32_to_cpu(extp->des4), le32_to_cpu(extp->des5), + le32_to_cpu(extp->des6), le32_to_cpu(extp->des7)); + extp++; + } + } else if (desc_size == sizeof(struct dma_edesc)) { + struct dma_edesc *ep = (struct dma_edesc *)head; + + for (i = 0; i < size; i++) { + dma_addr = dma_rx_phy + i * sizeof(*ep); + pr_info("%03d [%pad]: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, + le32_to_cpu(ep->des4), le32_to_cpu(ep->des5), + le32_to_cpu(ep->des6), le32_to_cpu(ep->des7), + le32_to_cpu(ep->basic.des0), le32_to_cpu(ep->basic.des1), + le32_to_cpu(ep->basic.des2), le32_to_cpu(ep->basic.des3)); + ep++; + } + } else { + pr_err("unsupported descriptor!"); } }
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c index d02cec296f51..6650edfab5bc 100644 --- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c +++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c @@ -417,19 +417,22 @@ static int enh_desc_get_rx_timestamp_status(void *desc, void *next_desc, } }
-static void enh_desc_display_ring(void *head, unsigned int size, bool rx) +static void enh_desc_display_ring(void *head, unsigned int size, bool rx, + dma_addr_t dma_rx_phy, unsigned int desc_size) { struct dma_extended_desc *ep = (struct dma_extended_desc *)head; + dma_addr_t dma_addr; int i;
pr_info("Extended %s descriptor ring:\n", rx ? "RX" : "TX");
for (i = 0; i < size; i++) { u64 x; + dma_addr = dma_rx_phy + i * sizeof(*ep);
x = *(u64 *)ep; - pr_info("%03d [0x%x]: 0x%x 0x%x 0x%x 0x%x\n", - i, (unsigned int)virt_to_phys(ep), + pr_info("%03d [%pad]: 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, (unsigned int)x, (unsigned int)(x >> 32), ep->basic.des2, ep->basic.des3); ep++; diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h index 15d7b8261189..979ac9fca23c 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -78,7 +78,8 @@ struct stmmac_desc_ops { /* get rx timestamp status */ int (*get_rx_timestamp_status)(void *desc, void *next_desc, u32 ats); /* Display ring */ - void (*display_ring)(void *head, unsigned int size, bool rx); + void (*display_ring)(void *head, unsigned int size, bool rx, + dma_addr_t dma_rx_phy, unsigned int desc_size); /* set MSS via context descriptor */ void (*set_mss)(struct dma_desc *p, unsigned int mss); /* get descriptor skbuff address */ diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c index f083360e4ba6..98ef43f35802 100644 --- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c +++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c @@ -269,19 +269,22 @@ static int ndesc_get_rx_timestamp_status(void *desc, void *next_desc, u32 ats) return 1; }
-static void ndesc_display_ring(void *head, unsigned int size, bool rx) +static void ndesc_display_ring(void *head, unsigned int size, bool rx, + dma_addr_t dma_rx_phy, unsigned int desc_size) { struct dma_desc *p = (struct dma_desc *)head; + dma_addr_t dma_addr; int i;
pr_info("%s descriptor ring:\n", rx ? "RX" : "TX");
for (i = 0; i < size; i++) { u64 x; + dma_addr = dma_rx_phy + i * sizeof(*p);
x = *(u64 *)p; - pr_info("%03d [0x%x]: 0x%x 0x%x 0x%x 0x%x", - i, (unsigned int)virt_to_phys(p), + pr_info("%03d [%pad]: 0x%x 0x%x 0x%x 0x%x", + i, &dma_addr, (unsigned int)x, (unsigned int)(x >> 32), p->des2, p->des3); p++; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index e87961432a79..4749bd0af160 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1133,6 +1133,7 @@ static int stmmac_phy_setup(struct stmmac_priv *priv) static void stmmac_display_rx_rings(struct stmmac_priv *priv) { u32 rx_cnt = priv->plat->rx_queues_to_use; + unsigned int desc_size; void *head_rx; u32 queue;
@@ -1142,19 +1143,24 @@ static void stmmac_display_rx_rings(struct stmmac_priv *priv)
pr_info("\tRX Queue %u rings\n", queue);
- if (priv->extend_desc) + if (priv->extend_desc) { head_rx = (void *)rx_q->dma_erx; - else + desc_size = sizeof(struct dma_extended_desc); + } else { head_rx = (void *)rx_q->dma_rx; + desc_size = sizeof(struct dma_desc); + }
/* Display RX ring */ - stmmac_display_ring(priv, head_rx, priv->dma_rx_size, true); + stmmac_display_ring(priv, head_rx, priv->dma_rx_size, true, + rx_q->dma_rx_phy, desc_size); } }
static void stmmac_display_tx_rings(struct stmmac_priv *priv) { u32 tx_cnt = priv->plat->tx_queues_to_use; + unsigned int desc_size; void *head_tx; u32 queue;
@@ -1164,14 +1170,19 @@ static void stmmac_display_tx_rings(struct stmmac_priv *priv)
pr_info("\tTX Queue %d rings\n", queue);
- if (priv->extend_desc) + if (priv->extend_desc) { head_tx = (void *)tx_q->dma_etx; - else if (tx_q->tbs & STMMAC_TBS_AVAIL) + desc_size = sizeof(struct dma_extended_desc); + } else if (tx_q->tbs & STMMAC_TBS_AVAIL) { head_tx = (void *)tx_q->dma_entx; - else + desc_size = sizeof(struct dma_edesc); + } else { head_tx = (void *)tx_q->dma_tx; + desc_size = sizeof(struct dma_desc); + }
- stmmac_display_ring(priv, head_tx, priv->dma_tx_size, false); + stmmac_display_ring(priv, head_tx, priv->dma_tx_size, false, + tx_q->dma_tx_phy, desc_size); } }
@@ -3740,18 +3751,23 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) unsigned int count = 0, error = 0, len = 0; int status = 0, coe = priv->hw->rx_csum; unsigned int next_entry = rx_q->cur_rx; + unsigned int desc_size; struct sk_buff *skb = NULL;
if (netif_msg_rx_status(priv)) { void *rx_head;
netdev_dbg(priv->dev, "%s: descriptor ring:\n", __func__); - if (priv->extend_desc) + if (priv->extend_desc) { rx_head = (void *)rx_q->dma_erx; - else + desc_size = sizeof(struct dma_extended_desc); + } else { rx_head = (void *)rx_q->dma_rx; + desc_size = sizeof(struct dma_desc); + }
- stmmac_display_ring(priv, rx_head, priv->dma_rx_size, true); + stmmac_display_ring(priv, rx_head, priv->dma_rx_size, true, + rx_q->dma_rx_phy, desc_size); } while (count < limit) { unsigned int buf1_len = 0, buf2_len = 0; @@ -4319,24 +4335,27 @@ static int stmmac_set_mac_address(struct net_device *ndev, void *addr) static struct dentry *stmmac_fs_dir;
static void sysfs_display_ring(void *head, int size, int extend_desc, - struct seq_file *seq) + struct seq_file *seq, dma_addr_t dma_phy_addr) { int i; struct dma_extended_desc *ep = (struct dma_extended_desc *)head; struct dma_desc *p = (struct dma_desc *)head; + dma_addr_t dma_addr;
for (i = 0; i < size; i++) { if (extend_desc) { - seq_printf(seq, "%d [0x%x]: 0x%x 0x%x 0x%x 0x%x\n", - i, (unsigned int)virt_to_phys(ep), + dma_addr = dma_phy_addr + i * sizeof(*ep); + seq_printf(seq, "%d [%pad]: 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, le32_to_cpu(ep->basic.des0), le32_to_cpu(ep->basic.des1), le32_to_cpu(ep->basic.des2), le32_to_cpu(ep->basic.des3)); ep++; } else { - seq_printf(seq, "%d [0x%x]: 0x%x 0x%x 0x%x 0x%x\n", - i, (unsigned int)virt_to_phys(p), + dma_addr = dma_phy_addr + i * sizeof(*p); + seq_printf(seq, "%d [%pad]: 0x%x 0x%x 0x%x 0x%x\n", + i, &dma_addr, le32_to_cpu(p->des0), le32_to_cpu(p->des1), le32_to_cpu(p->des2), le32_to_cpu(p->des3)); p++; @@ -4364,11 +4383,11 @@ static int stmmac_rings_status_show(struct seq_file *seq, void *v) if (priv->extend_desc) { seq_printf(seq, "Extended descriptor ring:\n"); sysfs_display_ring((void *)rx_q->dma_erx, - priv->dma_rx_size, 1, seq); + priv->dma_rx_size, 1, seq, rx_q->dma_rx_phy); } else { seq_printf(seq, "Descriptor ring:\n"); sysfs_display_ring((void *)rx_q->dma_rx, - priv->dma_rx_size, 0, seq); + priv->dma_rx_size, 0, seq, rx_q->dma_rx_phy); } }
@@ -4380,11 +4399,11 @@ static int stmmac_rings_status_show(struct seq_file *seq, void *v) if (priv->extend_desc) { seq_printf(seq, "Extended descriptor ring:\n"); sysfs_display_ring((void *)tx_q->dma_etx, - priv->dma_tx_size, 1, seq); + priv->dma_tx_size, 1, seq, tx_q->dma_tx_phy); } else if (!(tx_q->tbs & STMMAC_TBS_AVAIL)) { seq_printf(seq, "Descriptor ring:\n"); sysfs_display_ring((void *)tx_q->dma_tx, - priv->dma_tx_size, 0, seq); + priv->dma_tx_size, 0, seq, tx_q->dma_tx_phy); } }
From: Heiko Thiery heiko.thiery@gmail.com
[ Upstream commit 6a4d7234ae9a3bb31181f348ade9bbdb55aeb5c5 ]
When accessing the timecounter register on an i.MX8MQ the kernel hangs. This is only the case when the interface is down. This can be reproduced by reading with 'phc_ctrl eth0 get'.
Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900 the igp clock is disabled when the interface is down and leads to a system hang.
So we check if the ptp clock status before reading the timecounter register.
Signed-off-by: Heiko Thiery heiko.thiery@gmail.com Acked-by: Richard Cochran richardcochran@gmail.com Link: https://lore.kernel.org/r/20210225211514.9115-1-heiko.thiery@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_ptp.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index 2e344aada4c6..1753807cbf97 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -377,9 +377,16 @@ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts) u64 ns; unsigned long flags;
+ mutex_lock(&adapter->ptp_clk_mutex); + /* Check the ptp clock */ + if (!adapter->ptp_clk_on) { + mutex_unlock(&adapter->ptp_clk_mutex); + return -EINVAL; + } spin_lock_irqsave(&adapter->tmreg_lock, flags); ns = timecounter_read(&adapter->tc); spin_unlock_irqrestore(&adapter->tmreg_lock, flags); + mutex_unlock(&adapter->ptp_clk_mutex);
*ts = ns_to_timespec64(ns);
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit eead089311f4d935ab5d1d8fbb0c42ad44699ada ]
lkp reported a build error in fsp2.o:
CC arch/powerpc/platforms/44x/fsp2.o {standard input}:577: Error: unsupported relocation against base
Which comes from:
pr_err("GESR0: 0x%08x\n", mfdcr(base + PLB4OPB_GESR0));
Where our mfdcr() macro is stringifying "base + PLB4OPB_GESR0", and passing that to the assembler, which obviously doesn't work.
The mfdcr() macro already checks that the argument is constant using __builtin_constant_p(), and if not calls the out-of-line version of mfdcr(). But in this case GCC is smart enough to notice that "base + PLB4OPB_GESR0" will be constant, even though it's not something we can immediately stringify into a register number.
Segher pointed out that passing the register number to the inline asm as a constant would be better, and in fact it fixes the build error, presumably because it gives GCC a chance to resolve the value.
While we're at it, change mtdcr() similarly.
Reported-by: kernel test robot lkp@intel.com Suggested-by: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Acked-by: Feng Tang feng.tang@intel.com Link: https://lore.kernel.org/r/20210218123058.748882-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/dcr-native.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/dcr-native.h b/arch/powerpc/include/asm/dcr-native.h index 7141ccea8c94..a92059964579 100644 --- a/arch/powerpc/include/asm/dcr-native.h +++ b/arch/powerpc/include/asm/dcr-native.h @@ -53,8 +53,8 @@ static inline void mtdcrx(unsigned int reg, unsigned int val) #define mfdcr(rn) \ ({unsigned int rval; \ if (__builtin_constant_p(rn) && rn < 1024) \ - asm volatile("mfdcr %0," __stringify(rn) \ - : "=r" (rval)); \ + asm volatile("mfdcr %0, %1" : "=r" (rval) \ + : "n" (rn)); \ else if (likely(cpu_has_feature(CPU_FTR_INDEXED_DCR))) \ rval = mfdcrx(rn); \ else \ @@ -64,8 +64,8 @@ static inline void mtdcrx(unsigned int reg, unsigned int val) #define mtdcr(rn, v) \ do { \ if (__builtin_constant_p(rn) && rn < 1024) \ - asm volatile("mtdcr " __stringify(rn) ",%0" \ - : : "r" (v)); \ + asm volatile("mtdcr %0, %1" \ + : : "n" (rn), "r" (v)); \ else if (likely(cpu_has_feature(CPU_FTR_INDEXED_DCR))) \ mtdcrx(rn, v); \ else \
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit 4deb550bc3b698a1f03d0332cde3df154d1b6c1e ]
label err_eni_release is reachable when eni_start() fail. In eni_start() it calls dev->phy->start() in the last step, if start() fail we don't need to call phy->stop(), if start() is never called, we neither need to call phy->stop(), otherwise null-ptr-deref will happen.
In order to fix this issue, don't call phy->stop() in label err_eni_release
[ 4.875714] ================================================================== [ 4.876091] BUG: KASAN: null-ptr-deref in suni_stop+0x47/0x100 [suni] [ 4.876433] Read of size 8 at addr 0000000000000030 by task modprobe/95 [ 4.876778] [ 4.876862] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7-00090-gdcc0b49040c7 #2 [ 4.877290] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd94 [ 4.877876] Call Trace: [ 4.878009] dump_stack+0x7d/0xa3 [ 4.878191] kasan_report.cold+0x10c/0x10e [ 4.878410] ? __slab_free+0x2f0/0x340 [ 4.878612] ? suni_stop+0x47/0x100 [suni] [ 4.878832] suni_stop+0x47/0x100 [suni] [ 4.879043] eni_do_release+0x3b/0x70 [eni] [ 4.879269] eni_init_one.cold+0x1152/0x1747 [eni] [ 4.879528] ? _raw_spin_lock_irqsave+0x7b/0xd0 [ 4.879768] ? eni_ioctl+0x270/0x270 [eni] [ 4.879990] ? __mutex_lock_slowpath+0x10/0x10 [ 4.880226] ? eni_ioctl+0x270/0x270 [eni] [ 4.880448] local_pci_probe+0x6f/0xb0 [ 4.880650] pci_device_probe+0x171/0x240 [ 4.880864] ? pci_device_remove+0xe0/0xe0 [ 4.881086] ? kernfs_create_link+0xb6/0x110 [ 4.881315] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0 [ 4.881594] really_probe+0x161/0x420 [ 4.881791] driver_probe_device+0x6d/0xd0 [ 4.882010] device_driver_attach+0x82/0x90 [ 4.882233] ? device_driver_attach+0x90/0x90 [ 4.882465] __driver_attach+0x60/0x100 [ 4.882671] ? device_driver_attach+0x90/0x90 [ 4.882903] bus_for_each_dev+0xe1/0x140 [ 4.883114] ? subsys_dev_iter_exit+0x10/0x10 [ 4.883346] ? klist_node_init+0x61/0x80 [ 4.883557] bus_add_driver+0x254/0x2a0 [ 4.883764] driver_register+0xd3/0x150 [ 4.883971] ? 0xffffffffc0038000 [ 4.884149] do_one_initcall+0x84/0x250 [ 4.884355] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 4.884674] ? unpoison_range+0xf/0x30 [ 4.884875] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.885150] ? unpoison_range+0xf/0x30 [ 4.885352] ? unpoison_range+0xf/0x30 [ 4.885557] do_init_module+0xf8/0x350 [ 4.885760] load_module+0x3fe6/0x4340 [ 4.885960] ? vm_unmap_ram+0x1d0/0x1d0 [ 4.886166] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 4.886441] ? module_frob_arch_sections+0x20/0x20 [ 4.886697] ? __do_sys_finit_module+0x108/0x170 [ 4.886941] __do_sys_finit_module+0x108/0x170 [ 4.887178] ? __ia32_sys_init_module+0x40/0x40 [ 4.887419] ? file_open_root+0x200/0x200 [ 4.887634] ? do_sys_open+0x85/0xe0 [ 4.887826] ? filp_open+0x50/0x50 [ 4.888009] ? fpregs_assert_state_consistent+0x4d/0x60 [ 4.888287] ? exit_to_user_mode_prepare+0x2f/0x130 [ 4.888547] do_syscall_64+0x33/0x40 [ 4.888739] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 4.889010] RIP: 0033:0x7ff62fcf1cf7 [ 4.889202] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f71 [ 4.890172] RSP: 002b:00007ffe6644ade8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4.890570] RAX: ffffffffffffffda RBX: 0000000000f2ca70 RCX: 00007ff62fcf1cf7 [ 4.890944] RDX: 0000000000000000 RSI: 0000000000f2b9e0 RDI: 0000000000000003 [ 4.891318] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 4.891691] R10: 00007ff62fd55300 R11: 0000000000000246 R12: 0000000000f2b9e0 [ 4.892064] R13: 0000000000000000 R14: 0000000000f2bdd0 R15: 0000000000000001 [ 4.892439] ==================================================================
Signed-off-by: Tong Zhang ztong0001@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/eni.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c index 316a9947541f..b574cce98dc3 100644 --- a/drivers/atm/eni.c +++ b/drivers/atm/eni.c @@ -2260,7 +2260,8 @@ static int eni_init_one(struct pci_dev *pci_dev, return rc;
err_eni_release: - eni_do_release(dev); + dev->phy = NULL; + iounmap(ENI_DEV(dev)->ioaddr); err_unregister: atm_dev_deregister(dev); err_free_consistent:
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit a2bd45834e83d6c5a04d397bde13d744a4812dfc ]
lanai_dev_open() can fail. When it fail, lanai->base is unmapped and the pci device is disabled. The caller, lanai_init_one(), then tries to run atm_dev_deregister(). This will subsequently call lanai_dev_close() and use the already released MMIO area.
To fix this issue, set the lanai->base to NULL if open fail, and test the flag in lanai_dev_close().
[ 8.324153] lanai: lanai_start() failed, err=19 [ 8.324819] lanai(itf 0): shutting down interface [ 8.325211] BUG: unable to handle page fault for address: ffffc90000180024 [ 8.325781] #PF: supervisor write access in kernel mode [ 8.326215] #PF: error_code(0x0002) - not-present page [ 8.326641] PGD 100000067 P4D 100000067 PUD 100139067 PMD 10013a067 PTE 0 [ 8.327206] Oops: 0002 [#1] SMP KASAN NOPTI [ 8.327557] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7-00090-gdcc0b49040c7 #12 [ 8.328229] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-4 [ 8.329145] RIP: 0010:lanai_dev_close+0x4f/0xe5 [lanai] [ 8.329587] Code: 00 48 c7 c7 00 d3 01 c0 e8 49 4e 0a c2 48 8d bd 08 02 00 00 e8 6e 52 14 c1 48 80 [ 8.330917] RSP: 0018:ffff8881029ef680 EFLAGS: 00010246 [ 8.331196] RAX: 000000000003fffe RBX: ffff888102fb4800 RCX: ffffffffc001a98a [ 8.331572] RDX: ffffc90000180000 RSI: 0000000000000246 RDI: ffff888102fb4000 [ 8.331948] RBP: ffff888102fb4000 R08: ffffffff8115da8a R09: ffffed102053deaa [ 8.332326] R10: 0000000000000003 R11: ffffed102053dea9 R12: ffff888102fb48a4 [ 8.332701] R13: ffffffffc00123c0 R14: ffff888102fb4b90 R15: ffff888102fb4b88 [ 8.333077] FS: 00007f08eb9056a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 8.333502] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.333806] CR2: ffffc90000180024 CR3: 0000000102a28000 CR4: 00000000000006f0 [ 8.334182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.334557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.334932] Call Trace: [ 8.335066] atm_dev_deregister+0x161/0x1a0 [atm] [ 8.335324] lanai_init_one.cold+0x20c/0x96d [lanai] [ 8.335594] ? lanai_send+0x2a0/0x2a0 [lanai] [ 8.335831] local_pci_probe+0x6f/0xb0 [ 8.336039] pci_device_probe+0x171/0x240 [ 8.336255] ? pci_device_remove+0xe0/0xe0 [ 8.336475] ? kernfs_create_link+0xb6/0x110 [ 8.336704] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0 [ 8.336983] really_probe+0x161/0x420 [ 8.337181] driver_probe_device+0x6d/0xd0 [ 8.337401] device_driver_attach+0x82/0x90 [ 8.337626] ? device_driver_attach+0x90/0x90 [ 8.337859] __driver_attach+0x60/0x100 [ 8.338065] ? device_driver_attach+0x90/0x90 [ 8.338298] bus_for_each_dev+0xe1/0x140 [ 8.338511] ? subsys_dev_iter_exit+0x10/0x10 [ 8.338745] ? klist_node_init+0x61/0x80 [ 8.338956] bus_add_driver+0x254/0x2a0 [ 8.339164] driver_register+0xd3/0x150 [ 8.339370] ? 0xffffffffc0028000 [ 8.339550] do_one_initcall+0x84/0x250 [ 8.339755] ? trace_event_raw_event_initcall_finish+0x150/0x150 [ 8.340076] ? free_vmap_area_noflush+0x1a5/0x5c0 [ 8.340329] ? unpoison_range+0xf/0x30 [ 8.340532] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 8.340806] ? unpoison_range+0xf/0x30 [ 8.341014] ? unpoison_range+0xf/0x30 [ 8.341217] do_init_module+0xf8/0x350 [ 8.341419] load_module+0x3fe6/0x4340 [ 8.341621] ? vm_unmap_ram+0x1d0/0x1d0 [ 8.341826] ? ____kasan_kmalloc.constprop.0+0x84/0xa0 [ 8.342101] ? module_frob_arch_sections+0x20/0x20 [ 8.342358] ? __do_sys_finit_module+0x108/0x170 [ 8.342604] __do_sys_finit_module+0x108/0x170 [ 8.342841] ? __ia32_sys_init_module+0x40/0x40 [ 8.343083] ? file_open_root+0x200/0x200 [ 8.343298] ? do_sys_open+0x85/0xe0 [ 8.343491] ? filp_open+0x50/0x50 [ 8.343675] ? exit_to_user_mode_prepare+0xfc/0x130 [ 8.343935] do_syscall_64+0x33/0x40 [ 8.344132] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8.344401] RIP: 0033:0x7f08eb887cf7 [ 8.344594] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 41 [ 8.345565] RSP: 002b:00007ffcd5c98ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 8.345962] RAX: ffffffffffffffda RBX: 00000000008fea70 RCX: 00007f08eb887cf7 [ 8.346336] RDX: 0000000000000000 RSI: 00000000008fd9e0 RDI: 0000000000000003 [ 8.346711] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001 [ 8.347085] R10: 00007f08eb8eb300 R11: 0000000000000246 R12: 00000000008fd9e0 [ 8.347460] R13: 0000000000000000 R14: 00000000008fddd0 R15: 0000000000000001 [ 8.347836] Modules linked in: lanai(+) atm [ 8.348065] CR2: ffffc90000180024 [ 8.348244] ---[ end trace 7fdc1c668f2003e5 ]--- [ 8.348490] RIP: 0010:lanai_dev_close+0x4f/0xe5 [lanai] [ 8.348772] Code: 00 48 c7 c7 00 d3 01 c0 e8 49 4e 0a c2 48 8d bd 08 02 00 00 e8 6e 52 14 c1 48 80 [ 8.349745] RSP: 0018:ffff8881029ef680 EFLAGS: 00010246 [ 8.350022] RAX: 000000000003fffe RBX: ffff888102fb4800 RCX: ffffffffc001a98a [ 8.350397] RDX: ffffc90000180000 RSI: 0000000000000246 RDI: ffff888102fb4000 [ 8.350772] RBP: ffff888102fb4000 R08: ffffffff8115da8a R09: ffffed102053deaa [ 8.351151] R10: 0000000000000003 R11: ffffed102053dea9 R12: ffff888102fb48a4 [ 8.351525] R13: ffffffffc00123c0 R14: ffff888102fb4b90 R15: ffff888102fb4b88 [ 8.351918] FS: 00007f08eb9056a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 8.352343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.352647] CR2: ffffc90000180024 CR3: 0000000102a28000 CR4: 00000000000006f0 [ 8.353022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8.353397] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8.353958] modprobe (95) used greatest stack depth: 26216 bytes left
Signed-off-by: Tong Zhang ztong0001@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/lanai.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/atm/lanai.c b/drivers/atm/lanai.c index d7277c26e423..32d7aa141d96 100644 --- a/drivers/atm/lanai.c +++ b/drivers/atm/lanai.c @@ -2233,6 +2233,7 @@ static int lanai_dev_open(struct atm_dev *atmdev) conf1_write(lanai); #endif iounmap(lanai->base); + lanai->base = NULL; error_pci: pci_disable_device(lanai->pci); error: @@ -2245,6 +2246,8 @@ static int lanai_dev_open(struct atm_dev *atmdev) static void lanai_dev_close(struct atm_dev *atmdev) { struct lanai_dev *lanai = (struct lanai_dev *) atmdev->dev_data; + if (lanai->base==NULL) + return; printk(KERN_INFO DEV_LABEL "(itf %d): shutting down interface\n", lanai->number); lanai_timed_poll_stop(lanai); @@ -2552,7 +2555,7 @@ static int lanai_init_one(struct pci_dev *pci, struct atm_dev *atmdev; int result;
- lanai = kmalloc(sizeof(*lanai), GFP_KERNEL); + lanai = kzalloc(sizeof(*lanai), GFP_KERNEL); if (lanai == NULL) { printk(KERN_ERR DEV_LABEL ": couldn't allocate dev_data structure!\n");
From: Hayes Wang hayeswang@realtek.com
[ Upstream commit 4b5dc1a94d4f92b5845e98bd9ae344b26d933aad ]
This reverts commit 134f98bcf1b898fb9d6f2b91bc85dd2e5478b4b8.
The r8153_mac_clk_spd() is used for RTL8153A only, because the register table of RTL8153B is different from RTL8153A. However, this function would be called when RTL8153B calls r8153_first_init() and r8153_enter_oob(). That causes RTL8153B becomes unstable when suspending and resuming. The worst case may let the device stop working.
Besides, revert this commit to disable MAC clock speed down for RTL8153A. It would avoid the known issue when enabling U1. The data of the first control transfer may be wrong when exiting U1.
Signed-off-by: Hayes Wang hayeswang@realtek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/r8152.c | 35 ++++++----------------------------- 1 file changed, 6 insertions(+), 29 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 67cd6986634f..fd5ca11c4cbb 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3016,29 +3016,6 @@ static void __rtl_set_wol(struct r8152 *tp, u32 wolopts) device_set_wakeup_enable(&tp->udev->dev, false); }
-static void r8153_mac_clk_spd(struct r8152 *tp, bool enable) -{ - /* MAC clock speed down */ - if (enable) { - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL, - ALDPS_SPDWN_RATIO); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL2, - EEE_SPDWN_RATIO); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, - PKT_AVAIL_SPDWN_EN | SUSPEND_SPDWN_EN | - U1U2_SPDWN_EN | L1_SPDWN_EN); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL4, - PWRSAVE_SPDWN_EN | RXDV_SPDWN_EN | TX10MIDLE_EN | - TP100_SPDWN_EN | TP500_SPDWN_EN | EEE_SPDWN_EN | - TP1000_SPDWN_EN); - } else { - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL, 0); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL2, 0); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, 0); - ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL4, 0); - } -} - static void r8153_u1u2en(struct r8152 *tp, bool enable) { u8 u1u2[8]; @@ -3338,11 +3315,9 @@ static void rtl8153_runtime_enable(struct r8152 *tp, bool enable) if (enable) { r8153_u1u2en(tp, false); r8153_u2p3en(tp, false); - r8153_mac_clk_spd(tp, true); rtl_runtime_suspend_enable(tp, true); } else { rtl_runtime_suspend_enable(tp, false); - r8153_mac_clk_spd(tp, false);
switch (tp->version) { case RTL_VER_03: @@ -4678,7 +4653,6 @@ static void r8153_first_init(struct r8152 *tp) { u32 ocp_data;
- r8153_mac_clk_spd(tp, false); rxdy_gated_en(tp, true); r8153_teredo_off(tp);
@@ -4729,8 +4703,6 @@ static void r8153_enter_oob(struct r8152 *tp) { u32 ocp_data;
- r8153_mac_clk_spd(tp, true); - ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL); ocp_data &= ~NOW_IS_OOB; ocp_write_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL, ocp_data); @@ -5456,10 +5428,15 @@ static void r8153_init(struct r8152 *tp)
ocp_write_word(tp, MCU_TYPE_USB, USB_CONNECT_TIMER, 0x0001);
+ /* MAC clock speed down */ + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL, 0); + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL2, 0); + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, 0); + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL4, 0); + r8153_power_cut_en(tp, false); rtl_runtime_suspend_enable(tp, false); r8153_u1u2en(tp, true); - r8153_mac_clk_spd(tp, false); usb_enable_lpm(tp->udev);
ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6);
From: Mark Pearson markpearson@lenovo.com
[ Upstream commit a14a6219996ee6f6e858d83b11affc7907633687 ]
On some Lenovo systems if the microphone is disabled in the BIOS only the NHLT table header is created, with no data. This means the endpoints field is not correctly set to zero - leading to an unintialised variable and hence invalid descriptors are parsed leading to page faults.
The Lenovo firmware team is addressing this, but adding a check preventing invalid tables being parsed is worthwhile.
Tested on a Lenovo T14.
Tested-by: Philipp Leskovitz philipp.leskovitz@secunet.com Reported-by: Philipp Leskovitz philipp.leskovitz@secunet.com Signed-off-by: Mark Pearson markpearson@lenovo.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20210302141003.7342-1-markpearson@lenovo.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/intel-nhlt.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/hda/intel-nhlt.c b/sound/hda/intel-nhlt.c index d053beccfaec..e2237239d922 100644 --- a/sound/hda/intel-nhlt.c +++ b/sound/hda/intel-nhlt.c @@ -39,6 +39,11 @@ int intel_nhlt_get_dmic_geo(struct device *dev, struct nhlt_acpi_table *nhlt) if (!nhlt) return 0;
+ if (nhlt->header.length <= sizeof(struct acpi_table_header)) { + dev_warn(dev, "Invalid DMIC description table\n"); + return 0; + } + for (j = 0, epnt = nhlt->desc; j < nhlt->endpoint_count; j++, epnt = (struct nhlt_endpoint *)((u8 *)epnt + epnt->length)) {
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 7a766381634da19fc837619b0a34590498d9d29a ]
When ixgbe_fdir_write_perfect_filter_82599() fails, input allocated by kzalloc() has not been freed, which leads to memleak.
Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 393d1c2cd853..e9c2d28efc81 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -9582,8 +9582,10 @@ static int ixgbe_configure_clsu32(struct ixgbe_adapter *adapter, ixgbe_atr_compute_perfect_hash_82599(&input->filter, mask); err = ixgbe_fdir_write_perfect_filter_82599(hw, &input->filter, input->sw_idx, queue); - if (!err) - ixgbe_update_ethtool_fdir_entry(adapter, input, input->sw_idx); + if (err) + goto err_out_w_lock; + + ixgbe_update_ethtool_fdir_entry(adapter, input, input->sw_idx); spin_unlock(&adapter->fdir_perfect_lock);
if ((uhtid != 0x800) && (adapter->jump_tables[uhtid]))
From: Nitin Rawat nitirawa@codeaurora.org
[ Upstream commit 4a791574a0ccf36eb3a0a46fbd71d2768df3eef9 ]
Disable interrupt in reset path to flush pending IRQ handler in order to avoid possible NoC issues.
Link: https://lore.kernel.org/r/1614145010-36079-3-git-send-email-cang@codeaurora.... Reviewed-by: Avri Altman avri.altman@wdc.com Signed-off-by: Nitin Rawat nitirawa@codeaurora.org Signed-off-by: Can Guo cang@codeaurora.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufs-qcom.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c index 2206b1e4b774..e55201f64c10 100644 --- a/drivers/scsi/ufs/ufs-qcom.c +++ b/drivers/scsi/ufs/ufs-qcom.c @@ -253,12 +253,17 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba) { int ret = 0; struct ufs_qcom_host *host = ufshcd_get_variant(hba); + bool reenable_intr = false;
if (!host->core_reset) { dev_warn(hba->dev, "%s: reset control not set\n", __func__); goto out; }
+ reenable_intr = hba->is_irq_enabled; + disable_irq(hba->irq); + hba->is_irq_enabled = false; + ret = reset_control_assert(host->core_reset); if (ret) { dev_err(hba->dev, "%s: core_reset assert failed, err = %d\n", @@ -280,6 +285,11 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba)
usleep_range(1000, 1100);
+ if (reenable_intr) { + enable_irq(hba->irq); + hba->is_irq_enabled = true; + } + out: return ret; }
From: Xunlei Pang xlpang@linux.alibaba.com
[ Upstream commit 4f44657d74873735e93a50eb25014721a66aac19 ]
The current blkio.throttle.io_service_bytes_recursive doesn't work correctly.
As an example, for the following blkcg hierarchy: (Made 1GB READ in test1, 512MB READ in test2) test / \ test1 test2
$ head -n 1 test/test1/blkio.throttle.io_service_bytes_recursive 8:0 Read 1073684480 $ head -n 1 test/test2/blkio.throttle.io_service_bytes_recursive 8:0 Read 537448448 $ head -n 1 test/blkio.throttle.io_service_bytes_recursive 8:0 Read 537448448
Clearly, above data of "test" reflects "test2" not "test1"+"test2".
Do the correct summary in blkg_rwstat_recursive_sum().
Signed-off-by: Xunlei Pang xlpang@linux.alibaba.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-cgroup-rwstat.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-cgroup-rwstat.c b/block/blk-cgroup-rwstat.c index 85d5790ac49b..3304e841df7c 100644 --- a/block/blk-cgroup-rwstat.c +++ b/block/blk-cgroup-rwstat.c @@ -109,6 +109,7 @@ void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol,
lockdep_assert_held(&blkg->q->queue_lock);
+ memset(sum, 0, sizeof(*sum)); rcu_read_lock(); blkg_for_each_descendant_pre(pos_blkg, pos_css, blkg) { struct blkg_rwstat *rwstat; @@ -122,7 +123,7 @@ void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, struct blkcg_policy *pol, rwstat = (void *)pos_blkg + off;
for (i = 0; i < BLKG_RWSTAT_NR; i++) - sum->cnt[i] = blkg_rwstat_read_counter(rwstat, i); + sum->cnt[i] += blkg_rwstat_read_counter(rwstat, i); } rcu_read_unlock(); }
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 38c26ff3048af50eee3fcd591921357ee5bfd9ee ]
When bdx_read_mac() fails, no error return code of bdx_probe() is assigned. To fix this bug, err is assigned with -EFAULT as error return code.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/tehuti/tehuti.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/tehuti/tehuti.c b/drivers/net/ethernet/tehuti/tehuti.c index b8f4f419173f..d054c6e83b1c 100644 --- a/drivers/net/ethernet/tehuti/tehuti.c +++ b/drivers/net/ethernet/tehuti/tehuti.c @@ -2044,6 +2044,7 @@ bdx_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /*bdx_hw_reset(priv); */ if (bdx_read_mac(priv)) { pr_err("load MAC address failed\n"); + err = -EFAULT; goto err_out_iomap; } SET_NETDEV_DEV(ndev, &pdev->dev);
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 6650d31f21b8a0043613ae0a4a2e42e49dc20b2d ]
When iavf_process_config() fails, no error return code of iavf_init_get_resources() is assigned. To fix this bug, err is assigned with the return value of iavf_process_config(), and then err is checked.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 0a867d64d467..dc5b3c06d1e0 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1776,7 +1776,8 @@ static int iavf_init_get_resources(struct iavf_adapter *adapter) goto err_alloc; }
- if (iavf_process_config(adapter)) + err = iavf_process_config(adapter); + if (err) goto err_alloc; adapter->current_op = VIRTCHNL_OP_UNKNOWN;
From: Denis Efremov efremov@linux.com
[ Upstream commit 155b23e6e53475ca3b8c2a946299b4d4dd6a5a1e ]
RXMAC_BC_FRM_CNT_COUNT added to mp->rx_bcasts twice in a row in niu_xmac_interrupt(). Remove the second addition.
Signed-off-by: Denis Efremov efremov@linux.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sun/niu.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c index 68695d4afacd..707ccdd03b19 100644 --- a/drivers/net/ethernet/sun/niu.c +++ b/drivers/net/ethernet/sun/niu.c @@ -3931,8 +3931,6 @@ static void niu_xmac_interrupt(struct niu *np) mp->rx_mcasts += RXMAC_MC_FRM_CNT_COUNT; if (val & XRXMAC_STATUS_RXBCAST_CNT_EXP) mp->rx_bcasts += RXMAC_BC_FRM_CNT_COUNT; - if (val & XRXMAC_STATUS_RXBCAST_CNT_EXP) - mp->rx_bcasts += RXMAC_BC_FRM_CNT_COUNT; if (val & XRXMAC_STATUS_RXHIST1_CNT_EXP) mp->rx_hist_cnt1 += RXMAC_HIST_CNT1_COUNT; if (val & XRXMAC_STATUS_RXHIST2_CNT_EXP)
From: Michael Braun michael-dev@fami-braun.de
[ Upstream commit d8861bab48b6c1fc3cdbcab8ff9d1eaea43afe7f ]
When using jumbo packets and overrunning rx queue with napi enabled, the following sequence is observed in gfar_add_rx_frag:
| lstatus | | skb | t | lstatus, size, flags | first | len, data_len, *ptr | ---+--------------------------------------+-------+-----------------------+ 13 | 18002348, 9032, INTERRUPT LAST | 0 | 9600, 8000, f554c12e | 12 | 10000640, 1600, INTERRUPT | 0 | 8000, 6400, f554c12e | 11 | 10000640, 1600, INTERRUPT | 0 | 6400, 4800, f554c12e | 10 | 10000640, 1600, INTERRUPT | 0 | 4800, 3200, f554c12e | 09 | 10000640, 1600, INTERRUPT | 0 | 3200, 1600, f554c12e | 08 | 14000640, 1600, INTERRUPT FIRST | 0 | 1600, 0, f554c12e | 07 | 14000640, 1600, INTERRUPT FIRST | 1 | 0, 0, f554c12e | 06 | 1c000080, 128, INTERRUPT LAST FIRST | 1 | 0, 0, abf3bd6e | 05 | 18002348, 9032, INTERRUPT LAST | 0 | 8000, 6400, c5a57780 | 04 | 10000640, 1600, INTERRUPT | 0 | 6400, 4800, c5a57780 | 03 | 10000640, 1600, INTERRUPT | 0 | 4800, 3200, c5a57780 | 02 | 10000640, 1600, INTERRUPT | 0 | 3200, 1600, c5a57780 | 01 | 10000640, 1600, INTERRUPT | 0 | 1600, 0, c5a57780 | 00 | 14000640, 1600, INTERRUPT FIRST | 1 | 0, 0, c5a57780 |
So at t=7 a new packets is started but not finished, probably due to rx overrun - but rx overrun is not indicated in the flags. Instead a new packets starts at t=8. This results in skb->len to exceed size for the LAST fragment at t=13 and thus a negative fragment size added to the skb.
This then crashes:
kernel BUG at include/linux/skbuff.h:2277! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c04689f4] skb_pull+0x2c/0x48 LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 Call Trace: [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc [ec4bff08] [c0062718] kthread+0x140/0x144 [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c
This patch fixes this by checking for computed LAST fragment size, so a negative sized fragment is never added. In order to prevent the newer rx frame from getting corrupted, the FIRST flag is checked to discard the incomplete older frame.
Signed-off-by: Michael Braun michael-dev@fami-braun.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/gianfar.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index d391a45cebb6..4fab2ee5bbf5 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -2391,6 +2391,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, u32 lstatus, if (lstatus & BD_LFLAG(RXBD_LAST)) size -= skb->len;
+ WARN(size < 0, "gianfar: rx fragment size underflow"); + if (size < 0) + return false; + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, rxb->page_offset + RXBUF_ALIGNMENT, size, GFAR_RXB_TRUESIZE); @@ -2553,6 +2557,17 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, if (lstatus & BD_LFLAG(RXBD_EMPTY)) break;
+ /* lost RXBD_LAST descriptor due to overrun */ + if (skb && + (lstatus & BD_LFLAG(RXBD_FIRST))) { + /* discard faulty buffer */ + dev_kfree_skb(skb); + skb = NULL; + rx_queue->stats.rx_dropped++; + + /* can continue normally */ + } + /* order rx buffer descriptor reads */ rmb();
From: Aurelien Aptel aaptel@suse.com
[ Upstream commit 88fd98a2306755b965e4f4567f84e73db3b6738c ]
When doing a large read or write workload we only very gradually increase the number of credits which can cause problems with parallelizing large i/o (I/O ramps up more slowly than it should for large read/write workloads) especially with multichannel when the number of credits on the secondary channels starts out low (e.g. less than about 130) or when recovering after server throttled back the number of credit.
Signed-off-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2pdu.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 794fc3b68b4f..6a1af5545f67 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -4033,8 +4033,7 @@ smb2_async_readv(struct cifs_readdata *rdata) if (rdata->credits.value > 0) { shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(rdata->bytes, SMB2_MAX_BUFFER_SIZE)); - shdr->CreditRequest = - cpu_to_le16(le16_to_cpu(shdr->CreditCharge) + 1); + shdr->CreditRequest = cpu_to_le16(le16_to_cpu(shdr->CreditCharge) + 8);
rc = adjust_credits(server, &rdata->credits, rdata->bytes); if (rc) @@ -4340,8 +4339,7 @@ smb2_async_writev(struct cifs_writedata *wdata, if (wdata->credits.value > 0) { shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(wdata->bytes, SMB2_MAX_BUFFER_SIZE)); - shdr->CreditRequest = - cpu_to_le16(le16_to_cpu(shdr->CreditCharge) + 1); + shdr->CreditRequest = cpu_to_le16(le16_to_cpu(shdr->CreditCharge) + 8);
rc = adjust_credits(server, &wdata->credits, wdata->bytes); if (rc)
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit 1a5a2cfd34c17db73c53ef127272c8c1ae220485 ]
This patch adds code to function trans_drain to remove drained bd elements from the ail lists, if queued, before freeing the bd. If we don't remove the bd from the ail, function ail_drain will try to reference the bd after it has been freed by trans_drain.
Thanks to Andy Price for his analysis of the problem.
Reported-by: Andy Price anprice@redhat.com Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/log.c | 4 ++++ fs/gfs2/trans.c | 2 ++ 2 files changed, 6 insertions(+)
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 2e9314091c81..1955dea999f7 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -935,12 +935,16 @@ static void trans_drain(struct gfs2_trans *tr) while (!list_empty(head)) { bd = list_first_entry(head, struct gfs2_bufdata, bd_list); list_del_init(&bd->bd_list); + if (!list_empty(&bd->bd_ail_st_list)) + gfs2_remove_from_ail(bd); kmem_cache_free(gfs2_bufdata_cachep, bd); } head = &tr->tr_databuf; while (!list_empty(head)) { bd = list_first_entry(head, struct gfs2_bufdata, bd_list); list_del_init(&bd->bd_list); + if (!list_empty(&bd->bd_ail_st_list)) + gfs2_remove_from_ail(bd); kmem_cache_free(gfs2_bufdata_cachep, bd); } } diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c index 6d4bf7ea7b3b..7f850ff6a05d 100644 --- a/fs/gfs2/trans.c +++ b/fs/gfs2/trans.c @@ -134,6 +134,8 @@ static struct gfs2_bufdata *gfs2_alloc_bufdata(struct gfs2_glock *gl, bd->bd_bh = bh; bd->bd_gl = gl; INIT_LIST_HEAD(&bd->bd_list); + INIT_LIST_HEAD(&bd->bd_ail_st_list); + INIT_LIST_HEAD(&bd->bd_ail_gl_list); bh->b_private = bd; return bd; }
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit fbb31cb805fd3574d3be7defc06a7fd2fd9af7d2 ]
Add "arm,vexpress" to cpufreq-dt-platdev blacklist since the actual scaling is handled by the firmware cpufreq drivers(scpi, scmi and vexpress-spc).
Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq-dt-platdev.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index bd2db0188cbb..91e6a0c10dbf 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -103,6 +103,8 @@ static const struct of_device_id whitelist[] __initconst = { static const struct of_device_id blacklist[] __initconst = { { .compatible = "allwinner,sun50i-h6", },
+ { .compatible = "arm,vexpress", }, + { .compatible = "calxeda,highbank", }, { .compatible = "calxeda,ecx-2000", },
From: Yang Li yang.lee@linux.alibaba.com
[ Upstream commit 6e5d5791730b55a1f987e1db84b078b91eb49e99 ]
fixed the following coccicheck: ./drivers/gpio/gpiolib-acpi.c:176:7-27: ERROR: Threaded IRQ with no primary handler requested without IRQF_ONESHOT
Make sure threaded IRQs without a primary handler are always request with IRQF_ONESHOT
Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Yang Li yang.lee@linux.alibaba.com Acked-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib-acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index 495f779b2ab9..1aacd2a5a1fd 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -174,7 +174,7 @@ static void acpi_gpiochip_request_irq(struct acpi_gpio_chip *acpi_gpio, int ret, value;
ret = request_threaded_irq(event->irq, NULL, event->handler, - event->irqflags, "ACPI:Event", event); + event->irqflags | IRQF_ONESHOT, "ACPI:Event", event); if (ret) { dev_err(acpi_gpio->chip->parent, "Failed to setup interrupt handler for %d\n",
From: Timo Rothenpieler timo@rothenpieler.org
[ Upstream commit a0590473c5e6c4ef17c3132ad08fbad170f72d55 ]
This follows what was done in 8c2fabc6542d9d0f8b16bd1045c2eda59bdcde13. With the default being m, it's impossible to build the module into the kernel.
Signed-off-by: Timo Rothenpieler timo@rothenpieler.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig index e2a488d403a6..14a72224b657 100644 --- a/fs/nfs/Kconfig +++ b/fs/nfs/Kconfig @@ -127,7 +127,7 @@ config PNFS_BLOCK config PNFS_FLEXFILE_LAYOUT tristate depends on NFS_V4_1 && NFS_V3 - default m + default NFS_V4
config NFS_V4_1_IMPLEMENTATION_ID_DOMAIN string "NFSv4.1 Implementation ID Domain"
From: Frank Sorenson sorenson@redhat.com
[ Upstream commit ad3dbe35c833c2d4d0bbf3f04c785d32f931e7c9 ]
CREATE requests return a post_op_fh3, rather than nfs_fh3. The post_op_fh3 includes an extra word to indicate 'handle_follows'.
Without that additional word, create fails when full 64-byte filehandles are in use.
Add NFS3_post_op_fh_sz, and correct the size calculation for NFS3_createres_sz.
Signed-off-by: Frank Sorenson sorenson@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs3xdr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c index ca10072644ff..ed1c83738c30 100644 --- a/fs/nfs/nfs3xdr.c +++ b/fs/nfs/nfs3xdr.c @@ -36,6 +36,7 @@ #define NFS3_pagepad_sz (1) /* Page padding */ #define NFS3_fhandle_sz (1+16) #define NFS3_fh_sz (NFS3_fhandle_sz) /* shorthand */ +#define NFS3_post_op_fh_sz (1+NFS3_fh_sz) #define NFS3_sattr_sz (15) #define NFS3_filename_sz (1+(NFS3_MAXNAMLEN>>2)) #define NFS3_path_sz (1+(NFS3_MAXPATHLEN>>2)) @@ -73,7 +74,7 @@ #define NFS3_readlinkres_sz (1+NFS3_post_op_attr_sz+1+NFS3_pagepad_sz) #define NFS3_readres_sz (1+NFS3_post_op_attr_sz+3+NFS3_pagepad_sz) #define NFS3_writeres_sz (1+NFS3_wcc_data_sz+4) -#define NFS3_createres_sz (1+NFS3_fh_sz+NFS3_post_op_attr_sz+NFS3_wcc_data_sz) +#define NFS3_createres_sz (1+NFS3_post_op_fh_sz+NFS3_post_op_attr_sz+NFS3_wcc_data_sz) #define NFS3_renameres_sz (1+(2 * NFS3_wcc_data_sz)) #define NFS3_linkres_sz (1+NFS3_post_op_attr_sz+NFS3_wcc_data_sz) #define NFS3_readdirres_sz (1+NFS3_post_op_attr_sz+2+NFS3_pagepad_sz)
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 143c253f42bad20357e7e4432087aca747c43384 ]
When hns_assemble_skb() returns NULL to skb, no error return code of hns_nic_clear_all_rx_fetch() is assigned. To fix this bug, ret is assigned with -ENOMEM in this case.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns/hns_enet.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c index 858cb293152a..8bce5f1510be 100644 --- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c +++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c @@ -1663,8 +1663,10 @@ static int hns_nic_clear_all_rx_fetch(struct net_device *ndev) for (j = 0; j < fetch_num; j++) { /* alloc one skb and init */ skb = hns_assemble_skb(ndev); - if (!skb) + if (!skb) { + ret = -ENOMEM; goto out; + } rd = &tx_ring_data(priv, skb->queue_mapping); hns_nic_net_xmit_hw(ndev, skb, rd);
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 62765d39553cfd1ad340124fe1e280450e8c89e2 ]
When priv->rx_skbuff or priv->tx_skbuff is NULL, no error return code of uhdlc_init() is assigned. To fix this bug, ret is assigned with -ENOMEM in these cases.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/fsl_ucc_hdlc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c index dca97cd7c4e7..7eac6a3e1cde 100644 --- a/drivers/net/wan/fsl_ucc_hdlc.c +++ b/drivers/net/wan/fsl_ucc_hdlc.c @@ -204,14 +204,18 @@ static int uhdlc_init(struct ucc_hdlc_private *priv) priv->rx_skbuff = kcalloc(priv->rx_ring_size, sizeof(*priv->rx_skbuff), GFP_KERNEL); - if (!priv->rx_skbuff) + if (!priv->rx_skbuff) { + ret = -ENOMEM; goto free_ucc_pram; + }
priv->tx_skbuff = kcalloc(priv->tx_ring_size, sizeof(*priv->tx_skbuff), GFP_KERNEL); - if (!priv->tx_skbuff) + if (!priv->tx_skbuff) { + ret = -ENOMEM; goto free_rx_skbuff; + }
priv->skb_curtx = 0; priv->skb_dirtytx = 0;
From: Paul Cercueil paul@crapouillou.net
[ Upstream commit 2e2696223676d56db1a93acfca722c1b96cd552d ]
The second IRQ line really is optional, so use platform_get_irq_optional() to obtain it.
Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/davicom/dm9000.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c index a95e95ce9438..252adfa5d837 100644 --- a/drivers/net/ethernet/davicom/dm9000.c +++ b/drivers/net/ethernet/davicom/dm9000.c @@ -1507,7 +1507,7 @@ dm9000_probe(struct platform_device *pdev) goto out; }
- db->irq_wake = platform_get_irq(pdev, 1); + db->irq_wake = platform_get_irq_optional(pdev, 1); if (db->irq_wake >= 0) { dev_dbg(db->dev, "wakeup irq %d\n", db->irq_wake);
From: Alex Marginean alexandru.marginean@nxp.com
[ Upstream commit 1b2395dfff5bb40228a187f21f577cd90673d344 ]
On LS1028A, the MAC RX FIFO defaults to the value 2, which is too high and may lead to RX lock-up under traffic at a rate higher than 6 Gbps. Set it to 1 instead, as recommended by the hardware design team and by later versions of the ENETC block guide.
Signed-off-by: Alex Marginean alexandru.marginean@nxp.com Reviewed-by: Claudiu Manoil claudiu.manoil@nxp.com Reviewed-by: Jason Liu jason.hui.liu@nxp.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++ drivers/net/ethernet/freescale/enetc/enetc_pf.c | 6 ++++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h index de0d20b0f489..00938f7960a4 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -234,6 +234,8 @@ enum enetc_bdr_type {TX, RX}; #define ENETC_PM0_MAXFRM 0x8014 #define ENETC_SET_TX_MTU(val) ((val) << 16) #define ENETC_SET_MAXFRM(val) ((val) & 0xffff) +#define ENETC_PM0_RX_FIFO 0x801c +#define ENETC_PM0_RX_FIFO_VAL 1
#define ENETC_PM_IMDIO_BASE 0x8030
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index ca02f033bea2..224fc37a6757 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -490,6 +490,12 @@ static void enetc_configure_port_mac(struct enetc_hw *hw)
enetc_port_wr(hw, ENETC_PM1_CMD_CFG, ENETC_PM0_CMD_PHY_TX_EN | ENETC_PM0_CMD_TXP | ENETC_PM0_PROMISC); + + /* On LS1028A, the MAC RX FIFO defaults to 2, which is too high + * and may lead to RX lock-up under traffic. Set it to 1 instead, + * as recommended by the hardware team. + */ + enetc_port_wr(hw, ENETC_PM0_RX_FIFO, ENETC_PM0_RX_FIFO_VAL); }
static void enetc_mac_config(struct enetc_hw *hw, phy_interface_t phy_mode)
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit 3153724fc084d8ef640c611f269ddfb576d1dcb1 ]
dev->dev_data is set in zatm.c, calling zatm_start() will overwrite this dev->dev_data in uPD98402_start() and a subsequent PRIV(dev)->lock (i.e dev->phy_data->lock) will result in a null-ptr-dereference.
I believe this is a typo and what it actually want to do is to allocate phy_data instead of dev_data.
Signed-off-by: Tong Zhang ztong0001@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/uPD98402.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/atm/uPD98402.c b/drivers/atm/uPD98402.c index 7850758b5bb8..239852d85558 100644 --- a/drivers/atm/uPD98402.c +++ b/drivers/atm/uPD98402.c @@ -211,7 +211,7 @@ static void uPD98402_int(struct atm_dev *dev) static int uPD98402_start(struct atm_dev *dev) { DPRINTK("phy_start\n"); - if (!(dev->dev_data = kmalloc(sizeof(struct uPD98402_priv),GFP_KERNEL))) + if (!(dev->phy_data = kmalloc(sizeof(struct uPD98402_priv),GFP_KERNEL))) return -ENOMEM; spin_lock_init(&PRIV(dev)->lock); memset(&PRIV(dev)->sonet_stats,0,sizeof(struct k_sonet_stats));
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit 4416e98594dc04590ebc498fc4e530009535c511 ]
this one is similar to the phy_data allocation fix in uPD98402, the driver allocate the idt77105_priv and store to dev_data but later dereference using dev->dev_data, which will cause null-ptr-dereference.
fix this issue by changing dev_data to phy_data so that PRIV(dev) can work correctly.
Signed-off-by: Tong Zhang ztong0001@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/idt77105.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/atm/idt77105.c b/drivers/atm/idt77105.c index 3c081b6171a8..bfca7b8a6f31 100644 --- a/drivers/atm/idt77105.c +++ b/drivers/atm/idt77105.c @@ -262,7 +262,7 @@ static int idt77105_start(struct atm_dev *dev) { unsigned long flags;
- if (!(dev->dev_data = kmalloc(sizeof(struct idt77105_priv),GFP_KERNEL))) + if (!(dev->phy_data = kmalloc(sizeof(struct idt77105_priv),GFP_KERNEL))) return -ENOMEM; PRIV(dev)->dev = dev; spin_lock_irqsave(&idt77105_priv_lock, flags); @@ -337,7 +337,7 @@ static int idt77105_stop(struct atm_dev *dev) else idt77105_all = walk->next; dev->phy = NULL; - dev->dev_data = NULL; + dev->phy_data = NULL; kfree(walk); break; }
From: Paulo Alcantara pc@cjr.nz
[ Upstream commit e3d100eae44b42f309c1366efb8397368f1cf8ed ]
A customer has reported that their dmesg were being flooded by
CIFS: VFS: \server Cancelling wait for mid xxx cmd: a CIFS: VFS: \server Cancelling wait for mid yyy cmd: b CIFS: VFS: \server Cancelling wait for mid zzz cmd: c
because some processes that were performing statfs(2) on the share had been interrupted due to their automount setup when certain users logged in and out.
Change it to FYI as they should be mostly informative rather than error messages.
Signed-off-by: Paulo Alcantara (SUSE) pc@cjr.nz Reviewed-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index 64fccb8809ec..13d685f0ac8e 100644 --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -1185,7 +1185,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses, } if (rc != 0) { for (; i < num_rqst; i++) { - cifs_server_dbg(VFS, "Cancelling wait for mid %llu cmd: %d\n", + cifs_server_dbg(FYI, "Cancelling wait for mid %llu cmd: %d\n", midQ[i]->mid, le16_to_cpu(midQ[i]->command)); send_cancel(server, &rqst[i], midQ[i]); spin_lock(&GlobalMid_Lock);
From: Paul Cercueil paul@crapouillou.net
[ Upstream commit 5fbecd2389f48e1415799c63130d0cdce1cf3f60 ]
Add support for the interrupt controller found in the JZ4760 SoC, which works exactly like the one in the JZ4770.
Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210307172014.73481-2-paul@crapouillou.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-ingenic-tcu.c | 1 + drivers/irqchip/irq-ingenic.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/drivers/irqchip/irq-ingenic-tcu.c b/drivers/irqchip/irq-ingenic-tcu.c index 7a7222d4c19c..b938d1d04d96 100644 --- a/drivers/irqchip/irq-ingenic-tcu.c +++ b/drivers/irqchip/irq-ingenic-tcu.c @@ -179,5 +179,6 @@ static int __init ingenic_tcu_irq_init(struct device_node *np, } IRQCHIP_DECLARE(jz4740_tcu_irq, "ingenic,jz4740-tcu", ingenic_tcu_irq_init); IRQCHIP_DECLARE(jz4725b_tcu_irq, "ingenic,jz4725b-tcu", ingenic_tcu_irq_init); +IRQCHIP_DECLARE(jz4760_tcu_irq, "ingenic,jz4760-tcu", ingenic_tcu_irq_init); IRQCHIP_DECLARE(jz4770_tcu_irq, "ingenic,jz4770-tcu", ingenic_tcu_irq_init); IRQCHIP_DECLARE(x1000_tcu_irq, "ingenic,x1000-tcu", ingenic_tcu_irq_init); diff --git a/drivers/irqchip/irq-ingenic.c b/drivers/irqchip/irq-ingenic.c index b61a8901ef72..ea36bb00be80 100644 --- a/drivers/irqchip/irq-ingenic.c +++ b/drivers/irqchip/irq-ingenic.c @@ -155,6 +155,7 @@ static int __init intc_2chip_of_init(struct device_node *node, { return ingenic_intc_of_init(node, 2); } +IRQCHIP_DECLARE(jz4760_intc, "ingenic,jz4760-intc", intc_2chip_of_init); IRQCHIP_DECLARE(jz4770_intc, "ingenic,jz4770-intc", intc_2chip_of_init); IRQCHIP_DECLARE(jz4775_intc, "ingenic,jz4775-intc", intc_2chip_of_init); IRQCHIP_DECLARE(jz4780_intc, "ingenic,jz4780-intc", intc_2chip_of_init);
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 993bdde94547887faaad4a97f0b0480a6da271c3 ]
'make image_name' needs include/config/auto.conf to show the correct output because KBUILD_IMAGE depends on CONFIG options, but should not attempt to resync the configuration.
Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile index 824d15c14be0..ace7d6131fa1 100644 --- a/Makefile +++ b/Makefile @@ -265,7 +265,8 @@ no-dot-config-targets := $(clean-targets) \ $(version_h) headers headers_% archheaders archscripts \ %asm-generic kernelversion %src-pkg dt_binding_check \ outputmakefile -no-sync-config-targets := $(no-dot-config-targets) %install kernelrelease +no-sync-config-targets := $(no-dot-config-targets) %install kernelrelease \ + image_name single-targets := %.a %.i %.ko %.lds %.ll %.lst %.mod %.o %.s %.symtypes %/
config-build :=
From: Jiri Slaby jslaby@suse.cz
[ Upstream commit b3d9fc1436808a4ef9927e558b3415e728e710c5 ]
There is a test in Kconfig which takes inverted value of a compiler check: * config CC_HAS_INT128 def_bool !$(cc-option,$(m64-flag) -D__SIZEOF_INT128__=0)
This results in CC_HAS_INT128 not being in super-config generated by dummy-tools. So take this into account in the gcc script.
Signed-off-by: Jiri Slaby jslaby@suse.cz Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/dummy-tools/gcc | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/scripts/dummy-tools/gcc b/scripts/dummy-tools/gcc index 5c113cad5601..0d0589cf8184 100755 --- a/scripts/dummy-tools/gcc +++ b/scripts/dummy-tools/gcc @@ -85,3 +85,8 @@ if arg_contain -print-file-name=plugin "$@"; then echo $plugin_dir exit 0 fi + +# inverted return value +if arg_contain -D__SIZEOF_INT128__=0 "$@"; then + exit 1 +fi
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit eeb05595d22c19c8f814ff893dcf88ec277a2365 ]
Fix to return negative error code -ENOMEM from the blk_alloc_queue() and dma_alloc_coherent() error handling cases instead of 0, as done elsewhere in this function.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Link: https://lore.kernel.org/r/20210308123501.2573816-1-weiyongjun1@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/umem.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/block/umem.c b/drivers/block/umem.c index 2b95d7b33b91..5eb44e4a91ee 100644 --- a/drivers/block/umem.c +++ b/drivers/block/umem.c @@ -877,6 +877,7 @@ static int mm_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) if (card->mm_pages[0].desc == NULL || card->mm_pages[1].desc == NULL) { dev_printk(KERN_ERR, &card->dev->dev, "alloc failed\n"); + ret = -ENOMEM; goto failed_alloc; } reset_page(&card->mm_pages[0]); @@ -888,8 +889,10 @@ static int mm_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) spin_lock_init(&card->lock);
card->queue = blk_alloc_queue(NUMA_NO_NODE); - if (!card->queue) + if (!card->queue) { + ret = -ENOMEM; goto failed_alloc; + }
tasklet_init(&card->tasklet, process_page, (unsigned long)card);
From: Rob Gardner rob.gardner@oracle.com
[ Upstream commit e5e8b80d352ec999d2bba3ea584f541c83f4ca3f ]
is_no_fault_exception() has two bugs which were discovered via random opcode testing with stress-ng. Both are caused by improper filtering of opcodes.
The first bug can be triggered by a floating point store with a no-fault ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.
The code first tests op3[5] (0x1000000), which denotes a floating point instruction, and then tests op3[2] (0x200000), which denotes a store instruction. But these bits are not mutually exclusive, and the above mentioned opcode has both bits set. The intent is to filter out stores, so the test for stores must be done first in order to have any effect.
The second bug can be triggered by a floating point load with one of the invalid ASI values 0x8e or 0x8f, which pass this check in is_no_fault_exception(): if ((asi & 0xf2) == ASI_PNF)
An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38", opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal in handle_ldf_stq(), and is_no_fault_exception() must not allow these invalid asi values to make it that far.
In both of these cases, handle_ldf_stq() reacts by calling sun4v_data_access_exception() or spitfire_data_access_exception(), which call is_no_fault_exception() and results in an infinite recursion.
Signed-off-by: Rob Gardner rob.gardner@oracle.com Tested-by: Anatoly Pugachev matorola@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/kernel/traps_64.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/sparc/kernel/traps_64.c b/arch/sparc/kernel/traps_64.c index d92e5eaa4c1d..a850dccd78ea 100644 --- a/arch/sparc/kernel/traps_64.c +++ b/arch/sparc/kernel/traps_64.c @@ -275,14 +275,13 @@ bool is_no_fault_exception(struct pt_regs *regs) asi = (regs->tstate >> 24); /* saved %asi */ else asi = (insn >> 5); /* immediate asi */ - if ((asi & 0xf2) == ASI_PNF) { - if (insn & 0x1000000) { /* op3[5:4]=3 */ - handle_ldf_stq(insn, regs); - return true; - } else if (insn & 0x200000) { /* op3[2], stores */ + if ((asi & 0xf6) == ASI_PNF) { + if (insn & 0x200000) /* op3[2], stores */ return false; - } - handle_ld_nf(insn, regs); + if (insn & 0x1000000) /* op3[5:4]=3 (fp) */ + handle_ldf_stq(insn, regs); + else + handle_ld_nf(insn, regs); return true; } }
From: Tomer Tayar ttayar@habana.ai
[ Upstream commit 27ac5aada024e0821c86540ad18f37edadd77d5e ]
The refcount of the "hl_fpriv" structure is not used for the control device, and thus hl_hpriv_put() is not called when releasing this device. This results with no call to put_pid(), so add it explicitly in hl_device_release_ctrl().
Signed-off-by: Tomer Tayar ttayar@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index 69d04eca767f..6785329eee27 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -117,6 +117,8 @@ static int hl_device_release_ctrl(struct inode *inode, struct file *filp) list_del(&hpriv->dev_node); mutex_unlock(&hdev->fpriv_list_lock);
+ put_pid(hpriv->taskpid); + kfree(hpriv);
return 0;
From: Tomer Tayar ttayar@habana.ai
[ Upstream commit ffd123fe839700366ea79b19ac3683bf56817372 ]
A device can be removed from the PCI subsystem while a process holds the file descriptor opened. In such a case, the driver attempts to kill the process, but as it is still possible that the process will be alive after this step, the device removal will complete, and we will end up with a process object that points to a device object which was already released.
To prevent the usage of this released device object, disable the following file operations for this process object, and avoid the cleanup steps when the file descriptor is eventually closed. The latter is just a best effort, as memory leak will occur.
Signed-off-by: Tomer Tayar ttayar@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 40 ++++++++++++++++--- .../misc/habanalabs/common/habanalabs_ioctl.c | 12 ++++++ 2 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index 6785329eee27..82c0306a9210 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -93,12 +93,19 @@ void hl_hpriv_put(struct hl_fpriv *hpriv) static int hl_device_release(struct inode *inode, struct file *filp) { struct hl_fpriv *hpriv = filp->private_data; + struct hl_device *hdev = hpriv->hdev; + + filp->private_data = NULL; + + if (!hdev) { + pr_crit("Closing FD after device was removed. Memory leak will occur and it is advised to reboot.\n"); + put_pid(hpriv->taskpid); + return 0; + }
hl_cb_mgr_fini(hpriv->hdev, &hpriv->cb_mgr); hl_ctx_mgr_fini(hpriv->hdev, &hpriv->ctx_mgr);
- filp->private_data = NULL; - hl_hpriv_put(hpriv);
return 0; @@ -107,16 +114,19 @@ static int hl_device_release(struct inode *inode, struct file *filp) static int hl_device_release_ctrl(struct inode *inode, struct file *filp) { struct hl_fpriv *hpriv = filp->private_data; - struct hl_device *hdev; + struct hl_device *hdev = hpriv->hdev;
filp->private_data = NULL;
- hdev = hpriv->hdev; + if (!hdev) { + pr_err("Closing FD after device was removed\n"); + goto out; + }
mutex_lock(&hdev->fpriv_list_lock); list_del(&hpriv->dev_node); mutex_unlock(&hdev->fpriv_list_lock); - +out: put_pid(hpriv->taskpid);
kfree(hpriv); @@ -136,8 +146,14 @@ static int hl_device_release_ctrl(struct inode *inode, struct file *filp) static int hl_mmap(struct file *filp, struct vm_area_struct *vma) { struct hl_fpriv *hpriv = filp->private_data; + struct hl_device *hdev = hpriv->hdev; unsigned long vm_pgoff;
+ if (!hdev) { + pr_err_ratelimited("Trying to mmap after device was removed! Please close FD\n"); + return -ENODEV; + } + vm_pgoff = vma->vm_pgoff; vma->vm_pgoff = HL_MMAP_OFFSET_VALUE_GET(vm_pgoff);
@@ -884,6 +900,16 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout) return -EBUSY; }
+static void device_disable_open_processes(struct hl_device *hdev) +{ + struct hl_fpriv *hpriv; + + mutex_lock(&hdev->fpriv_list_lock); + list_for_each_entry(hpriv, &hdev->fpriv_list, dev_node) + hpriv->hdev = NULL; + mutex_unlock(&hdev->fpriv_list_lock); +} + /* * hl_device_reset - reset the device * @@ -1538,8 +1564,10 @@ void hl_device_fini(struct hl_device *hdev) HL_PENDING_RESET_LONG_SEC);
rc = device_kill_open_processes(hdev, HL_PENDING_RESET_LONG_SEC); - if (rc) + if (rc) { dev_crit(hdev->dev, "Failed to kill all open processes\n"); + device_disable_open_processes(hdev); + }
hl_cb_pool_fini(hdev);
diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c index d25892d61ec9..0805e1173d54 100644 --- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c +++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c @@ -5,6 +5,8 @@ * All Rights Reserved. */
+#define pr_fmt(fmt) "habanalabs: " fmt + #include <uapi/misc/habanalabs.h> #include "habanalabs.h"
@@ -667,6 +669,11 @@ long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) const struct hl_ioctl_desc *ioctl = NULL; unsigned int nr = _IOC_NR(cmd);
+ if (!hdev) { + pr_err_ratelimited("Sending ioctl after device was removed! Please close FD\n"); + return -ENODEV; + } + if ((nr >= HL_COMMAND_START) && (nr < HL_COMMAND_END)) { ioctl = &hl_ioctls[nr]; } else { @@ -685,6 +692,11 @@ long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg) const struct hl_ioctl_desc *ioctl = NULL; unsigned int nr = _IOC_NR(cmd);
+ if (!hdev) { + pr_err_ratelimited("Sending ioctl after device was removed! Please close FD\n"); + return -ENODEV; + } + if (nr == _IOC_NR(HL_IOCTL_INFO)) { ioctl = &hl_ioctls_control[nr]; } else {
From: Julian Braha julianbraha@gmail.com
[ Upstream commit 7c36194558cf49a86a53b5f60db8046c5e3013ae ]
When RTLLIB_CRYPTO_TKIP is enabled and CRYPTO is disabled, Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for CRYPTO_MICHAEL_MIC Depends on [n]: CRYPTO [=n] Selected by [m]: - RTLLIB_CRYPTO_TKIP [=m] && STAGING [=y] && RTLLIB [=m]
WARNING: unmet direct dependencies detected for CRYPTO_LIB_ARC4 Depends on [n]: CRYPTO [=n] Selected by [m]: - RTLLIB_CRYPTO_TKIP [=m] && STAGING [=y] && RTLLIB [=m] - RTLLIB_CRYPTO_WEP [=m] && STAGING [=y] && RTLLIB [=m]
This is because RTLLIB_CRYPTO_TKIP selects CRYPTO_MICHAEL_MIC and CRYPTO_LIB_ARC4, without depending on or selecting CRYPTO, despite those config options being subordinate to CRYPTO.
Acked-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Julian Braha julianbraha@gmail.com Link: https://lore.kernel.org/r/20210222180607.399753-1-julianbraha@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/rtl8192e/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/staging/rtl8192e/Kconfig b/drivers/staging/rtl8192e/Kconfig index 03fcc23516fd..6e7d84ac06f5 100644 --- a/drivers/staging/rtl8192e/Kconfig +++ b/drivers/staging/rtl8192e/Kconfig @@ -26,6 +26,7 @@ config RTLLIB_CRYPTO_CCMP config RTLLIB_CRYPTO_TKIP tristate "Support for rtllib TKIP crypto" depends on RTLLIB + select CRYPTO select CRYPTO_LIB_ARC4 select CRYPTO_MICHAEL_MIC default y
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit d5b0e0677bfd5efd17c5bbb00156931f0d41cb85 ]
Jakub reported that:
static struct net_device *rtl8139_init_board(struct pci_dev *pdev) { ... u64_stats_init(&tp->rx_stats.syncp); u64_stats_init(&tp->tx_stats.syncp); ... }
results in lockdep getting confused between the RX and TX stats lock. This is because u64_stats_init() is an inline calling seqcount_init(), which is a macro using a static variable to generate a lockdep class.
By wrapping that in an inline, we negate the effect of the macro and fold the static key variable, hence the confusion.
Fix by also making u64_stats_init() a macro for the case where it matters, leaving the other case an inline for argument validation etc.
Reported-by: Jakub Kicinski kuba@kernel.org Debugged-by: "Ahmed S. Darwish" a.darwish@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: "Erhard F." erhard_f@mailbox.org Link: https://lkml.kernel.org/r/YEXicy6+9MksdLZh@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/u64_stats_sync.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h index c6abb79501b3..e81856c0ba13 100644 --- a/include/linux/u64_stats_sync.h +++ b/include/linux/u64_stats_sync.h @@ -115,12 +115,13 @@ static inline void u64_stats_inc(u64_stats_t *p) } #endif
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP) +#define u64_stats_init(syncp) seqcount_init(&(syncp)->seq) +#else static inline void u64_stats_init(struct u64_stats_sync *syncp) { -#if BITS_PER_LONG == 32 && defined(CONFIG_SMP) - seqcount_init(&syncp->seq); -#endif } +#endif
static inline void u64_stats_update_begin(struct u64_stats_sync *syncp) {
From: Mark Brown broonie@kernel.org
[ Upstream commit 07e644885bf6727a48db109fad053cb43f3c9859 ]
We track if sve-ptrace encountered a failure in a variable but don't actually use that value when we exit the program, do so.
Signed-off-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20210309190304.39169-1-broonie@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/arm64/fp/sve-ptrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/sve-ptrace.c b/tools/testing/selftests/arm64/fp/sve-ptrace.c index b2282be6f938..612d3899614a 100644 --- a/tools/testing/selftests/arm64/fp/sve-ptrace.c +++ b/tools/testing/selftests/arm64/fp/sve-ptrace.c @@ -332,5 +332,5 @@ int main(void)
ksft_print_cnts();
- return 0; + return ret; }
From: satya priya skakit@codeaurora.org
[ Upstream commit e610e072c87a30658479a7b4c51e1801cb3f450c ]
Correct the REGULATOR_LINEAR_RANGE and n_voltges for pmic5_hfsmps515 buck.
Signed-off-by: satya priya skakit@codeaurora.org Link: https://lore.kernel.org/r/1614155592-14060-4-git-send-email-skakit@codeauror... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/qcom-rpmh-regulator.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/regulator/qcom-rpmh-regulator.c b/drivers/regulator/qcom-rpmh-regulator.c index 37a2abbe85c7..2351a232d90e 100644 --- a/drivers/regulator/qcom-rpmh-regulator.c +++ b/drivers/regulator/qcom-rpmh-regulator.c @@ -726,8 +726,8 @@ static const struct rpmh_vreg_hw_data pmic5_ftsmps510 = { static const struct rpmh_vreg_hw_data pmic5_hfsmps515 = { .regulator_type = VRM, .ops = &rpmh_regulator_vrm_ops, - .voltage_range = REGULATOR_LINEAR_RANGE(2800000, 0, 4, 16000), - .n_voltages = 5, + .voltage_range = REGULATOR_LINEAR_RANGE(320000, 0, 235, 16000), + .n_voltages = 236, .pmic_mode_map = pmic_mode_map_pmic5_smps, .of_map_mode = rpmh_regulator_pmic4_smps_of_map_mode, };
From: satya priya skakit@codeaurora.org
[ Upstream commit dfe03bca8db4957d4b60614ff7df4d136ba90f37 ]
Use correct buck, that is, pmic5_hfsmps515 for S1C regulator of PM8350C PMIC.
Signed-off-by: satya priya skakit@codeaurora.org Link: https://lore.kernel.org/r/1614155592-14060-7-git-send-email-skakit@codeauror... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/qcom-rpmh-regulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/regulator/qcom-rpmh-regulator.c b/drivers/regulator/qcom-rpmh-regulator.c index 2351a232d90e..0fd3da36f62e 100644 --- a/drivers/regulator/qcom-rpmh-regulator.c +++ b/drivers/regulator/qcom-rpmh-regulator.c @@ -901,7 +901,7 @@ static const struct rpmh_vreg_init_data pm8350_vreg_data[] = { };
static const struct rpmh_vreg_init_data pm8350c_vreg_data[] = { - RPMH_VREG("smps1", "smp%s1", &pmic5_hfsmps510, "vdd-s1"), + RPMH_VREG("smps1", "smp%s1", &pmic5_hfsmps515, "vdd-s1"), RPMH_VREG("smps2", "smp%s2", &pmic5_ftsmps510, "vdd-s2"), RPMH_VREG("smps3", "smp%s3", &pmic5_ftsmps510, "vdd-s3"), RPMH_VREG("smps4", "smp%s4", &pmic5_ftsmps510, "vdd-s4"),
From: Damien Le Moal damien.lemoal@wdc.com
[ Upstream commit faa44c69daf9ccbd5b8a1aee13e0e0d037c0be17 ]
Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set.
Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-zoned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c index df0ecf6790d3..fc925f73d694 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -240,7 +240,7 @@ int blkdev_zone_mgmt(struct block_device *bdev, enum req_opf op, */ if (op == REQ_OP_ZONE_RESET && blkdev_allow_reset_all_zones(bdev, sector, nr_sectors)) { - bio->bi_opf = REQ_OP_ZONE_RESET_ALL; + bio->bi_opf = REQ_OP_ZONE_RESET_ALL | REQ_SYNC; break; }
From: Qingqing Zhuo qingqing.zhuo@amd.com
[ Upstream commit 7afa0033d6f7fb8a84798ef99d1117661c4e696c ]
[Why] pflip interrupt would not be enabled promptly if a pipe is disabled and re-enabled, causing flip_done timeout error during DP compliance tests
[How] Enable pflip interrupt upon pipe enablement
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Qingqing Zhuo qingqing.zhuo@amd.com Reviewed-by: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Acked-by: Eryk Brol eryk.brol@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 + drivers/gpu/drm/amd/display/dc/dc.h | 1 + drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c | 11 +++++++++++ drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h | 6 ++++++ .../gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 7 +++++++ drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c | 1 + drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 6 ++++++ drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c | 1 + drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c | 1 + drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h | 2 ++ 10 files changed, 37 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 1d26e82602f7..ad4afbc37d51 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4616,6 +4616,7 @@ static int fill_dc_plane_attributes(struct amdgpu_device *adev, dc_plane_state->global_alpha_value = plane_info.global_alpha_value; dc_plane_state->dcc = plane_info.dcc; dc_plane_state->layer_index = plane_info.layer_index; // Always returns 0 + dc_plane_state->flip_int_enabled = true;
/* * Always set input transfer function, since plane state is refreshed diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 3aedadb34548..414b44b4ced4 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -889,6 +889,7 @@ struct dc_plane_state { int layer_index;
union surface_update_flags update_flags; + bool flip_int_enabled; /* private to DC core */ struct dc_plane_status status; struct dc_context *ctx; diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c index 9e796dfeac20..714c71a5fbde 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c @@ -1257,6 +1257,16 @@ void hubp1_soft_reset(struct hubp *hubp, bool reset) REG_UPDATE(DCHUBP_CNTL, HUBP_DISABLE, reset ? 1 : 0); }
+void hubp1_set_flip_int(struct hubp *hubp) +{ + struct dcn10_hubp *hubp1 = TO_DCN10_HUBP(hubp); + + REG_UPDATE(DCSURF_SURFACE_FLIP_INTERRUPT, + SURFACE_FLIP_INT_MASK, 1); + + return; +} + void hubp1_init(struct hubp *hubp) { //do nothing @@ -1290,6 +1300,7 @@ static const struct hubp_funcs dcn10_hubp_funcs = { .dmdata_load = NULL, .hubp_soft_reset = hubp1_soft_reset, .hubp_in_blank = hubp1_in_blank, + .hubp_set_flip_int = hubp1_set_flip_int, };
/*****************************************/ diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h index a9a6ed7f4f99..e2f2f6995935 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h @@ -74,6 +74,7 @@ SRI(DCSURF_SURFACE_EARLIEST_INUSE_C, HUBPREQ, id),\ SRI(DCSURF_SURFACE_EARLIEST_INUSE_HIGH_C, HUBPREQ, id),\ SRI(DCSURF_SURFACE_CONTROL, HUBPREQ, id),\ + SRI(DCSURF_SURFACE_FLIP_INTERRUPT, HUBPREQ, id),\ SRI(HUBPRET_CONTROL, HUBPRET, id),\ SRI(DCN_EXPANSION_MODE, HUBPREQ, id),\ SRI(DCHUBP_REQ_SIZE_CONFIG, HUBP, id),\ @@ -183,6 +184,7 @@ uint32_t DCSURF_SURFACE_EARLIEST_INUSE_C; \ uint32_t DCSURF_SURFACE_EARLIEST_INUSE_HIGH_C; \ uint32_t DCSURF_SURFACE_CONTROL; \ + uint32_t DCSURF_SURFACE_FLIP_INTERRUPT; \ uint32_t HUBPRET_CONTROL; \ uint32_t DCN_EXPANSION_MODE; \ uint32_t DCHUBP_REQ_SIZE_CONFIG; \ @@ -332,6 +334,7 @@ HUBP_SF(HUBPREQ0_DCSURF_SURFACE_CONTROL, SECONDARY_META_SURFACE_TMZ_C, mask_sh),\ HUBP_SF(HUBPREQ0_DCSURF_SURFACE_CONTROL, SECONDARY_SURFACE_DCC_EN, mask_sh),\ HUBP_SF(HUBPREQ0_DCSURF_SURFACE_CONTROL, SECONDARY_SURFACE_DCC_IND_64B_BLK, mask_sh),\ + HUBP_SF(HUBPREQ0_DCSURF_SURFACE_FLIP_INTERRUPT, SURFACE_FLIP_INT_MASK, mask_sh),\ HUBP_SF(HUBPRET0_HUBPRET_CONTROL, DET_BUF_PLANE1_BASE_ADDRESS, mask_sh),\ HUBP_SF(HUBPRET0_HUBPRET_CONTROL, CROSSBAR_SRC_CB_B, mask_sh),\ HUBP_SF(HUBPRET0_HUBPRET_CONTROL, CROSSBAR_SRC_CR_R, mask_sh),\ @@ -531,6 +534,7 @@ type PRIMARY_SURFACE_DCC_IND_64B_BLK;\ type SECONDARY_SURFACE_DCC_EN;\ type SECONDARY_SURFACE_DCC_IND_64B_BLK;\ + type SURFACE_FLIP_INT_MASK;\ type DET_BUF_PLANE1_BASE_ADDRESS;\ type CROSSBAR_SRC_CB_B;\ type CROSSBAR_SRC_CR_R;\ @@ -777,4 +781,6 @@ void hubp1_read_state_common(struct hubp *hubp); bool hubp1_in_blank(struct hubp *hubp); void hubp1_soft_reset(struct hubp *hubp, bool reset);
+void hubp1_set_flip_int(struct hubp *hubp); + #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index 017b67b830e6..3e86e042de0d 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -2195,6 +2195,13 @@ static void dcn10_enable_plane( if (dc->debug.sanity_checks) { hws->funcs.verify_allow_pstate_change_high(dc); } + + if (!pipe_ctx->top_pipe + && pipe_ctx->plane_state + && pipe_ctx->plane_state->flip_int_enabled + && pipe_ctx->plane_res.hubp->funcs->hubp_set_flip_int) + pipe_ctx->plane_res.hubp->funcs->hubp_set_flip_int(pipe_ctx->plane_res.hubp); + }
void dcn10_program_gamut_remap(struct pipe_ctx *pipe_ctx) diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c index 0df0da2e6a4d..bec7059f6d5d 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c @@ -1597,6 +1597,7 @@ static struct hubp_funcs dcn20_hubp_funcs = { .validate_dml_output = hubp2_validate_dml_output, .hubp_in_blank = hubp1_in_blank, .hubp_soft_reset = hubp1_soft_reset, + .hubp_set_flip_int = hubp1_set_flip_int, };
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c index 09b9732424e1..077ba9cf69c5 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1146,6 +1146,12 @@ void dcn20_enable_plane( pipe_ctx->plane_res.hubp->funcs->hubp_set_vm_system_aperture_settings(pipe_ctx->plane_res.hubp, &apt); }
+ if (!pipe_ctx->top_pipe + && pipe_ctx->plane_state + && pipe_ctx->plane_state->flip_int_enabled + && pipe_ctx->plane_res.hubp->funcs->hubp_set_flip_int) + pipe_ctx->plane_res.hubp->funcs->hubp_set_flip_int(pipe_ctx->plane_res.hubp); + // if (dc->debug.sanity_checks) { // dcn10_verify_allow_pstate_change_high(dc); // } diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c index f9045852728f..b0c9180b808f 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hubp.c @@ -838,6 +838,7 @@ static struct hubp_funcs dcn21_hubp_funcs = { .hubp_set_flip_control_surface_gsl = hubp2_set_flip_control_surface_gsl, .hubp_init = hubp21_init, .validate_dml_output = hubp21_validate_dml_output, + .hubp_set_flip_int = hubp1_set_flip_int, };
bool hubp21_construct( diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c index 88ffa9ff1ed1..f24612523248 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubp.c @@ -511,6 +511,7 @@ static struct hubp_funcs dcn30_hubp_funcs = { .hubp_init = hubp3_init, .hubp_in_blank = hubp1_in_blank, .hubp_soft_reset = hubp1_soft_reset, + .hubp_set_flip_int = hubp1_set_flip_int, };
bool hubp3_construct( diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h index 22f3f643ed1b..346dcd87dc10 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h @@ -191,6 +191,8 @@ struct hubp_funcs { bool (*hubp_in_blank)(struct hubp *hubp); void (*hubp_soft_reset)(struct hubp *hubp, bool reset);
+ void (*hubp_set_flip_int)(struct hubp *hubp); + };
#endif
From: Sung Lee sung.lee@amd.com
[ Upstream commit b0075d114c33580f5c9fa9cee8e13d06db41471b ]
[WHY & HOW] Using values provided by DF for latency may cause hangs in multi display configurations. Revert change to previous value.
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Sung Lee sung.lee@amd.com Reviewed-by: Haonan Wang Haonan.Wang2@amd.com Acked-by: Eryk Brol eryk.brol@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c index 4caeab6a09b3..4a3df13c9e49 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -296,7 +296,7 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc = { .num_banks = 8, .num_chans = 4, .vmm_page_size_bytes = 4096, - .dram_clock_change_latency_us = 11.72, + .dram_clock_change_latency_us = 23.84, .return_bus_width_bytes = 64, .dispclk_dppclk_vco_speed_mhz = 3600, .xfc_bus_transport_time_us = 4,
From: Dillon Varone dillon.varone@amd.com
[ Upstream commit d2c91285958a3e77db99c352c136af4243f8f529 ]
[Why & How] Ported logic from dcn21 for reading in pipe fusing to dcn30. Supported configurations are 1 and 6 pipes. Invalid fusing will revert to 1 pipe being enabled.
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Dillon Varone dillon.varone@amd.com Reviewed-by: Jun Lei Jun.Lei@amd.com Acked-by: Eryk Brol eryk.brol@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/amd/display/dc/dcn30/dcn30_resource.c | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c index 5e126fdf6ec1..7ec8936346b2 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c @@ -2601,6 +2601,19 @@ static const struct resource_funcs dcn30_res_pool_funcs = { .patch_unknown_plane_state = dcn20_patch_unknown_plane_state, };
+#define CTX ctx + +#define REG(reg_name) \ + (DCN_BASE.instance[0].segment[mm ## reg_name ## _BASE_IDX] + mm ## reg_name) + +static uint32_t read_pipe_fuses(struct dc_context *ctx) +{ + uint32_t value = REG_READ(CC_DC_PIPE_DIS); + /* Support for max 6 pipes */ + value = value & 0x3f; + return value; +} + static bool dcn30_resource_construct( uint8_t num_virtual_links, struct dc *dc, @@ -2610,6 +2623,15 @@ static bool dcn30_resource_construct( struct dc_context *ctx = dc->ctx; struct irq_service_init_data init_data; struct ddc_service_init_data ddc_init_data; + uint32_t pipe_fuses = read_pipe_fuses(ctx); + uint32_t num_pipes = 0; + + if (!(pipe_fuses == 0 || pipe_fuses == 0x3e)) { + BREAK_TO_DEBUGGER(); + dm_error("DC: Unexpected fuse recipe for navi2x !\n"); + /* fault to single pipe */ + pipe_fuses = 0x3e; + }
DC_FP_START();
@@ -2739,6 +2761,15 @@ static bool dcn30_resource_construct( /* PP Lib and SMU interfaces */ init_soc_bounding_box(dc, pool);
+ num_pipes = dcn3_0_ip.max_num_dpp; + + for (i = 0; i < dcn3_0_ip.max_num_dpp; i++) + if (pipe_fuses & 1 << i) + num_pipes--; + + dcn3_0_ip.max_num_dpp = num_pipes; + dcn3_0_ip.max_num_otg = num_pipes; + dml_init_instance(&dc->dml, &dcn3_0_soc, &dcn3_0_ip, DML_PROJECT_DCN30);
/* IRQ */
From: Zhan Liu zhan.liu@amd.com
[ Upstream commit eda29602f1a8b2b32d8c8c354232d9d1ee1c064d ]
[Why] For DGPU Navi, the wm_table.nv_entries are used. These entires are not populated for DCN301 Vangogh APU, but instead wm_table.entries are.
[How] Use DCN21 Renoir style wm calculations.
Signed-off-by: Leo Li sunpeng.li@amd.com Signed-off-by: Zhan Liu zhan.liu@amd.com Reviewed-by: Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com Acked-by: Zhan Liu zhan.liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../amd/display/dc/dcn301/dcn301_resource.c | 96 ++++++++++++++++++- 1 file changed, 95 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c index 35f5bf08ae96..23bc208cbfa4 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c @@ -1722,12 +1722,106 @@ static void dcn301_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b dml_init_instance(&dc->dml, &dcn3_01_soc, &dcn3_01_ip, DML_PROJECT_DCN30); }
+static void calculate_wm_set_for_vlevel( + int vlevel, + struct wm_range_table_entry *table_entry, + struct dcn_watermarks *wm_set, + struct display_mode_lib *dml, + display_e2e_pipe_params_st *pipes, + int pipe_cnt) +{ + double dram_clock_change_latency_cached = dml->soc.dram_clock_change_latency_us; + + ASSERT(vlevel < dml->soc.num_states); + /* only pipe 0 is read for voltage and dcf/soc clocks */ + pipes[0].clks_cfg.voltage = vlevel; + pipes[0].clks_cfg.dcfclk_mhz = dml->soc.clock_limits[vlevel].dcfclk_mhz; + pipes[0].clks_cfg.socclk_mhz = dml->soc.clock_limits[vlevel].socclk_mhz; + + dml->soc.dram_clock_change_latency_us = table_entry->pstate_latency_us; + dml->soc.sr_exit_time_us = table_entry->sr_exit_time_us; + dml->soc.sr_enter_plus_exit_time_us = table_entry->sr_enter_plus_exit_time_us; + + wm_set->urgent_ns = get_wm_urgent(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_enter_plus_exit_ns = get_wm_stutter_enter_exit(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_exit_ns = get_wm_stutter_exit(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.pstate_change_ns = get_wm_dram_clock_change(dml, pipes, pipe_cnt) * 1000; + wm_set->pte_meta_urgent_ns = get_wm_memory_trip(dml, pipes, pipe_cnt) * 1000; + wm_set->frac_urg_bw_nom = get_fraction_of_urgent_bandwidth(dml, pipes, pipe_cnt) * 1000; + wm_set->frac_urg_bw_flip = get_fraction_of_urgent_bandwidth_imm_flip(dml, pipes, pipe_cnt) * 1000; + wm_set->urgent_latency_ns = get_urgent_latency(dml, pipes, pipe_cnt) * 1000; + dml->soc.dram_clock_change_latency_us = dram_clock_change_latency_cached; + +} + +static void dcn301_calculate_wm_and_dlg( + struct dc *dc, struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int pipe_cnt, + int vlevel_req) +{ + int i, pipe_idx; + int vlevel, vlevel_max; + struct wm_range_table_entry *table_entry; + struct clk_bw_params *bw_params = dc->clk_mgr->bw_params; + + ASSERT(bw_params); + + vlevel_max = bw_params->clk_table.num_entries - 1; + + /* WM Set D */ + table_entry = &bw_params->wm_table.entries[WM_D]; + if (table_entry->wm_type == WM_TYPE_RETRAINING) + vlevel = 0; + else + vlevel = vlevel_max; + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.d, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set C */ + table_entry = &bw_params->wm_table.entries[WM_C]; + vlevel = min(max(vlevel_req, 2), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.c, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set B */ + table_entry = &bw_params->wm_table.entries[WM_B]; + vlevel = min(max(vlevel_req, 1), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.b, + &context->bw_ctx.dml, pipes, pipe_cnt); + + /* WM Set A */ + table_entry = &bw_params->wm_table.entries[WM_A]; + vlevel = min(vlevel_req, vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.watermarks.a, + &context->bw_ctx.dml, pipes, pipe_cnt); + + for (i = 0, pipe_idx = 0; i < dc->res_pool->pipe_count; i++) { + if (!context->res_ctx.pipe_ctx[i].stream) + continue; + + pipes[pipe_idx].clks_cfg.dispclk_mhz = get_dispclk_calculated(&context->bw_ctx.dml, pipes, pipe_cnt); + pipes[pipe_idx].clks_cfg.dppclk_mhz = get_dppclk_calculated(&context->bw_ctx.dml, pipes, pipe_cnt, pipe_idx); + + if (dc->config.forced_clocks) { + pipes[pipe_idx].clks_cfg.dispclk_mhz = context->bw_ctx.dml.soc.clock_limits[0].dispclk_mhz; + pipes[pipe_idx].clks_cfg.dppclk_mhz = context->bw_ctx.dml.soc.clock_limits[0].dppclk_mhz; + } + if (dc->debug.min_disp_clk_khz > pipes[pipe_idx].clks_cfg.dispclk_mhz * 1000) + pipes[pipe_idx].clks_cfg.dispclk_mhz = dc->debug.min_disp_clk_khz / 1000.0; + if (dc->debug.min_dpp_clk_khz > pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000) + pipes[pipe_idx].clks_cfg.dppclk_mhz = dc->debug.min_dpp_clk_khz / 1000.0; + + pipe_idx++; + } + + dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); +} + static struct resource_funcs dcn301_res_pool_funcs = { .destroy = dcn301_destroy_resource_pool, .link_enc_create = dcn301_link_encoder_create, .panel_cntl_create = dcn301_panel_cntl_create, .validate_bandwidth = dcn30_validate_bandwidth, - .calculate_wm_and_dlg = dcn30_calculate_wm_and_dlg, + .calculate_wm_and_dlg = dcn301_calculate_wm_and_dlg, .populate_dml_pipes = dcn30_populate_dml_pipes_from_context, .acquire_idle_pipe_for_layer = dcn20_acquire_idle_pipe_for_layer, .add_stream_to_ctx = dcn30_add_stream_to_ctx,
From: Nirmoy Das nirmoy.das@amd.com
[ Upstream commit 521f04f9e3ffc73ef96c776035f8a0a31b4cdd81 ]
FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway.
Signed-off-by: Nirmoy Das nirmoy.das@amd.com Acked-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c index 0bf7d36c6686..5b716404eee1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c @@ -146,7 +146,7 @@ static int amdgpufb_create_pinned_object(struct amdgpu_fbdev *rfbdev, size = mode_cmd->pitches[0] * height; aligned_size = ALIGN(size, PAGE_SIZE); ret = amdgpu_gem_object_create(adev, aligned_size, 0, domain, flags, - ttm_bo_type_kernel, NULL, &gobj); + ttm_bo_type_device, NULL, &gobj); if (ret) { pr_err("failed to allocate framebuffer (%d)\n", aligned_size); return -ENOMEM;
From: Christian König christian.koenig@amd.com
[ Upstream commit cba2afb65cb05c3d197d17323fee4e3c9edef9cd ]
When AGP is compiled as module radeon must be compiled as module as well.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index af6c6d214d91..f0c0ccdc8a10 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -232,6 +232,7 @@ source "drivers/gpu/drm/arm/Kconfig" config DRM_RADEON tristate "ATI Radeon" depends on DRM && PCI && MMU + depends on AGP || !AGP select FW_LOADER select DRM_KMS_HELPER select DRM_TTM
From: Hannes Reinecke hare@suse.de
[ Upstream commit d95c1f4179a7f3ea8aa728ed00252a8ed0f8158f ]
We only should remove namespaces when we get fatal error back from the device or when the namespace IDs have changed. So instead of painfully masking out error numbers which might indicate that the error should be ignored we could use an NVME status code to indicated when the namespace should be removed. That simplifies the final logic and makes it less error-prone.
Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Daniel Wagner dwagner@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f848ba16427e..a0f169a2d96f 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1425,7 +1425,7 @@ static int nvme_identify_ns(struct nvme_ctrl *ctrl, unsigned nsid, goto out_free_id; }
- error = -ENODEV; + error = NVME_SC_INVALID_NS | NVME_SC_DNR; if ((*id)->ncap == 0) /* namespace not allocated or attached */ goto out_free_id;
@@ -4011,7 +4011,7 @@ static void nvme_ns_remove_by_nsid(struct nvme_ctrl *ctrl, u32 nsid) static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_ids *ids) { struct nvme_id_ns *id; - int ret = -ENODEV; + int ret = NVME_SC_INVALID_NS | NVME_SC_DNR;
if (test_bit(NVME_NS_DEAD, &ns->flags)) goto out; @@ -4020,7 +4020,7 @@ static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_ids *ids) if (ret) goto out;
- ret = -ENODEV; + ret = NVME_SC_INVALID_NS | NVME_SC_DNR; if (!nvme_ns_ids_equal(&ns->head->ids, ids)) { dev_err(ns->ctrl->device, "identifiers changed for nsid %d\n", ns->head->ns_id); @@ -4038,7 +4038,7 @@ static void nvme_validate_ns(struct nvme_ns *ns, struct nvme_ns_ids *ids) * * TODO: we should probably schedule a delayed retry here. */ - if (ret && ret != -ENOMEM && !(ret > 0 && !(ret & NVME_SC_DNR))) + if (ret > 0 && (ret & NVME_SC_DNR)) nvme_ns_remove(ns); }
From: Hannes Reinecke hare@suse.de
[ Upstream commit d3589381987ec879b03f8ce3039df57e87f05901 ]
NVME_REQ_CANCELLED is translated into -EINTR in nvme_submit_sync_cmd(), so we should be setting this flags during nvme_cancel_request() to ensure that the callers to nvme_submit_sync_cmd() will get the correct error code when the controller is reset.
Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Chao Leng lengchao@huawei.com Reviewed-by: Daniel Wagner dwagner@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index a0f169a2d96f..206bf0a50487 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -366,6 +366,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved) return true;
nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD; + nvme_req(req)->flags |= NVME_REQ_CANCELLED; blk_mq_complete_request(req); return true; }
From: Hannes Reinecke hare@suse.de
[ Upstream commit 3c7aafbc8d3d4d90430dfa126847a796c3e4ecfc ]
nvme_fc_terminate_exchange() is being called when exchanges are being deleted, and as such we should be setting the NVME_REQ_CANCELLED flag to have identical behaviour on all transports.
Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: James Smart jsmart2021@gmail.com Reviewed-by: Daniel Wagner dwagner@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 7ec6869b3e5b..0ddd2514b401 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2443,6 +2443,7 @@ nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved) struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl); struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(req);
+ op->nreq.flags |= NVME_REQ_CANCELLED; __nvme_fc_abort_op(ctrl, op); return true; }
From: Hannes Reinecke hare@suse.de
[ Upstream commit ae3afe6308b43bbf49953101d4ba2c1c481133a8 ]
When a command has been aborted we should return NVME_SC_HOST_ABORTED_CMD to be consistent with the other transports.
Signed-off-by: Hannes Reinecke hare@suse.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: James Smart jsmart2021@gmail.com Reviewed-by: Daniel Wagner dwagner@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 0ddd2514b401..ca75338f2367 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -1956,7 +1956,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) sizeof(op->rsp_iu), DMA_FROM_DEVICE);
if (opstate == FCPOP_STATE_ABORTED) - status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1); + status = cpu_to_le16(NVME_SC_HOST_ABORTED_CMD << 1); else if (freq->status) { status = cpu_to_le16(NVME_SC_HOST_PATH_ERROR << 1); dev_info(ctrl->ctrl.device,
From: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com
[ Upstream commit 0ec84df4953bd42c6583a555773f1d4996a061eb ]
Ensure multiple Command Sets are supported before starting to setup a ZNS namespace.
Signed-off-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com [hch: move the check around a bit] Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 206bf0a50487..c611a17e83f0 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4069,6 +4069,12 @@ static void nvme_validate_or_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid) nsid); break; } + if (!nvme_multi_css(ctrl)) { + dev_warn(ctrl->device, + "command set not reported for nsid: %d\n", + ns->head->ns_id); + break; + } nvme_alloc_ns(ctrl, nsid, &ids); break; default:
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit abec6561fc4e0fbb19591a0b35676d8c783b5493 ]
In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by nvmet_rdma_release_rsp(). But after that, pr_info() used the freed chunk's member object and could leak the freed chunk address with wc->wr_cqe by computing the offset.
Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/rdma.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 06b6b742bb21..6c1f3ab7649c 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -802,9 +802,8 @@ static void nvmet_rdma_write_data_done(struct ib_cq *cq, struct ib_wc *wc) nvmet_req_uninit(&rsp->req); nvmet_rdma_release_rsp(rsp); if (wc->status != IB_WC_WR_FLUSH_ERR) { - pr_info("RDMA WRITE for CQE 0x%p failed with status %s (%d).\n", - wc->wr_cqe, ib_wc_status_msg(wc->status), - wc->status); + pr_info("RDMA WRITE for CQE failed with status %s (%d).\n", + ib_wc_status_msg(wc->status), wc->status); nvmet_rdma_error_comp(queue); } return;
From: Dmitry Monakhov dmtrmonakhov@yandex-team.ru
[ Upstream commit abbb5f5929ec6c52574c430c5475c158a65c2a8c ]
This adds a quirk for Samsung PM1725a drive which fixes timeouts and I/O errors due to the fact that the controller does not properly handle the Write Zeroes command, dmesg log:
nvme nvme0: I/O 528 QID 10 timeout, aborting nvme nvme0: I/O 529 QID 10 timeout, aborting nvme nvme0: I/O 530 QID 10 timeout, aborting nvme nvme0: I/O 531 QID 10 timeout, aborting nvme nvme0: I/O 532 QID 10 timeout, aborting nvme nvme0: I/O 533 QID 10 timeout, aborting nvme nvme0: I/O 534 QID 10 timeout, aborting nvme nvme0: I/O 535 QID 10 timeout, aborting nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: Abort status: 0x0 nvme nvme0: I/O 528 QID 10 timeout, reset controller nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10 nvme nvme0: Device not ready; aborting reset, CSTS=0x3 nvme nvme0: Device not ready; aborting reset, CSTS=0x3 nvme nvme0: Removing after probe failure status: -19 nvme0n1: detected capacity change from 6251233968 to 0 blk_update_request: I/O error, dev nvme0n1, sector 32776 op 0x1:(WRITE) flags 0x3000 phys_seg 6 prio class 0 blk_update_request: I/O error, dev nvme0n1, sector 113319936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 1, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113319680 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 2, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113319424 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 3, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113319168 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 4, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113318912 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 5, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113318656 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 Buffer I/O error on dev nvme0n1p2, logical block 6, lost async page write blk_update_request: I/O error, dev nvme0n1, sector 113318400 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 blk_update_request: I/O error, dev nvme0n1, sector 113318144 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0 blk_update_request: I/O error, dev nvme0n1, sector 113317888 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Signed-off-by: Dmitry Monakhov dmtrmonakhov@yandex-team.ru Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 806a5d071ef6..514dfd630035 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3242,6 +3242,7 @@ static const struct pci_device_id nvme_id_table[] = { .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, }, { PCI_DEVICE(0x144d, 0xa822), /* Samsung PM1725a */ .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY | + NVME_QUIRK_DISABLE_WRITE_ZEROES| NVME_QUIRK_IGNORE_DEV_SUBNQN, }, { PCI_DEVICE(0x1987, 0x5016), /* Phison E16 */ .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
From: J. Bruce Fields bfields@redhat.com
[ Upstream commit 4f8be1f53bf615102d103c0509ffa9596f65b718 ]
The NFSv4 protocol doesn't have any notion of reomoving an attribute, so removexattr(path,"system.nfs4_acl") doesn't make sense.
There's no documented return value. Arguably it could be EOPNOTSUPP but I'm a little worried an application might take that to mean that we don't support ACLs or xattrs. How about EINVAL?
Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 7eb44f37558c..95d3b8540f8e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5896,6 +5896,9 @@ static int __nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t bufl unsigned int npages = DIV_ROUND_UP(buflen, PAGE_SIZE); int ret, i;
+ /* You can't remove system.nfs4_acl: */ + if (buflen == 0) + return -EINVAL; if (!nfs4_server_supports_acls(server)) return -EOPNOTSUPP; if (npages > ARRAY_SIZE(pages))
From: Daniel Wagner dwagner@suse.de
[ Upstream commit 9ec491447b90ad6a4056a9656b13f0b3a1e83043 ]
register_disk() suppress uevents for devices with the GENHD_FL_HIDDEN but enables uevents at the end again in order to announce disk after possible partitions are created.
When the device is removed the uevents are still on and user land sees 'remove' messages for devices which were never 'add'ed to the system.
KERNEL[95481.571887] remove /devices/virtual/nvme-fabrics/ctl/nvme5/nvme0c5n1 (block)
Let's suppress the uevents for GENHD_FL_HIDDEN by not enabling the uevents at all.
Signed-off-by: Daniel Wagner dwagner@suse.de Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Martin Wilck mwilck@suse.com Link: https://lore.kernel.org/r/20210311151917.136091-1-dwagner@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/genhd.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/block/genhd.c b/block/genhd.c index 07a0ef741de1..12940cfa68af 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -658,10 +658,8 @@ static void register_disk(struct device *parent, struct gendisk *disk, kobject_create_and_add("holders", &ddev->kobj); disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
- if (disk->flags & GENHD_FL_HIDDEN) { - dev_set_uevent_suppress(ddev, 0); + if (disk->flags & GENHD_FL_HIDDEN) return; - }
disk_scan_partitions(disk);
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit e1915f76a8981f0a750cf56515df42582a37c4b0 ]
As io_uring_cancel_files() and others let SQO to run between io_uring_try_cancel_requests(), SQO may generate new deferred requests, so it's safer to try to cancel them in it.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index ef078182e7ca..c3cfaa367138 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8861,11 +8861,11 @@ static bool io_cancel_task_cb(struct io_wq_work *work, void *data) return ret; }
-static void io_cancel_defer_files(struct io_ring_ctx *ctx, +static bool io_cancel_defer_files(struct io_ring_ctx *ctx, struct task_struct *task, struct files_struct *files) { - struct io_defer_entry *de = NULL; + struct io_defer_entry *de; LIST_HEAD(list);
spin_lock_irq(&ctx->completion_lock); @@ -8876,6 +8876,8 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx, } } spin_unlock_irq(&ctx->completion_lock); + if (list_empty(&list)) + return false;
while (!list_empty(&list)) { de = list_first_entry(&list, struct io_defer_entry, list); @@ -8885,6 +8887,7 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx, io_req_complete(de->req, -ECANCELED); kfree(de); } + return true; }
static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx, @@ -8912,6 +8915,7 @@ static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx, } }
+ ret |= io_cancel_defer_files(ctx, task, files); ret |= io_poll_remove_all(ctx, task, files); ret |= io_kill_timeouts(ctx, task, files); ret |= io_run_task_work(); @@ -8992,8 +8996,6 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx, io_sq_thread_park(ctx->sq_data); }
- io_cancel_defer_files(ctx, task, files); - io_uring_cancel_files(ctx, task, files); if (!files) io_uring_try_cancel_requests(ctx, task, NULL);
From: Fenghua Yu fenghua.yu@intel.com
[ Upstream commit 82e69a121be4b1597ce758534816a8ee04c8b761 ]
When a new mm is created, its PASID should be cleared, i.e. the PASID is initialized to its init state 0 on both ARM and X86.
This patch was part of the series introducing mm->pasid, but got lost along the way [1]. It still makes sense to have it, because each address space has a different PASID. And the IOMMU code in iommu_sva_alloc_pasid() expects the pasid field of a new mm struct to be cleared.
[1] https://lore.kernel.org/linux-iommu/YDgh53AcQHT+T3L0@otcwcpicx3.sc.intel.com...
Link: https://lkml.kernel.org/r/20210302103837.2562625-1-jean-philippe@linaro.org Signed-off-by: Fenghua Yu fenghua.yu@intel.com Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Reviewed-by: Tony Luck tony.luck@intel.com Cc: Jacob Pan jacob.jun.pan@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm_types.h | 1 + kernel/fork.c | 8 ++++++++ 2 files changed, 9 insertions(+)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 07d9acb5b19c..61c77cfff8c2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -23,6 +23,7 @@ #endif #define AT_VECTOR_SIZE (2*(AT_VECTOR_SIZE_ARCH + AT_VECTOR_SIZE_BASE + 1))
+#define INIT_PASID 0
struct address_space; struct mem_cgroup; diff --git a/kernel/fork.c b/kernel/fork.c index d66cd1014211..808af2cc8ab6 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -994,6 +994,13 @@ static void mm_init_owner(struct mm_struct *mm, struct task_struct *p) #endif }
+static void mm_init_pasid(struct mm_struct *mm) +{ +#ifdef CONFIG_IOMMU_SUPPORT + mm->pasid = INIT_PASID; +#endif +} + static void mm_init_uprobes_state(struct mm_struct *mm) { #ifdef CONFIG_UPROBES @@ -1024,6 +1031,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mm_init_cpumask(mm); mm_init_aio(mm); mm_init_owner(mm, p); + mm_init_pasid(mm); RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm);
From: Sergei Trofimovich slyfox@gentoo.org
[ Upstream commit 0ceb1ace4a2778e34a5414e5349712ae4dc41d85 ]
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via glibc's syscall() wrapper.
ia64 has two ways to call syscalls from userspace: via `break` and via `eps` instructions.
The difference is in stack layout:
1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8} 2. `break` uses userspace stack frame: may be locals (glibc provides one), in{0..7} == out{0..8}.
Both work fine in syscall handling cde itself.
But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to re-extract syscall arguments but it does not account for locals.
The change always skips locals registers. It should not change `eps` path as kernel's handler already enforces locals=0 and fixes `break`.
Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614 Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Reported-by: Dmitry V. Levin ldv@altlinux.org Cc: Oleg Nesterov oleg@redhat.com Cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/ia64/kernel/ptrace.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c index c3490ee2daa5..e14f5653393a 100644 --- a/arch/ia64/kernel/ptrace.c +++ b/arch/ia64/kernel/ptrace.c @@ -2013,27 +2013,39 @@ static void syscall_get_set_args_cb(struct unw_frame_info *info, void *data) { struct syscall_get_set_args *args = data; struct pt_regs *pt = args->regs; - unsigned long *krbs, cfm, ndirty; + unsigned long *krbs, cfm, ndirty, nlocals, nouts; int i, count;
if (unw_unwind_to_user(info) < 0) return;
+ /* + * We get here via a few paths: + * - break instruction: cfm is shared with caller. + * syscall args are in out= regs, locals are non-empty. + * - epsinstruction: cfm is set by br.call + * locals don't exist. + * + * For both cases argguments are reachable in cfm.sof - cfm.sol. + * CFM: [ ... | sor: 17..14 | sol : 13..7 | sof : 6..0 ] + */ cfm = pt->cr_ifs; + nlocals = (cfm >> 7) & 0x7f; /* aka sol */ + nouts = (cfm & 0x7f) - nlocals; /* aka sof - sol */ krbs = (unsigned long *)info->task + IA64_RBS_OFFSET/8; ndirty = ia64_rse_num_regs(krbs, krbs + (pt->loadrs >> 19));
count = 0; if (in_syscall(pt)) - count = min_t(int, args->n, cfm & 0x7f); + count = min_t(int, args->n, nouts);
+ /* Iterate over outs. */ for (i = 0; i < count; i++) { + int j = ndirty + nlocals + i + args->i; if (args->rw) - *ia64_rse_skip_regs(krbs, ndirty + i + args->i) = - args->args[i]; + *ia64_rse_skip_regs(krbs, j) = args->args[i]; else - args->args[i] = *ia64_rse_skip_regs(krbs, - ndirty + i + args->i); + args->args[i] = *ia64_rse_skip_regs(krbs, j); }
if (!args->rw) {
From: Sergei Trofimovich slyfox@gentoo.org
[ Upstream commit 61bf318eac2c13356f7bd1c6a05421ef504ccc8a ]
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.
The bug is in mismatch between get/set errors:
static inline long syscall_get_error(struct task_struct *task, struct pt_regs *regs) { return regs->r10 == -1 ? regs->r8:0; }
static inline long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) { return regs->r8; }
static inline void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, int error, long val) { if (error) { /* error < 0, but ia64 uses > 0 return value */ regs->r8 = -error; regs->r10 = -1; } else { regs->r8 = val; regs->r10 = 0; } }
Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614 Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Reported-by: Dmitry V. Levin ldv@altlinux.org Reviewed-by: Dmitry V. Levin ldv@altlinux.org Cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Cc: Oleg Nesterov oleg@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/ia64/include/asm/syscall.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/ia64/include/asm/syscall.h b/arch/ia64/include/asm/syscall.h index 6c6f16e409a8..0d23c0049301 100644 --- a/arch/ia64/include/asm/syscall.h +++ b/arch/ia64/include/asm/syscall.h @@ -32,7 +32,7 @@ static inline void syscall_rollback(struct task_struct *task, static inline long syscall_get_error(struct task_struct *task, struct pt_regs *regs) { - return regs->r10 == -1 ? regs->r8:0; + return regs->r10 == -1 ? -regs->r8:0; }
static inline long syscall_get_return_value(struct task_struct *task,
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 880cfed3a012d7863f42251791cea7fe78c39390 ]
Some static call declarations are going to be needed on low level header files. Move the necessary material to the dedicated static call types header to avoid inclusion dependency hell.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/static_call.h | 21 ------------------- include/linux/static_call_types.h | 27 +++++++++++++++++++++++++ tools/include/linux/static_call_types.h | 27 +++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 21 deletions(-)
diff --git a/include/linux/static_call.h b/include/linux/static_call.h index 695da4c9b338..a2c064585c03 100644 --- a/include/linux/static_call.h +++ b/include/linux/static_call.h @@ -107,26 +107,10 @@ extern void arch_static_call_transform(void *site, void *tramp, void *func, bool
#define STATIC_CALL_TRAMP_ADDR(name) &STATIC_CALL_TRAMP(name)
-/* - * __ADDRESSABLE() is used to ensure the key symbol doesn't get stripped from - * the symbol table so that objtool can reference it when it generates the - * .static_call_sites section. - */ -#define __static_call(name) \ -({ \ - __ADDRESSABLE(STATIC_CALL_KEY(name)); \ - &STATIC_CALL_TRAMP(name); \ -}) - #else #define STATIC_CALL_TRAMP_ADDR(name) NULL #endif
- -#define DECLARE_STATIC_CALL(name, func) \ - extern struct static_call_key STATIC_CALL_KEY(name); \ - extern typeof(func) STATIC_CALL_TRAMP(name); - #define static_call_update(name, func) \ ({ \ BUILD_BUG_ON(!__same_type(*(func), STATIC_CALL_TRAMP(name))); \ @@ -174,7 +158,6 @@ extern int static_call_text_reserved(void *start, void *end); }; \ ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
-#define static_call(name) __static_call(name) #define static_call_cond(name) (void)__static_call(name)
#define EXPORT_STATIC_CALL(name) \ @@ -207,7 +190,6 @@ struct static_call_key { }; \ ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
-#define static_call(name) __static_call(name) #define static_call_cond(name) (void)__static_call(name)
static inline @@ -252,9 +234,6 @@ struct static_call_key { .func = NULL, \ }
-#define static_call(name) \ - ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func)) - static inline void __static_call_nop(void) { }
/* diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h index 89135bb35bf7..08f78b1b88b4 100644 --- a/include/linux/static_call_types.h +++ b/include/linux/static_call_types.h @@ -4,6 +4,7 @@
#include <linux/types.h> #include <linux/stringify.h> +#include <linux/compiler.h>
#define STATIC_CALL_KEY_PREFIX __SCK__ #define STATIC_CALL_KEY_PREFIX_STR __stringify(STATIC_CALL_KEY_PREFIX) @@ -32,4 +33,30 @@ struct static_call_site { s32 key; };
+#define DECLARE_STATIC_CALL(name, func) \ + extern struct static_call_key STATIC_CALL_KEY(name); \ + extern typeof(func) STATIC_CALL_TRAMP(name); + +#ifdef CONFIG_HAVE_STATIC_CALL + +/* + * __ADDRESSABLE() is used to ensure the key symbol doesn't get stripped from + * the symbol table so that objtool can reference it when it generates the + * .static_call_sites section. + */ +#define __static_call(name) \ +({ \ + __ADDRESSABLE(STATIC_CALL_KEY(name)); \ + &STATIC_CALL_TRAMP(name); \ +}) + +#define static_call(name) __static_call(name) + +#else + +#define static_call(name) \ + ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func)) + +#endif /* CONFIG_HAVE_STATIC_CALL */ + #endif /* _STATIC_CALL_TYPES_H */ diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h index 89135bb35bf7..08f78b1b88b4 100644 --- a/tools/include/linux/static_call_types.h +++ b/tools/include/linux/static_call_types.h @@ -4,6 +4,7 @@
#include <linux/types.h> #include <linux/stringify.h> +#include <linux/compiler.h>
#define STATIC_CALL_KEY_PREFIX __SCK__ #define STATIC_CALL_KEY_PREFIX_STR __stringify(STATIC_CALL_KEY_PREFIX) @@ -32,4 +33,30 @@ struct static_call_site { s32 key; };
+#define DECLARE_STATIC_CALL(name, func) \ + extern struct static_call_key STATIC_CALL_KEY(name); \ + extern typeof(func) STATIC_CALL_TRAMP(name); + +#ifdef CONFIG_HAVE_STATIC_CALL + +/* + * __ADDRESSABLE() is used to ensure the key symbol doesn't get stripped from + * the symbol table so that objtool can reference it when it generates the + * .static_call_sites section. + */ +#define __static_call(name) \ +({ \ + __ADDRESSABLE(STATIC_CALL_KEY(name)); \ + &STATIC_CALL_TRAMP(name); \ +}) + +#define static_call(name) __static_call(name) + +#else + +#define static_call(name) \ + ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func)) + +#endif /* CONFIG_HAVE_STATIC_CALL */ + #endif /* _STATIC_CALL_TYPES_H */
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit 73f44fe19d359635a607e8e8daa0da4001c1cfc2 ]
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module can use static_call_update() to change the function called. This is not desirable in general.
Not exporting static_call_key however also disallows usage of static_call(), since objtool needs the key to construct the static_call_site.
Solve this by allowing objtool to create the static_call_site using the trampoline address when it builds a module and cannot find the static_call_key symbol. The module loader will then try and map the trampole back to a key before it constructs the normal sites list.
Doing this requires a trampoline -> key associsation, so add another magic section that keeps those.
Originally-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/static_call.h | 7 ++++ include/asm-generic/vmlinux.lds.h | 5 ++- include/linux/static_call.h | 22 +++++++++- include/linux/static_call_types.h | 27 +++++++++++- kernel/static_call.c | 55 ++++++++++++++++++++++++- tools/include/linux/static_call_types.h | 27 +++++++++++- tools/objtool/check.c | 17 +++++++- 7 files changed, 149 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h index c37f11999d0c..cbb67b6030f9 100644 --- a/arch/x86/include/asm/static_call.h +++ b/arch/x86/include/asm/static_call.h @@ -37,4 +37,11 @@ #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \ __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret; nop; nop; nop; nop")
+ +#define ARCH_ADD_TRAMP_KEY(name) \ + asm(".pushsection .static_call_tramp_key, "a" \n" \ + ".long " STATIC_CALL_TRAMP_STR(name) " - . \n" \ + ".long " STATIC_CALL_KEY_STR(name) " - . \n" \ + ".popsection \n") + #endif /* _ASM_STATIC_CALL_H */ diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 34d8287cd774..d7efbc5490e8 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -393,7 +393,10 @@ . = ALIGN(8); \ __start_static_call_sites = .; \ KEEP(*(.static_call_sites)) \ - __stop_static_call_sites = .; + __stop_static_call_sites = .; \ + __start_static_call_tramp_key = .; \ + KEEP(*(.static_call_tramp_key)) \ + __stop_static_call_tramp_key = .;
/* * Allow architectures to handle ro_after_init data on their diff --git a/include/linux/static_call.h b/include/linux/static_call.h index a2c064585c03..04e6042d252d 100644 --- a/include/linux/static_call.h +++ b/include/linux/static_call.h @@ -138,6 +138,12 @@ struct static_call_key { }; };
+/* For finding the key associated with a trampoline */ +struct static_call_tramp_key { + s32 tramp; + s32 key; +}; + extern void __static_call_update(struct static_call_key *key, void *tramp, void *func); extern int static_call_mod_init(struct module *mod); extern int static_call_text_reserved(void *start, void *end); @@ -163,11 +169,18 @@ extern int static_call_text_reserved(void *start, void *end); #define EXPORT_STATIC_CALL(name) \ EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \ EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)) - #define EXPORT_STATIC_CALL_GPL(name) \ EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \ EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+/* Leave the key unexported, so modules can't change static call targets: */ +#define EXPORT_STATIC_CALL_TRAMP(name) \ + EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)); \ + ARCH_ADD_TRAMP_KEY(name) +#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \ + EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)); \ + ARCH_ADD_TRAMP_KEY(name) + #elif defined(CONFIG_HAVE_STATIC_CALL)
static inline int static_call_init(void) { return 0; } @@ -209,11 +222,16 @@ static inline int static_call_text_reserved(void *start, void *end) #define EXPORT_STATIC_CALL(name) \ EXPORT_SYMBOL(STATIC_CALL_KEY(name)); \ EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)) - #define EXPORT_STATIC_CALL_GPL(name) \ EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \ EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+/* Leave the key unexported, so modules can't change static call targets: */ +#define EXPORT_STATIC_CALL_TRAMP(name) \ + EXPORT_SYMBOL(STATIC_CALL_TRAMP(name)) +#define EXPORT_STATIC_CALL_TRAMP_GPL(name) \ + EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name)) + #else /* Generic implementation */
static inline int static_call_init(void) { return 0; } diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h index 08f78b1b88b4..ae5662d368b9 100644 --- a/include/linux/static_call_types.h +++ b/include/linux/static_call_types.h @@ -10,6 +10,7 @@ #define STATIC_CALL_KEY_PREFIX_STR __stringify(STATIC_CALL_KEY_PREFIX) #define STATIC_CALL_KEY_PREFIX_LEN (sizeof(STATIC_CALL_KEY_PREFIX_STR) - 1) #define STATIC_CALL_KEY(name) __PASTE(STATIC_CALL_KEY_PREFIX, name) +#define STATIC_CALL_KEY_STR(name) __stringify(STATIC_CALL_KEY(name))
#define STATIC_CALL_TRAMP_PREFIX __SCT__ #define STATIC_CALL_TRAMP_PREFIX_STR __stringify(STATIC_CALL_TRAMP_PREFIX) @@ -39,17 +40,39 @@ struct static_call_site {
#ifdef CONFIG_HAVE_STATIC_CALL
+#define __raw_static_call(name) (&STATIC_CALL_TRAMP(name)) + +#ifdef CONFIG_HAVE_STATIC_CALL_INLINE + /* * __ADDRESSABLE() is used to ensure the key symbol doesn't get stripped from * the symbol table so that objtool can reference it when it generates the * .static_call_sites section. */ +#define __STATIC_CALL_ADDRESSABLE(name) \ + __ADDRESSABLE(STATIC_CALL_KEY(name)) + #define __static_call(name) \ ({ \ - __ADDRESSABLE(STATIC_CALL_KEY(name)); \ - &STATIC_CALL_TRAMP(name); \ + __STATIC_CALL_ADDRESSABLE(name); \ + __raw_static_call(name); \ })
+#else /* !CONFIG_HAVE_STATIC_CALL_INLINE */ + +#define __STATIC_CALL_ADDRESSABLE(name) +#define __static_call(name) __raw_static_call(name) + +#endif /* CONFIG_HAVE_STATIC_CALL_INLINE */ + +#ifdef MODULE +#define __STATIC_CALL_MOD_ADDRESSABLE(name) +#define static_call_mod(name) __raw_static_call(name) +#else +#define __STATIC_CALL_MOD_ADDRESSABLE(name) __STATIC_CALL_ADDRESSABLE(name) +#define static_call_mod(name) __static_call(name) +#endif + #define static_call(name) __static_call(name)
#else diff --git a/kernel/static_call.c b/kernel/static_call.c index db914da6e785..db64c2331a32 100644 --- a/kernel/static_call.c +++ b/kernel/static_call.c @@ -12,6 +12,8 @@
extern struct static_call_site __start_static_call_sites[], __stop_static_call_sites[]; +extern struct static_call_tramp_key __start_static_call_tramp_key[], + __stop_static_call_tramp_key[];
static bool static_call_initialized;
@@ -332,10 +334,59 @@ static int __static_call_mod_text_reserved(void *start, void *end) return ret; }
+static unsigned long tramp_key_lookup(unsigned long addr) +{ + struct static_call_tramp_key *start = __start_static_call_tramp_key; + struct static_call_tramp_key *stop = __stop_static_call_tramp_key; + struct static_call_tramp_key *tramp_key; + + for (tramp_key = start; tramp_key != stop; tramp_key++) { + unsigned long tramp; + + tramp = (long)tramp_key->tramp + (long)&tramp_key->tramp; + if (tramp == addr) + return (long)tramp_key->key + (long)&tramp_key->key; + } + + return 0; +} + static int static_call_add_module(struct module *mod) { - return __static_call_init(mod, mod->static_call_sites, - mod->static_call_sites + mod->num_static_call_sites); + struct static_call_site *start = mod->static_call_sites; + struct static_call_site *stop = start + mod->num_static_call_sites; + struct static_call_site *site; + + for (site = start; site != stop; site++) { + unsigned long addr = (unsigned long)static_call_key(site); + unsigned long key; + + /* + * Is the key is exported, 'addr' points to the key, which + * means modules are allowed to call static_call_update() on + * it. + * + * Otherwise, the key isn't exported, and 'addr' points to the + * trampoline so we need to lookup the key. + * + * We go through this dance to prevent crazy modules from + * abusing sensitive static calls. + */ + if (!kernel_text_address(addr)) + continue; + + key = tramp_key_lookup(addr); + if (!key) { + pr_warn("Failed to fixup __raw_static_call() usage at: %ps\n", + static_call_addr(site)); + return -EINVAL; + } + + site->key = (key - (long)&site->key) | + (site->key & STATIC_CALL_SITE_FLAGS); + } + + return __static_call_init(mod, start, stop); }
static void static_call_del_module(struct module *mod) diff --git a/tools/include/linux/static_call_types.h b/tools/include/linux/static_call_types.h index 08f78b1b88b4..ae5662d368b9 100644 --- a/tools/include/linux/static_call_types.h +++ b/tools/include/linux/static_call_types.h @@ -10,6 +10,7 @@ #define STATIC_CALL_KEY_PREFIX_STR __stringify(STATIC_CALL_KEY_PREFIX) #define STATIC_CALL_KEY_PREFIX_LEN (sizeof(STATIC_CALL_KEY_PREFIX_STR) - 1) #define STATIC_CALL_KEY(name) __PASTE(STATIC_CALL_KEY_PREFIX, name) +#define STATIC_CALL_KEY_STR(name) __stringify(STATIC_CALL_KEY(name))
#define STATIC_CALL_TRAMP_PREFIX __SCT__ #define STATIC_CALL_TRAMP_PREFIX_STR __stringify(STATIC_CALL_TRAMP_PREFIX) @@ -39,17 +40,39 @@ struct static_call_site {
#ifdef CONFIG_HAVE_STATIC_CALL
+#define __raw_static_call(name) (&STATIC_CALL_TRAMP(name)) + +#ifdef CONFIG_HAVE_STATIC_CALL_INLINE + /* * __ADDRESSABLE() is used to ensure the key symbol doesn't get stripped from * the symbol table so that objtool can reference it when it generates the * .static_call_sites section. */ +#define __STATIC_CALL_ADDRESSABLE(name) \ + __ADDRESSABLE(STATIC_CALL_KEY(name)) + #define __static_call(name) \ ({ \ - __ADDRESSABLE(STATIC_CALL_KEY(name)); \ - &STATIC_CALL_TRAMP(name); \ + __STATIC_CALL_ADDRESSABLE(name); \ + __raw_static_call(name); \ })
+#else /* !CONFIG_HAVE_STATIC_CALL_INLINE */ + +#define __STATIC_CALL_ADDRESSABLE(name) +#define __static_call(name) __raw_static_call(name) + +#endif /* CONFIG_HAVE_STATIC_CALL_INLINE */ + +#ifdef MODULE +#define __STATIC_CALL_MOD_ADDRESSABLE(name) +#define static_call_mod(name) __raw_static_call(name) +#else +#define __STATIC_CALL_MOD_ADDRESSABLE(name) __STATIC_CALL_ADDRESSABLE(name) +#define static_call_mod(name) __static_call(name) +#endif + #define static_call(name) __static_call(name)
#else diff --git a/tools/objtool/check.c b/tools/objtool/check.c index dc24aac08edd..5c83f73ad668 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -502,8 +502,21 @@ static int create_static_call_sections(struct objtool_file *file)
key_sym = find_symbol_by_name(file->elf, tmp); if (!key_sym) { - WARN("static_call: can't find static_call_key symbol: %s", tmp); - return -1; + if (!module) { + WARN("static_call: can't find static_call_key symbol: %s", tmp); + return -1; + } + + /* + * For modules(), the key might not be exported, which + * means the module can make static calls but isn't + * allowed to change them. + * + * In that case we temporarily set the key to be the + * trampoline address. This is fixed up in + * static_call_add_module(). + */ + key_sym = insn->call_dest; } free(key_name);
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 50bf8080a94d171e843fc013abec19d8ab9f50ae ]
Provided the target address of a R_X86_64_PC32 relocation is aligned, the low two bits should be invariant between the relative and absolute value.
Turns out the address is not aligned and things go sideways, ensure we transfer the bits in the absolute form when fixing up the key address.
Fixes: 73f44fe19d35 ("static_call: Allow module use without exposing static_call_key") Reported-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Steven Rostedt (VMware) rostedt@goodmis.org Link: https://lkml.kernel.org/r/20210225220351.GE4746@worktop.programming.kicks-as... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/static_call.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/static_call.c b/kernel/static_call.c index db64c2331a32..5d53c354fbe7 100644 --- a/kernel/static_call.c +++ b/kernel/static_call.c @@ -358,7 +358,8 @@ static int static_call_add_module(struct module *mod) struct static_call_site *site;
for (site = start; site != stop; site++) { - unsigned long addr = (unsigned long)static_call_key(site); + unsigned long s_key = (long)site->key + (long)&site->key; + unsigned long addr = s_key & ~STATIC_CALL_SITE_FLAGS; unsigned long key;
/* @@ -382,8 +383,8 @@ static int static_call_add_module(struct module *mod) return -EINVAL; }
- site->key = (key - (long)&site->key) | - (site->key & STATIC_CALL_SITE_FLAGS); + key |= s_key & STATIC_CALL_SITE_FLAGS; + site->key = key - (long)&site->key; }
return __static_call_init(mod, start, stop);
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 68b1eddd421d2b16c6655eceb48918a1e896bbbc ]
It turns out that static_call_set_init() does not preserve the other flags; IOW. it clears TAIL if it was set.
Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure") Reported-by: Sumit Garg sumit.garg@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Jarkko Sakkinen jarkko@kernel.org Tested-by: Sumit Garg sumit.garg@linaro.org Link: https://lkml.kernel.org/r/20210318113610.519406371@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/static_call.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/kernel/static_call.c b/kernel/static_call.c index 5d53c354fbe7..49efbdc5b480 100644 --- a/kernel/static_call.c +++ b/kernel/static_call.c @@ -35,27 +35,30 @@ static inline void *static_call_addr(struct static_call_site *site) return (void *)((long)site->addr + (long)&site->addr); }
+static inline unsigned long __static_call_key(const struct static_call_site *site) +{ + return (long)site->key + (long)&site->key; +}
static inline struct static_call_key *static_call_key(const struct static_call_site *site) { - return (struct static_call_key *) - (((long)site->key + (long)&site->key) & ~STATIC_CALL_SITE_FLAGS); + return (void *)(__static_call_key(site) & ~STATIC_CALL_SITE_FLAGS); }
/* These assume the key is word-aligned. */ static inline bool static_call_is_init(struct static_call_site *site) { - return ((long)site->key + (long)&site->key) & STATIC_CALL_SITE_INIT; + return __static_call_key(site) & STATIC_CALL_SITE_INIT; }
static inline bool static_call_is_tail(struct static_call_site *site) { - return ((long)site->key + (long)&site->key) & STATIC_CALL_SITE_TAIL; + return __static_call_key(site) & STATIC_CALL_SITE_TAIL; }
static inline void static_call_set_init(struct static_call_site *site) { - site->key = ((long)static_call_key(site) | STATIC_CALL_SITE_INIT) - + site->key = (__static_call_key(site) | STATIC_CALL_SITE_INIT) - (long)&site->key; }
@@ -199,7 +202,7 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func) }
arch_static_call_transform(site_addr, NULL, func, - static_call_is_tail(site)); + static_call_is_tail(site)); } }
@@ -358,7 +361,7 @@ static int static_call_add_module(struct module *mod) struct static_call_site *site;
for (site = start; site != stop; site++) { - unsigned long s_key = (long)site->key + (long)&site->key; + unsigned long s_key = __static_call_key(site); unsigned long addr = s_key & ~STATIC_CALL_SITE_FLAGS; unsigned long key;
From: Sean Christopherson seanjc@google.com
[ Upstream commit b318e8decf6b9ef1bcf4ca06fae6d6a2cb5d5c5c ]
Fix a plethora of issues with MSR filtering by installing the resulting filter as an atomic bundle instead of updating the live filter one range at a time. The KVM_X86_SET_MSR_FILTER ioctl() isn't truly atomic, as the hardware MSR bitmaps won't be updated until the next VM-Enter, but the relevant software struct is atomically updated, which is what KVM really needs.
Similar to the approach used for modifying memslots, make arch.msr_filter a SRCU-protected pointer, do all the work configuring the new filter outside of kvm->lock, and then acquire kvm->lock only when the new filter has been vetted and created. That way vCPU readers either see the old filter or the new filter in their entirety, not some half-baked state.
Yuan Yao pointed out a use-after-free in ksm_msr_allowed() due to a TOCTOU bug, but that's just the tip of the iceberg...
- Nothing is __rcu annotated, making it nigh impossible to audit the code for correctness. - kvm_add_msr_filter() has an unpaired smp_wmb(). Violation of kernel coding style aside, the lack of a smb_rmb() anywhere casts all code into doubt. - kvm_clear_msr_filter() has a double free TOCTOU bug, as it grabs count before taking the lock. - kvm_clear_msr_filter() also has memory leak due to the same TOCTOU bug.
The entire approach of updating the live filter is also flawed. While installing a new filter is inherently racy if vCPUs are running, fixing the above issues also makes it trivial to ensure certain behavior is deterministic, e.g. KVM can provide deterministic behavior for MSRs with identical settings in the old and new filters. An atomic update of the filter also prevents KVM from getting into a half-baked state, e.g. if installing a filter fails, the existing approach would leave the filter in a half-baked state, having already committed whatever bits of the filter were already processed.
[*] https://lkml.kernel.org/r/20210312083157.25403-1-yaoyuan0329os@gmail.com
Fixes: 1a155254ff93 ("KVM: x86: Introduce MSR filtering") Cc: stable@vger.kernel.org Cc: Alexander Graf graf@amazon.com Reported-by: Yuan Yao yaoyuan0329os@gmail.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210316184436.2544875-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/virt/kvm/api.rst | 6 +- arch/x86/include/asm/kvm_host.h | 14 ++-- arch/x86/kvm/x86.c | 109 +++++++++++++++++++------------- 3 files changed, 78 insertions(+), 51 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 5570887a2dce..66d38520e65a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -4831,8 +4831,10 @@ If an MSR access is not permitted through the filtering, it generates a allows user space to deflect and potentially handle various MSR accesses into user space.
-If a vCPU is in running state while this ioctl is invoked, the vCPU may -experience inconsistent filtering behavior on MSR accesses. +Note, invoking this ioctl with a vCPU is running is inherently racy. However, +KVM does guarantee that vCPUs will see either the previous filter or the new +filter, e.g. MSRs with identical settings in both the old and new filter will +have deterministic behavior.
5. The kvm_run structure diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3d6616f6f6ef..e0cfd620b293 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -894,6 +894,12 @@ enum kvm_irqchip_mode { KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */ };
+struct kvm_x86_msr_filter { + u8 count; + bool default_allow:1; + struct msr_bitmap_range ranges[16]; +}; + #define APICV_INHIBIT_REASON_DISABLE 0 #define APICV_INHIBIT_REASON_HYPERV 1 #define APICV_INHIBIT_REASON_NESTED 2 @@ -989,14 +995,12 @@ struct kvm_arch { bool guest_can_read_msr_platform_info; bool exception_payload_enabled;
+ bool bus_lock_detection_enabled; + /* Deflect RDMSR and WRMSR to user space when they trigger a #GP */ u32 user_space_msr_mask;
- struct { - u8 count; - bool default_allow:1; - struct msr_bitmap_range ranges[16]; - } msr_filter; + struct kvm_x86_msr_filter __rcu *msr_filter;
struct kvm_pmu_event_filter *pmu_event_filter; struct task_struct *nx_lpage_recovery_thread; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b967c1c774a1..f37f5c1430cf 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1523,35 +1523,44 @@ EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type) { + struct kvm_x86_msr_filter *msr_filter; + struct msr_bitmap_range *ranges; struct kvm *kvm = vcpu->kvm; - struct msr_bitmap_range *ranges = kvm->arch.msr_filter.ranges; - u32 count = kvm->arch.msr_filter.count; - u32 i; - bool r = kvm->arch.msr_filter.default_allow; + bool allowed; int idx; + u32 i;
- /* MSR filtering not set up or x2APIC enabled, allow everything */ - if (!count || (index >= 0x800 && index <= 0x8ff)) + /* x2APIC MSRs do not support filtering. */ + if (index >= 0x800 && index <= 0x8ff) return true;
- /* Prevent collision with set_msr_filter */ idx = srcu_read_lock(&kvm->srcu);
- for (i = 0; i < count; i++) { + msr_filter = srcu_dereference(kvm->arch.msr_filter, &kvm->srcu); + if (!msr_filter) { + allowed = true; + goto out; + } + + allowed = msr_filter->default_allow; + ranges = msr_filter->ranges; + + for (i = 0; i < msr_filter->count; i++) { u32 start = ranges[i].base; u32 end = start + ranges[i].nmsrs; u32 flags = ranges[i].flags; unsigned long *bitmap = ranges[i].bitmap;
if ((index >= start) && (index < end) && (flags & type)) { - r = !!test_bit(index - start, bitmap); + allowed = !!test_bit(index - start, bitmap); break; } }
+out: srcu_read_unlock(&kvm->srcu, idx);
- return r; + return allowed; } EXPORT_SYMBOL_GPL(kvm_msr_allowed);
@@ -5315,25 +5324,34 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, return r; }
-static void kvm_clear_msr_filter(struct kvm *kvm) +static struct kvm_x86_msr_filter *kvm_alloc_msr_filter(bool default_allow) +{ + struct kvm_x86_msr_filter *msr_filter; + + msr_filter = kzalloc(sizeof(*msr_filter), GFP_KERNEL_ACCOUNT); + if (!msr_filter) + return NULL; + + msr_filter->default_allow = default_allow; + return msr_filter; +} + +static void kvm_free_msr_filter(struct kvm_x86_msr_filter *msr_filter) { u32 i; - u32 count = kvm->arch.msr_filter.count; - struct msr_bitmap_range ranges[16];
- mutex_lock(&kvm->lock); - kvm->arch.msr_filter.count = 0; - memcpy(ranges, kvm->arch.msr_filter.ranges, count * sizeof(ranges[0])); - mutex_unlock(&kvm->lock); - synchronize_srcu(&kvm->srcu); + if (!msr_filter) + return; + + for (i = 0; i < msr_filter->count; i++) + kfree(msr_filter->ranges[i].bitmap);
- for (i = 0; i < count; i++) - kfree(ranges[i].bitmap); + kfree(msr_filter); }
-static int kvm_add_msr_filter(struct kvm *kvm, struct kvm_msr_filter_range *user_range) +static int kvm_add_msr_filter(struct kvm_x86_msr_filter *msr_filter, + struct kvm_msr_filter_range *user_range) { - struct msr_bitmap_range *ranges = kvm->arch.msr_filter.ranges; struct msr_bitmap_range range; unsigned long *bitmap = NULL; size_t bitmap_size; @@ -5367,11 +5385,9 @@ static int kvm_add_msr_filter(struct kvm *kvm, struct kvm_msr_filter_range *user goto err; }
- /* Everything ok, add this range identifier to our global pool */ - ranges[kvm->arch.msr_filter.count] = range; - /* Make sure we filled the array before we tell anyone to walk it */ - smp_wmb(); - kvm->arch.msr_filter.count++; + /* Everything ok, add this range identifier. */ + msr_filter->ranges[msr_filter->count] = range; + msr_filter->count++;
return 0; err: @@ -5382,10 +5398,11 @@ static int kvm_add_msr_filter(struct kvm *kvm, struct kvm_msr_filter_range *user static int kvm_vm_ioctl_set_msr_filter(struct kvm *kvm, void __user *argp) { struct kvm_msr_filter __user *user_msr_filter = argp; + struct kvm_x86_msr_filter *new_filter, *old_filter; struct kvm_msr_filter filter; bool default_allow; - int r = 0; bool empty = true; + int r = 0; u32 i;
if (copy_from_user(&filter, user_msr_filter, sizeof(filter))) @@ -5398,25 +5415,32 @@ static int kvm_vm_ioctl_set_msr_filter(struct kvm *kvm, void __user *argp) if (empty && !default_allow) return -EINVAL;
- kvm_clear_msr_filter(kvm); - - kvm->arch.msr_filter.default_allow = default_allow; + new_filter = kvm_alloc_msr_filter(default_allow); + if (!new_filter) + return -ENOMEM;
- /* - * Protect from concurrent calls to this function that could trigger - * a TOCTOU violation on kvm->arch.msr_filter.count. - */ - mutex_lock(&kvm->lock); for (i = 0; i < ARRAY_SIZE(filter.ranges); i++) { - r = kvm_add_msr_filter(kvm, &filter.ranges[i]); - if (r) - break; + r = kvm_add_msr_filter(new_filter, &filter.ranges[i]); + if (r) { + kvm_free_msr_filter(new_filter); + return r; + } }
+ mutex_lock(&kvm->lock); + + /* The per-VM filter is protected by kvm->lock... */ + old_filter = srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1); + + rcu_assign_pointer(kvm->arch.msr_filter, new_filter); + synchronize_srcu(&kvm->srcu); + + kvm_free_msr_filter(old_filter); + kvm_make_all_cpus_request(kvm, KVM_REQ_MSR_FILTER_CHANGED); mutex_unlock(&kvm->lock);
- return r; + return 0; }
long kvm_arch_vm_ioctl(struct file *filp, @@ -10536,8 +10560,6 @@ void kvm_arch_pre_destroy_vm(struct kvm *kvm)
void kvm_arch_destroy_vm(struct kvm *kvm) { - u32 i; - if (current->mm == kvm->mm) { /* * Free memory regions allocated on behalf of userspace, @@ -10554,8 +10576,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) } if (kvm_x86_ops.vm_destroy) kvm_x86_ops.vm_destroy(kvm); - for (i = 0; i < kvm->arch.msr_filter.count; i++) - kfree(kvm->arch.msr_filter.ranges[i].bitmap); + kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1)); kvm_pic_destroy(kvm); kvm_ioapic_destroy(kvm); kvm_free_vcpus(kvm);
From: Josef Bacik josef@toxicpanda.com
commit 82d62d06db404d03836cdabbca41d38646d97cbb upstream.
Neal reported a panic trying to use -o rescue=all
BUG: kernel NULL pointer dereference, address: 0000000000000030 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1 RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0 RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10 RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000 RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000 R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00 R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10 FS: 00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0 Call Trace: ? btrfs_init_dev_stats+0x1f/0xf0 btrfs_init_dev_stats+0x62/0xf0 open_ctree+0x1019/0x15ff btrfs_mount_root.cold+0x13/0xfa legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x131/0x3d0 ? legacy_get_tree+0x27/0x40 ? btrfs_show_options+0x640/0x640 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x441/0xa80 __x64_sys_mount+0xf4/0x130 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f698c04e52e
This happens because we unconditionally attempt to initialize device stats on mount, but we may not have been able to read the device root. Fix this by skipping initializing the device stats if we do not have a device root.
Reported-by: Neal Gompa ngompa13@gmail.com CC: stable@vger.kernel.org # 5.11+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7282,6 +7282,9 @@ static int btrfs_device_init_dev_stats(s int item_size; int i, ret, slot;
+ if (!device->fs_info->dev_root) + return 0; + key.objectid = BTRFS_DEV_STATS_OBJECTID; key.type = BTRFS_PERSISTENT_ITEM_KEY; key.offset = device->devid;
From: Josef Bacik josef@toxicpanda.com
commit 3cb894972f1809aa8d087c42e5e8b26c64b7d508 upstream.
While helping Neal fix his broken file system I added a debug patch to catch if we were calling btrfs_search_slot with a NULL root, and this stack trace popped:
we tried to search with a NULL root CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020 Call Trace: dump_stack+0x6b/0x83 btrfs_search_slot.cold+0x11/0x1b ? btrfs_init_dev_replace+0x36/0x450 btrfs_init_dev_replace+0x71/0x450 open_ctree+0x1054/0x1610 btrfs_mount_root.cold+0x13/0xfa legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x131/0x3d0 ? legacy_get_tree+0x27/0x40 ? btrfs_show_options+0x640/0x640 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x441/0xa80 __x64_sys_mount+0xf4/0x130 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f644730352e
Fix this by not starting the device replace stuff if we do not have a NULL dev root.
Reported-by: Neal Gompa ngompa13@gmail.com CC: stable@vger.kernel.org # 5.11+ Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/dev-replace.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -80,6 +80,9 @@ int btrfs_init_dev_replace(struct btrfs_ struct btrfs_dev_replace_item *ptr; u64 src_devid;
+ if (!dev_root) + return 0; + path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM;
From: Omar Sandoval osandov@fb.com
commit c1d6abdac46ca8127274bea195d804e3f2cec7ee upstream.
Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to check_data_csum()") replaced the start parameter to check_data_csum() with page_offset(), but page_offset() is not meaningful for direct I/O pages. Bring back the start parameter.
Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to check_data_csum") CC: stable@vger.kernel.org # 5.11+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Omar Sandoval osandov@fb.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2947,11 +2947,13 @@ void btrfs_writepage_endio_finish_ordere * @bio_offset: offset to the beginning of the bio (in bytes) * @page: page where is the data to be verified * @pgoff: offset inside the page + * @start: logical offset in the file * * The length of such check is always one sector size. */ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, - u32 bio_offset, struct page *page, u32 pgoff) + u32 bio_offset, struct page *page, u32 pgoff, + u64 start) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); @@ -2978,8 +2980,8 @@ static int check_data_csum(struct inode kunmap_atomic(kaddr); return 0; zeroit: - btrfs_print_data_csum_error(BTRFS_I(inode), page_offset(page) + pgoff, - csum, csum_expected, io_bio->mirror_num); + btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected, + io_bio->mirror_num); if (io_bio->device) btrfs_dev_stat_inc_and_print(io_bio->device, BTRFS_DEV_STAT_CORRUPTION_ERRS); @@ -3032,7 +3034,8 @@ int btrfs_verify_data_csum(struct btrfs_ pg_off += sectorsize, bio_offset += sectorsize) { int ret;
- ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off); + ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off, + page_offset(page) + pg_off); if (ret < 0) return -EIO; } @@ -7742,7 +7745,8 @@ static blk_status_t btrfs_check_read_dio ASSERT(pgoff < PAGE_SIZE); if (uptodate && (!csum || !check_data_csum(inode, io_bio, - bio_offset, bvec.bv_page, pgoff))) { + bio_offset, bvec.bv_page, + pgoff, start))) { clean_io_failure(fs_info, failure_tree, io_tree, start, bvec.bv_page, btrfs_ino(BTRFS_I(inode)),
From: Josef Bacik josef@toxicpanda.com
commit 820a49dafc3304de06f296c35c9ff1ebc1666343 upstream.
Neal reported a panic trying to use -o rescue=all
BUG: kernel NULL pointer dereference, address: 0000000000000030 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 0 PID: 696 Comm: mount Tainted: G W 5.12.0-rc2+ #296 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200 RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070 RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400 RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150 R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000 R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150 FS: 00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0 Call Trace: ? btrfs_init_dev_stats+0x1f/0xf0 ? kmem_cache_alloc+0xef/0x1f0 btrfs_init_dev_stats+0x5f/0xf0 open_ctree+0x10cb/0x1720 btrfs_mount_root.cold+0x12/0xea legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x10d/0x380 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x433/0xa00 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae
This happens because when we call btrfs_init_dev_stats we do device->fs_info->dev_root. However device->fs_info isn't initialized because we were only calling btrfs_init_devices_late() if we properly read the device root. However we don't actually need the device root to init the devices, this function simply assigns the devices their ->fs_info pointer properly, so this needs to be done unconditionally always so that we can properly dereference device->fs_info in rescue cases.
Reported-by: Neal Gompa ngompa13@gmail.com CC: stable@vger.kernel.org # 5.11+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2300,8 +2300,9 @@ static int btrfs_read_roots(struct btrfs } else { set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); fs_info->dev_root = root; - btrfs_init_devices_late(fs_info); } + /* Initialize fs_info for all devices in any case */ + btrfs_init_devices_late(fs_info);
/* If IGNOREDATACSUMS is set don't bother reading the csum root. */ if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {
From: Filipe Manana fdmanana@suse.com
commit 0bb788300990d3eb5582d3301a720f846c78925c upstream.
While removing a qgroup's sysfs entry we end up taking the kernfs_mutex, through kobject_del(), while holding the fs_info->qgroup_lock spinlock, producing the following trace:
[821.843637] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [821.843641] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 28214, name: podman [821.843644] CPU: 3 PID: 28214 Comm: podman Tainted: G W 5.11.6 #15 [821.843646] Hardware name: Dell Inc. PowerEdge R330/084XW4, BIOS 2.11.0 12/08/2020 [821.843647] Call Trace: [821.843650] dump_stack+0xa1/0xfb [821.843656] ___might_sleep+0x144/0x160 [821.843659] mutex_lock+0x17/0x40 [821.843662] kernfs_remove_by_name_ns+0x1f/0x80 [821.843666] sysfs_remove_group+0x7d/0xe0 [821.843668] sysfs_remove_groups+0x28/0x40 [821.843670] kobject_del+0x2a/0x80 [821.843672] btrfs_sysfs_del_one_qgroup+0x2b/0x40 [btrfs] [821.843685] __del_qgroup_rb+0x12/0x150 [btrfs] [821.843696] btrfs_remove_qgroup+0x288/0x2a0 [btrfs] [821.843707] btrfs_ioctl+0x3129/0x36a0 [btrfs] [821.843717] ? __mod_lruvec_page_state+0x5e/0xb0 [821.843719] ? page_add_new_anon_rmap+0xbc/0x150 [821.843723] ? kfree+0x1b4/0x300 [821.843725] ? mntput_no_expire+0x55/0x330 [821.843728] __x64_sys_ioctl+0x5a/0xa0 [821.843731] do_syscall_64+0x33/0x70 [821.843733] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [821.843736] RIP: 0033:0x4cd3fb [821.843741] RSP: 002b:000000c000906b20 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [821.843744] RAX: ffffffffffffffda RBX: 000000c000050000 RCX: 00000000004cd3fb [821.843745] RDX: 000000c000906b98 RSI: 000000004010942a RDI: 000000000000000f [821.843747] RBP: 000000c000907cd0 R08: 000000c000622901 R09: 0000000000000000 [821.843748] R10: 000000c000d992c0 R11: 0000000000000206 R12: 000000000000012d [821.843749] R13: 000000000000012c R14: 0000000000000200 R15: 0000000000000049
Fix this by removing the qgroup sysfs entry while not holding the spinlock, since the spinlock is only meant for protection of the qgroup rbtree.
Reported-by: Stuart Shelton srcshelton@gmail.com Link: https://lore.kernel.org/linux-btrfs/7A5485BB-0628-419D-A4D3-27B1AF47E25A@gma... Fixes: 49e5fb46211de0 ("btrfs: qgroup: export qgroups in sysfs") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/qgroup.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -226,7 +226,6 @@ static void __del_qgroup_rb(struct btrfs { struct btrfs_qgroup_list *list;
- btrfs_sysfs_del_one_qgroup(fs_info, qgroup); list_del(&qgroup->dirty); while (!list_empty(&qgroup->groups)) { list = list_first_entry(&qgroup->groups, @@ -243,7 +242,6 @@ static void __del_qgroup_rb(struct btrfs list_del(&list->next_member); kfree(list); } - kfree(qgroup); }
/* must be called with qgroup_lock held */ @@ -569,6 +567,8 @@ void btrfs_free_qgroup_config(struct btr qgroup = rb_entry(n, struct btrfs_qgroup, node); rb_erase(n, &fs_info->qgroup_tree); __del_qgroup_rb(fs_info, qgroup); + btrfs_sysfs_del_one_qgroup(fs_info, qgroup); + kfree(qgroup); } /* * We call btrfs_free_qgroup_config() when unmounting @@ -1578,6 +1578,14 @@ int btrfs_remove_qgroup(struct btrfs_tra spin_lock(&fs_info->qgroup_lock); del_qgroup_rb(fs_info, qgroupid); spin_unlock(&fs_info->qgroup_lock); + + /* + * Remove the qgroup from sysfs now without holding the qgroup_lock + * spinlock, since the sysfs_remove_group() function needs to take + * the mutex kernfs_mutex through kernfs_remove_by_name_ns(). + */ + btrfs_sysfs_del_one_qgroup(fs_info, qgroup); + kfree(qgroup); out: mutex_unlock(&fs_info->qgroup_ioctl_lock); return ret;
From: Filipe Manana fdmanana@suse.com
commit 8d488a8c7ba22d7112fbf6b0a82beb1cdea1c0d5 upstream.
During the mount procedure we are calling btrfs_orphan_cleanup() against the root tree, which will find all orphans items in this tree. When an orphan item corresponds to a deleted subvolume/snapshot (instead of an inode space cache), it must not delete the orphan item, because that will cause btrfs_find_orphan_roots() to not find the orphan item and therefore not add the corresponding subvolume root to the list of dead roots, which results in the subvolume's tree never being deleted by the cleanup thread.
The same applies to the remount from RO to RW path.
Fix this by making btrfs_find_orphan_roots() run before calling btrfs_orphan_cleanup() against the root tree.
A test case for fstests will follow soon.
Reported-by: Robbie Ko robbieko@synology.com Link: https://lore.kernel.org/linux-btrfs/b19f4310-35e0-606e-1eea-2dd84d28c5da@syn... Fixes: 638331fa56caea ("btrfs: fix transaction leak and crash after cleaning up orphans on RO mount") CC: stable@vger.kernel.org # 5.11+ Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2914,6 +2914,21 @@ int btrfs_start_pre_rw_mount(struct btrf } }
+ /* + * btrfs_find_orphan_roots() is responsible for finding all the dead + * roots (with 0 refs), flag them with BTRFS_ROOT_DEAD_TREE and load + * them into the fs_info->fs_roots_radix tree. This must be done before + * calling btrfs_orphan_cleanup() on the tree root. If we don't do it + * first, then btrfs_orphan_cleanup() will delete a dead root's orphan + * item before the root's tree is deleted - this means that if we unmount + * or crash before the deletion completes, on the next mount we will not + * delete what remains of the tree because the orphan item does not + * exists anymore, which is what tells us we have a pending deletion. + */ + ret = btrfs_find_orphan_roots(fs_info); + if (ret) + goto out; + ret = btrfs_cleanup_fs_roots(fs_info); if (ret) goto out; @@ -2973,7 +2988,6 @@ int btrfs_start_pre_rw_mount(struct btrf } }
- ret = btrfs_find_orphan_roots(fs_info); out: return ret; }
From: Ondrej Mosnacek omosnace@redhat.com
commit 519dad3bcd809dc1523bf80ab0310ddb3bf00ade upstream.
If sel_make_policy_nodes() fails, we should jump to 'out', not 'out1', as the latter would incorrectly log an MAC_POLICY_LOAD audit record, even though the policy hasn't actually been reloaded. The 'out1' jump label now becomes unused and can be removed.
Fixes: 02a52c5c8c3b ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/selinuxfs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -651,14 +651,13 @@ static ssize_t sel_write_load(struct fil length = sel_make_policy_nodes(fsi, newpolicy); if (length) { selinux_policy_cancel(fsi->state, newpolicy); - goto out1; + goto out; }
selinux_policy_commit(fsi->state, newpolicy);
length = count;
-out1: audit_log(audit_context(), GFP_KERNEL, AUDIT_MAC_POLICY_LOAD, "auid=%u ses=%u lsm=selinux res=1", from_kuid(&init_user_ns, audit_get_loginuid(current)),
From: Ondrej Mosnacek omosnace@redhat.com
commit 6406887a12ee5dcdaffff1a8508d91113d545559 upstream.
Commit 02a52c5c8c3b ("selinux: move policy commit after updating selinuxfs") moved the selinux_policy_commit() call out of security_load_policy() into sel_write_load(), which caused a subtle yet rather serious bug.
The problem is that security_load_policy() passes a reference to the convert_params local variable to sidtab_convert(), which stores it in the sidtab, where it may be accessed until the policy is swapped over and RCU synchronized. Before 02a52c5c8c3b, selinux_policy_commit() was called directly from security_load_policy(), so the convert_params pointer remained valid all the way until the old sidtab was destroyed, but now that's no longer the case and calls to sidtab_context_to_sid() on the old sidtab after security_load_policy() returns may cause invalid memory accesses.
This can be easily triggered using the stress test from commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance"): ``` function rand_cat() { echo $(( $RANDOM % 1024 )) }
function do_work() { while true; do echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \ >/sys/fs/selinux/context 2>/dev/null || true done }
do_work >/dev/null & do_work >/dev/null & do_work >/dev/null &
while load_policy; do echo -n .; sleep 0.1; done
kill %1 kill %2 kill %3 ```
Fix this by allocating the temporary sidtab convert structures dynamically and passing them among the selinux_policy_{load,cancel,commit} functions.
Fixes: 02a52c5c8c3b ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Tested-by: Tyler Hicks tyhicks@linux.microsoft.com Reviewed-by: Tyler Hicks tyhicks@linux.microsoft.com Signed-off-by: Ondrej Mosnacek omosnace@redhat.com [PM: merge fuzz in security.h and services.c] Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/include/security.h | 15 ++++++-- security/selinux/selinuxfs.c | 10 ++--- security/selinux/ss/services.c | 65 ++++++++++++++++++++++-------------- 3 files changed, 56 insertions(+), 34 deletions(-)
--- a/security/selinux/include/security.h +++ b/security/selinux/include/security.h @@ -219,14 +219,21 @@ static inline bool selinux_policycap_gen return READ_ONCE(state->policycap[POLICYDB_CAPABILITY_GENFS_SECLABEL_SYMLINKS]); }
+struct selinux_policy_convert_data; + +struct selinux_load_state { + struct selinux_policy *policy; + struct selinux_policy_convert_data *convert_data; +}; + int security_mls_enabled(struct selinux_state *state); int security_load_policy(struct selinux_state *state, - void *data, size_t len, - struct selinux_policy **newpolicyp); + void *data, size_t len, + struct selinux_load_state *load_state); void selinux_policy_commit(struct selinux_state *state, - struct selinux_policy *newpolicy); + struct selinux_load_state *load_state); void selinux_policy_cancel(struct selinux_state *state, - struct selinux_policy *policy); + struct selinux_load_state *load_state); int security_read_policy(struct selinux_state *state, void **data, size_t *len);
--- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -616,7 +616,7 @@ static ssize_t sel_write_load(struct fil
{ struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info; - struct selinux_policy *newpolicy; + struct selinux_load_state load_state; ssize_t length; void *data = NULL;
@@ -642,19 +642,19 @@ static ssize_t sel_write_load(struct fil if (copy_from_user(data, buf, count) != 0) goto out;
- length = security_load_policy(fsi->state, data, count, &newpolicy); + length = security_load_policy(fsi->state, data, count, &load_state); if (length) { pr_warn_ratelimited("SELinux: failed to load policy\n"); goto out; }
- length = sel_make_policy_nodes(fsi, newpolicy); + length = sel_make_policy_nodes(fsi, load_state.policy); if (length) { - selinux_policy_cancel(fsi->state, newpolicy); + selinux_policy_cancel(fsi->state, &load_state); goto out; }
- selinux_policy_commit(fsi->state, newpolicy); + selinux_policy_commit(fsi->state, &load_state);
length = count;
--- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -66,6 +66,17 @@ #include "audit.h" #include "policycap_names.h"
+struct convert_context_args { + struct selinux_state *state; + struct policydb *oldp; + struct policydb *newp; +}; + +struct selinux_policy_convert_data { + struct convert_context_args args; + struct sidtab_convert_params sidtab_params; +}; + /* Forward declaration. */ static int context_struct_to_string(struct policydb *policydb, struct context *context, @@ -1973,12 +1984,6 @@ static inline int convert_context_handle return 0; }
-struct convert_context_args { - struct selinux_state *state; - struct policydb *oldp; - struct policydb *newp; -}; - /* * Convert the values in the security context * structure `oldc' from the values specified @@ -2158,7 +2163,7 @@ static void selinux_policy_cond_free(str }
void selinux_policy_cancel(struct selinux_state *state, - struct selinux_policy *policy) + struct selinux_load_state *load_state) { struct selinux_policy *oldpolicy;
@@ -2166,7 +2171,8 @@ void selinux_policy_cancel(struct selinu lockdep_is_held(&state->policy_mutex));
sidtab_cancel_convert(oldpolicy->sidtab); - selinux_policy_free(policy); + selinux_policy_free(load_state->policy); + kfree(load_state->convert_data); }
static void selinux_notify_policy_change(struct selinux_state *state, @@ -2181,9 +2187,9 @@ static void selinux_notify_policy_change }
void selinux_policy_commit(struct selinux_state *state, - struct selinux_policy *newpolicy) + struct selinux_load_state *load_state) { - struct selinux_policy *oldpolicy; + struct selinux_policy *oldpolicy, *newpolicy = load_state->policy; u32 seqno;
oldpolicy = rcu_dereference_protected(state->policy, @@ -2223,6 +2229,7 @@ void selinux_policy_commit(struct selinu /* Free the old policy */ synchronize_rcu(); selinux_policy_free(oldpolicy); + kfree(load_state->convert_data);
/* Notify others of the policy change */ selinux_notify_policy_change(state, seqno); @@ -2239,11 +2246,10 @@ void selinux_policy_commit(struct selinu * loading the new policy. */ int security_load_policy(struct selinux_state *state, void *data, size_t len, - struct selinux_policy **newpolicyp) + struct selinux_load_state *load_state) { struct selinux_policy *newpolicy, *oldpolicy; - struct sidtab_convert_params convert_params; - struct convert_context_args args; + struct selinux_policy_convert_data *convert_data; int rc = 0; struct policy_file file = { data, len }, *fp = &file;
@@ -2273,10 +2279,10 @@ int security_load_policy(struct selinux_ goto err_mapping; }
- if (!selinux_initialized(state)) { /* First policy load, so no need to preserve state from old policy */ - *newpolicyp = newpolicy; + load_state->policy = newpolicy; + load_state->convert_data = NULL; return 0; }
@@ -2290,29 +2296,38 @@ int security_load_policy(struct selinux_ goto err_free_isids; }
+ convert_data = kmalloc(sizeof(*convert_data), GFP_KERNEL); + if (!convert_data) { + rc = -ENOMEM; + goto err_free_isids; + } + /* * Convert the internal representations of contexts * in the new SID table. */ - args.state = state; - args.oldp = &oldpolicy->policydb; - args.newp = &newpolicy->policydb; - - convert_params.func = convert_context; - convert_params.args = &args; - convert_params.target = newpolicy->sidtab; + convert_data->args.state = state; + convert_data->args.oldp = &oldpolicy->policydb; + convert_data->args.newp = &newpolicy->policydb; + + convert_data->sidtab_params.func = convert_context; + convert_data->sidtab_params.args = &convert_data->args; + convert_data->sidtab_params.target = newpolicy->sidtab;
- rc = sidtab_convert(oldpolicy->sidtab, &convert_params); + rc = sidtab_convert(oldpolicy->sidtab, &convert_data->sidtab_params); if (rc) { pr_err("SELinux: unable to convert the internal" " representation of contexts in the new SID" " table\n"); - goto err_free_isids; + goto err_free_convert_data; }
- *newpolicyp = newpolicy; + load_state->policy = newpolicy; + load_state->convert_data = convert_data; return 0;
+err_free_convert_data: + kfree(convert_data); err_free_isids: sidtab_destroy(newpolicy->sidtab); err_mapping:
From: Mian Yousaf Kaukab ykaukab@suse.de
commit 804741ac7b9f2fdebe3740cb0579cb8d94d49e60 upstream.
Since commit 8e850f25b581 ("net: socionext: Stop PHY before resetting netsec") netsec_netdev_init() power downs phy before resetting the controller. However, the state is not restored once the reset is complete. As a result it is not possible to bring up network on a platform with Broadcom BCM5482 phy.
Fix the issue by restoring phy power state after controller reset is complete.
Fixes: 8e850f25b581 ("net: socionext: Stop PHY before resetting netsec") Cc: stable@vger.kernel.org Signed-off-by: Mian Yousaf Kaukab ykaukab@suse.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/socionext/netsec.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1718,14 +1718,17 @@ static int netsec_netdev_init(struct net goto err1;
/* set phy power down */ - data = netsec_phy_read(priv->mii_bus, priv->phy_addr, MII_BMCR) | - BMCR_PDOWN; - netsec_phy_write(priv->mii_bus, priv->phy_addr, MII_BMCR, data); + data = netsec_phy_read(priv->mii_bus, priv->phy_addr, MII_BMCR); + netsec_phy_write(priv->mii_bus, priv->phy_addr, MII_BMCR, + data | BMCR_PDOWN);
ret = netsec_reset_hardware(priv, true); if (ret) goto err2;
+ /* Restore phy power state */ + netsec_phy_write(priv->mii_bus, priv->phy_addr, MII_BMCR, data); + spin_lock_init(&priv->desc_ring[NETSEC_RING_TX].lock); spin_lock_init(&priv->desc_ring[NETSEC_RING_RX].lock);
From: Hans de Goede hdegoede@redhat.com
commit 538d2dd0b9920334e6596977a664e9e7bac73703 upstream.
Stop reporting SW_DOCK events because this breaks suspend-on-lid-close.
SW_DOCK should only be reported for docking stations, but all the DSDTs in my DSDT collection which use the intel-vbtn code, always seem to use this for 2-in-1s / convertibles and set SW_DOCK=1 when in laptop-mode (in tandem with setting SW_TABLET_MODE=0).
This causes userspace to think the laptop is docked to a port-replicator and to disable suspend-on-lid-close, which is undesirable.
Map the dock events to KEY_IGNORE to avoid this broken SW_DOCK reporting.
Note this may theoretically cause us to stop reporting SW_DOCK on some device where the 0xCA and 0xCB intel-vbtn events are actually used for reporting docking to a classic docking-station / port-replicator but I'm not aware of any such devices.
Also the most important thing is that we only report SW_DOCK when it reliably reports being docked to a classic docking-station without any false positives, which clearly is not the case here. If there is a chance of reporting false positives then it is better to not report SW_DOCK at all.
Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321163513.72328-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/intel-vbtn.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/platform/x86/intel-vbtn.c +++ b/drivers/platform/x86/intel-vbtn.c @@ -47,8 +47,16 @@ static const struct key_entry intel_vbtn };
static const struct key_entry intel_vbtn_switchmap[] = { - { KE_SW, 0xCA, { .sw = { SW_DOCK, 1 } } }, /* Docked */ - { KE_SW, 0xCB, { .sw = { SW_DOCK, 0 } } }, /* Undocked */ + /* + * SW_DOCK should only be reported for docking stations, but DSDTs using the + * intel-vbtn code, always seem to use this for 2-in-1s / convertibles and set + * SW_DOCK=1 when in laptop-mode (in tandem with setting SW_TABLET_MODE=0). + * This causes userspace to think the laptop is docked to a port-replicator + * and to disable suspend-on-lid-close, which is undesirable. + * Map the dock events to KEY_IGNORE to avoid this broken SW_DOCK reporting. + */ + { KE_IGNORE, 0xCA, { .sw = { SW_DOCK, 1 } } }, /* Docked */ + { KE_IGNORE, 0xCB, { .sw = { SW_DOCK, 0 } } }, /* Undocked */ { KE_SW, 0xCC, { .sw = { SW_TABLET_MODE, 1 } } }, /* Tablet */ { KE_SW, 0xCD, { .sw = { SW_TABLET_MODE, 0 } } }, /* Laptop */ };
From: Ido Schimmel idosch@nvidia.com
commit e43accba9b071dcd106b5e7643b1b106a158cbb1 upstream.
Cited commit added a new attribute before the existing group reference count attribute, thereby changing its value and breaking existing applications on new kernels.
Before:
# psample -l libpsample ERROR psample_group_foreach: failed to recv message: Operation not supported
After:
# psample -l Group Num Refcount Group Seq 1 1 0
Fix by restoring the value of the old attribute and remove the misleading comments from the enumerator to avoid future bugs.
Cc: stable@vger.kernel.org Fixes: d8bed686ab96 ("net: psample: Add tunnel support") Signed-off-by: Ido Schimmel idosch@nvidia.com Reported-by: Adiel Bidani adielb@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/psample.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/include/uapi/linux/psample.h +++ b/include/uapi/linux/psample.h @@ -3,7 +3,6 @@ #define __UAPI_PSAMPLE_H
enum { - /* sampled packet metadata */ PSAMPLE_ATTR_IIFINDEX, PSAMPLE_ATTR_OIFINDEX, PSAMPLE_ATTR_ORIGSIZE, @@ -11,10 +10,8 @@ enum { PSAMPLE_ATTR_GROUP_SEQ, PSAMPLE_ATTR_SAMPLE_RATE, PSAMPLE_ATTR_DATA, - PSAMPLE_ATTR_TUNNEL, - - /* commands attributes */ PSAMPLE_ATTR_GROUP_REFCOUNT, + PSAMPLE_ATTR_TUNNEL,
__PSAMPLE_ATTR_MAX };
From: Thomas Hebb tommyhebb@gmail.com
commit 6d679578fe9c762c8fbc3d796a067cbba84a7884 upstream.
Commit ca0246bb97c2 ("z3fold: fix possible reclaim races") introduced the PAGE_CLAIMED flag "to avoid racing on a z3fold 'headless' page release." By atomically testing and setting the bit in each of z3fold_free() and z3fold_reclaim_page(), a double-free was avoided.
However, commit dcf5aedb24f8 ("z3fold: stricter locking and more careful reclaim") appears to have unintentionally broken this behavior by moving the PAGE_CLAIMED check in z3fold_reclaim_page() to after the page lock gets taken, which only happens for non-headless pages. For headless pages, the check is now skipped entirely and races can occur again.
I have observed such a race on my system:
page:00000000ffbd76b7 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x165316 flags: 0x2ffff0000000000() raw: 02ffff0000000000 ffffea0004535f48 ffff8881d553a170 0000000000000000 raw: 0000000000000000 0000000000000011 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:707! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 2 PID: 291928 Comm: kworker/2:0 Tainted: G B 5.10.7-arch1-1-kasan #1 Hardware name: Gigabyte Technology Co., Ltd. H97N-WIFI/H97N-WIFI, BIOS F9b 03/03/2016 Workqueue: zswap-shrink shrink_worker RIP: 0010:__free_pages+0x10a/0x130 Code: c1 e7 06 48 01 ef 45 85 e4 74 d1 44 89 e6 31 d2 41 83 ec 01 e8 e7 b0 ff ff eb da 48 c7 c6 e0 32 91 88 48 89 ef e8 a6 89 f8 ff <0f> 0b 4c 89 e7 e8 fc 79 07 00 e9 33 ff ff ff 48 89 ef e8 ff 79 07 RSP: 0000:ffff88819a2ffb98 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffea000594c5a8 RCX: 0000000000000000 RDX: 1ffffd4000b298b7 RSI: 0000000000000000 RDI: ffffea000594c5b8 RBP: ffffea000594c580 R08: 000000000000003e R09: ffff8881d5520bbb R10: ffffed103aaa4177 R11: 0000000000000001 R12: ffffea000594c5b4 R13: 0000000000000000 R14: ffff888165316000 R15: ffffea000594c588 FS: 0000000000000000(0000) GS:ffff8881d5500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7c8c3654d8 CR3: 0000000103f42004 CR4: 00000000001706e0 Call Trace: z3fold_zpool_shrink+0x9b6/0x1240 shrink_worker+0x35/0x90 process_one_work+0x70c/0x1210 worker_thread+0x539/0x1200 kthread+0x330/0x400 ret_from_fork+0x22/0x30 Modules linked in: rfcomm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ccm algif_aead des_generic libdes ecb algif_skcipher cmac bnep md4 algif_hash af_alg vfat fat intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iwlmvm hid_logitech_hidpp kvm at24 mac80211 snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic intel_pmc_bxt snd_hda_codec_hdmi ledtrig_audio iTCO_vendor_support mei_wdt mei_hdcp snd_hda_intel snd_intel_dspcfg libarc4 soundwire_intel irqbypass iwlwifi soundwire_generic_allocation rapl soundwire_cadence intel_cstate snd_hda_codec intel_uncore btusb joydev mousedev snd_usb_audio pcspkr btrtl uvcvideo nouveau btbcm i2c_i801 btintel snd_hda_core videobuf2_vmalloc i2c_smbus snd_usbmidi_lib videobuf2_memops bluetooth snd_hwdep soundwire_bus snd_soc_rt5640 videobuf2_v4l2 cfg80211 snd_soc_rl6231 videobuf2_common snd_rawmidi lpc_ich alx videodev mdio snd_seq_device snd_soc_core mc ecdh_generic mxm_wmi mei_me hid_logitech_dj wmi snd_compress e1000e ac97_bus mei ttm rfkill snd_pcm_dmaengine ecc snd_pcm snd_timer snd soundcore mac_hid acpi_pad pkcs8_key_parser it87 hwmon_vid crypto_user fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_crypt cbc encrypted_keys trusted tpm rng_core usbhid dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart ---[ end trace 126d646fc3dc0ad8 ]---
To fix the issue, re-add the earlier test and set in the case where we have a headless page.
Link: https://lkml.kernel.org/r/c8106dbe6d8390b290cd1d7f873a2942e805349e.161545204... Fixes: dcf5aedb24f8 ("z3fold: stricter locking and more careful reclaim") Signed-off-by: Thomas Hebb tommyhebb@gmail.com Reviewed-by: Vitaly Wool vitaly.wool@konsulko.com Cc: Jongseok Kim ks77sj@gmail.com Cc: Snild Dolkow snild@sony.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/z3fold.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -1353,8 +1353,22 @@ static int z3fold_reclaim_page(struct z3 page = list_entry(pos, struct page, lru);
zhdr = page_address(page); - if (test_bit(PAGE_HEADLESS, &page->private)) + if (test_bit(PAGE_HEADLESS, &page->private)) { + /* + * For non-headless pages, we wait to do this + * until we have the page lock to avoid racing + * with __z3fold_alloc(). Headless pages don't + * have a lock (and __z3fold_alloc() will never + * see them), but we still need to test and set + * PAGE_CLAIMED to avoid racing with + * z3fold_free(), so just do it now before + * leaving the loop. + */ + if (test_and_set_bit(PAGE_CLAIMED, &page->private)) + continue; + break; + }
if (kref_get_unless_zero(&zhdr->refcount) == 0) { zhdr = NULL;
From: Sean Nyekjaer sean@geanix.com
commit c1b2028315c6b15e8d6725e0d5884b15887d3daa upstream.
When mouting a squashfs image created without inode compression it fails with: "unable to read inode lookup table"
It turns out that the BLOCK_OFFSET is missing when checking the SQUASHFS_METADATA_SIZE agaist the actual size.
Link: https://lkml.kernel.org/r/20210226092903.1473545-1-sean@geanix.com Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup") Signed-off-by: Sean Nyekjaer sean@geanix.com Acked-by: Phillip Lougher phillip@squashfs.org.uk Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/export.c | 8 ++++++-- fs/squashfs/squashfs_fs.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-)
--- a/fs/squashfs/export.c +++ b/fs/squashfs/export.c @@ -152,14 +152,18 @@ __le64 *squashfs_read_inode_lookup_table start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]);
- if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end + || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } }
start = le64_to_cpu(table[indexes - 1]); - if (start >= lookup_table_start || (lookup_table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= lookup_table_start || + (lookup_table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } --- a/fs/squashfs/squashfs_fs.h +++ b/fs/squashfs/squashfs_fs.h @@ -17,6 +17,7 @@
/* size of metadata (inode and directory) blocks */ #define SQUASHFS_METADATA_SIZE 8192 +#define SQUASHFS_BLOCK_OFFSET 2
/* default size of block device I/O */ #ifdef CONFIG_SQUASHFS_4K_DEVBLK_SIZE
From: Phillip Lougher phillip@squashfs.org.uk
commit 8b44ca2b634527151af07447a8090a5f3a043321 upstream.
The checks for maximum metadata block size is missing SQUASHFS_BLOCK_OFFSET (the two byte length count).
Link: https://lkml.kernel.org/r/2069685113.2081245.1614583677427@webmail.123-reg.c... Fixes: f37aa4c7366e23f ("squashfs: add more sanity checks in id lookup") Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Cc: Sean Nyekjaer sean@geanix.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/id.c | 6 ++++-- fs/squashfs/xattr_id.c | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-)
--- a/fs/squashfs/id.c +++ b/fs/squashfs/id.c @@ -97,14 +97,16 @@ __le64 *squashfs_read_id_index_table(str start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]);
- if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } }
start = le64_to_cpu(table[indexes - 1]); - if (start >= id_table_start || (id_table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= id_table_start || (id_table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } --- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -109,14 +109,16 @@ __le64 *squashfs_read_xattr_id_table(str start = le64_to_cpu(table[n]); end = le64_to_cpu(table[n + 1]);
- if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) { + if (start >= end || (end - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); } }
start = le64_to_cpu(table[indexes - 1]); - if (start >= table_start || (table_start - start) > SQUASHFS_METADATA_SIZE) { + if (start >= table_start || (table_start - start) > + (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) { kfree(table); return ERR_PTR(-EINVAL); }
From: Miaohe Lin linmiaohe@huawei.com
commit d85aecf2844ff02a0e5f077252b2461d4f10c9f0 upstream.
The current implementation of hugetlb_cgroup for shared mappings could have different behavior. Consider the following two scenarios:
1.Assume initial css reference count of hugetlb_cgroup is 1: 1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference count is 2 associated with 1 file_region. 1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference count is 3 associated with 2 file_region. 1.3 coalesce_file_region will coalesce these two file_regions into one. So css reference count is 3 associated with 1 file_region now.
2.Assume initial css reference count of hugetlb_cgroup is 1 again: 2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference count is 2 associated with 1 file_region.
Therefore, we might have one file_region while holding one or more css reference counts. This inconsistency could lead to imbalanced css_get() and css_put() pair. If we do css_put one by one (i.g. hole punch case), scenario 2 would put one more css reference. If we do css_put all together (i.g. truncate case), scenario 1 will leak one css reference.
The imbalanced css_get() and css_put() pair would result in a non-zero reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup directory is removed __but__ associated resource is not freed. This might result in OOM or can not create a new hugetlb cgroup in a busy workload ultimately.
In order to fix this, we have to make sure that one file_region must hold exactly one css reference. So in coalesce_file_region case, we should release one css reference before coalescence. Also only put css reference when the entire file_region is removed.
The last thing to note is that the caller of region_add() will only hold one reference to h_cg->css for the whole contiguous reservation region. But this area might be scattered when there are already some file_regions reside in it. As a result, many file_regions may share only one h_cg->css reference. In order to ensure that one file_region must hold exactly one css reference, we should do css_get() for each file_region and release the reference held by caller when they are done.
[linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings] Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings") Reported-by: kernel test robot lkp@intel.com (auto build test ERROR) Signed-off-by: Miaohe Lin linmiaohe@huawei.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.vnet.ibm.com Cc: Wanpeng Li liwp.linux@gmail.com Cc: Mina Almasry almasrymina@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/hugetlb_cgroup.h | 15 +++++++++++++-- mm/hugetlb.c | 41 +++++++++++++++++++++++++++++++++++++---- mm/hugetlb_cgroup.c | 10 ++++++++-- 3 files changed, 58 insertions(+), 8 deletions(-)
--- a/include/linux/hugetlb_cgroup.h +++ b/include/linux/hugetlb_cgroup.h @@ -113,6 +113,11 @@ static inline bool hugetlb_cgroup_disabl return !cgroup_subsys_enabled(hugetlb_cgrp_subsys); }
+static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg) +{ + css_put(&h_cg->css); +} + extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages, struct hugetlb_cgroup **ptr); extern int hugetlb_cgroup_charge_cgroup_rsvd(int idx, unsigned long nr_pages, @@ -138,7 +143,8 @@ extern void hugetlb_cgroup_uncharge_coun
extern void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv, struct file_region *rg, - unsigned long nr_pages); + unsigned long nr_pages, + bool region_del);
extern void hugetlb_cgroup_file_init(void) __init; extern void hugetlb_cgroup_migrate(struct page *oldhpage, @@ -147,7 +153,8 @@ extern void hugetlb_cgroup_migrate(struc #else static inline void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv, struct file_region *rg, - unsigned long nr_pages) + unsigned long nr_pages, + bool region_del) { }
@@ -185,6 +192,10 @@ static inline bool hugetlb_cgroup_disabl return true; }
+static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg) +{ +} + static inline int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages, struct hugetlb_cgroup **ptr) { --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -285,6 +285,17 @@ static void record_hugetlb_cgroup_unchar nrg->reservation_counter = &h_cg->rsvd_hugepage[hstate_index(h)]; nrg->css = &h_cg->css; + /* + * The caller will hold exactly one h_cg->css reference for the + * whole contiguous reservation region. But this area might be + * scattered when there are already some file_regions reside in + * it. As a result, many file_regions may share only one css + * reference. In order to ensure that one file_region must hold + * exactly one h_cg->css reference, we should do css_get for + * each file_region and leave the reference held by caller + * untouched. + */ + css_get(&h_cg->css); if (!resv->pages_per_hpage) resv->pages_per_hpage = pages_per_huge_page(h); /* pages_per_hpage should be the same for all entries in @@ -298,6 +309,14 @@ static void record_hugetlb_cgroup_unchar #endif }
+static void put_uncharge_info(struct file_region *rg) +{ +#ifdef CONFIG_CGROUP_HUGETLB + if (rg->css) + css_put(rg->css); +#endif +} + static bool has_same_uncharge_info(struct file_region *rg, struct file_region *org) { @@ -321,6 +340,7 @@ static void coalesce_file_region(struct prg->to = rg->to;
list_del(&rg->link); + put_uncharge_info(rg); kfree(rg);
rg = prg; @@ -332,6 +352,7 @@ static void coalesce_file_region(struct nrg->from = rg->from;
list_del(&rg->link); + put_uncharge_info(rg); kfree(rg); } } @@ -664,7 +685,7 @@ retry:
del += t - f; hugetlb_cgroup_uncharge_file_region( - resv, rg, t - f); + resv, rg, t - f, false);
/* New entry for end of split region */ nrg->from = t; @@ -685,7 +706,7 @@ retry: if (f <= rg->from && t >= rg->to) { /* Remove entire region */ del += rg->to - rg->from; hugetlb_cgroup_uncharge_file_region(resv, rg, - rg->to - rg->from); + rg->to - rg->from, true); list_del(&rg->link); kfree(rg); continue; @@ -693,13 +714,13 @@ retry:
if (f <= rg->from) { /* Trim beginning of region */ hugetlb_cgroup_uncharge_file_region(resv, rg, - t - rg->from); + t - rg->from, false);
del += t - rg->from; rg->from = t; } else { /* Trim end of region */ hugetlb_cgroup_uncharge_file_region(resv, rg, - rg->to - f); + rg->to - f, false);
del += rg->to - f; rg->to = f; @@ -5191,6 +5212,10 @@ int hugetlb_reserve_pages(struct inode * */ long rsv_adjust;
+ /* + * hugetlb_cgroup_uncharge_cgroup_rsvd() will put the + * reference to h_cg->css. See comment below for detail. + */ hugetlb_cgroup_uncharge_cgroup_rsvd( hstate_index(h), (chg - add) * pages_per_huge_page(h), h_cg); @@ -5198,6 +5223,14 @@ int hugetlb_reserve_pages(struct inode * rsv_adjust = hugepage_subpool_put_pages(spool, chg - add); hugetlb_acct_memory(h, -rsv_adjust); + } else if (h_cg) { + /* + * The file_regions will hold their own reference to + * h_cg->css. So we should release the reference held + * via hugetlb_cgroup_charge_cgroup_rsvd() when we are + * done. + */ + hugetlb_cgroup_put_rsvd_cgroup(h_cg); } } return 0; --- a/mm/hugetlb_cgroup.c +++ b/mm/hugetlb_cgroup.c @@ -391,7 +391,8 @@ void hugetlb_cgroup_uncharge_counter(str
void hugetlb_cgroup_uncharge_file_region(struct resv_map *resv, struct file_region *rg, - unsigned long nr_pages) + unsigned long nr_pages, + bool region_del) { if (hugetlb_cgroup_disabled() || !resv || !rg || !nr_pages) return; @@ -400,7 +401,12 @@ void hugetlb_cgroup_uncharge_file_region !resv->reservation_counter) { page_counter_uncharge(rg->reservation_counter, nr_pages * resv->pages_per_hpage); - css_put(rg->css); + /* + * Only do css_put(rg->css) when we delete the entire region + * because one file_region must hold exactly one css reference. + */ + if (region_del) + css_put(rg->css); } }
From: Andrey Konovalov andreyknvl@google.com
commit cf10bd4c4aff8dd64d1aa7f2a529d0c672bc16af upstream.
To allow performing tag checks on page_alloc addresses obtained via page_address(), tag-based KASAN modes store tags for page_alloc allocations in page->flags.
Currently, the default tag value stored in page->flags is 0x00. Therefore, page_address() returns a 0x00ffff... address for pages that were not allocated via page_alloc.
This might cause problems. A particular case we encountered is a conflict with KFENCE. If a KFENCE-allocated slab object is being freed via kfree(page_address(page) + offset), the address passed to kfree() will get tagged with 0x00 (as slab pages keep the default per-page tags). This leads to is_kfence_address() check failing, and a KFENCE object ending up in normal slab freelist, which causes memory corruptions.
This patch changes the way KASAN stores tag in page-flags: they are now stored xor'ed with 0xff. This way, KASAN doesn't need to initialize per-page flags for every created page, which might be slow.
With this change, page_address() returns natively-tagged (with 0xff) pointers for pages that didn't have tags set explicitly.
This patch fixes the encountered conflict with KFENCE and prevents more similar issues that can occur in the future.
Link: https://lkml.kernel.org/r/1a41abb11c51b264511d9e71c303bb16d5cb367b.161547545... Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Andrey Konovalov andreyknvl@google.com Reviewed-by: Marco Elver elver@google.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Alexander Potapenko glider@google.com Cc: Peter Collingbourne pcc@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Branislav Rankov Branislav.Rankov@arm.com Cc: Kevin Brodsky kevin.brodsky@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1431,16 +1431,28 @@ static inline bool cpupid_match_pid(stru
#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
+/* + * KASAN per-page tags are stored xor'ed with 0xff. This allows to avoid + * setting tags for all pages to native kernel tag value 0xff, as the default + * value 0x00 maps to 0xff. + */ + static inline u8 page_kasan_tag(const struct page *page) { - if (kasan_enabled()) - return (page->flags >> KASAN_TAG_PGSHIFT) & KASAN_TAG_MASK; - return 0xff; + u8 tag = 0xff; + + if (kasan_enabled()) { + tag = (page->flags >> KASAN_TAG_PGSHIFT) & KASAN_TAG_MASK; + tag ^= 0xff; + } + + return tag; }
static inline void page_kasan_tag_set(struct page *page, u8 tag) { if (kasan_enabled()) { + tag ^= 0xff; page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT); page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT; }
From: Nick Desaulniers ndesaulniers@google.com
commit 60bcf728ee7c60ac2a1f9a0eaceb3a7b3954cd2b upstream.
LLVM changed the expected function signatures for llvm_gcda_start_file() and llvm_gcda_emit_function() in the clang-11 release. Users of clang-11 or newer may have noticed their kernels failing to boot due to a panic when enabling CONFIG_GCOV_KERNEL=y +CONFIG_GCOV_PROFILE_ALL=y. Fix up the function signatures so calling these functions doesn't panic the kernel.
Link: https://reviews.llvm.org/rGcdd683b516d147925212724b09ec6fb792a40041 Link: https://reviews.llvm.org/rG13a633b438b6500ecad9e4f936ebadf3411d0f44 Link: https://lkml.kernel.org/r/20210312224132.3413602-2-ndesaulniers@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com Reported-by: Prasad Sodagudi psodagud@quicinc.com Suggested-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Fangrui Song maskray@google.com Tested-by: Nathan Chancellor nathan@kernel.org Acked-by: Peter Oberparleiter oberpar@linux.ibm.com Reviewed-by: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org [5.4+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/gcov/clang.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+)
--- a/kernel/gcov/clang.c +++ b/kernel/gcov/clang.c @@ -75,7 +75,9 @@ struct gcov_fn_info {
u32 num_counters; u64 *counters; +#if CONFIG_CLANG_VERSION < 110000 const char *function_name; +#endif };
static struct gcov_info *current_info; @@ -105,6 +107,7 @@ void llvm_gcov_init(llvm_gcov_callback w } EXPORT_SYMBOL(llvm_gcov_init);
+#if CONFIG_CLANG_VERSION < 110000 void llvm_gcda_start_file(const char *orig_filename, const char version[4], u32 checksum) { @@ -113,7 +116,17 @@ void llvm_gcda_start_file(const char *or current_info->checksum = checksum; } EXPORT_SYMBOL(llvm_gcda_start_file); +#else +void llvm_gcda_start_file(const char *orig_filename, u32 version, u32 checksum) +{ + current_info->filename = orig_filename; + current_info->version = version; + current_info->checksum = checksum; +} +EXPORT_SYMBOL(llvm_gcda_start_file); +#endif
+#if CONFIG_CLANG_VERSION < 110000 void llvm_gcda_emit_function(u32 ident, const char *function_name, u32 func_checksum, u8 use_extra_checksum, u32 cfg_checksum) { @@ -133,6 +146,24 @@ void llvm_gcda_emit_function(u32 ident, list_add_tail(&info->head, ¤t_info->functions); } EXPORT_SYMBOL(llvm_gcda_emit_function); +#else +void llvm_gcda_emit_function(u32 ident, u32 func_checksum, + u8 use_extra_checksum, u32 cfg_checksum) +{ + struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL); + + if (!info) + return; + + INIT_LIST_HEAD(&info->head); + info->ident = ident; + info->checksum = func_checksum; + info->use_extra_checksum = use_extra_checksum; + info->cfg_checksum = cfg_checksum; + list_add_tail(&info->head, ¤t_info->functions); +} +EXPORT_SYMBOL(llvm_gcda_emit_function); +#endif
void llvm_gcda_emit_arcs(u32 num_counters, u64 *counters) { @@ -295,6 +326,7 @@ void gcov_info_add(struct gcov_info *dst } }
+#if CONFIG_CLANG_VERSION < 110000 static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn) { size_t cv_size; /* counter values size */ @@ -322,6 +354,28 @@ err_name: kfree(fn_dup); return NULL; } +#else +static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn) +{ + size_t cv_size; /* counter values size */ + struct gcov_fn_info *fn_dup = kmemdup(fn, sizeof(*fn), + GFP_KERNEL); + if (!fn_dup) + return NULL; + INIT_LIST_HEAD(&fn_dup->head); + + cv_size = fn->num_counters * sizeof(fn->counters[0]); + fn_dup->counters = vmalloc(cv_size); + if (!fn_dup->counters) { + kfree(fn_dup); + return NULL; + } + + memcpy(fn_dup->counters, fn->counters, cv_size); + + return fn_dup; +} +#endif
/** * gcov_info_dup - duplicate profiling data set @@ -362,6 +416,7 @@ err: * gcov_info_free - release memory for profiling data set duplicate * @info: profiling data set duplicate to free */ +#if CONFIG_CLANG_VERSION < 110000 void gcov_info_free(struct gcov_info *info) { struct gcov_fn_info *fn, *tmp; @@ -375,6 +430,20 @@ void gcov_info_free(struct gcov_info *in kfree(info->filename); kfree(info); } +#else +void gcov_info_free(struct gcov_info *info) +{ + struct gcov_fn_info *fn, *tmp; + + list_for_each_entry_safe(fn, tmp, &info->functions, head) { + vfree(fn->counters); + list_del(&fn->head); + kfree(fn); + } + kfree(info->filename); + kfree(info); +} +#endif
#define ITER_STRIDE PAGE_SIZE
From: Ira Weiny ira.weiny@intel.com
commit 487cfade12fae0eb707bdce71c4d585128238a7d upstream.
The kernel test robot found that __kmap_local_sched_out() was not correctly skipping the guard pages when DEBUG_KMAP_LOCAL_FORCE_MAP was set.[1] This was due to DEBUG_HIGHMEM check being used.
Change the configuration check to be correct.
[1] https://lore.kernel.org/lkml/20210304083825.GB17830@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/20210318230657.1497881-1-ira.weiny@intel.com Fixes: 0e91a0c6984c ("mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP") Signed-off-by: Ira Weiny ira.weiny@intel.com Reported-by: kernel test robot oliver.sang@intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Oliver Sang oliver.sang@intel.com Cc: Chaitanya Kulkarni Chaitanya.Kulkarni@wdc.com Cc: David Sterba dsterba@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/highmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/highmem.c b/mm/highmem.c index 86f2b9495f9c..6ef8f5e05e7e 100644 --- a/mm/highmem.c +++ b/mm/highmem.c @@ -618,7 +618,7 @@ void __kmap_local_sched_out(void) int idx;
/* With debug all even slots are unmapped and act as guard */ - if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !(i & 0x01)) { + if (IS_ENABLED(CONFIG_DEBUG_KMAP_LOCAL) && !(i & 0x01)) { WARN_ON_ONCE(!pte_none(pteval)); continue; } @@ -654,7 +654,7 @@ void __kmap_local_sched_in(void) int idx;
/* With debug all even slots are unmapped and act as guard */ - if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !(i & 0x01)) { + if (IS_ENABLED(CONFIG_DEBUG_KMAP_LOCAL) && !(i & 0x01)) { WARN_ON_ONCE(!pte_none(pteval)); continue; }
From: Chris Chiu chris.chiu@canonical.com
commit c1d1e25a8c542816ae8dee41b81a18d30c7519a0 upstream.
The .callback of the quirk for Sony VPCEH3U1E was unintetionally removed by the commit 25417185e9b5 ("ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807"). Add it back to make sure the quirk for Sony VPCEH3U1E works as expected.
Fixes: 25417185e9b5 ("ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807") Signed-off-by: Chris Chiu chris.chiu@canonical.com Reported-by: Pavel Machek pavel@ucw.cz Reviewed-by: Pavel Machek (CIP) pavel@denx.de Cc: 5.11+ stable@vger.kernel.org # 5.11+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/video_detect.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -147,6 +147,7 @@ static const struct dmi_system_id video_ }, }, { + .callback = video_detect_force_vendor, .ident = "Sony VPCEH3U1E", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"),
From: Vegard Nossum vegard.nossum@oracle.com
commit 25928deeb1e4e2cdae1dccff349320c6841eb5f8 upstream.
ACPICA commit 29da9a2a3f5b2c60420893e5c6309a0586d7a329
ACPI is allocating an object using kmalloc(), but then frees it using kmem_cache_free(<"Acpi-Namespace" kmem_cache>).
This is wrong and can lead to boot failures manifesting like this:
hpet0: 3 comparators, 64-bit 100.000000 MHz counter clocksource: Switched to clocksource tsc-early BUG: unable to handle page fault for address: 000000003ffe0018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 RIP: 0010:kmem_cache_alloc+0x70/0x1d0 Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b 20 4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20 RSP: 0000:ffffc90000013df8 EFLAGS: 00010206 RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002 RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300 RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001 R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000 R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80 FS: 0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __trace_define_field+0x33/0xa0 event_trace_init+0xeb/0x2b4 tracer_init_tracefs+0x60/0x195 ? register_tracer+0x1e7/0x1e7 do_one_initcall+0x74/0x160 kernel_init_freeable+0x190/0x1f0 ? rest_init+0x9a/0x9a kernel_init+0x5/0xf6 ret_from_fork+0x35/0x40 CR2: 000000003ffe0018 ---[ end trace 707efa023f2ee960 ]--- RIP: 0010:kmem_cache_alloc+0x70/0x1d0
Bisection leads to unrelated changes in slab; Vlastimil Babka suggests an unrelated layout or slab merge change merely exposed the underlying bug.
Link: https://lore.kernel.org/lkml/4dc93ff8-f86e-f4c9-ebeb-6d3153a78d03@oracle.com... Link: https://lore.kernel.org/r/a1461e21-c744-767d-6dfc-6641fd3e3ce2@siemens.com Link: https://github.com/acpica/acpica/commit/29da9a2a Fixes: f79c8e4136ea ("ACPICA: Namespace: simplify creation of the initial/default namespace") Reported-by: Jan Kiszka jan.kiszka@siemens.com Diagnosed-by: Vlastimil Babka vbabka@suse.cz Diagnosed-by: Kees Cook keescook@chromium.org Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Bob Moore robert.moore@intel.com Signed-off-by: Erik Kaneda erik.kaneda@intel.com Cc: 5.10+ stable@vger.kernel.org # 5.10+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpica/nsaccess.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/acpi/acpica/nsaccess.c +++ b/drivers/acpi/acpica/nsaccess.c @@ -99,13 +99,12 @@ acpi_status acpi_ns_root_initialize(void * just create and link the new node(s) here. */ new_node = - ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_namespace_node)); + acpi_ns_create_node(*ACPI_CAST_PTR(u32, init_val->name)); if (!new_node) { status = AE_NO_MEMORY; goto unlock_and_exit; }
- ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name); new_node->descriptor_type = ACPI_DESC_TYPE_NAMED; new_node->type = init_val->type;
From: Mark Rutland mark.rutland@arm.com
commit c607ab4f916d4d5259072eca34055d3f5a795c21 upstream.
We recently converted arm64 to use arch_stack_walk() in commit:
5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
The core stacktrace code expects that (when tracing the current task) arch_stack_walk() starts a trace at its caller, and does not include itself in the trace. However, arm64's arch_stack_walk() includes itself, and so traces include one more entry than callers expect. The core stacktrace code which calls arch_stack_walk() tries to skip a number of entries to prevent itself appearing in a trace, and the additional entry prevents skipping one of the core stacktrace functions, leaving this in the trace unexpectedly.
We can fix this by having arm64's arch_stack_walk() begin the trace with its caller. The first value returned by the trace will be __builtin_return_address(0), i.e. the caller of arch_stack_walk(). The first frame record to be unwound will be __builtin_frame_address(1), i.e. the caller's frame record. To prevent surprises, arch_stack_walk() is also marked noinline.
While __builtin_frame_address(1) is not safe in portable code, local GCC developers have confirmed that it is safe on arm64. To find the caller's frame record, the builtin can safely dereference the current function's frame record or (in theory) could stash the original FP into another GPR at function entry time, neither of which are problematic.
Prior to this patch, the tracing code would unexpectedly show up in traces of the current task, e.g.
| # cat /proc/self/stack | [<0>] stack_trace_save_tsk+0x98/0x100 | [<0>] proc_pid_stack+0xb4/0x130 | [<0>] proc_single_show+0x60/0x110 | [<0>] seq_read_iter+0x230/0x4d0 | [<0>] seq_read+0xdc/0x130 | [<0>] vfs_read+0xac/0x1e0 | [<0>] ksys_read+0x6c/0xfc | [<0>] __arm64_sys_read+0x20/0x30 | [<0>] el0_svc_common.constprop.0+0x60/0x120 | [<0>] do_el0_svc+0x24/0x90 | [<0>] el0_svc+0x2c/0x54 | [<0>] el0_sync_handler+0x1a4/0x1b0 | [<0>] el0_sync+0x170/0x180
After this patch, the tracing code will not show up in such traces:
| # cat /proc/self/stack | [<0>] proc_pid_stack+0xb4/0x130 | [<0>] proc_single_show+0x60/0x110 | [<0>] seq_read_iter+0x230/0x4d0 | [<0>] seq_read+0xdc/0x130 | [<0>] vfs_read+0xac/0x1e0 | [<0>] ksys_read+0x6c/0xfc | [<0>] __arm64_sys_read+0x20/0x30 | [<0>] el0_svc_common.constprop.0+0x60/0x120 | [<0>] do_el0_svc+0x24/0x90 | [<0>] el0_svc+0x2c/0x54 | [<0>] el0_sync_handler+0x1a4/0x1b0 | [<0>] el0_sync+0x170/0x180
Erring on the side of caution, I've given this a spin with a bunch of toolchains, verifying the output of /proc/self/stack and checking that the assembly looked sound. For GCC (where we require version 5.1.0 or later) I tested with the kernel.org crosstool binares for versions 5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and 10.1.0. For clang (where we require version 10.0.1 or later) I tested with the llvm.org binary releases of 11.0.0, and 11.0.1.
Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Chen Jun chenjun102@huawei.com Cc: Marco Elver elver@google.com Cc: Mark Brown broonie@kernel.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org # 5.10.x Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/stacktrace.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -199,8 +199,9 @@ void show_stack(struct task_struct *tsk,
#ifdef CONFIG_STACKTRACE
-void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie, - struct task_struct *task, struct pt_regs *regs) +noinline void arch_stack_walk(stack_trace_consume_fn consume_entry, + void *cookie, struct task_struct *task, + struct pt_regs *regs) { struct stackframe frame;
@@ -208,8 +209,8 @@ void arch_stack_walk(stack_trace_consume start_backtrace(&frame, regs->regs[29], regs->pc); else if (task == current) start_backtrace(&frame, - (unsigned long)__builtin_frame_address(0), - (unsigned long)arch_stack_walk); + (unsigned long)__builtin_frame_address(1), + (unsigned long)__builtin_return_address(0)); else start_backtrace(&frame, thread_saved_fp(task), thread_saved_pc(task));
From: Horia Geantă horia.geanta@nxp.com
commit 9c3a16f88385e671b63a0de7b82b85e604a80f42 upstream.
Crypto engine (CAAM) on LS1046A platform is configured HW-coherent, mark accordingly the DT node.
As reported by Greg and Sascha, and explained by Robin, lack of "dma-coherent" property for an IP that is configured HW-coherent can lead to problems, e.g. on v5.11:
kernel BUG at drivers/crypto/caam/jr.c:247! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-20210225-3-00039-g434215968816-dirty #12 Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (C300) board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) pc : caam_jr_dequeue+0x98/0x57c lr : caam_jr_dequeue+0x98/0x57c sp : ffff800010003d50 x29: ffff800010003d50 x28: ffff8000118d4000 x27: ffff8000118d4328 x26: 00000000000001f0 x25: ffff0008022be480 x24: ffff0008022c6410 x23: 00000000000001f1 x22: ffff8000118d4329 x21: 0000000000004d80 x20: 00000000000001f1 x19: 0000000000000001 x18: 0000000000000020 x17: 0000000000000000 x16: 0000000000000015 x15: ffff800011690230 x14: 2e2e2e2e2e2e2e2e x13: 2e2e2e2e2e2e2020 x12: 3030303030303030 x11: ffff800011700a38 x10: 00000000fffff000 x9 : ffff8000100ada30 x8 : ffff8000116a8a38 x7 : 0000000000000001 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000001800 Call trace: caam_jr_dequeue+0x98/0x57c tasklet_action_common.constprop.0+0x164/0x18c tasklet_action+0x44/0x54 __do_softirq+0x160/0x454 __irq_exit_rcu+0x164/0x16c irq_exit+0x1c/0x30 __handle_domain_irq+0xc0/0x13c gic_handle_irq+0x5c/0xf0 el1_irq+0xb4/0x180 arch_cpu_idle+0x18/0x30 default_idle_call+0x3c/0x1c0 do_idle+0x23c/0x274 cpu_startup_entry+0x34/0x70 rest_init+0xdc/0xec arch_call_rest_init+0x1c/0x28 start_kernel+0x4ac/0x4e4 Code: 91392021 912c2000 d377d8c6 97f24d96 (d4210000)
Cc: stable@vger.kernel.org # v4.10+ Fixes: 8126d88162a5 ("arm64: dts: add QorIQ LS1046A SoC support") Link: https://lore.kernel.org/linux-crypto/fe6faa24-d8f7-d18f-adfa-44fa0caa1598@ar... Reported-by: Greg Ungerer gerg@kernel.org Reported-by: Sascha Hauer s.hauer@pengutronix.de Tested-by: Sascha Hauer s.hauer@pengutronix.de Signed-off-by: Horia Geantă horia.geanta@nxp.com Acked-by: Greg Ungerer gerg@kernel.org Acked-by: Li Yang leoyang.li@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -325,6 +325,7 @@ ranges = <0x0 0x00 0x1700000 0x100000>; reg = <0x00 0x1700000 0x0 0x100000>; interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>; + dma-coherent;
sec_jr0: jr@10000 { compatible = "fsl,sec-v5.4-job-ring",
From: Horia Geantă horia.geanta@nxp.com
commit ba8da03fa7dff59d9400250aebd38f94cde3cb0f upstream.
Crypto engine (CAAM) on LS1012A platform is configured HW-coherent, mark accordingly the DT node.
Lack of "dma-coherent" property for an IP that is configured HW-coherent can lead to problems, similar to what has been reported for LS1046A.
Cc: stable@vger.kernel.org # v4.12+ Fixes: 85b85c569507 ("arm64: dts: ls1012a: add crypto node") Signed-off-by: Horia Geantă horia.geanta@nxp.com Acked-by: Li Yang leoyang.li@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi @@ -192,6 +192,7 @@ ranges = <0x0 0x00 0x1700000 0x100000>; reg = <0x00 0x1700000 0x0 0x100000>; interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>; + dma-coherent;
sec_jr0: jr@10000 { compatible = "fsl,sec-v5.4-job-ring",
From: Horia Geantă horia.geanta@nxp.com
commit 4fb3a074755b7737c4081cffe0ccfa08c2f2d29d upstream.
Crypto engine (CAAM) on LS1043A platform is configured HW-coherent, mark accordingly the DT node.
Lack of "dma-coherent" property for an IP that is configured HW-coherent can lead to problems, similar to what has been reported for LS1046A.
Cc: stable@vger.kernel.org # v4.8+ Fixes: 63dac35b58f4 ("arm64: dts: ls1043a: add crypto node") Link: https://lore.kernel.org/linux-crypto/fe6faa24-d8f7-d18f-adfa-44fa0caa1598@ar... Signed-off-by: Horia Geantă horia.geanta@nxp.com Acked-by: Li Yang leoyang.li@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi @@ -322,6 +322,7 @@ ranges = <0x0 0x00 0x1700000 0x100000>; reg = <0x00 0x1700000 0x0 0x100000>; interrupts = <0 75 0x4>; + dma-coherent;
sec_jr0: jr@10000 { compatible = "fsl,sec-v5.4-job-ring",
From: Federico Pellegrin fede@evolware.org
commit 664979bba8169d775959452def968d1a7c03901f upstream.
According to the datasheet PA7 can be set to either function A, B or C (see table 6-2 of DS60001579D). The previous value would permit just configuring with function C.
Signed-off-by: Federico Pellegrin fede@evolware.org Fixes: 1e5f532c2737 ("ARM: dts: at91: sam9x60: add device tree for soc and board") Cc: stable@vger.kernel.org # 5.6+ Cc: Sandeep Sheriker Mallikarjun sandeepsheriker.mallikarjun@microchip.com Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/at91-sam9x60ek.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/at91-sam9x60ek.dts +++ b/arch/arm/boot/dts/at91-sam9x60ek.dts @@ -336,7 +336,7 @@ &pinctrl { atmel,mux-mask = < /* A B C */ - 0xFFFFFE7F 0xC0E0397F 0xEF00019D /* pioA */ + 0xFFFFFEFF 0xC0E039FF 0xEF00019D /* pioA */ 0x03FFFFFF 0x02FC7E68 0x00780000 /* pioB */ 0xffffffff 0xF83FFFFF 0xB800F3FC /* pioC */ 0x003FFFFF 0x003F8000 0x00000000 /* pioD */
From: Nicolas Ferre nicolas.ferre@microchip.com
commit 2c69c8a1736eace8de491d480e6e577a27c2087c upstream.
Fix the whole mux-mask table according to datasheet for the sam9x60 product. Too much functions for pins were disabled leading to misunderstandings when enabling more peripherals or taking this table as an example for another board. Take advantage of this fix to move the mux-mask in the SoC file where it belongs and use lower case letters for hex numbers like everywhere in the file.
Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Fixes: 1e5f532c2737 ("ARM: dts: at91: sam9x60: add device tree for soc and board") Cc: stable@vger.kernel.org # 5.6+ Cc: Sandeep Sheriker Mallikarjun sandeepsheriker.mallikarjun@microchip.com Reviewed-by: Tudor Ambarus tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20210310152006.15018-1-nicolas.ferre@microchip.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/at91-sam9x60ek.dts | 8 -------- arch/arm/boot/dts/sam9x60.dtsi | 9 +++++++++ 2 files changed, 9 insertions(+), 8 deletions(-)
--- a/arch/arm/boot/dts/at91-sam9x60ek.dts +++ b/arch/arm/boot/dts/at91-sam9x60ek.dts @@ -334,14 +334,6 @@ };
&pinctrl { - atmel,mux-mask = < - /* A B C */ - 0xFFFFFEFF 0xC0E039FF 0xEF00019D /* pioA */ - 0x03FFFFFF 0x02FC7E68 0x00780000 /* pioB */ - 0xffffffff 0xF83FFFFF 0xB800F3FC /* pioC */ - 0x003FFFFF 0x003F8000 0x00000000 /* pioD */ - >; - adc { pinctrl_adc_default: adc_default { atmel,pins = <AT91_PIOB 15 AT91_PERIPH_A AT91_PINCTRL_NONE>; --- a/arch/arm/boot/dts/sam9x60.dtsi +++ b/arch/arm/boot/dts/sam9x60.dtsi @@ -606,6 +606,15 @@ compatible = "microchip,sam9x60-pinctrl", "atmel,at91sam9x5-pinctrl", "atmel,at91rm9200-pinctrl", "simple-bus"; ranges = <0xfffff400 0xfffff400 0x800>;
+ /* mux-mask corresponding to sam9x60 SoC in TFBGA228L package */ + atmel,mux-mask = < + /* A B C */ + 0xffffffff 0xffe03fff 0xef00019d /* pioA */ + 0x03ffffff 0x02fc7e7f 0x00780000 /* pioB */ + 0xffffffff 0xffffffff 0xf83fffff /* pioC */ + 0x003fffff 0x003f8000 0x00000000 /* pioD */ + >; + pioA: gpio@fffff400 { compatible = "microchip,sam9x60-gpio", "atmel,at91sam9x5-gpio", "atmel,at91rm9200-gpio"; reg = <0xfffff400 0x200>;
From: Claudiu Beznea claudiu.beznea@microchip.com
commit 221c3a09ddf70a0a51715e6c2878d8305e95c558 upstream.
Fix the phy address to 7 for Ethernet PHY on SAMA5D27 SOM1. No connection established if phy address 0 is used.
The board uses the 24 pins version of the KSZ8081RNA part, KSZ8081RNA pin 16 REFCLK as PHYAD bit [2] has weak internal pull-down. But at reset, connected to PD09 of the MPU it's connected with an internal pull-up forming PHYAD[2:0] = 7.
Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Fixes: 2f61929eb10a ("ARM: dts: at91: at91-sama5d27_som1: fix PHY ID") Cc: Ludovic Desroches ludovic.desroches@microchip.com Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Cc: stable@vger.kernel.org # 4.14+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/at91-sama5d27_som1.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/at91-sama5d27_som1.dtsi +++ b/arch/arm/boot/dts/at91-sama5d27_som1.dtsi @@ -84,8 +84,8 @@ pinctrl-0 = <&pinctrl_macb0_default>; phy-mode = "rmii";
- ethernet-phy@0 { - reg = <0x0>; + ethernet-phy@7 { + reg = <0x7>; interrupt-parent = <&pioA>; interrupts = <PIN_PD31 IRQ_TYPE_LEVEL_LOW>; pinctrl-names = "default";
From: Mimi Zohar zohar@linux.ibm.com
commit 92063f3ca73aab794bd5408d3361fd5b5ea33079 upstream.
The kernel may be built with multiple LSMs, but only a subset may be enabled on the boot command line by specifying "lsm=". Not including "integrity" on the ordered LSM list may result in a NULL deref.
As reported by Dmitry Vyukov: in qemu: qemu-system-x86_64 -enable-kvm -machine q35,nvdimm -cpu max,migratable=off -smp 4 -m 4G,slots=4,maxmem=16G -hda wheezy.img -kernel arch/x86/boot/bzImage -nographic -vga std -soundhw all -usb -usbdevice tablet -bt hci -bt device:keyboard -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic,model=virtio-net-pci -object memory-backend-file,id=pmem1,share=off,mem-path=/dev/zero,size=64M -device nvdimm,id=nvdimm1,memdev=pmem1 -append "console=ttyS0 root=/dev/sda earlyprintk=serial rodata=n oops=panic panic_on_warn=1 panic=86400 lsm=smack numa=fake=2 nopcid dummy_hcd.num=8" -pidfile vm_pid -m 2G -cpu host
But it crashes on NULL deref in integrity_inode_get during boot:
Run /sbin/init as init process BUG: kernel NULL pointer dereference, address: 000000000000001c PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc2+ #97 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014 RIP: 0010:kmem_cache_alloc+0x2b/0x370 mm/slub.c:2920 Code: 57 41 56 41 55 41 54 41 89 f4 55 48 89 fd 53 48 83 ec 10 44 8b 3d d9 1f 90 0b 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <8b> 5f 1c 4cf RSP: 0000:ffffc9000032f9d8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888017fc4f00 RCX: 0000000000000000 RDX: ffff888040220000 RSI: 0000000000000c40 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff888019263627 R10: ffffffff83937cd1 R11: 0000000000000000 R12: 0000000000000c40 R13: ffff888019263538 R14: 0000000000000000 R15: 0000000000ffffff FS: 0000000000000000(0000) GS:ffff88802d180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001c CR3: 000000000b48e000 CR4: 0000000000750ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: integrity_inode_get+0x47/0x260 security/integrity/iint.c:105 process_measurement+0x33d/0x17e0 security/integrity/ima/ima_main.c:237 ima_bprm_check+0xde/0x210 security/integrity/ima/ima_main.c:474 security_bprm_check+0x7d/0xa0 security/security.c:845 search_binary_handler fs/exec.c:1708 [inline] exec_binprm fs/exec.c:1761 [inline] bprm_execve fs/exec.c:1830 [inline] bprm_execve+0x764/0x19a0 fs/exec.c:1792 kernel_execve+0x370/0x460 fs/exec.c:1973 try_to_run_init_process+0x14/0x4e init/main.c:1366 kernel_init+0x11d/0x1b8 init/main.c:1477 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Modules linked in: CR2: 000000000000001c ---[ end trace 22d601a500de7d79 ]---
Since LSMs and IMA may be configured at build time, but not enabled at run time, panic the system if "integrity" was not initialized before use.
Reported-by: Dmitry Vyukov dvyukov@google.com Fixes: 79f7865d844c ("LSM: Introduce "lsm=" for boottime LSM selection") Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/integrity/iint.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/security/integrity/iint.c +++ b/security/integrity/iint.c @@ -98,6 +98,14 @@ struct integrity_iint_cache *integrity_i struct rb_node *node, *parent = NULL; struct integrity_iint_cache *iint, *test_iint;
+ /* + * The integrity's "iint_cache" is initialized at security_init(), + * unless it is not included in the ordered list of LSMs enabled + * on the boot command line. + */ + if (!iint_cache) + panic("%s: lsm=integrity required.\n", __func__); + iint = integrity_iint_find(inode); if (iint) return iint;
From: Lyude Paul lyude@redhat.com
commit d3999c1f7bbbc100c167d7ad3cd79c1d10446ba2 upstream.
While Kepler does technically support 256x256 cursors, it turns out that Kepler actually has some additional requirements for scanout surfaces that we're not enforcing correctly, which aren't present on Maxwell and later. Cursor surfaces must always use small pages (4K), and overlay surfaces must always use large pages (128K).
Fixing this correctly though will take a bit more work: as we'll need to add some code in prepare_fb() to move cursor FBs in large pages to small pages, and vice-versa for overlay FBs. So until we have the time to do that, just limit cursor surfaces to 128x128 - a size small enough to always default to small pages.
This means small ovlys are still broken on Kepler, but it is extremely unlikely anyone cares about those anyway :).
Signed-off-by: Lyude Paul lyude@redhat.com Fixes: d3b2f0f7921c ("drm/nouveau/kms/nv50-: Report max cursor size to userspace") Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -2663,9 +2663,20 @@ nv50_display_create(struct drm_device *d else nouveau_display(dev)->format_modifiers = disp50xx_modifiers;
- if (disp->disp->object.oclass >= GK104_DISP) { + /* FIXME: 256x256 cursors are supported on Kepler, however unlike Maxwell and later + * generations Kepler requires that we use small pages (4K) for cursor scanout surfaces. The + * proper fix for this is to teach nouveau to migrate fbs being used for the cursor plane to + * small page allocations in prepare_fb(). When this is implemented, we should also force + * large pages (128K) for ovly fbs in order to fix Kepler ovlys. + * But until then, just limit cursors to 128x128 - which is small enough to avoid ever using + * large pages. + */ + if (disp->disp->object.oclass >= GM107_DISP) { dev->mode_config.cursor_width = 256; dev->mode_config.cursor_height = 256; + } else if (disp->disp->object.oclass >= GK104_DISP) { + dev->mode_config.cursor_width = 128; + dev->mode_config.cursor_height = 128; } else { dev->mode_config.cursor_width = 64; dev->mode_config.cursor_height = 64;
From: Daniel Vetter daniel.vetter@ffwll.ch
commit cd5297b0855f17c8b4e3ef1d20c6a3656209c7b3 upstream.
Nothing checks userptr.ro except this call to pup_fast, which means there's nothing actually preventing userspace from writing to this. Which means you can just read-only mmap any file you want, userptr it and then write to it with the gpu. Not good.
The right way to handle this is FOLL_WRITE | FOLL_FORCE, which will break any COW mappings and update tracking for MAY_WRITE mappings so there's no exploit and the vm isn't confused about what's going on. For any legit use case there's no difference from what userspace can observe and do.
Reviewed-by: Lucas Stach l.stach@pengutronix.de Cc: stable@vger.kernel.org Cc: John Hubbard jhubbard@nvidia.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Lucas Stach l.stach@pengutronix.de Cc: Russell King linux+etnaviv@armlinux.org.uk Cc: Christian Gmeiner christian.gmeiner@gmail.com Cc: etnaviv@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20210301095254.1946084-1-danie... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -689,7 +689,7 @@ static int etnaviv_gem_userptr_get_pages struct page **pages = pvec + pinned;
ret = pin_user_pages_fast(ptr, num_pages, - !userptr->ro ? FOLL_WRITE : 0, pages); + FOLL_WRITE | FOLL_FORCE, pages); if (ret < 0) { unpin_user_pages(pvec, pinned); kvfree(pvec);
From: Kenneth Feng kenneth.feng@amd.com
commit 9d03730ecbc5afabfda26d4dbb014310bc4ea4d9 upstream.
On some Intel platforms, audio noise can be detected due to high pcie speed switch latency. This patch leaverages ppfeaturemask to fix to the highest pcie speed then disable pcie switching.
v2: coding style fix
Signed-off-by: Kenneth Feng kenneth.feng@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 54 +++++++++++++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 74 +++++++++++++++--- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 24 +++++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 25 ++++++ 4 files changed, 166 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -587,6 +587,48 @@ static int smu7_force_switch_to_arbf0(st tmp, MC_CG_ARB_FREQ_F0); }
+static uint16_t smu7_override_pcie_speed(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev); + uint16_t pcie_gen = 0; + + if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4 && + adev->pm.pcie_gen_mask & CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN4) + pcie_gen = 3; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3 && + adev->pm.pcie_gen_mask & CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN3) + pcie_gen = 2; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2 && + adev->pm.pcie_gen_mask & CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN2) + pcie_gen = 1; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1 && + adev->pm.pcie_gen_mask & CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN1) + pcie_gen = 0; + + return pcie_gen; +} + +static uint16_t smu7_override_pcie_width(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev); + uint16_t pcie_width = 0; + + if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X16) + pcie_width = 16; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X12) + pcie_width = 12; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X8) + pcie_width = 8; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4) + pcie_width = 4; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2) + pcie_width = 2; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X1) + pcie_width = 1; + + return pcie_width; +} + static int smu7_setup_default_pcie_table(struct pp_hwmgr *hwmgr) { struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend); @@ -683,6 +725,11 @@ static int smu7_setup_default_pcie_table PP_Min_PCIEGen), get_pcie_lane_support(data->pcie_lane_cap, PP_Max_PCIELane)); + + if (data->pcie_dpm_key_disabled) + phm_setup_pcie_table_entry(&data->dpm_table.pcie_speed_table, + data->dpm_table.pcie_speed_table.count, + smu7_override_pcie_speed(hwmgr), smu7_override_pcie_width(hwmgr)); } return 0; } @@ -1248,6 +1295,13 @@ static int smu7_start_dpm(struct pp_hwmg NULL)), "Failed to enable pcie DPM during DPM Start Function!", return -EINVAL); + } else { + PP_ASSERT_WITH_CODE( + (0 == smum_send_msg_to_smc(hwmgr, + PPSMC_MSG_PCIeDPM_Disable, + NULL)), + "Failed to disble pcie DPM during DPM Start Function!", + return -EINVAL); }
if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c @@ -54,6 +54,9 @@ #include "smuio/smuio_9_0_offset.h" #include "smuio/smuio_9_0_sh_mask.h"
+#define smnPCIE_LC_SPEED_CNTL 0x11140290 +#define smnPCIE_LC_LINK_WIDTH_CNTL 0x11140288 + #define HBM_MEMORY_CHANNEL_WIDTH 128
static const uint32_t channel_number[] = {1, 2, 0, 4, 0, 8, 0, 16, 2}; @@ -443,8 +446,7 @@ static void vega10_init_dpm_defaults(str if (PP_CAP(PHM_PlatformCaps_VCEDPM)) data->smu_features[GNLD_DPM_VCE].supported = true;
- if (!data->registry_data.pcie_dpm_key_disabled) - data->smu_features[GNLD_DPM_LINK].supported = true; + data->smu_features[GNLD_DPM_LINK].supported = true;
if (!data->registry_data.dcefclk_dpm_key_disabled) data->smu_features[GNLD_DPM_DCEFCLK].supported = true; @@ -1545,6 +1547,13 @@ static int vega10_override_pcie_paramete pp_table->PcieLaneCount[i] = pcie_width; }
+ if (data->registry_data.pcie_dpm_key_disabled) { + for (i = 0; i < NUM_LINK_LEVELS; i++) { + pp_table->PcieGenSpeed[i] = pcie_gen; + pp_table->PcieLaneCount[i] = pcie_width; + } + } + return 0; }
@@ -2967,6 +2976,14 @@ static int vega10_start_dpm(struct pp_hw } }
+ if (data->registry_data.pcie_dpm_key_disabled) { + PP_ASSERT_WITH_CODE(!vega10_enable_smc_features(hwmgr, + false, data->smu_features[GNLD_DPM_LINK].smu_feature_bitmap), + "Attempt to Disable Link DPM feature Failed!", return -EINVAL); + data->smu_features[GNLD_DPM_LINK].enabled = false; + data->smu_features[GNLD_DPM_LINK].supported = false; + } + return 0; }
@@ -4585,6 +4602,24 @@ static int vega10_set_ppfeature_status(s return 0; }
+static int vega10_get_current_pcie_link_width_level(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = hwmgr->adev; + + return (RREG32_PCIE(smnPCIE_LC_LINK_WIDTH_CNTL) & + PCIE_LC_LINK_WIDTH_CNTL__LC_LINK_WIDTH_RD_MASK) + >> PCIE_LC_LINK_WIDTH_CNTL__LC_LINK_WIDTH_RD__SHIFT; +} + +static int vega10_get_current_pcie_link_speed_level(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = hwmgr->adev; + + return (RREG32_PCIE(smnPCIE_LC_SPEED_CNTL) & + PSWUSP0_PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE_MASK) + >> PSWUSP0_PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE__SHIFT; +} + static int vega10_print_clock_levels(struct pp_hwmgr *hwmgr, enum pp_clock_type type, char *buf) { @@ -4593,8 +4628,9 @@ static int vega10_print_clock_levels(str struct vega10_single_dpm_table *mclk_table = &(data->dpm_table.mem_table); struct vega10_single_dpm_table *soc_table = &(data->dpm_table.soc_table); struct vega10_single_dpm_table *dcef_table = &(data->dpm_table.dcef_table); - struct vega10_pcie_table *pcie_table = &(data->dpm_table.pcie_table); struct vega10_odn_clock_voltage_dependency_table *podn_vdd_dep = NULL; + uint32_t gen_speed, lane_width, current_gen_speed, current_lane_width; + PPTable_t *pptable = &(data->smc_state_table.pp_table);
int i, now, size = 0, count = 0;
@@ -4651,15 +4687,31 @@ static int vega10_print_clock_levels(str "*" : ""); break; case PP_PCIE: - smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetCurrentLinkIndex, &now); - - for (i = 0; i < pcie_table->count; i++) - size += sprintf(buf + size, "%d: %s %s\n", i, - (pcie_table->pcie_gen[i] == 0) ? "2.5GT/s, x1" : - (pcie_table->pcie_gen[i] == 1) ? "5.0GT/s, x16" : - (pcie_table->pcie_gen[i] == 2) ? "8.0GT/s, x16" : "", - (i == now) ? "*" : ""); + current_gen_speed = + vega10_get_current_pcie_link_speed_level(hwmgr); + current_lane_width = + vega10_get_current_pcie_link_width_level(hwmgr); + for (i = 0; i < NUM_LINK_LEVELS; i++) { + gen_speed = pptable->PcieGenSpeed[i]; + lane_width = pptable->PcieLaneCount[i]; + + size += sprintf(buf + size, "%d: %s %s %s\n", i, + (gen_speed == 0) ? "2.5GT/s," : + (gen_speed == 1) ? "5.0GT/s," : + (gen_speed == 2) ? "8.0GT/s," : + (gen_speed == 3) ? "16.0GT/s," : "", + (lane_width == 1) ? "x1" : + (lane_width == 2) ? "x2" : + (lane_width == 3) ? "x4" : + (lane_width == 4) ? "x8" : + (lane_width == 5) ? "x12" : + (lane_width == 6) ? "x16" : "", + (current_gen_speed == gen_speed) && + (current_lane_width == lane_width) ? + "*" : ""); + } break; + case OD_SCLK: if (hwmgr->od_enabled) { size = sprintf(buf, "%s:\n", "OD_SCLK"); --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c @@ -133,6 +133,7 @@ static void vega12_set_default_registry_ data->registry_data.auto_wattman_debug = 0; data->registry_data.auto_wattman_sample_period = 100; data->registry_data.auto_wattman_threshold = 50; + data->registry_data.pcie_dpm_key_disabled = !(hwmgr->feature_mask & PP_PCIE_DPM_MASK); }
static int vega12_set_features_platform_caps(struct pp_hwmgr *hwmgr) @@ -539,6 +540,29 @@ static int vega12_override_pcie_paramete pp_table->PcieLaneCount[i] = pcie_width_arg; }
+ /* override to the highest if it's disabled from ppfeaturmask */ + if (data->registry_data.pcie_dpm_key_disabled) { + for (i = 0; i < NUM_LINK_LEVELS; i++) { + smu_pcie_arg = (i << 16) | (pcie_gen << 8) | pcie_width; + ret = smum_send_msg_to_smc_with_parameter(hwmgr, + PPSMC_MSG_OverridePcieParameters, smu_pcie_arg, + NULL); + PP_ASSERT_WITH_CODE(!ret, + "[OverridePcieParameters] Attempt to override pcie params failed!", + return ret); + + pp_table->PcieGenSpeed[i] = pcie_gen; + pp_table->PcieLaneCount[i] = pcie_width; + } + ret = vega12_enable_smc_features(hwmgr, + false, + data->smu_features[GNLD_DPM_LINK].smu_feature_bitmap); + PP_ASSERT_WITH_CODE(!ret, + "Attempt to Disable DPM LINK Failed!", + return ret); + data->smu_features[GNLD_DPM_LINK].enabled = false; + data->smu_features[GNLD_DPM_LINK].supported = false; + } return 0; }
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c @@ -171,6 +171,7 @@ static void vega20_set_default_registry_ data->registry_data.gfxoff_controlled_by_driver = 1; data->gfxoff_allowed = false; data->counter_gfxoff = 0; + data->registry_data.pcie_dpm_key_disabled = !(hwmgr->feature_mask & PP_PCIE_DPM_MASK); }
static int vega20_set_features_platform_caps(struct pp_hwmgr *hwmgr) @@ -885,6 +886,30 @@ static int vega20_override_pcie_paramete pp_table->PcieLaneCount[i] = pcie_width_arg; }
+ /* override to the highest if it's disabled from ppfeaturmask */ + if (data->registry_data.pcie_dpm_key_disabled) { + for (i = 0; i < NUM_LINK_LEVELS; i++) { + smu_pcie_arg = (i << 16) | (pcie_gen << 8) | pcie_width; + ret = smum_send_msg_to_smc_with_parameter(hwmgr, + PPSMC_MSG_OverridePcieParameters, smu_pcie_arg, + NULL); + PP_ASSERT_WITH_CODE(!ret, + "[OverridePcieParameters] Attempt to override pcie params failed!", + return ret); + + pp_table->PcieGenSpeed[i] = pcie_gen; + pp_table->PcieLaneCount[i] = pcie_width; + } + ret = vega20_enable_smc_features(hwmgr, + false, + data->smu_features[GNLD_DPM_LINK].smu_feature_bitmap); + PP_ASSERT_WITH_CODE(!ret, + "Attempt to Disable DPM LINK Failed!", + return ret); + data->smu_features[GNLD_DPM_LINK].enabled = false; + data->smu_features[GNLD_DPM_LINK].supported = false; + } + return 0; }
From: Alex Deucher alexander.deucher@amd.com
commit 5c458585c0141754cdcbf25feebb547dd671b559 upstream.
Commit 098214999c8f added fetching of the AUX_DPHY register values from the vbios, but it also changed the default values in the case when there are no values in the vbios. This causes problems with displays with high refresh rates. To fix this, switch back to the original default value for AUX_DPHY_TX_CONTROL.
Fixes: 098214999c8f ("drm/amd/display: Read VBIOS Golden Settings Tbl") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1426 Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: Igor Kravchenko Igor.Kravchenko@amd.com Cc: Aric Cyr Aric.Cyr@amd.com Cc: Aurabindo Pillai aurabindo.pillai@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_link_encoder.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_link_encoder.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_link_encoder.c @@ -341,8 +341,7 @@ void enc2_hw_init(struct link_encoder *e } else { AUX_REG_WRITE(AUX_DPHY_RX_CONTROL0, 0x103d1110);
- AUX_REG_WRITE(AUX_DPHY_TX_CONTROL, 0x21c4d); - + AUX_REG_WRITE(AUX_DPHY_TX_CONTROL, 0x21c7a); }
//AUX_DPHY_TX_REF_CONTROL'AUX_TX_REF_DIV HW default is 0x32;
From: Prike Liang Prike.Liang@amd.com
commit 9aa26019c1a60013ea866d460de6392acb1712ee upstream.
During system hibernation suspend still need un-gate gfx CG/PG firstly to handle HW status check before HW resource destory.
Signed-off-by: Prike Liang Prike.Liang@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Acked-by: Huang Rui ray.huang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2666,7 +2666,7 @@ static int amdgpu_device_ip_suspend_phas { int i, r;
- if (adev->in_poweroff_reboot_com || + if (adev->in_poweroff_reboot_com || adev->in_hibernate || !amdgpu_acpi_is_s0ix_supported(adev) || amdgpu_in_reset(adev)) { amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); @@ -3727,7 +3727,11 @@ int amdgpu_device_suspend(struct drm_dev
amdgpu_fence_driver_suspend(adev);
- if (adev->in_poweroff_reboot_com || + /* + * TODO: Need figure out the each GNB IP idle off dependency and then + * improve the AMDGPU suspend/resume sequence for system-wide Sx entry/exit. + */ + if (adev->in_poweroff_reboot_com || adev->in_hibernate || !amdgpu_acpi_is_s0ix_supported(adev) || amdgpu_in_reset(adev)) r = amdgpu_device_ip_suspend_phase2(adev); else
From: Alex Deucher alexander.deucher@amd.com
commit c933b111094f2818571fc51b81b98ee0d370c035 upstream.
Add new DID.
Reviewed-by: Guchun Chen guchun.chen@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1102,6 +1102,7 @@ static const struct pci_device_id pciidl {0x1002, 0x73A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID}, {0x1002, 0x73AB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID}, {0x1002, 0x73AE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID}, + {0x1002, 0x73AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID}, {0x1002, 0x73BF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
/* Van Gogh */
From: Jani Nikula jani.nikula@intel.com
commit b61fde1beb6b1847f1743e75f4d9839acebad76a upstream.
Use the correct DSS CTL registers for ICL DSI transcoders.
As a side effect, this also brings back the sanity check for trying to use pipe DSC registers on pipe A on ICL.
Fixes: 8a029c113b17 ("drm/i915/dp: Modify VDSC helpers to configure DSC for Bigjoiner slave") Cc: Manasi Navare manasi.d.navare@intel.com Cc: Animesh Manna animesh.manna@intel.com Cc: Vandita Kulkarni vandita.kulkarni@intel.com Cc: stable@vger.kernel.org # v5.11+ Reviewed-by: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210319115333.8330-1-jani.nik... (cherry picked from commit 5706d02871240fdba7ddd6ab1cc31672fc95a90f) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_vdsc.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c @@ -1016,20 +1016,14 @@ static i915_reg_t dss_ctl1_reg(const str { enum pipe pipe = to_intel_crtc(crtc_state->uapi.crtc)->pipe;
- if (crtc_state->cpu_transcoder == TRANSCODER_EDP) - return DSS_CTL1; - - return ICL_PIPE_DSS_CTL1(pipe); + return is_pipe_dsc(crtc_state) ? ICL_PIPE_DSS_CTL1(pipe) : DSS_CTL1; }
static i915_reg_t dss_ctl2_reg(const struct intel_crtc_state *crtc_state) { enum pipe pipe = to_intel_crtc(crtc_state->uapi.crtc)->pipe;
- if (crtc_state->cpu_transcoder == TRANSCODER_EDP) - return DSS_CTL2; - - return ICL_PIPE_DSS_CTL2(pipe); + return is_pipe_dsc(crtc_state) ? ICL_PIPE_DSS_CTL2(pipe) : DSS_CTL2; }
void intel_dsc_enable(struct intel_encoder *encoder,
From: Imre Deak imre.deak@intel.com
commit 8840e3bd981f128846b01c12d3966d115e8617c9 upstream.
To optimize some task deferring it until runtime resume unless someone holds a runtime PM reference (because in this case the task can be done w/o the overhead of runtime resume), we have to use the runtime PM get-if-active logic: If the runtime PM usage count is 0 (and so get-if-in-use would return false) the runtime suspend handler is not necessarily called yet (it could be just pending), so the device is not necessarily powered down, and so the runtime resume handler is not guaranteed to be called.
The fence revocation depends on the above deferral, so add a get-if-active helper and use it during fence revocation.
v2: - Add code comment explaining the fence reg programming deferral logic to i915_vma_revoke_fence(). (Chris) - Add Cc: stable and Fixes: tags. (Chris) - Fix the function docbook comment.
Cc: Chris Wilson chris@chris-wilson.co.uk Cc: stable@vger.kernel.org # v4.12+ Fixes: 181df2d458f3 ("drm/i915: Take rpm wakelock for releasing the fence on unbind") Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210322204223.919936-1-imre.d... (cherry picked from commit 9d58aa46291d4d696bb1eac3436d3118f7bf2573) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 13 +++++++++++- drivers/gpu/drm/i915/intel_runtime_pm.c | 29 ++++++++++++++++++++++----- drivers/gpu/drm/i915/intel_runtime_pm.h | 5 ++++ 3 files changed, 41 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c @@ -316,7 +316,18 @@ void i915_vma_revoke_fence(struct i915_v WRITE_ONCE(fence->vma, NULL); vma->fence = NULL;
- with_intel_runtime_pm_if_in_use(fence_to_uncore(fence)->rpm, wakeref) + /* + * Skip the write to HW if and only if the device is currently + * suspended. + * + * If the driver does not currently hold a wakeref (if_in_use == 0), + * the device may currently be runtime suspended, or it may be woken + * up before the suspend takes place. If the device is not suspended + * (powered down) and we skip clearing the fence register, the HW is + * left in an undefined state where we may end up with multiple + * registers overlapping. + */ + with_intel_runtime_pm_if_active(fence_to_uncore(fence)->rpm, wakeref) fence_write(fence); }
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c @@ -412,12 +412,20 @@ intel_wakeref_t intel_runtime_pm_get(str }
/** - * intel_runtime_pm_get_if_in_use - grab a runtime pm reference if device in use + * __intel_runtime_pm_get_if_active - grab a runtime pm reference if device is active * @rpm: the intel_runtime_pm structure + * @ignore_usecount: get a ref even if dev->power.usage_count is 0 * * This function grabs a device-level runtime pm reference if the device is - * already in use and ensures that it is powered up. It is illegal to try - * and access the HW should intel_runtime_pm_get_if_in_use() report failure. + * already active and ensures that it is powered up. It is illegal to try + * and access the HW should intel_runtime_pm_get_if_active() report failure. + * + * If @ignore_usecount=true, a reference will be acquired even if there is no + * user requiring the device to be powered up (dev->power.usage_count == 0). + * If the function returns false in this case then it's guaranteed that the + * device's runtime suspend hook has been called already or that it will be + * called (and hence it's also guaranteed that the device's runtime resume + * hook will be called eventually). * * Any runtime pm reference obtained by this function must have a symmetric * call to intel_runtime_pm_put() to release the reference again. @@ -425,7 +433,8 @@ intel_wakeref_t intel_runtime_pm_get(str * Returns: the wakeref cookie to pass to intel_runtime_pm_put(), evaluates * as True if the wakeref was acquired, or False otherwise. */ -intel_wakeref_t intel_runtime_pm_get_if_in_use(struct intel_runtime_pm *rpm) +static intel_wakeref_t __intel_runtime_pm_get_if_active(struct intel_runtime_pm *rpm, + bool ignore_usecount) { if (IS_ENABLED(CONFIG_PM)) { /* @@ -434,7 +443,7 @@ intel_wakeref_t intel_runtime_pm_get_if_ * function, since the power state is undefined. This applies * atm to the late/early system suspend/resume handlers. */ - if (pm_runtime_get_if_in_use(rpm->kdev) <= 0) + if (pm_runtime_get_if_active(rpm->kdev, ignore_usecount) <= 0) return 0; }
@@ -443,6 +452,16 @@ intel_wakeref_t intel_runtime_pm_get_if_ return track_intel_runtime_pm_wakeref(rpm); }
+intel_wakeref_t intel_runtime_pm_get_if_in_use(struct intel_runtime_pm *rpm) +{ + return __intel_runtime_pm_get_if_active(rpm, false); +} + +intel_wakeref_t intel_runtime_pm_get_if_active(struct intel_runtime_pm *rpm) +{ + return __intel_runtime_pm_get_if_active(rpm, true); +} + /** * intel_runtime_pm_get_noresume - grab a runtime pm reference * @rpm: the intel_runtime_pm structure --- a/drivers/gpu/drm/i915/intel_runtime_pm.h +++ b/drivers/gpu/drm/i915/intel_runtime_pm.h @@ -177,6 +177,7 @@ void intel_runtime_pm_driver_release(str
intel_wakeref_t intel_runtime_pm_get(struct intel_runtime_pm *rpm); intel_wakeref_t intel_runtime_pm_get_if_in_use(struct intel_runtime_pm *rpm); +intel_wakeref_t intel_runtime_pm_get_if_active(struct intel_runtime_pm *rpm); intel_wakeref_t intel_runtime_pm_get_noresume(struct intel_runtime_pm *rpm); intel_wakeref_t intel_runtime_pm_get_raw(struct intel_runtime_pm *rpm);
@@ -188,6 +189,10 @@ intel_wakeref_t intel_runtime_pm_get_raw for ((wf) = intel_runtime_pm_get_if_in_use(rpm); (wf); \ intel_runtime_pm_put((rpm), (wf)), (wf) = 0)
+#define with_intel_runtime_pm_if_active(rpm, wf) \ + for ((wf) = intel_runtime_pm_get_if_active(rpm); (wf); \ + intel_runtime_pm_put((rpm), (wf)), (wf) = 0) + void intel_runtime_pm_put_unchecked(struct intel_runtime_pm *rpm); #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM) void intel_runtime_pm_put(struct intel_runtime_pm *rpm, intel_wakeref_t wref);
From: JeongHyeon Lee jhs2.lee@samsung.com
commit 160f99db943224e55906dd83880da1a704c6e6b9 upstream.
Three optional parameters must be accepted at once in a DM verity table, e.g.: (verity_error_handling_mode) (ignore_zero_block) (check_at_most_once) Fix this to be possible by incrementing DM_VERITY_OPTS_MAX.
Signed-off-by: JeongHyeon Lee jhs2.lee@samsung.com Fixes: 843f38d382b1 ("dm verity: add 'check_at_most_once' option to only validate hashes once") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-verity-target.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -34,7 +34,7 @@ #define DM_VERITY_OPT_IGN_ZEROES "ignore_zero_blocks" #define DM_VERITY_OPT_AT_MOST_ONCE "check_at_most_once"
-#define DM_VERITY_OPTS_MAX (2 + DM_VERITY_OPTS_FEC + \ +#define DM_VERITY_OPTS_MAX (3 + DM_VERITY_OPTS_FEC + \ DM_VERITY_ROOT_HASH_VERIFICATION_OPTS)
static unsigned dm_verity_prefetch_cluster = DM_VERITY_DEFAULT_PREFETCH_SIZE;
From: Mikulas Patocka mpatocka@redhat.com
commit 5424a0b867e65f1ecf34ffe88d091a4fcbb35bc1 upstream.
When a DM device is first created it doesn't yet have an established capacity, therefore the use of set_capacity_and_notify() should be conditional given the potential for needless pr_info "detected capacity change" noise even if capacity is 0.
One could argue that the pr_info() in set_capacity_and_notify() is misplaced, but that position is not held uniformly.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Fixes: f64d9b2eacb9 ("dm: use set_capacity_and_notify") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2016,7 +2016,10 @@ static struct dm_table *__bind(struct ma if (size != dm_get_size(md)) memset(&md->geometry, 0, sizeof(md->geometry));
- set_capacity_and_notify(md->disk, size); + if (!get_capacity(md->disk)) + set_capacity(md->disk, size); + else + set_capacity_and_notify(md->disk, size);
dm_table_event_callback(t, event_callback, md);
From: Mikulas Patocka mpatocka@redhat.com
commit 4edbe1d7bcffcd6269f3b5eb63f710393ff2ec7a upstream.
If there are not any dm devices, we need to zero the "dev" argument in the first structure dm_name_list. However, this can cause out of bounds write, because the "needed" variable is zero and len may be less than eight.
Fix this bug by reporting DM_BUFFER_FULL_FLAG if the result buffer is too small to hold the "nl->dev" value.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -529,7 +529,7 @@ static int list_devices(struct file *fil * Grab our output buffer. */ nl = orig_nl = get_result_buffer(param, param_size, &len); - if (len < needed) { + if (len < needed || len < sizeof(nl->dev)) { param->flags |= DM_BUFFER_FULL_FLAG; goto out; }
From: Grygorii Strashko grygorii.strashko@ti.com
[ Upstream commit 7d7275b3e866cf8092bd12553ec53ba26864f7bb ]
The main purpose of l3 IRQs is to catch OCP bus access errors and identify corresponding code places by showing call stack, so it's important to handle L3 interconnect errors as fast as possible. On RT these IRQs will became threaded and will be scheduled much more late from the moment actual error occurred so showing completely useless information.
Hence, mark l3 IRQs as IRQF_NO_THREAD so they will not be forced threaded on RT or if force_irqthreads = true.
Fixes: 0ee7261c9212 ("drivers: bus: Move the OMAP interconnect driver to drivers/bus/") Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/omap_l3_noc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/bus/omap_l3_noc.c b/drivers/bus/omap_l3_noc.c index b040447575ad..dcfb32ee5cb6 100644 --- a/drivers/bus/omap_l3_noc.c +++ b/drivers/bus/omap_l3_noc.c @@ -285,7 +285,7 @@ static int omap_l3_probe(struct platform_device *pdev) */ l3->debug_irq = platform_get_irq(pdev, 0); ret = devm_request_irq(l3->dev, l3->debug_irq, l3_interrupt_handler, - 0x0, "l3-dbg-irq", l3); + IRQF_NO_THREAD, "l3-dbg-irq", l3); if (ret) { dev_err(l3->dev, "request_irq failed for %d\n", l3->debug_irq); @@ -294,7 +294,7 @@ static int omap_l3_probe(struct platform_device *pdev)
l3->app_irq = platform_get_irq(pdev, 1); ret = devm_request_irq(l3->dev, l3->app_irq, l3_interrupt_handler, - 0x0, "l3-app-irq", l3); + IRQF_NO_THREAD, "l3-app-irq", l3); if (ret) dev_err(l3->dev, "request_irq failed for %d\n", l3->app_irq);
From: Tony Lindgren tony@atomide.com
[ Upstream commit a249ca66d15fa4b54dc6deaff4155df3db1308e1 ]
Yongqin Liu yongqin.liu@linaro.org reported an issue where reboot hangs on beagleboard-x15. This started happening after commit 7078a5ba7a58 ("soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1").
We now assert any 012 type resets on init to prevent unconfigured accelerator MMUs getting enabled on init depending on the bootloader or kexec configured state.
Turns out that we now also wrongly assert dra7 l3init domain PCIe reset bits causing a hang during reboot. Let's fix the l3init reset bits to use a 01 map instead of 012 map. There are only two rstctrl bits and not three. This is documented in TRM "Table 3-1647. RM_PCIESS_RSTCTRL".
Fixes: 5a68c87afde0 ("soc: ti: omap-prm: dra7: add genpd support for remaining PRM instances") Fixes: 7078a5ba7a58 ("soc: ti: omap-prm: Fix boot time errors for rst_map_012 bits 0 and 1") Cc: Kishon Vijay Abraham I kishon@ti.com Reported-by: Yongqin Liu yongqin.liu@linaro.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/ti/omap_prm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/ti/omap_prm.c b/drivers/soc/ti/omap_prm.c index bf1468e5bccb..17ea6a74a988 100644 --- a/drivers/soc/ti/omap_prm.c +++ b/drivers/soc/ti/omap_prm.c @@ -332,7 +332,7 @@ static const struct omap_prm_data dra7_prm_data[] = { { .name = "l3init", .base = 0x4ae07300, .pwrstctrl = 0x0, .pwrstst = 0x4, .dmap = &omap_prm_alwon, - .rstctrl = 0x10, .rstst = 0x14, .rstmap = rst_map_012, + .rstctrl = 0x10, .rstst = 0x14, .rstmap = rst_map_01, .clkdm_name = "pcie" }, {
From: Tony Lindgren tony@atomide.com
[ Upstream commit fbfa463be8dc7957ee4f81556e9e1ea2a951807d ]
When I dropped legacy data for omap4 and dra7 smartreflex in favor of device tree based data, it seems I only testd for the "SmartReflex Class3 initialized" line in dmesg. I missed the fact that there is also omap_devinit_smartreflex() that happens later, and now it produces an error on boot for "No Voltage table for the corresponding vdd. Cannot create debugfs entries for n-values".
This happens as we no longer have the smartreflex instance legacy data, and have not yet moved completely to device tree based booting for the driver. Let's fix the issue by changing the smartreflex init to use names. This should all eventually go away in favor of doing the init in the driver based on devicetree compatible value.
Note that dra7xx_init_early() is not calling any voltage domain init like omap54xx_voltagedomains_init(), or a dra7 specific voltagedomains init. This means that on dra7 smartreflex is still not fully initialized, and also seems to be missing the related devicetree nodes.
Fixes: a6b1e717e942 ("ARM: OMAP2+: Drop legacy platform data for omap4 smartreflex") Fixes: e54740b4afe8 ("ARM: OMAP2+: Drop legacy platform data for dra7 smartreflex") Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/sr_device.c | 75 +++++++++++++++++++++++++-------- 1 file changed, 58 insertions(+), 17 deletions(-)
diff --git a/arch/arm/mach-omap2/sr_device.c b/arch/arm/mach-omap2/sr_device.c index 62df666c2bd0..17b66f0d0dee 100644 --- a/arch/arm/mach-omap2/sr_device.c +++ b/arch/arm/mach-omap2/sr_device.c @@ -88,34 +88,26 @@ static void __init sr_set_nvalues(struct omap_volt_data *volt_data,
extern struct omap_sr_data omap_sr_pdata[];
-static int __init sr_dev_init(struct omap_hwmod *oh, void *user) +static int __init sr_init_by_name(const char *name, const char *voltdm) { struct omap_sr_data *sr_data = NULL; struct omap_volt_data *volt_data; - struct omap_smartreflex_dev_attr *sr_dev_attr; static int i;
- if (!strncmp(oh->name, "smartreflex_mpu_iva", 20) || - !strncmp(oh->name, "smartreflex_mpu", 16)) + if (!strncmp(name, "smartreflex_mpu_iva", 20) || + !strncmp(name, "smartreflex_mpu", 16)) sr_data = &omap_sr_pdata[OMAP_SR_MPU]; - else if (!strncmp(oh->name, "smartreflex_core", 17)) + else if (!strncmp(name, "smartreflex_core", 17)) sr_data = &omap_sr_pdata[OMAP_SR_CORE]; - else if (!strncmp(oh->name, "smartreflex_iva", 16)) + else if (!strncmp(name, "smartreflex_iva", 16)) sr_data = &omap_sr_pdata[OMAP_SR_IVA];
if (!sr_data) { - pr_err("%s: Unknown instance %s\n", __func__, oh->name); + pr_err("%s: Unknown instance %s\n", __func__, name); return -EINVAL; }
- sr_dev_attr = (struct omap_smartreflex_dev_attr *)oh->dev_attr; - if (!sr_dev_attr || !sr_dev_attr->sensor_voltdm_name) { - pr_err("%s: No voltage domain specified for %s. Cannot initialize\n", - __func__, oh->name); - goto exit; - } - - sr_data->name = oh->name; + sr_data->name = name; if (cpu_is_omap343x()) sr_data->ip_type = 1; else @@ -136,10 +128,10 @@ static int __init sr_dev_init(struct omap_hwmod *oh, void *user) } }
- sr_data->voltdm = voltdm_lookup(sr_dev_attr->sensor_voltdm_name); + sr_data->voltdm = voltdm_lookup(voltdm); if (!sr_data->voltdm) { pr_err("%s: Unable to get voltage domain pointer for VDD %s\n", - __func__, sr_dev_attr->sensor_voltdm_name); + __func__, voltdm); goto exit; }
@@ -160,6 +152,20 @@ exit: return 0; }
+static int __init sr_dev_init(struct omap_hwmod *oh, void *user) +{ + struct omap_smartreflex_dev_attr *sr_dev_attr; + + sr_dev_attr = (struct omap_smartreflex_dev_attr *)oh->dev_attr; + if (!sr_dev_attr || !sr_dev_attr->sensor_voltdm_name) { + pr_err("%s: No voltage domain specified for %s. Cannot initialize\n", + __func__, oh->name); + return 0; + } + + return sr_init_by_name(oh->name, sr_dev_attr->sensor_voltdm_name); +} + /* * API to be called from board files to enable smartreflex * autocompensation at init. @@ -169,7 +175,42 @@ void __init omap_enable_smartreflex_on_init(void) sr_enable_on_init = true; }
+static const char * const omap4_sr_instances[] = { + "mpu", + "iva", + "core", +}; + +static const char * const dra7_sr_instances[] = { + "mpu", + "core", +}; + int __init omap_devinit_smartreflex(void) { + const char * const *sr_inst; + int i, nr_sr = 0; + + if (soc_is_omap44xx()) { + sr_inst = omap4_sr_instances; + nr_sr = ARRAY_SIZE(omap4_sr_instances); + + } else if (soc_is_dra7xx()) { + sr_inst = dra7_sr_instances; + nr_sr = ARRAY_SIZE(dra7_sr_instances); + } + + if (nr_sr) { + const char *name, *voltdm; + + for (i = 0; i < nr_sr; i++) { + name = kasprintf(GFP_KERNEL, "smartreflex_%s", sr_inst[i]); + voltdm = sr_inst[i]; + sr_init_by_name(name, voltdm); + } + + return 0; + } + return omap_hwmod_for_each_by_class("smartreflex", sr_dev_init, NULL); }
From: Tony Lindgren tony@atomide.com
[ Upstream commit effe89e40037038db7711bdab5d3401fe297d72c ]
On reset deassert, we must wait a bit after the rstst bit change before we allow clockdomain autoidle again. Otherwise we get the following oops sometimes on dra7 with iva:
Unhandled fault: imprecise external abort (0x1406) at 0x00000000 44000000.ocp:L3 Standard Error: MASTER MPU TARGET IVA_CONFIG (Read Link): At Address: 0x0005A410 : Data Access in User mode during Functional access Internal error: : 1406 [#1] SMP ARM ... (sysc_write_sysconfig) from [<c0782cb0>] (sysc_enable_module+0xcc/0x260) (sysc_enable_module) from [<c0782f0c>] (sysc_runtime_resume+0xc8/0x174) (sysc_runtime_resume) from [<c0a3e1ac>] (genpd_runtime_resume+0x94/0x224) (genpd_runtime_resume) from [<c0a33f0c>] (__rpm_callback+0xd8/0x180)
It is unclear what all devices this might affect, but presumably other devices with the rstst bit too can be affected. So let's just enable the delay for all the devices with rstst bit for now. Later on we may want to limit the list to the know affected devices if needed.
Fixes: d30cd83f6853 ("soc: ti: omap-prm: add support for denying idle for reset clockdomain") Reported-by: Yongqin Liu yongqin.liu@linaro.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/ti/omap_prm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/ti/omap_prm.c b/drivers/soc/ti/omap_prm.c index 17ea6a74a988..51143a68a889 100644 --- a/drivers/soc/ti/omap_prm.c +++ b/drivers/soc/ti/omap_prm.c @@ -830,8 +830,12 @@ static int omap_reset_deassert(struct reset_controller_dev *rcdev, reset->prm->data->name, id);
exit: - if (reset->clkdm) + if (reset->clkdm) { + /* At least dra7 iva needs a delay before clkdm idle */ + if (has_rstst) + udelay(1); pdata->clkdm_allow_idle(reset->clkdm); + }
return ret; }
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit edbea922025169c0e5cdca5ebf7bf5374cc5566c ]
Currently, veth_xmit() would call the skb_record_rx_queue() only when there is XDP program loaded on peer interface in native mode.
If peer has XDP prog in generic mode, then netif_receive_generic_xdp() has a call to netif_get_rxqueue(skb), so for multi-queue veth it will not be possible to grab a correct rxq.
To fix that, store queue_mapping independently of XDP prog presence on peer interface.
Fixes: 638264dc9022 ("veth: Support per queue XDP ring") Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Toshiaki Makita toshiaki.makita1@gmail.com Link: https://lore.kernel.org/bpf/20210303152903.11172-1-maciej.fijalkowski@intel.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/veth.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 02bfcdf50a7a..36abe756282e 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -301,8 +301,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) if (rxq < rcv->real_num_rx_queues) { rq = &rcv_priv->rq[rxq]; rcv_xdp = rcu_access_pointer(rq->xdp_prog); - if (rcv_xdp) - skb_record_rx_queue(skb, rxq); + skb_record_rx_queue(skb, rxq); }
skb_tx_timestamp(skb);
From: Alexei Starovoitov ast@kernel.org
[ Upstream commit 350a5c4dd2452ea999cc5e1d4a8dbf12de2f97ef ]
The syzbot got FD of vmlinux BTF and passed it into map_create which caused crash in btf_type_id_size() when it tried to access resolved_ids. The vmlinux BTF doesn't have 'resolved_ids' and 'resolved_sizes' initialized to save memory. To avoid such issues disallow using vmlinux BTF in prog_load and map_create commands.
Fixes: 5329722057d4 ("bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO") Reported-by: syzbot+8bab8ed346746e7540e8@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210307225248.79031-1-alexei.starovoitov@gmail.... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/syscall.c | 5 +++++ kernel/bpf/verifier.c | 4 ++++ 2 files changed, 9 insertions(+)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e5999d86c76e..32ca33539052 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -854,6 +854,11 @@ static int map_create(union bpf_attr *attr) err = PTR_ERR(btf); goto free_map; } + if (btf_is_kernel(btf)) { + btf_put(btf); + err = -EACCES; + goto free_map; + } map->btf = btf;
if (attr->btf_value_type_id) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ab23dfb9df1b..5b233e911c2c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8580,6 +8580,10 @@ static int check_btf_info(struct bpf_verifier_env *env, btf = btf_get_by_fd(attr->prog_btf_fd); if (IS_ERR(btf)) return PTR_ERR(btf); + if (btf_is_kernel(btf)) { + btf_put(btf); + return -EACCES; + } env->prog->aux->btf = btf;
err = check_btf_func(env, attr, uattr);
From: Tal Lossos tallossos@gmail.com
[ Upstream commit 769c18b254ca191b45047e1fcb3b2ce56fada0b6 ]
bpf_fd_inode_storage_lookup_elem() returned NULL when getting a bad FD, which caused -ENOENT in bpf_map_copy_value. -EBADF error is better than -ENOENT for a bad FD behaviour.
The patch was partially contributed by CyberArk Software, Inc.
Fixes: 8ea636848aca ("bpf: Implement bpf_local_storage for inodes") Signed-off-by: Tal Lossos tallossos@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Yonghong Song yhs@fb.com Acked-by: KP Singh kpsingh@kernel.org Link: https://lore.kernel.org/bpf/20210307120948.61414-1-tallossos@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/bpf_inode_storage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 6639640523c0..b58b2efb9b43 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -109,7 +109,7 @@ static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key) fd = *(int *)key; f = fget_raw(fd); if (!f) - return NULL; + return ERR_PTR(-EBADF);
sdata = inode_storage_lookup(f->f_inode, map, true); fput(f);
From: Georgi Valkov gvalkov@abv.bg
[ Upstream commit e7fb6465d4c8e767e39cbee72464e0060ab3d20c ]
It was reported ([0]) that having optional -m flag between source and destination arguments in install command breaks bpftools cross-build on MacOS. Move -m to the front to fix this issue.
[0] https://github.com/openwrt/openwrt/pull/3959
Fixes: 7110d80d53f4 ("libbpf: Makefile set specified permission mode") Signed-off-by: Georgi Valkov gvalkov@abv.bg Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210308183038.613432-1-andrii@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 55bd78b3496f..310f647c2d5b 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -236,7 +236,7 @@ define do_install if [ ! -d '$(DESTDIR_SQ)$2' ]; then \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$2'; \ fi; \ - $(INSTALL) $1 $(if $3,-m $3,) '$(DESTDIR_SQ)$2' + $(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR_SQ)$2' endef
install_lib: all_cmd
From: Tariq Toukan tariqt@nvidia.com
[ Upstream commit d5dd03b26ba49c4ffe67ee1937add82293c19794 ]
Since cited patch, MLX5E_REQUIRED_WQE_MTTS is not a power of two. Hence, usage of MLX5E_LOG_ALIGNED_MPWQE_PPW should be replaced, as it lost some accuracy. Use the designated macro to calculate the number of required MTTs.
This makes sure the solution in cited patch works properly.
While here, un-inline mlx5e_get_mpwqe_offset(), and remove the unused RQ parameter.
Fixes: c3c9402373fe ("net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU") Signed-off-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 ++++--- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 +++--- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 4 ++-- 3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 055baf3b6cb1..f258f2f9b8cf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -90,14 +90,15 @@ struct page_pool; MLX5_MPWRQ_LOG_WQE_SZ - PAGE_SHIFT : 0) #define MLX5_MPWRQ_PAGES_PER_WQE BIT(MLX5_MPWRQ_WQE_PAGE_ORDER)
-#define MLX5_MTT_OCTW(npages) (ALIGN(npages, 8) / 2) +#define MLX5_ALIGN_MTTS(mtts) (ALIGN(mtts, 8)) +#define MLX5_ALIGNED_MTTS_OCTW(mtts) ((mtts) / 2) +#define MLX5_MTT_OCTW(mtts) (MLX5_ALIGNED_MTTS_OCTW(MLX5_ALIGN_MTTS(mtts))) /* Add another page to MLX5E_REQUIRED_WQE_MTTS as a buffer between * WQEs, This page will absorb write overflow by the hardware, when * receiving packets larger than MTU. These oversize packets are * dropped by the driver at a later stage. */ -#define MLX5E_REQUIRED_WQE_MTTS (ALIGN(MLX5_MPWRQ_PAGES_PER_WQE + 1, 8)) -#define MLX5E_LOG_ALIGNED_MPWQE_PPW (ilog2(MLX5E_REQUIRED_WQE_MTTS)) +#define MLX5E_REQUIRED_WQE_MTTS (MLX5_ALIGN_MTTS(MLX5_MPWRQ_PAGES_PER_WQE + 1)) #define MLX5E_REQUIRED_MTTS(wqes) (wqes * MLX5E_REQUIRED_WQE_MTTS) #define MLX5E_MAX_RQ_NUM_MTTS \ ((1 << 16) * 2) /* So that MLX5_MTT_OCTW(num_mtts) fits into u16 */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index a2e0b548bf57..e479cce3e2b1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -305,9 +305,9 @@ static int mlx5e_create_rq_umr_mkey(struct mlx5_core_dev *mdev, struct mlx5e_rq rq->wqe_overflow.addr); }
-static inline u64 mlx5e_get_mpwqe_offset(struct mlx5e_rq *rq, u16 wqe_ix) +static u64 mlx5e_get_mpwqe_offset(u16 wqe_ix) { - return (wqe_ix << MLX5E_LOG_ALIGNED_MPWQE_PPW) << PAGE_SHIFT; + return MLX5E_REQUIRED_MTTS(wqe_ix) << PAGE_SHIFT; }
static void mlx5e_init_frags_partition(struct mlx5e_rq *rq) @@ -547,7 +547,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, mlx5_wq_ll_get_wqe(&rq->mpwqe.wq, i); u32 byte_count = rq->mpwqe.num_strides << rq->mpwqe.log_stride_sz; - u64 dma_offset = mlx5e_get_mpwqe_offset(rq, i); + u64 dma_offset = mlx5e_get_mpwqe_offset(i);
wqe->data[0].addr = cpu_to_be64(dma_offset + rq->buff.headroom); wqe->data[0].byte_count = cpu_to_be32(byte_count); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 4864deed9dc9..b2e71a045df0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -505,7 +505,6 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) struct mlx5e_icosq *sq = rq->icosq; struct mlx5_wq_cyc *wq = &sq->wq; struct mlx5e_umr_wqe *umr_wqe; - u16 xlt_offset = ix << (MLX5E_LOG_ALIGNED_MPWQE_PPW - 1); u16 pi; int err; int i; @@ -536,7 +535,8 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) umr_wqe->ctrl.opmod_idx_opcode = cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | MLX5_OPCODE_UMR); - umr_wqe->uctrl.xlt_offset = cpu_to_be16(xlt_offset); + umr_wqe->uctrl.xlt_offset = + cpu_to_be16(MLX5_ALIGNED_MTTS_OCTW(MLX5E_REQUIRED_MTTS(ix)));
sq->db.wqe_info[pi] = (struct mlx5e_icosq_wqe_info) { .wqe_type = MLX5E_ICOSQ_WQE_UMR_RX,
From: Aya Levin ayal@nvidia.com
[ Upstream commit 1c2cdf0b603a3b0c763288ad92e9f3f1555925cf ]
When closing the PTP channel, set its pointer explicitly to NULL. PTP channel is opened on demand, the code verify the pointer validity before access. Nullify it when closing the PTP channel to avoid unexpected behavior.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index e479cce3e2b1..3248741af440 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -2443,8 +2443,10 @@ void mlx5e_close_channels(struct mlx5e_channels *chs) { int i;
- if (chs->port_ptp) + if (chs->port_ptp) { mlx5e_port_ptp_close(chs->port_ptp); + chs->port_ptp = NULL; + }
for (i = 0; i < chs->num; i++) mlx5e_close_channel(chs->c[i]);
From: Maxim Mikityanskiy maximmi@mellanox.com
[ Upstream commit e5eb01344e9b09bb9d255b9727449186f7168df8 ]
Each RQ (including XSK RQs) takes a reference to the XDP program. When an XDP program is attached or detached, the channels and queues are recreated, however, there is a special flow for changing an active XDP program to another one. In that flow, channels and queues stay alive, but the refcounts of the old and new XDP programs are adjusted. This flow didn't increment refcount by the number of active XSK RQs, and this commit fixes it.
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support") Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 3248741af440..1386212ad3f0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4550,8 +4550,10 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) struct mlx5e_channel *c = priv->channels.c[i];
mlx5e_rq_replace_xdp_prog(&c->rq, prog); - if (test_bit(MLX5E_CHANNEL_STATE_XSK, c->state)) + if (test_bit(MLX5E_CHANNEL_STATE_XSK, c->state)) { + bpf_prog_inc(prog); mlx5e_rq_replace_xdp_prog(&c->xskrq, prog); + } }
unlock:
From: Maxim Mikityanskiy maximmi@mellanox.com
[ Upstream commit 74640f09735f935437bd8df9fe61a66f03eabb34 ]
Port timestamping for PTP can be enabled/disabled while the channels are closed. In that case mlx5e_safe_switch_channels is skipped, and the preactivate hook is called directly. However, if that hook returns an error, the channel parameters must be reverted back to their old values. This commit adds missing handling on this case.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support") Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 8612c388db7d..fdf5afc8b058 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1993,8 +1993,13 @@ static int set_pflag_tx_port_ts(struct net_device *netdev, bool enable) */
if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) { + struct mlx5e_params old_params; + + old_params = priv->channels.params; priv->channels.params = new_channels.params; err = mlx5e_num_channels_changed(priv); + if (err) + priv->channels.params = old_params; goto out; }
From: Maor Dickman maord@nvidia.com
[ Upstream commit 385d40b042e60aa0b677d7b400a0fefb44bcbaf4 ]
The cited change added offload support for Geneve options without verifying the validity of the options masks, this caused offload of rules with match on Geneve options with class,type and data masks which are zero to fail.
Fix by ignoring the match on Geneve options in case option masks are all zero.
Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload") Signed-off-by: Maor Dickman maord@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Reviewed-by: Oz Shlomo ozsh@nvidia.com Reviewed-by: Yevgeny Kliteynik kliteyn@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c index e472ed0eacfb..7ed3f9f79f11 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c @@ -227,6 +227,10 @@ static int mlx5e_tc_tun_parse_geneve_options(struct mlx5e_priv *priv, option_key = (struct geneve_opt *)&enc_opts.key->data[0]; option_mask = (struct geneve_opt *)&enc_opts.mask->data[0];
+ if (option_mask->opt_class == 0 && option_mask->type == 0 && + !memchr_inv(option_mask->opt_data, 0, option_mask->length * 4)) + return 0; + if (option_key->length > max_tlv_option_data_len) { NL_SET_ERR_MSG_MOD(extack, "Matching on GENEVE options: unsupported option len");
From: Parav Pandit parav@nvidia.com
[ Upstream commit 8b90d897823b28a51811931f3bdc79f8df79407e ]
do_div() returns reminder, while cited patch wanted to use quotient. Fix it by using quotient.
Fixes: 0e22bfb7c046 ("net/mlx5e: E-switch, Fix rate calculation for overflow") Signed-off-by: Parav Pandit parav@nvidia.com Signed-off-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 717fbaa6ce73..e9b7da05f14a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -5040,7 +5040,8 @@ static int apply_police_params(struct mlx5e_priv *priv, u64 rate, */ if (rate) { rate = (rate * BITS_PER_BYTE) + 500000; - rate_mbps = max_t(u64, do_div(rate, 1000000), 1); + do_div(rate, 1000000); + rate_mbps = max_t(u32, rate, 1); }
err = mlx5_esw_modify_vport_rate(esw, vport_num, rate_mbps);
From: Wei Wang weiwan@google.com
[ Upstream commit 28259bac7f1dde06d8ba324e222bbec9d4e92f2b ]
Syzbot reported the suspecious RCU usage in nexthop_fib6_nh() when called from ipv6_route_seq_show(). The reason is ipv6_route_seq_start() calls rcu_read_lock_bh(), while nexthop_fib6_nh() calls rcu_dereference_rtnl(). The fix proposed is to add a variant of nexthop_fib6_nh() to use rcu_dereference_bh_rtnl() for ipv6_route_seq_show().
The reported trace is as follows: ./include/net/nexthop.h:416 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 2 locks held by syz-executor.0/17895: at: seq_read+0x71/0x12a0 fs/seq_file.c:169 at: seq_file_net include/linux/seq_file_net.h:19 [inline] at: ipv6_route_seq_start+0xaf/0x300 net/ipv6/ip6_fib.c:2616
stack backtrace: CPU: 1 PID: 17895 Comm: syz-executor.0 Not tainted 4.15.0-syzkaller #0 Call Trace: [<ffffffff849edf9e>] __dump_stack lib/dump_stack.c:17 [inline] [<ffffffff849edf9e>] dump_stack+0xd8/0x147 lib/dump_stack.c:53 [<ffffffff8480b7fa>] lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5745 [<ffffffff8459ada6>] nexthop_fib6_nh include/net/nexthop.h:416 [inline] [<ffffffff8459ada6>] ipv6_route_native_seq_show net/ipv6/ip6_fib.c:2488 [inline] [<ffffffff8459ada6>] ipv6_route_seq_show+0x436/0x7a0 net/ipv6/ip6_fib.c:2673 [<ffffffff81c556df>] seq_read+0xccf/0x12a0 fs/seq_file.c:276 [<ffffffff81dbc62c>] proc_reg_read+0x10c/0x1d0 fs/proc/inode.c:231 [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:714 [inline] [<ffffffff81bc28ae>] do_loop_readv_writev fs/read_write.c:701 [inline] [<ffffffff81bc28ae>] do_iter_read+0x49e/0x660 fs/read_write.c:935 [<ffffffff81bc81ab>] vfs_readv+0xfb/0x170 fs/read_write.c:997 [<ffffffff81c88847>] kernel_readv fs/splice.c:361 [inline] [<ffffffff81c88847>] default_file_splice_read+0x487/0x9c0 fs/splice.c:416 [<ffffffff81c86189>] do_splice_to+0x129/0x190 fs/splice.c:879 [<ffffffff81c86f66>] splice_direct_to_actor+0x256/0x890 fs/splice.c:951 [<ffffffff81c8777d>] do_splice_direct+0x1dd/0x2b0 fs/splice.c:1060 [<ffffffff81bc4747>] do_sendfile+0x597/0xce0 fs/read_write.c:1459 [<ffffffff81bca205>] SYSC_sendfile64 fs/read_write.c:1520 [inline] [<ffffffff81bca205>] SyS_sendfile64+0x155/0x170 fs/read_write.c:1506 [<ffffffff81015fcf>] do_syscall_64+0x1ff/0x310 arch/x86/entry/common.c:305 [<ffffffff84a00076>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Wei Wang weiwan@google.com Cc: David Ahern dsahern@kernel.org Cc: Ido Schimmel idosch@idosch.org Cc: Petr Machata petrm@nvidia.com Cc: Eric Dumazet edumazet@google.com Reviewed-by: Ido Schimmel idosch@nvidia.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/nexthop.h | 24 ++++++++++++++++++++++++ net/ipv6/ip6_fib.c | 2 +- 2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/include/net/nexthop.h b/include/net/nexthop.h index 226930d66b63..abd620103cec 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -400,6 +400,7 @@ static inline struct fib_nh *fib_info_nh(struct fib_info *fi, int nhsel) int fib6_check_nexthop(struct nexthop *nh, struct fib6_config *cfg, struct netlink_ext_ack *extack);
+/* Caller should either hold rcu_read_lock(), or RTNL. */ static inline struct fib6_nh *nexthop_fib6_nh(struct nexthop *nh) { struct nh_info *nhi; @@ -420,6 +421,29 @@ static inline struct fib6_nh *nexthop_fib6_nh(struct nexthop *nh) return NULL; }
+/* Variant of nexthop_fib6_nh(). + * Caller should either hold rcu_read_lock_bh(), or RTNL. + */ +static inline struct fib6_nh *nexthop_fib6_nh_bh(struct nexthop *nh) +{ + struct nh_info *nhi; + + if (nh->is_group) { + struct nh_group *nh_grp; + + nh_grp = rcu_dereference_bh_rtnl(nh->nh_grp); + nh = nexthop_mpath_select(nh_grp, 0); + if (!nh) + return NULL; + } + + nhi = rcu_dereference_bh_rtnl(nh->nh_info); + if (nhi->family == AF_INET6) + return &nhi->fib6_nh; + + return NULL; +} + static inline struct net_device *fib6_info_nh_dev(struct fib6_info *f6i) { struct fib6_nh *fib6_nh; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index f43e27555725..1fb79dbde0cb 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -2485,7 +2485,7 @@ static int ipv6_route_native_seq_show(struct seq_file *seq, void *v) const struct net_device *dev;
if (rt->nh) - fib6_nh = nexthop_fib6_nh(rt->nh); + fib6_nh = nexthop_fib6_nh_bh(rt->nh);
seq_printf(seq, "%pi6 %02x ", &rt->fib6_dst.addr, rt->fib6_dst.plen);
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 9398e9c0b1d44eeb700e9e766c02bcc765c82570 ]
In the rare case that drop_monitor fails to register its probe on the 'napi_poll' tracepoint, it will not deactivate its hysteresis timer as part of the error path. If the hysteresis timer was armed by the shortly lived 'kfree_skb' probe and user space retries to initiate tracing, a warning will be emitted for trying to initialize an active object [1].
Fix this by properly undoing all the operations that were done prior to probe registration, in both software and hardware code paths.
Note that syzkaller managed to fail probe registration by injecting a slab allocation failure [2].
[1] ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135 WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Modules linked in: CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505 [...] Call Trace: __debug_object_init+0x524/0xd10 lib/debugobjects.c:588 debug_timer_init kernel/time/timer.c:722 [inline] debug_init kernel/time/timer.c:770 [inline] init_timer_key+0x2d/0x340 kernel/time/timer.c:814 net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline] set_all_monitor_traces net/core/drop_monitor.c:1188 [inline] net_dm_monitor_start net/core/drop_monitor.c:1295 [inline] net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348 ___sys_sendmsg+0xf3/0x170 net/socket.c:2402 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae
[2] FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 1 CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: dump_stack+0xfa/0x151 should_fail.cold+0x5/0xa should_failslab+0x5/0x10 __kmalloc+0x72/0x3f0 tracepoint_add_func+0x378/0x990 tracepoint_probe_register+0x9c/0xe0 net_dm_cmd_trace+0x7fc/0x1220 genl_family_rcv_msg_doit+0x228/0x320 genl_rcv_msg+0x328/0x580 netlink_rcv_skb+0x153/0x420 genl_rcv+0x24/0x40 netlink_unicast+0x533/0x7d0 netlink_sendmsg+0x856/0xd90 sock_sendmsg+0xcf/0x120 ____sys_sendmsg+0x6e8/0x810 ___sys_sendmsg+0xf3/0x170 __sys_sendmsg+0xe5/0x1b0 do_syscall_64+0x2d/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: 70c69274f354 ("drop_monitor: Initialize timer and work item upon tracing enable") Fixes: 8ee2267ad33e ("drop_monitor: Convert to using devlink tracepoint") Reported-by: syzbot+779559d6503f3a56213d@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/drop_monitor.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index 571f191c06d9..db65ce62b625 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -1053,6 +1053,20 @@ static int net_dm_hw_monitor_start(struct netlink_ext_ack *extack) return 0;
err_module_put: + for_each_possible_cpu(cpu) { + struct per_cpu_dm_data *hw_data = &per_cpu(dm_hw_cpu_data, cpu); + struct sk_buff *skb; + + del_timer_sync(&hw_data->send_timer); + cancel_work_sync(&hw_data->dm_alert_work); + while ((skb = __skb_dequeue(&hw_data->drop_queue))) { + struct devlink_trap_metadata *hw_metadata; + + hw_metadata = NET_DM_SKB_CB(skb)->hw_metadata; + net_dm_hw_metadata_free(hw_metadata); + consume_skb(skb); + } + } module_put(THIS_MODULE); return rc; } @@ -1134,6 +1148,15 @@ static int net_dm_trace_on_set(struct netlink_ext_ack *extack) err_unregister_trace: unregister_trace_kfree_skb(ops->kfree_skb_probe, NULL); err_module_put: + for_each_possible_cpu(cpu) { + struct per_cpu_dm_data *data = &per_cpu(dm_cpu_data, cpu); + struct sk_buff *skb; + + del_timer_sync(&data->send_timer); + cancel_work_sync(&data->dm_alert_work); + while ((skb = __skb_dequeue(&data->drop_queue))) + consume_skb(skb); + } module_put(THIS_MODULE); return rc; }
From: Eric Dumazet edumazet@google.com
[ Upstream commit dd4fa1dae9f4847cc1fd78ca468ad69e16e5db3e ]
macvlan_count_rx() can be called from process context, it is thus necessary to disable preemption before calling u64_stats_update_begin()
syzbot was able to spot this on 32bit arch:
WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert include/linux/seqlock.h:271 [inline] WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269 Modules linked in: Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4632 Comm: kworker/1:3 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: ARM-Versatile Express Workqueue: events macvlan_process_broadcast Backtrace: [<82740468>] (dump_backtrace) from [<827406dc>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) r7:00000080 r6:60000093 r5:00000000 r4:8422a3c4 [<827406c4>] (show_stack) from [<82751b58>] (__dump_stack lib/dump_stack.c:79 [inline]) [<827406c4>] (show_stack) from [<82751b58>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120) [<82751aa0>] (dump_stack) from [<82741270>] (panic+0x130/0x378 kernel/panic.c:231) r7:830209b4 r6:84069ea4 r5:00000000 r4:844350d0 [<82741140>] (panic) from [<80244924>] (__warn+0xb0/0x164 kernel/panic.c:605) r3:8404ec8c r2:00000000 r1:00000000 r0:830209b4 r7:0000010f [<80244874>] (__warn) from [<82741520>] (warn_slowpath_fmt+0x68/0xd4 kernel/panic.c:628) r7:81363f70 r6:0000010f r5:83018e50 r4:00000000 [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert include/linux/seqlock.h:271 [inline]) [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269) r8:5a109000 r7:0000000f r6:a568dac0 r5:89802300 r4:00000001 [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (u64_stats_update_begin include/linux/u64_stats_sync.h:128 [inline]) [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_count_rx include/linux/if_macvlan.h:47 [inline]) [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_broadcast+0x154/0x26c drivers/net/macvlan.c:291) r5:89802300 r4:8a927740 [<8136499c>] (macvlan_broadcast) from [<81365020>] (macvlan_process_broadcast+0x258/0x2d0 drivers/net/macvlan.c:317) r10:81364f78 r9:8a86d000 r8:8a9c7e7c r7:8413aa5c r6:00000000 r5:00000000 r4:89802840 [<81364dc8>] (macvlan_process_broadcast) from [<802696a4>] (process_one_work+0x2d4/0x998 kernel/workqueue.c:2275) r10:00000008 r9:8404ec98 r8:84367a02 r7:ddfe6400 r6:ddfe2d40 r5:898dac80 r4:8a86d43c [<802693d0>] (process_one_work) from [<80269dcc>] (worker_thread+0x64/0x54c kernel/workqueue.c:2421) r10:00000008 r9:8a9c6000 r8:84006d00 r7:ddfe2d78 r6:898dac94 r5:ddfe2d40 r4:898dac80 [<80269d68>] (worker_thread) from [<80271f40>] (kthread+0x184/0x1a4 kernel/kthread.c:292) r10:85247e64 r9:898dac80 r8:80269d68 r7:00000000 r6:8a9c6000 r5:89a2ee40 r4:8a97bd00 [<80271dbc>] (kthread) from [<80200114>] (ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158) Exception stack(0x8a9c7fb0 to 0x8a9c7ff8)
Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Herbert Xu herbert@gondor.apana.org.au Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/if_macvlan.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index 96556c64c95d..10c94a3936ca 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -43,13 +43,14 @@ static inline void macvlan_count_rx(const struct macvlan_dev *vlan, if (likely(success)) { struct vlan_pcpu_stats *pcpu_stats;
- pcpu_stats = this_cpu_ptr(vlan->pcpu_stats); + pcpu_stats = get_cpu_ptr(vlan->pcpu_stats); u64_stats_update_begin(&pcpu_stats->syncp); pcpu_stats->rx_packets++; pcpu_stats->rx_bytes += len; if (multicast) pcpu_stats->rx_multicast++; u64_stats_update_end(&pcpu_stats->syncp); + put_cpu_ptr(vlan->pcpu_stats); } else { this_cpu_inc(vlan->pcpu_stats->rx_errors); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit e323d865b36134e8c5c82c834df89109a5c60dab ]
iproute2 package is well behaved, but malicious user space can provide illegal shift values and trigger UBSAN reports.
Add stab parameter to red_check_params() to validate user input.
syzbot reported:
UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18 shift exponent 111 is too large for 64-bit type 'long unsigned int' CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327 red_calc_qavg_from_idle_time include/net/red.h:312 [inline] red_calc_qavg include/net/red.h:353 [inline] choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221 __dev_xmit_skb net/core/dev.c:3837 [inline] __dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150 neigh_hh_output include/net/neighbour.h:499 [inline] neigh_output include/net/neighbour.h:508 [inline] ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117 __ip6_finish_output net/ipv6/ip6_output.c:182 [inline] __ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161 ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192 NF_HOOK_COND include/linux/netfilter.h:290 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215 dst_output include/net/dst.h:448 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] NF_HOOK include/linux/netfilter.h:295 [inline] ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320 inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135 dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138 dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535 dccp_finish_passive_close net/dccp/proto.c:123 [inline] dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118 dccp_terminate_connection net/dccp/proto.c:958 [inline] dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431 inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478 __sock_release+0xcd/0x280 net/socket.c:599 sock_close+0x18/0x20 net/socket.c:1258 __fput+0x288/0x920 fs/file_table.c:280 task_work_run+0xdd/0x1a0 kernel/task_work.c:140 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/red.h | 10 +++++++++- net/sched/sch_choke.c | 7 ++++--- net/sched/sch_gred.c | 2 +- net/sched/sch_red.c | 7 +++++-- net/sched/sch_sfq.c | 2 +- 5 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/include/net/red.h b/include/net/red.h index 932f0d79d60c..9e6647c4ccd1 100644 --- a/include/net/red.h +++ b/include/net/red.h @@ -168,7 +168,8 @@ static inline void red_set_vars(struct red_vars *v) v->qcount = -1; }
-static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog, u8 Scell_log) +static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog, + u8 Scell_log, u8 *stab) { if (fls(qth_min) + Wlog > 32) return false; @@ -178,6 +179,13 @@ static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog, u8 Scell_ return false; if (qth_max < qth_min) return false; + if (stab) { + int i; + + for (i = 0; i < RED_STAB_SIZE; i++) + if (stab[i] >= 32) + return false; + } return true; }
diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c index 50f680f03a54..2adbd945bf15 100644 --- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -345,6 +345,7 @@ static int choke_change(struct Qdisc *sch, struct nlattr *opt, struct sk_buff **old = NULL; unsigned int mask; u32 max_P; + u8 *stab;
if (opt == NULL) return -EINVAL; @@ -361,8 +362,8 @@ static int choke_change(struct Qdisc *sch, struct nlattr *opt, max_P = tb[TCA_CHOKE_MAX_P] ? nla_get_u32(tb[TCA_CHOKE_MAX_P]) : 0;
ctl = nla_data(tb[TCA_CHOKE_PARMS]); - - if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log)) + stab = nla_data(tb[TCA_CHOKE_STAB]); + if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log, stab)) return -EINVAL;
if (ctl->limit > CHOKE_MAX_QUEUE) @@ -412,7 +413,7 @@ static int choke_change(struct Qdisc *sch, struct nlattr *opt,
red_set_parms(&q->parms, ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Plog, ctl->Scell_log, - nla_data(tb[TCA_CHOKE_STAB]), + stab, max_P); red_set_vars(&q->vars);
diff --git a/net/sched/sch_gred.c b/net/sched/sch_gred.c index e0bc77533acc..f4132dc25ac0 100644 --- a/net/sched/sch_gred.c +++ b/net/sched/sch_gred.c @@ -480,7 +480,7 @@ static inline int gred_change_vq(struct Qdisc *sch, int dp, struct gred_sched *table = qdisc_priv(sch); struct gred_sched_data *q = table->tab[dp];
- if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log)) { + if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log, stab)) { NL_SET_ERR_MSG_MOD(extack, "invalid RED parameters"); return -EINVAL; } diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index b4ae34d7aa96..40adf1f07a82 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -242,6 +242,7 @@ static int __red_change(struct Qdisc *sch, struct nlattr **tb, unsigned char flags; int err; u32 max_P; + u8 *stab;
if (tb[TCA_RED_PARMS] == NULL || tb[TCA_RED_STAB] == NULL) @@ -250,7 +251,9 @@ static int __red_change(struct Qdisc *sch, struct nlattr **tb, max_P = tb[TCA_RED_MAX_P] ? nla_get_u32(tb[TCA_RED_MAX_P]) : 0;
ctl = nla_data(tb[TCA_RED_PARMS]); - if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log)) + stab = nla_data(tb[TCA_RED_STAB]); + if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, + ctl->Scell_log, stab)) return -EINVAL;
err = red_get_flags(ctl->flags, TC_RED_HISTORIC_FLAGS, @@ -288,7 +291,7 @@ static int __red_change(struct Qdisc *sch, struct nlattr **tb, red_set_parms(&q->parms, ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Plog, ctl->Scell_log, - nla_data(tb[TCA_RED_STAB]), + stab, max_P); red_set_vars(&q->vars);
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index b25e51440623..066754a18569 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -647,7 +647,7 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) }
if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, - ctl_v1->Wlog, ctl_v1->Scell_log)) + ctl_v1->Wlog, ctl_v1->Scell_log, NULL)) return -EINVAL; if (ctl_v1 && ctl_v1->qth_min) { p = kmalloc(sizeof(*p), GFP_KERNEL);
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 47142ed6c34d544ae9f0463e58d482289cbe0d46 ]
Similar to commit 92696286f3bb37ba50e4bd8d1beb24afb759a799 ("net: bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to qualify the phydev->dev_flags based on whether the port is connected to an internal or external PHY otherwise we risk having a flags collision with a completely different interpretation depending on the driver.
Fixes: aa9aef77c761 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/bcm_sf2.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index edb0a1027b38..510324916e91 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -584,8 +584,10 @@ static u32 bcm_sf2_sw_get_phy_flags(struct dsa_switch *ds, int port) * in bits 15:8 and the patch level in bits 7:0 which is exactly what * the REG_PHY_REVISION register layout is. */ - - return priv->hw_params.gphy_rev; + if (priv->int_phy_mask & BIT(port)) + return priv->hw_params.gphy_rev; + else + return 0; }
static void bcm_sf2_sw_validate(struct dsa_switch *ds, int port,
From: Sasha Neftin sasha.neftin@intel.com
[ Upstream commit 6da262378c99b17b1a1ac2e42aa65acc1bd471c7 ]
This commit applies to the igc_reset_task the same changes that were applied to the igb driver in commit 024a8168b749 ("igb: reinit_locked() should be called with rtnl_lock") and fix possible race in reset subtask.
Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Suggested-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index afd6a62da29d..93874e930abf 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -3847,10 +3847,19 @@ static void igc_reset_task(struct work_struct *work)
adapter = container_of(work, struct igc_adapter, reset_task);
+ rtnl_lock(); + /* If we're already down or resetting, just bail */ + if (test_bit(__IGC_DOWN, &adapter->state) || + test_bit(__IGC_RESETTING, &adapter->state)) { + rtnl_unlock(); + return; + } + igc_rings_dump(adapter); igc_regs_dump(adapter); netdev_err(adapter->netdev, "Reset adapter\n"); igc_reinit_locked(adapter); + rtnl_unlock(); }
/**
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit 8876529465c368beafd51a70f79d7a738f2aadf4 ]
Fix Pause Frame Advertising when getting the advertisement via ethtool. Remove setting the "advertising" bit in link_ksettings during default case when Tx and Rx are in off state with Auto Negotiate off.
Below is the original output of advertisement link during Tx and Rx off: Advertised pause frame use: Symmetric Receive-only
Expected output: Advertised pause frame use: No
Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Reviewed-by: Malli C mallikarjuna.chilakala@intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_ethtool.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c index ec8cd69d4992..35c104a02bed 100644 --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c @@ -1709,9 +1709,7 @@ static int igc_ethtool_get_link_ksettings(struct net_device *netdev, Asym_Pause); break; default: - ethtool_link_ksettings_add_link_mode(cmd, advertising, Pause); - ethtool_link_ksettings_add_link_mode(cmd, advertising, - Asym_Pause); + break; }
status = pm_runtime_suspended(&adapter->pdev->dev) ?
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit 9a4a1cdc5ab52118c1f2b216f4240830b6528d32 ]
The Supported Pause Frame always display "No" even though the Advertised pause frame showing the correct setting based on the pause parameters via ethtool. Set bit in link_ksettings to "Supported" for Pause Frame.
Before output: Supported pause frame use: No
Expected output: Supported pause frame use: Symmetric
Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Reviewed-by: Malli C mallikarjuna.chilakala@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_ethtool.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c index 35c104a02bed..da259cd59add 100644 --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c @@ -1695,6 +1695,9 @@ static int igc_ethtool_get_link_ksettings(struct net_device *netdev, Autoneg); }
+ /* Set pause flow control settings */ + ethtool_link_ksettings_add_link_mode(cmd, supported, Pause); + switch (hw->fc.requested_mode) { case igc_fc_full: ethtool_link_ksettings_add_link_mode(cmd, advertising, Pause);
From: Andre Guedes andre.guedes@intel.com
[ Upstream commit fc9e5020971d57d7d0b3fef9e2ab2108fcb5588b ]
The comment describing the timestamps layout in the packet buffer is wrong and the code is actually retrieving the timestamp in Timer 1 reference instead of Timer 0. This hasn't been a big issue so far because hardware is configured to report both timestamps using Timer 0 (see IGC_SRRCTL register configuration in igc_ptp_enable_rx_timestamp() helper). This patch fixes the comment and the code so we retrieve the timestamp in Timer 0 reference as expected.
This patch also takes the opportunity to get rid of the hw.mac.type check since it is not required.
Fixes: 81b055205e8ba ("igc: Add support for RX timestamping") Signed-off-by: Andre Guedes andre.guedes@intel.com Signed-off-by: Vedang Patel vedang.patel@intel.com Signed-off-by: Jithu Joseph jithu.joseph@intel.com Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 2 +- drivers/net/ethernet/intel/igc/igc_ptp.c | 72 +++++++++++++----------- 2 files changed, 41 insertions(+), 33 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 35baae900c1f..6dca67d9c25d 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -545,7 +545,7 @@ void igc_ptp_init(struct igc_adapter *adapter); void igc_ptp_reset(struct igc_adapter *adapter); void igc_ptp_suspend(struct igc_adapter *adapter); void igc_ptp_stop(struct igc_adapter *adapter); -void igc_ptp_rx_pktstamp(struct igc_q_vector *q_vector, void *va, +void igc_ptp_rx_pktstamp(struct igc_q_vector *q_vector, __le32 *va, struct sk_buff *skb); int igc_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr); int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr); diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c index ac0b9c85da7c..545f4d0e67cf 100644 --- a/drivers/net/ethernet/intel/igc/igc_ptp.c +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c @@ -152,46 +152,54 @@ static void igc_ptp_systim_to_hwtstamp(struct igc_adapter *adapter, }
/** - * igc_ptp_rx_pktstamp - retrieve Rx per packet timestamp + * igc_ptp_rx_pktstamp - Retrieve timestamp from Rx packet buffer * @q_vector: Pointer to interrupt specific structure * @va: Pointer to address containing Rx buffer * @skb: Buffer containing timestamp and packet * - * This function is meant to retrieve the first timestamp from the - * first buffer of an incoming frame. The value is stored in little - * endian format starting on byte 0. There's a second timestamp - * starting on byte 8. - **/ -void igc_ptp_rx_pktstamp(struct igc_q_vector *q_vector, void *va, + * This function retrieves the timestamp saved in the beginning of packet + * buffer. While two timestamps are available, one in timer0 reference and the + * other in timer1 reference, this function considers only the timestamp in + * timer0 reference. + */ +void igc_ptp_rx_pktstamp(struct igc_q_vector *q_vector, __le32 *va, struct sk_buff *skb) { struct igc_adapter *adapter = q_vector->adapter; - __le64 *regval = (__le64 *)va; - int adjust = 0; - - /* The timestamp is recorded in little endian format. - * DWORD: | 0 | 1 | 2 | 3 - * Field: | Timer0 Low | Timer0 High | Timer1 Low | Timer1 High + u64 regval; + int adjust; + + /* Timestamps are saved in little endian at the beginning of the packet + * buffer following the layout: + * + * DWORD: | 0 | 1 | 2 | 3 | + * Field: | Timer1 SYSTIML | Timer1 SYSTIMH | Timer0 SYSTIML | Timer0 SYSTIMH | + * + * SYSTIML holds the nanoseconds part while SYSTIMH holds the seconds + * part of the timestamp. */ - igc_ptp_systim_to_hwtstamp(adapter, skb_hwtstamps(skb), - le64_to_cpu(regval[0])); - - /* adjust timestamp for the RX latency based on link speed */ - if (adapter->hw.mac.type == igc_i225) { - switch (adapter->link_speed) { - case SPEED_10: - adjust = IGC_I225_RX_LATENCY_10; - break; - case SPEED_100: - adjust = IGC_I225_RX_LATENCY_100; - break; - case SPEED_1000: - adjust = IGC_I225_RX_LATENCY_1000; - break; - case SPEED_2500: - adjust = IGC_I225_RX_LATENCY_2500; - break; - } + regval = le32_to_cpu(va[2]); + regval |= (u64)le32_to_cpu(va[3]) << 32; + igc_ptp_systim_to_hwtstamp(adapter, skb_hwtstamps(skb), regval); + + /* Adjust timestamp for the RX latency based on link speed */ + switch (adapter->link_speed) { + case SPEED_10: + adjust = IGC_I225_RX_LATENCY_10; + break; + case SPEED_100: + adjust = IGC_I225_RX_LATENCY_100; + break; + case SPEED_1000: + adjust = IGC_I225_RX_LATENCY_1000; + break; + case SPEED_2500: + adjust = IGC_I225_RX_LATENCY_2500; + break; + default: + adjust = 0; + netdev_warn_once(adapter->netdev, "Imprecise timestamp\n"); + break; } skb_hwtstamps(skb)->hwtstamp = ktime_sub_ns(skb_hwtstamps(skb)->hwtstamp, adjust);
From: Vitaly Lifshits vitaly.lifshits@intel.com
[ Upstream commit 21f857f0321d0d0ea9b1a758bd55dc63d1cb2437 ]
A possible race condition was found in e1000_reset_task, after discovering a similar issue in igb driver via commit 024a8168b749 ("igb: reinit_locked() should be called with rtnl_lock").
Added rtnl_lock() and rtnl_unlock() to avoid this.
Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Suggested-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Vitaly Lifshits vitaly.lifshits@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e1000e/netdev.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index e9b82c209c2d..a0948002ddf8 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -5974,15 +5974,19 @@ static void e1000_reset_task(struct work_struct *work) struct e1000_adapter *adapter; adapter = container_of(work, struct e1000_adapter, reset_task);
+ rtnl_lock(); /* don't run the task if already down */ - if (test_bit(__E1000_DOWN, &adapter->state)) + if (test_bit(__E1000_DOWN, &adapter->state)) { + rtnl_unlock(); return; + }
if (!(adapter->flags & FLAG_RESTART_NOW)) { e1000e_dump(adapter); e_err("Reset adapter unexpectedly\n"); } e1000e_reinit_locked(adapter); + rtnl_unlock(); }
/**
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit b52912b8293f2c496f42583e65599aee606a0c18 ]
There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571 that we have caught its return value but lack further handling. Check and terminate the execution flow just like other e1e_wphy() in this function.
Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e1000e/82571.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/e1000e/82571.c b/drivers/net/ethernet/intel/e1000e/82571.c index 88faf05e23ba..0b1e890dd583 100644 --- a/drivers/net/ethernet/intel/e1000e/82571.c +++ b/drivers/net/ethernet/intel/e1000e/82571.c @@ -899,6 +899,8 @@ static s32 e1000_set_d0_lplu_state_82571(struct e1000_hw *hw, bool active) } else { data &= ~IGP02E1000_PM_D0_LPLU; ret_val = e1e_wphy(hw, IGP02E1000_PHY_POWER_MGMT, data); + if (ret_val) + return ret_val; /* LPLU and SmartSpeed are mutually exclusive. LPLU is used * during Dx states where the power conservation is most * important. During driver activity we should enable
From: David Gow davidgow@google.com
[ Upstream commit 7fd53f41f771d250eb08db08650940f017e37c26 ]
kunit_tool maintains a list of config options which are broken under UML, which we exclude from an otherwise 'make ARCH=um allyesconfig' build used to run all tests with the --alltests option.
Something in UML allyesconfig is causing segfaults when page poisining is enabled (and is poisoning with a non-zero value). Previously, this didn't occur, as allyesconfig enabled the CONFIG_PAGE_POISONING_ZERO option, which worked around the problem by zeroing memory. This option has since been removed, and memory is now poisoned with 0xAA, which triggers segfaults in many different codepaths, preventing UML from booting.
Note that we have to disable both CONFIG_PAGE_POISONING and CONFIG_DEBUG_PAGEALLOC, as the latter will 'select' the former on architectures (such as UML) which don't implement __kernel_map_pages().
Ideally, we'd fix this properly by tracking down the real root cause, but since this is breaking KUnit's --alltests feature, it's worth disabling there in the meantime so the kernel can boot to the point where tests can actually run.
Fixes: f289041ed4cf ("mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO") Signed-off-by: David Gow davidgow@google.com Acked-by: Vlastimil Babka vbabka@suse.cz Reviewed-by: Brendan Higgins brendanhiggins@google.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/kunit/configs/broken_on_uml.config | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/kunit/configs/broken_on_uml.config b/tools/testing/kunit/configs/broken_on_uml.config index a7f0603d33f6..690870043ac0 100644 --- a/tools/testing/kunit/configs/broken_on_uml.config +++ b/tools/testing/kunit/configs/broken_on_uml.config @@ -40,3 +40,5 @@ # CONFIG_RESET_BRCMSTB_RESCAL is not set # CONFIG_RESET_INTEL_GW is not set # CONFIG_ADI_AXI_ADC is not set +# CONFIG_DEBUG_PAGEALLOC is not set +# CONFIG_PAGE_POISONING is not set
From: Lv Yunlong lyl2019@mail.ustc.edu.cn
[ Upstream commit db74623a3850db99cb9692fda9e836a56b74198d ]
In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by vfree(). But unfortunately, it is used when extended is true.
Fixes: 7061b2bdd620e ("qlogic: Deletion of unnecessary checks before two function calls") Signed-off-by: Lv Yunlong lyl2019@mail.ustc.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c index 7760a3394e93..7ecb3dfe30bd 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c @@ -1425,6 +1425,7 @@ void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *adapter)
if (fw_dump->tmpl_hdr == NULL || current_version > prev_version) { vfree(fw_dump->tmpl_hdr); + fw_dump->tmpl_hdr = NULL;
if (qlcnic_83xx_md_check_extended_dump_capability(adapter)) extended = !qlcnic_83xx_extend_md_capab(adapter); @@ -1443,6 +1444,8 @@ void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *adapter) struct qlcnic_83xx_dump_template_hdr *hdr;
hdr = fw_dump->tmpl_hdr; + if (!hdr) + return; hdr->drv_cap_mask = 0x1f; fw_dump->cap_mask = 0x1f; dev_info(&pdev->dev,
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 7a1468ba0e02eee24ae1353e8933793a27198e20 ]
Per the datasheet, when we clear the power down bit, the PHY remains in an internal reset state for 40us and then resume normal operation. Account for that delay to avoid any issues in the future if genphy_resume() changes.
Fixes: fe26821fa614 ("net: phy: broadcom: Wire suspend/resume for BCM54810") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/broadcom.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index 8a4ec3222168..ec45a1608309 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -332,6 +332,11 @@ static int bcm54xx_resume(struct phy_device *phydev) if (ret < 0) return ret;
+ /* Upon exiting power down, the PHY remains in an internal reset state + * for 40us + */ + fsleep(40); + return bcm54xx_config_init(phydev); }
From: Magnus Karlsson magnus.karlsson@intel.com
[ Upstream commit ed0907e3bdcfc7fe1c1756a480451e757b207a69 ]
Fix the wrong napi work done reporting in the xsk path of the ice driver. The code in the main Rx processing loop was written to assume that the buffer allocation code returns true if all allocations where successful and false if not. In contrast with all other Intel NIC xsk drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up the work done reporting in the napi loop.
This can be fixed either by inverting the return value from ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect way, or by changing the return value of ice_alloc_rx_bufs_zc(). We chose the latter as it makes all the xsk allocation functions for Intel NICs behave in the same way. My guess is that it was this unexpected discrepancy that gave rise to this bug in the first place.
Fixes: 5bb0c4b5eb61 ("ice, xsk: Move Rx allocation out of while-loop") Reported-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Magnus Karlsson magnus.karlsson@intel.com Tested-by: Kiran Bhandare kiranx.bhandare@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_base.c | 6 ++++-- drivers/net/ethernet/intel/ice/ice_xsk.c | 10 +++++----- 2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 3124a3bf519a..952e41a1e001 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -418,6 +418,8 @@ int ice_setup_rx_ctx(struct ice_ring *ring) writel(0, ring->tail);
if (ring->xsk_pool) { + bool ok; + if (!xsk_buff_can_alloc(ring->xsk_pool, num_bufs)) { dev_warn(dev, "XSK buffer pool does not provide enough addresses to fill %d buffers on Rx ring %d\n", num_bufs, ring->q_index); @@ -426,8 +428,8 @@ int ice_setup_rx_ctx(struct ice_ring *ring) return 0; }
- err = ice_alloc_rx_bufs_zc(ring, num_bufs); - if (err) + ok = ice_alloc_rx_bufs_zc(ring, num_bufs); + if (!ok) dev_info(dev, "Failed to allocate some buffers on XSK buffer pool enabled Rx ring %d (pf_q %d)\n", ring->q_index, pf_q); return 0; diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 1782146db644..69ee1a8e87ab 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -408,18 +408,18 @@ xsk_pool_if_up: * This function allocates a number of Rx buffers from the fill ring * or the internal recycle mechanism and places them on the Rx ring. * - * Returns false if all allocations were successful, true if any fail. + * Returns true if all allocations were successful, false if any fail. */ bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count) { union ice_32b_rx_flex_desc *rx_desc; u16 ntu = rx_ring->next_to_use; struct ice_rx_buf *rx_buf; - bool ret = false; + bool ok = true; dma_addr_t dma;
if (!count) - return false; + return true;
rx_desc = ICE_RX_DESC(rx_ring, ntu); rx_buf = &rx_ring->rx_buf[ntu]; @@ -427,7 +427,7 @@ bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count) do { rx_buf->xdp = xsk_buff_alloc(rx_ring->xsk_pool); if (!rx_buf->xdp) { - ret = true; + ok = false; break; }
@@ -452,7 +452,7 @@ bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count) ice_release_rx_desc(rx_ring, ntu); }
- return ret; + return ok; }
/**
From: Dylan Hung dylan_hung@aspeedtech.com
[ Upstream commit 6897087323a2fde46df32917462750c069668b2f ]
The interrupt handler may set the flag to reset the mac in the future, but that flag is not cleared once the reset has occurred.
Fixes: 10cbd6407609 ("ftgmac100: Rework NAPI & interrupts handling") Signed-off-by: Dylan Hung dylan_hung@aspeedtech.com Acked-by: Benjamin Herrenschmidt benh@kernel.crashing.org Reviewed-by: Joel Stanley joel@jms.id.au Signed-off-by: Joel Stanley joel@jms.id.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/faraday/ftgmac100.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c index 88bfe2107938..04421aec2dfd 100644 --- a/drivers/net/ethernet/faraday/ftgmac100.c +++ b/drivers/net/ethernet/faraday/ftgmac100.c @@ -1337,6 +1337,7 @@ static int ftgmac100_poll(struct napi_struct *napi, int budget) */ if (unlikely(priv->need_mac_restart)) { ftgmac100_start_hw(priv); + priv->need_mac_restart = false;
/* Re-enable "bad" interrupts */ iowrite32(FTGMAC100_INT_BAD,
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 148ddaa89d4a0a927c4353398096cc33687755c1 ]
While picking commit a8cd989e1a57 ("mmc: sdhci-msm: Warn about overclocking SD/MMC") back to my tree I was surprised that it was reporting warnings. I thought I fixed those! Looking closer at the fix, I see that I totally bungled it (or at least I halfway bungled it). The SD card clock got fixed (and that was the one I was really focused on fixing), but I totally adjusted the wrong clock for eMMC. Sigh. Let's fix my dumb mistake.
Now both SD and eMMC have floor for the "apps" clock.
This doesn't matter a lot for the final clock rate for HS400 eMMC but could matter if someone happens to put some slower eMMC on a sc7180. We also transition through some of these lower rates sometimes and having them wrong could cause problems during these transitions. These were the messages I was seeing at boot: mmc1: Card appears overclocked; req 52000000 Hz, actual 100000000 Hz mmc1: Card appears overclocked; req 52000000 Hz, actual 100000000 Hz mmc1: Card appears overclocked; req 104000000 Hz, actual 192000000 Hz
Fixes: 6d37a8d19283 ("clk: qcom: gcc-sc7180: Use floor ops for sdcc clks") Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://lore.kernel.org/r/20210224095013.1.I2e2ba4978cfca06520dfb5d757768f9c... Reviewed-by: Taniya Das tdas@codeaurora.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gcc-sc7180.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/gcc-sc7180.c b/drivers/clk/qcom/gcc-sc7180.c index 88e896abb663..da8b627ca156 100644 --- a/drivers/clk/qcom/gcc-sc7180.c +++ b/drivers/clk/qcom/gcc-sc7180.c @@ -620,7 +620,7 @@ static struct clk_rcg2 gcc_sdcc1_apps_clk_src = { .name = "gcc_sdcc1_apps_clk_src", .parent_data = gcc_parent_data_1, .num_parents = 5, - .ops = &clk_rcg2_ops, + .ops = &clk_rcg2_floor_ops, }, };
@@ -642,7 +642,7 @@ static struct clk_rcg2 gcc_sdcc1_ice_core_clk_src = { .name = "gcc_sdcc1_ice_core_clk_src", .parent_data = gcc_parent_data_0, .num_parents = 4, - .ops = &clk_rcg2_floor_ops, + .ops = &clk_rcg2_ops, }, };
From: Alex Elder elder@linaro.org
[ Upstream commit 3a9ef3e11c5d33e5cb355b4aad1a4caad2407541 ]
When a QMI handle is initialized, an array of message handler structures is provided, defining how any received message should be handled based on its type and message ID. The QMI core code traverses this array when a message arrives and calls the function associated with the (type, msg_id) found in the array.
The array is supposed to be terminated with an empty (all zero) entry though. Without it, an unsupported message will cause the QMI core code to go past the end of the array.
Fix this bug, by properly terminating the message handler arrays provided when QMI handles are set up by the IPA driver.
Fixes: 530f9216a9537 ("soc: qcom: ipa: AP/modem communications") Reported-by: Sujit Kautkar sujitka@chromium.org Signed-off-by: Alex Elder elder@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipa/ipa_qmi.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ipa/ipa_qmi.c b/drivers/net/ipa/ipa_qmi.c index 2fc64483f275..e594bf3b600f 100644 --- a/drivers/net/ipa/ipa_qmi.c +++ b/drivers/net/ipa/ipa_qmi.c @@ -249,6 +249,7 @@ static const struct qmi_msg_handler ipa_server_msg_handlers[] = { .decoded_size = IPA_QMI_DRIVER_INIT_COMPLETE_REQ_SZ, .fn = ipa_server_driver_init_complete, }, + { }, };
/* Handle an INIT_DRIVER response message from the modem. */ @@ -269,6 +270,7 @@ static const struct qmi_msg_handler ipa_client_msg_handlers[] = { .decoded_size = IPA_QMI_INIT_DRIVER_RSP_SZ, .fn = ipa_client_init_driver, }, + { }, };
/* Return a pointer to an init modem driver request structure, which contains
From: Eric Dumazet edumazet@google.com
[ Upstream commit 50535249f624d0072cd885bcdce4e4b6fb770160 ]
struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently does not clear it before copying kernel data to user space.
It might be too late to name the hole since sockaddr_qrtr structure is uapi.
BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249 CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:120 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118 kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402 kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249 instrument_copy_to_user include/linux/instrumented.h:121 [inline] _copy_to_user+0x1ac/0x270 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:209 [inline] move_addr_to_user+0x3a2/0x640 net/socket.c:237 ____sys_recvmsg+0x696/0xd50 net/socket.c:2575 ___sys_recvmsg net/socket.c:2610 [inline] do_recvmmsg+0xa97/0x22d0 net/socket.c:2710 __sys_recvmmsg net/socket.c:2789 [inline] __do_sys_recvmmsg net/socket.c:2812 [inline] __se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465f69 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69 RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003 RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60 R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000
Local variable ----addr@____sys_recvmsg created at: ____sys_recvmsg+0x168/0xd50 net/socket.c:2550 ____sys_recvmsg+0x168/0xd50 net/socket.c:2550
Bytes 2-3 of 12 are uninitialized Memory access of size 12 starts at ffff88817c627b40 Data copied to user address 0000000020000140
Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Courtney Cavin courtney.cavin@sonymobile.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/qrtr/qrtr.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/qrtr/qrtr.c b/net/qrtr/qrtr.c index edb6ac17ceca..dfc820ee553a 100644 --- a/net/qrtr/qrtr.c +++ b/net/qrtr/qrtr.c @@ -1058,6 +1058,11 @@ static int qrtr_recvmsg(struct socket *sock, struct msghdr *msg, rc = copied;
if (addr) { + /* There is an anonymous 2-byte hole after sq_family, + * make sure to clear it. + */ + memset(addr, 0, sizeof(*addr)); + addr->sq_family = AF_QIPCRTR; addr->sq_node = cb->src_node; addr->sq_port = cb->src_port;
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit a25f822285420486f5da434efc8d940d42a83bce ]
flow_dissector_key_icmp::id is of type u16 (CPU byteorder), ICMP header has its ID field in network byteorder obviously. Sparse says:
net/core/flow_dissector.c:178:43: warning: restricted __be16 degrades to integer
Convert ID value to CPU byteorder when storing it into flow_dissector_key_icmp.
Fixes: 5dec597e5cd0 ("flow_dissector: extract more ICMP information") Signed-off-by: Alexander Lobakin alobakin@pm.me Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/flow_dissector.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 6f1adba6695f..7a06d4301617 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -175,7 +175,7 @@ void skb_flow_get_icmp_tci(const struct sk_buff *skb, * avoid confusion with packets without such field */ if (icmp_has_id(ih->type)) - key_icmp->id = ih->un.echo.id ? : 1; + key_icmp->id = ih->un.echo.id ? ntohs(ih->un.echo.id) : 1; else key_icmp->id = 0; }
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 31254dc9566221429d2cfb45fd5737985d70f2b6 ]
When fixing the bpf test_tunnel.sh geneve failure. I only fixed the IPv4 part but forgot the IPv6 issue. Similar with the IPv4 fixes 557c223b643a ("selftests/bpf: No need to drop the packet when there is no geneve opt"), when there is no tunnel option and bpf_skb_get_tunnel_opt() returns error, there is no need to drop the packets and break all geneve rx traffic. Just set opt_class to 0 and keep returning TC_ACT_OK at the end.
Fixes: 557c223b643a ("selftests/bpf: No need to drop the packet when there is no geneve opt") Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: William Tu u9012063@gmail.com Link: https://lore.kernel.org/bpf/20210309032214.2112438-1-liuhangbin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/progs/test_tunnel_kern.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 9afe947cfae9..ba6eadfec565 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -508,10 +508,8 @@ int _ip6geneve_get_tunnel(struct __sk_buff *skb) }
ret = bpf_skb_get_tunnel_opt(skb, &gopt, sizeof(gopt)); - if (ret < 0) { - ERROR(ret); - return TC_ACT_SHOT; - } + if (ret < 0) + gopt.opt_class = 0;
bpf_trace_printk(fmt, sizeof(fmt), key.tunnel_id, key.remote_ipv4, gopt.opt_class);
From: Florian Westphal fw@strlen.de
[ Upstream commit b58f33d49e426dc66e98ed73afb5d97b15a25f2d ]
Before this change, the mask is never included in the netlink message, so "conntrack -E expect" always prints 0.0.0.0.
In older kernels the l3num callback struct was passed as argument, based on tuple->src.l3num. After the l3num indirection got removed, the call chain is based on m.src.l3num, but this value is 0xffff.
Init l3num to the correct value.
Fixes: f957be9d349a3 ("netfilter: conntrack: remove ctnetlink callbacks from l3 protocol trackers") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_netlink.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 84caf3316946..e0c566b3df90 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -2969,6 +2969,7 @@ static int ctnetlink_exp_dump_mask(struct sk_buff *skb, memset(&m, 0xFF, sizeof(m)); memcpy(&m.src.u3, &mask->src.u3, sizeof(m.src.u3)); m.src.u.all = mask->src.u.all; + m.src.l3num = tuple->src.l3num; m.dst.protonum = tuple->dst.protonum;
nest_parms = nla_nest_start(skb, CTA_EXPECT_MASK);
From: Xie He xie.he.0141@gmail.com
[ Upstream commit bf0ffea336b493c0a8c8bc27b46683ecf1e8f294 ]
"x25_close" is called by "hdlc_close" in "hdlc.c", which is called by hardware drivers' "ndo_stop" function. "x25_xmit" is called by "hdlc_start_xmit" in "hdlc.c", which is hardware drivers' "ndo_start_xmit" function. "x25_rx" is called by "hdlc_rcv" in "hdlc.c", which receives HDLC frames from "net/core/dev.c".
"x25_close" races with "x25_xmit" and "x25_rx" because their callers race.
However, we need to ensure that the LAPB APIs called in "x25_xmit" and "x25_rx" are called before "lapb_unregister" is called in "x25_close".
This patch adds locking to ensure when "x25_xmit" and "x25_rx" are doing their work, "lapb_unregister" is not yet called in "x25_close".
Reasons for not solving the racing between "x25_close" and "x25_xmit" by calling "netif_tx_disable" in "x25_close": 1. We still need to solve the racing between "x25_close" and "x25_rx"; 2. The design of the HDLC subsystem assumes the HDLC hardware drivers have full control over the TX queue, and the HDLC protocol drivers (like this driver) have no control. Controlling the queue here in the protocol driver may interfere with hardware drivers' control of the queue.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He xie.he.0141@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/hdlc_x25.c | 42 +++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wan/hdlc_x25.c b/drivers/net/wan/hdlc_x25.c index 4aaa6388b9ee..5a6a945f6c81 100644 --- a/drivers/net/wan/hdlc_x25.c +++ b/drivers/net/wan/hdlc_x25.c @@ -23,6 +23,8 @@
struct x25_state { x25_hdlc_proto settings; + bool up; + spinlock_t up_lock; /* Protects "up" */ };
static int x25_ioctl(struct net_device *dev, struct ifreq *ifr); @@ -104,6 +106,8 @@ static void x25_data_transmit(struct net_device *dev, struct sk_buff *skb)
static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev) { + hdlc_device *hdlc = dev_to_hdlc(dev); + struct x25_state *x25st = state(hdlc); int result;
/* There should be a pseudo header of 1 byte added by upper layers. @@ -114,11 +118,19 @@ static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; }
+ spin_lock_bh(&x25st->up_lock); + if (!x25st->up) { + spin_unlock_bh(&x25st->up_lock); + kfree_skb(skb); + return NETDEV_TX_OK; + } + switch (skb->data[0]) { case X25_IFACE_DATA: /* Data to be transmitted */ skb_pull(skb, 1); if ((result = lapb_data_request(dev, skb)) != LAPB_OK) dev_kfree_skb(skb); + spin_unlock_bh(&x25st->up_lock); return NETDEV_TX_OK;
case X25_IFACE_CONNECT: @@ -147,6 +159,7 @@ static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev) break; }
+ spin_unlock_bh(&x25st->up_lock); dev_kfree_skb(skb); return NETDEV_TX_OK; } @@ -164,6 +177,7 @@ static int x25_open(struct net_device *dev) .data_transmit = x25_data_transmit, }; hdlc_device *hdlc = dev_to_hdlc(dev); + struct x25_state *x25st = state(hdlc); struct lapb_parms_struct params; int result;
@@ -190,6 +204,10 @@ static int x25_open(struct net_device *dev) if (result != LAPB_OK) return -EINVAL;
+ spin_lock_bh(&x25st->up_lock); + x25st->up = true; + spin_unlock_bh(&x25st->up_lock); + return 0; }
@@ -197,6 +215,13 @@ static int x25_open(struct net_device *dev)
static void x25_close(struct net_device *dev) { + hdlc_device *hdlc = dev_to_hdlc(dev); + struct x25_state *x25st = state(hdlc); + + spin_lock_bh(&x25st->up_lock); + x25st->up = false; + spin_unlock_bh(&x25st->up_lock); + lapb_unregister(dev); }
@@ -205,15 +230,28 @@ static void x25_close(struct net_device *dev) static int x25_rx(struct sk_buff *skb) { struct net_device *dev = skb->dev; + hdlc_device *hdlc = dev_to_hdlc(dev); + struct x25_state *x25st = state(hdlc);
if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL) { dev->stats.rx_dropped++; return NET_RX_DROP; }
- if (lapb_data_received(dev, skb) == LAPB_OK) + spin_lock_bh(&x25st->up_lock); + if (!x25st->up) { + spin_unlock_bh(&x25st->up_lock); + kfree_skb(skb); + dev->stats.rx_dropped++; + return NET_RX_DROP; + } + + if (lapb_data_received(dev, skb) == LAPB_OK) { + spin_unlock_bh(&x25st->up_lock); return NET_RX_SUCCESS; + }
+ spin_unlock_bh(&x25st->up_lock); dev->stats.rx_errors++; dev_kfree_skb_any(skb); return NET_RX_DROP; @@ -298,6 +336,8 @@ static int x25_ioctl(struct net_device *dev, struct ifreq *ifr) return result;
memcpy(&state(hdlc)->settings, &new_settings, size); + state(hdlc)->up = false; + spin_lock_init(&state(hdlc)->up_lock);
/* There's no header_ops so hard_header_len should be 0. */ dev->hard_header_len = 0;
From: Ong Boon Leong boon.leong.ong@intel.com
[ Upstream commit d82c6c1aaccd2877b6082cebcb1746a13648a16d ]
if pl->mac_ops->mac_finish() failed, phylink_err should use "mac_finish" instead of "mac_prepare".
Fixes: b7ad14c2fe2d4 ("net: phylink: re-implement interface configuration with PCS") Signed-off-by: Ong Boon Leong boon.leong.ong@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phylink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 84f6e197f965..add9156601af 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -472,7 +472,7 @@ static void phylink_major_config(struct phylink *pl, bool restart, err = pl->mac_ops->mac_finish(pl->config, pl->cur_link_an_mode, state->interface); if (err < 0) - phylink_err(pl, "mac_prepare failed: %pe\n", + phylink_err(pl, "mac_finish failed: %pe\n", ERR_PTR(err)); } }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0217ed2848e8538bcf9172d97ed2eeb4a26041bb ]
Before calling tipc_aead_key_size(ptr), we need to ensure we have enough data to dereference ptr->keylen.
We probably also want to make sure tipc_aead_key_size() wont overflow with malicious ptr->keylen values.
Syzbot reported:
BUG: KMSAN: uninit-value in __tipc_nl_node_set_key net/tipc/node.c:2971 [inline] BUG: KMSAN: uninit-value in tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023 CPU: 0 PID: 21060 Comm: syz-executor.5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:120 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197 __tipc_nl_node_set_key net/tipc/node.c:2971 [inline] tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x1319/0x1610 net/netlink/genetlink.c:800 netlink_rcv_skb+0x6fa/0x810 net/netlink/af_netlink.c:2494 genl_rcv+0x63/0x80 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11d6/0x14a0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x1740/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345 ___sys_sendmsg net/socket.c:2399 [inline] __sys_sendmsg+0x714/0x830 net/socket.c:2432 __compat_sys_sendmsg net/compat.c:347 [inline] __do_compat_sys_sendmsg net/compat.c:354 [inline] __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline] __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c RIP: 0023:0xf7f60549 Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 RSP: 002b:00000000f555a5fc EFLAGS: 00000296 ORIG_RAX: 0000000000000172 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000200 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline] kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104 kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76 slab_alloc_node mm/slub.c:2907 [inline] __kmalloc_node_track_caller+0xa37/0x1430 mm/slub.c:4527 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2f8/0xb30 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1099 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline] netlink_sendmsg+0xdbc/0x1840 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345 ___sys_sendmsg net/socket.c:2399 [inline] __sys_sendmsg+0x714/0x830 net/socket.c:2432 __compat_sys_sendmsg net/compat.c:347 [inline] __do_compat_sys_sendmsg net/compat.c:354 [inline] __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline] __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
Fixes: e1f32190cf7d ("tipc: add support for AEAD key setting via netlink") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Tuong Lien tuong.t.lien@dektech.com.au Cc: Jon Maloy jmaloy@redhat.com Cc: Ying Xue ying.xue@windriver.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/node.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/tipc/node.c b/net/tipc/node.c index 008670d1f43e..136338b85504 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -2895,17 +2895,22 @@ int tipc_nl_node_dump_monitor_peer(struct sk_buff *skb,
#ifdef CONFIG_TIPC_CRYPTO static int tipc_nl_retrieve_key(struct nlattr **attrs, - struct tipc_aead_key **key) + struct tipc_aead_key **pkey) { struct nlattr *attr = attrs[TIPC_NLA_NODE_KEY]; + struct tipc_aead_key *key;
if (!attr) return -ENODATA;
- *key = (struct tipc_aead_key *)nla_data(attr); - if (nla_len(attr) < tipc_aead_key_size(*key)) + if (nla_len(attr) < sizeof(*key)) + return -EINVAL; + key = (struct tipc_aead_key *)nla_data(attr); + if (key->keylen > TIPC_AEAD_KEYLEN_MAX || + nla_len(attr) < tipc_aead_key_size(key)) return -EINVAL;
+ *pkey = key; return 0; }
From: Alexander Ovechkin ovov@yandex-team.ru
[ Upstream commit 7233da86697efef41288f8b713c10c2499cffe85 ]
Currently tcp_check_req can be called with obsolete req socket for which big socket have been already created (because of CPU race or early demux assigning req socket to multiple packets in gro batch).
Commit e0f9759f530bf789e984 ("tcp: try to keep packet if SYN_RCV race is lost") added retry in case when tcp_check_req is called for PSH|ACK packet. But if client sends RST+ACK immediatly after connection being established (it is performing healthcheck, for example) retry does not occur. In that case tcp_check_req tries to close req socket, leaving big socket active.
Fixes: e0f9759f530 ("tcp: try to keep packet if SYN_RCV race is lost") Signed-off-by: Alexander Ovechkin ovov@yandex-team.ru Reported-by: Oleg Senin olegsenin@yandex-team.ru Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_connection_sock.h | 2 +- net/ipv4/inet_connection_sock.c | 7 +++++-- net/ipv4/tcp_minisocks.c | 7 +++++-- 3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 111d7771b208..aa92af3dd444 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -284,7 +284,7 @@ static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk) return inet_csk_reqsk_queue_len(sk) >= sk->sk_max_ack_backlog; }
-void inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req); +bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req); void inet_csk_reqsk_queue_drop_and_put(struct sock *sk, struct request_sock *req);
static inline void inet_csk_prepare_for_destroy_sock(struct sock *sk) diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 6bd7ca09af03..fd472eae4f5c 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -705,12 +705,15 @@ static bool reqsk_queue_unlink(struct request_sock *req) return found; }
-void inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req) +bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req) { - if (reqsk_queue_unlink(req)) { + bool unlinked = reqsk_queue_unlink(req); + + if (unlinked) { reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req); reqsk_put(req); } + return unlinked; } EXPORT_SYMBOL(inet_csk_reqsk_queue_drop);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 0055ae0a3bf8..7513ba45553d 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -804,8 +804,11 @@ embryonic_reset: tcp_reset(sk, skb); } if (!fastopen) { - inet_csk_reqsk_queue_drop(sk, req); - __NET_INC_STATS(sock_net(sk), LINUX_MIB_EMBRYONICRSTS); + bool unlinked = inet_csk_reqsk_queue_drop(sk, req); + + if (unlinked) + __NET_INC_STATS(sock_net(sk), LINUX_MIB_EMBRYONICRSTS); + *req_stolen = !unlinked; } return NULL; }
From: Davide Caratti dcaratti@redhat.com
[ Upstream commit 13832ae2755395b2585500c85b64f5109a44227e ]
Currently, Linux computes the HMAC contained in ADD_ADDR sub-option using the Address Id and the IP Address, and hardcodes a destination port equal to zero. This is not ok for ADD_ADDR with port: ensure to account for the endpoint port when computing the HMAC, in compliance with RFC8684 §3.4.1.
Fixes: 22fb85ffaefb ("mptcp: add port support for ADD_ADDR suboption writing") Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Acked-by: Geliang Tang geliangtang@gmail.com Signed-off-by: Davide Caratti dcaratti@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/options.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 2e26e39169b8..37ef0bf098f6 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -555,15 +555,15 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb, }
static u64 add_addr_generate_hmac(u64 key1, u64 key2, u8 addr_id, - struct in_addr *addr) + struct in_addr *addr, u16 port) { u8 hmac[SHA256_DIGEST_SIZE]; u8 msg[7];
msg[0] = addr_id; memcpy(&msg[1], &addr->s_addr, 4); - msg[5] = 0; - msg[6] = 0; + msg[5] = port >> 8; + msg[6] = port & 0xFF;
mptcp_crypto_hmac_sha(key1, key2, msg, 7, hmac);
@@ -572,15 +572,15 @@ static u64 add_addr_generate_hmac(u64 key1, u64 key2, u8 addr_id,
#if IS_ENABLED(CONFIG_MPTCP_IPV6) static u64 add_addr6_generate_hmac(u64 key1, u64 key2, u8 addr_id, - struct in6_addr *addr) + struct in6_addr *addr, u16 port) { u8 hmac[SHA256_DIGEST_SIZE]; u8 msg[19];
msg[0] = addr_id; memcpy(&msg[1], &addr->s6_addr, 16); - msg[17] = 0; - msg[18] = 0; + msg[17] = port >> 8; + msg[18] = port & 0xFF;
mptcp_crypto_hmac_sha(key1, key2, msg, 19, hmac);
@@ -634,7 +634,8 @@ static bool mptcp_established_options_add_addr(struct sock *sk, struct sk_buff * opts->ahmac = add_addr_generate_hmac(msk->local_key, msk->remote_key, opts->addr_id, - &opts->addr); + &opts->addr, + opts->port); } } #if IS_ENABLED(CONFIG_MPTCP_IPV6) @@ -645,7 +646,8 @@ static bool mptcp_established_options_add_addr(struct sock *sk, struct sk_buff * opts->ahmac = add_addr6_generate_hmac(msk->local_key, msk->remote_key, opts->addr_id, - &opts->addr6); + &opts->addr6, + opts->port); } } #endif @@ -922,12 +924,14 @@ static bool add_addr_hmac_valid(struct mptcp_sock *msk, if (mp_opt->family == MPTCP_ADDR_IPVERSION_4) hmac = add_addr_generate_hmac(msk->remote_key, msk->local_key, - mp_opt->addr_id, &mp_opt->addr); + mp_opt->addr_id, &mp_opt->addr, + mp_opt->port); #if IS_ENABLED(CONFIG_MPTCP_IPV6) else hmac = add_addr6_generate_hmac(msk->remote_key, msk->local_key, - mp_opt->addr_id, &mp_opt->addr6); + mp_opt->addr_id, &mp_opt->addr6, + mp_opt->port); #endif
pr_debug("msk=%p, ahmac=%llu, mp_opt->ahmac=%llu\n",
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit e4912459bd5edd493b61bc7c3a5d9b2eb17f5a89 ]
CAN-FD frames have struct canfd_frame::flags, while classic CAN frames don't.
This patch refuses to set TX flags (struct can_isotp_ll_options::tx_flags) on non CAN-FD isotp sockets.
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://lore.kernel.org/r/20210218215434.1708249-2-mkl@pengutronix.de Cc: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/isotp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/can/isotp.c b/net/can/isotp.c index 3ef7f78e553b..e32d446c121e 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -1228,7 +1228,8 @@ static int isotp_setsockopt(struct socket *sock, int level, int optname, if (ll.mtu != CAN_MTU && ll.mtu != CANFD_MTU) return -EINVAL;
- if (ll.mtu == CAN_MTU && ll.tx_dl > CAN_MAX_DLEN) + if (ll.mtu == CAN_MTU && + (ll.tx_dl > CAN_MAX_DLEN || ll.tx_flags != 0)) return -EINVAL;
memcpy(&so->ll, &ll, sizeof(ll));
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit d4eb538e1f48b3cf7bb6cb9eb39fe3e9e8a701f7 ]
The previous patch ensures that the TX flags (struct can_isotp_ll_options::tx_flags) are 0 for classic CAN frames or a user configured value for CAN-FD frames.
This patch sets the CAN frames flags unconditionally to the ISO-TP TX flags, so that they are initialized to a proper value. Otherwise when running "candump -x" on a classical CAN ISO-TP stream shows wrongly set "B" and "E" flags.
| $ candump any,0:0,#FFFFFFFF -extA | [...] | can0 TX B E 713 [8] 2B 0A 0B 0C 0D 0E 0F 00 | can0 TX B E 713 [8] 2C 01 02 03 04 05 06 07 | can0 TX B E 713 [8] 2D 08 09 0A 0B 0C 0D 0E | can0 TX B E 713 [8] 2E 0F 00 01 02 03 04 05
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://lore.kernel.org/r/20210218215434.1708249-2-mkl@pengutronix.de Cc: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/isotp.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/can/isotp.c b/net/can/isotp.c index e32d446c121e..430976485d95 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -215,8 +215,7 @@ static int isotp_send_fc(struct sock *sk, int ae, u8 flowstatus) if (ae) ncf->data[0] = so->opt.ext_address;
- if (so->ll.mtu == CANFD_MTU) - ncf->flags = so->ll.tx_flags; + ncf->flags = so->ll.tx_flags;
can_send_ret = can_send(nskb, 1); if (can_send_ret) @@ -790,8 +789,7 @@ isotp_tx_burst: so->tx.sn %= 16; so->tx.bs++;
- if (so->ll.mtu == CANFD_MTU) - cf->flags = so->ll.tx_flags; + cf->flags = so->ll.tx_flags;
skb->dev = dev; can_skb_set_owner(skb, sk); @@ -939,8 +937,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) }
/* send the first or only CAN frame */ - if (so->ll.mtu == CANFD_MTU) - cf->flags = so->ll.tx_flags; + cf->flags = so->ll.tx_flags;
skb->dev = dev; skb->sk = sk;
From: Stephane Grosjean s.grosjean@peak-system.com
[ Upstream commit 59ec7b89ed3e921cd0625a8c83f31a30d485fdf8 ]
Since the peak_usb driver also supports the CAN-USB interfaces "PCAN-USB X6" and "PCAN-Chip USB" from PEAK-System GmbH, this patch adds their names to the list of explicitly supported devices.
Fixes: ea8b65b596d7 ("can: usb: Add support of PCAN-Chip USB stamp module") Fixes: f00b534ded60 ("can: peak: Add support for PCAN-USB X6 USB interface") Link: https://lore.kernel.org/r/20210309082128.23125-3-s.grosjean@peak-system.com Signed-off-by: Stephane Grosjean s.grosjean@peak-system.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c index f347ecc79aef..f1d018218c93 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c @@ -18,6 +18,8 @@
MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB FD adapter"); MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB Pro FD adapter"); +MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-Chip USB"); +MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB X6 adapter");
#define PCAN_USBPROFD_CHANNEL_COUNT 2 #define PCAN_USBFD_CHANNEL_COUNT 1
From: Angelo Dureghello angelo@kernel-space.org
[ Upstream commit 47c5e474bc1e1061fb037d13b5000b38967eb070 ]
For cases when flexcan is built-in, bitrate is still not set at registering. So flexcan_chip_freeze() generates:
[ 1.860000] *** ZERO DIVIDE *** FORMAT=4 [ 1.860000] Current process id is 1 [ 1.860000] BAD KERNEL TRAP: 00000000 [ 1.860000] PC: [<402e70c8>] flexcan_chip_freeze+0x1a/0xa8
To allow chip freeze, using an hardcoded timeout when bitrate is still not set.
Fixes: ec15e27cc890 ("can: flexcan: enable RX FIFO after FRZ/HALT valid") Link: https://lore.kernel.org/r/20210315231510.650593-1-angelo@kernel-space.org Signed-off-by: Angelo Dureghello angelo@kernel-space.org [mkl: use if instead of ? operator] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/flexcan.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index 2893297555eb..a9502fbc6dd6 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -697,9 +697,15 @@ static int flexcan_chip_disable(struct flexcan_priv *priv) static int flexcan_chip_freeze(struct flexcan_priv *priv) { struct flexcan_regs __iomem *regs = priv->regs; - unsigned int timeout = 1000 * 1000 * 10 / priv->can.bittiming.bitrate; + unsigned int timeout; + u32 bitrate = priv->can.bittiming.bitrate; u32 reg;
+ if (bitrate) + timeout = 1000 * 1000 * 10 / bitrate; + else + timeout = FLEXCAN_TIMEOUT_US / 10; + reg = priv->read(®s->mcr); reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT; priv->write(reg, ®s->mcr);
From: Jimmy Assarsson extja@kvaser.com
[ Upstream commit 7c6e6bce08f918b64459415f58061d4d6df44994 ]
Under certain circumstances, when switching from Kvaser's linuxcan driver (kvpciefd) to the SocketCAN driver (kvaser_pciefd), the bus load reporting is not disabled. This is flooding the kernel log with prints like: [3485.574677] kvaser_pciefd 0000:02:00.0: Received unexpected packet type 0x00000009
Always put the controller in the expected state, instead of assuming that bus load reporting is inactive.
Note: If bus load reporting is enabled when the driver is loaded, you will still get a number of bus load packages (and printouts), before it is disabled.
Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices") Link: https://lore.kernel.org/r/20210309091724.31262-1-jimmyassarsson@gmail.com Signed-off-by: Jimmy Assarsson extja@kvaser.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/kvaser_pciefd.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/can/kvaser_pciefd.c b/drivers/net/can/kvaser_pciefd.c index 969cedb9b0b6..0d77c60f775e 100644 --- a/drivers/net/can/kvaser_pciefd.c +++ b/drivers/net/can/kvaser_pciefd.c @@ -57,6 +57,7 @@ MODULE_DESCRIPTION("CAN driver for Kvaser CAN/PCIe devices"); #define KVASER_PCIEFD_KCAN_STAT_REG 0x418 #define KVASER_PCIEFD_KCAN_MODE_REG 0x41c #define KVASER_PCIEFD_KCAN_BTRN_REG 0x420 +#define KVASER_PCIEFD_KCAN_BUS_LOAD_REG 0x424 #define KVASER_PCIEFD_KCAN_BTRD_REG 0x428 #define KVASER_PCIEFD_KCAN_PWM_REG 0x430 /* Loopback control register */ @@ -949,6 +950,9 @@ static int kvaser_pciefd_setup_can_ctrls(struct kvaser_pciefd *pcie) timer_setup(&can->bec_poll_timer, kvaser_pciefd_bec_poll_timer, 0);
+ /* Disable Bus load reporting */ + iowrite32(0, can->reg_base + KVASER_PCIEFD_KCAN_BUS_LOAD_REG); + tx_npackets = ioread32(can->reg_base + KVASER_PCIEFD_KCAN_TX_NPACKETS_REG); if (((tx_npackets >> KVASER_PCIEFD_KCAN_TX_NPACKETS_MAX_SHIFT) &
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit 0429d6d89f97ebff4f17f13f5b5069c66bde8138 ]
There is a UAF in c_can_pci_remove(). dev is released by free_c_can_dev() and is used by pci_iounmap(pdev, priv->base) later. To fix this issue, save the mmio address before releasing dev.
Fixes: 5b92da0443c2 ("c_can_pci: generic module for C_CAN/D_CAN on PCI") Link: https://lore.kernel.org/r/20210301024512.539039-1-ztong0001@gmail.com Signed-off-by: Tong Zhang ztong0001@gmail.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/c_can/c_can_pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/c_can/c_can_pci.c b/drivers/net/can/c_can/c_can_pci.c index 406b4847e5dc..7efb60b50876 100644 --- a/drivers/net/can/c_can/c_can_pci.c +++ b/drivers/net/can/c_can/c_can_pci.c @@ -239,12 +239,13 @@ static void c_can_pci_remove(struct pci_dev *pdev) { struct net_device *dev = pci_get_drvdata(pdev); struct c_can_priv *priv = netdev_priv(dev); + void __iomem *addr = priv->base;
unregister_c_can_dev(dev);
free_c_can_dev(dev);
- pci_iounmap(pdev, priv->base); + pci_iounmap(pdev, addr); pci_disable_msi(pdev); pci_clear_master(pdev); pci_release_regions(pdev);
From: Tong Zhang ztong0001@gmail.com
[ Upstream commit 6e2fe01dd6f98da6cae8b07cd5cfa67abc70d97d ]
Currently doing modprobe c_can_pci will make the kernel complain:
Unbalanced pm_runtime_enable!
this is caused by pm_runtime_enable() called before pm is initialized.
This fix is similar to 227619c3ff7c, move those pm_enable/disable code to c_can_platform.
Fixes: 4cdd34b26826 ("can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controller") Link: http://lore.kernel.org/r/20210302025542.987600-1-ztong0001@gmail.com Signed-off-by: Tong Zhang ztong0001@gmail.com Tested-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/c_can/c_can.c | 24 +----------------------- drivers/net/can/c_can/c_can_platform.c | 6 +++++- 2 files changed, 6 insertions(+), 24 deletions(-)
diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 63f48b016ecd..716d1a5bf17b 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -212,18 +212,6 @@ static const struct can_bittiming_const c_can_bittiming_const = { .brp_inc = 1, };
-static inline void c_can_pm_runtime_enable(const struct c_can_priv *priv) -{ - if (priv->device) - pm_runtime_enable(priv->device); -} - -static inline void c_can_pm_runtime_disable(const struct c_can_priv *priv) -{ - if (priv->device) - pm_runtime_disable(priv->device); -} - static inline void c_can_pm_runtime_get_sync(const struct c_can_priv *priv) { if (priv->device) @@ -1335,7 +1323,6 @@ static const struct net_device_ops c_can_netdev_ops = {
int register_c_can_dev(struct net_device *dev) { - struct c_can_priv *priv = netdev_priv(dev); int err;
/* Deactivate pins to prevent DRA7 DCAN IP from being @@ -1345,28 +1332,19 @@ int register_c_can_dev(struct net_device *dev) */ pinctrl_pm_select_sleep_state(dev->dev.parent);
- c_can_pm_runtime_enable(priv); - dev->flags |= IFF_ECHO; /* we support local echo */ dev->netdev_ops = &c_can_netdev_ops;
err = register_candev(dev); - if (err) - c_can_pm_runtime_disable(priv); - else + if (!err) devm_can_led_init(dev); - return err; } EXPORT_SYMBOL_GPL(register_c_can_dev);
void unregister_c_can_dev(struct net_device *dev) { - struct c_can_priv *priv = netdev_priv(dev); - unregister_candev(dev); - - c_can_pm_runtime_disable(priv); } EXPORT_SYMBOL_GPL(unregister_c_can_dev);
diff --git a/drivers/net/can/c_can/c_can_platform.c b/drivers/net/can/c_can/c_can_platform.c index 05f425ceb53a..47b251b1607c 100644 --- a/drivers/net/can/c_can/c_can_platform.c +++ b/drivers/net/can/c_can/c_can_platform.c @@ -29,6 +29,7 @@ #include <linux/list.h> #include <linux/io.h> #include <linux/platform_device.h> +#include <linux/pm_runtime.h> #include <linux/clk.h> #include <linux/of.h> #include <linux/of_device.h> @@ -386,6 +387,7 @@ static int c_can_plat_probe(struct platform_device *pdev) platform_set_drvdata(pdev, dev); SET_NETDEV_DEV(dev, &pdev->dev);
+ pm_runtime_enable(priv->device); ret = register_c_can_dev(dev); if (ret) { dev_err(&pdev->dev, "registering %s failed (err=%d)\n", @@ -398,6 +400,7 @@ static int c_can_plat_probe(struct platform_device *pdev) return 0;
exit_free_device: + pm_runtime_disable(priv->device); free_c_can_dev(dev); exit: dev_err(&pdev->dev, "probe failed\n"); @@ -408,9 +411,10 @@ exit: static int c_can_plat_remove(struct platform_device *pdev) { struct net_device *dev = platform_get_drvdata(pdev); + struct c_can_priv *priv = netdev_priv(dev);
unregister_c_can_dev(dev); - + pm_runtime_disable(priv->device); free_c_can_dev(dev);
return 0;
From: Torin Cooper-Bennun torin@maxiluxsystems.com
[ Upstream commit c0e399f3baf42279f48991554240af8c457535d1 ]
Message loss from RX FIFO 0 is already handled in m_can_handle_lost_msg(), with netdev output included.
Removing this warning also improves driver performance under heavy load, where m_can_do_rx_poll() may be called many times before this interrupt is cleared, causing this message to be output many times (thanks Mariusz Madej for this report).
Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Link: https://lore.kernel.org/r/20210303103151.3760532-1-torin@maxiluxsystems.com Reported-by: Mariusz Madej mariusz.madej@xtrack.com Signed-off-by: Torin Cooper-Bennun torin@maxiluxsystems.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index da551fd0f502..678679a8c907 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -501,9 +501,6 @@ static int m_can_do_rx_poll(struct net_device *dev, int quota) }
while ((rxfs & RXFS_FFL_MASK) && (quota > 0)) { - if (rxfs & RXFS_RFL) - netdev_warn(dev, "Rx FIFO 0 Message Lost\n"); - m_can_read_fifo(dev, rxfs);
quota--;
From: Torin Cooper-Bennun torin@maxiluxsystems.com
[ Upstream commit e98d9ee64ee2cc9b1d1a8e26610ec4d0392ebe50 ]
For M_CAN peripherals, m_can_rx_handler() was called with quota = 1, which caused any error handling to block RX from taking place until the next time the IRQ handler is called. This had been observed to cause RX to be blocked indefinitely in some cases.
This is fixed by calling m_can_rx_handler with a sensibly high quota.
Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Link: https://lore.kernel.org/r/20210303144350.4093750-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Torin Cooper-Bennun torin@maxiluxsystems.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 678679a8c907..44b3f4b3aea5 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -873,7 +873,7 @@ static int m_can_rx_peripheral(struct net_device *dev) { struct m_can_classdev *cdev = netdev_priv(dev);
- m_can_rx_handler(dev, 1); + m_can_rx_handler(dev, M_CAN_NAPI_WEIGHT);
m_can_enable_all_interrupts(cdev);
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 1944015fe9c1d9fa5e9eb7ffbbb5ef8954d6753b ]
Coverity reported the strange "if (~...)" condition that's always true. It suggested that ! was intended instead of ~, but upon further analysis I'm convinced that what really was intended was a comparison to 0xff/0xffff (in HT/VHT cases respectively), since this indicates that all of the rates are enabled.
Change the comparison accordingly.
I'm guessing this never really mattered because a reset to not having a rate mask is basically equivalent to having a mask that enables all rates.
Reported-by: Colin Ian King colin.king@canonical.com Fixes: 2ffbe6d33366 ("mac80211: fix and optimize MCS mask handling") Fixes: b119ad6e726c ("mac80211: add rate mask logic for vht rates") Reviewed-by: Colin Ian King colin.king@canonical.com Link: https://lore.kernel.org/r/20210212112213.36b38078f569.I8546a20c80bc1669058eb... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/cfg.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index c4c70e30ad7f..68a0de02b561 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -2950,14 +2950,14 @@ static int ieee80211_set_bitrate_mask(struct wiphy *wiphy, continue;
for (j = 0; j < IEEE80211_HT_MCS_MASK_LEN; j++) { - if (~sdata->rc_rateidx_mcs_mask[i][j]) { + if (sdata->rc_rateidx_mcs_mask[i][j] != 0xff) { sdata->rc_has_mcs_mask[i] = true; break; } }
for (j = 0; j < NL80211_VHT_NSS_MAX; j++) { - if (~sdata->rc_rateidx_vht_mcs_mask[i][j]) { + if (sdata->rc_rateidx_vht_mcs_mask[i][j] != 0xffff) { sdata->rc_has_vht_mcs_mask[i] = true; break; }
From: Brian Norris briannorris@chromium.org
[ Upstream commit 0f7e90faddeef53a3568f449a0c3992d77510b66 ]
We observed some Cisco APs sending the following HE Operation IE in associate response:
ff 0a 24 f4 3f 00 01 fc ff 00 00 00
Its HE operation parameter is 0x003ff4, so the expected total length is 7 which does not match the actual length = 10. This causes association failing with "HE AP is missing HE Capability/operation."
According to P802.11ax_D4 Table9-94, HE operation is extensible, and according to 802.11-2016 10.27.8, STA should discard the part beyond the maximum length and parse the truncated element.
Allow HE operation element to be longer than expected to handle this case and future extensions.
Fixes: e4d005b80dee ("mac80211: refactor extended element parsing") Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Yen-lin Lai yenlinlai@chromium.org Link: https://lore.kernel.org/r/20210223051926.2653301-1-yenlinlai@chromium.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 2 +- net/mac80211/util.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 0e4d950cf907..9db648a91a4f 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -5071,7 +5071,7 @@ static int ieee80211_prep_channel(struct ieee80211_sub_if_data *sdata, he_oper_ie = cfg80211_find_ext_ie(WLAN_EID_EXT_HE_OPERATION, ies->data, ies->len); if (he_oper_ie && - he_oper_ie[1] == ieee80211_he_oper_size(&he_oper_ie[3])) + he_oper_ie[1] >= ieee80211_he_oper_size(&he_oper_ie[3])) he_oper = (void *)(he_oper_ie + 3); else he_oper = NULL; diff --git a/net/mac80211/util.c b/net/mac80211/util.c index 8d3ae6b2f95f..f4507a708965 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -968,7 +968,7 @@ static void ieee80211_parse_extension_element(u32 *crc, break; case WLAN_EID_EXT_HE_OPERATION: if (len >= sizeof(*elems->he_operation) && - len == ieee80211_he_oper_size(data) - 1) { + len >= ieee80211_he_oper_size(data) - 1) { if (crc) *crc = crc32_be(*crc, (void *)elem, elem->datalen + 2);
From: Carlos Llamas cmllamas@google.com
[ Upstream commit 81f711d67a973bf8a6db9556faf299b4074d536e ]
Fix multiple warnings seen with gcc 10.2.1: reuseaddr_ports_exhausted.c:32:41: warning: missing braces around initializer [-Wmissing-braces] 32 | struct reuse_opts unreusable_opts[12] = { | ^ 33 | {0, 0, 0, 0}, | { } { }
Fixes: 7f204a7de8b0 ("selftests: net: Add SO_REUSEADDR test to check if 4-tuples are fully utilized.") Signed-off-by: Carlos Llamas cmllamas@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/net/reuseaddr_ports_exhausted.c | 32 +++++++++---------- 1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/net/reuseaddr_ports_exhausted.c b/tools/testing/selftests/net/reuseaddr_ports_exhausted.c index 7b01b7c2ec10..066efd30e294 100644 --- a/tools/testing/selftests/net/reuseaddr_ports_exhausted.c +++ b/tools/testing/selftests/net/reuseaddr_ports_exhausted.c @@ -30,25 +30,25 @@ struct reuse_opts { };
struct reuse_opts unreusable_opts[12] = { - {0, 0, 0, 0}, - {0, 0, 0, 1}, - {0, 0, 1, 0}, - {0, 0, 1, 1}, - {0, 1, 0, 0}, - {0, 1, 0, 1}, - {0, 1, 1, 0}, - {0, 1, 1, 1}, - {1, 0, 0, 0}, - {1, 0, 0, 1}, - {1, 0, 1, 0}, - {1, 0, 1, 1}, + {{0, 0}, {0, 0}}, + {{0, 0}, {0, 1}}, + {{0, 0}, {1, 0}}, + {{0, 0}, {1, 1}}, + {{0, 1}, {0, 0}}, + {{0, 1}, {0, 1}}, + {{0, 1}, {1, 0}}, + {{0, 1}, {1, 1}}, + {{1, 0}, {0, 0}}, + {{1, 0}, {0, 1}}, + {{1, 0}, {1, 0}}, + {{1, 0}, {1, 1}}, };
struct reuse_opts reusable_opts[4] = { - {1, 1, 0, 0}, - {1, 1, 0, 1}, - {1, 1, 1, 0}, - {1, 1, 1, 1}, + {{1, 1}, {0, 0}}, + {{1, 1}, {0, 1}}, + {{1, 1}, {1, 0}}, + {{1, 1}, {1, 1}}, };
int bind_port(struct __test_metadata *_metadata, int reuseaddr, int reuseport)
From: Louis Peens louis.peens@corigine.com
[ Upstream commit 982e5ee23d764fe6158f67a7813d416335e978b0 ]
There are some pre_tunnel flows combinations which are incorrectly being offloaded without proper support, fix these.
- Matching on MPLS is not supported for pre_tun. - Match on IPv4/IPv6 layer must be present. - Destination MAC address must match pre_tun.dev MAC
Fixes: 120ffd84a9ec ("nfp: flower: verify pre-tunnel rules") Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/netronome/nfp/flower/offload.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c index 1c59aff2163c..d72225d64a75 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c @@ -1142,6 +1142,12 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, return -EOPNOTSUPP; }
+ if (!(key_layer & NFP_FLOWER_LAYER_IPV4) && + !(key_layer & NFP_FLOWER_LAYER_IPV6)) { + NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: match on ipv4/ipv6 eth_type must be present"); + return -EOPNOTSUPP; + } + /* Skip fields known to exist. */ mask += sizeof(struct nfp_flower_meta_tci); ext += sizeof(struct nfp_flower_meta_tci); @@ -1152,6 +1158,13 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, mask += sizeof(struct nfp_flower_in_port); ext += sizeof(struct nfp_flower_in_port);
+ /* Ensure destination MAC address matches pre_tun_dev. */ + mac = (struct nfp_flower_mac_mpls *)ext; + if (memcmp(&mac->mac_dst[0], flow->pre_tun_rule.dev->dev_addr, 6)) { + NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: dest MAC must match output dev MAC"); + return -EOPNOTSUPP; + } + /* Ensure destination MAC address is fully matched. */ mac = (struct nfp_flower_mac_mpls *)mask; if (!is_broadcast_ether_addr(&mac->mac_dst[0])) { @@ -1159,6 +1172,11 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, return -EOPNOTSUPP; }
+ if (mac->mpls_lse) { + NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: MPLS not supported"); + return -EOPNOTSUPP; + } + mask += sizeof(struct nfp_flower_mac_mpls); ext += sizeof(struct nfp_flower_mac_mpls); if (key_layer & NFP_FLOWER_LAYER_IPV4 ||
From: Louis Peens louis.peens@corigine.com
[ Upstream commit 5c4f5e19d6a8e159127b9d653bb67e0dc7a28047 ]
Differentiate between ipv4 and ipv6 flows when configuring the pre_tunnel table to prevent them trampling each other in the table.
Fixes: 783461604f7e ("nfp: flower: update flow merge code to support IPv6 tunnels") Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/netronome/nfp/flower/tunnel_conf.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c index 7248d248f604..d19c02e99114 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c +++ b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c @@ -16,8 +16,9 @@ #define NFP_FL_MAX_ROUTES 32
#define NFP_TUN_PRE_TUN_RULE_LIMIT 32 -#define NFP_TUN_PRE_TUN_RULE_DEL 0x1 -#define NFP_TUN_PRE_TUN_IDX_BIT 0x8 +#define NFP_TUN_PRE_TUN_RULE_DEL BIT(0) +#define NFP_TUN_PRE_TUN_IDX_BIT BIT(3) +#define NFP_TUN_PRE_TUN_IPV6_BIT BIT(7)
/** * struct nfp_tun_pre_run_rule - rule matched before decap @@ -1268,6 +1269,7 @@ int nfp_flower_xmit_pre_tun_flow(struct nfp_app *app, { struct nfp_flower_priv *app_priv = app->priv; struct nfp_tun_offloaded_mac *mac_entry; + struct nfp_flower_meta_tci *key_meta; struct nfp_tun_pre_tun_rule payload; struct net_device *internal_dev; int err; @@ -1290,6 +1292,15 @@ int nfp_flower_xmit_pre_tun_flow(struct nfp_app *app, if (!mac_entry) return -ENOENT;
+ /* Set/clear IPV6 bit. cpu_to_be16() swap will lead to MSB being + * set/clear for port_idx. + */ + key_meta = (struct nfp_flower_meta_tci *)flow->unmasked_data; + if (key_meta->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV6) + mac_entry->index |= NFP_TUN_PRE_TUN_IPV6_BIT; + else + mac_entry->index &= ~NFP_TUN_PRE_TUN_IPV6_BIT; + payload.port_idx = cpu_to_be16(mac_entry->index);
/* Copy mac id and vlan to flow - dev may not exist at delete time. */
From: Louis Peens louis.peens@corigine.com
[ Upstream commit d8ce0275e45ec809a33f98fc080fe7921b720dfb ]
pre_tun_rule flows does not follow the usual add-flow path, instead they are used to update the pre_tun table on the firmware. This means that if the mask-id gets allocated here the firmware will never see the "NFP_FL_META_FLAG_MANAGE_MASK" flag for the specific mask id, which triggers the allocation on the firmware side. This leads to the firmware mask being corrupted and causing all sorts of strange behaviour.
Fixes: f12725d98cbe ("nfp: flower: offload pre-tunnel rules") Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/netronome/nfp/flower/metadata.c | 24 +++++++++++++------ 1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/metadata.c b/drivers/net/ethernet/netronome/nfp/flower/metadata.c index 5defd31d481c..aa06fcb38f8b 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/metadata.c +++ b/drivers/net/ethernet/netronome/nfp/flower/metadata.c @@ -327,8 +327,14 @@ int nfp_compile_flow_metadata(struct nfp_app *app, goto err_free_ctx_entry; }
+ /* Do net allocate a mask-id for pre_tun_rules. These flows are used to + * configure the pre_tun table and are never actually send to the + * firmware as an add-flow message. This causes the mask-id allocation + * on the firmware to get out of sync if allocated here. + */ new_mask_id = 0; - if (!nfp_check_mask_add(app, nfp_flow->mask_data, + if (!nfp_flow->pre_tun_rule.dev && + !nfp_check_mask_add(app, nfp_flow->mask_data, nfp_flow->meta.mask_len, &nfp_flow->meta.flags, &new_mask_id)) { NL_SET_ERR_MSG_MOD(extack, "invalid entry: cannot allocate a new mask id"); @@ -359,7 +365,8 @@ int nfp_compile_flow_metadata(struct nfp_app *app, goto err_remove_mask; }
- if (!nfp_check_mask_remove(app, nfp_flow->mask_data, + if (!nfp_flow->pre_tun_rule.dev && + !nfp_check_mask_remove(app, nfp_flow->mask_data, nfp_flow->meta.mask_len, NULL, &new_mask_id)) { NL_SET_ERR_MSG_MOD(extack, "invalid entry: cannot release mask id"); @@ -374,8 +381,10 @@ int nfp_compile_flow_metadata(struct nfp_app *app, return 0;
err_remove_mask: - nfp_check_mask_remove(app, nfp_flow->mask_data, nfp_flow->meta.mask_len, - NULL, &new_mask_id); + if (!nfp_flow->pre_tun_rule.dev) + nfp_check_mask_remove(app, nfp_flow->mask_data, + nfp_flow->meta.mask_len, + NULL, &new_mask_id); err_remove_rhash: WARN_ON_ONCE(rhashtable_remove_fast(&priv->stats_ctx_table, &ctx_entry->ht_node, @@ -406,9 +415,10 @@ int nfp_modify_flow_metadata(struct nfp_app *app,
__nfp_modify_flow_metadata(priv, nfp_flow);
- nfp_check_mask_remove(app, nfp_flow->mask_data, - nfp_flow->meta.mask_len, &nfp_flow->meta.flags, - &new_mask_id); + if (!nfp_flow->pre_tun_rule.dev) + nfp_check_mask_remove(app, nfp_flow->mask_data, + nfp_flow->meta.mask_len, &nfp_flow->meta.flags, + &new_mask_id);
/* Update flow payload with mask ids. */ nfp_flow->unmasked_data[NFP_FL_MASK_ID_LOCATION] = new_mask_id;
From: Alexei Starovoitov ast@kernel.org
[ Upstream commit 8a141dd7f7060d1e64c14a5257e0babae20ac99b ]
The following sequence of commands: register_ftrace_direct(ip, addr1); modify_ftrace_direct(ip, addr1, addr2); unregister_ftrace_direct(ip, addr2); will cause the kernel to warn: [ 30.179191] WARNING: CPU: 2 PID: 1961 at kernel/trace/ftrace.c:5223 unregister_ftrace_direct+0x130/0x150 [ 30.180556] CPU: 2 PID: 1961 Comm: test_progs W O 5.12.0-rc2-00378-g86bc10a0a711-dirty #3246 [ 30.182453] RIP: 0010:unregister_ftrace_direct+0x130/0x150
When modify_ftrace_direct() changes the addr from old to new it should update the addr stored in ftrace_direct_funcs. Otherwise the final unregister_ftrace_direct() won't find the address and will cause the splat.
Fixes: 0567d6809182 ("ftrace: Add modify_ftrace_direct()") Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Link: https://lore.kernel.org/bpf/20210316195815.34714-1-alexei.starovoitov@gmail.... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ftrace.c | 43 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 4d8e35575549..b7e29db127fa 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -5045,6 +5045,20 @@ struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr) return NULL; }
+static struct ftrace_direct_func *ftrace_alloc_direct_func(unsigned long addr) +{ + struct ftrace_direct_func *direct; + + direct = kmalloc(sizeof(*direct), GFP_KERNEL); + if (!direct) + return NULL; + direct->addr = addr; + direct->count = 0; + list_add_rcu(&direct->next, &ftrace_direct_funcs); + ftrace_direct_func_count++; + return direct; +} + /** * register_ftrace_direct - Call a custom trampoline directly * @ip: The address of the nop at the beginning of a function @@ -5120,15 +5134,11 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
direct = ftrace_find_direct_func(addr); if (!direct) { - direct = kmalloc(sizeof(*direct), GFP_KERNEL); + direct = ftrace_alloc_direct_func(addr); if (!direct) { kfree(entry); goto out_unlock; } - direct->addr = addr; - direct->count = 0; - list_add_rcu(&direct->next, &ftrace_direct_funcs); - ftrace_direct_func_count++; }
entry->ip = ip; @@ -5329,6 +5339,7 @@ int __weak ftrace_modify_direct_caller(struct ftrace_func_entry *entry, int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr) { + struct ftrace_direct_func *direct, *new_direct = NULL; struct ftrace_func_entry *entry; struct dyn_ftrace *rec; int ret = -ENODEV; @@ -5344,6 +5355,20 @@ int modify_ftrace_direct(unsigned long ip, if (entry->direct != old_addr) goto out_unlock;
+ direct = ftrace_find_direct_func(old_addr); + if (WARN_ON(!direct)) + goto out_unlock; + if (direct->count > 1) { + ret = -ENOMEM; + new_direct = ftrace_alloc_direct_func(new_addr); + if (!new_direct) + goto out_unlock; + direct->count--; + new_direct->count++; + } else { + direct->addr = new_addr; + } + /* * If there's no other ftrace callback on the rec->ip location, * then it can be changed directly by the architecture. @@ -5357,6 +5382,14 @@ int modify_ftrace_direct(unsigned long ip, ret = 0; }
+ if (unlikely(ret && new_direct)) { + direct->count++; + list_del_rcu(&new_direct->next); + synchronize_rcu_tasks(); + kfree(new_direct); + ftrace_direct_func_count--; + } + out_unlock: mutex_unlock(&ftrace_lock); mutex_unlock(&direct_mutex);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 3b24cdfc721a5f1098da22f9f68ff5f4a5efccc9 ]
Fix setting min/max DSI PLL rate for the V4.1 7nm DSI PLL (used on sm8250). Current code checks for pll->type before it is set (as it is set in the msm_dsi_pll_init() after calling device-specific functions.
Cc: Jonathan Marek jonathan@marek.ca Fixes: 1ef7c99d145c ("drm/msm/dsi: add support for 7nm DSI PHY/PLL") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dsi/pll/dsi_pll.c | 2 +- drivers/gpu/drm/msm/dsi/pll/dsi_pll.h | 6 ++++-- drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c | 5 +++-- 3 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll.c b/drivers/gpu/drm/msm/dsi/pll/dsi_pll.c index a45fe95aff49..3dc65877fa10 100644 --- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll.c +++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll.c @@ -163,7 +163,7 @@ struct msm_dsi_pll *msm_dsi_pll_init(struct platform_device *pdev, break; case MSM_DSI_PHY_7NM: case MSM_DSI_PHY_7NM_V4_1: - pll = msm_dsi_pll_7nm_init(pdev, id); + pll = msm_dsi_pll_7nm_init(pdev, type, id); break; default: pll = ERR_PTR(-ENXIO); diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll.h b/drivers/gpu/drm/msm/dsi/pll/dsi_pll.h index 3405982a092c..bbecb1de5678 100644 --- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll.h +++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll.h @@ -117,10 +117,12 @@ msm_dsi_pll_10nm_init(struct platform_device *pdev, int id) } #endif #ifdef CONFIG_DRM_MSM_DSI_7NM_PHY -struct msm_dsi_pll *msm_dsi_pll_7nm_init(struct platform_device *pdev, int id); +struct msm_dsi_pll *msm_dsi_pll_7nm_init(struct platform_device *pdev, + enum msm_dsi_phy_type type, int id); #else static inline struct msm_dsi_pll * -msm_dsi_pll_7nm_init(struct platform_device *pdev, int id) +msm_dsi_pll_7nm_init(struct platform_device *pdev, + enum msm_dsi_phy_type type, int id) { return ERR_PTR(-ENODEV); } diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c index 93bf142e4a4e..c1f6708367ae 100644 --- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c +++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_7nm.c @@ -852,7 +852,8 @@ err_base_clk_hw: return ret; }
-struct msm_dsi_pll *msm_dsi_pll_7nm_init(struct platform_device *pdev, int id) +struct msm_dsi_pll *msm_dsi_pll_7nm_init(struct platform_device *pdev, + enum msm_dsi_phy_type type, int id) { struct dsi_pll_7nm *pll_7nm; struct msm_dsi_pll *pll; @@ -885,7 +886,7 @@ struct msm_dsi_pll *msm_dsi_pll_7nm_init(struct platform_device *pdev, int id) pll = &pll_7nm->base; pll->min_rate = 1000000000UL; pll->max_rate = 3500000000UL; - if (pll->type == MSM_DSI_PHY_7NM_V4_1) { + if (type == MSM_DSI_PHY_7NM_V4_1) { pll->min_rate = 600000000UL; pll->max_rate = (unsigned long)5000000000ULL; /* workaround for max rate overflowing on 32-bit builds: */
From: Shannon Nelson snelson@pensando.io
[ Upstream commit d2c21422323b06938b3c070361dc544f047489d7 ]
We were linearizing non-TSO skbs that had too many frags, but we weren't checking number of frags on TSO skbs. This could lead to a bad page reference when we received a TSO skb with more frags than the Tx descriptor could support.
v2: use gso_segs rather than yet another division don't rework the check on the nr_frags
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Shannon Nelson snelson@pensando.io Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_txrx.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c index ac4cd5d82e69..b7601cadcb8c 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c @@ -1079,15 +1079,17 @@ static int ionic_tx_descs_needed(struct ionic_queue *q, struct sk_buff *skb) { int sg_elems = q->lif->qtype_info[IONIC_QTYPE_TXQ].max_sg_elems; struct ionic_tx_stats *stats = q_to_tx_stats(q); + int ndescs; int err;
- /* If TSO, need roundup(skb->len/mss) descs */ + /* Each desc is mss long max, so a descriptor for each gso_seg */ if (skb_is_gso(skb)) - return (skb->len / skb_shinfo(skb)->gso_size) + 1; + ndescs = skb_shinfo(skb)->gso_segs; + else + ndescs = 1;
- /* If non-TSO, just need 1 desc and nr_frags sg elems */ if (skb_shinfo(skb)->nr_frags <= sg_elems) - return 1; + return ndescs;
/* Too many frags, so linearize */ err = skb_linearize(skb); @@ -1096,8 +1098,7 @@ static int ionic_tx_descs_needed(struct ionic_queue *q, struct sk_buff *skb)
stats->linearize++;
- /* Need 1 desc and zero sg elems */ - return 1; + return ndescs; }
static int ionic_maybe_stop_tx(struct ionic_queue *q, int ndescs)
From: wenxu wenxu@ucloud.cn
[ Upstream commit afa536d8405a9ca36e45ba035554afbb8da27b82 ]
The ct_state validate should not only check the mask bit and also check mask_bit & key_bit.. For the +new+est case example, The 'new' and 'est' bits should be set in both state_mask and state flags. Or the -new-est case also will be reject by kernel. When Openvswitch with two flows ct_state=+trk+new,action=commit,forward ct_state=+trk+est,action=forward
A packet go through the kernel and the contrack state is invalid, The ct_state will be +trk-inv. Upcall to the ovs-vswitchd, the finally dp action will be drop with -new-est+trk.
Fixes: 1bcc51ac0731 ("net/sched: cls_flower: Reject invalid ct_state flags rules") Fixes: 3aed8b63336c ("net/sched: cls_flower: validate ct_state for invalid and reply flags") Signed-off-by: wenxu wenxu@ucloud.cn Reviewed-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_flower.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 46c1b3e9f66a..14316ba9b3b3 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -1432,7 +1432,7 @@ static int fl_set_key_ct(struct nlattr **tb, &mask->ct_state, TCA_FLOWER_KEY_CT_STATE_MASK, sizeof(key->ct_state));
- err = fl_validate_ct_state(mask->ct_state, + err = fl_validate_ct_state(key->ct_state & mask->ct_state, tb[TCA_FLOWER_KEY_CT_STATE_MASK], extack); if (err)
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 7e6136f1b7272b2202817cff37ada355eb5e6784 ]
Error was not set accordingly.
Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 8ee9f40cc0ea..2aae0df0d70d 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6929,8 +6929,10 @@ static int nf_tables_newflowtable(struct net *net, struct sock *nlsk, if (nla[NFTA_FLOWTABLE_FLAGS]) { flowtable->data.flags = ntohl(nla_get_be32(nla[NFTA_FLOWTABLE_FLAGS])); - if (flowtable->data.flags & ~NFT_FLOWTABLE_MASK) + if (flowtable->data.flags & ~NFT_FLOWTABLE_MASK) { + err = -EOPNOTSUPP; goto err3; + } }
write_pnet(&flowtable->data.net, net);
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 7b35582cd04ace2fd1807c1b624934e465cc939d ]
Honor flowtable flags from the control update path. Disallow disabling to toggle hardware offload support though.
Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 3 +++ net/netfilter/nf_tables_api.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 4b6ecf532623..6799f95eea65 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1531,6 +1531,7 @@ struct nft_trans_flowtable { struct nft_flowtable *flowtable; bool update; struct list_head hook_list; + u32 flags; };
#define nft_trans_flowtable(trans) \ @@ -1539,6 +1540,8 @@ struct nft_trans_flowtable { (((struct nft_trans_flowtable *)trans->data)->update) #define nft_trans_flowtable_hooks(trans) \ (((struct nft_trans_flowtable *)trans->data)->hook_list) +#define nft_trans_flowtable_flags(trans) \ + (((struct nft_trans_flowtable *)trans->data)->flags)
int __init nft_chain_filter_init(void); void nft_chain_filter_fini(void); diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 2aae0df0d70d..24a7a6b17268 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6808,6 +6808,7 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh, struct nft_hook *hook, *next; struct nft_trans *trans; bool unregister = false; + u32 flags; int err;
err = nft_flowtable_parse_hook(ctx, nla[NFTA_FLOWTABLE_HOOK], @@ -6822,6 +6823,17 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh, } }
+ if (nla[NFTA_FLOWTABLE_FLAGS]) { + flags = ntohl(nla_get_be32(nla[NFTA_FLOWTABLE_FLAGS])); + if (flags & ~NFT_FLOWTABLE_MASK) + return -EOPNOTSUPP; + if ((flowtable->data.flags & NFT_FLOWTABLE_HW_OFFLOAD) ^ + (flags & NFT_FLOWTABLE_HW_OFFLOAD)) + return -EOPNOTSUPP; + } else { + flags = flowtable->data.flags; + } + err = nft_register_flowtable_net_hooks(ctx->net, ctx->table, &flowtable_hook.list, flowtable); if (err < 0) @@ -6835,6 +6847,7 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh, goto err_flowtable_update_hook; }
+ nft_trans_flowtable_flags(trans) = flags; nft_trans_flowtable(trans) = flowtable; nft_trans_flowtable_update(trans) = true; INIT_LIST_HEAD(&nft_trans_flowtable_hooks(trans)); @@ -8144,6 +8157,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) break; case NFT_MSG_NEWFLOWTABLE: if (nft_trans_flowtable_update(trans)) { + nft_trans_flowtable(trans)->data.flags = + nft_trans_flowtable_flags(trans); nf_tables_flowtable_notify(&trans->ctx, nft_trans_flowtable(trans), &nft_trans_flowtable_hooks(trans),
From: Yinjun Zhang yinjun.zhang@corigine.com
[ Upstream commit 740b486a8d1f966e68ac0666f1fd57441a7cda94 ]
Currently flowtable's GC work is initialized as deferrable, which means GC cannot work on time when system is idle. So the hardware offloaded flow may be deleted for timeout, since its used time is not timely updated.
Resolve it by initializing the GC work as delayed work instead of deferrable.
Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: Yinjun Zhang yinjun.zhang@corigine.com Signed-off-by: Louis Peens louis.peens@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_flow_table_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 4a4acbba78ff..b03feb6e1226 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -506,7 +506,7 @@ int nf_flow_table_init(struct nf_flowtable *flowtable) { int err;
- INIT_DEFERRABLE_WORK(&flowtable->gc_work, nf_flow_offload_work_gc); + INIT_DELAYED_WORK(&flowtable->gc_work, nf_flow_offload_work_gc); flow_block_init(&flowtable->flow_block); init_rwsem(&flowtable->flow_block_lock);
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 8f3f5792f2940c16ab63c614b26494c8689c9c1e ]
When it failed to get section names, it should call into bpf_object__elf_finish() like others.
Fixes: 88a82120282b ("libbpf: Factor out common ELF operations and improve logging") Signed-off-by: Namhyung Kim namhyung@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210317145414.884817-1-namhyung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/libbpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index a0d4fc4de402..8913e5e7bedb 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1180,7 +1180,8 @@ static int bpf_object__elf_init(struct bpf_object *obj) if (!elf_rawdata(elf_getscn(obj->efile.elf, obj->efile.shstrndx), NULL)) { pr_warn("elf: failed to get section names strings from %s: %s\n", obj->path, elf_errmsg(-1)); - return -LIBBPF_ERRNO__FORMAT; + err = -LIBBPF_ERRNO__FORMAT; + goto errout; }
/* Old LLVM set e_machine to EM_NONE */
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit 58bfd95b554f1a23d01228672f86bb489bdbf4ba ]
Otherwise, there exists a small window between the opening and closing of the socket fd where it may leak into processes launched by some other thread.
Fixes: 949abbe88436 ("libbpf: add function to setup XDP") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/bpf/20210317115857.6536-1-memxor@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 4dd73de00b6f..d2cb28e9ef52 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -40,7 +40,7 @@ static int libbpf_netlink_open(__u32 *nl_pid) memset(&sa, 0, sizeof(sa)); sa.nl_family = AF_NETLINK;
- sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE); + sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE); if (sock < 0) return -errno;
From: dillon min dillon.minfei@gmail.com
[ Upstream commit e4817a1b6b77db538bc0141c3b138f2df803ce87 ]
For NAND Ecc layout, there is a dependency from old kernel's nand driver setting and current. if old kernel use 4 bit ecc , we should use 4 bit in new kernel either. else will run into following error at filesystem mounting.
So, enable fsl,use-minimum-ecc from device tree, to fix this mismatch
[ 9.449265] ubi0: scanning is finished [ 9.463968] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 22528 bytes from PEB 513:4096, read only 22528 bytes, retry [ 9.486940] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 22528 bytes from PEB 513:4096, read only 22528 bytes, retry [ 9.509906] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 22528 bytes from PEB 513:4096, read only 22528 bytes, retry [ 9.532845] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 22528 bytes from PEB 513:4096, read 22528 bytes
Fixes: f9ecf10cb88c ("ARM: dts: imx6ull: add MYiR MYS-6ULX SBC") Signed-off-by: dillon min dillon.minfei@gmail.com Reviewed-by: Fabio Estevam festevam@gmail.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts b/arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts index ecbb2cc5b9ab..79cc45728cd2 100644 --- a/arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts +++ b/arch/arm/boot/dts/imx6ull-myir-mys-6ulx-eval.dts @@ -14,5 +14,6 @@ };
&gpmi { + fsl,use-minimum-ecc; status = "okay"; };
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit dcc32f4f183ab8479041b23a1525d48233df1d43 ]
This reverts commit 6af1799aaf3f1bc8defedddfa00df3192445bbf3.
Commit 6af1799aaf3f ("ipv6: drop incoming packets having a v4mapped source address") introduced an input check against v4mapped addresses. Use of such addresses on the wire is indeed questionable and not allowed on public Internet. As the commit pointed out
https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-02
lists potential issues.
Unfortunately there are applications which use v4mapped addresses, and breaking them is a clear regression. For example v4mapped addresses (or any semi-valid addresses, really) may be used for uni-direction event streams or packet export.
Since the issue which sparked the addition of the check was with TCP and request_socks in particular push the check down to TCPv6 and DCCP. This restores the ability to receive UDPv6 packets with v4mapped address as the source.
Keep using the IPSTATS_MIB_INHDRERRORS statistic to minimize the user-visible changes.
Fixes: 6af1799aaf3f ("ipv6: drop incoming packets having a v4mapped source address") Reported-by: Sunyi Shao sunyishao@fb.com Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Mat Martineau mathew.j.martineau@linux.intel.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/dccp/ipv6.c | 5 +++++ net/ipv6/ip6_input.c | 10 ---------- net/ipv6/tcp_ipv6.c | 5 +++++ net/mptcp/subflow.c | 5 +++++ 4 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 1f73603913f5..2be5c69824f9 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -319,6 +319,11 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb) if (!ipv6_unicast_destination(skb)) return 0; /* discard, don't send a reset here */
+ if (ipv6_addr_v4mapped(&ipv6_hdr(skb)->saddr)) { + __IP6_INC_STATS(sock_net(sk), NULL, IPSTATS_MIB_INHDRERRORS); + return 0; + } + if (dccp_bad_service_code(sk, service)) { dcb->dccpd_reset_code = DCCP_RESET_CODE_BAD_SERVICE_CODE; goto drop; diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index e96304d8a4a7..06d60662717d 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -245,16 +245,6 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev, if (ipv6_addr_is_multicast(&hdr->saddr)) goto err;
- /* While RFC4291 is not explicit about v4mapped addresses - * in IPv6 headers, it seems clear linux dual-stack - * model can not deal properly with these. - * Security models could be fooled by ::ffff:127.0.0.1 for example. - * - * https://tools.ietf.org/html/draft-itojun-v6ops-v4mapped-harmful-02 - */ - if (ipv6_addr_v4mapped(&hdr->saddr)) - goto err; - skb->transport_header = skb->network_header + sizeof(*hdr); IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 0e1509b02cb3..c07e5e8d557b 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1175,6 +1175,11 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb) if (!ipv6_unicast_destination(skb)) goto drop;
+ if (ipv6_addr_v4mapped(&ipv6_hdr(skb)->saddr)) { + __IP6_INC_STATS(sock_net(sk), NULL, IPSTATS_MIB_INHDRERRORS); + return 0; + } + return tcp_conn_request(&tcp6_request_sock_ops, &tcp_request_sock_ipv6_ops, sk, skb);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index c3090003a17b..96e040951cd4 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -440,6 +440,11 @@ static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb) if (!ipv6_unicast_destination(skb)) goto drop;
+ if (ipv6_addr_v4mapped(&ipv6_hdr(skb)->saddr)) { + __IP6_INC_STATS(sock_net(sk), NULL, IPSTATS_MIB_INHDRERRORS); + return 0; + } + return tcp_conn_request(&mptcp_subflow_request_sock_ops, &subflow_request_sock_ipv6_ops, sk, skb);
From: Rakesh Babu rsaladi2@marvell.com
[ Upstream commit f7884097141b615b6ce89c16f456a53902b4eec3 ]
With the existing rsrc_alloc's format, there is misalignment for the pcifunc entries whose VF's index is a double digit. This patch fixes this.
pcifunc NPA NIX0 NIX1 SSO GROUP SSOWS TIM CPT0 CPT1 REE0 REE1 PF0:VF0 8 5 PF0:VF1 9 3 PF0:VF10 18 10 PF0:VF11 19 8 PF0:VF12 20 11 PF0:VF13 21 9 PF0:VF14 22 12 PF0:VF15 23 10 PF1 0 0
Fixes: 23205e6d06d4 ("octeontx2-af: Dump current resource provisioning status") Signed-off-by: Rakesh Babu rsaladi2@marvell.com Signed-off-by: Hariprasad Kelam hkelam@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../marvell/octeontx2/af/rvu_debugfs.c | 46 ++++++++++++------- 1 file changed, 29 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c index bb3fdaf33751..ea1e520b6552 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c @@ -150,12 +150,14 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, char __user *buffer, size_t count, loff_t *ppos) { - int index, off = 0, flag = 0, go_back = 0, off_prev; + int index, off = 0, flag = 0, go_back = 0, len = 0; struct rvu *rvu = filp->private_data; int lf, pf, vf, pcifunc; struct rvu_block block; int bytes_not_copied; + int lf_str_size = 12; int buf_size = 2048; + char *lfs; char *buf;
/* don't allow partial reads */ @@ -165,12 +167,18 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, buf = kzalloc(buf_size, GFP_KERNEL); if (!buf) return -ENOSPC; - off += scnprintf(&buf[off], buf_size - 1 - off, "\npcifunc\t\t"); + + lfs = kzalloc(lf_str_size, GFP_KERNEL); + if (!lfs) + return -ENOMEM; + off += scnprintf(&buf[off], buf_size - 1 - off, "%-*s", lf_str_size, + "pcifunc"); for (index = 0; index < BLK_COUNT; index++) - if (strlen(rvu->hw->block[index].name)) - off += scnprintf(&buf[off], buf_size - 1 - off, - "%*s\t", (index - 1) * 2, - rvu->hw->block[index].name); + if (strlen(rvu->hw->block[index].name)) { + off += scnprintf(&buf[off], buf_size - 1 - off, + "%-*s", lf_str_size, + rvu->hw->block[index].name); + } off += scnprintf(&buf[off], buf_size - 1 - off, "\n"); for (pf = 0; pf < rvu->hw->total_pfs; pf++) { for (vf = 0; vf <= rvu->hw->total_vfs; vf++) { @@ -179,14 +187,15 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, continue;
if (vf) { + sprintf(lfs, "PF%d:VF%d", pf, vf - 1); go_back = scnprintf(&buf[off], buf_size - 1 - off, - "PF%d:VF%d\t\t", pf, - vf - 1); + "%-*s", lf_str_size, lfs); } else { + sprintf(lfs, "PF%d", pf); go_back = scnprintf(&buf[off], buf_size - 1 - off, - "PF%d\t\t", pf); + "%-*s", lf_str_size, lfs); }
off += go_back; @@ -194,20 +203,22 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, block = rvu->hw->block[index]; if (!strlen(block.name)) continue; - off_prev = off; + len = 0; + lfs[len] = '\0'; for (lf = 0; lf < block.lf.max; lf++) { if (block.fn_map[lf] != pcifunc) continue; flag = 1; - off += scnprintf(&buf[off], buf_size - 1 - - off, "%3d,", lf); + len += sprintf(&lfs[len], "%d,", lf); } - if (flag && off_prev != off) - off--; - else - go_back++; + + if (flag) + len--; + lfs[len] = '\0'; off += scnprintf(&buf[off], buf_size - 1 - off, - "\t"); + "%-*s", lf_str_size, lfs); + if (!strlen(lfs)) + go_back += lf_str_size; } if (!flag) off -= go_back; @@ -219,6 +230,7 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, }
bytes_not_copied = copy_to_user(buffer, buf, off); + kfree(lfs); kfree(buf);
if (bytes_not_copied)
From: Subbaraya Sundeep sbhatta@marvell.com
[ Upstream commit ce86c2a531e2f2995ee55ea527c1f39ba1d95f73 ]
The MKEX profile describes what packet fields need to be extracted from the input packet and how to place those packet fields in the output key for MCAM matching. The MKEX profile can be in a way where higher layer packet fields can overwrite lower layer packet fields in output MCAM Key. Hence MKEX profile is always ensured that there are no overlaps between any of the layers. But the commit 42006910b5ea ("octeontx2-af: cleanup KPU config data") introduced TX TOS field which overlaps with DMAC in MCAM key. This led to AF driver returning error when TX rule is installed with DMAC as match criteria since DMAC gets overwritten and cannot be supported. This patch fixes the issue by removing TOS field from MKEX TX profile.
Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data") Signed-off-by: Subbaraya Sundeep sbhatta@marvell.com Signed-off-by: Hariprasad Kelam hkelam@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h b/drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h index b192692b4fc4..5c372d2c24a1 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h @@ -13499,8 +13499,6 @@ static struct npc_mcam_kex npc_mkex_default = { [NPC_LT_LC_IP] = { /* SIP+DIP: 8 bytes, KW2[63:0] */ KEX_LD_CFG(0x07, 0xc, 0x1, 0x0, 0x10), - /* TOS: 1 byte, KW1[63:56] */ - KEX_LD_CFG(0x0, 0x1, 0x1, 0x0, 0xf), }, /* Layer C: IPv6 */ [NPC_LT_LC_IP6] = {
From: Geetha sowjanya gakula@marvell.com
[ Upstream commit ae2619dd4fccdad9876aa5f900bd85484179c50f ]
Current devlink code try to free already freed irqs as the irq_allocate flag is not cleared after free leading to kernel crash while removing rvu driver. The patch fixes the irq free sequence and clears the irq_allocate flag on free.
Fixes: 7304ac4567bc ("octeontx2-af: Add mailbox IRQ and msg handlers") Signed-off-by: Geetha sowjanya gakula@marvell.com Signed-off-by: Hariprasad Kelam hkelam@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c index e8fd712860a1..e3fc6d1c0ec3 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c @@ -2358,8 +2358,10 @@ static void rvu_unregister_interrupts(struct rvu *rvu) INTR_MASK(rvu->hw->total_pfs) & ~1ULL);
for (irq = 0; irq < rvu->num_vec; irq++) { - if (rvu->irq_allocated[irq]) + if (rvu->irq_allocated[irq]) { free_irq(pci_irq_vector(rvu->pdev, irq), rvu); + rvu->irq_allocated[irq] = false; + } }
pci_free_irq_vectors(rvu->pdev); @@ -2873,8 +2875,8 @@ static void rvu_remove(struct pci_dev *pdev) struct rvu *rvu = pci_get_drvdata(pdev);
rvu_dbg_exit(rvu); - rvu_unregister_interrupts(rvu); rvu_unregister_dl(rvu); + rvu_unregister_interrupts(rvu); rvu_flr_wq_destroy(rvu); rvu_cgx_exit(rvu); rvu_fwdata_exit(rvu);
From: Geetha sowjanya gakula@marvell.com
[ Upstream commit f12098ce9b43e1a6fcaa524acbd90f9118a74c0a ]
RSS configuration can not be get/set when interface is in down state as they required mbox communication. RSS enable flag status is used for set/get configuration. Current code do not clear the RSS enable flag on interface down which lead to mbox error while trying to set/get RSS configuration.
Fixes: 85069e95e531 ("octeontx2-pf: Receive side scaling support") Signed-off-by: Geetha sowjanya gakula@marvell.com Signed-off-by: Hariprasad Kelam hkelam@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c index 634d60655a74..07e841df5678 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c @@ -1625,6 +1625,7 @@ int otx2_stop(struct net_device *netdev) struct otx2_nic *pf = netdev_priv(netdev); struct otx2_cq_poll *cq_poll = NULL; struct otx2_qset *qset = &pf->qset; + struct otx2_rss_info *rss; int qidx, vec, wrk;
netif_carrier_off(netdev); @@ -1637,6 +1638,10 @@ int otx2_stop(struct net_device *netdev) /* First stop packet Rx/Tx */ otx2_rxtx_enable(pf, false);
+ /* Clear RSS enable flag */ + rss = &pf->hw.rss_info; + rss->enable = false; + /* Cleanup Queue IRQ */ vec = pci_irq_vector(pf->pdev, pf->hw.nix_msixoff + NIX_LF_QINT_VEC_START);
From: Hariprasad Kelam hkelam@marvell.com
[ Upstream commit 64451b98306bf1334a62bcd020ec92bdb4cb68db ]
unmapping npc counter works in a way by traversing all mcam entries to find which mcam rule is associated with counter. But loop cursor variable 'entry' is not incremented before checking next mcam entry which resulting in infinite loop.
This in turn hogs the kworker thread forever and no other mbox message is processed by AF driver after that. Fix this by updating entry value before checking next mcam entry.
Fixes: a958dd59f9ce ("octeontx2-af: Map or unmap NPC MCAM entry and counter") Signed-off-by: Hariprasad Kelam hkelam@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c index 5cf9b7a907ae..b81539f3b2ac 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c @@ -2490,10 +2490,10 @@ int rvu_mbox_handler_npc_mcam_free_counter(struct rvu *rvu, index = find_next_bit(mcam->bmap, mcam->bmap_entries, entry); if (index >= mcam->bmap_entries) break; + entry = index + 1; if (mcam->entry2cntr_map[index] != req->cntr) continue;
- entry = index + 1; npc_unmap_mcam_entry_and_cntr(rvu, mcam, blkaddr, index, req->cntr); }
From: Jiri Bohac jbohac@suse.cz
[ Upstream commit 6c015a2256801597fadcbc11d287774c9c512fa5 ]
__dev_alloc_name(), when supplied with a name containing '%d', will search for the first available device number to generate a unique device name.
Since commit ff92741270bf8b6e78aa885f166b68c7a67ab13a ("net: introduce name_node struct to be used in hashlist") network devices may have alternate names. __dev_alloc_name() does take these alternate names into account, possibly generating a name that is already taken and failing with -ENFILE as a result.
This demonstrates the bug:
# rmmod dummy 2>/dev/null # ip link property add dev lo altname dummy0 # modprobe dummy numdummies=1 modprobe: ERROR: could not insert 'dummy': Too many open files in system
Instead of creating a device named dummy1, modprobe fails.
Fix this by checking all the names in the d->name_node list, not just d->name.
Signed-off-by: Jiri Bohac jbohac@suse.cz Fixes: ff92741270bf ("net: introduce name_node struct to be used in hashlist") Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/dev.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c index a5a1dbe66b76..541ee3bc467b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1182,6 +1182,18 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) return -ENOMEM;
for_each_netdev(net, d) { + struct netdev_name_node *name_node; + list_for_each_entry(name_node, &d->name_node->list, list) { + if (!sscanf(name_node->name, name, &i)) + continue; + if (i < 0 || i >= max_netdevices) + continue; + + /* avoid cases where sscanf is not exact inverse of printf */ + snprintf(buf, IFNAMSIZ, name, i); + if (!strncmp(buf, name_node->name, IFNAMSIZ)) + set_bit(i, inuse); + } if (!sscanf(d->name, name, &i)) continue; if (i < 0 || i >= max_netdevices)
From: Johan Hovold johan@kernel.org
[ Upstream commit c79a707072fe3fea0e3c92edee6ca85c1e53c29f ]
Set the disconnected flag before releasing the data interface in case netdev registration fails to avoid having the disconnect callback try to deregister the never registered netdev (and trigger a WARN_ON()).
Fixes: 87cf65601e17 ("USB host CDC Phonet network interface driver") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/cdc-phonet.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/usb/cdc-phonet.c b/drivers/net/usb/cdc-phonet.c index 02e6bbb17b15..8d1f69dad603 100644 --- a/drivers/net/usb/cdc-phonet.c +++ b/drivers/net/usb/cdc-phonet.c @@ -387,6 +387,8 @@ static int usbpn_probe(struct usb_interface *intf, const struct usb_device_id *i
err = register_netdev(dev); if (err) { + /* Set disconnected flag so that disconnect() returns early. */ + pnd->disconnected = 1; usb_driver_release_interface(&usbpn_driver, data_intf); goto out; }
From: Jesse Brandeburg jesse.brandeburg@intel.com
[ Upstream commit f0a03a026857d6c7766eb7d5835edbf5523ca15c ]
Add a couple of checks to make sure timestamping is on and that the timestamp value from DMA is valid. This avoids any functional issues that could come from a misinterpreted time stamp.
One of the functions changed doesn't need a return value added because there was no value in checking from the calling locations.
While here, fix a couple of reverse christmas tree issues next to the code being changed.
Fixes: f56e7bba22fa ("igb: Pull timestamp from fragment before adding it to skb") Fixes: 9cbc948b5a20 ("igb: add XDP support") Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb.h | 4 +-- drivers/net/ethernet/intel/igb/igb_main.c | 11 ++++---- drivers/net/ethernet/intel/igb/igb_ptp.c | 31 ++++++++++++++++++----- 3 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h index aaa954aae574..7bda8c5edea5 100644 --- a/drivers/net/ethernet/intel/igb/igb.h +++ b/drivers/net/ethernet/intel/igb/igb.h @@ -748,8 +748,8 @@ void igb_ptp_suspend(struct igb_adapter *adapter); void igb_ptp_rx_hang(struct igb_adapter *adapter); void igb_ptp_tx_hang(struct igb_adapter *adapter); void igb_ptp_rx_rgtstamp(struct igb_q_vector *q_vector, struct sk_buff *skb); -void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, - struct sk_buff *skb); +int igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, + struct sk_buff *skb); int igb_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr); int igb_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr); void igb_set_flag_queue_pairs(struct igb_adapter *, const u32); diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 03f78fdb0dcd..de0fab0e7ce2 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -8319,9 +8319,10 @@ static struct sk_buff *igb_construct_skb(struct igb_ring *rx_ring, return NULL;
if (unlikely(igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP))) { - igb_ptp_rx_pktstamp(rx_ring->q_vector, xdp->data, skb); - xdp->data += IGB_TS_HDR_LEN; - size -= IGB_TS_HDR_LEN; + if (!igb_ptp_rx_pktstamp(rx_ring->q_vector, xdp->data, skb)) { + xdp->data += IGB_TS_HDR_LEN; + size -= IGB_TS_HDR_LEN; + } }
/* Determine available headroom for copy */ @@ -8382,8 +8383,8 @@ static struct sk_buff *igb_build_skb(struct igb_ring *rx_ring,
/* pull timestamp out of packet data */ if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) { - igb_ptp_rx_pktstamp(rx_ring->q_vector, skb->data, skb); - __skb_pull(skb, IGB_TS_HDR_LEN); + if (!igb_ptp_rx_pktstamp(rx_ring->q_vector, skb->data, skb)) + __skb_pull(skb, IGB_TS_HDR_LEN); }
/* update buffer offset */ diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c index 7cc5428c3b3d..86a576201f5f 100644 --- a/drivers/net/ethernet/intel/igb/igb_ptp.c +++ b/drivers/net/ethernet/intel/igb/igb_ptp.c @@ -856,6 +856,9 @@ static void igb_ptp_tx_hwtstamp(struct igb_adapter *adapter) dev_kfree_skb_any(skb); }
+#define IGB_RET_PTP_DISABLED 1 +#define IGB_RET_PTP_INVALID 2 + /** * igb_ptp_rx_pktstamp - retrieve Rx per packet timestamp * @q_vector: Pointer to interrupt specific structure @@ -864,19 +867,29 @@ static void igb_ptp_tx_hwtstamp(struct igb_adapter *adapter) * * This function is meant to retrieve a timestamp from the first buffer of an * incoming frame. The value is stored in little endian format starting on - * byte 8. + * byte 8 + * + * Returns: 0 if success, nonzero if failure **/ -void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, - struct sk_buff *skb) +int igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, + struct sk_buff *skb) { - __le64 *regval = (__le64 *)va; struct igb_adapter *adapter = q_vector->adapter; + __le64 *regval = (__le64 *)va; int adjust = 0;
+ if (!(adapter->ptp_flags & IGB_PTP_ENABLED)) + return IGB_RET_PTP_DISABLED; + /* The timestamp is recorded in little endian format. * DWORD: 0 1 2 3 * Field: Reserved Reserved SYSTIML SYSTIMH */ + + /* check reserved dwords are zero, be/le doesn't matter for zero */ + if (regval[0]) + return IGB_RET_PTP_INVALID; + igb_ptp_systim_to_hwtstamp(adapter, skb_hwtstamps(skb), le64_to_cpu(regval[1]));
@@ -896,6 +909,8 @@ void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, } skb_hwtstamps(skb)->hwtstamp = ktime_sub_ns(skb_hwtstamps(skb)->hwtstamp, adjust); + + return 0; }
/** @@ -906,13 +921,15 @@ void igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va, * This function is meant to retrieve a timestamp from the internal registers * of the adapter and store it in the skb. **/ -void igb_ptp_rx_rgtstamp(struct igb_q_vector *q_vector, - struct sk_buff *skb) +void igb_ptp_rx_rgtstamp(struct igb_q_vector *q_vector, struct sk_buff *skb) { struct igb_adapter *adapter = q_vector->adapter; struct e1000_hw *hw = &adapter->hw; - u64 regval; int adjust = 0; + u64 regval; + + if (!(adapter->ptp_flags & IGB_PTP_ENABLED)) + return;
/* If this bit is set, then the RX registers contain the time stamp. No * other packet will be time stamped until we read these registers, so
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 8ff0b1f08ea73e5c08f5addd23481e76a60e741c ]
The sk's sk_route_caps is set in sctp_packet_config, and later it only needs to change when traversing the transport_list in a loop, as the dst might be changed in the tx path.
So move sk_route_caps check and set into sctp_outq_flush_transports from sctp_packet_transmit. This also fixes a dst leak reported by Chen Yi:
https://bugzilla.kernel.org/show_bug.cgi?id=212227
As calling sk_setup_caps() in sctp_packet_transmit may also set the sk_route_caps for the ctrl sock in a netns. When the netns is being deleted, the ctrl sock's releasing is later than dst dev's deleting, which will cause this dev's deleting to hang and dmesg error occurs:
unregister_netdevice: waiting for xxx to become free. Usage count = 1
Reported-by: Chen Yi yiche@redhat.com Fixes: bcd623d8e9fa ("sctp: call sk_setup_caps in sctp_packet_transmit instead") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/output.c | 7 ------- net/sctp/outqueue.c | 7 +++++++ 2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/sctp/output.c b/net/sctp/output.c index 6614c9fdc51e..a6aa17df09ef 100644 --- a/net/sctp/output.c +++ b/net/sctp/output.c @@ -584,13 +584,6 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp) goto out; }
- rcu_read_lock(); - if (__sk_dst_get(sk) != tp->dst) { - dst_hold(tp->dst); - sk_setup_caps(sk, tp->dst); - } - rcu_read_unlock(); - /* pack up chunks */ pkt_count = sctp_packet_pack(packet, head, gso, gfp); if (!pkt_count) { diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c index 3fd06a27105d..5cb1aa5f067b 100644 --- a/net/sctp/outqueue.c +++ b/net/sctp/outqueue.c @@ -1135,6 +1135,7 @@ static void sctp_outq_flush_data(struct sctp_flush_ctx *ctx,
static void sctp_outq_flush_transports(struct sctp_flush_ctx *ctx) { + struct sock *sk = ctx->asoc->base.sk; struct list_head *ltransport; struct sctp_packet *packet; struct sctp_transport *t; @@ -1144,6 +1145,12 @@ static void sctp_outq_flush_transports(struct sctp_flush_ctx *ctx) t = list_entry(ltransport, struct sctp_transport, send_ready); packet = &t->packet; if (!sctp_packet_empty(packet)) { + rcu_read_lock(); + if (t->dst && __sk_dst_get(sk) != t->dst) { + dst_hold(t->dst); + sk_setup_caps(sk, t->dst); + } + rcu_read_unlock(); error = sctp_packet_transmit(packet, ctx->gfp); if (error < 0) ctx->q->asoc->base.sk->sk_err = -error;
From: Hayes Wang hayeswang@realtek.com
[ Upstream commit f91a50d8b51b5c8ef1cfb08115a005bba4250507 ]
If the USB host controller is EHCI, the throughput is reduced from 300Mb/s to 60Mb/s, when the rx buffer size is modified from 16K to 32K.
According to the EHCI spec, the maximum size of the qTD is 20K. Therefore, when the driver uses more than 20K buffer, the latency time of EHCI would be increased. And, it let the RTL8153A get worse throughput.
However, the driver uses alloc_pages() for rx buffer, so I limit the rx buffer to 16K rather than 20K.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205923 Fixes: ec5791c202ac ("r8152: separate the rx buffer size") Reported-by: Robert Davies robdavies1977@gmail.com Signed-off-by: Hayes Wang hayeswang@realtek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/r8152.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index fd5ca11c4cbb..390d9e1fa7fe 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -6502,7 +6502,10 @@ static int rtl_ops_init(struct r8152 *tp) ops->in_nway = rtl8153_in_nway; ops->hw_phy_cfg = r8153_hw_phy_cfg; ops->autosuspend_en = rtl8153_runtime_enable; - tp->rx_buf_sz = 32 * 1024; + if (tp->udev->speed < USB_SPEED_SUPER) + tp->rx_buf_sz = 16 * 1024; + else + tp->rx_buf_sz = 32 * 1024; tp->eee_en = true; tp->eee_adv = MDIO_EEE_1000T | MDIO_EEE_100TX; break;
From: Corentin Labbe clabbe@baylibre.com
[ Upstream commit 014dfa26ce1c647af09bf506285ef67e0e3f0a6b ]
MTU cannot be changed on dwmac-sun8i. (ip link set eth0 mtu xxx returning EINVAL) This is due to tx_fifo_size being 0, since this value is used to compute valid MTU range. Like dwmac-sunxi (with commit 806fd188ce2a ("net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes")) dwmac-sun8i need to have tx and rx fifo sizes set. I have used values from datasheets. After this patch, setting a non-default MTU (like 1000) value works and network is still useable.
Tested-on: sun8i-h3-orangepi-pc Tested-on: sun8i-r40-bananapi-m2-ultra Tested-on: sun50i-a64-bananapi-m64 Tested-on: sun50i-h5-nanopi-neo-plus2 Tested-on: sun50i-h6-pine-h64 Fixes: 9f93ac8d408 ("net-next: stmmac: Add dwmac-sun8i") Reported-by: Belisko Marek marek.belisko@gmail.com Signed-off-by: Corentin Labbe clabbe@baylibre.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index a5e0eff4a387..9f5ccf1a0a54 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -1217,6 +1217,8 @@ static int sun8i_dwmac_probe(struct platform_device *pdev) plat_dat->init = sun8i_dwmac_init; plat_dat->exit = sun8i_dwmac_exit; plat_dat->setup = sun8i_dwmac_setup; + plat_dat->tx_fifo_size = 4096; + plat_dat->rx_fifo_size = 16384;
ret = sun8i_dwmac_set_syscon(&pdev->dev, plat_dat); if (ret)
From: David Brazdil dbrazdil@google.com
[ Upstream commit 1f935e8e72ec28dddb2dc0650b3b6626a293d94b ]
For AF_VSOCK, accept() currently returns sockets that are unlabelled. Other socket families derive the child's SID from the SID of the parent and the SID of the incoming packet. This is typically done as the connected socket is placed in the queue that accept() removes from.
Reuse the existing 'security_sk_clone' hook to copy the SID from the parent (server) socket to the child. There is no packet SID in this case.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: David Brazdil dbrazdil@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/vmw_vsock/af_vsock.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 5546710d8ac1..bc7fb9bf3351 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -755,6 +755,7 @@ static struct sock *__vsock_create(struct net *net, vsk->buffer_size = psk->buffer_size; vsk->buffer_min_size = psk->buffer_min_size; vsk->buffer_max_size = psk->buffer_max_size; + security_sk_clone(parent, sk); } else { vsk->trusted = ns_capable_noaudit(&init_user_ns, CAP_NET_ADMIN); vsk->owner = get_current_cred();
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 5aa3c334a449bab24519c4967f5ac2b3304c8dcf ]
The ECN bit defines ECT(1) = 1, ECT(0) = 2. So inner 0x02 + outer 0x01 should be inner ECT(0) + outer ECT(1). Based on the description of __INET_ECN_decapsulate, the final decapsulate value should be ECT(1). So fix the test expect value to 0x01.
Before the fix: TEST: VXLAN: ECN decap: 01/02->0x02 [FAIL] Expected to capture 10 packets, got 0.
After the fix: TEST: VXLAN: ECN decap: 01/02->0x01 [ OK ]
Fixes: a0b61f3d8ebf ("selftests: forwarding: vxlan_bridge_1d: Add an ECN decap test") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh index ce6bea9675c0..0ccb1dda099a 100755 --- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh +++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh @@ -658,7 +658,7 @@ test_ecn_decap() # In accordance with INET_ECN_decapsulate() __test_ecn_decap 00 00 0x00 __test_ecn_decap 01 01 0x01 - __test_ecn_decap 02 01 0x02 + __test_ecn_decap 02 01 0x01 __test_ecn_decap 01 03 0x03 __test_ecn_decap 02 03 0x03 test_ecn_decap_error
From: Jean-Philippe Brucker jean-philippe@linaro.org
[ Upstream commit 901ee1d750f29a335423eeb9463c3ca461ca18c2 ]
The vmlinux.h generated from BTF is invalid when building drivers/phy/ti/phy-gmii-sel.c with clang:
vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’ 61702 | const struct reg_field (*regfields)[3]; | ^~~~~~~~~
bpftool generates a forward declaration for this struct regfield, which compilers aren't happy about. Here's a simplified reproducer:
struct inner { int val; }; struct outer { struct inner (*ptr_to_array)[2]; } A;
After build with clang -> bpftool btf dump c -> clang/gcc: ./def-clang.h:11:23: error: array has incomplete element type 'struct inner' struct inner (*ptr_to_array)[2];
Member ptr_to_array of struct outer is a pointer to an array of struct inner. In the DWARF generated by clang, struct outer appears before struct inner, so when converting BTF of struct outer into C, bpftool issues a forward declaration to struct inner. With GCC the DWARF info is reversed so struct inner gets fully defined.
That forward declaration is not sufficient when compilers handle an array of the struct, even when it's only used through a pointer. Note that we can trigger the same issue with an intermediate typedef:
struct inner { int val; }; typedef struct inner inner2_t[2]; struct outer { inner2_t *ptr_to_array; } A;
Becomes:
struct inner; typedef struct inner inner2_t[2];
And causes:
./def-clang.h:10:30: error: array has incomplete element type 'struct inner' typedef struct inner inner2_t[2];
To fix this, clear through_ptr whenever we encounter an intermediate array, to make the inner struct part of a strong link and force full declaration.
Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Jean-Philippe Brucker jean-philippe@linaro.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/btf_dump.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 2f9d685bd522..0911aea4cdbe 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -462,7 +462,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr) return err;
case BTF_KIND_ARRAY: - return btf_dump_order_type(d, btf_array(t)->type, through_ptr); + return btf_dump_order_type(d, btf_array(t)->type, false);
case BTF_KIND_STRUCT: case BTF_KIND_UNION: {
From: Zqiang qiang.zhang@windriver.com
[ Upstream commit f60a85cad677c4f9bb4cadd764f1d106c38c7cf8 ]
The syzbot reported a memleak as follows:
BUG: memory leak unreferenced object 0xffff888101b41d00 (size 120): comm "kworker/u4:0", pid 8, jiffies 4294944270 (age 12.780s) backtrace: [<ffffffff8125dc56>] alloc_pid+0x66/0x560 [<ffffffff81226405>] copy_process+0x1465/0x25e0 [<ffffffff81227943>] kernel_clone+0xf3/0x670 [<ffffffff812281a1>] kernel_thread+0x61/0x80 [<ffffffff81253464>] call_usermodehelper_exec_work [<ffffffff81253464>] call_usermodehelper_exec_work+0xc4/0x120 [<ffffffff812591c9>] process_one_work+0x2c9/0x600 [<ffffffff81259ab9>] worker_thread+0x59/0x5d0 [<ffffffff812611c8>] kthread+0x178/0x1b0 [<ffffffff8100227f>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff888110ef5c00 (size 232): comm "kworker/u4:0", pid 8414, jiffies 4294944270 (age 12.780s) backtrace: [<ffffffff8154a0cf>] kmem_cache_zalloc [<ffffffff8154a0cf>] __alloc_file+0x1f/0xf0 [<ffffffff8154a809>] alloc_empty_file+0x69/0x120 [<ffffffff8154a8f3>] alloc_file+0x33/0x1b0 [<ffffffff8154ab22>] alloc_file_pseudo+0xb2/0x140 [<ffffffff81559218>] create_pipe_files+0x138/0x2e0 [<ffffffff8126c793>] umd_setup+0x33/0x220 [<ffffffff81253574>] call_usermodehelper_exec_async+0xb4/0x1b0 [<ffffffff8100227f>] ret_from_fork+0x1f/0x30
After the UMD process exits, the pipe_to_umh/pipe_from_umh and tgid need to be released.
Fixes: d71fa5c9763c ("bpf: Add kernel module with user mode driver that populates bpffs.") Reported-by: syzbot+44908bb56d2bfe56b28e@syzkaller.appspotmail.com Signed-off-by: Zqiang qiang.zhang@windriver.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210317030915.2865-1-qiang.zhang@windriver.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/usermode_driver.h | 1 + kernel/bpf/preload/bpf_preload_kern.c | 19 +++++++++++++++---- kernel/usermode_driver.c | 21 +++++++++++++++------ 3 files changed, 31 insertions(+), 10 deletions(-)
diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h index 073a9e0ec07d..ad970416260d 100644 --- a/include/linux/usermode_driver.h +++ b/include/linux/usermode_driver.h @@ -14,5 +14,6 @@ struct umd_info { int umd_load_blob(struct umd_info *info, const void *data, size_t len); int umd_unload_blob(struct umd_info *info); int fork_usermode_driver(struct umd_info *info); +void umd_cleanup_helper(struct umd_info *info);
#endif /* __LINUX_USERMODE_DRIVER_H__ */ diff --git a/kernel/bpf/preload/bpf_preload_kern.c b/kernel/bpf/preload/bpf_preload_kern.c index 79c5772465f1..53736e52c1df 100644 --- a/kernel/bpf/preload/bpf_preload_kern.c +++ b/kernel/bpf/preload/bpf_preload_kern.c @@ -60,9 +60,12 @@ static int finish(void) &magic, sizeof(magic), &pos); if (n != sizeof(magic)) return -EPIPE; + tgid = umd_ops.info.tgid; - wait_event(tgid->wait_pidfd, thread_group_exited(tgid)); - umd_ops.info.tgid = NULL; + if (tgid) { + wait_event(tgid->wait_pidfd, thread_group_exited(tgid)); + umd_cleanup_helper(&umd_ops.info); + } return 0; }
@@ -80,10 +83,18 @@ static int __init load_umd(void)
static void __exit fini_umd(void) { + struct pid *tgid; + bpf_preload_ops = NULL; + /* kill UMD in case it's still there due to earlier error */ - kill_pid(umd_ops.info.tgid, SIGKILL, 1); - umd_ops.info.tgid = NULL; + tgid = umd_ops.info.tgid; + if (tgid) { + kill_pid(tgid, SIGKILL, 1); + + wait_event(tgid->wait_pidfd, thread_group_exited(tgid)); + umd_cleanup_helper(&umd_ops.info); + } umd_unload_blob(&umd_ops.info); } late_initcall(load_umd); diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c index 0b35212ffc3d..bb7bb3b478ab 100644 --- a/kernel/usermode_driver.c +++ b/kernel/usermode_driver.c @@ -139,13 +139,22 @@ static void umd_cleanup(struct subprocess_info *info) struct umd_info *umd_info = info->data;
/* cleanup if umh_setup() was successful but exec failed */ - if (info->retval) { - fput(umd_info->pipe_to_umh); - fput(umd_info->pipe_from_umh); - put_pid(umd_info->tgid); - umd_info->tgid = NULL; - } + if (info->retval) + umd_cleanup_helper(umd_info); +} + +/** + * umd_cleanup_helper - release the resources which were allocated in umd_setup + * @info: information about usermode driver + */ +void umd_cleanup_helper(struct umd_info *info) +{ + fput(info->pipe_to_umh); + fput(info->pipe_from_umh); + put_pid(info->tgid); + info->tgid = NULL; } +EXPORT_SYMBOL_GPL(umd_cleanup_helper);
/** * fork_usermode_driver - fork a usermode driver
From: Oliver Hartkopp socketcan@hartkopp.net
[ Upstream commit b5f020f82a8e41201c6ede20fa00389d6980b223 ]
Commit d4eb538e1f48 ("can: isotp: TX-path: ensure that CAN frame flags are initialized") ensured the TX flags to be properly set for outgoing CAN frames.
In fact the root cause of the issue results from a missing initialization of outgoing CAN frames created by isotp. This is no problem on the CAN bus as the CAN driver only picks the correctly defined content from the struct can(fd)_frame. But when the outgoing frames are monitored (e.g. with candump) we potentially leak some bytes in the unused content of struct can(fd)_frame.
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Cc: Marc Kleine-Budde mkl@pengutronix.de Link: https://lore.kernel.org/r/20210319100619.10858-1-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/isotp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/can/isotp.c b/net/can/isotp.c index 430976485d95..15ea1234d457 100644 --- a/net/can/isotp.c +++ b/net/can/isotp.c @@ -196,7 +196,7 @@ static int isotp_send_fc(struct sock *sk, int ae, u8 flowstatus) nskb->dev = dev; can_skb_set_owner(nskb, sk); ncf = (struct canfd_frame *)nskb->data; - skb_put(nskb, so->ll.mtu); + skb_put_zero(nskb, so->ll.mtu);
/* create & send flow control reply */ ncf->can_id = so->txid; @@ -779,7 +779,7 @@ isotp_tx_burst: can_skb_prv(skb)->skbcnt = 0;
cf = (struct canfd_frame *)skb->data; - skb_put(skb, so->ll.mtu); + skb_put_zero(skb, so->ll.mtu);
/* create consecutive frame */ isotp_fill_dataframe(cf, so, ae, 0); @@ -895,7 +895,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) so->tx.idx = 0;
cf = (struct canfd_frame *)skb->data; - skb_put(skb, so->ll.mtu); + skb_put_zero(skb, so->ll.mtu);
/* check for single frame transmission depending on TX_DL */ if (size <= so->tx.ll_dl - SF_PCI_SZ4 - ae - off) {
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit d939cd96b9df6dcde1605fab23bbd6307e11f930 ]
On some system the WMI GUIDs used by dell-wmi-sysman are present but there are no enum type attributes, this causes init_bios_attributes() to return -ENODEV, after which sysman_init() does a "goto fail_create_group" and then calls release_attributes_data().
release_attributes_data() calls kset_unregister(wmi_priv.main_dir_kset); but before this commit it was missing a "wmi_priv.main_dir_kset = NULL;" statement; and after calling release_attributes_data() the sysman_init() error handling does this:
if (wmi_priv.main_dir_kset) { kset_unregister(wmi_priv.main_dir_kset); wmi_priv.main_dir_kset = NULL; }
Which causes a second kset_unregister(wmi_priv.main_dir_kset), leading to a double-free, which causes a crash.
Add the missing "wmi_priv.main_dir_kset = NULL;" statement to release_attributes_data() to fix this double-free crash.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-2-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index cb81010ba1a2..c1997db74cca 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -388,6 +388,7 @@ static void release_attributes_data(void) if (wmi_priv.main_dir_kset) { destroy_attribute_objs(wmi_priv.main_dir_kset); kset_unregister(wmi_priv.main_dir_kset); + wmi_priv.main_dir_kset = NULL; } mutex_unlock(&wmi_priv.mutex);
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit c59ab4cedab70a1a117a2dba3c48bb78e66c55ca ]
It is possible for release_attributes_data() to get called when the main_dir_kset has not been created yet, move the removal of the bios-reset sysfs attr to under a if (main_dir_kset) check to avoid a NULL pointer deref.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Reported-by: Alexander Naumann alexandernaumann@gmx.de Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-3-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index c1997db74cca..8b251b2c37a2 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -225,12 +225,6 @@ static int create_attributes_level_sysfs_files(void) return ret; }
-static void release_reset_bios_data(void) -{ - sysfs_remove_file(&wmi_priv.main_dir_kset->kobj, &reset_bios.attr); - sysfs_remove_file(&wmi_priv.main_dir_kset->kobj, &pending_reboot.attr); -} - static ssize_t wmi_sysman_attr_show(struct kobject *kobj, struct attribute *attr, char *buf) { @@ -373,8 +367,6 @@ static void destroy_attribute_objs(struct kset *kset) */ static void release_attributes_data(void) { - release_reset_bios_data(); - mutex_lock(&wmi_priv.mutex); exit_enum_attributes(); exit_int_attributes(); @@ -386,12 +378,13 @@ static void release_attributes_data(void) wmi_priv.authentication_dir_kset = NULL; } if (wmi_priv.main_dir_kset) { + sysfs_remove_file(&wmi_priv.main_dir_kset->kobj, &reset_bios.attr); + sysfs_remove_file(&wmi_priv.main_dir_kset->kobj, &pending_reboot.attr); destroy_attribute_objs(wmi_priv.main_dir_kset); kset_unregister(wmi_priv.main_dir_kset); wmi_priv.main_dir_kset = NULL; } mutex_unlock(&wmi_priv.mutex); - }
/**
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 2d0c418c91d8c86a1b9fb254dda842ada9919513 ]
During some of the error-exit paths it is possible that release_attributes_data() will get called multiple times, which results in exit_foo_attributes() getting called multiple times.
Make it safe to call exit_foo_attributes() multiple times, avoiding double-free()s in this case.
Note that release_attributes_data() really should only be called once during error-exit paths. This will be fixed in a separate patch and it is good to have the exit_foo_attributes() functions modified this way regardless.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-4-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/enum-attributes.c | 3 +++ drivers/platform/x86/dell-wmi-sysman/int-attributes.c | 3 +++ drivers/platform/x86/dell-wmi-sysman/passobj-attributes.c | 3 +++ drivers/platform/x86/dell-wmi-sysman/string-attributes.c | 3 +++ 4 files changed, 12 insertions(+)
diff --git a/drivers/platform/x86/dell-wmi-sysman/enum-attributes.c b/drivers/platform/x86/dell-wmi-sysman/enum-attributes.c index 80f4b7785c6c..091e48c217ed 100644 --- a/drivers/platform/x86/dell-wmi-sysman/enum-attributes.c +++ b/drivers/platform/x86/dell-wmi-sysman/enum-attributes.c @@ -185,5 +185,8 @@ void exit_enum_attributes(void) sysfs_remove_group(wmi_priv.enumeration_data[instance_id].attr_name_kobj, &enumeration_attr_group); } + wmi_priv.enumeration_instances_count = 0; + kfree(wmi_priv.enumeration_data); + wmi_priv.enumeration_data = NULL; } diff --git a/drivers/platform/x86/dell-wmi-sysman/int-attributes.c b/drivers/platform/x86/dell-wmi-sysman/int-attributes.c index 75aedbb733be..8a49ba6e44f9 100644 --- a/drivers/platform/x86/dell-wmi-sysman/int-attributes.c +++ b/drivers/platform/x86/dell-wmi-sysman/int-attributes.c @@ -175,5 +175,8 @@ void exit_int_attributes(void) sysfs_remove_group(wmi_priv.integer_data[instance_id].attr_name_kobj, &integer_attr_group); } + wmi_priv.integer_instances_count = 0; + kfree(wmi_priv.integer_data); + wmi_priv.integer_data = NULL; } diff --git a/drivers/platform/x86/dell-wmi-sysman/passobj-attributes.c b/drivers/platform/x86/dell-wmi-sysman/passobj-attributes.c index 3abcd95477c0..834b3e82ad9f 100644 --- a/drivers/platform/x86/dell-wmi-sysman/passobj-attributes.c +++ b/drivers/platform/x86/dell-wmi-sysman/passobj-attributes.c @@ -183,5 +183,8 @@ void exit_po_attributes(void) sysfs_remove_group(wmi_priv.po_data[instance_id].attr_name_kobj, &po_attr_group); } + wmi_priv.po_instances_count = 0; + kfree(wmi_priv.po_data); + wmi_priv.po_data = NULL; } diff --git a/drivers/platform/x86/dell-wmi-sysman/string-attributes.c b/drivers/platform/x86/dell-wmi-sysman/string-attributes.c index ac75dce88a4c..552537852459 100644 --- a/drivers/platform/x86/dell-wmi-sysman/string-attributes.c +++ b/drivers/platform/x86/dell-wmi-sysman/string-attributes.c @@ -155,5 +155,8 @@ void exit_str_attributes(void) sysfs_remove_group(wmi_priv.str_data[instance_id].attr_name_kobj, &str_attr_group); } + wmi_priv.str_instances_count = 0; + kfree(wmi_priv.str_data); + wmi_priv.str_data = NULL; }
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 59bbbeb9c22cc7c55965cd5ea8c16af7f16e61eb ]
All calls of init_bios_attributes() will result in a goto fail_create_group if they fail, which calls release_attributes_data().
So there is no need to call release_attributes_data() from init_bios_attributes() on failure itself.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-5-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index 8b251b2c37a2..58dc4571f987 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -491,7 +491,6 @@ nextobj:
err_attr_init: mutex_unlock(&wmi_priv.mutex); - release_attributes_data(); kfree(obj); return retval; }
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 9c90cd869747e3492a9306dcd8123c17502ff1fc ]
Cleanup sysman_init() error-exit handling:
1. There is no need for the fail_reset_bios and fail_authentication_kset eror-exit cases, these can be handled by release_attributes_data()
2. Rename all the labels from fail_what_failed, to err_what_to_cleanup this is the usual way to name these and avoids the need to rename them when extra steps are added.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-6-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 45 +++++++------------ 1 file changed, 16 insertions(+), 29 deletions(-)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index 58dc4571f987..99dc2f3bdf49 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -508,100 +508,87 @@ static int __init sysman_init(void) ret = init_bios_attr_set_interface(); if (ret || !wmi_priv.bios_attr_wdev) { pr_debug("failed to initialize set interface\n"); - goto fail_set_interface; + return ret; }
ret = init_bios_attr_pass_interface(); if (ret || !wmi_priv.password_attr_wdev) { pr_debug("failed to initialize pass interface\n"); - goto fail_pass_interface; + goto err_exit_bios_attr_set_interface; }
ret = class_register(&firmware_attributes_class); if (ret) - goto fail_class; + goto err_exit_bios_attr_pass_interface;
wmi_priv.class_dev = device_create(&firmware_attributes_class, NULL, MKDEV(0, 0), NULL, "%s", DRIVER_NAME); if (IS_ERR(wmi_priv.class_dev)) { ret = PTR_ERR(wmi_priv.class_dev); - goto fail_classdev; + goto err_unregister_class; }
wmi_priv.main_dir_kset = kset_create_and_add("attributes", NULL, &wmi_priv.class_dev->kobj); if (!wmi_priv.main_dir_kset) { ret = -ENOMEM; - goto fail_main_kset; + goto err_destroy_classdev; }
wmi_priv.authentication_dir_kset = kset_create_and_add("authentication", NULL, &wmi_priv.class_dev->kobj); if (!wmi_priv.authentication_dir_kset) { ret = -ENOMEM; - goto fail_authentication_kset; + goto err_release_attributes_data; }
ret = create_attributes_level_sysfs_files(); if (ret) { pr_debug("could not create reset BIOS attribute\n"); - goto fail_reset_bios; + goto err_release_attributes_data; }
ret = init_bios_attributes(ENUM, DELL_WMI_BIOS_ENUMERATION_ATTRIBUTE_GUID); if (ret) { pr_debug("failed to populate enumeration type attributes\n"); - goto fail_create_group; + goto err_release_attributes_data; }
ret = init_bios_attributes(INT, DELL_WMI_BIOS_INTEGER_ATTRIBUTE_GUID); if (ret) { pr_debug("failed to populate integer type attributes\n"); - goto fail_create_group; + goto err_release_attributes_data; }
ret = init_bios_attributes(STR, DELL_WMI_BIOS_STRING_ATTRIBUTE_GUID); if (ret) { pr_debug("failed to populate string type attributes\n"); - goto fail_create_group; + goto err_release_attributes_data; }
ret = init_bios_attributes(PO, DELL_WMI_BIOS_PASSOBJ_ATTRIBUTE_GUID); if (ret) { pr_debug("failed to populate pass object type attributes\n"); - goto fail_create_group; + goto err_release_attributes_data; }
return 0;
-fail_create_group: +err_release_attributes_data: release_attributes_data();
-fail_reset_bios: - if (wmi_priv.authentication_dir_kset) { - kset_unregister(wmi_priv.authentication_dir_kset); - wmi_priv.authentication_dir_kset = NULL; - } - -fail_authentication_kset: - if (wmi_priv.main_dir_kset) { - kset_unregister(wmi_priv.main_dir_kset); - wmi_priv.main_dir_kset = NULL; - } - -fail_main_kset: +err_destroy_classdev: device_destroy(&firmware_attributes_class, MKDEV(0, 0));
-fail_classdev: +err_unregister_class: class_unregister(&firmware_attributes_class);
-fail_class: +err_exit_bios_attr_pass_interface: exit_bios_attr_pass_interface();
-fail_pass_interface: +err_exit_bios_attr_set_interface: exit_bios_attr_set_interface();
-fail_set_interface: return ret; }
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 32418dd58c957f8fef25b97450d00275967604f1 ]
When either the attributes or the password interface is not found, then unregister the 2 wmi drivers again and return -ENODEV from sysman_init().
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Reported-by: Alexander Naumann alexandernaumann@gmx.de Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-7-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index 99dc2f3bdf49..5dd9b29d939c 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -506,15 +506,17 @@ static int __init sysman_init(void) }
ret = init_bios_attr_set_interface(); - if (ret || !wmi_priv.bios_attr_wdev) { - pr_debug("failed to initialize set interface\n"); + if (ret) return ret; - }
ret = init_bios_attr_pass_interface(); - if (ret || !wmi_priv.password_attr_wdev) { - pr_debug("failed to initialize pass interface\n"); + if (ret) goto err_exit_bios_attr_set_interface; + + if (!wmi_priv.bios_attr_wdev || !wmi_priv.password_attr_wdev) { + pr_debug("failed to find set or pass interface\n"); + ret = -ENODEV; + goto err_exit_bios_attr_pass_interface; }
ret = class_register(&firmware_attributes_class);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 623f279c77811475ac8fd5635cc4e4451aa71291 ]
If GPU components have failed to bind, shutdown callback would fail with the following backtrace. Add safeguard check to stop that oops from happening and allow the board to reboot.
[ 66.617046] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 66.626066] Mem abort info: [ 66.628939] ESR = 0x96000006 [ 66.632088] EC = 0x25: DABT (current EL), IL = 32 bits [ 66.637542] SET = 0, FnV = 0 [ 66.640688] EA = 0, S1PTW = 0 [ 66.643924] Data abort info: [ 66.646889] ISV = 0, ISS = 0x00000006 [ 66.650832] CM = 0, WnR = 0 [ 66.653890] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000107f81000 [ 66.660505] [0000000000000000] pgd=0000000100bb2003, p4d=0000000100bb2003, pud=0000000100897003, pmd=0000000000000000 [ 66.671398] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 66.677115] Modules linked in: [ 66.680261] CPU: 6 PID: 352 Comm: reboot Not tainted 5.11.0-rc2-00309-g79e3faa756b2 #38 [ 66.688473] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 66.695347] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 66.701507] pc : msm_atomic_commit_tail+0x78/0x4e0 [ 66.706437] lr : commit_tail+0xa4/0x184 [ 66.710381] sp : ffff8000108f3af0 [ 66.713791] x29: ffff8000108f3af0 x28: ffff418c44337000 [ 66.719242] x27: 0000000000000000 x26: ffff418c40a24490 [ 66.724693] x25: ffffd3a842a4f1a0 x24: 0000000000000008 [ 66.730146] x23: ffffd3a84313f030 x22: ffff418c444ce000 [ 66.735598] x21: ffff418c408a4980 x20: 0000000000000000 [ 66.741049] x19: 0000000000000000 x18: ffff800010710fbc [ 66.746500] x17: 000000000000000c x16: 0000000000000001 [ 66.751954] x15: 0000000000010008 x14: 0000000000000068 [ 66.757405] x13: 0000000000000001 x12: 0000000000000000 [ 66.762855] x11: 0000000000000001 x10: 00000000000009b0 [ 66.768306] x9 : ffffd3a843192000 x8 : ffff418c44337000 [ 66.773757] x7 : 0000000000000000 x6 : 00000000a401b34e [ 66.779210] x5 : 00ffffffffffffff x4 : 0000000000000000 [ 66.784660] x3 : 0000000000000000 x2 : ffff418c444ce000 [ 66.790111] x1 : ffffd3a841dce530 x0 : ffff418c444cf000 [ 66.795563] Call trace: [ 66.798075] msm_atomic_commit_tail+0x78/0x4e0 [ 66.802633] commit_tail+0xa4/0x184 [ 66.806217] drm_atomic_helper_commit+0x160/0x390 [ 66.811051] drm_atomic_commit+0x4c/0x60 [ 66.815082] drm_atomic_helper_disable_all+0x1f4/0x210 [ 66.820355] drm_atomic_helper_shutdown+0x80/0x130 [ 66.825276] msm_pdev_shutdown+0x14/0x20 [ 66.829303] platform_shutdown+0x28/0x40 [ 66.833330] device_shutdown+0x158/0x330 [ 66.837357] kernel_restart+0x40/0xa0 [ 66.841122] __do_sys_reboot+0x228/0x250 [ 66.845148] __arm64_sys_reboot+0x28/0x34 [ 66.849264] el0_svc_common.constprop.0+0x74/0x190 [ 66.854187] do_el0_svc+0x24/0x90 [ 66.857595] el0_svc+0x14/0x20 [ 66.860739] el0_sync_handler+0x1a4/0x1b0 [ 66.864858] el0_sync+0x174/0x180 [ 66.868269] Code: 1ac020a0 2a000273 eb02007f 54ffff01 (f9400285) [ 66.874525] ---[ end trace 20dedb2a3229fec8 ]---
Fixes: 9d5cbf5fe46e ("drm/msm: add shutdown support for display platform_driver") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Fabio Estevam festevam@gmail.com Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_drv.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 94525ac76d4e..fd2ac54caf9f 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -1311,6 +1311,10 @@ static int msm_pdev_remove(struct platform_device *pdev) static void msm_pdev_shutdown(struct platform_device *pdev) { struct drm_device *drm = platform_get_drvdata(pdev); + struct msm_drm_private *priv = drm ? drm->dev_private : NULL; + + if (!priv || !priv->kms) + return;
drm_atomic_helper_shutdown(drm); }
From: Fabio Estevam festevam@gmail.com
[ Upstream commit a9748134ea4aad989e52a6a91479e0acfd306e5b ]
When putting iMX5 into suspend, the following flow is observed:
[ 70.023427] [<c07755f0>] (msm_atomic_commit_tail) from [<c06e7218>] (commit_tail+0x9c/0x18c) [ 70.031890] [<c06e7218>] (commit_tail) from [<c0e2920c>] (drm_atomic_helper_commit+0x1a0/0x1d4) [ 70.040627] [<c0e2920c>] (drm_atomic_helper_commit) from [<c06e74d4>] (drm_atomic_helper_disable_all+0x1c4/0x1d4) [ 70.050913] [<c06e74d4>] (drm_atomic_helper_disable_all) from [<c0e2943c>] (drm_atomic_helper_suspend+0xb8/0x170) [ 70.061198] [<c0e2943c>] (drm_atomic_helper_suspend) from [<c06e84bc>] (drm_mode_config_helper_suspend+0x24/0x58)
In the i.MX5 case, priv->kms is not populated (as i.MX5 does not use any of the Qualcomm display controllers), causing a NULL pointer dereference in msm_atomic_commit_tail():
[ 24.268964] 8<--- cut here --- [ 24.274602] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 24.283434] pgd = (ptrval) [ 24.286387] [00000000] *pgd=ca212831 [ 24.290788] Internal error: Oops: 17 [#1] SMP ARM [ 24.295609] Modules linked in: [ 24.298777] CPU: 0 PID: 197 Comm: init Not tainted 5.11.0-rc2-next-20210111 #333 [ 24.306276] Hardware name: Freescale i.MX53 (Device Tree Support) [ 24.312442] PC is at msm_atomic_commit_tail+0x54/0xb9c [ 24.317743] LR is at commit_tail+0xa4/0x1b0
Fix the problem by calling drm_mode_config_helper_suspend/resume() only when priv->kms is available.
Fixes: ca8199f13498 ("drm/msm/dpu: ensure device suspend happens during PM sleep") Signed-off-by: Fabio Estevam festevam@gmail.com Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index fd2ac54caf9f..a5c6b8c23336 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -1072,6 +1072,10 @@ static int __maybe_unused msm_pm_resume(struct device *dev) static int __maybe_unused msm_pm_prepare(struct device *dev) { struct drm_device *ddev = dev_get_drvdata(dev); + struct msm_drm_private *priv = ddev ? ddev->dev_private : NULL; + + if (!priv || !priv->kms) + return 0;
return drm_mode_config_helper_suspend(ddev); } @@ -1079,6 +1083,10 @@ static int __maybe_unused msm_pm_prepare(struct device *dev) static void __maybe_unused msm_pm_complete(struct device *dev) { struct drm_device *ddev = dev_get_drvdata(dev); + struct msm_drm_private *priv = ddev ? ddev->dev_private : NULL; + + if (!priv || !priv->kms) + return;
drm_mode_config_helper_resume(ddev); }
From: Pavel Tatashin pasha.tatashin@soleen.com
[ Upstream commit 141f8202cfa4192c3af79b6cbd68e7760bb01b5a ]
The ppos points to a position in the old kernel memory (and in case of arm64 in the crash kernel since elfcorehdr is passed as a segment). The function should update the ppos by the amount that was read. This bug is not exposed by accident, but other platforms update this value properly. So, fix it in ARM64 version of elfcorehdr_read() as well.
Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Fixes: e62aaeac426a ("arm64: kdump: provide /proc/vmcore file") Reviewed-by: Tyler Hicks tyhicks@linux.microsoft.com Link: https://lore.kernel.org/r/20210319205054.743368-1-pasha.tatashin@soleen.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/crash_dump.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c index e6e284265f19..58303a9ec32c 100644 --- a/arch/arm64/kernel/crash_dump.c +++ b/arch/arm64/kernel/crash_dump.c @@ -64,5 +64,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf, ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos) { memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count); + *ppos += count; + return count; }
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
[ Upstream commit 5244f5e2d801259af877ee759e8c22364c607072 ]
Because the PM-runtime status of the device is not updated in __rpm_callback(), attempts to suspend the suppliers of the given device triggered by the rpm_put_suppliers() call in there may cause a supplier to be suspended completely before the status of the consumer is updated to RPM_SUSPENDED, which is confusing.
To avoid that (1) modify __rpm_callback() to only decrease the PM-runtime usage counter of each supplier and (2) make rpm_suspend() try to suspend the suppliers after changing the consumer's status to RPM_SUSPENDED, in analogy with the device's parent.
Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR... Fixes: 21d5c57b3726 ("PM / runtime: Use device links") Reported-by: elaine.zhang zhangqing@rock-chips.com Diagnosed-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/power/runtime.c | 45 +++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 6 deletions(-)
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index bfda153b1a41..5ef67bacb585 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -305,7 +305,7 @@ static int rpm_get_suppliers(struct device *dev) return 0; }
-static void rpm_put_suppliers(struct device *dev) +static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend) { struct device_link *link;
@@ -313,10 +313,30 @@ static void rpm_put_suppliers(struct device *dev) device_links_read_lock_held()) {
while (refcount_dec_not_one(&link->rpm_active)) - pm_runtime_put(link->supplier); + pm_runtime_put_noidle(link->supplier); + + if (try_to_suspend) + pm_request_idle(link->supplier); } }
+static void rpm_put_suppliers(struct device *dev) +{ + __rpm_put_suppliers(dev, true); +} + +static void rpm_suspend_suppliers(struct device *dev) +{ + struct device_link *link; + int idx = device_links_read_lock(); + + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node, + device_links_read_lock_held()) + pm_request_idle(link->supplier); + + device_links_read_unlock(idx); +} + /** * __rpm_callback - Run a given runtime PM callback for a given device. * @cb: Runtime PM callback to run. @@ -344,8 +364,10 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev) idx = device_links_read_lock();
retval = rpm_get_suppliers(dev); - if (retval) + if (retval) { + rpm_put_suppliers(dev); goto fail; + }
device_links_read_unlock(idx); } @@ -368,9 +390,9 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev) || (dev->power.runtime_status == RPM_RESUMING && retval))) { idx = device_links_read_lock();
- fail: - rpm_put_suppliers(dev); + __rpm_put_suppliers(dev, false);
+fail: device_links_read_unlock(idx); }
@@ -642,8 +664,11 @@ static int rpm_suspend(struct device *dev, int rpmflags) goto out; }
+ if (dev->power.irq_safe) + goto out; + /* Maybe the parent is now able to suspend. */ - if (parent && !parent->power.ignore_children && !dev->power.irq_safe) { + if (parent && !parent->power.ignore_children) { spin_unlock(&dev->power.lock);
spin_lock(&parent->power.lock); @@ -652,6 +677,14 @@ static int rpm_suspend(struct device *dev, int rpmflags)
spin_lock(&dev->power.lock); } + /* Maybe the suppliers are now able to suspend. */ + if (dev->power.links_count > 0) { + spin_unlock_irq(&dev->power.lock); + + rpm_suspend_suppliers(dev); + + spin_lock_irq(&dev->power.lock); + }
out: trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);
From: Huy Nguyen huyn@nvidia.com
[ Upstream commit a07231084da2207629b42244380ae2f1e10bd9b4 ]
The multicast counter got removed from uplink representor due to the cited patch.
Fixes: 47c97e6b10a1 ("net/mlx5e: Fix multicast counter not up-to-date in "ip -s"") Signed-off-by: Huy Nguyen huyn@nvidia.com Reviewed-by: Daniel Jurgens danielj@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 1386212ad3f0..aaa5a56b44c7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3703,10 +3703,17 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats) }
if (mlx5e_is_uplink_rep(priv)) { + struct mlx5e_vport_stats *vstats = &priv->stats.vport; + stats->rx_packets = PPORT_802_3_GET(pstats, a_frames_received_ok); stats->rx_bytes = PPORT_802_3_GET(pstats, a_octets_received_ok); stats->tx_packets = PPORT_802_3_GET(pstats, a_frames_transmitted_ok); stats->tx_bytes = PPORT_802_3_GET(pstats, a_octets_transmitted_ok); + + /* vport multicast also counts packets that are dropped due to steering + * or rx out of buffer + */ + stats->multicast = VPORT_COUNTER_GET(vstats, received_eth_multicast.packets); } else { mlx5e_fold_sw_stats64(priv, stats); }
From: Alaa Hleihel alaa@nvidia.com
[ Upstream commit 7d6c86e3ccb5ceea767df5c7a9a17cdfccd3df9a ]
Currently, we support hardware offload only for MPLS over UDP. However, rules matching on MPLS parameters are now wrongly offloaded for regular MPLS, without actually taking the parameters into consideration when doing the offload. Fix it by rejecting such unsupported rules.
Fixes: 72046a91d134 ("net/mlx5e: Allow to match on mpls parameters") Signed-off-by: Alaa Hleihel alaa@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index e9b7da05f14a..95cbefed1b32 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2595,6 +2595,16 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, *match_level = MLX5_MATCH_L4; }
+ /* Currenlty supported only for MPLS over UDP */ + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_MPLS) && + !netif_is_bareudp(filter_dev)) { + NL_SET_ERR_MSG_MOD(extack, + "Matching on MPLS is supported only for MPLS over UDP"); + netdev_err(priv->netdev, + "Matching on MPLS is supported only for MPLS over UDP\n"); + return -EOPNOTSUPP; + } + return 0; }
From: Dima Chumak dchumak@nvidia.com
[ Upstream commit 96b5b4585843e3c83fb1930e5dfbefd0fb889c55 ]
Setting connection tracking OVS flows and then setting non-CT flows that use tuple rewrite action (e.g. mod_tp_dst), causes the latter flows not being offloaded.
Fix by using a stricter condition in modify_header_match_supported() to check tuple rewrite support only for flows with CT action. The check is factored out into standalone modify_tuple_supported() function to aid readability.
Fixes: 7e36feeb0467 ("net/mlx5e: CT: Don't offload tuple rewrites for established tuples") Signed-off-by: Dima Chumak dchumak@nvidia.com Reviewed-by: Paul Blakey paulb@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 3 +- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 44 ++++++++++++++----- 2 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 24e2c0d955b9..b42396df3111 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -1182,7 +1182,8 @@ int mlx5_tc_ct_add_no_trk_match(struct mlx5_flow_spec *spec)
mlx5e_tc_match_to_reg_get_match(spec, CTSTATE_TO_REG, &ctstate, &ctstate_mask); - if (ctstate_mask) + + if ((ctstate & ctstate_mask) == MLX5_CT_STATE_TRK_BIT) return -EOPNOTSUPP;
ctstate_mask |= MLX5_CT_STATE_TRK_BIT; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 95cbefed1b32..24fa399b1577 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3208,6 +3208,37 @@ static int is_action_keys_supported(const struct flow_action_entry *act, return 0; }
+static bool modify_tuple_supported(bool modify_tuple, bool ct_clear, + bool ct_flow, struct netlink_ext_ack *extack, + struct mlx5e_priv *priv, + struct mlx5_flow_spec *spec) +{ + if (!modify_tuple || ct_clear) + return true; + + if (ct_flow) { + NL_SET_ERR_MSG_MOD(extack, + "can't offload tuple modification with non-clear ct()"); + netdev_info(priv->netdev, + "can't offload tuple modification with non-clear ct()"); + return false; + } + + /* Add ct_state=-trk match so it will be offloaded for non ct flows + * (or after clear action), as otherwise, since the tuple is changed, + * we can't restore ct state + */ + if (mlx5_tc_ct_add_no_trk_match(spec)) { + NL_SET_ERR_MSG_MOD(extack, + "can't offload tuple modification with ct matches and no ct(clear) action"); + netdev_info(priv->netdev, + "can't offload tuple modification with ct matches and no ct(clear) action"); + return false; + } + + return true; +} + static bool modify_header_match_supported(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec, struct flow_action *flow_action, @@ -3246,18 +3277,9 @@ static bool modify_header_match_supported(struct mlx5e_priv *priv, return err; }
- /* Add ct_state=-trk match so it will be offloaded for non ct flows - * (or after clear action), as otherwise, since the tuple is changed, - * we can't restore ct state - */ - if (!ct_clear && modify_tuple && - mlx5_tc_ct_add_no_trk_match(spec)) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload tuple modify header with ct matches"); - netdev_info(priv->netdev, - "can't offload tuple modify header with ct matches"); + if (!modify_tuple_supported(modify_tuple, ct_clear, ct_flow, extack, + priv, spec)) return false; - }
ip_proto = MLX5_GET(fte_match_set_lyr_2_4, headers_v, ip_protocol); if (modify_ip_header && ip_proto != IPPROTO_TCP &&
From: Aya Levin ayal@nvidia.com
[ Upstream commit 4eacfe72e3e037e3fc019113df32c39a705148c2 ]
Expose error value when failing to comply to command: $ ethtool --set-priv-flags eth2 rx_cqe_compress [on/off]
Fixes: be7e87f92b58 ("net/mlx5e: Fail safe cqe compressing/moderation mode setting") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index fdf5afc8b058..c9d01e705ab2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1876,6 +1876,7 @@ static int set_pflag_rx_cqe_compress(struct net_device *netdev, { struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5_core_dev *mdev = priv->mdev; + int err;
if (!MLX5_CAP_GEN(mdev, cqe_compression)) return -EOPNOTSUPP; @@ -1885,7 +1886,10 @@ static int set_pflag_rx_cqe_compress(struct net_device *netdev, return -EINVAL; }
- mlx5e_modify_rx_cqe_compression_locked(priv, enable); + err = mlx5e_modify_rx_cqe_compression_locked(priv, enable); + if (err) + return err; + priv->channels.params.rx_cqe_compress_def = enable;
return 0;
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit a61f4661fba404418a7c77e86586dc52a58a93c6 ]
The structures are used as place holders, so they are modified at run-time. Obviously they may not be constants.
BUG: unable to handle page fault for address: d0643220 ... CPU: 0 PID: 110 Comm: modprobe Not tainted 5.11.0+ #1 Hardware name: Intel Corp. QUARK/GalileoGen2, BIOS 0x01000200 01/01/2014 EIP: intel_quark_mfd_probe+0x93/0x1c0 [intel_quark_i2c_gpio]
This partially reverts the commit c4a164f41554d2899bed94bdcc499263f41787b4.
While at it, add a comment to avoid similar changes in the future.
Fixes: c4a164f41554 ("mfd: Constify static struct resources") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Rikard Falkeborn rikard.falkeborn@gmail.com Tested-by: Tong Zhang ztong0001@gmail.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mfd/intel_quark_i2c_gpio.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/mfd/intel_quark_i2c_gpio.c b/drivers/mfd/intel_quark_i2c_gpio.c index fe8ca945f367..b67cb0a3ab05 100644 --- a/drivers/mfd/intel_quark_i2c_gpio.c +++ b/drivers/mfd/intel_quark_i2c_gpio.c @@ -72,7 +72,8 @@ static const struct dmi_system_id dmi_platform_info[] = { {} };
-static const struct resource intel_quark_i2c_res[] = { +/* This is used as a place holder and will be modified at run-time */ +static struct resource intel_quark_i2c_res[] = { [INTEL_QUARK_IORES_MEM] = { .flags = IORESOURCE_MEM, }, @@ -85,7 +86,8 @@ static struct mfd_cell_acpi_match intel_quark_acpi_match_i2c = { .adr = MFD_ACPI_MATCH_I2C, };
-static const struct resource intel_quark_gpio_res[] = { +/* This is used as a place holder and will be modified at run-time */ +static struct resource intel_quark_gpio_res[] = { [INTEL_QUARK_IORES_MEM] = { .flags = IORESOURCE_MEM, },
From: Lukasz Luba lukasz.luba@arm.com
[ Upstream commit fb9d62b27ab1e07d625591549c314b7d406d21df ]
The debugfs directory '/sys/kernel/debug/energy_model' is needed before the Energy Model registration can happen. With the recent change in debugfs subsystem it's not allowed to create this directory at early stage (core_initcall). Thus creating this directory would fail.
Postpone the creation of the EM debug dir to later stage: fs_initcall.
It should be safe since all clients: CPUFreq drivers, Devfreq drivers will be initialized in later stages.
The custom debug log below prints the time of creation the EM debug dir at fs_initcall and successful registration of EMs at later stages.
[ 1.505717] energy_model: creating rootdir [ 3.698307] cpu cpu0: EM: created perf domain [ 3.709022] cpu cpu1: EM: created perf domain
Fixes: 56348560d495 ("debugfs: do not attempt to create a new file before the filesystem is initalized") Reported-by: Ionela Voinescu ionela.voinescu@arm.com Signed-off-by: Lukasz Luba lukasz.luba@arm.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/power/energy_model.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 1358fa4abfa8..0f4530b3a8cd 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -98,7 +98,7 @@ static int __init em_debug_init(void)
return 0; } -core_initcall(em_debug_init); +fs_initcall(em_debug_init); #else /* CONFIG_DEBUG_FS */ static void em_debug_create_pd(struct device *dev) {} static void em_debug_remove_pd(struct device *dev) {}
From: David E. Box david.e.box@linux.intel.com
[ Upstream commit 10c931cdfe64ebc38a15a485dd794915044f2111 ]
Fixes off-by-one bugs in the macro assignments for the crashlog control bits. Was initially tested on emulation but bug revealed after testing on silicon.
Fixes: 5ef9998c96b0 ("platform/x86: Intel PMT Crashlog capability driver") Signed-off-by: David E. Box david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20210317024455.3071477-2-david.e.box@linux.intel.c... Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel_pmt_crashlog.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/platform/x86/intel_pmt_crashlog.c b/drivers/platform/x86/intel_pmt_crashlog.c index 97dd749c8290..92d315a16cfd 100644 --- a/drivers/platform/x86/intel_pmt_crashlog.c +++ b/drivers/platform/x86/intel_pmt_crashlog.c @@ -23,18 +23,17 @@ #define CRASH_TYPE_OOBMSM 1
/* Control Flags */ -#define CRASHLOG_FLAG_DISABLE BIT(27) +#define CRASHLOG_FLAG_DISABLE BIT(28)
/* - * Bits 28 and 29 control the state of bit 31. + * Bits 29 and 30 control the state of bit 31. * - * Bit 28 will clear bit 31, if set, allowing a new crashlog to be captured. - * Bit 29 will immediately trigger a crashlog to be generated, setting bit 31. - * Bit 30 is read-only and reserved as 0. + * Bit 29 will clear bit 31, if set, allowing a new crashlog to be captured. + * Bit 30 will immediately trigger a crashlog to be generated, setting bit 31. * Bit 31 is the read-only status with a 1 indicating log is complete. */ -#define CRASHLOG_FLAG_TRIGGER_CLEAR BIT(28) -#define CRASHLOG_FLAG_TRIGGER_EXECUTE BIT(29) +#define CRASHLOG_FLAG_TRIGGER_CLEAR BIT(29) +#define CRASHLOG_FLAG_TRIGGER_EXECUTE BIT(30) #define CRASHLOG_FLAG_TRIGGER_COMPLETE BIT(31) #define CRASHLOG_FLAG_TRIGGER_MASK GENMASK(31, 28)
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 6ab4c3117aec4e08007d9e971fa4133e1de1082d ]
As explained in this discussion: https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/
the switchdev notifiers for FDB entries managed to have a zero-day bug. The bridge would not say that this entry is local:
ip link add br0 type bridge ip link set swp0 master br0 bridge fdb add dev swp0 00:01:02:03:04:05 master local
and the switchdev driver would be more than happy to offload it as a normal static FDB entry. This is despite the fact that 'local' and non-'local' entries have completely opposite directions: a local entry is locally terminated and not forwarded, whereas a static entry is forwarded and not locally terminated. So, for example, DSA would install this entry on swp0 instead of installing it on the CPU port as it should.
There is an even sadder part, which is that the 'local' flag is implicit if 'static' is not specified, meaning that this command produces the same result of adding a 'local' entry:
bridge fdb add dev swp0 00:01:02:03:04:05 master
I've updated the man pages for 'bridge', and after reading it now, it should be pretty clear to any user that the commands above were broken and should have never resulted in the 00:01:02:03:04:05 address being forwarded (this behavior is coherent with non-switchdev interfaces): https://patchwork.kernel.org/project/netdevbpf/cover/20210211104502.2081443-... If you're a user reading this and this is what you want, just use:
bridge fdb add dev swp0 00:01:02:03:04:05 master static
Because switchdev should have given drivers the means from day one to classify FDB entries as local/non-local, but didn't, it means that all drivers are currently broken. So we can just as well omit the switchdev notifications for local FDB entries, which is exactly what this patch does to close the bug in stable trees. For further development work where drivers might want to trap the local FDB entries to the host, we can add a 'bool is_local' to br_switchdev_fdb_call_notifiers(), and selectively make drivers act upon that bit, while all the others ignore those entries if the 'is_local' bit is set.
Fixes: 6b26b51b1d13 ("net: bridge: Add support for notifying devices about FDB add/del") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_switchdev.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 015209bf44aa..3c42095fa75f 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -123,6 +123,8 @@ br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type) { if (!fdb->dst) return; + if (test_bit(BR_FDB_LOCAL, &fdb->flags)) + return;
switch (type) { case RTM_DELNEIGH:
From: Colin Ian King colin.king@canonical.com
[ Upstream commit 9e0a537d06fc36861e4f78d0a7df1fe2b3592714 ]
Currently the error return path when lfs fails to allocate is not free'ing the memory allocated to buf. Fix this by adding the missing kfree.
Addresses-Coverity: ("Resource leak") Fixes: f7884097141b ("octeontx2-af: Formatting debugfs entry rsrc_alloc.") Signed-off-by: Colin Ian King colin.king@canonical.com Acked-by: Sunil Goutham sgoutham@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c index ea1e520b6552..0488651a68d0 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c @@ -169,8 +169,10 @@ static ssize_t rvu_dbg_rsrc_attach_status(struct file *filp, return -ENOSPC;
lfs = kzalloc(lf_str_size, GFP_KERNEL); - if (!lfs) + if (!lfs) { + kfree(buf); return -ENOMEM; + } off += scnprintf(&buf[off], buf_size - 1 - off, "%-*s", lf_str_size, "pcifunc"); for (index = 0; index < BLK_COUNT; index++)
From: Roger Pau Monne roger.pau@citrix.com
[ Upstream commit 2b514ec72706a31bea0c3b97e622b81535b5323a ]
The Xen memory hotplug limit should depend on the memory hotplug generic option, rather than the Xen balloon configuration. It's possible to have a kernel with generic memory hotplug enabled, but without Xen balloon enabled, at which point memory hotplug won't work correctly due to the size limitation of the p2m.
Rename the option to XEN_MEMORY_HOTPLUG_LIMIT since it's no longer tied to ballooning.
Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") Signed-off-by: Roger Pau Monné roger.pau@citrix.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Link: https://lore.kernel.org/r/20210324122424.58685-2-roger.pau@citrix.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/p2m.c | 4 ++-- drivers/xen/Kconfig | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index a3cc33091f46..bee51ac92a18 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -98,8 +98,8 @@ EXPORT_SYMBOL_GPL(xen_p2m_size); unsigned long xen_max_p2m_pfn __read_mostly; EXPORT_SYMBOL_GPL(xen_max_p2m_pfn);
-#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG_LIMIT -#define P2M_LIMIT CONFIG_XEN_BALLOON_MEMORY_HOTPLUG_LIMIT +#ifdef CONFIG_XEN_MEMORY_HOTPLUG_LIMIT +#define P2M_LIMIT CONFIG_XEN_MEMORY_HOTPLUG_LIMIT #else #define P2M_LIMIT 0 #endif diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index 41645fe6ad48..ea0efd290c37 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -50,11 +50,11 @@ config XEN_BALLOON_MEMORY_HOTPLUG
SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'"
-config XEN_BALLOON_MEMORY_HOTPLUG_LIMIT +config XEN_MEMORY_HOTPLUG_LIMIT int "Hotplugged memory limit (in GiB) for a PV guest" default 512 depends on XEN_HAVE_PVMMU - depends on XEN_BALLOON_MEMORY_HOTPLUG + depends on MEMORY_HOTPLUG help Maxmium amount of memory (in GiB) that a PV guest can be expanded to when using memory hotplug.
From: Potnuri Bharat Teja bharat@chelsio.com
[ Upstream commit 3408be145a5d6418ff955fe5badde652be90e700 ]
Not setting the ipv6 bit while destroying ipv6 listening servers may result in potential fatal adapter errors due to lookup engine memory hash errors. Therefore always set ipv6 field while destroying ipv6 listening servers.
Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja bharat@chelsio.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/cxgb4/cm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 8769e7aa097f..81903749d241 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -3610,13 +3610,13 @@ int c4iw_destroy_listen(struct iw_cm_id *cm_id) ep->com.local_addr.ss_family == AF_INET) { err = cxgb4_remove_server_filter( ep->com.dev->rdev.lldi.ports[0], ep->stid, - ep->com.dev->rdev.lldi.rxq_ids[0], 0); + ep->com.dev->rdev.lldi.rxq_ids[0], false); } else { struct sockaddr_in6 *sin6; c4iw_init_wr_wait(ep->com.wr_waitp); err = cxgb4_remove_server( ep->com.dev->rdev.lldi.ports[0], ep->stid, - ep->com.dev->rdev.lldi.rxq_ids[0], 0); + ep->com.dev->rdev.lldi.rxq_ids[0], true); if (err) goto done; err = c4iw_wait_for_reply(&ep->com.dev->rdev, ep->com.wr_waitp,
From: Mike Rapoport rppt@linux.ibm.com
[ Upstream commit a024b7c2850dddd01e65b8270f0971deaf272f27 ]
Commit 34dc2efb39a2 ("memblock: fix section mismatch warning") marked memblock_bottom_up() and memblock_set_bottom_up() as __init, but they could be referenced from non-init functions like memblock_find_in_range_node() on architectures that enable CONFIG_ARCH_KEEP_MEMBLOCK.
For such builds kernel test robot reports:
WARNING: modpost: vmlinux.o(.text+0x74fea4): Section mismatch in reference from the function memblock_find_in_range_node() to the function .init.text:memblock_bottom_up() The function memblock_find_in_range_node() references the function __init memblock_bottom_up(). This is often because memblock_find_in_range_node lacks a __init annotation or the annotation of memblock_bottom_up is wrong.
Replace __init annotations with __init_memblock annotations so that the appropriate section will be selected depending on CONFIG_ARCH_KEEP_MEMBLOCK.
Link: https://lore.kernel.org/lkml/202103160133.UzhgY0wt-lkp@intel.com Link: https://lkml.kernel.org/r/20210316171347.14084-1-rppt@kernel.org Fixes: 34dc2efb39a2 ("memblock: fix section mismatch warning") Signed-off-by: Mike Rapoport rppt@linux.ibm.com Reviewed-by: Arnd Bergmann arnd@arndb.de Reported-by: kernel test robot lkp@intel.com Reviewed-by: David Hildenbrand david@redhat.com Acked-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/memblock.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 7643d2dfa959..4ce9c8f9e684 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -460,7 +460,7 @@ static inline void memblock_free_late(phys_addr_t base, phys_addr_t size) /* * Set the allocation direction to bottom-up or top-down. */ -static inline __init void memblock_set_bottom_up(bool enable) +static inline __init_memblock void memblock_set_bottom_up(bool enable) { memblock.bottom_up = enable; } @@ -470,7 +470,7 @@ static inline __init void memblock_set_bottom_up(bool enable) * if this is true, that said, memblock will allocate memory * in bottom-up direction. */ -static inline __init bool memblock_bottom_up(void) +static inline __init_memblock bool memblock_bottom_up(void) { return memblock.bottom_up; }
[ Upstream commit 05a68ce5fa51a83c360381630f823545c5757aa2 ]
For kuprobe and tracepoint bpf programs, kernel calls trace_call_bpf() which calls BPF_PROG_RUN_ARRAY_CHECK() to run the program array. Currently, BPF_PROG_RUN_ARRAY_CHECK() also calls bpf_cgroup_storage_set() to set percpu cgroup local storage with NULL value. This is due to Commit 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") which modified __BPF_PROG_RUN_ARRAY() to call bpf_cgroup_storage_set() and this macro is also used by BPF_PROG_RUN_ARRAY_CHECK().
kuprobe and tracepoint programs are not allowed to call bpf_get_local_storage() helper hence does not access percpu cgroup local storage. Let us change BPF_PROG_RUN_ARRAY_CHECK() not to modify percpu cgroup local storage.
The issue is observed when I tried to debug [1] where percpu data is overwritten due to preempt_disable -> migration_disable change. This patch does not completely fix the above issue, which will be addressed separately, e.g., multiple cgroup prog runs may preempt each other. But it does fix any potential issue caused by tracing program overwriting percpu cgroup storage: - in a busy system, a tracing program is to run between bpf_cgroup_storage_set() and the cgroup prog run. - a kprobe program is triggered by a helper in cgroup prog before bpf_get_local_storage() is called.
[1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_...
Fixes: 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Roman Gushchin guro@fb.com Link: https://lore.kernel.org/bpf/20210309185028.3763817-1-yhs@fb.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 6e585dbc10df..f4242865cc86 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1066,7 +1066,7 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, struct bpf_prog *include_prog, struct bpf_prog_array **new_array);
-#define __BPF_PROG_RUN_ARRAY(array, ctx, func, check_non_null) \ +#define __BPF_PROG_RUN_ARRAY(array, ctx, func, check_non_null, set_cg_storage) \ ({ \ struct bpf_prog_array_item *_item; \ struct bpf_prog *_prog; \ @@ -1079,7 +1079,8 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, goto _out; \ _item = &_array->items[0]; \ while ((_prog = READ_ONCE(_item->prog))) { \ - bpf_cgroup_storage_set(_item->cgroup_storage); \ + if (set_cg_storage) \ + bpf_cgroup_storage_set(_item->cgroup_storage); \ _ret &= func(_prog, ctx); \ _item++; \ } \ @@ -1140,10 +1141,10 @@ _out: \ })
#define BPF_PROG_RUN_ARRAY(array, ctx, func) \ - __BPF_PROG_RUN_ARRAY(array, ctx, func, false) + __BPF_PROG_RUN_ARRAY(array, ctx, func, false, true)
#define BPF_PROG_RUN_ARRAY_CHECK(array, ctx, func) \ - __BPF_PROG_RUN_ARRAY(array, ctx, func, true) + __BPF_PROG_RUN_ARRAY(array, ctx, func, true, false)
#ifdef CONFIG_BPF_SYSCALL DECLARE_PER_CPU(int, bpf_prog_active);
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit c4c877b2732466b4c63217baad05c96f775912c7 ]
Move generic blackhole dst ops to the core and use them from both ipv4_dst_blackhole_ops and ip6_dst_blackhole_ops where possible. No functional change otherwise. We need these also in other locations and having to define them over and over again is not great.
Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/dst.h | 11 +++++++++++ net/core/dst.c | 38 ++++++++++++++++++++++++++++++++++++++ net/ipv4/route.c | 45 ++++++++------------------------------------- net/ipv6/route.c | 36 +++++++++--------------------------- 4 files changed, 66 insertions(+), 64 deletions(-)
diff --git a/include/net/dst.h b/include/net/dst.h index 10f0a8399867..8d7cf51766c4 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -533,4 +533,15 @@ static inline void skb_dst_update_pmtu_no_confirm(struct sk_buff *skb, u32 mtu) dst->ops->update_pmtu(dst, NULL, skb, mtu, false); }
+struct dst_entry *dst_blackhole_check(struct dst_entry *dst, u32 cookie); +void dst_blackhole_update_pmtu(struct dst_entry *dst, struct sock *sk, + struct sk_buff *skb, u32 mtu, bool confirm_neigh); +void dst_blackhole_redirect(struct dst_entry *dst, struct sock *sk, + struct sk_buff *skb); +u32 *dst_blackhole_cow_metrics(struct dst_entry *dst, unsigned long old); +struct neighbour *dst_blackhole_neigh_lookup(const struct dst_entry *dst, + struct sk_buff *skb, + const void *daddr); +unsigned int dst_blackhole_mtu(const struct dst_entry *dst); + #endif /* _NET_DST_H */ diff --git a/net/core/dst.c b/net/core/dst.c index 0c01bd8d9d81..5f6315601776 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -237,6 +237,44 @@ void __dst_destroy_metrics_generic(struct dst_entry *dst, unsigned long old) } EXPORT_SYMBOL(__dst_destroy_metrics_generic);
+struct dst_entry *dst_blackhole_check(struct dst_entry *dst, u32 cookie) +{ + return NULL; +} + +u32 *dst_blackhole_cow_metrics(struct dst_entry *dst, unsigned long old) +{ + return NULL; +} + +struct neighbour *dst_blackhole_neigh_lookup(const struct dst_entry *dst, + struct sk_buff *skb, + const void *daddr) +{ + return NULL; +} + +void dst_blackhole_update_pmtu(struct dst_entry *dst, struct sock *sk, + struct sk_buff *skb, u32 mtu, + bool confirm_neigh) +{ +} +EXPORT_SYMBOL_GPL(dst_blackhole_update_pmtu); + +void dst_blackhole_redirect(struct dst_entry *dst, struct sock *sk, + struct sk_buff *skb) +{ +} +EXPORT_SYMBOL_GPL(dst_blackhole_redirect); + +unsigned int dst_blackhole_mtu(const struct dst_entry *dst) +{ + unsigned int mtu = dst_metric_raw(dst, RTAX_MTU); + + return mtu ? : dst->dev->mtu; +} +EXPORT_SYMBOL_GPL(dst_blackhole_mtu); + static struct dst_ops md_dst_ops = { .family = AF_UNSPEC, }; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index e26652ff7059..983b4db1868f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2682,44 +2682,15 @@ out: return rth; }
-static struct dst_entry *ipv4_blackhole_dst_check(struct dst_entry *dst, u32 cookie) -{ - return NULL; -} - -static unsigned int ipv4_blackhole_mtu(const struct dst_entry *dst) -{ - unsigned int mtu = dst_metric_raw(dst, RTAX_MTU); - - return mtu ? : dst->dev->mtu; -} - -static void ipv4_rt_blackhole_update_pmtu(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb, u32 mtu, - bool confirm_neigh) -{ -} - -static void ipv4_rt_blackhole_redirect(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb) -{ -} - -static u32 *ipv4_rt_blackhole_cow_metrics(struct dst_entry *dst, - unsigned long old) -{ - return NULL; -} - static struct dst_ops ipv4_dst_blackhole_ops = { - .family = AF_INET, - .check = ipv4_blackhole_dst_check, - .mtu = ipv4_blackhole_mtu, - .default_advmss = ipv4_default_advmss, - .update_pmtu = ipv4_rt_blackhole_update_pmtu, - .redirect = ipv4_rt_blackhole_redirect, - .cow_metrics = ipv4_rt_blackhole_cow_metrics, - .neigh_lookup = ipv4_neigh_lookup, + .family = AF_INET, + .default_advmss = ipv4_default_advmss, + .neigh_lookup = ipv4_neigh_lookup, + .check = dst_blackhole_check, + .cow_metrics = dst_blackhole_cow_metrics, + .update_pmtu = dst_blackhole_update_pmtu, + .redirect = dst_blackhole_redirect, + .mtu = dst_blackhole_mtu, };
struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_orig) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 188e114b29b4..0bbfaa55e3c8 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -258,34 +258,16 @@ static struct dst_ops ip6_dst_ops_template = { .confirm_neigh = ip6_confirm_neigh, };
-static unsigned int ip6_blackhole_mtu(const struct dst_entry *dst) -{ - unsigned int mtu = dst_metric_raw(dst, RTAX_MTU); - - return mtu ? : dst->dev->mtu; -} - -static void ip6_rt_blackhole_update_pmtu(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb, u32 mtu, - bool confirm_neigh) -{ -} - -static void ip6_rt_blackhole_redirect(struct dst_entry *dst, struct sock *sk, - struct sk_buff *skb) -{ -} - static struct dst_ops ip6_dst_blackhole_ops = { - .family = AF_INET6, - .destroy = ip6_dst_destroy, - .check = ip6_dst_check, - .mtu = ip6_blackhole_mtu, - .default_advmss = ip6_default_advmss, - .update_pmtu = ip6_rt_blackhole_update_pmtu, - .redirect = ip6_rt_blackhole_redirect, - .cow_metrics = dst_cow_metrics_generic, - .neigh_lookup = ip6_dst_neigh_lookup, + .family = AF_INET6, + .default_advmss = ip6_default_advmss, + .neigh_lookup = ip6_dst_neigh_lookup, + .check = ip6_dst_check, + .destroy = ip6_dst_destroy, + .cow_metrics = dst_cow_metrics_generic, + .update_pmtu = dst_blackhole_update_pmtu, + .redirect = dst_blackhole_redirect, + .mtu = dst_blackhole_mtu, };
static const u32 ip6_template_metrics[RTAX_MAX] = {
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit a188bb5638d41aa99090ebf2f85d3505ab13fba5 ]
I ran into a crash where setting up a ip6ip6 tunnel device which was /not/ set to collect_md mode was receiving collect_md populated skbs for xmit.
The BPF prog was populating the skb via bpf_skb_set_tunnel_key() which is assigning special metadata dst entry and then redirecting the skb to the device, taking ip6_tnl_start_xmit() -> ipxip6_tnl_xmit() -> ip6_tnl_xmit() and in the latter it performs a neigh lookup based on skb_dst(skb) where we trigger a NULL pointer dereference on dst->ops->neigh_lookup() since the md_dst_ops do not populate neigh_lookup callback with a fake handler.
Transform the md_dst_ops into generic dst_blackhole_ops that can also be reused elsewhere when needed, and use them for the metadata dst entries as callback ops.
Also, remove the dst_md_discard{,_out}() ops and rely on dst_discard{,_out}() from dst_init() which free the skb the same way modulo the splat. Given we will be able to recover just fine from there, avoid any potential splats iff this gets ever triggered in future (or worse, panic on warns when set).
Fixes: f38a9eb1f77b ("dst: Metadata destinations") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/dst.c | 31 +++++++++---------------------- 1 file changed, 9 insertions(+), 22 deletions(-)
diff --git a/net/core/dst.c b/net/core/dst.c index 5f6315601776..fb3bcba87744 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -275,37 +275,24 @@ unsigned int dst_blackhole_mtu(const struct dst_entry *dst) } EXPORT_SYMBOL_GPL(dst_blackhole_mtu);
-static struct dst_ops md_dst_ops = { - .family = AF_UNSPEC, +static struct dst_ops dst_blackhole_ops = { + .family = AF_UNSPEC, + .neigh_lookup = dst_blackhole_neigh_lookup, + .check = dst_blackhole_check, + .cow_metrics = dst_blackhole_cow_metrics, + .update_pmtu = dst_blackhole_update_pmtu, + .redirect = dst_blackhole_redirect, + .mtu = dst_blackhole_mtu, };
-static int dst_md_discard_out(struct net *net, struct sock *sk, struct sk_buff *skb) -{ - WARN_ONCE(1, "Attempting to call output on metadata dst\n"); - kfree_skb(skb); - return 0; -} - -static int dst_md_discard(struct sk_buff *skb) -{ - WARN_ONCE(1, "Attempting to call input on metadata dst\n"); - kfree_skb(skb); - return 0; -} - static void __metadata_dst_init(struct metadata_dst *md_dst, enum metadata_type type, u8 optslen) - { struct dst_entry *dst;
dst = &md_dst->dst; - dst_init(dst, &md_dst_ops, NULL, 1, DST_OBSOLETE_NONE, + dst_init(dst, &dst_blackhole_ops, NULL, 1, DST_OBSOLETE_NONE, DST_METADATA | DST_NOCOUNT); - - dst->input = dst_md_discard; - dst->output = dst_md_discard_out; - memset(dst + 1, 0, sizeof(*md_dst) + optslen - sizeof(*dst)); md_dst->type = type; }
From: Li RongQing lirongqing@baidu.com
[ Upstream commit 98dfb02aa22280bd8833836d1b00ab0488fa951f ]
Igb needs a similar fix as commit 75aab4e10ae6a ("i40e: avoid premature Rx buffer reuse")
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten.
To avoid this, store the page count prior invoking xdp_do_redirect().
Longer explanation:
Intel NICs have a recycle mechanism. The main idea is that a page is split into two parts. One part is owned by the driver, one part might be owned by someone else, such as the stack.
t0: Page is allocated, and put on the Rx ring +--------------- used by NIC ->| upper buffer (rx_buffer) +--------------- | lower buffer +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX
t1: Buffer is received, and passed to the stack (e.g.) +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 1
t2: Buffer is received, and redirected +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +---------------
Now, prior calling xdp_do_redirect(): page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 2
This means that buffer *cannot* be flipped/reused, because the skb is still using it.
The problem arises when xdp_do_redirect() actually frees the segment. Then we get: page count == USHRT_MAX - 1 rx_buffer->pagecnt_bias == USHRT_MAX - 2
From a recycle perspective, the buffer can be flipped and reused,
which means that the skb data area is passed to the Rx HW ring!
To work around this, the page count is stored prior calling xdp_do_redirect().
Fixes: 9cbc948b5a20 ("igb: add XDP support") Signed-off-by: Li RongQing lirongqing@baidu.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Vishakha Jambekar vishakha.jambekar@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index de0fab0e7ce2..0e8c17f7af28 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -8232,7 +8232,8 @@ static inline bool igb_page_is_reserved(struct page *page) return (page_to_nid(page) != numa_mem_id()) || page_is_pfmemalloc(page); }
-static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer) +static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer, + int rx_buf_pgcnt) { unsigned int pagecnt_bias = rx_buffer->pagecnt_bias; struct page *page = rx_buffer->page; @@ -8243,7 +8244,7 @@ static bool igb_can_reuse_rx_page(struct igb_rx_buffer *rx_buffer)
#if (PAGE_SIZE < 8192) /* if we are only owner of page we can reuse it */ - if (unlikely((page_ref_count(page) - pagecnt_bias) > 1)) + if (unlikely((rx_buf_pgcnt - pagecnt_bias) > 1)) return false; #else #define IGB_LAST_OFFSET \ @@ -8633,11 +8634,17 @@ static unsigned int igb_rx_offset(struct igb_ring *rx_ring) }
static struct igb_rx_buffer *igb_get_rx_buffer(struct igb_ring *rx_ring, - const unsigned int size) + const unsigned int size, int *rx_buf_pgcnt) { struct igb_rx_buffer *rx_buffer;
rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean]; + *rx_buf_pgcnt = +#if (PAGE_SIZE < 8192) + page_count(rx_buffer->page); +#else + 0; +#endif prefetchw(rx_buffer->page);
/* we are reusing so sync this buffer for CPU use */ @@ -8653,9 +8660,9 @@ static struct igb_rx_buffer *igb_get_rx_buffer(struct igb_ring *rx_ring, }
static void igb_put_rx_buffer(struct igb_ring *rx_ring, - struct igb_rx_buffer *rx_buffer) + struct igb_rx_buffer *rx_buffer, int rx_buf_pgcnt) { - if (igb_can_reuse_rx_page(rx_buffer)) { + if (igb_can_reuse_rx_page(rx_buffer, rx_buf_pgcnt)) { /* hand second half of page back to the ring */ igb_reuse_rx_page(rx_ring, rx_buffer); } else { @@ -8682,6 +8689,7 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) u16 cleaned_count = igb_desc_unused(rx_ring); unsigned int xdp_xmit = 0; struct xdp_buff xdp; + int rx_buf_pgcnt;
xdp.rxq = &rx_ring->xdp_rxq;
@@ -8712,7 +8720,7 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) */ dma_rmb();
- rx_buffer = igb_get_rx_buffer(rx_ring, size); + rx_buffer = igb_get_rx_buffer(rx_ring, size, &rx_buf_pgcnt);
/* retrieve a buffer from the ring */ if (!skb) { @@ -8755,7 +8763,7 @@ static int igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget) break; }
- igb_put_rx_buffer(rx_ring, rx_buffer); + igb_put_rx_buffer(rx_ring, rx_buffer, rx_buf_pgcnt); cleaned_count++;
/* fetch next buffer in frame if non-eop */
From: Robert Hancock robert.hancock@calian.com
[ Upstream commit 59cd4f19267a0aab87a8c07e4426eb7187ee548d ]
The driver did not always clean up all allocated resources when probe failed. Fix the probe cleanup path to clean up everything that was allocated.
Fixes: 57baf8cc70ea ("net: axienet: Handle deferred probe on clock properly") Signed-off-by: Robert Hancock robert.hancock@calian.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/xilinx/xilinx_axienet_main.c | 35 +++++++++++++------ 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c index b4a0bfce5b76..4cd701a9277d 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c @@ -1835,7 +1835,7 @@ static int axienet_probe(struct platform_device *pdev) if (IS_ERR(lp->regs)) { dev_err(&pdev->dev, "could not map Axi Ethernet regs.\n"); ret = PTR_ERR(lp->regs); - goto free_netdev; + goto cleanup_clk; } lp->regs_start = ethres->start;
@@ -1910,12 +1910,12 @@ static int axienet_probe(struct platform_device *pdev) break; default: ret = -EINVAL; - goto free_netdev; + goto cleanup_clk; } } else { ret = of_get_phy_mode(pdev->dev.of_node, &lp->phy_mode); if (ret) - goto free_netdev; + goto cleanup_clk; }
/* Find the DMA node, map the DMA registers, and decode the DMA IRQs */ @@ -1928,7 +1928,7 @@ static int axienet_probe(struct platform_device *pdev) dev_err(&pdev->dev, "unable to get DMA resource\n"); of_node_put(np); - goto free_netdev; + goto cleanup_clk; } lp->dma_regs = devm_ioremap_resource(&pdev->dev, &dmares); @@ -1948,12 +1948,12 @@ static int axienet_probe(struct platform_device *pdev) if (IS_ERR(lp->dma_regs)) { dev_err(&pdev->dev, "could not map DMA regs\n"); ret = PTR_ERR(lp->dma_regs); - goto free_netdev; + goto cleanup_clk; } if ((lp->rx_irq <= 0) || (lp->tx_irq <= 0)) { dev_err(&pdev->dev, "could not determine irqs\n"); ret = -ENOMEM; - goto free_netdev; + goto cleanup_clk; }
/* Autodetect the need for 64-bit DMA pointers. @@ -1983,7 +1983,7 @@ static int axienet_probe(struct platform_device *pdev) ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(addr_width)); if (ret) { dev_err(&pdev->dev, "No suitable DMA available\n"); - goto free_netdev; + goto cleanup_clk; }
/* Check for Ethernet core IRQ (optional) */ @@ -2014,12 +2014,12 @@ static int axienet_probe(struct platform_device *pdev) if (!lp->phy_node) { dev_err(&pdev->dev, "phy-handle required for 1000BaseX/SGMII\n"); ret = -EINVAL; - goto free_netdev; + goto cleanup_mdio; } lp->pcs_phy = of_mdio_find_device(lp->phy_node); if (!lp->pcs_phy) { ret = -EPROBE_DEFER; - goto free_netdev; + goto cleanup_mdio; } lp->phylink_config.pcs_poll = true; } @@ -2033,17 +2033,30 @@ static int axienet_probe(struct platform_device *pdev) if (IS_ERR(lp->phylink)) { ret = PTR_ERR(lp->phylink); dev_err(&pdev->dev, "phylink_create error (%i)\n", ret); - goto free_netdev; + goto cleanup_mdio; }
ret = register_netdev(lp->ndev); if (ret) { dev_err(lp->dev, "register_netdev() error (%i)\n", ret); - goto free_netdev; + goto cleanup_phylink; }
return 0;
+cleanup_phylink: + phylink_destroy(lp->phylink); + +cleanup_mdio: + if (lp->pcs_phy) + put_device(&lp->pcs_phy->dev); + if (lp->mii_bus) + axienet_mdio_teardown(lp); + of_node_put(lp->phy_node); + +cleanup_clk: + clk_disable_unprepare(lp->clk); + free_netdev: free_netdev(ndev);
From: Michael Walle michael@walle.cc
[ Upstream commit 4217a64e18a1647a0dbc68cb3169a5a06f054ec8 ]
At the moment, PORT_MII is reported in the ethtool ops. This is odd because it is an interface between the MAC and the PHY and no external port. Some network card drivers will overwrite the port to twisted pair or fiber, though. Even worse, the MDI/MDIX setting is only used by ethtool if the port is twisted pair.
Set the port to PORT_TP by default because most PHY drivers are copper ones. If there is fibre support and it is enabled, the PHY driver will set it to PORT_FIBRE.
This will change reporting PORT_MII to either PORT_TP or PORT_FIBRE; except for the genphy fallback driver.
Suggested-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Michael Walle michael@walle.cc Reviewed-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/broadcom.c | 2 ++ drivers/net/phy/dp83822.c | 3 +++ drivers/net/phy/dp83869.c | 4 ++++ drivers/net/phy/lxt.c | 1 + drivers/net/phy/marvell.c | 1 + drivers/net/phy/marvell10g.c | 2 ++ drivers/net/phy/micrel.c | 14 +++++++++++--- drivers/net/phy/phy.c | 2 +- drivers/net/phy/phy_device.c | 9 +++++++++ include/linux/phy.h | 2 ++ 10 files changed, 36 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index ec45a1608309..48024ac85980 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -505,6 +505,8 @@ static int bcm54616s_probe(struct phy_device *phydev) */ if (!(val & BCM54616S_100FX_MODE)) phydev->dev_flags |= PHY_BCM_FLAGS_MODE_1000BX; + + phydev->port = PORT_FIBRE; }
return 0; diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c index 423952cb9e1c..f7a2ec150e54 100644 --- a/drivers/net/phy/dp83822.c +++ b/drivers/net/phy/dp83822.c @@ -555,6 +555,9 @@ static int dp83822_probe(struct phy_device *phydev)
dp83822_of_init(phydev);
+ if (dp83822->fx_enabled) + phydev->port = PORT_FIBRE; + return 0; }
diff --git a/drivers/net/phy/dp83869.c b/drivers/net/phy/dp83869.c index b30bc142d82e..755220c6451f 100644 --- a/drivers/net/phy/dp83869.c +++ b/drivers/net/phy/dp83869.c @@ -855,6 +855,10 @@ static int dp83869_probe(struct phy_device *phydev) if (ret) return ret;
+ if (dp83869->mode == DP83869_RGMII_100_BASE || + dp83869->mode == DP83869_RGMII_1000_BASE) + phydev->port = PORT_FIBRE; + return dp83869_config_init(phydev); }
diff --git a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c index 0ee23d29c0d4..bde3356a2f86 100644 --- a/drivers/net/phy/lxt.c +++ b/drivers/net/phy/lxt.c @@ -292,6 +292,7 @@ static int lxt973_probe(struct phy_device *phydev) phy_write(phydev, MII_BMCR, val); /* Remember that the port is in fiber mode. */ phydev->priv = lxt973_probe; + phydev->port = PORT_FIBRE; } else { phydev->priv = NULL; } diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index 620052c023a5..2afef45d15b1 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -1552,6 +1552,7 @@ static int marvell_read_status_page(struct phy_device *phydev, int page) phydev->asym_pause = 0; phydev->speed = SPEED_UNKNOWN; phydev->duplex = DUPLEX_UNKNOWN; + phydev->port = fiber ? PORT_FIBRE : PORT_TP;
if (phydev->autoneg == AUTONEG_ENABLE) err = marvell_read_status_page_an(phydev, fiber, status); diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c index 1901ba277413..b1bb9b8e1e4e 100644 --- a/drivers/net/phy/marvell10g.c +++ b/drivers/net/phy/marvell10g.c @@ -631,6 +631,7 @@ static int mv3310_read_status_10gbaser(struct phy_device *phydev) phydev->link = 1; phydev->speed = SPEED_10000; phydev->duplex = DUPLEX_FULL; + phydev->port = PORT_FIBRE;
return 0; } @@ -690,6 +691,7 @@ static int mv3310_read_status_copper(struct phy_device *phydev)
phydev->duplex = cssr1 & MV_PCS_CSSR1_DUPLEX_FULL ? DUPLEX_FULL : DUPLEX_HALF; + phydev->port = PORT_TP; phydev->mdix = cssr1 & MV_PCS_CSSR1_MDIX ? ETH_TP_MDI_X : ETH_TP_MDI;
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 57f8021b70af..a6c691938f94 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -341,14 +341,19 @@ static int kszphy_config_init(struct phy_device *phydev) return kszphy_config_reset(phydev); }
+static int ksz8041_fiber_mode(struct phy_device *phydev) +{ + struct device_node *of_node = phydev->mdio.dev.of_node; + + return of_property_read_bool(of_node, "micrel,fiber-mode"); +} + static int ksz8041_config_init(struct phy_device *phydev) { __ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
- struct device_node *of_node = phydev->mdio.dev.of_node; - /* Limit supported and advertised modes in fiber mode */ - if (of_property_read_bool(of_node, "micrel,fiber-mode")) { + if (ksz8041_fiber_mode(phydev)) { phydev->dev_flags |= MICREL_PHY_FXEN; linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, mask); linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, mask); @@ -1176,6 +1181,9 @@ static int kszphy_probe(struct phy_device *phydev) } }
+ if (ksz8041_fiber_mode(phydev)) + phydev->port = PORT_FIBRE; + /* Support legacy board-file configuration */ if (phydev->dev_flags & MICREL_PHY_50MHZ_CLK) { priv->rmii_ref_clk_sel = true; diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index b79c4068ee61..c93c295db3dc 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -310,7 +310,7 @@ void phy_ethtool_ksettings_get(struct phy_device *phydev, if (phydev->interface == PHY_INTERFACE_MODE_MOCA) cmd->base.port = PORT_BNC; else - cmd->base.port = PORT_MII; + cmd->base.port = phydev->port; cmd->base.transceiver = phy_is_internal(phydev) ? XCVR_INTERNAL : XCVR_EXTERNAL; cmd->base.phy_address = phydev->mdio.addr; diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 1c6ae845e03f..d2fd54e4c612 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -576,6 +576,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id, dev->pause = 0; dev->asym_pause = 0; dev->link = 0; + dev->port = PORT_TP; dev->interface = PHY_INTERFACE_MODE_GMII;
dev->autoneg = AUTONEG_ENABLE; @@ -1382,6 +1383,14 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
phydev->state = PHY_READY;
+ /* Port is set to PORT_TP by default and the actual PHY driver will set + * it to different value depending on the PHY configuration. If we have + * the generic PHY driver we can't figure it out, thus set the old + * legacy PORT_MII value. + */ + if (using_genphy) + phydev->port = PORT_MII; + /* Initial carrier state is off as the phy is about to be * (re)initialized. */ diff --git a/include/linux/phy.h b/include/linux/phy.h index 9effb511acde..d0e64f3b53b9 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -499,6 +499,7 @@ struct macsec_ops; * * @speed: Current link speed * @duplex: Current duplex + * @port: Current port * @pause: Current pause * @asym_pause: Current asymmetric pause * @supported: Combined MAC/PHY supported linkmodes @@ -577,6 +578,7 @@ struct phy_device { */ int speed; int duplex; + int port; int pause; int asym_pause; u8 master_slave_get;
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 133bf7b4fbbe58cff5492e37e95e75c88161f1b8 ]
Avoid a forward declaration by moving the callers of bcm54xx_config_clock_delay() below its body.
Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/broadcom.c | 74 +++++++++++++++++++------------------- 1 file changed, 36 insertions(+), 38 deletions(-)
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index 48024ac85980..407626ddcae7 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -26,44 +26,6 @@ MODULE_DESCRIPTION("Broadcom PHY driver"); MODULE_AUTHOR("Maciej W. Rozycki"); MODULE_LICENSE("GPL");
-static int bcm54xx_config_clock_delay(struct phy_device *phydev); - -static int bcm54210e_config_init(struct phy_device *phydev) -{ - int val; - - bcm54xx_config_clock_delay(phydev); - - if (phydev->dev_flags & PHY_BRCM_EN_MASTER_MODE) { - val = phy_read(phydev, MII_CTRL1000); - val |= CTL1000_AS_MASTER | CTL1000_ENABLE_MASTER; - phy_write(phydev, MII_CTRL1000, val); - } - - return 0; -} - -static int bcm54612e_config_init(struct phy_device *phydev) -{ - int reg; - - bcm54xx_config_clock_delay(phydev); - - /* Enable CLK125 MUX on LED4 if ref clock is enabled. */ - if (!(phydev->dev_flags & PHY_BRCM_RX_REFCLK_UNUSED)) { - int err; - - reg = bcm_phy_read_exp(phydev, BCM54612E_EXP_SPARE0); - err = bcm_phy_write_exp(phydev, BCM54612E_EXP_SPARE0, - BCM54612E_LED4_CLK125OUT_EN | reg); - - if (err < 0) - return err; - } - - return 0; -} - static int bcm54xx_config_clock_delay(struct phy_device *phydev) { int rc, val; @@ -105,6 +67,42 @@ static int bcm54xx_config_clock_delay(struct phy_device *phydev) return 0; }
+static int bcm54210e_config_init(struct phy_device *phydev) +{ + int val; + + bcm54xx_config_clock_delay(phydev); + + if (phydev->dev_flags & PHY_BRCM_EN_MASTER_MODE) { + val = phy_read(phydev, MII_CTRL1000); + val |= CTL1000_AS_MASTER | CTL1000_ENABLE_MASTER; + phy_write(phydev, MII_CTRL1000, val); + } + + return 0; +} + +static int bcm54612e_config_init(struct phy_device *phydev) +{ + int reg; + + bcm54xx_config_clock_delay(phydev); + + /* Enable CLK125 MUX on LED4 if ref clock is enabled. */ + if (!(phydev->dev_flags & PHY_BRCM_RX_REFCLK_UNUSED)) { + int err; + + reg = bcm_phy_read_exp(phydev, BCM54612E_EXP_SPARE0); + err = bcm_phy_write_exp(phydev, BCM54612E_EXP_SPARE0, + BCM54612E_LED4_CLK125OUT_EN | reg); + + if (err < 0) + return err; + } + + return 0; +} + /* Needs SMDSP clock enabled via bcm54xx_phydsp_config() */ static int bcm50610_a0_workaround(struct phy_device *phydev) {
From: Robert Hancock robert.hancock@calian.com
[ Upstream commit 3afd0218992a8d1398e9791d6c2edd4c948ae7ee ]
The default configuration for the BCM54616S PHY may not match the desired mode when using 1000BaseX or SGMII interface modes, such as when it is on an SFP module. Add code to explicitly set the correct mode using programming sequences provided by Bel-Fuse:
https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-... https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-...
Signed-off-by: Robert Hancock robert.hancock@calian.com Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/broadcom.c | 84 ++++++++++++++++++++++++++++++++------ include/linux/brcmphy.h | 4 ++ 2 files changed, 76 insertions(+), 12 deletions(-)
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index 407626ddcae7..b160186dc766 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -103,6 +103,64 @@ static int bcm54612e_config_init(struct phy_device *phydev) return 0; }
+static int bcm54616s_config_init(struct phy_device *phydev) +{ + int rc, val; + + if (phydev->interface != PHY_INTERFACE_MODE_SGMII && + phydev->interface != PHY_INTERFACE_MODE_1000BASEX) + return 0; + + /* Ensure proper interface mode is selected. */ + /* Disable RGMII mode */ + val = bcm54xx_auxctl_read(phydev, MII_BCM54XX_AUXCTL_SHDWSEL_MISC); + if (val < 0) + return val; + val &= ~MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_EN; + val |= MII_BCM54XX_AUXCTL_MISC_WREN; + rc = bcm54xx_auxctl_write(phydev, MII_BCM54XX_AUXCTL_SHDWSEL_MISC, + val); + if (rc < 0) + return rc; + + /* Select 1000BASE-X register set (primary SerDes) */ + val = bcm_phy_read_shadow(phydev, BCM54XX_SHD_MODE); + if (val < 0) + return val; + val |= BCM54XX_SHD_MODE_1000BX; + rc = bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE, val); + if (rc < 0) + return rc; + + /* Power down SerDes interface */ + rc = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN); + if (rc < 0) + return rc; + + /* Select proper interface mode */ + val &= ~BCM54XX_SHD_INTF_SEL_MASK; + val |= phydev->interface == PHY_INTERFACE_MODE_SGMII ? + BCM54XX_SHD_INTF_SEL_SGMII : + BCM54XX_SHD_INTF_SEL_GBIC; + rc = bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE, val); + if (rc < 0) + return rc; + + /* Power up SerDes interface */ + rc = phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN); + if (rc < 0) + return rc; + + /* Select copper register set */ + val &= ~BCM54XX_SHD_MODE_1000BX; + rc = bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE, val); + if (rc < 0) + return rc; + + /* Power up copper interface */ + return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN); +} + /* Needs SMDSP clock enabled via bcm54xx_phydsp_config() */ static int bcm50610_a0_workaround(struct phy_device *phydev) { @@ -281,15 +339,17 @@ static int bcm54xx_config_init(struct phy_device *phydev)
bcm54xx_adjust_rxrefclk(phydev);
- if (BRCM_PHY_MODEL(phydev) == PHY_ID_BCM54210E) { + switch (BRCM_PHY_MODEL(phydev)) { + case PHY_ID_BCM54210E: err = bcm54210e_config_init(phydev); - if (err) - return err; - } else if (BRCM_PHY_MODEL(phydev) == PHY_ID_BCM54612E) { + break; + case PHY_ID_BCM54612E: err = bcm54612e_config_init(phydev); - if (err) - return err; - } else if (BRCM_PHY_MODEL(phydev) == PHY_ID_BCM54810) { + break; + case PHY_ID_BCM54616S: + err = bcm54616s_config_init(phydev); + break; + case PHY_ID_BCM54810: /* For BCM54810, we need to disable BroadR-Reach function */ val = bcm_phy_read_exp(phydev, BCM54810_EXP_BROADREACH_LRE_MISC_CTL); @@ -297,9 +357,10 @@ static int bcm54xx_config_init(struct phy_device *phydev) err = bcm_phy_write_exp(phydev, BCM54810_EXP_BROADREACH_LRE_MISC_CTL, val); - if (err < 0) - return err; + break; } + if (err) + return err;
bcm54xx_phydsp_config(phydev);
@@ -478,7 +539,7 @@ static int bcm5481_config_aneg(struct phy_device *phydev)
static int bcm54616s_probe(struct phy_device *phydev) { - int val, intf_sel; + int val;
val = bcm_phy_read_shadow(phydev, BCM54XX_SHD_MODE); if (val < 0) @@ -490,8 +551,7 @@ static int bcm54616s_probe(struct phy_device *phydev) * RGMII-1000Base-X is properly supported, but RGMII-100Base-FX * support is still missing as of now. */ - intf_sel = (val & BCM54XX_SHD_INTF_SEL_MASK) >> 1; - if (intf_sel == 1) { + if ((val & BCM54XX_SHD_INTF_SEL_MASK) == BCM54XX_SHD_INTF_SEL_RGMII) { val = bcm_phy_read_shadow(phydev, BCM54616S_SHD_100FX_CTRL); if (val < 0) return val; diff --git a/include/linux/brcmphy.h b/include/linux/brcmphy.h index d0bd226d6bd9..54665952d6ad 100644 --- a/include/linux/brcmphy.h +++ b/include/linux/brcmphy.h @@ -136,6 +136,7 @@
#define MII_BCM54XX_AUXCTL_SHDWSEL_MISC 0x07 #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_WIRESPEED_EN 0x0010 +#define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_EN 0x0080 #define MII_BCM54XX_AUXCTL_SHDWSEL_MISC_RGMII_SKEW_EN 0x0100 #define MII_BCM54XX_AUXCTL_MISC_FORCE_AMDIX 0x0200 #define MII_BCM54XX_AUXCTL_MISC_WREN 0x8000 @@ -222,6 +223,9 @@ /* 11111: Mode Control Register */ #define BCM54XX_SHD_MODE 0x1f #define BCM54XX_SHD_INTF_SEL_MASK GENMASK(2, 1) /* INTERF_SEL[1:0] */ +#define BCM54XX_SHD_INTF_SEL_RGMII 0x02 +#define BCM54XX_SHD_INTF_SEL_SGMII 0x04 +#define BCM54XX_SHD_INTF_SEL_GBIC 0x06 #define BCM54XX_SHD_MODE_1000BX BIT(0) /* Enable 1000-X registers */
/*
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit b1dd9bf688b0dcc5a34dca660de46c7570bd9243 ]
The PHY driver entry for BCM50160 and BCM50610M calls bcm54xx_config_init() but does not call bcm54xx_config_clock_delay() in order to configuration appropriate clock delays on the PHY, fix that.
Fixes: 733336262b28 ("net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/broadcom.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index b160186dc766..07a98c324942 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -340,6 +340,10 @@ static int bcm54xx_config_init(struct phy_device *phydev) bcm54xx_adjust_rxrefclk(phydev);
switch (BRCM_PHY_MODEL(phydev)) { + case PHY_ID_BCM50610: + case PHY_ID_BCM50610M: + err = bcm54xx_config_clock_delay(phydev); + break; case PHY_ID_BCM54210E: err = bcm54210e_config_init(phydev); break;
From: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz
[ Upstream commit d3d40f237480abf3268956daf18cdc56edd32834 ]
This reverts commit cc00bcaa589914096edef7fb87ca5cee4a166b5c.
This (and the preceding) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude.
Prior to using RCU a script calling "iptables" approx. 200 times was taking 1.16s. With RCU this increased to 11.59s.
Revert these patches and fix the issue in a different way.
Signed-off-by: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter/x_tables.h | 5 +-- net/ipv4/netfilter/arp_tables.c | 14 ++++----- net/ipv4/netfilter/ip_tables.c | 14 ++++----- net/ipv6/netfilter/ip6_tables.c | 14 ++++----- net/netfilter/x_tables.c | 49 +++++++++++++++++++++--------- 5 files changed, 56 insertions(+), 40 deletions(-)
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 8ebb64193757..5deb099d156d 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -227,7 +227,7 @@ struct xt_table { unsigned int valid_hooks;
/* Man behind the curtain... */ - struct xt_table_info __rcu *private; + struct xt_table_info *private;
/* Set this to THIS_MODULE if you are a module, otherwise NULL */ struct module *me; @@ -448,9 +448,6 @@ xt_get_per_cpu_counter(struct xt_counters *cnt, unsigned int cpu)
struct nf_hook_ops *xt_hook_ops_alloc(const struct xt_table *, nf_hookfn *);
-struct xt_table_info -*xt_table_get_private_protected(const struct xt_table *table); - #ifdef CONFIG_COMPAT #include <net/compat.h>
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index c576a63d09db..04a2010755a6 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -203,7 +203,7 @@ unsigned int arpt_do_table(struct sk_buff *skb,
local_bh_disable(); addend = xt_write_recseq_begin(); - private = rcu_access_pointer(table->private); + private = READ_ONCE(table->private); /* Address dependency. */ cpu = smp_processor_id(); table_base = private->entries; jumpstack = (struct arpt_entry **)private->jumpstack[cpu]; @@ -649,7 +649,7 @@ static struct xt_counters *alloc_counters(const struct xt_table *table) { unsigned int countersize; struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private;
/* We need atomic snapshot of counters: rest doesn't change * (other than comefrom, which userspace doesn't care @@ -673,7 +673,7 @@ static int copy_entries_to_user(unsigned int total_size, unsigned int off, num; const struct arpt_entry *e; struct xt_counters *counters; - struct xt_table_info *private = xt_table_get_private_protected(table); + struct xt_table_info *private = table->private; int ret = 0; void *loc_cpu_entry;
@@ -807,7 +807,7 @@ static int get_info(struct net *net, void __user *user, const int *len) t = xt_request_find_table_lock(net, NFPROTO_ARP, name); if (!IS_ERR(t)) { struct arpt_getinfo info; - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; #ifdef CONFIG_COMPAT struct xt_table_info tmp;
@@ -860,7 +860,7 @@ static int get_entries(struct net *net, struct arpt_get_entries __user *uptr,
t = xt_find_table_lock(net, NFPROTO_ARP, get.name); if (!IS_ERR(t)) { - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private;
if (get.size == private->size) ret = copy_entries_to_user(private->size, @@ -1017,7 +1017,7 @@ static int do_add_counters(struct net *net, sockptr_t arg, unsigned int len) }
local_bh_disable(); - private = xt_table_get_private_protected(t); + private = t->private; if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; @@ -1330,7 +1330,7 @@ static int compat_copy_entries_to_user(unsigned int total_size, void __user *userptr) { struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private; void __user *pos; unsigned int size; int ret = 0; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index e8f6f9d86237..a5b63f92b7f3 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -258,7 +258,7 @@ ipt_do_table(struct sk_buff *skb, WARN_ON(!(table->valid_hooks & (1 << hook))); local_bh_disable(); addend = xt_write_recseq_begin(); - private = rcu_access_pointer(table->private); + private = READ_ONCE(table->private); /* Address dependency. */ cpu = smp_processor_id(); table_base = private->entries; jumpstack = (struct ipt_entry **)private->jumpstack[cpu]; @@ -791,7 +791,7 @@ static struct xt_counters *alloc_counters(const struct xt_table *table) { unsigned int countersize; struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private;
/* We need atomic snapshot of counters: rest doesn't change (other than comefrom, which userspace doesn't care @@ -815,7 +815,7 @@ copy_entries_to_user(unsigned int total_size, unsigned int off, num; const struct ipt_entry *e; struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private; int ret = 0; const void *loc_cpu_entry;
@@ -964,7 +964,7 @@ static int get_info(struct net *net, void __user *user, const int *len) t = xt_request_find_table_lock(net, AF_INET, name); if (!IS_ERR(t)) { struct ipt_getinfo info; - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; #ifdef CONFIG_COMPAT struct xt_table_info tmp;
@@ -1018,7 +1018,7 @@ get_entries(struct net *net, struct ipt_get_entries __user *uptr,
t = xt_find_table_lock(net, AF_INET, get.name); if (!IS_ERR(t)) { - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; if (get.size == private->size) ret = copy_entries_to_user(private->size, t, uptr->entrytable); @@ -1173,7 +1173,7 @@ do_add_counters(struct net *net, sockptr_t arg, unsigned int len) }
local_bh_disable(); - private = xt_table_get_private_protected(t); + private = t->private; if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; @@ -1543,7 +1543,7 @@ compat_copy_entries_to_user(unsigned int total_size, struct xt_table *table, void __user *userptr) { struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private; void __user *pos; unsigned int size; int ret = 0; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 0d453fa9e327..81c042940b21 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -280,7 +280,7 @@ ip6t_do_table(struct sk_buff *skb,
local_bh_disable(); addend = xt_write_recseq_begin(); - private = rcu_access_pointer(table->private); + private = READ_ONCE(table->private); /* Address dependency. */ cpu = smp_processor_id(); table_base = private->entries; jumpstack = (struct ip6t_entry **)private->jumpstack[cpu]; @@ -807,7 +807,7 @@ static struct xt_counters *alloc_counters(const struct xt_table *table) { unsigned int countersize; struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private;
/* We need atomic snapshot of counters: rest doesn't change (other than comefrom, which userspace doesn't care @@ -831,7 +831,7 @@ copy_entries_to_user(unsigned int total_size, unsigned int off, num; const struct ip6t_entry *e; struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private; int ret = 0; const void *loc_cpu_entry;
@@ -980,7 +980,7 @@ static int get_info(struct net *net, void __user *user, const int *len) t = xt_request_find_table_lock(net, AF_INET6, name); if (!IS_ERR(t)) { struct ip6t_getinfo info; - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; #ifdef CONFIG_COMPAT struct xt_table_info tmp;
@@ -1035,7 +1035,7 @@ get_entries(struct net *net, struct ip6t_get_entries __user *uptr,
t = xt_find_table_lock(net, AF_INET6, get.name); if (!IS_ERR(t)) { - struct xt_table_info *private = xt_table_get_private_protected(t); + struct xt_table_info *private = t->private; if (get.size == private->size) ret = copy_entries_to_user(private->size, t, uptr->entrytable); @@ -1189,7 +1189,7 @@ do_add_counters(struct net *net, sockptr_t arg, unsigned int len) }
local_bh_disable(); - private = xt_table_get_private_protected(t); + private = t->private; if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; @@ -1552,7 +1552,7 @@ compat_copy_entries_to_user(unsigned int total_size, struct xt_table *table, void __user *userptr) { struct xt_counters *counters; - const struct xt_table_info *private = xt_table_get_private_protected(table); + const struct xt_table_info *private = table->private; void __user *pos; unsigned int size; int ret = 0; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index bce6ca203d46..7df3aef39c5c 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1351,14 +1351,6 @@ struct xt_counters *xt_counters_alloc(unsigned int counters) } EXPORT_SYMBOL(xt_counters_alloc);
-struct xt_table_info -*xt_table_get_private_protected(const struct xt_table *table) -{ - return rcu_dereference_protected(table->private, - mutex_is_locked(&xt[table->af].mutex)); -} -EXPORT_SYMBOL(xt_table_get_private_protected); - struct xt_table_info * xt_replace_table(struct xt_table *table, unsigned int num_counters, @@ -1366,6 +1358,7 @@ xt_replace_table(struct xt_table *table, int *error) { struct xt_table_info *private; + unsigned int cpu; int ret;
ret = xt_jumpstack_alloc(newinfo); @@ -1375,20 +1368,47 @@ xt_replace_table(struct xt_table *table, }
/* Do the substitution. */ - private = xt_table_get_private_protected(table); + local_bh_disable(); + private = table->private;
/* Check inside lock: is the old number correct? */ if (num_counters != private->number) { pr_debug("num_counters != table->private->number (%u/%u)\n", num_counters, private->number); + local_bh_enable(); *error = -EAGAIN; return NULL; }
newinfo->initial_entries = private->initial_entries; + /* + * Ensure contents of newinfo are visible before assigning to + * private. + */ + smp_wmb(); + table->private = newinfo; + + /* make sure all cpus see new ->private value */ + smp_wmb();
- rcu_assign_pointer(table->private, newinfo); - synchronize_rcu(); + /* + * Even though table entries have now been swapped, other CPU's + * may still be using the old entries... + */ + local_bh_enable(); + + /* ... so wait for even xt_recseq on all cpus */ + for_each_possible_cpu(cpu) { + seqcount_t *s = &per_cpu(xt_recseq, cpu); + u32 seq = raw_read_seqcount(s); + + if (seq & 1) { + do { + cond_resched(); + cpu_relax(); + } while (seq == raw_read_seqcount(s)); + } + }
audit_log_nfcfg(table->name, table->af, private->number, !private->number ? AUDIT_XT_OP_REGISTER : @@ -1424,12 +1444,12 @@ struct xt_table *xt_register_table(struct net *net, }
/* Simplifies replace_table code. */ - rcu_assign_pointer(table->private, bootstrap); + table->private = bootstrap;
if (!xt_replace_table(table, 0, newinfo, &ret)) goto unlock;
- private = xt_table_get_private_protected(table); + private = table->private; pr_debug("table->private->number = %u\n", private->number);
/* save number of initial entries */ @@ -1452,8 +1472,7 @@ void *xt_unregister_table(struct xt_table *table) struct xt_table_info *private;
mutex_lock(&xt[table->af].mutex); - private = xt_table_get_private_protected(table); - RCU_INIT_POINTER(table->private, NULL); + private = table->private; list_del(&table->list); mutex_unlock(&xt[table->af].mutex); audit_log_nfcfg(table->name, table->af, private->number,
From: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz
[ Upstream commit 175e476b8cdf2a4de7432583b49c871345e4f8a1 ]
When a new table value was assigned, it was followed by a write memory barrier. This ensured that all writes before this point would complete before any writes after this point. However, to determine whether the rules are unused, the sequence counter is read. To ensure that all writes have been done before these reads, a full memory barrier is needed, not just a write memory barrier. The same argument applies when incrementing the counter, before the rules are read.
Changing to using smp_mb() instead of smp_wmb() fixes the kernel panic reported in cc00bcaa5899 (which is still present), while still maintaining the same speed of replacing tables.
The smb_mb() barriers potentially slow the packet path, however testing has shown no measurable change in performance on a 4-core MIPS64 platform.
Fixes: 7f5c6d4f665b ("netfilter: get rid of atomic ops in fast path") Signed-off-by: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter/x_tables.h | 2 +- net/netfilter/x_tables.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 5deb099d156d..8ec48466410a 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -376,7 +376,7 @@ static inline unsigned int xt_write_recseq_begin(void) * since addend is most likely 1 */ __this_cpu_add(xt_recseq.sequence, addend); - smp_wmb(); + smp_mb();
return addend; } diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 7df3aef39c5c..6bd31a7a27fc 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1389,7 +1389,7 @@ xt_replace_table(struct xt_table *table, table->private = newinfo;
/* make sure all cpus see new ->private value */ - smp_wmb(); + smp_mb();
/* * Even though table entries have now been swapped, other CPU's
From: Alexei Starovoitov ast@kernel.org
[ Upstream commit e21aa341785c679dd409c8cb71f864c00fe6c463 ]
The fexit/fmod_ret programs can be attached to kernel functions that can sleep. The synchronize_rcu_tasks() will not wait for such tasks to complete. In such case the trampoline image will be freed and when the task wakes up the return IP will point to freed memory causing the crash. Solve this by adding percpu_ref_get/put for the duration of trampoline and separate trampoline vs its image life times. The "half page" optimization has to be removed, since first_half->second_half->first_half transition cannot be guaranteed to complete in deterministic time. Every trampoline update becomes a new image. The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and call_rcu_tasks. Together they will wait for the original function and trampoline asm to complete. The trampoline is patched from nop to jmp to skip fexit progs. They are freed independently from the trampoline. The image with fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which will wait for both sleepable and non-sleepable progs to complete.
Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline") Reported-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Paul E. McKenney paulmck@kernel.org # for RCU Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 26 ++++- include/linux/bpf.h | 24 +++- kernel/bpf/bpf_struct_ops.c | 2 +- kernel/bpf/core.c | 4 +- kernel/bpf/trampoline.c | 218 +++++++++++++++++++++++++++--------- 5 files changed, 213 insertions(+), 61 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 796506dcfc42..652bd64e422d 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1735,7 +1735,7 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog, * add rsp, 8 // skip eth_type_trans's frame * ret // return to its caller */ -int arch_prepare_bpf_trampoline(void *image, void *image_end, +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *image_end, const struct btf_func_model *m, u32 flags, struct bpf_tramp_progs *tprogs, void *orig_call) @@ -1774,6 +1774,15 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
save_regs(m, &prog, nr_args, stack_size);
+ if (flags & BPF_TRAMP_F_CALL_ORIG) { + /* arg1: mov rdi, im */ + emit_mov_imm64(&prog, BPF_REG_1, (long) im >> 32, (u32) (long) im); + if (emit_call(&prog, __bpf_tramp_enter, prog)) { + ret = -EINVAL; + goto cleanup; + } + } + if (fentry->nr_progs) if (invoke_bpf(m, &prog, fentry, stack_size)) return -EINVAL; @@ -1792,8 +1801,7 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end, }
if (flags & BPF_TRAMP_F_CALL_ORIG) { - if (fentry->nr_progs || fmod_ret->nr_progs) - restore_regs(m, &prog, nr_args, stack_size); + restore_regs(m, &prog, nr_args, stack_size);
/* call original function */ if (emit_call(&prog, orig_call, prog)) { @@ -1802,6 +1810,8 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end, } /* remember return value in a stack for bpf prog to access */ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8); + im->ip_after_call = prog; + emit_nops(&prog, 5); }
if (fmod_ret->nr_progs) { @@ -1832,9 +1842,17 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end, * the return value is only updated on the stack and still needs to be * restored to R0. */ - if (flags & BPF_TRAMP_F_CALL_ORIG) + if (flags & BPF_TRAMP_F_CALL_ORIG) { + im->ip_epilogue = prog; + /* arg1: mov rdi, im */ + emit_mov_imm64(&prog, BPF_REG_1, (long) im >> 32, (u32) (long) im); + if (emit_call(&prog, __bpf_tramp_exit, prog)) { + ret = -EINVAL; + goto cleanup; + } /* restore original return value back into RAX */ emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8); + }
EMIT1(0x5B); /* pop rbx */ EMIT1(0xC9); /* leave */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f4242865cc86..564ebf91793e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -22,6 +22,7 @@ #include <linux/capability.h> #include <linux/sched/mm.h> #include <linux/slab.h> +#include <linux/percpu-refcount.h>
struct bpf_verifier_env; struct bpf_verifier_log; @@ -563,7 +564,8 @@ struct bpf_tramp_progs { * fentry = a set of program to run before calling original function * fexit = a set of program to run after original function */ -int arch_prepare_bpf_trampoline(void *image, void *image_end, +struct bpf_tramp_image; +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end, const struct btf_func_model *m, u32 flags, struct bpf_tramp_progs *tprogs, void *orig_call); @@ -572,6 +574,8 @@ u64 notrace __bpf_prog_enter(void); void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start); void notrace __bpf_prog_enter_sleepable(void); void notrace __bpf_prog_exit_sleepable(void); +void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); +void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
struct bpf_ksym { unsigned long start; @@ -590,6 +594,18 @@ enum bpf_tramp_prog_type { BPF_TRAMP_REPLACE, /* more than MAX */ };
+struct bpf_tramp_image { + void *image; + struct bpf_ksym ksym; + struct percpu_ref pcref; + void *ip_after_call; + void *ip_epilogue; + union { + struct rcu_head rcu; + struct work_struct work; + }; +}; + struct bpf_trampoline { /* hlist for trampoline_table */ struct hlist_node hlist; @@ -612,9 +628,8 @@ struct bpf_trampoline { /* Number of attached programs. A counter per kind. */ int progs_cnt[BPF_TRAMP_MAX]; /* Executable image of trampoline */ - void *image; + struct bpf_tramp_image *cur_image; u64 selector; - struct bpf_ksym ksym; };
struct bpf_attach_target_info { @@ -698,6 +713,8 @@ void bpf_image_ksym_add(void *data, struct bpf_ksym *ksym); void bpf_image_ksym_del(struct bpf_ksym *ksym); void bpf_ksym_add(struct bpf_ksym *ksym); void bpf_ksym_del(struct bpf_ksym *ksym); +int bpf_jit_charge_modmem(u32 pages); +void bpf_jit_uncharge_modmem(u32 pages); #else static inline int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr) @@ -788,7 +805,6 @@ struct bpf_prog_aux { bool func_proto_unreliable; bool sleepable; bool tail_call_reachable; - enum bpf_tramp_prog_type trampoline_prog_type; struct hlist_node tramp_hlist; /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ const struct btf_type *attach_func_proto; diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 1a666a975416..70f6fd4fa305 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -430,7 +430,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
tprogs[BPF_TRAMP_FENTRY].progs[0] = prog; tprogs[BPF_TRAMP_FENTRY].nr_progs = 1; - err = arch_prepare_bpf_trampoline(image, + err = arch_prepare_bpf_trampoline(NULL, image, st_map->image + PAGE_SIZE, &st_ops->func_models[i], 0, tprogs, NULL); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 261f8692d0d2..1de87fcaeabd 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -817,7 +817,7 @@ static int __init bpf_jit_charge_init(void) } pure_initcall(bpf_jit_charge_init);
-static int bpf_jit_charge_modmem(u32 pages) +int bpf_jit_charge_modmem(u32 pages) { if (atomic_long_add_return(pages, &bpf_jit_current) > (bpf_jit_limit >> PAGE_SHIFT)) { @@ -830,7 +830,7 @@ static int bpf_jit_charge_modmem(u32 pages) return 0; }
-static void bpf_jit_uncharge_modmem(u32 pages) +void bpf_jit_uncharge_modmem(u32 pages) { atomic_long_sub(pages, &bpf_jit_current); } diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 35c5887d82ff..986dabc3d11f 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -57,19 +57,10 @@ void bpf_image_ksym_del(struct bpf_ksym *ksym) PAGE_SIZE, true, ksym->name); }
-static void bpf_trampoline_ksym_add(struct bpf_trampoline *tr) -{ - struct bpf_ksym *ksym = &tr->ksym; - - snprintf(ksym->name, KSYM_NAME_LEN, "bpf_trampoline_%llu", tr->key); - bpf_image_ksym_add(tr->image, ksym); -} - static struct bpf_trampoline *bpf_trampoline_lookup(u64 key) { struct bpf_trampoline *tr; struct hlist_head *head; - void *image; int i;
mutex_lock(&trampoline_mutex); @@ -84,14 +75,6 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key) if (!tr) goto out;
- /* is_root was checked earlier. No need for bpf_jit_charge_modmem() */ - image = bpf_jit_alloc_exec_page(); - if (!image) { - kfree(tr); - tr = NULL; - goto out; - } - tr->key = key; INIT_HLIST_NODE(&tr->hlist); hlist_add_head(&tr->hlist, head); @@ -99,9 +82,6 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key) mutex_init(&tr->mutex); for (i = 0; i < BPF_TRAMP_MAX; i++) INIT_HLIST_HEAD(&tr->progs_hlist[i]); - tr->image = image; - INIT_LIST_HEAD_RCU(&tr->ksym.lnode); - bpf_trampoline_ksym_add(tr); out: mutex_unlock(&trampoline_mutex); return tr; @@ -185,10 +165,142 @@ bpf_trampoline_get_progs(const struct bpf_trampoline *tr, int *total) return tprogs; }
+static void __bpf_tramp_image_put_deferred(struct work_struct *work) +{ + struct bpf_tramp_image *im; + + im = container_of(work, struct bpf_tramp_image, work); + bpf_image_ksym_del(&im->ksym); + bpf_jit_free_exec(im->image); + bpf_jit_uncharge_modmem(1); + percpu_ref_exit(&im->pcref); + kfree_rcu(im, rcu); +} + +/* callback, fexit step 3 or fentry step 2 */ +static void __bpf_tramp_image_put_rcu(struct rcu_head *rcu) +{ + struct bpf_tramp_image *im; + + im = container_of(rcu, struct bpf_tramp_image, rcu); + INIT_WORK(&im->work, __bpf_tramp_image_put_deferred); + schedule_work(&im->work); +} + +/* callback, fexit step 2. Called after percpu_ref_kill confirms. */ +static void __bpf_tramp_image_release(struct percpu_ref *pcref) +{ + struct bpf_tramp_image *im; + + im = container_of(pcref, struct bpf_tramp_image, pcref); + call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu); +} + +/* callback, fexit or fentry step 1 */ +static void __bpf_tramp_image_put_rcu_tasks(struct rcu_head *rcu) +{ + struct bpf_tramp_image *im; + + im = container_of(rcu, struct bpf_tramp_image, rcu); + if (im->ip_after_call) + /* the case of fmod_ret/fexit trampoline and CONFIG_PREEMPTION=y */ + percpu_ref_kill(&im->pcref); + else + /* the case of fentry trampoline */ + call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu); +} + +static void bpf_tramp_image_put(struct bpf_tramp_image *im) +{ + /* The trampoline image that calls original function is using: + * rcu_read_lock_trace to protect sleepable bpf progs + * rcu_read_lock to protect normal bpf progs + * percpu_ref to protect trampoline itself + * rcu tasks to protect trampoline asm not covered by percpu_ref + * (which are few asm insns before __bpf_tramp_enter and + * after __bpf_tramp_exit) + * + * The trampoline is unreachable before bpf_tramp_image_put(). + * + * First, patch the trampoline to avoid calling into fexit progs. + * The progs will be freed even if the original function is still + * executing or sleeping. + * In case of CONFIG_PREEMPT=y use call_rcu_tasks() to wait on + * first few asm instructions to execute and call into + * __bpf_tramp_enter->percpu_ref_get. + * Then use percpu_ref_kill to wait for the trampoline and the original + * function to finish. + * Then use call_rcu_tasks() to make sure few asm insns in + * the trampoline epilogue are done as well. + * + * In !PREEMPT case the task that got interrupted in the first asm + * insns won't go through an RCU quiescent state which the + * percpu_ref_kill will be waiting for. Hence the first + * call_rcu_tasks() is not necessary. + */ + if (im->ip_after_call) { + int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP, + NULL, im->ip_epilogue); + WARN_ON(err); + if (IS_ENABLED(CONFIG_PREEMPTION)) + call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu_tasks); + else + percpu_ref_kill(&im->pcref); + return; + } + + /* The trampoline without fexit and fmod_ret progs doesn't call original + * function and doesn't use percpu_ref. + * Use call_rcu_tasks_trace() to wait for sleepable progs to finish. + * Then use call_rcu_tasks() to wait for the rest of trampoline asm + * and normal progs. + */ + call_rcu_tasks_trace(&im->rcu, __bpf_tramp_image_put_rcu_tasks); +} + +static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx) +{ + struct bpf_tramp_image *im; + struct bpf_ksym *ksym; + void *image; + int err = -ENOMEM; + + im = kzalloc(sizeof(*im), GFP_KERNEL); + if (!im) + goto out; + + err = bpf_jit_charge_modmem(1); + if (err) + goto out_free_im; + + err = -ENOMEM; + im->image = image = bpf_jit_alloc_exec_page(); + if (!image) + goto out_uncharge; + + err = percpu_ref_init(&im->pcref, __bpf_tramp_image_release, 0, GFP_KERNEL); + if (err) + goto out_free_image; + + ksym = &im->ksym; + INIT_LIST_HEAD_RCU(&ksym->lnode); + snprintf(ksym->name, KSYM_NAME_LEN, "bpf_trampoline_%llu_%u", key, idx); + bpf_image_ksym_add(image, ksym); + return im; + +out_free_image: + bpf_jit_free_exec(im->image); +out_uncharge: + bpf_jit_uncharge_modmem(1); +out_free_im: + kfree(im); +out: + return ERR_PTR(err); +} + static int bpf_trampoline_update(struct bpf_trampoline *tr) { - void *old_image = tr->image + ((tr->selector + 1) & 1) * PAGE_SIZE/2; - void *new_image = tr->image + (tr->selector & 1) * PAGE_SIZE/2; + struct bpf_tramp_image *im; struct bpf_tramp_progs *tprogs; u32 flags = BPF_TRAMP_F_RESTORE_REGS; int err, total; @@ -198,41 +310,42 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr) return PTR_ERR(tprogs);
if (total == 0) { - err = unregister_fentry(tr, old_image); + err = unregister_fentry(tr, tr->cur_image->image); + bpf_tramp_image_put(tr->cur_image); + tr->cur_image = NULL; tr->selector = 0; goto out; }
+ im = bpf_tramp_image_alloc(tr->key, tr->selector); + if (IS_ERR(im)) { + err = PTR_ERR(im); + goto out; + } + if (tprogs[BPF_TRAMP_FEXIT].nr_progs || tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs) flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
- /* Though the second half of trampoline page is unused a task could be - * preempted in the middle of the first half of trampoline and two - * updates to trampoline would change the code from underneath the - * preempted task. Hence wait for tasks to voluntarily schedule or go - * to userspace. - * The same trampoline can hold both sleepable and non-sleepable progs. - * synchronize_rcu_tasks_trace() is needed to make sure all sleepable - * programs finish executing. - * Wait for these two grace periods together. - */ - synchronize_rcu_mult(call_rcu_tasks, call_rcu_tasks_trace); - - err = arch_prepare_bpf_trampoline(new_image, new_image + PAGE_SIZE / 2, + err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE, &tr->func.model, flags, tprogs, tr->func.addr); if (err < 0) goto out;
- if (tr->selector) + WARN_ON(tr->cur_image && tr->selector == 0); + WARN_ON(!tr->cur_image && tr->selector); + if (tr->cur_image) /* progs already running at this address */ - err = modify_fentry(tr, old_image, new_image); + err = modify_fentry(tr, tr->cur_image->image, im->image); else /* first time registering */ - err = register_fentry(tr, new_image); + err = register_fentry(tr, im->image); if (err) goto out; + if (tr->cur_image) + bpf_tramp_image_put(tr->cur_image); + tr->cur_image = im; tr->selector++; out: kfree(tprogs); @@ -364,17 +477,12 @@ void bpf_trampoline_put(struct bpf_trampoline *tr) goto out; if (WARN_ON_ONCE(!hlist_empty(&tr->progs_hlist[BPF_TRAMP_FEXIT]))) goto out; - bpf_image_ksym_del(&tr->ksym); - /* This code will be executed when all bpf progs (both sleepable and - * non-sleepable) went through - * bpf_prog_put()->call_rcu[_tasks_trace]()->bpf_prog_free_deferred(). - * Hence no need for another synchronize_rcu_tasks_trace() here, - * but synchronize_rcu_tasks() is still needed, since trampoline - * may not have had any sleepable programs and we need to wait - * for tasks to get out of trampoline code before freeing it. + /* This code will be executed even when the last bpf_tramp_image + * is alive. All progs are detached from the trampoline and the + * trampoline image is patched with jmp into epilogue to skip + * fexit progs. The fentry-only trampoline will be freed via + * multiple rcu callbacks. */ - synchronize_rcu_tasks(); - bpf_jit_free_exec(tr->image); hlist_del(&tr->hlist); kfree(tr); out: @@ -433,8 +541,18 @@ void notrace __bpf_prog_exit_sleepable(void) rcu_read_unlock_trace(); }
+void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr) +{ + percpu_ref_get(&tr->pcref); +} + +void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr) +{ + percpu_ref_put(&tr->pcref); +} + int __weak -arch_prepare_bpf_trampoline(void *image, void *image_end, +arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end, const struct btf_func_model *m, u32 flags, struct bpf_tramp_progs *tprogs, void *orig_call)
From: Stanislav Fomichev sdf@google.com
[ Upstream commit b9082970478009b778aa9b22d5561eef35b53b63 ]
__bpf_arch_text_poke does rewrite only for atomic nop5, emit_nops(xxx, 5) emits non-atomic one which breaks fentry/fexit with k8 atomics:
P6_NOP5 == P6_NOP5_ATOMIC (0f1f440000 == 0f1f440000) K8_NOP5 != K8_NOP5_ATOMIC (6666906690 != 6666666690)
Can be reproduced by doing "ideal_nops = k8_nops" in "arch_init_ideal_nops() and running fexit_bpf2bpf selftest.
Fixes: e21aa341785c ("bpf: Fix fexit trampoline.") Signed-off-by: Stanislav Fomichev sdf@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/bpf/20210320000001.915366-1-sdf@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 652bd64e422d..023ac12f54a2 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1811,7 +1811,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i /* remember return value in a stack for bpf prog to access */ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8); im->ip_after_call = prog; - emit_nops(&prog, 5); + memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE); + prog += X86_PATCH_SIZE; }
if (fmod_ret->nr_progs) {
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 35471138a9f7193482a2019e39643f575f8098dc ]
Cleanup create_attributes_level_sysfs_files():
1. There is no need to call sysfs_remove_file() on error, sysman_init() will already call release_attributes_data() on failure which already does this.
2. There is no need for the pr_debug() calls sysfs_create_file() should never fail and if it does it will already complain about the problem itself.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Cc: Divya Bharathi Divya_Bharathi@dell.com Cc: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210321115901.35072-8-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell-wmi-sysman/sysman.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/platform/x86/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell-wmi-sysman/sysman.c index 5dd9b29d939c..7410ccae650c 100644 --- a/drivers/platform/x86/dell-wmi-sysman/sysman.c +++ b/drivers/platform/x86/dell-wmi-sysman/sysman.c @@ -210,19 +210,17 @@ static struct kobj_attribute pending_reboot = __ATTR_RO(pending_reboot); */ static int create_attributes_level_sysfs_files(void) { - int ret = sysfs_create_file(&wmi_priv.main_dir_kset->kobj, &reset_bios.attr); + int ret;
- if (ret) { - pr_debug("could not create reset_bios file\n"); + ret = sysfs_create_file(&wmi_priv.main_dir_kset->kobj, &reset_bios.attr); + if (ret) return ret; - }
ret = sysfs_create_file(&wmi_priv.main_dir_kset->kobj, &pending_reboot.attr); - if (ret) { - pr_debug("could not create changing_pending_reboot file\n"); - sysfs_remove_file(&wmi_priv.main_dir_kset->kobj, &reset_bios.attr); - } - return ret; + if (ret) + return ret; + + return 0; }
static ssize_t wmi_sysman_attr_show(struct kobject *kobj, struct attribute *attr,
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) { + struct range mhp_range; + /* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */ - return start >= __pa(_PAGE_OFFSET(vabits_actual)) && - (start + size - 1) <= __pa(PAGE_END - 1); + mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual)); + mhp_range.end = __pa(PAGE_END - 1); + return mhp_range; }
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
- if (!inside_linear_region(start, size)) { - pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size); - return -EINVAL; - } - + VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
-- Linaro LKFT https://lkft.linaro.org
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Note, I used tuxbuild before doing this release, and it does not show this error in the arm64 defconfigs. What config did you use to trigger this?
thanks,
greg k-h
On Mon, 29 Mar 2021 at 12:13, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Note, I used tuxbuild before doing this release, and it does not show this error in the arm64 defconfigs. What config did you use to trigger this?
We have a build with CONFIG_MEMORY_HOTPLUG=y enabled too.
This is a way to reproduce it locally: tuxmake --runtime podman --target-arch arm64 --toolchain gcc --kconfig defconfig --kconfig-add CONFIG_MEMORY_HOTPLUG=y
Cheers, Anders
On Mon, Mar 29, 2021 at 01:06:47PM +0200, Anders Roxell wrote:
On Mon, 29 Mar 2021 at 12:13, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Note, I used tuxbuild before doing this release, and it does not show this error in the arm64 defconfigs. What config did you use to trigger this?
We have a build with CONFIG_MEMORY_HOTPLUG=y enabled too.
This is a way to reproduce it locally: tuxmake --runtime podman --target-arch arm64 --toolchain gcc --kconfig defconfig --kconfig-add CONFIG_MEMORY_HOTPLUG=y
Ah, that wasn't expected, but makes sense, thanks.
Does 'allmodconfig' also trigger that? Maybe I'll go add that to my build tests for arm64...
thanks,
greg k-h
On Mon, 29 Mar 2021 at 13:35, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 01:06:47PM +0200, Anders Roxell wrote:
On Mon, 29 Mar 2021 at 12:13, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Note, I used tuxbuild before doing this release, and it does not show this error in the arm64 defconfigs. What config did you use to trigger this?
We have a build with CONFIG_MEMORY_HOTPLUG=y enabled too.
This is a way to reproduce it locally: tuxmake --runtime podman --target-arch arm64 --toolchain gcc --kconfig defconfig --kconfig-add CONFIG_MEMORY_HOTPLUG=y
Ah, that wasn't expected, but makes sense, thanks.
Does 'allmodconfig' also trigger that?
Yes that would trigger it. I tried this: tuxmake --runtime docker --target-arch arm64 --toolchain gcc --kconfig allmodconfig --build-dir=$(pwd)/obj/test-allmod-arm64
Maybe I'll go add that to my build tests for arm64...
It will take a few minutes to build an allmodconfig kernel on tuxsuite.
Cheers, Anders
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
On Mon, Mar 29, 2021 at 03:08:52PM +0200, Ard Biesheuvel wrote:
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
This was needed for a follow-on patch in the series that fixed an issue. Specifically it was commit ee7febce0519 ("arm64: mm: correct the inside linear map range during hotplug check")
thanks,
greg k-h
(+ Pavel)
On Mon, 29 Mar 2021 at 15:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:08:52PM +0200, Ard Biesheuvel wrote:
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
This was needed for a follow-on patch in the series that fixed an issue. Specifically it was commit ee7febce0519 ("arm64: mm: correct the inside linear map range during hotplug check")
Yeah, but during the discussion of that patch [0], we pointed out that it needed to be rebased because of these new changes. So trying to backport this rebased version is obviously not the right approach: Pavel's original patch would be much more suitable for that.
Could we have annotated this patch in a better way to make this more obvious?
[0] https://lore.kernel.org/linux-arm-kernel/20210215192237.362706-2-pasha.tatas...
On Mon, Mar 29, 2021 at 03:49:19PM +0200, Ard Biesheuvel wrote:
(+ Pavel)
On Mon, 29 Mar 2021 at 15:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:08:52PM +0200, Ard Biesheuvel wrote:
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ]
This overrides arch_get_mappable_range() on arm64 platform which will be used with recently added generic framework. It drops inside_linear_region() and subsequent check in arch_add_memory() which are no longer required. It also adds a VM_BUG_ON() check that would ensure that mhp_range_allowed() has already been called.
Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Heiko Carstens hca@linux.ibm.com Cc: Jason Wang jasowang@redhat.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@kernel.org Cc: Oscar Salvador osalvador@suse.de Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com Cc: teawater teawaterz@linux.alibaba.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Wei Yang richard.weiyang@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
arch/arm64/mm/mmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f0648777d34..92b3be127796 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
-static bool inside_linear_region(u64 start, u64 size) +struct range arch_get_mappable_range(void) {
struct range mhp_range;
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] * accommodating both its ends but excluding PAGE_END. Max physical * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */
return start >= __pa(_PAGE_OFFSET(vabits_actual)) &&
(start + size - 1) <= __pa(PAGE_END - 1);
mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual));
mhp_range.end = __pa(PAGE_END - 1);
return mhp_range;
}
int arch_add_memory(int nid, u64 start, u64 size, @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, { int ret, flags = 0;
if (!inside_linear_region(start, size)) {
pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
return -EINVAL;
}
VM_BUG_ON(!mhp_range_allowed(start, size, true)); if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
This was needed for a follow-on patch in the series that fixed an issue. Specifically it was commit ee7febce0519 ("arm64: mm: correct the inside linear map range during hotplug check")
Yeah, but during the discussion of that patch [0], we pointed out that it needed to be rebased because of these new changes. So trying to backport this rebased version is obviously not the right approach: Pavel's original patch would be much more suitable for that.
Could we have annotated this patch in a better way to make this more obvious?
Yes, given that there was no annotation on the patch at all to let us know this :)
You can say things like "do not apply to stable trees" or "needs total rework for older kernels" or other fun such things that when we read them, we know to ask for help. As it is, the patch provided nothing so we guessed and got it wrong...
thanks,
greg k-h
On Mon, Mar 29, 2021 at 9:51 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:49:19PM +0200, Ard Biesheuvel wrote:
(+ Pavel)
On Mon, 29 Mar 2021 at 15:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:08:52PM +0200, Ard Biesheuvel wrote:
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote:
On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote: > > From: Anshuman Khandual anshuman.khandual@arm.com > > [ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ] > > This overrides arch_get_mappable_range() on arm64 platform which will be > used with recently added generic framework. It drops > inside_linear_region() and subsequent check in arch_add_memory() which are > no longer required. It also adds a VM_BUG_ON() check that would ensure > that mhp_range_allowed() has already been called. > > Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... > Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com > Reviewed-by: David Hildenbrand david@redhat.com > Reviewed-by: Catalin Marinas catalin.marinas@arm.com > Cc: Will Deacon will@kernel.org > Cc: Ard Biesheuvel ardb@kernel.org > Cc: Mark Rutland mark.rutland@arm.com > Cc: Heiko Carstens hca@linux.ibm.com > Cc: Jason Wang jasowang@redhat.com > Cc: Jonathan Cameron Jonathan.Cameron@huawei.com > Cc: "Michael S. Tsirkin" mst@redhat.com > Cc: Michal Hocko mhocko@kernel.org > Cc: Oscar Salvador osalvador@suse.de > Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com > Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com > Cc: teawater teawaterz@linux.alibaba.com > Cc: Vasily Gorbik gor@linux.ibm.com > Cc: Wei Yang richard.weiyang@linux.alibaba.com > Signed-off-by: Andrew Morton akpm@linux-foundation.org > Signed-off-by: Linus Torvalds torvalds@linux-foundation.org > Signed-off-by: Sasha Levin sashal@kernel.org > --- > arch/arm64/mm/mmu.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 6f0648777d34..92b3be127796 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) > free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); > } > > -static bool inside_linear_region(u64 start, u64 size) > +struct range arch_get_mappable_range(void) > { > + struct range mhp_range; > + > /* > * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] > * accommodating both its ends but excluding PAGE_END. Max physical > * range which can be mapped inside this linear mapping range, must > * also be derived from its end points. > */ > - return start >= __pa(_PAGE_OFFSET(vabits_actual)) && > - (start + size - 1) <= __pa(PAGE_END - 1); > + mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual)); > + mhp_range.end = __pa(PAGE_END - 1); > + return mhp_range; > } > > int arch_add_memory(int nid, u64 start, u64 size, > @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, > { > int ret, flags = 0; > > - if (!inside_linear_region(start, size)) { > - pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size); > - return -EINVAL; > - } > - > + VM_BUG_ON(!mhp_range_allowed(start, size, true)); > if (rodata_full || debug_pagealloc_enabled()) > flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
> Anshuman Khandual anshuman.khandual@arm.com > arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
This was needed for a follow-on patch in the series that fixed an issue. Specifically it was commit ee7febce0519 ("arm64: mm: correct the inside linear map range during hotplug check")
Yeah, but during the discussion of that patch [0], we pointed out that it needed to be rebased because of these new changes. So trying to backport this rebased version is obviously not the right approach: Pavel's original patch would be much more suitable for that.
Could we have annotated this patch in a better way to make this more obvious?
Yes, given that there was no annotation on the patch at all to let us know this :)
You can say things like "do not apply to stable trees" or "needs total rework for older kernels" or other fun such things that when we read them, we know to ask for help. As it is, the patch provided nothing so we guessed and got it wrong...
I will send the patch for stable trees with the commit id included as requested.
Pasha
thanks,
greg k-h
On Mon, Mar 29, 2021 at 09:53:10AM -0400, Pavel Tatashin wrote:
On Mon, Mar 29, 2021 at 9:51 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:49:19PM +0200, Ard Biesheuvel wrote:
(+ Pavel)
On Mon, 29 Mar 2021 at 15:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:08:52PM +0200, Ard Biesheuvel wrote:
On Mon, 29 Mar 2021 at 12:12, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Mar 29, 2021 at 03:05:25PM +0530, Naresh Kamboju wrote: > On Mon, 29 Mar 2021 at 14:10, Greg Kroah-Hartman > gregkh@linuxfoundation.org wrote: > > > > From: Anshuman Khandual anshuman.khandual@arm.com > > > > [ Upstream commit 03aaf83fba6e5af08b5dd174c72edee9b7d9ed9b ] > > > > This overrides arch_get_mappable_range() on arm64 platform which will be > > used with recently added generic framework. It drops > > inside_linear_region() and subsequent check in arch_add_memory() which are > > no longer required. It also adds a VM_BUG_ON() check that would ensure > > that mhp_range_allowed() has already been called. > > > > Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual... > > Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com > > Reviewed-by: David Hildenbrand david@redhat.com > > Reviewed-by: Catalin Marinas catalin.marinas@arm.com > > Cc: Will Deacon will@kernel.org > > Cc: Ard Biesheuvel ardb@kernel.org > > Cc: Mark Rutland mark.rutland@arm.com > > Cc: Heiko Carstens hca@linux.ibm.com > > Cc: Jason Wang jasowang@redhat.com > > Cc: Jonathan Cameron Jonathan.Cameron@huawei.com > > Cc: "Michael S. Tsirkin" mst@redhat.com > > Cc: Michal Hocko mhocko@kernel.org > > Cc: Oscar Salvador osalvador@suse.de > > Cc: Pankaj Gupta pankaj.gupta@cloud.ionos.com > > Cc: Pankaj Gupta pankaj.gupta.linux@gmail.com > > Cc: teawater teawaterz@linux.alibaba.com > > Cc: Vasily Gorbik gor@linux.ibm.com > > Cc: Wei Yang richard.weiyang@linux.alibaba.com > > Signed-off-by: Andrew Morton akpm@linux-foundation.org > > Signed-off-by: Linus Torvalds torvalds@linux-foundation.org > > Signed-off-by: Sasha Levin sashal@kernel.org > > --- > > arch/arm64/mm/mmu.c | 15 +++++++-------- > > 1 file changed, 7 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 6f0648777d34..92b3be127796 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -1443,16 +1443,19 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) > > free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); > > } > > > > -static bool inside_linear_region(u64 start, u64 size) > > +struct range arch_get_mappable_range(void) > > { > > + struct range mhp_range; > > + > > /* > > * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] > > * accommodating both its ends but excluding PAGE_END. Max physical > > * range which can be mapped inside this linear mapping range, must > > * also be derived from its end points. > > */ > > - return start >= __pa(_PAGE_OFFSET(vabits_actual)) && > > - (start + size - 1) <= __pa(PAGE_END - 1); > > + mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual)); > > + mhp_range.end = __pa(PAGE_END - 1); > > + return mhp_range; > > } > > > > int arch_add_memory(int nid, u64 start, u64 size, > > @@ -1460,11 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size, > > { > > int ret, flags = 0; > > > > - if (!inside_linear_region(start, size)) { > > - pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size); > > - return -EINVAL; > > - } > > - > > + VM_BUG_ON(!mhp_range_allowed(start, size, true)); > > if (rodata_full || debug_pagealloc_enabled()) > > flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; > > The stable rc 5.10 and 5.11 builds failed for arm64 architecture > due to below warnings / errors, > > > Anshuman Khandual anshuman.khandual@arm.com > > arm64/mm: define arch_get_mappable_range() > > > arch/arm64/mm/mmu.c: In function 'arch_add_memory': > arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function > 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? > [-Werror=implicit-function-declaration] > VM_BUG_ON(!mhp_range_allowed(start, size, true)); > ^ > include/linux/build_bug.h:30:63: note: in definition of macro > 'BUILD_BUG_ON_INVALID' > #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) > ^ > arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' > VM_BUG_ON(!mhp_range_allowed(start, size, true)); > ^~~~~~~~~ > > Build link, > https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... > https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
thanks, will go drop this, and the patch that was after it in the series, from both trees and will push out a -rc2.
Why were these picked up in the first place? I don't see any fixes or cc:stable tags, and the commit log clearly describes that the change is preparatory work for enabling arm64 support into a recently introduced generic framework.
This was needed for a follow-on patch in the series that fixed an issue. Specifically it was commit ee7febce0519 ("arm64: mm: correct the inside linear map range during hotplug check")
Yeah, but during the discussion of that patch [0], we pointed out that it needed to be rebased because of these new changes. So trying to backport this rebased version is obviously not the right approach: Pavel's original patch would be much more suitable for that.
Could we have annotated this patch in a better way to make this more obvious?
Yes, given that there was no annotation on the patch at all to let us know this :)
You can say things like "do not apply to stable trees" or "needs total rework for older kernels" or other fun such things that when we read them, we know to ask for help. As it is, the patch provided nothing so we guessed and got it wrong...
I will send the patch for stable trees with the commit id included as requested.
Wonderful, thank you so much.
greg k-h
From: Pavel Tatashin pasha.tatashin@soleen.com
[ Upstream commit ee7febce051945be28ad86d16a15886f878204de ]
Memory hotplug may fail on systems with CONFIG_RANDOMIZE_BASE because the linear map range is not checked correctly.
The start physical address that linear map covers can be actually at the end of the range because of randomization. Check that and if so reduce it to 0.
This can be verified on QEMU with setting kaslr-seed to ~0ul:
memstart_offset_seed = 0xffff START: __pa(_PAGE_OFFSET(vabits_actual)) = ffff9000c0000000 END: __pa(PAGE_END - 1) = 1000bfffffff
Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Fixes: 58284a901b42 ("arm64/mm: Validate hotplug range before creating linear mapping") Tested-by: Tyler Hicks tyhicks@linux.microsoft.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Link: https://lore.kernel.org/r/20210216150351.129018-2-pasha.tatashin@soleen.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/mmu.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 92b3be127796..3ca02e917598 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1446,6 +1446,22 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) struct range arch_get_mappable_range(void) { struct range mhp_range; + u64 start_linear_pa = __pa(_PAGE_OFFSET(vabits_actual)); + u64 end_linear_pa = __pa(PAGE_END - 1); + + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { + /* + * Check for a wrap, it is possible because of randomized linear + * mapping the start physical address is actually bigger than + * the end physical address. In this case set start to zero + * because [0, end_linear_pa] range must still be able to cover + * all addressable physical addresses. + */ + if (start_linear_pa > end_linear_pa) + start_linear_pa = 0; + } + + WARN_ON(start_linear_pa > end_linear_pa);
/* * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] @@ -1453,8 +1469,9 @@ struct range arch_get_mappable_range(void) * range which can be mapped inside this linear mapping range, must * also be derived from its end points. */ - mhp_range.start = __pa(_PAGE_OFFSET(vabits_actual)); - mhp_range.end = __pa(PAGE_END - 1); + mhp_range.start = start_linear_pa; + mhp_range.end = end_linear_pa; + return mhp_range; }
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 2d669ceb69c276f7637cf760287ca4187add082e ]
Commit 24f6b6036c9e ("dm table: fix zoned iterate_devices based device capability checks") triggered dm table load failure when dm-zoned device is set up for zoned block devices and a regular device for cache.
The commit inverted logic of two callback functions for iterate_devices: device_is_zoned_model() and device_matches_zone_sectors(). The logic of device_is_zoned_model() was inverted then all destination devices of all targets in dm table are required to have the expected zoned model. This is fine for dm-linear, dm-flakey and dm-crypt on zoned block devices since each target has only one destination device. However, this results in failure for dm-zoned with regular cache device since that target has both regular block device and zoned block devices.
As for device_matches_zone_sectors(), the commit inverted the logic to require all zoned block devices in each target have the specified zone_sectors. This check also fails for regular block device which does not have zones.
To avoid the check failures, fix the zone model check and the zone sectors check. For zone model check, introduce the new feature flag DM_TARGET_MIXED_ZONED_MODEL, and set it to dm-zoned target. When the target has this flag, allow it to have destination devices with any zoned model. For zone sectors check, skip the check if the destination device is not a zoned block device. Also add comments and improve an error message to clarify expectations to the two checks.
Fixes: 24f6b6036c9e ("dm table: fix zoned iterate_devices based device capability checks") Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-table.c | 33 +++++++++++++++++++++++++-------- drivers/md/dm-zoned-target.c | 2 +- include/linux/device-mapper.h | 15 ++++++++++++++- 3 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 77086db8b920..7291fd3106ff 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1380,6 +1380,13 @@ static int device_not_zoned_model(struct dm_target *ti, struct dm_dev *dev, return !q || blk_queue_zoned_model(q) != *zoned_model; }
+/* + * Check the device zoned model based on the target feature flag. If the target + * has the DM_TARGET_ZONED_HM feature flag set, host-managed zoned devices are + * also accepted but all devices must have the same zoned model. If the target + * has the DM_TARGET_MIXED_ZONED_MODEL feature set, the devices can have any + * zoned model with all zoned devices having the same zone size. + */ static bool dm_table_supports_zoned_model(struct dm_table *t, enum blk_zoned_model zoned_model) { @@ -1389,13 +1396,15 @@ static bool dm_table_supports_zoned_model(struct dm_table *t, for (i = 0; i < dm_table_get_num_targets(t); i++) { ti = dm_table_get_target(t, i);
- if (zoned_model == BLK_ZONED_HM && - !dm_target_supports_zoned_hm(ti->type)) - return false; - - if (!ti->type->iterate_devices || - ti->type->iterate_devices(ti, device_not_zoned_model, &zoned_model)) - return false; + if (dm_target_supports_zoned_hm(ti->type)) { + if (!ti->type->iterate_devices || + ti->type->iterate_devices(ti, device_not_zoned_model, + &zoned_model)) + return false; + } else if (!dm_target_supports_mixed_zoned_model(ti->type)) { + if (zoned_model == BLK_ZONED_HM) + return false; + } }
return true; @@ -1407,9 +1416,17 @@ static int device_not_matches_zone_sectors(struct dm_target *ti, struct dm_dev * struct request_queue *q = bdev_get_queue(dev->bdev); unsigned int *zone_sectors = data;
+ if (!blk_queue_is_zoned(q)) + return 0; + return !q || blk_queue_zone_sectors(q) != *zone_sectors; }
+/* + * Check consistency of zoned model and zone sectors across all targets. For + * zone sectors, if the destination device is a zoned block device, it shall + * have the specified zone_sectors. + */ static int validate_hardware_zoned_model(struct dm_table *table, enum blk_zoned_model zoned_model, unsigned int zone_sectors) @@ -1428,7 +1445,7 @@ static int validate_hardware_zoned_model(struct dm_table *table, return -EINVAL;
if (dm_table_any_dev_attr(table, device_not_matches_zone_sectors, &zone_sectors)) { - DMERR("%s: zone sectors is not consistent across all devices", + DMERR("%s: zone sectors is not consistent across all zoned devices", dm_device_name(table->md)); return -EINVAL; } diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c index 697f9de37355..7e88df64d197 100644 --- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -1143,7 +1143,7 @@ static int dmz_message(struct dm_target *ti, unsigned int argc, char **argv, static struct target_type dmz_type = { .name = "zoned", .version = {2, 0, 0}, - .features = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM, + .features = DM_TARGET_SINGLETON | DM_TARGET_MIXED_ZONED_MODEL, .module = THIS_MODULE, .ctr = dmz_ctr, .dtr = dmz_dtr, diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index d2d7f9b6a276..50cc070cb1f7 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -246,7 +246,11 @@ struct target_type { #define dm_target_passes_integrity(type) ((type)->features & DM_TARGET_PASSES_INTEGRITY)
/* - * Indicates that a target supports host-managed zoned block devices. + * Indicates support for zoned block devices: + * - DM_TARGET_ZONED_HM: the target also supports host-managed zoned + * block devices but does not support combining different zoned models. + * - DM_TARGET_MIXED_ZONED_MODEL: the target supports combining multiple + * devices with different zoned models. */ #define DM_TARGET_ZONED_HM 0x00000040 #define dm_target_supports_zoned_hm(type) ((type)->features & DM_TARGET_ZONED_HM) @@ -257,6 +261,15 @@ struct target_type { #define DM_TARGET_NOWAIT 0x00000080 #define dm_target_supports_nowait(type) ((type)->features & DM_TARGET_NOWAIT)
+#ifdef CONFIG_BLK_DEV_ZONED +#define DM_TARGET_MIXED_ZONED_MODEL 0x00000200 +#define dm_target_supports_mixed_zoned_model(type) \ + ((type)->features & DM_TARGET_MIXED_ZONED_MODEL) +#else +#define DM_TARGET_MIXED_ZONED_MODEL 0x00000000 +#define dm_target_supports_mixed_zoned_model(type) (false) +#endif + struct dm_target { struct dm_table *table; struct target_type *type;
From: Sean Christopherson seanjc@google.com
[ Upstream commit c2655835fd8cabdfe7dab737253de3ffb88da126 ]
If one or more notifiers fails .invalidate_range_start(), invoke .invalidate_range_end() for "all" notifiers. If there are multiple notifiers, those that did not fail are expecting _start() and _end() to be paired, e.g. KVM's mmu_notifier_count would become imbalanced. Disallow notifiers that can fail _start() from implementing _end() so that it's unnecessary to either track which notifiers rejected _start(), or had already succeeded prior to a failed _start().
Note, the existing behavior of calling _start() on all notifiers even after a previous notifier failed _start() was an unintented "feature". Make it canon now that the behavior is depended on for correctness.
As of today, the bug is likely benign:
1. The only caller of the non-blocking notifier is OOM kill. 2. The only notifiers that can fail _start() are the i915 and Nouveau drivers. 3. The only notifiers that utilize _end() are the SGI UV GRU driver and KVM. 4. The GRU driver will never coincide with the i195/Nouveau drivers. 5. An imbalanced kvm->mmu_notifier_count only causes soft lockup in the _guest_, and the guest is already doomed due to being an OOM victim.
Fix the bug now to play nice with future usage, e.g. KVM has a potential use case for blocking memslot updates in KVM while an invalidation is in-progress, and failure to unblock would result in said updates being blocked indefinitely and hanging.
Found by inspection. Verified by adding a second notifier in KVM that periodically returns -EAGAIN on non-blockable ranges, triggering OOM, and observing that KVM exits with an elevated notifier count.
Link: https://lkml.kernel.org/r/20210311180057.1582638-1-seanjc@google.com Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") Signed-off-by: Sean Christopherson seanjc@google.com Suggested-by: Jason Gunthorpe jgg@ziepe.ca Reviewed-by: Jason Gunthorpe jgg@nvidia.com Cc: David Rientjes rientjes@google.com Cc: Ben Gardon bgardon@google.com Cc: Michal Hocko mhocko@suse.com Cc: "Jérôme Glisse" jglisse@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Dimitri Sivanich dimitri.sivanich@hpe.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mmu_notifier.h | 10 +++++----- mm/mmu_notifier.c | 23 +++++++++++++++++++++++ 2 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index b8200782dede..1a6a9eb6d3fa 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -169,11 +169,11 @@ struct mmu_notifier_ops { * the last refcount is dropped. * * If blockable argument is set to false then the callback cannot - * sleep and has to return with -EAGAIN. 0 should be returned - * otherwise. Please note that if invalidate_range_start approves - * a non-blocking behavior then the same applies to - * invalidate_range_end. - * + * sleep and has to return with -EAGAIN if sleeping would be required. + * 0 should be returned otherwise. Please note that notifiers that can + * fail invalidate_range_start are not allowed to implement + * invalidate_range_end, as there is no mechanism for informing the + * notifier that its start failed. */ int (*invalidate_range_start)(struct mmu_notifier *subscription, const struct mmu_notifier_range *range); diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 61ee40ed804e..459d195d2ff6 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -501,10 +501,33 @@ static int mn_hlist_invalidate_range_start( ""); WARN_ON(mmu_notifier_range_blockable(range) || _ret != -EAGAIN); + /* + * We call all the notifiers on any EAGAIN, + * there is no way for a notifier to know if + * its start method failed, thus a start that + * does EAGAIN can't also do end. + */ + WARN_ON(ops->invalidate_range_end); ret = _ret; } } } + + if (ret) { + /* + * Must be non-blocking to get here. If there are multiple + * notifiers and one or more failed start, any that succeeded + * start are expecting their end to be called. Do so now. + */ + hlist_for_each_entry_rcu(subscription, &subscriptions->list, + hlist, srcu_read_lock_held(&srcu)) { + if (!subscription->ops->invalidate_range_end) + continue; + + subscription->ops->invalidate_range_end(subscription, + range); + } + } srcu_read_unlock(&srcu, id);
return ret;
From: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz
[ Upstream commit abe7034b9a8d57737e80cc16d60ed3666990bdbf ]
This reverts commit 443d6e86f821a165fae3fc3fc13086d27ac140b1.
This (and the following) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude.
Revert these patches and fix the issue in a different way.
Signed-off-by: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/netfilter/arp_tables.c | 2 +- net/ipv4/netfilter/ip_tables.c | 2 +- net/ipv6/netfilter/ip6_tables.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 04a2010755a6..d1e04d2b5170 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1379,7 +1379,7 @@ static int compat_get_entries(struct net *net, xt_compat_lock(NFPROTO_ARP); t = xt_find_table_lock(net, NFPROTO_ARP, get.name); if (!IS_ERR(t)) { - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; struct xt_table_info info;
ret = compat_table_info(private, &info); diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index a5b63f92b7f3..f15bc21d7301 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1589,7 +1589,7 @@ compat_get_entries(struct net *net, struct compat_ipt_get_entries __user *uptr, xt_compat_lock(AF_INET); t = xt_find_table_lock(net, AF_INET, get.name); if (!IS_ERR(t)) { - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; struct xt_table_info info; ret = compat_table_info(private, &info); if (!ret && get.size == info.size) diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 81c042940b21..2e2119bfcf13 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1598,7 +1598,7 @@ compat_get_entries(struct net *net, struct compat_ip6t_get_entries __user *uptr, xt_compat_lock(AF_INET6); t = xt_find_table_lock(net, AF_INET6, get.name); if (!IS_ERR(t)) { - const struct xt_table_info *private = xt_table_get_private_protected(t); + const struct xt_table_info *private = t->private; struct xt_table_info info; ret = compat_table_info(private, &info); if (!ret && get.size == info.size)
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
[ Upstream commit c1013ff7a5472db637c56bb6237f8343398c03a7 ]
The upfront allocation of new_bus_id is done to avoid allocating memory under acpi_device_lock, but it doesn't really help, because (1) it leads to many unnecessary memory allocations for _ADR devices, (2) kstrdup_const() is run under that lock anyway and (3) it complicates the code.
Rearrange acpi_device_add() to allocate memory for a new struct acpi_device_bus_id instance only when necessary, eliminate a redundant local variable from it and reduce the number of labels in there.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/scan.c | 57 +++++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 31 deletions(-)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 22566b4b3150..1efeb8f78ae4 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -623,12 +623,23 @@ void acpi_bus_put_acpi_device(struct acpi_device *adev) put_device(&adev->dev); }
+static struct acpi_device_bus_id *acpi_device_bus_id_match(const char *dev_id) +{ + struct acpi_device_bus_id *acpi_device_bus_id; + + /* Find suitable bus_id and instance number in acpi_bus_id_list. */ + list_for_each_entry(acpi_device_bus_id, &acpi_bus_id_list, node) { + if (!strcmp(acpi_device_bus_id->bus_id, dev_id)) + return acpi_device_bus_id; + } + return NULL; +} + int acpi_device_add(struct acpi_device *device, void (*release)(struct device *)) { + struct acpi_device_bus_id *acpi_device_bus_id; int result; - struct acpi_device_bus_id *acpi_device_bus_id, *new_bus_id; - int found = 0;
if (device->handle) { acpi_status status; @@ -654,38 +665,26 @@ int acpi_device_add(struct acpi_device *device, INIT_LIST_HEAD(&device->del_list); mutex_init(&device->physical_node_lock);
- new_bus_id = kzalloc(sizeof(struct acpi_device_bus_id), GFP_KERNEL); - if (!new_bus_id) { - pr_err(PREFIX "Memory allocation error\n"); - result = -ENOMEM; - goto err_detach; - } - mutex_lock(&acpi_device_lock); - /* - * Find suitable bus_id and instance number in acpi_bus_id_list - * If failed, create one and link it into acpi_bus_id_list - */ - list_for_each_entry(acpi_device_bus_id, &acpi_bus_id_list, node) { - if (!strcmp(acpi_device_bus_id->bus_id, - acpi_device_hid(device))) { - acpi_device_bus_id->instance_no++; - found = 1; - kfree(new_bus_id); - break; + + acpi_device_bus_id = acpi_device_bus_id_match(acpi_device_hid(device)); + if (acpi_device_bus_id) { + acpi_device_bus_id->instance_no++; + } else { + acpi_device_bus_id = kzalloc(sizeof(*acpi_device_bus_id), + GFP_KERNEL); + if (!acpi_device_bus_id) { + result = -ENOMEM; + goto err_unlock; } - } - if (!found) { - acpi_device_bus_id = new_bus_id; acpi_device_bus_id->bus_id = kstrdup_const(acpi_device_hid(device), GFP_KERNEL); if (!acpi_device_bus_id->bus_id) { - pr_err(PREFIX "Memory allocation error for bus id\n"); + kfree(acpi_device_bus_id); result = -ENOMEM; - goto err_free_new_bus_id; + goto err_unlock; }
- acpi_device_bus_id->instance_no = 0; list_add_tail(&acpi_device_bus_id->node, &acpi_bus_id_list); } dev_set_name(&device->dev, "%s:%02x", acpi_device_bus_id->bus_id, acpi_device_bus_id->instance_no); @@ -720,13 +719,9 @@ int acpi_device_add(struct acpi_device *device, list_del(&device->node); list_del(&device->wakeup_list);
- err_free_new_bus_id: - if (!found) - kfree(new_bus_id); - + err_unlock: mutex_unlock(&acpi_device_lock);
- err_detach: acpi_detach_data(device->handle, acpi_scan_drop_device); return result; }
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit eb50aaf960e3bedfef79063411ffd670da94b84b ]
The decrementation of acpi_device_bus_id->instance_no in acpi_device_del() is incorrect, because it may cause a duplicate instance number to be allocated next time a device with the same acpi_device_bus_id is added.
Replace above mentioned approach by using IDA framework.
While at it, define the instance range to be [0, 4096).
Fixes: e49bd2dd5a50 ("ACPI: use PNPID:instance_no as bus_id of ACPI device") Fixes: ca9dc8d42b30 ("ACPI / scan: Fix acpi_bus_id_list bookkeeping") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: 4.10+ stable@vger.kernel.org # 4.10+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/internal.h | 6 +++++- drivers/acpi/scan.c | 33 ++++++++++++++++++++++++++++----- include/acpi/acpi_bus.h | 1 + 3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h index e6a5d997241c..cb8f70842249 100644 --- a/drivers/acpi/internal.h +++ b/drivers/acpi/internal.h @@ -9,6 +9,8 @@ #ifndef _ACPI_INTERNAL_H_ #define _ACPI_INTERNAL_H_
+#include <linux/idr.h> + #define PREFIX "ACPI: "
int early_acpi_osi_init(void); @@ -96,9 +98,11 @@ void acpi_scan_table_handler(u32 event, void *table, void *context);
extern struct list_head acpi_bus_id_list;
+#define ACPI_MAX_DEVICE_INSTANCES 4096 + struct acpi_device_bus_id { const char *bus_id; - unsigned int instance_no; + struct ida instance_ida; struct list_head node; };
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 1efeb8f78ae4..a4fdf61b0644 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -482,9 +482,8 @@ static void acpi_device_del(struct acpi_device *device) list_for_each_entry(acpi_device_bus_id, &acpi_bus_id_list, node) if (!strcmp(acpi_device_bus_id->bus_id, acpi_device_hid(device))) { - if (acpi_device_bus_id->instance_no > 0) - acpi_device_bus_id->instance_no--; - else { + ida_simple_remove(&acpi_device_bus_id->instance_ida, device->pnp.instance_no); + if (ida_is_empty(&acpi_device_bus_id->instance_ida)) { list_del(&acpi_device_bus_id->node); kfree_const(acpi_device_bus_id->bus_id); kfree(acpi_device_bus_id); @@ -635,6 +634,21 @@ static struct acpi_device_bus_id *acpi_device_bus_id_match(const char *dev_id) return NULL; }
+static int acpi_device_set_name(struct acpi_device *device, + struct acpi_device_bus_id *acpi_device_bus_id) +{ + struct ida *instance_ida = &acpi_device_bus_id->instance_ida; + int result; + + result = ida_simple_get(instance_ida, 0, ACPI_MAX_DEVICE_INSTANCES, GFP_KERNEL); + if (result < 0) + return result; + + device->pnp.instance_no = result; + dev_set_name(&device->dev, "%s:%02x", acpi_device_bus_id->bus_id, result); + return 0; +} + int acpi_device_add(struct acpi_device *device, void (*release)(struct device *)) { @@ -669,7 +683,9 @@ int acpi_device_add(struct acpi_device *device,
acpi_device_bus_id = acpi_device_bus_id_match(acpi_device_hid(device)); if (acpi_device_bus_id) { - acpi_device_bus_id->instance_no++; + result = acpi_device_set_name(device, acpi_device_bus_id); + if (result) + goto err_unlock; } else { acpi_device_bus_id = kzalloc(sizeof(*acpi_device_bus_id), GFP_KERNEL); @@ -685,9 +701,16 @@ int acpi_device_add(struct acpi_device *device, goto err_unlock; }
+ ida_init(&acpi_device_bus_id->instance_ida); + + result = acpi_device_set_name(device, acpi_device_bus_id); + if (result) { + kfree(acpi_device_bus_id); + goto err_unlock; + } + list_add_tail(&acpi_device_bus_id->node, &acpi_bus_id_list); } - dev_set_name(&device->dev, "%s:%02x", acpi_device_bus_id->bus_id, acpi_device_bus_id->instance_no);
if (device->parent) list_add_tail(&device->node, &device->parent->children); diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 6d1879bf9440..37dac195adbb 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -233,6 +233,7 @@ struct acpi_pnp_type {
struct acpi_device_pnp { acpi_bus_id bus_id; /* Object name */ + int instance_no; /* Instance number of this object */ struct acpi_pnp_type type; /* ID type */ acpi_bus_address bus_address; /* _ADR */ char *unique_id; /* _UID */
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit b410ed2a8572d41c68bd9208555610e4b07d0703 ]
The only requirement of an auxtrace queue is that the buffers are in time order. That is achieved by making separate queues for separate perf buffer or AUX area buffer mmaps.
That generally means a separate queue per cpu for per-cpu contexts, and a separate queue per thread for per-task contexts.
When buffers are added to a queue, perf checks that the buffer cpu and thread id (tid) match the queue cpu and thread id.
However, generally, that need not be true, and perf will queue buffers correctly anyway, so the check is not needed.
In addition, the check gets erroneously hit when using sample mode to trace multiple threads.
Consequently, fix that case by removing the check.
Fixes: e502789302a6 ("perf auxtrace: Add helpers for queuing AUX area tracing data") Reported-by: Andi Kleen ak@linux.intel.com Signed-off-by: Adrian Hunter adrian.hunter@intel.com Reviewed-by: Andi Kleen ak@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Link: http://lore.kernel.org/lkml/20210308151143.18338-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/auxtrace.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index a60878498139..2723082f3817 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -298,10 +298,6 @@ static int auxtrace_queues__queue_buffer(struct auxtrace_queues *queues, queue->set = true; queue->tid = buffer->tid; queue->cpu = buffer->cpu; - } else if (buffer->cpu != queue->cpu || buffer->tid != queue->tid) { - pr_err("auxtrace queue conflict: cpu %d, tid %d vs cpu %d, tid %d\n", - queue->cpu, queue->tid, buffer->cpu, buffer->tid); - return -EINVAL; }
buffer->buffer_nr = queues->next_buffer_nr++;
From: Ian Rogers irogers@google.com
[ Upstream commit 2a76f6de07906f0bb5f2a13fb02845db1695cc29 ]
Account for alignment bytes in the zero-ing memset.
Fixes: 1a853e36871b533c ("perf record: Allow specifying a pid to record") Signed-off-by: Ian Rogers irogers@google.com Acked-by: Jiri Olsa jolsa@redhat.com Cc: Ingo Molnar mingo@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20210309234945.419254-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/synthetic-events.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c index 2947e3f3c6d9..dda0a6a3173d 100644 --- a/tools/perf/util/synthetic-events.c +++ b/tools/perf/util/synthetic-events.c @@ -384,7 +384,7 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
while (!io.eof) { static const char anonstr[] = "//anon"; - size_t size; + size_t size, aligned_size;
/* ensure null termination since stack will be reused. */ event->mmap2.filename[0] = '\0'; @@ -444,11 +444,12 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, }
size = strlen(event->mmap2.filename) + 1; - size = PERF_ALIGN(size, sizeof(u64)); + aligned_size = PERF_ALIGN(size, sizeof(u64)); event->mmap2.len -= event->mmap.start; event->mmap2.header.size = (sizeof(event->mmap2) - - (sizeof(event->mmap2.filename) - size)); - memset(event->mmap2.filename + size, 0, machine->id_hdr_size); + (sizeof(event->mmap2.filename) - aligned_size)); + memset(event->mmap2.filename + size, 0, machine->id_hdr_size + + (aligned_size - size)); event->mmap2.header.size += machine->id_hdr_size; event->mmap2.pid = tgid; event->mmap2.tid = pid;
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit d81269fecb8ce16eb07efafc9ff5520b2a31c486 ]
io_provide_buffers_prep()'s "p->len * p->nbufs" to sign extension problems. Not a huge problem as it's only used for access_ok() and increases the checked length, but better to keep typing right.
Reported-by: Colin Ian King colin.king@canonical.com Fixes: efe68c1ca8f49 ("io_uring: validate the full range of provided buffers for access") Signed-off-by: Pavel Begunkov asml.silence@gmail.com Reviewed-by: Colin Ian King colin.king@canonical.com Link: https://lore.kernel.org/r/562376a39509e260d8532186a06226e56eb1f594.161614923... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index c3cfaa367138..5c4378694d54 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4214,6 +4214,7 @@ static int io_remove_buffers(struct io_kiocb *req, bool force_nonblock, static int io_provide_buffers_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { + unsigned long size; struct io_provide_buf *p = &req->pbuf; u64 tmp;
@@ -4227,7 +4228,8 @@ static int io_provide_buffers_prep(struct io_kiocb *req, p->addr = READ_ONCE(sqe->addr); p->len = READ_ONCE(sqe->len);
- if (!access_ok(u64_to_user_ptr(p->addr), (p->len * p->nbufs))) + size = (unsigned long)p->len * p->nbufs; + if (!access_ok(u64_to_user_ptr(p->addr), size)) return -EFAULT;
p->bgid = READ_ONCE(sqe->buf_group);
From: David Jeffery djeffery@redhat.com
[ Upstream commit a958937ff166fc60d1c3a721036f6ff41bfa2821 ]
When a stacked block device inserts a request into another block device using blk_insert_cloned_request, the request's nr_phys_segments field gets recalculated by a call to blk_recalc_rq_segments in blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to handle multi-segment discards. For disk types which can handle multi-segment discards like nvme, this results in discard requests which claim a single segment when it should report several, triggering a warning in nvme and causing nvme to fail the discard from the invalid state.
WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core] ... nvme_setup_cmd+0x217/0x270 [nvme_core] nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop] __blk_mq_try_issue_directly+0xe7/0x1b0 blk_mq_request_issue_directly+0x41/0x70 ? blk_account_io_start+0x40/0x50 dm_mq_queue_rq+0x200/0x3e0 blk_mq_dispatch_rq_list+0x10a/0x7d0 ? __sbitmap_queue_get+0x25/0x90 ? elv_rb_del+0x1f/0x30 ? deadline_remove_request+0x55/0xb0 ? dd_dispatch_request+0x181/0x210 __blk_mq_do_dispatch_sched+0x144/0x290 ? bio_attempt_discard_merge+0x134/0x1f0 __blk_mq_sched_dispatch_requests+0x129/0x180 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x47/0xe0 __blk_mq_delay_run_hw_queue+0x15b/0x170 blk_mq_sched_insert_requests+0x68/0xe0 blk_mq_flush_plug_list+0xf0/0x170 blk_finish_plug+0x36/0x50 xlog_cil_committed+0x19f/0x290 [xfs] xlog_cil_process_committed+0x57/0x80 [xfs] xlog_state_do_callback+0x1e0/0x2a0 [xfs] xlog_ioend_work+0x2f/0x80 [xfs] process_one_work+0x1b6/0x350 worker_thread+0x53/0x3e0 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30
This patch fixes blk_recalc_rq_segments to be aware of devices which can have multi-segment discards. It calculates the correct discard segment count by counting the number of bio as each discard bio is considered its own segment.
Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request") Signed-off-by: David Jeffery djeffery@redhat.com Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Laurence Oberman loberman@redhat.com Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-merge.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/block/blk-merge.c b/block/blk-merge.c index 808768f6b174..756473295f19 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -383,6 +383,14 @@ unsigned int blk_recalc_rq_segments(struct request *rq) switch (bio_op(rq->bio)) { case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: + if (queue_max_discard_segments(rq->q) > 1) { + struct bio *bio = rq->bio; + + for_each_bio(bio) + nr_phys_segs++; + return nr_phys_segs; + } + return 1; case REQ_OP_WRITE_ZEROES: return 0; case REQ_OP_WRITE_SAME:
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit 39c0c8553bfb5a3d108aa47f1256076d507605e3 ]
Calling vha->hw->tgt.tgt_ops->free_cmd() from qlt_xmit_response() is wrong since the command for which a response is sent must remain valid until the SCSI target core calls .release_cmd(). It has been observed that the following scenario triggers a kernel crash:
- qlt_xmit_response() calls qlt_check_reserve_free_req()
- qlt_check_reserve_free_req() returns -EAGAIN
- qlt_xmit_response() calls vha->hw->tgt.tgt_ops->free_cmd(cmd)
- transport_handle_queue_full() tries to retransmit the response
Fix this crash by reverting the patch that introduced it.
Link: https://lore.kernel.org/r/20210320232359.941-2-bvanassche@acm.org Fixes: 0dcec41acb85 ("scsi: qla2xxx: Make sure that aborted commands are freed") Cc: Quinn Tran qutran@marvell.com Cc: Mike Christie michael.christie@oracle.com Reviewed-by: Daniel Wagner dwagner@suse.de Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_target.c | 13 +++++-------- drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 ---- 2 files changed, 5 insertions(+), 12 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index 0d09480b66cd..17f541030f5b 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -3223,8 +3223,7 @@ int qlt_xmit_response(struct qla_tgt_cmd *cmd, int xmit_type, if (!qpair->fw_started || (cmd->reset_count != qpair->chip_reset) || (cmd->sess && cmd->sess->deleted)) { cmd->state = QLA_TGT_STATE_PROCESSED; - res = 0; - goto free; + return 0; }
ql_dbg_qp(ql_dbg_tgt, qpair, 0xe018, @@ -3235,8 +3234,9 @@ int qlt_xmit_response(struct qla_tgt_cmd *cmd, int xmit_type,
res = qlt_pre_xmit_response(cmd, &prm, xmit_type, scsi_status, &full_req_cnt); - if (unlikely(res != 0)) - goto free; + if (unlikely(res != 0)) { + return res; + }
spin_lock_irqsave(qpair->qp_lock_ptr, flags);
@@ -3256,8 +3256,7 @@ int qlt_xmit_response(struct qla_tgt_cmd *cmd, int xmit_type, vha->flags.online, qla2x00_reset_active(vha), cmd->reset_count, qpair->chip_reset); spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); - res = 0; - goto free; + return 0; }
/* Does F/W have an IOCBs for this request */ @@ -3360,8 +3359,6 @@ int qlt_xmit_response(struct qla_tgt_cmd *cmd, int xmit_type, qlt_unmap_sg(vha, cmd); spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
-free: - vha->hw->tgt.tgt_ops->free_cmd(cmd); return res; } EXPORT_SYMBOL(qlt_xmit_response); diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index b55fc768a2a7..8b4890cdd4ca 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -644,7 +644,6 @@ static int tcm_qla2xxx_queue_data_in(struct se_cmd *se_cmd) { struct qla_tgt_cmd *cmd = container_of(se_cmd, struct qla_tgt_cmd, se_cmd); - struct scsi_qla_host *vha = cmd->vha;
if (cmd->aborted) { /* Cmd can loop during Q-full. tcm_qla2xxx_aborted_task @@ -657,7 +656,6 @@ static int tcm_qla2xxx_queue_data_in(struct se_cmd *se_cmd) cmd->se_cmd.transport_state, cmd->se_cmd.t_state, cmd->se_cmd.se_cmd_flags); - vha->hw->tgt.tgt_ops->free_cmd(cmd); return 0; }
@@ -685,7 +683,6 @@ static int tcm_qla2xxx_queue_status(struct se_cmd *se_cmd) { struct qla_tgt_cmd *cmd = container_of(se_cmd, struct qla_tgt_cmd, se_cmd); - struct scsi_qla_host *vha = cmd->vha; int xmit_type = QLA_TGT_XMIT_STATUS;
if (cmd->aborted) { @@ -699,7 +696,6 @@ static int tcm_qla2xxx_queue_status(struct se_cmd *se_cmd) cmd, kref_read(&cmd->se_cmd.cmd_kref), cmd->se_cmd.transport_state, cmd->se_cmd.t_state, cmd->se_cmd.se_cmd_flags); - vha->hw->tgt.tgt_ops->free_cmd(cmd); return 0; } cmd->bufflen = se_cmd->data_length;
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit f69953837ca5d98aa983a138dc0b90a411e9c763 ]
When kzalloc() returns NULL to qedi->global_queues[i], no error return code of qedi_alloc_global_queues() is assigned. To fix this bug, status is assigned with -ENOMEM in this case.
Link: https://lore.kernel.org/r/20210308033024.27147-1-baijiaju1990@gmail.com Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Reported-by: TOTE Robot oslab@tsinghua.edu.cn Acked-by: Manish Rangankar mrangankar@marvell.com Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qedi/qedi_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c index 47ad64b06623..69c5b5ee2169 100644 --- a/drivers/scsi/qedi/qedi_main.c +++ b/drivers/scsi/qedi/qedi_main.c @@ -1675,6 +1675,7 @@ static int qedi_alloc_global_queues(struct qedi_ctx *qedi) if (!qedi->global_queues[i]) { QEDI_ERR(&qedi->dbg_ctx, "Unable to allocation global queue %d.\n", i); + status = -ENOMEM; goto mem_alloc_failure; }
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 3401ecf7fc1b9458a19d42c0e26a228f18ac7dda ]
When kzalloc() returns NULL, no error return code of mpt3sas_base_attach() is assigned. To fix this bug, r is assigned with -ENOMEM in this case.
Link: https://lore.kernel.org/r/20210308035241.3288-1-baijiaju1990@gmail.com Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 6e23dc3209fe..340d435ac0ce 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -7789,14 +7789,18 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc) ioc->pend_os_device_add_sz++; ioc->pend_os_device_add = kzalloc(ioc->pend_os_device_add_sz, GFP_KERNEL); - if (!ioc->pend_os_device_add) + if (!ioc->pend_os_device_add) { + r = -ENOMEM; goto out_free_resources; + }
ioc->device_remove_in_progress_sz = ioc->pend_os_device_add_sz; ioc->device_remove_in_progress = kzalloc(ioc->device_remove_in_progress_sz, GFP_KERNEL); - if (!ioc->device_remove_in_progress) + if (!ioc->device_remove_in_progress) { + r = -ENOMEM; goto out_free_resources; + }
ioc->fwfault_debug = mpt3sas_fwfault_debug;
From: Steve French stfrench@microsoft.com
commit cfc63fc8126a93cbf95379bc4cad79a7b15b6ece upstream.
There were two problems (one of which could cause data corruption) that were noticed with duplicate extents (ie reflink) when debugging why various xfstests were being incorrectly skipped (e.g. generic/138, generic/140, generic/142). First, we were not updating the file size locally in the cache when extending a file due to reflink (it would refresh after actimeo expires) but xfstest was checking the size immediately which was still 0 so caused the test to be skipped. Second, we were setting the target file size (which could shrink the file) in all cases to the end of the reflinked range rather than only setting the target file size when reflink would extend the file.
CC: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2ops.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -2007,6 +2007,7 @@ smb2_duplicate_extents(const unsigned in { int rc; unsigned int ret_data_len; + struct inode *inode; struct duplicate_extents_to_file dup_ext_buf; struct cifs_tcon *tcon = tlink_tcon(trgtfile->tlink);
@@ -2023,10 +2024,21 @@ smb2_duplicate_extents(const unsigned in cifs_dbg(FYI, "Duplicate extents: src off %lld dst off %lld len %lld\n", src_off, dest_off, len);
- rc = smb2_set_file_size(xid, tcon, trgtfile, dest_off + len, false); - if (rc) - goto duplicate_extents_out; + inode = d_inode(trgtfile->dentry); + if (inode->i_size < dest_off + len) { + rc = smb2_set_file_size(xid, tcon, trgtfile, dest_off + len, false); + if (rc) + goto duplicate_extents_out;
+ /* + * Although also could set plausible allocation size (i_blocks) + * here in addition to setting the file size, in reflink + * it is likely that the target file is sparse. Its allocation + * size will be queried on next revalidate, but it is important + * to make sure that file's cached size is updated immediately + */ + cifs_setsize(inode, dest_off + len); + } rc = SMB2_ioctl(xid, tcon, trgtfile->fid.persistent_fid, trgtfile->fid.volatile_fid, FSCTL_DUPLICATE_EXTENTS_TO_FILE,
From: Shyam Prasad N sprasad@microsoft.com
commit 45a4546c6167a2da348a31ca439d8a8ff773b6ea upstream.
For AES256 encryption (GCM and CCM), we need to adjust the size of a few fields to 32 bytes instead of 16 to accommodate the larger keys.
Also, the L value supplied to the key generator needs to be changed from to 256 when these algorithms are used.
Keeping the ioctl struct for dumping keys of the same size for now. Will send out a different patch for that one.
Signed-off-by: Shyam Prasad N sprasad@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com CC: stable@vger.kernel.org # v5.10+ Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifsglob.h | 4 ++-- fs/cifs/cifspdu.h | 5 +++++ fs/cifs/smb2glob.h | 1 + fs/cifs/smb2ops.c | 9 +++++---- fs/cifs/smb2transport.c | 37 ++++++++++++++++++++++++++++--------- 5 files changed, 41 insertions(+), 15 deletions(-)
--- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -915,8 +915,8 @@ struct cifs_ses { bool binding:1; /* are we binding the session? */ __u16 session_flags; __u8 smb3signingkey[SMB3_SIGN_KEY_SIZE]; - __u8 smb3encryptionkey[SMB3_SIGN_KEY_SIZE]; - __u8 smb3decryptionkey[SMB3_SIGN_KEY_SIZE]; + __u8 smb3encryptionkey[SMB3_ENC_DEC_KEY_SIZE]; + __u8 smb3decryptionkey[SMB3_ENC_DEC_KEY_SIZE]; __u8 preauth_sha_hash[SMB2_PREAUTH_HASH_SIZE];
__u8 binding_preauth_sha_hash[SMB2_PREAUTH_HASH_SIZE]; --- a/fs/cifs/cifspdu.h +++ b/fs/cifs/cifspdu.h @@ -147,6 +147,11 @@ */ #define SMB3_SIGN_KEY_SIZE (16)
+/* + * Size of the smb3 encryption/decryption keys + */ +#define SMB3_ENC_DEC_KEY_SIZE (32) + #define CIFS_CLIENT_CHALLENGE_SIZE (8) #define CIFS_SERVER_CHALLENGE_SIZE (8) #define CIFS_HMAC_MD5_HASH_SIZE (16) --- a/fs/cifs/smb2glob.h +++ b/fs/cifs/smb2glob.h @@ -58,6 +58,7 @@ #define SMB2_HMACSHA256_SIZE (32) #define SMB2_CMACAES_SIZE (16) #define SMB3_SIGNKEY_SIZE (16) +#define SMB3_GCM128_CRYPTKEY_SIZE (16) #define SMB3_GCM256_CRYPTKEY_SIZE (32)
/* Maximum buffer size value we can send with 1 credit */ --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -4109,7 +4109,7 @@ smb2_get_enc_key(struct TCP_Server_Info if (ses->Suid == ses_id) { ses_enc_key = enc ? ses->smb3encryptionkey : ses->smb3decryptionkey; - memcpy(key, ses_enc_key, SMB3_SIGN_KEY_SIZE); + memcpy(key, ses_enc_key, SMB3_ENC_DEC_KEY_SIZE); spin_unlock(&cifs_tcp_ses_lock); return 0; } @@ -4136,7 +4136,7 @@ crypt_message(struct TCP_Server_Info *se int rc = 0; struct scatterlist *sg; u8 sign[SMB2_SIGNATURE_SIZE] = {}; - u8 key[SMB3_SIGN_KEY_SIZE]; + u8 key[SMB3_ENC_DEC_KEY_SIZE]; struct aead_request *req; char *iv; unsigned int iv_len; @@ -4160,10 +4160,11 @@ crypt_message(struct TCP_Server_Info *se tfm = enc ? server->secmech.ccmaesencrypt : server->secmech.ccmaesdecrypt;
- if (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM) + if ((server->cipher_type == SMB2_ENCRYPTION_AES256_CCM) || + (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) rc = crypto_aead_setkey(tfm, key, SMB3_GCM256_CRYPTKEY_SIZE); else - rc = crypto_aead_setkey(tfm, key, SMB3_SIGN_KEY_SIZE); + rc = crypto_aead_setkey(tfm, key, SMB3_GCM128_CRYPTKEY_SIZE);
if (rc) { cifs_server_dbg(VFS, "%s: Failed to set aead key %d\n", __func__, rc); --- a/fs/cifs/smb2transport.c +++ b/fs/cifs/smb2transport.c @@ -298,7 +298,8 @@ static int generate_key(struct cifs_ses { unsigned char zero = 0x0; __u8 i[4] = {0, 0, 0, 1}; - __u8 L[4] = {0, 0, 0, 128}; + __u8 L128[4] = {0, 0, 0, 128}; + __u8 L256[4] = {0, 0, 1, 0}; int rc = 0; unsigned char prfhash[SMB2_HMACSHA256_SIZE]; unsigned char *hashptr = prfhash; @@ -354,8 +355,14 @@ static int generate_key(struct cifs_ses goto smb3signkey_ret; }
- rc = crypto_shash_update(&server->secmech.sdeschmacsha256->shash, - L, 4); + if ((server->cipher_type == SMB2_ENCRYPTION_AES256_CCM) || + (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) { + rc = crypto_shash_update(&server->secmech.sdeschmacsha256->shash, + L256, 4); + } else { + rc = crypto_shash_update(&server->secmech.sdeschmacsha256->shash, + L128, 4); + } if (rc) { cifs_server_dbg(VFS, "%s: Could not update with L\n", __func__); goto smb3signkey_ret; @@ -390,6 +397,9 @@ generate_smb3signingkey(struct cifs_ses const struct derivation_triplet *ptriplet) { int rc; +#ifdef CONFIG_CIFS_DEBUG_DUMP_KEYS + struct TCP_Server_Info *server = ses->server; +#endif
/* * All channels use the same encryption/decryption keys but @@ -422,11 +432,11 @@ generate_smb3signingkey(struct cifs_ses rc = generate_key(ses, ptriplet->encryption.label, ptriplet->encryption.context, ses->smb3encryptionkey, - SMB3_SIGN_KEY_SIZE); + SMB3_ENC_DEC_KEY_SIZE); rc = generate_key(ses, ptriplet->decryption.label, ptriplet->decryption.context, ses->smb3decryptionkey, - SMB3_SIGN_KEY_SIZE); + SMB3_ENC_DEC_KEY_SIZE); if (rc) return rc; } @@ -442,14 +452,23 @@ generate_smb3signingkey(struct cifs_ses */ cifs_dbg(VFS, "Session Id %*ph\n", (int)sizeof(ses->Suid), &ses->Suid); + cifs_dbg(VFS, "Cipher type %d\n", server->cipher_type); cifs_dbg(VFS, "Session Key %*ph\n", SMB2_NTLMV2_SESSKEY_SIZE, ses->auth_key.response); cifs_dbg(VFS, "Signing Key %*ph\n", SMB3_SIGN_KEY_SIZE, ses->smb3signingkey); - cifs_dbg(VFS, "ServerIn Key %*ph\n", - SMB3_SIGN_KEY_SIZE, ses->smb3encryptionkey); - cifs_dbg(VFS, "ServerOut Key %*ph\n", - SMB3_SIGN_KEY_SIZE, ses->smb3decryptionkey); + if ((server->cipher_type == SMB2_ENCRYPTION_AES256_CCM) || + (server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) { + cifs_dbg(VFS, "ServerIn Key %*ph\n", + SMB3_GCM256_CRYPTKEY_SIZE, ses->smb3encryptionkey); + cifs_dbg(VFS, "ServerOut Key %*ph\n", + SMB3_GCM256_CRYPTKEY_SIZE, ses->smb3decryptionkey); + } else { + cifs_dbg(VFS, "ServerIn Key %*ph\n", + SMB3_GCM128_CRYPTKEY_SIZE, ses->smb3encryptionkey); + cifs_dbg(VFS, "ServerOut Key %*ph\n", + SMB3_GCM128_CRYPTKEY_SIZE, ses->smb3decryptionkey); + } #endif return rc; }
From: Thomas Gleixner tglx@linutronix.de
commit 291da9d4a9eb3a1cb0610b7f4480f5b52b1825e7 upstream.
If CONFIG_DEBUG_LOCK_ALLOC=n then mutex_lock_io_nested() maps to mutex_lock() which is clearly wrong because mutex_lock() lacks the io_schedule_prepare()/finish() invocations.
Map it to mutex_lock_io().
Fixes: f21860bac05b ("locking/mutex, sched/wait: Fix the mutex_lock_io_nested() define") Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/878s6fshii.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mutex.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -185,7 +185,7 @@ extern void mutex_lock_io(struct mutex * # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock) # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock) # define mutex_lock_nest_lock(lock, nest_lock) mutex_lock(lock) -# define mutex_lock_io_nested(lock, subclass) mutex_lock(lock) +# define mutex_lock_io_nested(lock, subclass) mutex_lock_io(lock) #endif
/*
From: Isaku Yamahata isaku.yamahata@intel.com
commit 8249d17d3194eac064a8ca5bc5ca0abc86feecde upstream.
The pfn variable contains the page frame number as returned by the pXX_pfn() functions, shifted to the right by PAGE_SHIFT to remove the page bits. After page protection computations are done to it, it gets shifted back to the physical address using page_level_shift().
That is wrong, of course, because that function determines the shift length based on the level of the page in the page table but in all the cases, it was shifted by PAGE_SHIFT before.
Therefore, shift it back using PAGE_SHIFT to get the correct physical address.
[ bp: Rewrite commit message. ]
Fixes: dfaaec9033b8 ("x86: Add support for changing memory encryption attribute in early boot") Signed-off-by: Isaku Yamahata isaku.yamahata@intel.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/81abbae1657053eccc535c16151f63cd049dcb97.161609829... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/mem_encrypt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -262,7 +262,7 @@ static void __init __set_clr_pte_enc(pte if (pgprot_val(old_prot) == pgprot_val(new_prot)) return;
- pa = pfn << page_level_shift(level); + pa = pfn << PAGE_SHIFT; size = page_level_size(level);
/*
From: Matthew Wilcox (Oracle) willy@infradead.org
commit 39f985c8f667c80a3d1eb19d31138032fa36b09e upstream.
Cachefiles was relying on wait_page_key and wait_bit_key being the same layout, which is fragile. Now that wait_page_key is exposed in the pagemap.h header, we can remove that fragility
A comment on the need to maintain structure layout equivalence was added by Linus[1] and that is no longer applicable.
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: David Howells dhowells@redhat.com Tested-by: kafs-testing@auristor.com cc: linux-cachefs@redhat.com cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/ Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [1] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cachefiles/rdwr.c | 7 +++---- include/linux/pagemap.h | 1 - 2 files changed, 3 insertions(+), 5 deletions(-)
--- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -24,17 +24,16 @@ static int cachefiles_read_waiter(wait_q container_of(wait, struct cachefiles_one_read, monitor); struct cachefiles_object *object; struct fscache_retrieval *op = monitor->op; - struct wait_bit_key *key = _key; + struct wait_page_key *key = _key; struct page *page = wait->private;
ASSERT(key);
_enter("{%lu},%u,%d,{%p,%u}", monitor->netfs_page->index, mode, sync, - key->flags, key->bit_nr); + key->page, key->bit_nr);
- if (key->flags != &page->flags || - key->bit_nr != PG_locked) + if (key->page != page || key->bit_nr != PG_locked) return 0;
_debug("--- monitor %p %lx ---", page, page->flags); --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -559,7 +559,6 @@ static inline pgoff_t linear_page_index( return pgoff; }
-/* This has the same layout as wait_bit_key - see fs/cachefiles/rdwr.c */ struct wait_page_key { struct page *page; int bit_nr;
From: Arnd Bergmann arnd@arndb.de
commit 6f235a69e59484e382dc31952025b0308efedc17 upstream.
gcc points out an incorrect enum assignment:
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c: In function 'chcr_ktls_cpl_set_tcb_rpl': drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:684:22: warning: implicit conversion from 'enum <anonymous>' to 'enum ch_ktls_open_state' [-Wenum-conversion]
This appears harmless, and should apparently use 'CH_KTLS_OPEN_SUCCESS' instead of 'false', with the same value '0'.
Fixes: efca3878a5fb ("ch_ktls: Issue if connection offload fails") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c @@ -727,7 +727,7 @@ static int chcr_ktls_cpl_set_tcb_rpl(str kvfree(tx_info); return 0; } - tx_info->open_state = false; + tx_info->open_state = CH_KTLS_OPEN_SUCCESS; spin_unlock(&tx_info->lock);
complete(&tx_info->completion);
From: Martin Willi martin@strongswan.org
commit 3a5ca857079ea022e0b1b17fc154f7ad7dbc150f upstream.
When a non-initial netns is destroyed, the usual policy is to delete all virtual network interfaces contained, but move physical interfaces back to the initial netns. This keeps the physical interface visible on the system.
CAN devices are somewhat special, as they define rtnl_link_ops even if they are physical devices. If a CAN interface is moved into a non-initial netns, destroying that netns lets the interface vanish instead of moving it back to the initial netns. default_device_exit() skips CAN interfaces due to having rtnl_link_ops set. Reproducer:
ip netns add foo ip link set can0 netns foo ip netns delete foo
WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60 CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1 Workqueue: netns cleanup_net [<c010e700>] (unwind_backtrace) from [<c010a1d8>] (show_stack+0x10/0x14) [<c010a1d8>] (show_stack) from [<c086dc10>] (dump_stack+0x94/0xa8) [<c086dc10>] (dump_stack) from [<c086b938>] (__warn+0xb8/0x114) [<c086b938>] (__warn) from [<c086ba10>] (warn_slowpath_fmt+0x7c/0xac) [<c086ba10>] (warn_slowpath_fmt) from [<c0629f20>] (ops_exit_list+0x38/0x60) [<c0629f20>] (ops_exit_list) from [<c062a5c4>] (cleanup_net+0x230/0x380) [<c062a5c4>] (cleanup_net) from [<c0142c20>] (process_one_work+0x1d8/0x438) [<c0142c20>] (process_one_work) from [<c0142ee4>] (worker_thread+0x64/0x5a8) [<c0142ee4>] (worker_thread) from [<c0148a98>] (kthread+0x148/0x14c) [<c0148a98>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
To properly restore physical CAN devices to the initial netns on owning netns exit, introduce a flag on rtnl_link_ops that can be set by drivers. For CAN devices setting this flag, default_device_exit() considers them non-virtual, applying the usual namespace move.
The issue was introduced in the commit mentioned below, as at that time CAN devices did not have a dellink() operation.
Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.") Link: https://lore.kernel.org/r/20210302122423.872326-1-martin@strongswan.org Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/dev.c | 1 + include/net/rtnetlink.h | 2 ++ net/core/dev.c | 2 +- 3 files changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -1255,6 +1255,7 @@ static void can_dellink(struct net_devic
static struct rtnl_link_ops can_link_ops __read_mostly = { .kind = "can", + .netns_refund = true, .maxtype = IFLA_CAN_MAX, .policy = can_policy, .setup = can_setup, --- a/include/net/rtnetlink.h +++ b/include/net/rtnetlink.h @@ -33,6 +33,7 @@ static inline int rtnl_msg_family(const * * @list: Used internally * @kind: Identifier + * @netns_refund: Physical device, move to init_net on netns exit * @maxtype: Highest device specific netlink attribute number * @policy: Netlink policy for device specific attribute validation * @validate: Optional validation function for netlink/changelink parameters @@ -64,6 +65,7 @@ struct rtnl_link_ops { size_t priv_size; void (*setup)(struct net_device *dev);
+ bool netns_refund; unsigned int maxtype; const struct nla_policy *policy; int (*validate)(struct nlattr *tb[], --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11194,7 +11194,7 @@ static void __net_exit default_device_ex continue;
/* Leave virtual devices for the generic cleanup */ - if (dev->rtnl_link_ops) + if (dev->rtnl_link_ops && !dev->rtnl_link_ops->netns_refund) continue;
/* Push remaining network devices to init_net */
From: Heiner Kallweit hkallweit1@gmail.com
commit f658b90977d2e79822a558e48116e059a7e75dec upstream.
IOMMU errors have been reported if WoL is enabled and interface is brought down. It turned out that the network chip triggers DMA transfers after the DMA buffers have been freed. For WoL to work we need to leave rx enabled, therefore simply stop the chip from being a DMA busmaster.
Fixes: 567ca57faa62 ("r8169: add rtl8169_up") Tested-by: Paul Blazejowski paulb@blazebox.homeip.net Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4677,6 +4677,9 @@ static void rtl8169_down(struct rtl8169_
rtl8169_update_counters(tp);
+ pci_clear_master(tp->pci_dev); + rtl_pci_commit(tp); + rtl8169_cleanup(tp, true);
rtl_pll_power_down(tp); @@ -4684,6 +4687,7 @@ static void rtl8169_down(struct rtl8169_
static void rtl8169_up(struct rtl8169_private *tp) { + pci_set_master(tp->pci_dev); rtl_pll_power_up(tp); rtl8169_init_phy(tp); napi_enable(&tp->napi); @@ -5348,8 +5352,6 @@ static int rtl_init_one(struct pci_dev *
rtl_hw_reset(tp);
- pci_set_master(pdev); - rc = rtl_alloc_irq(tp); if (rc < 0) { dev_err(&pdev->dev, "Can't allocate interrupt\n");
From: Florian Fainelli f.fainelli@gmail.com
commit d45c36bafb94e72fdb6dee437279b61b6d97e706 upstream.
The bcm_sf2 driver uses the b53 driver as a library but does not make usre of the b53_setup() function, this made it fail to inherit the vlan_filtering_is_global attribute. Fix this by moving the assignment to b53_switch_alloc() which is used by bcm_sf2.
Fixes: 7228b23e68f7 ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/b53/b53_common.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1070,13 +1070,6 @@ static int b53_setup(struct dsa_switch * b53_disable_port(ds, port); }
- /* Let DSA handle the case were multiple bridges span the same switch - * device and different VLAN awareness settings are requested, which - * would be breaking filtering semantics for any of the other bridge - * devices. (not hardware supported) - */ - ds->vlan_filtering_is_global = true; - return b53_setup_devlink_resources(ds); }
@@ -2627,6 +2620,13 @@ struct b53_device *b53_switch_alloc(stru ds->configure_vlan_while_not_filtering = true; ds->untag_bridge_pvid = true; dev->vlan_enabled = ds->configure_vlan_while_not_filtering; + /* Let DSA handle the case were multiple bridges span the same switch + * device and different VLAN awareness settings are requested, which + * would be breaking filtering semantics for any of the other bridge + * devices. (not hardware supported) + */ + ds->vlan_filtering_is_global = true; + mutex_init(&dev->reg_mutex); mutex_init(&dev->stats_mutex);
From: Markus Theil markus.theil@tu-ilmenau.de
commit 3bd801b14e0c5d29eeddc7336558beb3344efaa3 upstream.
Clear beacon ie pointer and ie length after free in order to prevent double free.
================================================================== BUG: KASAN: double-free or invalid-free \ in ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876
CPU: 0 PID: 8472 Comm: syz-executor100 Not tainted 5.11.0-rc6-syzkaller #0 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2c6 mm/kasan/report.c:230 kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355 ____kasan_slab_free+0xcc/0xe0 mm/kasan/common.c:341 kasan_slab_free include/linux/kasan.h:192 [inline] __cache_free mm/slab.c:3424 [inline] kfree+0xed/0x270 mm/slab.c:3760 ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876 rdev_leave_ibss net/wireless/rdev-ops.h:545 [inline] __cfg80211_leave_ibss+0x19a/0x4c0 net/wireless/ibss.c:212 __cfg80211_leave+0x327/0x430 net/wireless/core.c:1172 cfg80211_leave net/wireless/core.c:1221 [inline] cfg80211_netdev_notifier_call+0x9e8/0x12c0 net/wireless/core.c:1335 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040 call_netdevice_notifiers_extack net/core/dev.c:2052 [inline] call_netdevice_notifiers net/core/dev.c:2066 [inline] __dev_close_many+0xee/0x2e0 net/core/dev.c:1586 __dev_close net/core/dev.c:1624 [inline] __dev_change_flags+0x2cb/0x730 net/core/dev.c:8476 dev_change_flags+0x8a/0x160 net/core/dev.c:8549 dev_ifsioc+0x210/0xa70 net/core/dev_ioctl.c:265 dev_ioctl+0x1b1/0xc40 net/core/dev_ioctl.c:511 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: syzbot+93976391bf299d425f44@syzkaller.appspotmail.com Signed-off-by: Markus Theil markus.theil@tu-ilmenau.de Link: https://lore.kernel.org/r/20210213133653.367130-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/ibss.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/mac80211/ibss.c +++ b/net/mac80211/ibss.c @@ -1874,6 +1874,8 @@ int ieee80211_ibss_leave(struct ieee8021
/* remove beacon */ kfree(sdata->u.ibss.ie); + sdata->u.ibss.ie = NULL; + sdata->u.ibss.ie_len = 0;
/* on the next join, re-program HT parameters */ memset(&ifibss->ht_capa, 0, sizeof(ifibss->ht_capa));
From: Jan Kara jack@suse.cz
commit 163f0ec1df33cf468509ff38cbcbb5eb0d7fac60 upstream.
Syzbot is reporting that ext4 can enter fs reclaim from kvmalloc() while the transaction is started like:
fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340 might_alloc include/linux/sched/mm.h:193 [inline] slab_pre_alloc_hook mm/slab.h:493 [inline] slab_alloc_node mm/slub.c:2817 [inline] __kmalloc_node+0x5f/0x430 mm/slub.c:4015 kmalloc_node include/linux/slab.h:575 [inline] kvmalloc_node+0x61/0xf0 mm/util.c:587 kvmalloc include/linux/mm.h:781 [inline] ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline] ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649 ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224 ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380 ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
This should be impossible since transaction start sets PF_MEMALLOC_NOFS. Add some assertions to the code to catch if something isn't working as expected early.
Link: https://lore.kernel.org/linux-ext4/000000000000563a0205bafb7970@google.com/ Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20210222171626.21884-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1462,6 +1462,9 @@ ext4_xattr_inode_cache_find(struct inode if (!ce) return NULL;
+ WARN_ON_ONCE(ext4_handle_valid(journal_current_handle()) && + !(current->flags & PF_MEMALLOC_NOFS)); + ea_data = kvmalloc(value_len, GFP_KERNEL); if (!ea_data) { mb_cache_entry_put(ea_inode_cache, ce); @@ -2327,6 +2330,7 @@ ext4_xattr_set_handle(handle_t *handle, error = -ENOSPC; goto cleanup; } + WARN_ON_ONCE(!(current->flags & PF_MEMALLOC_NOFS)); }
error = ext4_reserve_inode_write(handle, inode, &is.iloc);
From: Sabyrzhan Tasbolatov snovitoll@gmail.com
commit f91436d55a279f045987e8b8c1385585dca54be9 upstream.
syzbot found UBSAN: shift-out-of-bounds in ext4_mb_init [1], when 1 << sbi->s_es->s_log_groups_per_flex is bigger than UINT_MAX, where sbi->s_mb_prefetch is unsigned integer type.
32 is the maximum allowed power of s_log_groups_per_flex. Following if check will also trigger UBSAN shift-out-of-bound:
if (1 << sbi->s_es->s_log_groups_per_flex >= UINT_MAX) {
So I'm checking it against the raw number, perhaps there is another way to calculate UINT_MAX max power. Also use min_t as to make sure it's uint type.
[1] UBSAN: shift-out-of-bounds in fs/ext4/mballoc.c:2713:24 shift exponent 60 is too large for 32-bit type 'int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x137/0x1be lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:148 [inline] __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395 ext4_mb_init_backend fs/ext4/mballoc.c:2713 [inline] ext4_mb_init+0x19bc/0x19f0 fs/ext4/mballoc.c:2898 ext4_fill_super+0xc2ec/0xfbe0 fs/ext4/super.c:4983
Reported-by: syzbot+a8b4b0c60155e87e9484@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov snovitoll@gmail.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20210224095800.3350002-1-snovitoll@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2709,8 +2709,15 @@ static int ext4_mb_init_backend(struct s }
if (ext4_has_feature_flex_bg(sb)) { - /* a single flex group is supposed to be read by a single IO */ - sbi->s_mb_prefetch = min(1 << sbi->s_es->s_log_groups_per_flex, + /* a single flex group is supposed to be read by a single IO. + * 2 ^ s_log_groups_per_flex != UINT_MAX as s_mb_prefetch is + * unsigned integer, so the maximum shift is 32. + */ + if (sbi->s_es->s_log_groups_per_flex >= 32) { + ext4_msg(sb, KERN_ERR, "too many log groups per flexible block group"); + goto err_freesgi; + } + sbi->s_mb_prefetch = min_t(uint, 1 << sbi->s_es->s_log_groups_per_flex, BLK_MAX_SEGMENT_SIZE >> (sb->s_blocksize_bits - 9)); sbi->s_mb_prefetch *= 8; /* 8 prefetch IOs in flight at most */ } else {
From: Roger Pau Monne roger.pau@citrix.com
commit af44a387e743ab7aa39d3fb5e29c0a973cf91bdc upstream.
This partially reverts commit 882213990d32 ("xen: fix p2m size in dom0 for disabled memory hotplug case")
There's no need to special case XEN_UNPOPULATED_ALLOC anymore in order to correctly size the p2m. The generic memory hotplug option has already been tied together with the Xen hotplug limit, so enabling memory hotplug should already trigger a properly sized p2m on Xen PV.
Note that XEN_UNPOPULATED_ALLOC depends on ZONE_DEVICE which pulls in MEMORY_HOTPLUG.
Leave the check added to __set_phys_to_machine and the adjusted comment about EXTRA_MEM_RATIO.
Signed-off-by: Roger Pau Monné roger.pau@citrix.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Link: https://lore.kernel.org/r/20210324122424.58685-3-roger.pau@citrix.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
[boris: fixed formatting issues] Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com --- arch/x86/include/asm/xen/page.h | 12 ------------ arch/x86/xen/p2m.c | 3 --- arch/x86/xen/setup.c | 16 ++++++++++++++-- 3 files changed, 14 insertions(+), 17 deletions(-)
--- a/arch/x86/include/asm/xen/page.h +++ b/arch/x86/include/asm/xen/page.h @@ -87,18 +87,6 @@ clear_foreign_p2m_mapping(struct gnttab_ #endif
/* - * The maximum amount of extra memory compared to the base size. The - * main scaling factor is the size of struct page. At extreme ratios - * of base:extra, all the base memory can be filled with page - * structures for the extra memory, leaving no space for anything - * else. - * - * 10x seems like a reasonable balance between scaling flexibility and - * leaving a practically usable system. - */ -#define XEN_EXTRA_MEM_RATIO (10) - -/* * Helper functions to write or read unsigned long values to/from * memory, when the access may fault. */ --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -416,9 +416,6 @@ void __init xen_vmalloc_p2m_tree(void) xen_p2m_last_pfn = xen_max_p2m_pfn;
p2m_limit = (phys_addr_t)P2M_LIMIT * 1024 * 1024 * 1024 / PAGE_SIZE; - if (!p2m_limit && IS_ENABLED(CONFIG_XEN_UNPOPULATED_ALLOC)) - p2m_limit = xen_start_info->nr_pages * XEN_EXTRA_MEM_RATIO; - vm.flags = VM_ALLOC; vm.size = ALIGN(sizeof(unsigned long) * max(xen_max_p2m_pfn, p2m_limit), PMD_SIZE * PMDS_PER_MID_PAGE); --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -59,6 +59,18 @@ static struct { } xen_remap_buf __initdata __aligned(PAGE_SIZE); static unsigned long xen_remap_mfn __initdata = INVALID_P2M_ENTRY;
+/* + * The maximum amount of extra memory compared to the base size. The + * main scaling factor is the size of struct page. At extreme ratios + * of base:extra, all the base memory can be filled with page + * structures for the extra memory, leaving no space for anything + * else. + * + * 10x seems like a reasonable balance between scaling flexibility and + * leaving a practically usable system. + */ +#define EXTRA_MEM_RATIO (10) + static bool xen_512gb_limit __initdata = IS_ENABLED(CONFIG_XEN_512GB);
static void __init xen_parse_512gb(void) @@ -778,13 +790,13 @@ char * __init xen_memory_setup(void) extra_pages += max_pages - max_pfn;
/* - * Clamp the amount of extra memory to a XEN_EXTRA_MEM_RATIO + * Clamp the amount of extra memory to a EXTRA_MEM_RATIO * factor the base size. * * Make sure we have no memory above max_pages, as this area * isn't handled by the p2m management. */ - extra_pages = min3(XEN_EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), + extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)), extra_pages, max_pages - max_pfn); i = 0; addr = xen_e820_table.entries[0].addr;
From: Christoph Hellwig hch@lst.de
commit f4f9fc29e56b6fa9d7fa65ec51d3c82aff99c99b upstream.
ns can be NULL at this point, and my move of the check from the original patch by Chaitanya broke this.
Fixes: 0ec84df4953b ("nvme-core: check ctrl css before setting up zns") Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4072,7 +4072,7 @@ static void nvme_validate_or_alloc_ns(st if (!nvme_multi_css(ctrl)) { dev_warn(ctrl->device, "command set not reported for nsid: %d\n", - ns->head->ns_id); + nsid); break; } nvme_alloc_ns(ctrl, nsid, &ids);
From: Marc Kleine-Budde mkl@pengutronix.de
commit 5d7047ed6b7214fbabc16d8712a822e256b1aa44 upstream.
In commit 6417f03132a6 ("module: remove never implemented MODULE_SUPPORTED_DEVICE") the MODULE_SUPPORTED_DEVICE macro was removed from the kerne entirely. Shortly before this patch was applied mainline the commit 59ec7b89ed3e ("can: peak_usb: add forgotten supported devices") was added to net/master. As this would result in a merge conflict, let's revert this patch.
Fixes: 59ec7b89ed3e ("can: peak_usb: add forgotten supported devices") Link: https://lore.kernel.org/r/20210320192649.341832-1-mkl@pengutronix.de Suggested-by: Leon Romanovsky leon@kernel.org Cc: Stephane Grosjean s.grosjean@peak-system.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c @@ -18,8 +18,6 @@
MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB FD adapter"); MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB Pro FD adapter"); -MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-Chip USB"); -MODULE_SUPPORTED_DEVICE("PEAK-System PCAN-USB X6 adapter");
#define PCAN_USBPROFD_CHANNEL_COUNT 2 #define PCAN_USBFD_CHANNEL_COUNT 1
From: Alexei Starovoitov ast@kernel.org
commit eddbe8e6521401003e37e7848ef72e75c10ee2aa upstream.
Add a selftest for commit e21aa341785c ("bpf: Fix fexit trampoline.") to make sure that attaching fexit prog to a sleeping kernel function will trigger appropriate trampoline and program destruction.
Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210318004523.55908-1-alexei.starovoitov@gmail.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/prog_tests/fexit_sleep.c | 82 +++++++++++++++++++ tools/testing/selftests/bpf/progs/fexit_sleep.c | 31 +++++++ 2 files changed, 113 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/fexit_sleep.c create mode 100644 tools/testing/selftests/bpf/progs/fexit_sleep.c
--- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#define _GNU_SOURCE +#include <sched.h> +#include <test_progs.h> +#include <time.h> +#include <sys/mman.h> +#include <sys/syscall.h> +#include "fexit_sleep.skel.h" + +static int do_sleep(void *skel) +{ + struct fexit_sleep *fexit_skel = skel; + struct timespec ts1 = { .tv_nsec = 1 }; + struct timespec ts2 = { .tv_sec = 10 }; + + fexit_skel->bss->pid = getpid(); + (void)syscall(__NR_nanosleep, &ts1, NULL); + (void)syscall(__NR_nanosleep, &ts2, NULL); + return 0; +} + +#define STACK_SIZE (1024 * 1024) +static char child_stack[STACK_SIZE]; + +void test_fexit_sleep(void) +{ + struct fexit_sleep *fexit_skel = NULL; + int wstatus, duration = 0; + pid_t cpid; + int err, fexit_cnt; + + fexit_skel = fexit_sleep__open_and_load(); + if (CHECK(!fexit_skel, "fexit_skel_load", "fexit skeleton failed\n")) + goto cleanup; + + err = fexit_sleep__attach(fexit_skel); + if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err)) + goto cleanup; + + cpid = clone(do_sleep, child_stack + STACK_SIZE, CLONE_FILES | SIGCHLD, fexit_skel); + if (CHECK(cpid == -1, "clone", strerror(errno))) + goto cleanup; + + /* wait until first sys_nanosleep ends and second sys_nanosleep starts */ + while (READ_ONCE(fexit_skel->bss->fentry_cnt) != 2); + fexit_cnt = READ_ONCE(fexit_skel->bss->fexit_cnt); + if (CHECK(fexit_cnt != 1, "fexit_cnt", "%d", fexit_cnt)) + goto cleanup; + + /* close progs and detach them. That will trigger two nop5->jmp5 rewrites + * in the trampolines to skip nanosleep_fexit prog. + * The nanosleep_fentry prog will get detached first. + * The nanosleep_fexit prog will get detached second. + * Detaching will trigger freeing of both progs JITed images. + * There will be two dying bpf_tramp_image-s, but only the initial + * bpf_tramp_image (with both _fentry and _fexit progs will be stuck + * waiting for percpu_ref_kill to confirm). The other one + * will be freed quickly. + */ + close(bpf_program__fd(fexit_skel->progs.nanosleep_fentry)); + close(bpf_program__fd(fexit_skel->progs.nanosleep_fexit)); + fexit_sleep__detach(fexit_skel); + + /* kill the thread to unwind sys_nanosleep stack through the trampoline */ + kill(cpid, 9); + + if (CHECK(waitpid(cpid, &wstatus, 0) == -1, "waitpid", strerror(errno))) + goto cleanup; + if (CHECK(WEXITSTATUS(wstatus) != 0, "exitstatus", "failed")) + goto cleanup; + + /* The bypassed nanosleep_fexit prog shouldn't have executed. + * Unlike progs the maps were not freed and directly accessible. + */ + fexit_cnt = READ_ONCE(fexit_skel->bss->fexit_cnt); + if (CHECK(fexit_cnt != 1, "fexit_cnt", "%d", fexit_cnt)) + goto cleanup; + +cleanup: + fexit_sleep__destroy(fexit_skel); +} --- /dev/null +++ b/tools/testing/selftests/bpf/progs/fexit_sleep.c @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +char LICENSE[] SEC("license") = "GPL"; + +int pid = 0; +int fentry_cnt = 0; +int fexit_cnt = 0; + +SEC("fentry/__x64_sys_nanosleep") +int BPF_PROG(nanosleep_fentry, const struct pt_regs *regs) +{ + if ((int)bpf_get_current_pid_tgid() != pid) + return 0; + + fentry_cnt++; + return 0; +} + +SEC("fexit/__x64_sys_nanosleep") +int BPF_PROG(nanosleep_fexit, const struct pt_regs *regs, int ret) +{ + if ((int)bpf_get_current_pid_tgid() != pid) + return 0; + + fexit_cnt++; + return 0; +}
On Mon, 29 Mar 2021 at 14:00, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.11.11 release. There are 254 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.11-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
The stable rc 5.10 and 5.11 builds failed for arm64 architecture due to below warnings / errors,
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: define arch_get_mappable_range()
arch/arm64/mm/mmu.c: In function 'arch_add_memory': arch/arm64/mm/mmu.c:1483:13: error: implicit declaration of function 'mhp_range_allowed'; did you mean 'cpu_map_prog_allowed'? [-Werror=implicit-function-declaration] VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^ include/linux/build_bug.h:30:63: note: in definition of macro 'BUILD_BUG_ON_INVALID' #define BUILD_BUG_ON_INVALID(e) ((void)(sizeof((__force long)(e)))) ^ arch/arm64/mm/mmu.c:1483:2: note: in expansion of macro 'VM_BUG_ON' VM_BUG_ON(!mhp_range_allowed(start, size, true)); ^~~~~~~~~
Build link, https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.11/D... https://ci.linaro.org/view/lkft/job/openembedded-lkft-linux-stable-rc-5.10/D...
linux-stable-mirror@lists.linaro.org