This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.9.9-rc1
Boris Protopopov pboris@amazon.com Convert trailing spaces and periods in path components
Mike Leach mike.leach@linaro.org coresight: Fix uninitialised pointer bug in etm_setup_aux()
Linu Cherian lcherian@marvell.com coresight: etm: perf: Sink selection using sysfs is deprecated
Arnaldo Carvalho de Melo acme@redhat.com perf scripting python: Avoid declaring function pointers with a visibility attribute
Damien Le Moal damien.lemoal@wdc.com null_blk: Fix scheduling in atomic with zoned mode
Christophe Leroy christophe.leroy@csgroup.eu powerpc/603: Always fault when _PAGE_ACCESSED is not set
Stefano Brivio sbrivio@redhat.com tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies
Paolo Abeni pabeni@redhat.com mptcp: provide rmem[0] limit
Parav Pandit parav@nvidia.com devlink: Avoid overwriting port attributes of registered port
Wang Hai wanghai38@huawei.com tipc: fix memory leak in tipc_topsrv_start()
Martin Schiller ms@dev.tdt.de net/x25: Fix null-ptr-deref in x25_connect
Mao Wenan wenan.mao@linux.alibaba.com net: Update window_clamp if SOCK_RCVBUF is set
Alexander Lobakin alobakin@pm.me net: udp: fix UDP header access on Fast/frag0 UDP GRO
Alexander Lobakin alobakin@pm.me net: udp: fix IP header access and skb lookup on Fast/frag0 UDP GRO
Ursula Braun ubraun@linux.ibm.com net/af_iucv: fix null pointer dereference on shutdown
Oliver Herms oliver.peter.herms@gmail.com IPv6: Set SIT tunnel hard_header_len to zero
Alexander Lobakin alobakin@pm.me ethtool: netlink: add missing netdev_features_change() call
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: intel_pstate: Take CPUFREQ_GOV_STRICT_TARGET into account
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: Add strict_target to struct cpufreq_policy
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: Introduce governor flags
Stefano Stabellini stefano.stabellini@xilinx.com swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
Coiby Xu coiby.xu@gmail.com pinctrl: amd: fix incorrect way to disable debounce filter
Coiby Xu coiby.xu@gmail.com pinctrl: amd: use higher precision for 512 RtcClk
J. Bruce Fields bfields@redhat.com NFSv4.2: fix failure to unregister shrinker
Thomas Zimmermann tzimmermann@suse.de drm/gma500: Fix out-of-bounds access to struct drm_device.vblank[]
Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com drm/i915: Correctly set SFC capability for video engines
Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com drm/amd/display: Add missing pflip irq
Al Viro viro@zeniv.linux.org.uk don't dump the threads that had been already exiting when zapped.
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com mmc: renesas_sdhi_core: Add missing tmio_mmc_host_free() at remove
Yangbo Lu yangbo.lu@nxp.com mmc: sdhci-of-esdhc: Handle pulse width detection erratum for more SoCs
Arnaud de Turckheim quarium@gmail.com gpio: pcie-idio-24: Enable PEX8311 interrupts
Arnaud de Turckheim quarium@gmail.com gpio: pcie-idio-24: Fix IRQ Enable Register value
Arnaud de Turckheim quarium@gmail.com gpio: pcie-idio-24: Fix irq mask when masking
Damien Le Moal damien.lemoal@wdc.com gpio: sifive: Fix SiFive gpio probe
Jens Axboe axboe@kernel.dk io_uring: round-up cq size before comparing with rounded sq size
Chen Zhou chenzhou10@huawei.com selinux: Fix error return code in sel_ib_pkey_sid_slow()
Naveen Krishna Chatradhi nchatrad@amd.com hwmon: (amd_energy) modify the visibility of the counters
Wengang Wang wen.gang.wang@oracle.com ocfs2: initialize ip_next_orphan
Mike Kravetz mike.kravetz@oracle.com hugetlbfs: fix anon huge page migration race
Matteo Croce mcroce@microsoft.com reboot: fix overflow parsing reboot cpu number
Matteo Croce mcroce@microsoft.com Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
Jason Gunthorpe jgg@ziepe.ca mm/gup: use unpin_user_pages() in __gup_longterm_locked()
Nicholas Piggin npiggin@gmail.com mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Laurent Dufour ldufour@linux.ibm.com mm/slub: fix panic in slab_alloc_node()
Zi Yan ziy@nvidia.com mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
Zi Yan ziy@nvidia.com mm/compaction: count pages and stop correctly during page isolation
Masami Hiramatsu mhiramat@kernel.org bootconfig: Extend the magic check range to the preceding 3 bytes
Theodore Ts'o tytso@mit.edu jbd2: fix up sparse warnings in checkpoint code
Dan Carpenter dan.carpenter@oracle.com futex: Don't enable IRQs unconditionally in put_pi_state()
Alexander Usyskin alexander.usyskin@intel.com mei: protect mei_cl_mtu from null dereference
Alexander Lobakin alobakin@pm.me virtio: virtio_console: fix DMA memory allocation for rproc serial
Zhang Qilong zhangqilong3@huawei.com xhci: hisilicon: fix refercence leak in xhci_histb_probe
Heikki Krogerus heikki.krogerus@linux.intel.com usb: typec: ucsi: Report power supply changes
Chris Brandt chris.brandt@renesas.com usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode
Geert Uytterhoeven geert+renesas@glider.be Revert "usb: musb: convert to devm_platform_ioremap_resource_byname"
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com uio: Fix use-after-free in uio_unregister_device()
Petr Vorel pvorel@suse.cz loop: Fix occasional uevent drop
Christoph Hellwig hch@lst.de block: add a return value to set_capacity_revalidate_and_notify
Jing Xiangfeng jingxiangfeng@huawei.com thunderbolt: Add the missed ida_simple_remove() in ring_request_msix()
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Fix memory leak if ida_simple_get() fails in enumerate_services()
Samuel Thibault samuel.thibault@ens-lyon.org speakup: Fix clearing selection in safe context
Samuel Thibault samuel.thibault@ens-lyon.org speakup ttyio: Do not schedule() in ttyio_in_nowait
Samuel Thibault samuel.thibault@ens-lyon.org speakup: Fix var_id_t values and thus keymap
Andrew Jones drjones@redhat.com KVM: arm64: Don't hide ID registers from userspace
Anand Jain anand.jain@oracle.com btrfs: dev-replace: fail mount if we don't have replace item with target device
Josef Bacik josef@toxicpanda.com btrfs: fix min reserved size calculation in merge_reloc_root
Dinghao Liu dinghao.liu@zju.edu.cn btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod
Matthew Wilcox (Oracle) willy@infradead.org btrfs: fix potential overflow in cluster_pages_for_defrag on 32bit arch
Joseph Qi joseph.qi@linux.alibaba.com ext4: unlock xattr_sem properly in ext4_inline_data_truncate()
Kaixu Xia kaixuxia@tencent.com ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTA
Gao Xiang hsiangkao@redhat.com erofs: derive atime instead of leaving it empty
Gao Xiang hsiangkao@redhat.com erofs: fix setting up pcluster for temporary pages
Arnd Bergmann arnd@arndb.de firmware: xilinx: fix out-of-bounds access
Peter Zijlstra peterz@infradead.org perf: Fix event multiplexing for exclusive groups
Peter Zijlstra peterz@infradead.org perf: Simplify group_sched_in()
Sagi Grimberg sagi@grimberg.me nvme: fix incorrect behavior when BLKROSET is called by the user
Sasha Levin sashal@kernel.org nvme: freeze the queue over ->lba_shift updates
Christoph Hellwig hch@lst.de nvme: factor out a nvme_configure_metadata helper
Peter Zijlstra peterz@infradead.org perf: Fix get_recursion_context()
David Howells dhowells@redhat.com afs: Fix afs_write_end() when called with copied == 0 [ver #3]
Muchun Song songmuchun@bytedance.com mm: memcontrol: fix missing wakeup polling thread
Santosh Sivaraj santosh@fossix.org kernel/watchdog: fix watchdog_allowed_mask not used warning
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: Validate hotplug range before creating linear mapping
Sven Van Asbroeck thesven73@gmail.com lan743x: fix use of uninitialized variable
Martin Willi martin@strongswan.org vrf: Fix fast path output packet handling with async Netfilter rules
Chuck Lever chuck.lever@oracle.com NFS: Fix listxattr receive buffer size
Brad Campbell brad@fnarfbargle.com hwmon: (applesmc) Re-work SMC comms
Wang Hai wanghai38@huawei.com cosa: Add missing kfree in error path of cosa_write
Rohit Maheshwari rohitm@chelsio.com ch_ktls: tcb update fails sometimes
Rohit Maheshwari rohitm@chelsio.com ch_ktls: Update cheksum information
Evan Nimmo evan.nimmo@alliedtelesis.co.nz of/address: Fix of_node memory leak in of_dma_is_coherent
Christoph Hellwig hch@lst.de xfs: fix a missing unlock on error in xfs_fs_map_blocks
Sven Van Asbroeck thesven73@gmail.com lan743x: fix "BUG: invalid wait context" when setting rx mode
Darrick J. Wong darrick.wong@oracle.com xfs: fix brainos in the refcount scrubber's rmap fragment processor
Darrick J. Wong darrick.wong@oracle.com xfs: fix rmap key and record comparison functions
Darrick J. Wong darrick.wong@oracle.com xfs: set the unwritten bit in rmap lookup flags in xchk_bmap_get_rmapextents
Darrick J. Wong darrick.wong@oracle.com xfs: fix flags argument to rmap lookup when converting shared file rmaps
Heiner Kallweit hkallweit1@gmail.com net: phy: realtek: support paged operations on RTL8201CP
Sven Van Asbroeck thesven73@gmail.com lan743x: correctly handle chips with internal PHY
Vinicius Costa Gomes vinicius.gomes@intel.com igc: Fix returning wrong statistics
Slawomir Laba slawomirx.laba@intel.com i40e: Fix MAC address setting for a VF via Host/VM
Vlad Buslov vlad@buslov.dev selftest: fix flower terse dump tests
Christoph Hellwig hch@lst.de nbd: fix a block_device refcount leak in nbd_release
Bjorn Andersson bjorn.andersson@linaro.org pinctrl: qcom: sm8250: Specify PDC map
Maulik Shah mkshah@codeaurora.org pinctrl: qcom: Move clearing pending IRQ to .irq_request_resources callback
Heiner Kallweit hkallweit1@gmail.com r8169: disable hw csum for short packets on all chip versions
Heiner Kallweit hkallweit1@gmail.com r8169: fix potential skb double free in an error path
David Verbeiren david.verbeiren@tessares.net bpf: Zero-fill re-used per-cpu map element
Lorenz Bauer lmb@cloudflare.com tools/bpftool: Fix attaching flow dissector
Dai Ngo dai.ngo@oracle.com NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy
Dai Ngo dai.ngo@oracle.com NFSD: Fix use-after-free warning when doing inter-server copy
Chuck Lever chuck.lever@oracle.com SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow()
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: Fix incorrect access of RCU-protected xdp_prog
Aya Levin ayal@nvidia.com net/mlx5e: Fix VXLAN synchronization after function reload
Parav Pandit parav@nvidia.com net/mlx5: E-switch, Avoid extack error log for disabled vport
Maor Gottlieb maorg@nvidia.com net/mlx5: Fix deletion of duplicate rules
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: Use spin_lock_bh for async_icosq_lock
Vlad Buslov vladbu@nvidia.com net/mlx5e: Protect encap route dev from concurrent release
Maor Dickman maord@nvidia.com net/mlx5e: Fix modify header actions memory leak
Billy Tsai billy_tsai@aspeedtech.com pinctrl: aspeed: Fix GPI only function problem.
Andy Shevchenko andriy.shevchenko@linux.intel.com pinctrl: mcp23s08: Use full chunk of memory for regmap configuration
Ian Rogers irogers@google.com libbpf, hashmap: Fix undefined behavior in hash_bits
Ard Biesheuvel ardb@kernel.org bpf: Don't rely on GCC __attribute__((optimize)) to disable GCSE
Andrew Jeffery andrew@aj.id.au ARM: 9019/1: kprobes: Avoid fortify_panic() when copying optprobe template
Billy Tsai billy_tsai@aspeedtech.com gpio: aspeed: fix ast2600 bank properties
Andy Shevchenko andriy.shevchenko@linux.intel.com pinctrl: intel: Set default bias in case no particular value given
Andy Shevchenko andriy.shevchenko@linux.intel.com pinctrl: intel: Fix 2 kOhm bias which is 833 Ohm
Baolin Wang baolin.wang7@gmail.com mfd: sprd: Add wakeup capability for PMIC IRQ
Martin Hundebøll martin@geanix.com spi: bcm2835: remove use of uninitialized gpio flags variable
Jerry Snitselaar jsnitsel@redhat.com tpm_tis: Disable interrupts on ThinkPad T490s
Michael Wu michael.wu@vatics.com i2c: designware: slave should do WRITE_REQUESTED before WRITE_RECEIVED
Michael Wu michael.wu@vatics.com i2c: designware: call i2c_dw_read_clear_intrbits_slave() once
Ulrich Hecht uli+renesas@fpond.eu i2c: sh_mobile: implement atomic transfers
Sean Anderson seanga2@gmail.com riscv: Set text_offset correctly for M-Mode
Benjamin Gwin bgwin@google.com arm64: kexec_file: try more regions if loading segments fails
Tommi Rantala tommi.t.rantala@nokia.com selftests: proc: fix warning: _GNU_SOURCE redefined
Brian Foster bfoster@redhat.com iomap: clean up writeback state logic on writepage error
Veerabadhran Gopalakrishnan veerabadhran.gopalakrishnan@amd.com amd/amdgpu: Disable VCN DPG mode for Picasso
Qii Wang qii.wang@mediatek.com i2c: mediatek: move dma reset before i2c reset
Fred Gao fred.gao@intel.com vfio/pci: Bypass IGD init in case of -ENODEV
Zhang Qilong zhangqilong3@huawei.com vfio: platform: fix reference leak in vfio_platform_open
Qian Cai cai@redhat.com s390/smp: move rcu_cpu_starting() earlier
Suravee Suthikulpanit suravee.suthikulpanit@amd.com iommu/amd: Increase interrupt remapping table limit to 512 entries
Sagi Grimberg sagi@grimberg.me nvme-tcp: avoid repeated request completion
Sagi Grimberg sagi@grimberg.me nvme-rdma: avoid repeated request completion
Chao Leng lengchao@huawei.com nvme-tcp: avoid race between time out and tear down
Chao Leng lengchao@huawei.com nvme-rdma: avoid race between time out and tear down
Chao Leng lengchao@huawei.com nvme: introduce nvme_sync_io_queues
Sreekanth Reddy sreekanth.reddy@broadcom.com scsi: mpt3sas: Fix timeouts observed while reenabling IRQ
Hannes Reinecke hare@suse.de scsi: scsi_dh_alua: Avoid crash during alua_bus_detach()
Vineet Gupta vgupta@synopsys.com ARC: [plat-hsdk] Remap CCMs super early in asm boot trampoline
Keith Busch kbusch@kernel.org Revert "nvme-pci: remove last_sq_tail"
Qiujun Huang hqjagain@gmail.com tracing: Fix the checking of stackidx in __ftrace_trace_stack
Jason A. Donenfeld Jason@zx2c4.com wireguard: selftests: check that route_me_harder packets use the right sk
Ye Bin yebin10@huawei.com cfg80211: regulatory: Fix inconsistent format argument
Johannes Berg johannes.berg@intel.com mac80211: always wind down STA state
Johannes Berg johannes.berg@intel.com cfg80211: initialize wdev data earlier
Johannes Berg johannes.berg@intel.com mac80211: fix use of skb payload instead of header
Evan Quan evan.quan@amd.com drm/amd/pm: do not use ixFEATURE_STATUS for checking smc running
Evan Quan evan.quan@amd.com drm/amd/pm: perform SMC reset on suspend/hibernation
Evan Quan evan.quan@amd.com drm/amd/pm: correct the baco reset sequence for CI ASICs
Evan Quan evan.quan@amd.com drm/amdgpu: perform srbm soft reset always on SDMA resume
Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp scsi: hpsa: Fix memory leak in hpsa_init_one()
Bob Peterson rpeterso@redhat.com gfs2: check for live vs. read-only file system in gfs2_fitrim
Bob Peterson rpeterso@redhat.com gfs2: Add missing truncate_inode_pages_final for sd_aspace
Bob Peterson rpeterso@redhat.com gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
Joerg Roedel jroedel@suse.de x86/boot/compressed/64: Introduce sev_status
Kai-Heng Feng kai.heng.feng@canonical.com ALSA: hda: Reinstate runtime_allow() for all hda controllers
Kai-Heng Feng kai.heng.feng@canonical.com ALSA: hda: Separate runtime and system suspend
Tommi Rantala tommi.t.rantala@nokia.com selftests: pidfd: fix compilation errors due to wait.h
Colin Ian King colin.king@canonical.com selftests/ftrace: check for do_sys_openat2 in user-memory test
Zqiang qiang.zhang@windriver.com usb: raw-gadget: fix memory leak in gadget_setup
Evgeny Novikov novikov@ispras.ru usb: gadget: goku_udc: fix potential crashes in probe
Viresh Kumar viresh.kumar@linaro.org opp: Reduce the size of critical section in _opp_table_kref_release()
Heikki Krogerus heikki.krogerus@linux.intel.com usb: dwc3: pci: add support for the Intel Alder Lake-S
Bard Liao yung-chuan.liao@linux.intel.com ASoC: SOF: loader: handle all SOF_IPC_EXT types
Olivier Moysan olivier.moysan@st.com ASoC: cs42l51: manage mclk shutdown delay
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: qcom: sdm845: set driver name correctly
Tzung-Bi Shih tzungbi@google.com ASoC: mediatek: mt8183-da7219: fix DAPM paths for rt1015
Pujin Shi shipujin.t@gmail.com scsi: ufs: Fix missing brace warning for old compilers
Masashi Honma masashi.honma@gmail.com ath9k_htc: Use appropriate rs_datalen type
Stephen Boyd swboyd@chromium.org KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED
Tyler Hicks tyhicks@linux.microsoft.com tpm: efi: Don't create binary_bios_measurements file for an empty log
Zhang Qilong zhangqilong3@huawei.com USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property
Palmer Dabbelt palmerdabbelt@google.com RISC-V: Fix the VDSO symbol generaton for binutils-2.35+
Bill Wendling morbo@google.com kbuild: explicitly specify the build id style
Anand K Mistry amistry@google.com x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
Tommi Rantala tommi.t.rantala@nokia.com selftests: binderfs: use SKIP instead of XFAIL
Tommi Rantala tommi.t.rantala@nokia.com selftests: clone3: use SKIP instead of XFAIL
Tommi Rantala tommi.t.rantala@nokia.com selftests: core: use SKIP instead of XFAIL in close_range_test.c
Jeff Layton jlayton@kernel.org ceph: check session state after bumping session->s_seq
Rob Herring robh@kernel.org PCI: mvebu: Fix duplicate resource requests
Zhao Qiang qiang.zhao@nxp.com spi: fsl-dspi: fix wrong pointer in suspend/resume
Jens Axboe axboe@kernel.dk io_uring: ensure consistent view of original task ->mm from SQPOLL
Darrick J. Wong darrick.wong@oracle.com xfs: fix scrub flagging rtinherit even if there is no rt device
Darrick J. Wong darrick.wong@oracle.com xfs: fix missing CoW blocks writeback conversion retry
Brian Foster bfoster@redhat.com xfs: flush new eof page on truncate to avoid post-eof corruption
Joakim Zhang qiangqing.zhang@nxp.com can: flexcan: flexcan_remove(): disable wakeup completely
Joakim Zhang qiangqing.zhang@nxp.com can: flexcan: remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
Stephane Grosjean s.grosjean@peak-system.com can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is on
Stephane Grosjean s.grosjean@peak-system.com can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
Dan Carpenter dan.carpenter@oracle.com can: peak_usb: add range checking in decode operations
Navid Emamdoost navid.emamdoost@gmail.com can: xilinx_can: handle failure cases of pm_runtime_get_sync
Zhang Changzhong zhangchangzhong@huawei.com can: ti_hecc: ti_hecc_probe(): add missed clk_disable_unprepare() in error path
Zhang Changzhong zhangchangzhong@huawei.com can: j1939: j1939_sk_bind(): return failure if netdev is down
Yegor Yefremov yegorslists@googlemail.com can: j1939: swap addr and pgn in the send example
Oleksij Rempel linux@rempel-privat.de can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
Oliver Hartkopp socketcan@hartkopp.net can: dev: __can_get_echo_skb(): fix real payload length return value for RTR frames
Vincent Mailhol mailhol.vincent@wanadoo.fr can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ context
Marc Kleine-Budde mkl@pengutronix.de can: rx-offload: don't call kfree_skb() from IRQ context
Alex Williamson alex.williamson@redhat.com vfio/pci: Implement ioeventfd thread handler for contended memory lock
David Howells dhowells@redhat.com afs: Fix incorrect freeing of the ACL passed to the YFS ACL store op
David Howells dhowells@redhat.com afs: Fix warning due to unadvanced marshalling pointer
Liu, Yi L yi.l.liu@intel.com iommu/vt-d: Fix a bug for PDP check in prq_event_thread
Liu Yi L yi.l.liu@intel.com iommu/vt-d: Fix sid not set issue in intel_svm_bind_gpasid()
Dan Carpenter dan.carpenter@oracle.com ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
Namhyung Kim namhyung@kernel.org perf tools: Add missing swap for cgroup events
Jiri Olsa jolsa@kernel.org perf tools: Add missing swap for ino_generation
Stanislav Ivanichkin sivanichkin@yandex-team.ru perf trace: Fix segfault when trying to trace events by cgroup
Steven Price steven.price@arm.com drm/panfrost: Fix module unload
Clément Péron peron.clem@gmail.com drm/panfrost: move devfreq_init()/fini() in device
Clément Péron peron.clem@gmail.com drm/panfrost: rename error labels in device_init
zhongjiang-ali zhongjiang-ali@linux.alibaba.com mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg
Maor Gottlieb maorg@nvidia.com IB/srpt: Fix memory leak in srpt_add_one
Maxime Ripard maxime@cerno.tech drm/vc4: bo: Add a managed action to cleanup the cache
Qian Cai cai@redhat.com powerpc/eeh_cache: Fix a possible debugfs deadlock
Greentime Hu greentime.hu@sifive.com irqchip/sifive-plic: Fix chip_data access within a hierarchy
Stefano Brivio sbrivio@redhat.com netfilter: ipset: Update byte and packet counters regardless of whether they match
Rajat Jain rajatja@google.com PCI: Always enable ACS even if no ACS Capability
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: missing validation from the abort path
Jason A. Donenfeld Jason@zx2c4.com netfilter: use actual socket sk rather than skb sk when routing harder
Pablo Neira Ayuso pablo@netfilter.org netfilter: nftables: fix netlink report logic in flowtable and genid
Johannes Berg johannes.berg@intel.com mac80211: don't require VHT elements for HE on 2.4 GHz
Darrick J. Wong darrick.wong@oracle.com xfs: set xefi_discard when creating a deferred agfl free log intent item
Bert Vermeulen bert@biot.com mtd: spi-nor: Fix address width on flash chips > 16MB
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wcd9335: Set digital gain range correctly
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wcd934x: Set digital gain range correctly
Tommi Rantala tommi.t.rantala@nokia.com selftests: filter kselftest headers from command in lib.mk
Ran Wang ran.wang_1@nxp.com usb: gadget: fsl: fix null pointer checking
Andy Shevchenko andriy.shevchenko@linux.intel.com kunit: Don't fail test suites if one of them is empty
David Gow davidgow@google.com kunit: Fix kunit.py --raw_output option
Greentime Hu greentime.hu@sifive.com irqchip/sifive-plic: Fix broken irq_set_affinity() callback
Sascha Hauer s.hauer@pengutronix.de spi: imx: fix runtime pm support for !CONFIG_PM
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wsa881x: add missing stream rates and format
zhuoliang zhang zhuoliang.zhang@mediatek.com net: xfrm: fix a race condition during allocing spi
Olaf Hering olaf@aepfle.de hv_balloon: disable warning when floor reached
Marc Zyngier maz@kernel.org genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
Tomasz Figa tfiga@chromium.org ASoC: Intel: kbl_rt5663_max98927: Fix kabylake_ssp_fixup function
Xin Long lucien.xin@gmail.com xfrm: interface: fix the priorities for ipip and ipv6 tunnels
Santosh Shukla sashukla@nvidia.com KVM: arm64: Force PTE mapping on fault resulting in a device mapping
Ming Lei ming.lei@redhat.com nbd: don't update block size after device is started
Roman Gushchin guro@fb.com mm: memcg: link page counters to root if use_hierarchy is false
Chris Wilson chris@chris-wilson.co.uk drm/i915/gem: Flush coherency domains on first set-domain-ioctl
Chris Wilson chris@chris-wilson.co.uk drm/i915: Hold onto an explicit ref to i915_vma_work.pinned
-------------
Diffstat:
Documentation/networking/j1939.rst | 4 +- Makefile | 8 +- arch/arc/kernel/head.S | 17 +- arch/arc/plat-hsdk/platform.c | 17 -- arch/arm/include/asm/kprobes.h | 22 +-- arch/arm/probes/kprobes/opt-arm.c | 18 +- arch/arm/vdso/Makefile | 2 +- arch/arm64/kernel/kexec_image.c | 41 +++- arch/arm64/kernel/machine_kexec_file.c | 9 +- arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kernel/vdso32/Makefile | 2 +- arch/arm64/kvm/hypercalls.c | 2 +- arch/arm64/kvm/mmu.c | 1 + arch/arm64/kvm/sys_regs.c | 18 +- arch/arm64/mm/mmu.c | 17 ++ arch/mips/vdso/Makefile | 2 +- arch/powerpc/kernel/eeh_cache.c | 5 +- arch/powerpc/kernel/head_32.S | 12 -- arch/riscv/kernel/head.S | 5 + arch/riscv/kernel/vdso/.gitignore | 1 + arch/riscv/kernel/vdso/Makefile | 18 +- arch/riscv/kernel/vdso/so2s.sh | 6 + arch/s390/kernel/smp.c | 3 +- arch/s390/kernel/vdso64/Makefile | 2 +- arch/sparc/vdso/Makefile | 2 +- arch/x86/boot/compressed/mem_encrypt.S | 16 +- arch/x86/entry/vdso/Makefile | 2 +- arch/x86/kernel/cpu/bugs.c | 51 +++-- block/genhd.c | 5 +- drivers/accessibility/speakup/main.c | 1 - drivers/accessibility/speakup/selection.c | 11 +- drivers/accessibility/speakup/speakup.h | 1 - drivers/accessibility/speakup/spk_ttyio.c | 10 +- drivers/accessibility/speakup/spk_types.h | 8 +- drivers/block/loop.c | 3 +- drivers/block/nbd.c | 10 +- drivers/block/null_blk.h | 1 + drivers/block/null_blk_zoned.c | 31 ++- drivers/char/tpm/eventlog/efi.c | 5 + drivers/char/tpm/tpm_tis.c | 29 ++- drivers/char/virtio_console.c | 8 +- drivers/cpufreq/cpufreq.c | 4 +- drivers/cpufreq/cpufreq_governor.h | 2 +- drivers/cpufreq/cpufreq_performance.c | 1 + drivers/cpufreq/cpufreq_powersave.c | 1 + drivers/cpufreq/intel_pstate.c | 16 +- drivers/crypto/chelsio/chcr_ktls.c | 27 ++- drivers/firmware/xilinx/zynqmp.c | 3 + drivers/gpio/gpio-aspeed.c | 1 + drivers/gpio/gpio-pcie-idio-24.c | 62 +++++- drivers/gpio/gpio-sifive.c | 2 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 ++- drivers/gpu/drm/amd/amdgpu/soc15.c | 3 +- .../amd/display/dc/irq/dcn30/irq_service_dcn30.c | 4 +- drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c | 7 +- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 + drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 + drivers/gpu/drm/amd/powerplay/inc/smumgr.h | 2 + drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 29 ++- drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 8 + drivers/gpu/drm/gma500/psb_irq.c | 34 ++-- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 28 ++- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 +- drivers/gpu/drm/i915/i915_vma.c | 6 +- drivers/gpu/drm/panfrost/panfrost_device.c | 40 ++-- drivers/gpu/drm/panfrost/panfrost_drv.c | 20 +- drivers/gpu/drm/vc4/vc4_bo.c | 5 +- drivers/gpu/drm/vc4/vc4_drv.c | 1 - drivers/gpu/drm/vc4/vc4_drv.h | 2 +- drivers/hv/hv_balloon.c | 2 +- drivers/hwmon/amd_energy.c | 2 +- drivers/hwmon/applesmc.c | 130 ++++++++----- drivers/hwtracing/coresight/coresight-etm-perf.c | 4 +- drivers/i2c/busses/i2c-designware-slave.c | 52 ++--- drivers/i2c/busses/i2c-mt65xx.c | 8 +- drivers/i2c/busses/i2c-sh_mobile.c | 86 +++++++-- drivers/infiniband/ulp/srpt/ib_srpt.c | 13 +- drivers/iommu/amd/amd_iommu_types.h | 6 +- drivers/iommu/intel/svm.c | 8 +- drivers/irqchip/irq-sifive-plic.c | 10 +- drivers/mfd/sprd-sc27xx-spi.c | 28 ++- drivers/misc/mei/client.h | 4 +- drivers/mmc/host/renesas_sdhi_core.c | 1 + drivers/mmc/host/sdhci-of-esdhc.c | 2 + drivers/mtd/spi-nor/core.c | 8 +- drivers/net/can/dev.c | 14 +- drivers/net/can/flexcan.c | 5 +- drivers/net/can/peak_canfd/peak_canfd.c | 11 +- drivers/net/can/rx-offload.c | 4 +- drivers/net/can/ti_hecc.c | 8 +- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++++- drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 +++-- drivers/net/can/xilinx_can.c | 6 +- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 26 ++- drivers/net/ethernet/intel/igc/igc_main.c | 14 +- .../net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 6 +- .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 72 ++++--- .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c | 4 +- .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c | 4 +- .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 14 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_rep.h | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 + drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 2 - drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 7 +- .../net/ethernet/mellanox/mlx5/core/lib/vxlan.c | 23 ++- .../net/ethernet/mellanox/mlx5/core/lib/vxlan.h | 2 + drivers/net/ethernet/microchip/lan743x_main.c | 24 +-- drivers/net/ethernet/microchip/lan743x_main.h | 3 - drivers/net/ethernet/realtek/r8169_main.c | 18 +- drivers/net/phy/realtek.c | 2 + drivers/net/vrf.c | 92 ++++++--- drivers/net/wan/cosa.c | 1 + drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2 +- drivers/nvme/host/core.c | 106 ++++++---- drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 23 ++- drivers/nvme/host/rdma.c | 14 +- drivers/nvme/host/tcp.c | 16 +- drivers/of/address.c | 4 +- drivers/opp/core.c | 7 +- drivers/pci/controller/pci-mvebu.c | 23 +-- drivers/pci/pci.c | 9 +- drivers/pinctrl/aspeed/pinctrl-aspeed.c | 7 +- drivers/pinctrl/intel/pinctrl-intel.c | 40 +++- drivers/pinctrl/pinctrl-amd.c | 6 +- drivers/pinctrl/pinctrl-mcp23s08_spi.c | 2 +- drivers/pinctrl/qcom/pinctrl-msm.c | 32 +-- drivers/pinctrl/qcom/pinctrl-sm8250.c | 18 ++ drivers/scsi/device_handler/scsi_dh_alua.c | 9 +- drivers/scsi/hpsa.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_base.c | 7 + drivers/scsi/ufs/ufshcd-crypto.c | 4 +- drivers/spi/spi-bcm2835.c | 3 +- drivers/spi/spi-fsl-dspi.c | 10 +- drivers/spi/spi-imx.c | 23 ++- drivers/thunderbolt/nhi.c | 19 +- drivers/thunderbolt/xdomain.c | 1 + drivers/uio/uio.c | 10 +- drivers/usb/class/cdc-acm.c | 9 + drivers/usb/dwc3/dwc3-pci.c | 4 + drivers/usb/gadget/legacy/raw_gadget.c | 5 +- drivers/usb/gadget/udc/fsl_udc_core.c | 2 +- drivers/usb/gadget/udc/goku_udc.c | 2 +- drivers/usb/host/xhci-histb.c | 2 +- drivers/usb/misc/apple-mfi-fastcharge.c | 4 +- drivers/usb/musb/musb_dsps.c | 4 +- drivers/usb/typec/ucsi/psy.c | 9 + drivers/usb/typec/ucsi/ucsi.c | 7 +- drivers/usb/typec/ucsi/ucsi.h | 2 + drivers/vfio/pci/vfio_pci.c | 2 +- drivers/vfio/pci/vfio_pci_rdwr.c | 43 ++++- drivers/vfio/platform/vfio_platform_common.c | 3 +- fs/afs/write.c | 5 +- fs/afs/xattr.c | 7 +- fs/afs/yfsclient.c | 1 + fs/btrfs/dev-replace.c | 26 ++- fs/btrfs/ioctl.c | 10 +- fs/btrfs/ref-verify.c | 1 + fs/btrfs/relocation.c | 4 +- fs/btrfs/volumes.c | 26 +-- fs/ceph/caps.c | 2 +- fs/ceph/mds_client.c | 50 +++-- fs/ceph/mds_client.h | 1 + fs/ceph/quota.c | 2 +- fs/ceph/snap.c | 2 +- fs/cifs/cifs_unicode.c | 8 +- fs/erofs/inode.c | 21 +- fs/erofs/zdata.c | 7 +- fs/ext4/inline.c | 1 + fs/ext4/super.c | 4 +- fs/gfs2/rgrp.c | 5 +- fs/gfs2/super.c | 1 + fs/io_uring.c | 29 ++- fs/iomap/buffered-io.c | 15 +- fs/jbd2/checkpoint.c | 2 + fs/jbd2/transaction.c | 4 +- fs/nfs/nfs42xattr.c | 2 + fs/nfs/nfs42xdr.c | 4 +- fs/nfsd/nfs4proc.c | 3 +- fs/ocfs2/super.c | 1 + fs/xfs/libxfs/xfs_alloc.c | 1 + fs/xfs/libxfs/xfs_bmap.h | 2 +- fs/xfs/libxfs/xfs_rmap.c | 2 +- fs/xfs/libxfs/xfs_rmap_btree.c | 16 +- fs/xfs/scrub/bmap.c | 2 + fs/xfs/scrub/inode.c | 3 +- fs/xfs/scrub/refcount.c | 8 +- fs/xfs/xfs_aops.c | 6 +- fs/xfs/xfs_iops.c | 10 + fs/xfs/xfs_pnfs.c | 2 +- include/linux/arm-smccc.h | 2 + include/linux/can/skb.h | 20 +- include/linux/compiler-gcc.h | 2 - include/linux/compiler_types.h | 4 - include/linux/cpufreq.h | 18 +- include/linux/genhd.h | 2 +- include/linux/memcontrol.h | 11 +- include/linux/netfilter/nfnetlink.h | 9 +- include/linux/netfilter_ipv4.h | 2 +- include/linux/netfilter_ipv6.h | 10 +- include/trace/events/sunrpc.h | 8 +- init/main.c | 14 +- kernel/bpf/Makefile | 6 +- kernel/bpf/core.c | 2 +- kernel/bpf/hashtab.c | 30 ++- kernel/dma/swiotlb.c | 6 +- kernel/events/core.c | 12 +- kernel/events/internal.h | 2 +- kernel/exit.c | 5 +- kernel/futex.c | 5 +- kernel/irq/Kconfig | 1 + kernel/reboot.c | 28 +-- kernel/sched/cpufreq_schedutil.c | 2 +- kernel/trace/trace.c | 4 +- kernel/watchdog.c | 4 +- mm/compaction.c | 12 +- mm/gup.c | 14 +- mm/hugetlb.c | 90 +-------- mm/memcontrol.c | 28 ++- mm/memory-failure.c | 36 ++-- mm/migrate.c | 44 +++-- mm/rmap.c | 5 +- mm/slub.c | 2 +- mm/vmscan.c | 5 +- net/can/j1939/socket.c | 6 + net/core/devlink.c | 8 +- net/ethtool/features.c | 2 +- net/ipv4/ip_tunnel_core.c | 4 +- net/ipv4/netfilter.c | 8 +- net/ipv4/netfilter/iptable_mangle.c | 2 +- net/ipv4/netfilter/nf_reject_ipv4.c | 2 +- net/ipv4/syncookies.c | 9 +- net/ipv4/udp_offload.c | 19 +- net/ipv4/xfrm4_tunnel.c | 4 +- net/ipv6/netfilter.c | 6 +- net/ipv6/netfilter/ip6table_mangle.c | 2 +- net/ipv6/sit.c | 2 - net/ipv6/syncookies.c | 10 +- net/ipv6/udp_offload.c | 17 +- net/ipv6/xfrm6_tunnel.c | 4 +- net/iucv/af_iucv.c | 3 +- net/mac80211/mlme.c | 3 +- net/mac80211/sta_info.c | 18 ++ net/mac80211/tx.c | 37 ++-- net/mptcp/protocol.c | 1 + net/netfilter/ipset/ip_set_core.c | 3 +- net/netfilter/ipvs/ip_vs_core.c | 4 +- net/netfilter/nf_nat_proto.c | 4 +- net/netfilter/nf_synproxy_core.c | 2 +- net/netfilter/nf_tables_api.c | 19 +- net/netfilter/nfnetlink.c | 22 ++- net/netfilter/nft_chain_route.c | 4 +- net/netfilter/utils.c | 4 +- net/tipc/topsrv.c | 10 +- net/wireless/core.c | 57 +++--- net/wireless/core.h | 5 +- net/wireless/nl80211.c | 3 +- net/wireless/reg.c | 2 +- net/x25/af_x25.c | 2 +- net/xfrm/xfrm_interface.c | 8 +- net/xfrm/xfrm_state.c | 8 +- security/selinux/ibpkey.c | 4 +- sound/hda/ext/hdac_ext_controller.c | 2 + sound/pci/hda/hda_controller.h | 3 +- sound/pci/hda/hda_intel.c | 63 +++--- sound/soc/codecs/cs42l51.c | 22 ++- sound/soc/codecs/wcd9335.c | 2 +- sound/soc/codecs/wcd934x.c | 2 +- sound/soc/codecs/wsa881x.c | 2 + sound/soc/intel/boards/kbl_rt5663_max98927.c | 39 +++- sound/soc/mediatek/mt8183/mt8183-da7219-max98357.c | 31 ++- sound/soc/qcom/sdm845.c | 2 + sound/soc/sof/loader.c | 5 + tools/bpf/bpftool/prog.c | 2 +- tools/lib/bpf/hashmap.h | 15 +- tools/perf/builtin-trace.c | 15 +- .../util/scripting-engines/trace-event-python.c | 7 +- tools/perf/util/session.c | 14 ++ tools/testing/kunit/kunit_parser.py | 3 +- tools/testing/selftests/bpf/Makefile | 2 +- tools/testing/selftests/bpf/prog_tests/map_init.c | 214 +++++++++++++++++++++ tools/testing/selftests/bpf/progs/test_map_init.c | 33 ++++ .../clone3/clone3_cap_checkpoint_restore.c | 2 +- tools/testing/selftests/core/close_range_test.c | 8 +- .../selftests/filesystems/binderfs/binderfs_test.c | 8 +- .../ftrace/test.d/kprobe/kprobe_args_user.tc | 4 + tools/testing/selftests/lib.mk | 2 +- tools/testing/selftests/pidfd/pidfd_open_test.c | 1 - tools/testing/selftests/pidfd/pidfd_poll_test.c | 1 - tools/testing/selftests/proc/proc-loadavg-001.c | 1 - tools/testing/selftests/proc/proc-self-syscall.c | 1 - tools/testing/selftests/proc/proc-uptime-002.c | 1 - .../tc-testing/tc-tests/filters/tests.json | 4 +- tools/testing/selftests/wireguard/netns.sh | 8 + .../testing/selftests/wireguard/qemu/kernel.config | 2 + 297 files changed, 2450 insertions(+), 1241 deletions(-)
From: Chris Wilson chris@chris-wilson.co.uk
[ Upstream commit 537457a979a02a410b555fab289dcb28b588f33b ]
Since __vma_release is run by a kworker after the fence has been signaled, it is no longer protected by the active reference on the vma, and so the alias of vw->pinned to vma->obj is also not protected by a reference on the object. Add an explicit reference for vw->pinned so it will always be safe.
Found by inspection.
Fixes: 54d7195f8c64 ("drm/i915: Unpin vma->obj on early error") Reported-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v5.6+ Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20201102161931.30031-1-chris@c... (cherry picked from commit bc73e5d33048b7ab5f12b11b5d923700467a8e1d) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_vma.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index bc64f773dcdb4..034d0a8d24c8c 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -315,8 +315,10 @@ static void __vma_release(struct dma_fence_work *work) { struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
- if (vw->pinned) + if (vw->pinned) { __i915_gem_object_unpin_pages(vw->pinned); + i915_gem_object_put(vw->pinned); + } }
static const struct dma_fence_work_ops bind_ops = { @@ -430,7 +432,7 @@ int i915_vma_bind(struct i915_vma *vma,
if (vma->obj) { __i915_gem_object_pin_pages(vma->obj); - work->pinned = vma->obj; + work->pinned = i915_gem_object_get(vma->obj); } } else { ret = vma->ops->bind_vma(vma->vm, vma, cache_level, bind_flags);
From: Chris Wilson chris@chris-wilson.co.uk
[ Upstream commit 59dd13ad310793757e34afa489dd6fc8544fc3da ]
Avoid skipping what appears to be a no-op set-domain-ioctl if the cache coherency state is inconsistent with our target domain. This also has the utility of using the population of the pages to validate the backing store.
The danger in skipping the first set-domain is leaving the cache inconsistent and submitting stale data, or worse leaving the clean data in the cache and not flushing it to the GPU. The impact should be small as it requires a no-op set-domain as the very first ioctl in a particular sequence not found in typical userspace.
Reported-by: Zbigniew Kempczyński zbigniew.kempczynski@intel.com Fixes: 754a25442705 ("drm/i915: Skip object locking around a no-op set-domain ioctl") Testcase: igt/gem_mmap_offset/blt-coherency Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.william.auld@gmail.com Cc: Zbigniew Kempczyński zbigniew.kempczynski@intel.com Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Matthew Auld matthew.auld@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20201019203825.10966-1-chris@c... (cherry picked from commit 44c2200afcd59f441b43f27829b4003397cc495d) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 28 ++++++++++------------ 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 7f76fc68f498a..ba8758011e297 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -484,21 +484,6 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT;
- /* - * Already in the desired write domain? Nothing for us to do! - * - * We apply a little bit of cunning here to catch a broader set of - * no-ops. If obj->write_domain is set, we must be in the same - * obj->read_domains, and only that domain. Therefore, if that - * obj->write_domain matches the request read_domains, we are - * already in the same read/write domain and can skip the operation, - * without having to further check the requested write_domain. - */ - if (READ_ONCE(obj->write_domain) == read_domains) { - err = 0; - goto out; - } - /* * Try to flush the object off the GPU without holding the lock. * We will repeat the flush holding the lock in the normal manner @@ -536,6 +521,19 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (err) goto out;
+ /* + * Already in the desired write domain? Nothing for us to do! + * + * We apply a little bit of cunning here to catch a broader set of + * no-ops. If obj->write_domain is set, we must be in the same + * obj->read_domains, and only that domain. Therefore, if that + * obj->write_domain matches the request read_domains, we are + * already in the same read/write domain and can skip the operation, + * without having to further check the requested write_domain. + */ + if (READ_ONCE(obj->write_domain) == read_domains) + goto out_unpin; + err = i915_gem_object_lock_interruptible(obj); if (err) goto out_unpin;
From: Roman Gushchin guro@fb.com
[ Upstream commit 8de15e920dc85d1705ab9c202c95d56845bc2d48 ]
Richard reported a warning which can be reproduced by running the LTP madvise6 test (cgroup v1 in the non-hierarchical mode should be used):
WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156) Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014 Workqueue: events drain_local_stock RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156) Call Trace: __memcg_kmem_uncharge (mm/memcontrol.c:3022) drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114) drain_local_stock (mm/memcontrol.c:2255) process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274) worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416) kthread (kernel/kthread.c:292) ret_from_fork (arch/x86/entry/entry_64.S:300)
The problem occurs because in the non-hierarchical mode non-root page counters are not linked to root page counters, so the charge is not propagated to the root memory cgroup.
After the removal of the original memory cgroup and reparenting of the object cgroup, the root cgroup might be uncharged by draining a objcg stock, for example. It leads to an eventual underflow of the charge and triggers a warning.
Fix it by linking all page counters to corresponding root page counters in the non-hierarchical mode.
Please note, that in the non-hierarchical mode all objcgs are always reparented to the root memory cgroup, even if the hierarchy has more than 1 level. This patch doesn't change it.
The patch also doesn't affect how the hierarchical mode is working, which is the only sane and truly supported mode now.
Thanks to Richard for reporting, debugging and providing an alternative version of the fix!
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Reported-by: ltp@lists.linux.it Signed-off-by: Roman Gushchin guro@fb.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Shakeel Butt shakeelb@google.com Reviewed-by: Michal Koutný mkoutny@suse.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com Debugged-by: Richard Palethorpe rpalethorpe@suse.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memcontrol.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9eefdb9cc2303..de51787831728 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5298,7 +5298,13 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) memcg->swappiness = mem_cgroup_swappiness(parent); memcg->oom_kill_disable = parent->oom_kill_disable; } - if (parent && parent->use_hierarchy) { + if (!parent) { + page_counter_init(&memcg->memory, NULL); + page_counter_init(&memcg->swap, NULL); + page_counter_init(&memcg->memsw, NULL); + page_counter_init(&memcg->kmem, NULL); + page_counter_init(&memcg->tcpmem, NULL); + } else if (parent->use_hierarchy) { memcg->use_hierarchy = true; page_counter_init(&memcg->memory, &parent->memory); page_counter_init(&memcg->swap, &parent->swap); @@ -5306,11 +5312,11 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) page_counter_init(&memcg->kmem, &parent->kmem); page_counter_init(&memcg->tcpmem, &parent->tcpmem); } else { - page_counter_init(&memcg->memory, NULL); - page_counter_init(&memcg->swap, NULL); - page_counter_init(&memcg->memsw, NULL); - page_counter_init(&memcg->kmem, NULL); - page_counter_init(&memcg->tcpmem, NULL); + page_counter_init(&memcg->memory, &root_mem_cgroup->memory); + page_counter_init(&memcg->swap, &root_mem_cgroup->swap); + page_counter_init(&memcg->memsw, &root_mem_cgroup->memsw); + page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem); + page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem); /* * Deeper hierachy with use_hierarchy == false doesn't make * much sense so let cgroup subsystem know about this
From: Ming Lei ming.lei@redhat.com
[ Upstream commit b40813ddcd6bf9f01d020804e4cb8febc480b9e4 ]
Mounted NBD device can be resized, one use case is rbd-nbd.
Fix the issue by setting up default block size, then not touch it in nbd_size_update() any more. This kind of usage is aligned with loop which has same use case too.
Cc: stable@vger.kernel.org Fixes: c8a83a6b54d0 ("nbd: Use set_blocksize() to set device blocksize") Reported-by: lining lining2020x@163.com Signed-off-by: Ming Lei ming.lei@redhat.com Cc: Josef Bacik josef@toxicpanda.com Cc: Jan Kara jack@suse.cz Tested-by: lining lining2020x@163.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index f46e26c9d9b3c..d76fca629c143 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -296,7 +296,7 @@ static void nbd_size_clear(struct nbd_device *nbd) } }
-static void nbd_size_update(struct nbd_device *nbd) +static void nbd_size_update(struct nbd_device *nbd, bool start) { struct nbd_config *config = nbd->config; struct block_device *bdev = bdget_disk(nbd->disk, 0); @@ -312,7 +312,8 @@ static void nbd_size_update(struct nbd_device *nbd) if (bdev) { if (bdev->bd_disk) { bd_set_size(bdev, config->bytesize); - set_blocksize(bdev, config->blksize); + if (start) + set_blocksize(bdev, config->blksize); } else bdev->bd_invalidated = 1; bdput(bdev); @@ -327,7 +328,7 @@ static void nbd_size_set(struct nbd_device *nbd, loff_t blocksize, config->blksize = blocksize; config->bytesize = blocksize * nr_blocks; if (nbd->task_recv != NULL) - nbd_size_update(nbd); + nbd_size_update(nbd, false); }
static void nbd_complete_rq(struct request *req) @@ -1307,7 +1308,7 @@ static int nbd_start_device(struct nbd_device *nbd) args->index = i; queue_work(nbd->recv_workq, &args->work); } - nbd_size_update(nbd); + nbd_size_update(nbd, true); return error; }
From: Santosh Shukla sashukla@nvidia.com
[ Upstream commit 91a2c34b7d6fadc9c5d9433c620ea4c32ee7cae8 ]
VFIO allows a device driver to resolve a fault by mapping a MMIO range. This can be subsequently result in user_mem_abort() to try and compute a huge mapping based on the MMIO pfn, which is a sure recipe for things to go wrong.
Instead, force a PTE mapping when the pfn faulted in has a device mapping.
Fixes: 6d674e28f642 ("KVM: arm/arm64: Properly handle faulting of device mappings") Suggested-by: Marc Zyngier maz@kernel.org Signed-off-by: Santosh Shukla sashukla@nvidia.com [maz: rewritten commit message] Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Gavin Shan gshan@redhat.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1603711447-11998-2-git-send-email-sashukla@nvidia.... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kvm/mmu.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 3d26b47a13430..7a4ad984d54e0 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1920,6 +1920,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (kvm_is_device_pfn(pfn)) { mem_type = PAGE_S2_DEVICE; flags |= KVM_S2PTE_FLAG_IS_IOMAP; + force_pte = true; } else if (logging_active) { /* * Faults on pages in a memslot with logging enabled
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 7fe94612dd4cfcd35fe0ec87745fb31ad2be71f8 ]
As Nicolas noticed in his case, when xfrm_interface module is installed the standard IP tunnels will break in receiving packets.
This is caused by the IP tunnel handlers with a higher priority in xfrm interface processing incoming packets by xfrm_input(), which would drop the packets and return 0 instead when anything wrong happens.
Rather than changing xfrm_input(), this patch is to adjust the priority for the IP tunnel handlers in xfrm interface, so that the packets would go to xfrmi's later than the others', as the others' would not drop the packets when the handlers couldn't process them.
Note that IPCOMP also defines its own IPIP tunnel handler and it calls xfrm_input() as well, so we must make its priority lower than xfrmi's, which means having xfrmi loaded would still break IPCOMP. We may seek another way to fix it in xfrm_input() in the future.
Reported-by: Nicolas Dichtel nicolas.dichtel@6wind.com Tested-by: Nicolas Dichtel nicolas.dichtel@6wind.com Fixes: da9bbf0598c9 ("xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handler") FIxes: d7b360c2869f ("xfrm: interface: support IP6IP6 and IP6IP tunnels processing with .cb_handler") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/xfrm4_tunnel.c | 4 ++-- net/ipv6/xfrm6_tunnel.c | 4 ++-- net/xfrm/xfrm_interface.c | 8 ++++---- 3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/xfrm4_tunnel.c b/net/ipv4/xfrm4_tunnel.c index dc19aff7c2e00..fb0648e7fb32f 100644 --- a/net/ipv4/xfrm4_tunnel.c +++ b/net/ipv4/xfrm4_tunnel.c @@ -64,14 +64,14 @@ static int xfrm_tunnel_err(struct sk_buff *skb, u32 info) static struct xfrm_tunnel xfrm_tunnel_handler __read_mostly = { .handler = xfrm_tunnel_rcv, .err_handler = xfrm_tunnel_err, - .priority = 3, + .priority = 4, };
#if IS_ENABLED(CONFIG_IPV6) static struct xfrm_tunnel xfrm64_tunnel_handler __read_mostly = { .handler = xfrm_tunnel_rcv, .err_handler = xfrm_tunnel_err, - .priority = 2, + .priority = 3, }; #endif
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index 25b7ebda2fabf..f696d46e69100 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -303,13 +303,13 @@ static const struct xfrm_type xfrm6_tunnel_type = { static struct xfrm6_tunnel xfrm6_tunnel_handler __read_mostly = { .handler = xfrm6_tunnel_rcv, .err_handler = xfrm6_tunnel_err, - .priority = 2, + .priority = 3, };
static struct xfrm6_tunnel xfrm46_tunnel_handler __read_mostly = { .handler = xfrm6_tunnel_rcv, .err_handler = xfrm6_tunnel_err, - .priority = 2, + .priority = 3, };
static int __net_init xfrm6_tunnel_net_init(struct net *net) diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index a8f66112c52b4..0bb7963b9f6bc 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -830,14 +830,14 @@ static struct xfrm6_tunnel xfrmi_ipv6_handler __read_mostly = { .handler = xfrmi6_rcv_tunnel, .cb_handler = xfrmi_rcv_cb, .err_handler = xfrmi6_err, - .priority = -1, + .priority = 2, };
static struct xfrm6_tunnel xfrmi_ip6ip_handler __read_mostly = { .handler = xfrmi6_rcv_tunnel, .cb_handler = xfrmi_rcv_cb, .err_handler = xfrmi6_err, - .priority = -1, + .priority = 2, }; #endif
@@ -875,14 +875,14 @@ static struct xfrm_tunnel xfrmi_ipip_handler __read_mostly = { .handler = xfrmi4_rcv_tunnel, .cb_handler = xfrmi_rcv_cb, .err_handler = xfrmi4_err, - .priority = -1, + .priority = 3, };
static struct xfrm_tunnel xfrmi_ipip6_handler __read_mostly = { .handler = xfrmi4_rcv_tunnel, .cb_handler = xfrmi_rcv_cb, .err_handler = xfrmi4_err, - .priority = -1, + .priority = 2, }; #endif
From: Tomasz Figa tfiga@chromium.org
[ Upstream commit 9fe9efd6924c9a62ebb759025bb8927e398f51f7 ]
This is a copy of commit 5c5f1baee85a ("ASoC: Intel: kbl_rt5663_rt5514_max98927: Fix kabylake_ssp_fixup function") applied to the kbl_rt5663_max98927 board file.
Original explanation of the change:
kabylake_ssp_fixup function uses snd_soc_dpcm to identify the codecs DAIs. The HW parameters are changed based on the codec DAI of the stream. The earlier approach to get snd_soc_dpcm was using container_of() macro on snd_pcm_hw_params.
The structures have been modified over time and snd_soc_dpcm does not have snd_pcm_hw_params as a reference but as a copy. This causes the current driver to crash when used.
This patch changes the way snd_soc_dpcm is extracted. snd_soc_pcm_runtime holds 2 dpcm instances (one for playback and one for capture). 2 codecs on the SSP are dmic (capture) and speakers (playback). Based on the stream direction, snd_soc_dpcm is extracted from snd_soc_pcm_runtime.
Fixes a boot crash on a HP Chromebook x2:
[ 16.582225] BUG: kernel NULL pointer dereference, address: 0000000000000050 [ 16.582231] #PF: supervisor read access in kernel mode [ 16.582233] #PF: error_code(0x0000) - not-present page [ 16.582234] PGD 0 P4D 0 [ 16.582238] Oops: 0000 [#1] PREEMPT SMP PTI [ 16.582241] CPU: 0 PID: 1980 Comm: cras Tainted: G C 5.4.58 #1 [ 16.582243] Hardware name: HP Soraka/Soraka, BIOS Google_Soraka.10431.75.0 08/30/2018 [ 16.582247] RIP: 0010:kabylake_ssp_fixup+0x19/0xbb [snd_soc_kbl_rt5663_max98927] [ 16.582250] Code: c6 6f c5 80 c0 44 89 f2 31 c0 e8 3e c9 4c d6 eb de 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 89 f3 48 8b 46 c8 48 8b 4e d0 <48> 8b 49 10 4c 8b 78 10 4c 8b 31 4c 89 f7 48 c7 c6 4b c2 80 c0 e8 [ 16.582252] RSP: 0000:ffffaf7e81e0b958 EFLAGS: 00010282 [ 16.582254] RAX: ffffffff96f13e0d RBX: ffffaf7e81e0ba00 RCX: 0000000000000040 [ 16.582256] RDX: ffffaf7e81e0ba00 RSI: ffffaf7e81e0ba00 RDI: ffffa3b208558028 [ 16.582258] RBP: ffffaf7e81e0b970 R08: ffffa3b203b54160 R09: ffffaf7e81e0ba00 [ 16.582259] R10: 0000000000000000 R11: ffffffffc080b345 R12: ffffa3b209fb6e00 [ 16.582261] R13: ffffa3b1b1a47838 R14: ffffa3b1e6197f28 R15: ffffaf7e81e0ba00 [ 16.582263] FS: 00007eb3f25aaf80(0000) GS:ffffa3b236a00000(0000) knlGS:0000000000000000 [ 16.582265] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.582267] CR2: 0000000000000050 CR3: 0000000246bc8006 CR4: 00000000003606f0 [ 16.582269] Call Trace: [ 16.582275] snd_soc_link_be_hw_params_fixup+0x21/0x68 [ 16.582278] snd_soc_dai_hw_params+0x25/0x94 [ 16.582282] soc_pcm_hw_params+0x2d8/0x583 [ 16.582288] dpcm_be_dai_hw_params+0x172/0x29e [ 16.582291] dpcm_fe_dai_hw_params+0x9f/0x12f [ 16.582295] snd_pcm_hw_params+0x137/0x41c [ 16.582298] snd_pcm_hw_params_user+0x3c/0x71 [ 16.582301] snd_pcm_common_ioctl+0x2c6/0x565 [ 16.582304] snd_pcm_ioctl+0x32/0x36 [ 16.582307] do_vfs_ioctl+0x506/0x783 [ 16.582311] ksys_ioctl+0x58/0x83 [ 16.582313] __x64_sys_ioctl+0x1a/0x1e [ 16.582316] do_syscall_64+0x54/0x7e [ 16.582319] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 16.582322] RIP: 0033:0x7eb3f1886157 [ 16.582324] Code: 8a 66 90 48 8b 05 11 dd 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 dc 2b 00 f7 d8 64 89 01 48 [ 16.582326] RSP: 002b:00007ffff7559818 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 16.582329] RAX: ffffffffffffffda RBX: 00005acc9188b140 RCX: 00007eb3f1886157 [ 16.582330] RDX: 00007ffff7559940 RSI: 00000000c2604111 RDI: 000000000000001e [ 16.582332] RBP: 00007ffff7559840 R08: 0000000000000004 R09: 0000000000000000 [ 16.582333] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000bb80 [ 16.582335] R13: 00005acc91702e80 R14: 00007ffff7559940 R15: 00005acc91702e80 [ 16.582337] Modules linked in: rfcomm cmac algif_hash algif_skcipher af_alg uinput hid_google_hammer snd_soc_kbl_rt5663_max98927 snd_soc_hdac_hdmi snd_soc_dmic snd_soc_skl_ssp_clk snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_hdac_hda snd_soc_acpi_intel_match snd_soc_acpi snd_hda_ext_core snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core ipu3_cio2 ipu3_imgu(C) videobuf2_v4l2 videobuf2_common videobuf2_dma_sg videobuf2_memops snd_soc_rt5663 snd_soc_max98927 snd_soc_rl6231 ov5670 ov13858 acpi_als v4l2_fwnode dw9714 fuse xt_MASQUERADE iio_trig_sysfs cros_ec_light_prox cros_ec_sensors cros_ec_sensors_core cros_ec_sensors_ring industrialio_triggered_buffer kfifo_buf industrialio cros_ec_sensorhub cdc_ether usbnet btusb btrtl btintel btbcm bluetooth ecdh_generic ecc lzo_rle lzo_compress iwlmvm zram iwl7000_mac80211 r8152 mii iwlwifi cfg80211 joydev [ 16.584243] gsmi: Log Shutdown Reason 0x03 [ 16.584246] CR2: 0000000000000050 [ 16.584248] ---[ end trace c8511d090c11edff ]---
Suggested-by: Łukasz Majczak lmajczak@google.com Fixes: 2e5894d73789e ("ASoC: pcm: Add support for DAI multicodec") Signed-off-by: Tomasz Figa tfiga@chromium.org Acked-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20201014141624.4143453-1-tfiga@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/kbl_rt5663_max98927.c | 39 ++++++++++++++++---- 1 file changed, 31 insertions(+), 8 deletions(-)
diff --git a/sound/soc/intel/boards/kbl_rt5663_max98927.c b/sound/soc/intel/boards/kbl_rt5663_max98927.c index 3ea4602dfb3ee..9a4b3d0973f65 100644 --- a/sound/soc/intel/boards/kbl_rt5663_max98927.c +++ b/sound/soc/intel/boards/kbl_rt5663_max98927.c @@ -401,17 +401,40 @@ static int kabylake_ssp_fixup(struct snd_soc_pcm_runtime *rtd, struct snd_interval *chan = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); struct snd_mask *fmt = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); - struct snd_soc_dpcm *dpcm = container_of( - params, struct snd_soc_dpcm, hw_params); - struct snd_soc_dai_link *fe_dai_link = dpcm->fe->dai_link; - struct snd_soc_dai_link *be_dai_link = dpcm->be->dai_link; + struct snd_soc_dpcm *dpcm, *rtd_dpcm = NULL; + + /* + * The following loop will be called only for playback stream + * In this platform, there is only one playback device on every SSP + */ + for_each_dpcm_fe(rtd, SNDRV_PCM_STREAM_PLAYBACK, dpcm) { + rtd_dpcm = dpcm; + break; + } + + /* + * This following loop will be called only for capture stream + * In this platform, there is only one capture device on every SSP + */ + for_each_dpcm_fe(rtd, SNDRV_PCM_STREAM_CAPTURE, dpcm) { + rtd_dpcm = dpcm; + break; + } + + if (!rtd_dpcm) + return -EINVAL; + + /* + * The above 2 loops are mutually exclusive based on the stream direction, + * thus rtd_dpcm variable will never be overwritten + */
/* * The ADSP will convert the FE rate to 48k, stereo, 24 bit */ - if (!strcmp(fe_dai_link->name, "Kbl Audio Port") || - !strcmp(fe_dai_link->name, "Kbl Audio Headset Playback") || - !strcmp(fe_dai_link->name, "Kbl Audio Capture Port")) { + if (!strcmp(rtd_dpcm->fe->dai_link->name, "Kbl Audio Port") || + !strcmp(rtd_dpcm->fe->dai_link->name, "Kbl Audio Headset Playback") || + !strcmp(rtd_dpcm->fe->dai_link->name, "Kbl Audio Capture Port")) { rate->min = rate->max = 48000; chan->min = chan->max = 2; snd_mask_none(fmt); @@ -421,7 +444,7 @@ static int kabylake_ssp_fixup(struct snd_soc_pcm_runtime *rtd, * The speaker on the SSP0 supports S16_LE and not S24_LE. * thus changing the mask here */ - if (!strcmp(be_dai_link->name, "SSP0-Codec")) + if (!strcmp(rtd_dpcm->be->dai_link->name, "SSP0-Codec")) snd_mask_set_format(fmt, SNDRV_PCM_FORMAT_S16_LE);
return 0;
From: Marc Zyngier maz@kernel.org
[ Upstream commit 151a535171be6ff824a0a3875553ea38570f4c05 ]
kernel/irq/ipi.c otherwise fails to compile if nothing else selects it.
Fixes: 379b656446a3 ("genirq: Add GENERIC_IRQ_IPI Kconfig symbol") Reported-by: Pavel Machek pavel@ucw.cz Tested-by: Pavel Machek pavel@ucw.cz Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20201015101222.GA32747@amd Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/irq/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 10a5aff4eecc8..164a031cfdb66 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -82,6 +82,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS # Generic IRQ IPI support config GENERIC_IRQ_IPI bool + select IRQ_DOMAIN_HIERARCHY
# Generic MSI interrupt support config GENERIC_MSI_IRQ
From: Olaf Hering olaf@aepfle.de
[ Upstream commit 2c3bd2a5c86fe744e8377733c5e511a5ca1e14f5 ]
It is not an error if the host requests to balloon down, but the VM refuses to do so. Without this change a warning is logged in dmesg every five minutes.
Fixes: b3bb97b8a49f3 ("Drivers: hv: balloon: Add logging for dynamic memory operations")
Signed-off-by: Olaf Hering olaf@aepfle.de Reviewed-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/20201008071216.16554-1-olaf@aepfle.de Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/hv_balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 32e3bc0aa665a..0f50295d02149 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1275,7 +1275,7 @@ static void balloon_up(struct work_struct *dummy)
/* Refuse to balloon below the floor. */ if (avail_pages < num_pages || avail_pages - num_pages < floor) { - pr_warn("Balloon request will be partially fulfilled. %s\n", + pr_info("Balloon request will be partially fulfilled. %s\n", avail_pages < num_pages ? "Not enough memory." : "Balloon floor reached.");
From: zhuoliang zhang zhuoliang.zhang@mediatek.com
[ Upstream commit a779d91314ca7208b7feb3ad817b62904397c56d ]
we found that the following race condition exists in xfrm_alloc_userspi flow:
user thread state_hash_work thread ---- ---- xfrm_alloc_userspi() __find_acq_core() /*alloc new xfrm_state:x*/ xfrm_state_alloc() /*schedule state_hash_work thread*/ xfrm_hash_grow_check() xfrm_hash_resize() xfrm_alloc_spi /*hold lock*/ x->id.spi = htonl(spi) spin_lock_bh(&net->xfrm.xfrm_state_lock) /*waiting lock release*/ xfrm_hash_transfer() spin_lock_bh(&net->xfrm.xfrm_state_lock) /*add x into hlist:net->xfrm.state_byspi*/ hlist_add_head_rcu(&x->byspi) spin_unlock_bh(&net->xfrm.xfrm_state_lock)
/*add x into hlist:net->xfrm.state_byspi 2 times*/ hlist_add_head_rcu(&x->byspi)
1. a new state x is alloced in xfrm_state_alloc() and added into the bydst hlist in __find_acq_core() on the LHS; 2. on the RHS, state_hash_work thread travels the old bydst and tranfers every xfrm_state (include x) into the new bydst hlist and new byspi hlist; 3. user thread on the LHS gets the lock and adds x into the new byspi hlist again.
So the same xfrm_state (x) is added into the same list_hash (net->xfrm.state_byspi) 2 times that makes the list_hash become an inifite loop.
To fix the race, x->id.spi = htonl(spi) in the xfrm_alloc_spi() is moved to the back of spin_lock_bh, sothat state_hash_work thread no longer add x which id.spi is zero into the hash_list.
Fixes: f034b5d4efdf ("[XFRM]: Dynamic xfrm_state hash table sizing.") Signed-off-by: zhuoliang zhang zhuoliang.zhang@mediatek.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_state.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index efc89a92961df..ee6ac32bb06d7 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2004,6 +2004,7 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high) int err = -ENOENT; __be32 minspi = htonl(low); __be32 maxspi = htonl(high); + __be32 newspi = 0; u32 mark = x->mark.v & x->mark.m;
spin_lock_bh(&x->lock); @@ -2022,21 +2023,22 @@ int xfrm_alloc_spi(struct xfrm_state *x, u32 low, u32 high) xfrm_state_put(x0); goto unlock; } - x->id.spi = minspi; + newspi = minspi; } else { u32 spi = 0; for (h = 0; h < high-low+1; h++) { spi = low + prandom_u32()%(high-low+1); x0 = xfrm_state_lookup(net, mark, &x->id.daddr, htonl(spi), x->id.proto, x->props.family); if (x0 == NULL) { - x->id.spi = htonl(spi); + newspi = htonl(spi); break; } xfrm_state_put(x0); } } - if (x->id.spi) { + if (newspi) { spin_lock_bh(&net->xfrm.xfrm_state_lock); + x->id.spi = newspi; h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, x->props.family); hlist_add_head_rcu(&x->byspi, net->xfrm.state_byspi + h); spin_unlock_bh(&net->xfrm.xfrm_state_lock);
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit f47d0742515748162d3fc35f04331c5b81c0ed47 ]
Add missing supported rates and formats for the stream, without which attempt to do playback will fail to find any matching rates/format.
Fixes: a0aab9e1404a ("ASoC: codecs: add wsa881x amplifier support") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20201022130518.31723-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wsa881x.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/codecs/wsa881x.c b/sound/soc/codecs/wsa881x.c index d39d479e23786..5456124457a7c 100644 --- a/sound/soc/codecs/wsa881x.c +++ b/sound/soc/codecs/wsa881x.c @@ -1026,6 +1026,8 @@ static struct snd_soc_dai_driver wsa881x_dais[] = { .id = 0, .playback = { .stream_name = "SPKR Playback", + .rates = SNDRV_PCM_RATE_48000, + .formats = SNDRV_PCM_FMTBIT_S16_LE, .rate_max = 48000, .rate_min = 48000, .channels_min = 1,
From: Sascha Hauer s.hauer@pengutronix.de
[ Upstream commit 43b6bf406cd0319e522638f97c9086b7beebaeaa ]
525c9e5a32bd introduced pm_runtime support for the i.MX SPI driver. With this pm_runtime is used to bring up the clocks initially. When CONFIG_PM is disabled the clocks are no longer enabled and the driver doesn't work anymore. Fix this by enabling the clocks in the probe function and telling pm_runtime that the device is active using pm_runtime_set_active().
Fixes: 525c9e5a32bd spi: imx: enable runtime pm support Tested-by: Christian Eggers ceggers@arri.de [tested for !CONFIG_PM only] Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Link: https://lore.kernel.org/r/20201021104513.21560-1-s.hauer@pengutronix.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-imx.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c index e38e5ad3c7068..9aac515b718c8 100644 --- a/drivers/spi/spi-imx.c +++ b/drivers/spi/spi-imx.c @@ -1674,15 +1674,18 @@ static int spi_imx_probe(struct platform_device *pdev) goto out_master_put; }
- pm_runtime_enable(spi_imx->dev); + ret = clk_prepare_enable(spi_imx->clk_per); + if (ret) + goto out_master_put; + + ret = clk_prepare_enable(spi_imx->clk_ipg); + if (ret) + goto out_put_per; + pm_runtime_set_autosuspend_delay(spi_imx->dev, MXC_RPM_TIMEOUT); pm_runtime_use_autosuspend(spi_imx->dev); - - ret = pm_runtime_get_sync(spi_imx->dev); - if (ret < 0) { - dev_err(spi_imx->dev, "failed to enable clock\n"); - goto out_runtime_pm_put; - } + pm_runtime_set_active(spi_imx->dev); + pm_runtime_enable(spi_imx->dev);
spi_imx->spi_clk = clk_get_rate(spi_imx->clk_per); /* @@ -1722,8 +1725,12 @@ out_bitbang_start: spi_imx_sdma_exit(spi_imx); out_runtime_pm_put: pm_runtime_dont_use_autosuspend(spi_imx->dev); - pm_runtime_put_sync(spi_imx->dev); + pm_runtime_set_suspended(&pdev->dev); pm_runtime_disable(spi_imx->dev); + + clk_disable_unprepare(spi_imx->clk_ipg); +out_put_per: + clk_disable_unprepare(spi_imx->clk_per); out_master_put: spi_master_put(master);
From: Greentime Hu greentime.hu@sifive.com
[ Upstream commit a7480c5d725c4ecfc627e70960f249c34f5d13e8 ]
An interrupt submitted to an affinity change will always be left enabled after plic_set_affinity() has been called, while the expectation is that it should stay in whatever state it was before the call.
Preserving the configuration fixes a PWM hang issue on the Unleashed board.
[ 919.015783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 919.020922] rcu: 0-...0: (0 ticks this GP) idle=7d2/1/0x4000000000000002 softirq=1424/1424 fqs=105807 [ 919.030295] (detected by 1, t=225825 jiffies, g=1561, q=3496) [ 919.036109] Task dump for CPU 0: [ 919.039321] kworker/0:1 R running task 0 30 2 0x00000008 [ 919.046359] Workqueue: events set_brightness_delayed [ 919.051302] Call Trace: [ 919.053738] [<ffffffe000930d92>] __schedule+0x194/0x4de [ 982.035783] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 982.040923] rcu: 0-...0: (0 ticks this GP) idle=7d2/1/0x4000000000000002 softirq=1424/1424 fqs=113325 [ 982.050294] (detected by 1, t=241580 jiffies, g=1561, q=3509) [ 982.056108] Task dump for CPU 0: [ 982.059321] kworker/0:1 R running task 0 30 2 0x00000008 [ 982.066359] Workqueue: events set_brightness_delayed [ 982.071302] Call Trace: [ 982.073739] [<ffffffe000930d92>] __schedule+0x194/0x4de [..]
Fixes: bb0fed1c60cc ("irqchip/sifive-plic: Switch to fasteoi flow") Signed-off-by: Greentime Hu greentime.hu@sifive.com [maz: tidy-up commit message] Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Anup Patel anup@brainfault.org Link: https://lore.kernel.org/r/20201020081532.2377-1-greentime.hu@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-sifive-plic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index eaa3e9fe54e91..4048657ece0ac 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -151,7 +151,7 @@ static int plic_set_affinity(struct irq_data *d, return -EINVAL;
plic_irq_toggle(&priv->lmask, d, 0); - plic_irq_toggle(cpumask_of(cpu), d, 1); + plic_irq_toggle(cpumask_of(cpu), d, !irqd_irq_masked(d));
irq_data_update_effective_affinity(d, cpumask_of(cpu));
From: David Gow davidgow@google.com
[ Upstream commit 3023d8ff3fc60e5d32dc1d05f99ad6ffa12b0033 ]
Due to the raw_output() function on kunit_parser.py actually being a generator, it only runs if something reads the lines it returns. Since we no-longer do that (parsing doesn't actually happen if raw_output is enabled), it was not printing anything.
Fixes: 45ba7a893ad8 ("kunit: kunit_tool: Separate out config/build/exec/parse") Signed-off-by: David Gow davidgow@google.com Reviewed-by: Brendan Higgins brendanhiggins@google.com Tested-by: Brendan Higgins brendanhiggins@google.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/kunit/kunit_parser.py | 1 - 1 file changed, 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py index f13e0c0d66639..62a0848699671 100644 --- a/tools/testing/kunit/kunit_parser.py +++ b/tools/testing/kunit/kunit_parser.py @@ -65,7 +65,6 @@ def isolate_kunit_output(kernel_output): def raw_output(kernel_output): for line in kernel_output: print(line) - yield line
DIVIDER = '=' * 60
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 3fc48259d5250f7a3ee021ad0492b604c428c564 ]
Empty test suite is okay test suite.
Don't fail the rest of the test suites if one of them is empty.
Fixes: 6ebf5866f2e8 ("kunit: tool: add Python wrappers for running KUnit tests") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Brendan Higgins brendanhiggins@google.com Tested-by: Brendan Higgins brendanhiggins@google.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/kunit/kunit_parser.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py index 62a0848699671..91036d5d51cf6 100644 --- a/tools/testing/kunit/kunit_parser.py +++ b/tools/testing/kunit/kunit_parser.py @@ -232,7 +232,7 @@ def parse_test_suite(lines: List[str]) -> TestSuite: return None test_suite.name = name expected_test_case_num = parse_subtest_plan(lines) - if not expected_test_case_num: + if expected_test_case_num is None: return None while expected_test_case_num > 0: test_case = parse_test_case(lines)
From: Ran Wang ran.wang_1@nxp.com
[ Upstream commit 48e7bbbbb261b007fe78aa14ae62df01d236497e ]
fsl_ep_fifo_status() should return error if _ep->desc is null.
Fixes: 75eaa498c99e (“usb: gadget: Correct NULL pointer checking in fsl gadget”) Reviewed-by: Peter Chen peter.chen@nxp.com Signed-off-by: Ran Wang ran.wang_1@nxp.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/fsl_udc_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c index a6f7b2594c090..c0cb007b749ff 100644 --- a/drivers/usb/gadget/udc/fsl_udc_core.c +++ b/drivers/usb/gadget/udc/fsl_udc_core.c @@ -1051,7 +1051,7 @@ static int fsl_ep_fifo_status(struct usb_ep *_ep) u32 bitmask; struct ep_queue_head *qh;
- if (!_ep || _ep->desc || !(_ep->desc->bEndpointAddress&0xF)) + if (!_ep || !_ep->desc || !(_ep->desc->bEndpointAddress&0xF)) return -ENODEV;
ep = container_of(_ep, struct fsl_ep, ep);
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit f825d3f7ed9305e7dd0a3e0a74673a4257d0cc53 ]
Commit 1056d3d2c97e ("selftests: enforce local header dependency in lib.mk") added header dependency to the rule, but as the rule uses $^, the headers are added to the compiler command line.
This can cause unexpected precompiled header files being generated when compilation fails:
$ echo { >> openat2_test.c
$ make gcc -Wall -O2 -g -fsanitize=address -fsanitize=undefined openat2_test.c tools/testing/selftests/kselftest_harness.h tools/testing/selftests/kselftest.h helpers.c -o tools/testing/selftests/openat2/openat2_test openat2_test.c:313:1: error: expected identifier or ‘(’ before ‘{’ token 313 | { | ^ make: *** [../lib.mk:140: tools/testing/selftests/openat2/openat2_test] Error 1
$ file openat2_test* openat2_test: GCC precompiled header (version 014) for C openat2_test.c: C source, ASCII text
Fix it by filtering out the headers, so that we'll only pass the actual *.c files in the compiler command line.
Fixes: 1056d3d2c97e ("selftests: enforce local header dependency in lib.mk") Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Acked-by: Kees Cook keescook@chromium.org Reviewed-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/lib.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk index 7a17ea8157367..66f3317dc3654 100644 --- a/tools/testing/selftests/lib.mk +++ b/tools/testing/selftests/lib.mk @@ -137,7 +137,7 @@ endif ifeq ($(OVERRIDE_TARGETS),) LOCAL_HDRS := $(selfdir)/kselftest_harness.h $(selfdir)/kselftest.h $(OUTPUT)/%:%.c $(LOCAL_HDRS) - $(LINK.c) $^ $(LDLIBS) -o $@ + $(LINK.c) $(filter-out $(LOCAL_HDRS),$^) $(LDLIBS) -o $@
$(OUTPUT)/%.o:%.S $(COMPILE.S) $^ -o $@
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit fc0522bbe02fa4beb95c0514ace66b585616f111 ]
digital gain range is -84dB min to 40dB max, however this was not correctly specified in the range.
Fix this by with correct range!
Fixes: 1cde8b822332 ("ASoC: wcd934x: add basic controls") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20201028154340.17090-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wcd934x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c index 35697b072367a..40f682f5dab8b 100644 --- a/sound/soc/codecs/wcd934x.c +++ b/sound/soc/codecs/wcd934x.c @@ -551,7 +551,7 @@ struct wcd_iir_filter_ctl { struct soc_bytes_ext bytes_ext; };
-static const DECLARE_TLV_DB_SCALE(digital_gain, 0, 1, 0); +static const DECLARE_TLV_DB_SCALE(digital_gain, -8400, 100, -8400); static const DECLARE_TLV_DB_SCALE(line_gain, 0, 7, 1); static const DECLARE_TLV_DB_SCALE(analog_gain, 0, 25, 1); static const DECLARE_TLV_DB_SCALE(ear_pa_gain, 0, 150, 0);
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 6d6bc54ab4f2404d46078abc04bf4dee4db01def ]
digital gain range is -84dB min to 40dB max, however this was not correctly specified in the range.
Fix this by with correct range!
Fixes: 8c4f021d806a ("ASoC: wcd9335: add basic controls") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20201028154340.17090-2-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wcd9335.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c index f2d9d52ee171b..4d2b1ec7c03bb 100644 --- a/sound/soc/codecs/wcd9335.c +++ b/sound/soc/codecs/wcd9335.c @@ -618,7 +618,7 @@ static const char * const sb_tx8_mux_text[] = { "ZERO", "RX_MIX_TX8", "DEC8", "DEC8_192" };
-static const DECLARE_TLV_DB_SCALE(digital_gain, 0, 1, 0); +static const DECLARE_TLV_DB_SCALE(digital_gain, -8400, 100, -8400); static const DECLARE_TLV_DB_SCALE(line_gain, 0, 7, 1); static const DECLARE_TLV_DB_SCALE(analog_gain, 0, 25, 1); static const DECLARE_TLV_DB_SCALE(ear_pa_gain, 0, 150, 0);
From: Bert Vermeulen bert@biot.com
[ Upstream commit 324f78dfb442b82365548b657ec4e6974c677502 ]
If a flash chip has more than 16MB capacity but its BFPT reports BFPT_DWORD1_ADDRESS_BYTES_3_OR_4, the spi-nor framework defaults to 3.
The check in spi_nor_set_addr_width() doesn't catch it because addr_width did get set. This fixes that check.
Fixes: f9acd7fa80be ("mtd: spi-nor: sfdp: default to addr_width of 3 for configurable widths") Signed-off-by: Bert Vermeulen bert@biot.com Signed-off-by: Vignesh Raghavendra vigneshr@ti.com Reviewed-by: Tudor Ambarus tudor.ambarus@microchip.com Reviewed-by: Pratyush Yadav p.yadav@ti.com Reviewed-by: Joel Stanley joel@jms.id.au Reviewed-by: Cédric Le Goater clg@kaod.org Tested-by: Joel Stanley joel@jms.id.au Tested-by: Cédric Le Goater clg@kaod.org Link: https://lore.kernel.org/r/20201006132346.12652-1-bert@biot.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/spi-nor/core.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c index b37d6c1936de1..f0ae7a01703a1 100644 --- a/drivers/mtd/spi-nor/core.c +++ b/drivers/mtd/spi-nor/core.c @@ -3008,13 +3008,15 @@ static int spi_nor_set_addr_width(struct spi_nor *nor) /* already configured from SFDP */ } else if (nor->info->addr_width) { nor->addr_width = nor->info->addr_width; - } else if (nor->mtd.size > 0x1000000) { - /* enable 4-byte addressing if the device exceeds 16MiB */ - nor->addr_width = 4; } else { nor->addr_width = 3; }
+ if (nor->addr_width == 3 && nor->mtd.size > 0x1000000) { + /* enable 4-byte addressing if the device exceeds 16MiB */ + nor->addr_width = 4; + } + if (nor->addr_width > SPI_NOR_MAX_ADDR_WIDTH) { dev_dbg(nor->dev, "address width is too large: %u\n", nor->addr_width);
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit 2c334e12f957cd8c6bb66b4aa3f79848b7c33cab ]
Make sure that we actually initialize xefi_discard when we're scheduling a deferred free of an AGFL block. This was (eventually) found by the UBSAN while I was banging on realtime rmap problems, but it exists in the upstream codebase. While we're at it, rearrange the structure to reduce the struct size from 64 to 56 bytes.
Fixes: fcb762f5de2e ("xfs: add bmapi nodiscard flag") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Brian Foster bfoster@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_alloc.c | 1 + fs/xfs/libxfs/xfs_bmap.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 852b536551b53..15640015be9d2 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -2467,6 +2467,7 @@ xfs_defer_agfl_block( new->xefi_startblock = XFS_AGB_TO_FSB(mp, agno, agbno); new->xefi_blockcount = 1; new->xefi_oinfo = *oinfo; + new->xefi_skip_discard = false;
trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index e1bd484e55485..6747e97a79490 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -52,9 +52,9 @@ struct xfs_extent_free_item { xfs_fsblock_t xefi_startblock;/* starting fs block number */ xfs_extlen_t xefi_blockcount;/* number of blocks in extent */ + bool xefi_skip_discard; struct list_head xefi_list; struct xfs_owner_info xefi_oinfo; /* extent owner */ - bool xefi_skip_discard; };
#define XFS_BMAP_MAX_NMAP 4
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit c2f46814521113f6699a74e0a0424cbc5b305479 ]
After the previous similar bugfix there was another bug here, if no VHT elements were found we also disabled HE. Fix this to disable HE only on the 5 GHz band; on 6 GHz it was already not disabled, and on 2.4 GHz there need (should) not be any VHT.
Fixes: 57fa5e85d53c ("mac80211: determine chandef from HE 6 GHz operation") Link: https://lore.kernel.org/r/20201013140156.535a2fc6192f.Id6e5e525a60ac18d245d8... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 2e400b0ff6961..0f30f50c46b1b 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -5359,6 +5359,7 @@ int ieee80211_mgd_assoc(struct ieee80211_sub_if_data *sdata, struct cfg80211_assoc_request *req) { bool is_6ghz = req->bss->channel->band == NL80211_BAND_6GHZ; + bool is_5ghz = req->bss->channel->band == NL80211_BAND_5GHZ; struct ieee80211_local *local = sdata->local; struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; struct ieee80211_bss *bss = (void *)req->bss->priv; @@ -5507,7 +5508,7 @@ int ieee80211_mgd_assoc(struct ieee80211_sub_if_data *sdata, if (vht_ie && vht_ie[1] >= sizeof(struct ieee80211_vht_cap)) memcpy(&assoc_data->ap_vht_cap, vht_ie + 2, sizeof(struct ieee80211_vht_cap)); - else if (!is_6ghz) + else if (is_5ghz) ifmgd->flags |= IEEE80211_STA_DISABLE_VHT | IEEE80211_STA_DISABLE_HE; rcu_read_unlock();
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit dceababac29d1c53cbc1f7ddf6f688d2df01da87 ]
The netlink report should be sent regardless the available listeners.
Fixes: 84d7fce69388 ("netfilter: nf_tables: export rule-set generation ID") Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 72f3ee47e478f..1c90bd1fce60c 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -7076,7 +7076,7 @@ static void nf_tables_flowtable_notify(struct nft_ctx *ctx, GFP_KERNEL); kfree(buf);
- if (ctx->report && + if (!ctx->report && !nfnetlink_has_listeners(ctx->net, NFNLGRP_NFTABLES)) return;
@@ -7198,7 +7198,7 @@ static void nf_tables_gen_notify(struct net *net, struct sk_buff *skb, audit_log_nfcfg("?:0;?:0", 0, net->nft.base_seq, AUDIT_NFT_OP_GEN_REGISTER, GFP_KERNEL);
- if (nlmsg_report(nlh) && + if (!nlmsg_report(nlh) && !nfnetlink_has_listeners(net, NFNLGRP_NFTABLES)) return;
From: Jason A. Donenfeld Jason@zx2c4.com
[ Upstream commit 46d6c5ae953cc0be38efd0e469284df7c4328cf8 ]
If netfilter changes the packet mark when mangling, the packet is rerouted using the route_me_harder set of functions. Prior to this commit, there's one big difference between route_me_harder and the ordinary initial routing functions, described in the comment above __ip_queue_xmit():
/* Note: skb->sk can be different from sk, in case of tunnels */ int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
That function goes on to correctly make use of sk->sk_bound_dev_if, rather than skb->sk->sk_bound_dev_if. And indeed the comment is true: a tunnel will receive a packet in ndo_start_xmit with an initial skb->sk. It will make some transformations to that packet, and then it will send the encapsulated packet out of a *new* socket. That new socket will basically always have a different sk_bound_dev_if (otherwise there'd be a routing loop). So for the purposes of routing the encapsulated packet, the routing information as it pertains to the socket should come from that socket's sk, rather than the packet's original skb->sk. For that reason __ip_queue_xmit() and related functions all do the right thing.
One might argue that all tunnels should just call skb_orphan(skb) before transmitting the encapsulated packet into the new socket. But tunnels do *not* do this -- and this is wisely avoided in skb_scrub_packet() too -- because features like TSQ rely on skb->destructor() being called when that buffer space is truely available again. Calling skb_orphan(skb) too early would result in buffers filling up unnecessarily and accounting info being all wrong. Instead, additional routing must take into account the new sk, just as __ip_queue_xmit() notes.
So, this commit addresses the problem by fishing the correct sk out of state->sk -- it's already set properly in the call to nf_hook() in __ip_local_out(), which receives the sk as part of its normal functionality. So we make sure to plumb state->sk through the various route_me_harder functions, and then make correct use of it following the example of __ip_queue_xmit().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter_ipv4.h | 2 +- include/linux/netfilter_ipv6.h | 10 +++++----- net/ipv4/netfilter.c | 8 +++++--- net/ipv4/netfilter/iptable_mangle.c | 2 +- net/ipv4/netfilter/nf_reject_ipv4.c | 2 +- net/ipv6/netfilter.c | 6 +++--- net/ipv6/netfilter/ip6table_mangle.c | 2 +- net/netfilter/ipvs/ip_vs_core.c | 4 ++-- net/netfilter/nf_nat_proto.c | 4 ++-- net/netfilter/nf_synproxy_core.c | 2 +- net/netfilter/nft_chain_route.c | 4 ++-- net/netfilter/utils.c | 4 ++-- 12 files changed, 26 insertions(+), 24 deletions(-)
diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h index 082e2c41b7ff9..5b70ca868bb19 100644 --- a/include/linux/netfilter_ipv4.h +++ b/include/linux/netfilter_ipv4.h @@ -16,7 +16,7 @@ struct ip_rt_info { u_int32_t mark; };
-int ip_route_me_harder(struct net *net, struct sk_buff *skb, unsigned addr_type); +int ip_route_me_harder(struct net *net, struct sock *sk, struct sk_buff *skb, unsigned addr_type);
struct nf_queue_entry;
diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h index 9b67394471e1c..48314ade1506f 100644 --- a/include/linux/netfilter_ipv6.h +++ b/include/linux/netfilter_ipv6.h @@ -42,7 +42,7 @@ struct nf_ipv6_ops { #if IS_MODULE(CONFIG_IPV6) int (*chk_addr)(struct net *net, const struct in6_addr *addr, const struct net_device *dev, int strict); - int (*route_me_harder)(struct net *net, struct sk_buff *skb); + int (*route_me_harder)(struct net *net, struct sock *sk, struct sk_buff *skb); int (*dev_get_saddr)(struct net *net, const struct net_device *dev, const struct in6_addr *daddr, unsigned int srcprefs, struct in6_addr *saddr); @@ -143,9 +143,9 @@ static inline int nf_br_ip6_fragment(struct net *net, struct sock *sk, #endif }
-int ip6_route_me_harder(struct net *net, struct sk_buff *skb); +int ip6_route_me_harder(struct net *net, struct sock *sk, struct sk_buff *skb);
-static inline int nf_ip6_route_me_harder(struct net *net, struct sk_buff *skb) +static inline int nf_ip6_route_me_harder(struct net *net, struct sock *sk, struct sk_buff *skb) { #if IS_MODULE(CONFIG_IPV6) const struct nf_ipv6_ops *v6_ops = nf_get_ipv6_ops(); @@ -153,9 +153,9 @@ static inline int nf_ip6_route_me_harder(struct net *net, struct sk_buff *skb) if (!v6_ops) return -EHOSTUNREACH;
- return v6_ops->route_me_harder(net, skb); + return v6_ops->route_me_harder(net, sk, skb); #elif IS_BUILTIN(CONFIG_IPV6) - return ip6_route_me_harder(net, skb); + return ip6_route_me_harder(net, sk, skb); #else return -EHOSTUNREACH; #endif diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c index a058213b77a78..7c841037c5334 100644 --- a/net/ipv4/netfilter.c +++ b/net/ipv4/netfilter.c @@ -17,17 +17,19 @@ #include <net/netfilter/nf_queue.h>
/* route_me_harder function, used by iptable_nat, iptable_mangle + ip_queue */ -int ip_route_me_harder(struct net *net, struct sk_buff *skb, unsigned int addr_type) +int ip_route_me_harder(struct net *net, struct sock *sk, struct sk_buff *skb, unsigned int addr_type) { const struct iphdr *iph = ip_hdr(skb); struct rtable *rt; struct flowi4 fl4 = {}; __be32 saddr = iph->saddr; - const struct sock *sk = skb_to_full_sk(skb); - __u8 flags = sk ? inet_sk_flowi_flags(sk) : 0; + __u8 flags; struct net_device *dev = skb_dst(skb)->dev; unsigned int hh_len;
+ sk = sk_to_full_sk(sk); + flags = sk ? inet_sk_flowi_flags(sk) : 0; + if (addr_type == RTN_UNSPEC) addr_type = inet_addr_type_dev_table(net, dev, saddr); if (addr_type == RTN_LOCAL || addr_type == RTN_UNICAST) diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c index f703a717ab1d2..8330795892730 100644 --- a/net/ipv4/netfilter/iptable_mangle.c +++ b/net/ipv4/netfilter/iptable_mangle.c @@ -62,7 +62,7 @@ ipt_mangle_out(struct sk_buff *skb, const struct nf_hook_state *state) iph->daddr != daddr || skb->mark != mark || iph->tos != tos) { - err = ip_route_me_harder(state->net, skb, RTN_UNSPEC); + err = ip_route_me_harder(state->net, state->sk, skb, RTN_UNSPEC); if (err < 0) ret = NF_DROP_ERR(err); } diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c index 9dcfa4e461b65..93b07739807b2 100644 --- a/net/ipv4/netfilter/nf_reject_ipv4.c +++ b/net/ipv4/netfilter/nf_reject_ipv4.c @@ -145,7 +145,7 @@ void nf_send_reset(struct net *net, struct sk_buff *oldskb, int hook) ip4_dst_hoplimit(skb_dst(nskb))); nf_reject_ip_tcphdr_put(nskb, oldskb, oth);
- if (ip_route_me_harder(net, nskb, RTN_UNSPEC)) + if (ip_route_me_harder(net, nskb->sk, nskb, RTN_UNSPEC)) goto free_nskb;
niph = ip_hdr(nskb); diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c index 6d0e942d082d4..ab9a279dd6d47 100644 --- a/net/ipv6/netfilter.c +++ b/net/ipv6/netfilter.c @@ -20,10 +20,10 @@ #include <net/netfilter/ipv6/nf_defrag_ipv6.h> #include "../bridge/br_private.h"
-int ip6_route_me_harder(struct net *net, struct sk_buff *skb) +int ip6_route_me_harder(struct net *net, struct sock *sk_partial, struct sk_buff *skb) { const struct ipv6hdr *iph = ipv6_hdr(skb); - struct sock *sk = sk_to_full_sk(skb->sk); + struct sock *sk = sk_to_full_sk(sk_partial); unsigned int hh_len; struct dst_entry *dst; int strict = (ipv6_addr_type(&iph->daddr) & @@ -84,7 +84,7 @@ static int nf_ip6_reroute(struct sk_buff *skb, if (!ipv6_addr_equal(&iph->daddr, &rt_info->daddr) || !ipv6_addr_equal(&iph->saddr, &rt_info->saddr) || skb->mark != rt_info->mark) - return ip6_route_me_harder(entry->state.net, skb); + return ip6_route_me_harder(entry->state.net, entry->state.sk, skb); } return 0; } diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c index 1a2748611e003..cee74803d7a1c 100644 --- a/net/ipv6/netfilter/ip6table_mangle.c +++ b/net/ipv6/netfilter/ip6table_mangle.c @@ -57,7 +57,7 @@ ip6t_mangle_out(struct sk_buff *skb, const struct nf_hook_state *state) skb->mark != mark || ipv6_hdr(skb)->hop_limit != hop_limit || flowlabel != *((u_int32_t *)ipv6_hdr(skb)))) { - err = ip6_route_me_harder(state->net, skb); + err = ip6_route_me_harder(state->net, state->sk, skb); if (err < 0) ret = NF_DROP_ERR(err); } diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index e3668a6e54e47..570d8ef6fb8b6 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -742,12 +742,12 @@ static int ip_vs_route_me_harder(struct netns_ipvs *ipvs, int af, struct dst_entry *dst = skb_dst(skb);
if (dst->dev && !(dst->dev->flags & IFF_LOOPBACK) && - ip6_route_me_harder(ipvs->net, skb) != 0) + ip6_route_me_harder(ipvs->net, skb->sk, skb) != 0) return 1; } else #endif if (!(skb_rtable(skb)->rt_flags & RTCF_LOCAL) && - ip_route_me_harder(ipvs->net, skb, RTN_LOCAL) != 0) + ip_route_me_harder(ipvs->net, skb->sk, skb, RTN_LOCAL) != 0) return 1;
return 0; diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c index 59151dc07fdc1..e87b6bd6b3cdb 100644 --- a/net/netfilter/nf_nat_proto.c +++ b/net/netfilter/nf_nat_proto.c @@ -715,7 +715,7 @@ nf_nat_ipv4_local_fn(void *priv, struct sk_buff *skb,
if (ct->tuplehash[dir].tuple.dst.u3.ip != ct->tuplehash[!dir].tuple.src.u3.ip) { - err = ip_route_me_harder(state->net, skb, RTN_UNSPEC); + err = ip_route_me_harder(state->net, state->sk, skb, RTN_UNSPEC); if (err < 0) ret = NF_DROP_ERR(err); } @@ -953,7 +953,7 @@ nf_nat_ipv6_local_fn(void *priv, struct sk_buff *skb,
if (!nf_inet_addr_cmp(&ct->tuplehash[dir].tuple.dst.u3, &ct->tuplehash[!dir].tuple.src.u3)) { - err = nf_ip6_route_me_harder(state->net, skb); + err = nf_ip6_route_me_harder(state->net, state->sk, skb); if (err < 0) ret = NF_DROP_ERR(err); } diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c index 9cca35d229273..d7d34a62d3bf5 100644 --- a/net/netfilter/nf_synproxy_core.c +++ b/net/netfilter/nf_synproxy_core.c @@ -446,7 +446,7 @@ synproxy_send_tcp(struct net *net,
skb_dst_set_noref(nskb, skb_dst(skb)); nskb->protocol = htons(ETH_P_IP); - if (ip_route_me_harder(net, nskb, RTN_UNSPEC)) + if (ip_route_me_harder(net, nskb->sk, nskb, RTN_UNSPEC)) goto free_nskb;
if (nfct) { diff --git a/net/netfilter/nft_chain_route.c b/net/netfilter/nft_chain_route.c index 8826bbe71136c..edd02cda57fca 100644 --- a/net/netfilter/nft_chain_route.c +++ b/net/netfilter/nft_chain_route.c @@ -42,7 +42,7 @@ static unsigned int nf_route_table_hook4(void *priv, iph->daddr != daddr || skb->mark != mark || iph->tos != tos) { - err = ip_route_me_harder(state->net, skb, RTN_UNSPEC); + err = ip_route_me_harder(state->net, state->sk, skb, RTN_UNSPEC); if (err < 0) ret = NF_DROP_ERR(err); } @@ -92,7 +92,7 @@ static unsigned int nf_route_table_hook6(void *priv, skb->mark != mark || ipv6_hdr(skb)->hop_limit != hop_limit || flowlabel != *((u32 *)ipv6_hdr(skb)))) { - err = nf_ip6_route_me_harder(state->net, skb); + err = nf_ip6_route_me_harder(state->net, state->sk, skb); if (err < 0) ret = NF_DROP_ERR(err); } diff --git a/net/netfilter/utils.c b/net/netfilter/utils.c index cedf47ab3c6f9..2182d361e273f 100644 --- a/net/netfilter/utils.c +++ b/net/netfilter/utils.c @@ -191,8 +191,8 @@ static int nf_ip_reroute(struct sk_buff *skb, const struct nf_queue_entry *entry skb->mark == rt_info->mark && iph->daddr == rt_info->daddr && iph->saddr == rt_info->saddr)) - return ip_route_me_harder(entry->state.net, skb, - RTN_UNSPEC); + return ip_route_me_harder(entry->state.net, entry->state.sk, + skb, RTN_UNSPEC); } #endif return 0;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit c0391b6ab810381df632677a1dcbbbbd63d05b6d ]
If userspace does not include the trailing end of batch message, then nfnetlink aborts the transaction. This allows to check that ruleset updates trigger no errors.
After this patch, invoking this command from the prerouting chain:
# nft -c add rule x y fib saddr . oif type local
fails since oif is not supported there.
This patch fixes the lack of rule validation from the abort/check path to catch configuration errors such as the one above.
Fixes: a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter/nfnetlink.h | 9 ++++++++- net/netfilter/nf_tables_api.c | 15 ++++++++++----- net/netfilter/nfnetlink.c | 22 ++++++++++++++++++---- 3 files changed, 36 insertions(+), 10 deletions(-)
diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h index 89016d08f6a27..f6267e2883f26 100644 --- a/include/linux/netfilter/nfnetlink.h +++ b/include/linux/netfilter/nfnetlink.h @@ -24,6 +24,12 @@ struct nfnl_callback { const u_int16_t attr_count; /* number of nlattr's */ };
+enum nfnl_abort_action { + NFNL_ABORT_NONE = 0, + NFNL_ABORT_AUTOLOAD, + NFNL_ABORT_VALIDATE, +}; + struct nfnetlink_subsystem { const char *name; __u8 subsys_id; /* nfnetlink subsystem ID */ @@ -31,7 +37,8 @@ struct nfnetlink_subsystem { const struct nfnl_callback *cb; /* callback for individual types */ struct module *owner; int (*commit)(struct net *net, struct sk_buff *skb); - int (*abort)(struct net *net, struct sk_buff *skb, bool autoload); + int (*abort)(struct net *net, struct sk_buff *skb, + enum nfnl_abort_action action); void (*cleanup)(struct net *net); bool (*valid_genid)(struct net *net, u32 genid); }; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 1c90bd1fce60c..4305d96334082 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -7992,12 +7992,16 @@ static void nf_tables_abort_release(struct nft_trans *trans) kfree(trans); }
-static int __nf_tables_abort(struct net *net, bool autoload) +static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) { struct nft_trans *trans, *next; struct nft_trans_elem *te; struct nft_hook *hook;
+ if (action == NFNL_ABORT_VALIDATE && + nf_tables_validate(net) < 0) + return -EAGAIN; + list_for_each_entry_safe_reverse(trans, next, &net->nft.commit_list, list) { switch (trans->msg_type) { @@ -8129,7 +8133,7 @@ static int __nf_tables_abort(struct net *net, bool autoload) nf_tables_abort_release(trans); }
- if (autoload) + if (action == NFNL_ABORT_AUTOLOAD) nf_tables_module_autoload(net); else nf_tables_module_autoload_cleanup(net); @@ -8142,9 +8146,10 @@ static void nf_tables_cleanup(struct net *net) nft_validate_state_update(net, NFT_VALIDATE_SKIP); }
-static int nf_tables_abort(struct net *net, struct sk_buff *skb, bool autoload) +static int nf_tables_abort(struct net *net, struct sk_buff *skb, + enum nfnl_abort_action action) { - int ret = __nf_tables_abort(net, autoload); + int ret = __nf_tables_abort(net, action);
mutex_unlock(&net->nft.commit_mutex);
@@ -8775,7 +8780,7 @@ static void __net_exit nf_tables_exit_net(struct net *net) { mutex_lock(&net->nft.commit_mutex); if (!list_empty(&net->nft.commit_list)) - __nf_tables_abort(net, false); + __nf_tables_abort(net, NFNL_ABORT_NONE); __nft_release_tables(net); mutex_unlock(&net->nft.commit_mutex); WARN_ON_ONCE(!list_empty(&net->nft.tables)); diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c index 3a2e64e13b227..212c37f53f5f4 100644 --- a/net/netfilter/nfnetlink.c +++ b/net/netfilter/nfnetlink.c @@ -316,7 +316,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh, return netlink_ack(skb, nlh, -EINVAL, NULL); replay: status = 0; - +replay_abort: skb = netlink_skb_clone(oskb, GFP_KERNEL); if (!skb) return netlink_ack(oskb, nlh, -ENOMEM, NULL); @@ -482,7 +482,7 @@ ack: } done: if (status & NFNL_BATCH_REPLAY) { - ss->abort(net, oskb, true); + ss->abort(net, oskb, NFNL_ABORT_AUTOLOAD); nfnl_err_reset(&err_list); kfree_skb(skb); module_put(ss->owner); @@ -493,11 +493,25 @@ done: status |= NFNL_BATCH_REPLAY; goto done; } else if (err) { - ss->abort(net, oskb, false); + ss->abort(net, oskb, NFNL_ABORT_NONE); netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL); } } else { - ss->abort(net, oskb, false); + enum nfnl_abort_action abort_action; + + if (status & NFNL_BATCH_FAILURE) + abort_action = NFNL_ABORT_NONE; + else + abort_action = NFNL_ABORT_VALIDATE; + + err = ss->abort(net, oskb, abort_action); + if (err == -EAGAIN) { + nfnl_err_reset(&err_list); + kfree_skb(skb); + module_put(ss->owner); + status |= NFNL_BATCH_FAILURE; + goto replay_abort; + } } if (ss->cleanup) ss->cleanup(net);
From: Rajat Jain rajatja@google.com
[ Upstream commit 462b58fb033996e999cc213ed0b430d4f22a28fe ]
Some devices support ACS functionality even though they don't have a spec-compliant ACS Capability; pci_enable_acs() has a quirk mechanism to handle them.
We want to enable ACS whenever possible, but 52fbf5bdeeef ("PCI: Cache ACS capability offset in device") inadvertently broke this by calling pci_enable_acs() only if we find an ACS Capability.
This resulted in ACS not being enabled for these non-compliant devices, which means devices can't be separated into different IOMMU groups, which in turn means we may not be able to pass those devices through to VMs, as reported by Boris V:
https://lore.kernel.org/r/74aeea93-8a46-5f5a-343c-790d4c655da3@bstnet.org
Fixes: 52fbf5bdeeef ("PCI: Cache ACS capability offset in device") Link: https://lore.kernel.org/r/20201028231545.4116866-1-rajatja@google.com Reported-by: Boris V borisvk@bstnet.org Signed-off-by: Rajat Jain rajatja@google.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index e39c5499770ff..b2fed944903e2 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3503,8 +3503,13 @@ void pci_acs_init(struct pci_dev *dev) { dev->acs_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
- if (dev->acs_cap) - pci_enable_acs(dev); + /* + * Attempt to enable ACS regardless of capability because some Root + * Ports (e.g. those quirked with *_intel_pch_acs_*) do not have + * the standard ACS capability but still support ACS via those + * quirks. + */ + pci_enable_acs(dev); }
/**
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit 7d10e62c2ff8e084c136c94d32d9a94de4d31248 ]
In ip_set_match_extensions(), for sets with counters, we take care of updating counters themselves by calling ip_set_update_counter(), and of checking if the given comparison and values match, by calling ip_set_match_counter() if needed.
However, if a given comparison on counters doesn't match the configured values, that doesn't mean the set entry itself isn't matching.
This fix restores the behaviour we had before commit 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching"), without reintroducing the issue fixed there: back then, mtype_data_match() first updated counters in any case, and then took care of matching on counters.
Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set, ip_set_update_counter() will anyway skip counter updates if desired.
The issue observed is illustrated by this reproducer:
ipset create c hash:ip counters ipset add c 192.0.2.1 iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
if we now send packets from 192.0.2.1, bytes and packets counters for the entry as shown by 'ipset list' are always zero, and, no matter how many bytes we send, the rule will never match, because counters themselves are not updated.
Reported-by: Mithil Mhatre mmhatre@redhat.com Fixes: 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching") Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 920b7c4331f0c..2643dc982eb4e 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -652,13 +652,14 @@ ip_set_match_extensions(struct ip_set *set, const struct ip_set_ext *ext, if (SET_WITH_COUNTER(set)) { struct ip_set_counter *counter = ext_counter(data, set);
+ ip_set_update_counter(counter, ext, flags); + if (flags & IPSET_FLAG_MATCH_COUNTERS && !(ip_set_match_counter(ip_set_get_packets(counter), mext->packets, mext->packets_op) && ip_set_match_counter(ip_set_get_bytes(counter), mext->bytes, mext->bytes_op))) return false; - ip_set_update_counter(counter, ext, flags); } if (SET_WITH_SKBINFO(set)) ip_set_get_skbinfo(ext_skbinfo(data, set),
From: Greentime Hu greentime.hu@sifive.com
[ Upstream commit f9ac7bbd6e4540dcc6df621b9c9b6eb2e26ded1d ]
The plic driver crashes in plic_irq_unmask() when the interrupt is within a hierarchy, as it picks the top-level chip_data instead of its local one.
Using irq_data_get_irq_chip_data() instead of irq_get_chip_data() solves the issue for good.
Fixes: f1ad1133b18f ("irqchip/sifive-plic: Add support for multiple PLICs") Signed-off-by: Greentime Hu greentime.hu@sifive.com [maz: rewrote commit message] Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Anup Patel anup@brainfault.org Reviewed-by: Atish Patra atish.patra@wdc.com Link: https://lore.kernel.org/r/20201029023738.127472-1-greentime.hu@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-sifive-plic.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index 4048657ece0ac..6f432d2a5cebd 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -99,7 +99,7 @@ static inline void plic_irq_toggle(const struct cpumask *mask, struct irq_data *d, int enable) { int cpu; - struct plic_priv *priv = irq_get_chip_data(d->irq); + struct plic_priv *priv = irq_data_get_irq_chip_data(d);
writel(enable, priv->regs + PRIORITY_BASE + d->hwirq * PRIORITY_PER_ID); for_each_cpu(cpu, mask) { @@ -115,7 +115,7 @@ static void plic_irq_unmask(struct irq_data *d) { struct cpumask amask; unsigned int cpu; - struct plic_priv *priv = irq_get_chip_data(d->irq); + struct plic_priv *priv = irq_data_get_irq_chip_data(d);
cpumask_and(&amask, &priv->lmask, cpu_online_mask); cpu = cpumask_any_and(irq_data_get_affinity_mask(d), @@ -127,7 +127,7 @@ static void plic_irq_unmask(struct irq_data *d)
static void plic_irq_mask(struct irq_data *d) { - struct plic_priv *priv = irq_get_chip_data(d->irq); + struct plic_priv *priv = irq_data_get_irq_chip_data(d);
plic_irq_toggle(&priv->lmask, d, 0); } @@ -138,7 +138,7 @@ static int plic_set_affinity(struct irq_data *d, { unsigned int cpu; struct cpumask amask; - struct plic_priv *priv = irq_get_chip_data(d->irq); + struct plic_priv *priv = irq_data_get_irq_chip_data(d);
cpumask_and(&amask, &priv->lmask, mask_val);
From: Qian Cai cai@redhat.com
[ Upstream commit fd552e0542b4532483289cce48fdbd27b692984b ]
Lockdep complains that a possible deadlock below in eeh_addr_cache_show() because it is acquiring a lock with IRQ enabled, but eeh_addr_cache_insert_dev() needs to acquire the same lock with IRQ disabled. Let's just make eeh_addr_cache_show() acquire the lock with IRQ disabled as well.
CPU0 CPU1 ---- ---- lock(&pci_io_addr_cache_root.piar_lock); local_irq_disable(); lock(&tp->lock); lock(&pci_io_addr_cache_root.piar_lock); <Interrupt> lock(&tp->lock);
*** DEADLOCK ***
lock_acquire+0x140/0x5f0 _raw_spin_lock_irqsave+0x64/0xb0 eeh_addr_cache_insert_dev+0x48/0x390 eeh_probe_device+0xb8/0x1a0 pnv_pcibios_bus_add_device+0x3c/0x80 pcibios_bus_add_device+0x118/0x290 pci_bus_add_device+0x28/0xe0 pci_bus_add_devices+0x54/0xb0 pcibios_init+0xc4/0x124 do_one_initcall+0xac/0x528 kernel_init_freeable+0x35c/0x3fc kernel_init+0x24/0x148 ret_from_kernel_thread+0x5c/0x80
lock_acquire+0x140/0x5f0 _raw_spin_lock+0x4c/0x70 eeh_addr_cache_show+0x38/0x110 seq_read+0x1a0/0x660 vfs_read+0xc8/0x1f0 ksys_read+0x74/0x130 system_call_exception+0xf8/0x1d0 system_call_common+0xe8/0x218
Fixes: 5ca85ae6318d ("powerpc/eeh_cache: Add a way to dump the EEH address cache") Signed-off-by: Qian Cai cai@redhat.com Reviewed-by: Oliver O'Halloran oohall@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20201028152717.8967-1-cai@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/eeh_cache.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c index 6b50bf15d8c19..bf3270426d82d 100644 --- a/arch/powerpc/kernel/eeh_cache.c +++ b/arch/powerpc/kernel/eeh_cache.c @@ -264,8 +264,9 @@ static int eeh_addr_cache_show(struct seq_file *s, void *v) { struct pci_io_addr_range *piar; struct rb_node *n; + unsigned long flags;
- spin_lock(&pci_io_addr_cache_root.piar_lock); + spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags); for (n = rb_first(&pci_io_addr_cache_root.rb_root); n; n = rb_next(n)) { piar = rb_entry(n, struct pci_io_addr_range, rb_node);
@@ -273,7 +274,7 @@ static int eeh_addr_cache_show(struct seq_file *s, void *v) (piar->flags & IORESOURCE_IO) ? "i/o" : "mem", &piar->addr_lo, &piar->addr_hi, pci_name(piar->pcidev)); } - spin_unlock(&pci_io_addr_cache_root.piar_lock); + spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
return 0; }
From: Maxime Ripard maxime@cerno.tech
[ Upstream commit 1c80be48c70a2198f7cf04a546b3805b92293ac6 ]
The BO cache needs to be cleaned up using vc4_bo_cache_destroy, but it's not used consistently (vc4_drv's bind calls it in its error path, but doesn't in unbind), and we can make that automatic through a managed action. Let's remove the requirement to call vc4_bo_cache_destroy.
Fixes: c826a6e10644 ("drm/vc4: Add a BO cache.") Acked-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20201029190104.2181730-1-maxim... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_bo.c | 5 +++-- drivers/gpu/drm/vc4/vc4_drv.c | 1 - drivers/gpu/drm/vc4/vc4_drv.h | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c index 74ceebd62fbce..073b528f33337 100644 --- a/drivers/gpu/drm/vc4/vc4_bo.c +++ b/drivers/gpu/drm/vc4/vc4_bo.c @@ -1005,6 +1005,7 @@ int vc4_get_tiling_ioctl(struct drm_device *dev, void *data, return 0; }
+static void vc4_bo_cache_destroy(struct drm_device *dev, void *unused); int vc4_bo_cache_init(struct drm_device *dev) { struct vc4_dev *vc4 = to_vc4_dev(dev); @@ -1033,10 +1034,10 @@ int vc4_bo_cache_init(struct drm_device *dev) INIT_WORK(&vc4->bo_cache.time_work, vc4_bo_cache_time_work); timer_setup(&vc4->bo_cache.time_timer, vc4_bo_cache_time_timer, 0);
- return 0; + return drmm_add_action_or_reset(dev, vc4_bo_cache_destroy, NULL); }
-void vc4_bo_cache_destroy(struct drm_device *dev) +static void vc4_bo_cache_destroy(struct drm_device *dev, void *unused) { struct vc4_dev *vc4 = to_vc4_dev(dev); int i; diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c index f6995e7f6eb6e..c7aeaba3fabe8 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.c +++ b/drivers/gpu/drm/vc4/vc4_drv.c @@ -311,7 +311,6 @@ unbind_all: gem_destroy: vc4_gem_destroy(drm); drm_mode_config_cleanup(drm); - vc4_bo_cache_destroy(drm); dev_put: drm_dev_put(drm); return ret; diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h index fa19160c801f8..528c28895a8e0 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.h +++ b/drivers/gpu/drm/vc4/vc4_drv.h @@ -14,6 +14,7 @@ #include <drm/drm_device.h> #include <drm/drm_encoder.h> #include <drm/drm_gem_cma_helper.h> +#include <drm/drm_managed.h> #include <drm/drm_mm.h> #include <drm/drm_modeset_lock.h>
@@ -786,7 +787,6 @@ struct drm_gem_object *vc4_prime_import_sg_table(struct drm_device *dev, struct sg_table *sgt); void *vc4_prime_vmap(struct drm_gem_object *obj); int vc4_bo_cache_init(struct drm_device *dev); -void vc4_bo_cache_destroy(struct drm_device *dev); int vc4_bo_inc_usecnt(struct vc4_bo *bo); void vc4_bo_dec_usecnt(struct vc4_bo *bo); void vc4_bo_add_to_purgeable_pool(struct vc4_bo *bo);
From: Maor Gottlieb maorg@nvidia.com
[ Upstream commit 372a1786283e50e7cb437ab7fdb1b95597310ad7 ]
Failure in srpt_refresh_port() for the second port will leave MAD registered for the first one, however, the srpt_add_one() will be marked as "failed" and SRPT will leak resources for that registered but not used and released first port.
Unregister the MAD agent for all ports in case of failure.
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1") Link: https://lore.kernel.org/r/20201028065051.112430-1-leon@kernel.org Signed-off-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/srpt/ib_srpt.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 0065eb17ae36b..1b096305de1a4 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -622,10 +622,11 @@ static int srpt_refresh_port(struct srpt_port *sport) /** * srpt_unregister_mad_agent - unregister MAD callback functions * @sdev: SRPT HCA pointer. + * #port_cnt: number of ports with registered MAD * * Note: It is safe to call this function more than once for the same device. */ -static void srpt_unregister_mad_agent(struct srpt_device *sdev) +static void srpt_unregister_mad_agent(struct srpt_device *sdev, int port_cnt) { struct ib_port_modify port_modify = { .clr_port_cap_mask = IB_PORT_DEVICE_MGMT_SUP, @@ -633,7 +634,7 @@ static void srpt_unregister_mad_agent(struct srpt_device *sdev) struct srpt_port *sport; int i;
- for (i = 1; i <= sdev->device->phys_port_cnt; i++) { + for (i = 1; i <= port_cnt; i++) { sport = &sdev->port[i - 1]; WARN_ON(sport->port != i); if (sport->mad_agent) { @@ -3185,7 +3186,8 @@ static int srpt_add_one(struct ib_device *device) if (ret) { pr_err("MAD registration failed for %s-%d.\n", dev_name(&sdev->device->dev), i); - goto err_event; + i--; + goto err_port; } }
@@ -3197,7 +3199,8 @@ static int srpt_add_one(struct ib_device *device) pr_debug("added %s.\n", dev_name(&device->dev)); return 0;
-err_event: +err_port: + srpt_unregister_mad_agent(sdev, i); ib_unregister_event_handler(&sdev->event_handler); err_cm: if (sdev->cm_id) @@ -3221,7 +3224,7 @@ static void srpt_remove_one(struct ib_device *device, void *client_data) struct srpt_device *sdev = client_data; int i;
- srpt_unregister_mad_agent(sdev); + srpt_unregister_mad_agent(sdev, sdev->device->phys_port_cnt);
ib_unregister_event_handler(&sdev->event_handler);
From: zhongjiang-ali zhongjiang-ali@linux.alibaba.com
[ Upstream commit 7de2e9f195b9cb27583c5c64deaaf5e6afcc163e ]
memcg_page_state will get the specified number in hierarchical memcg, It should multiply by HPAGE_PMD_NR rather than an page if the item is NR_ANON_THPS.
[akpm@linux-foundation.org: fix printk warning] [akpm@linux-foundation.org: use u64 cast, per Michal]
Fixes: 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter") Signed-off-by: zhongjiang-ali zhongjiang-ali@linux.alibaba.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Johannes Weiner hannes@cmpxchg.org Acked-by: Michal Hocko mhocko@suse.com Link: https://lkml.kernel.org/r/1603722395-72443-1-git-send-email-zhongjiang-ali@l... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memcontrol.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index de51787831728..51ce5d172855a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4068,11 +4068,17 @@ static int memcg_stat_show(struct seq_file *m, void *v) (u64)memsw * PAGE_SIZE);
for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) { + unsigned long nr; + if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; + nr = memcg_page_state(memcg, memcg1_stats[i]); +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + if (memcg1_stats[i] == NR_ANON_THPS) + nr *= HPAGE_PMD_NR; +#endif seq_printf(m, "total_%s %llu\n", memcg1_stat_names[i], - (u64)memcg_page_state(memcg, memcg1_stats[i]) * - PAGE_SIZE); + (u64)nr * PAGE_SIZE); }
for (i = 0; i < ARRAY_SIZE(memcg1_events); i++)
From: Clément Péron peron.clem@gmail.com
[ Upstream commit d3c335da0200be9287cdf5755d19f62ce1670a8d ]
Rename goto labels in device_init it will be easier to maintain.
Reviewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Steven Price steven.price@arm.com Signed-off-by: Clément Péron peron.clem@gmail.com Signed-off-by: Rob Herring robh@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-8-peron.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panfrost/panfrost_device.c | 30 +++++++++++----------- 1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c index b172087eee6ae..9f89984f652a6 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.c +++ b/drivers/gpu/drm/panfrost/panfrost_device.c @@ -216,56 +216,56 @@ int panfrost_device_init(struct panfrost_device *pfdev)
err = panfrost_regulator_init(pfdev); if (err) - goto err_out0; + goto out_clk;
err = panfrost_reset_init(pfdev); if (err) { dev_err(pfdev->dev, "reset init failed %d\n", err); - goto err_out1; + goto out_regulator; }
err = panfrost_pm_domain_init(pfdev); if (err) - goto err_out2; + goto out_reset;
res = platform_get_resource(pfdev->pdev, IORESOURCE_MEM, 0); pfdev->iomem = devm_ioremap_resource(pfdev->dev, res); if (IS_ERR(pfdev->iomem)) { dev_err(pfdev->dev, "failed to ioremap iomem\n"); err = PTR_ERR(pfdev->iomem); - goto err_out3; + goto out_pm_domain; }
err = panfrost_gpu_init(pfdev); if (err) - goto err_out3; + goto out_pm_domain;
err = panfrost_mmu_init(pfdev); if (err) - goto err_out4; + goto out_gpu;
err = panfrost_job_init(pfdev); if (err) - goto err_out5; + goto out_mmu;
err = panfrost_perfcnt_init(pfdev); if (err) - goto err_out6; + goto out_job;
return 0; -err_out6: +out_job: panfrost_job_fini(pfdev); -err_out5: +out_mmu: panfrost_mmu_fini(pfdev); -err_out4: +out_gpu: panfrost_gpu_fini(pfdev); -err_out3: +out_pm_domain: panfrost_pm_domain_fini(pfdev); -err_out2: +out_reset: panfrost_reset_fini(pfdev); -err_out1: +out_regulator: panfrost_regulator_fini(pfdev); -err_out0: +out_clk: panfrost_clk_fini(pfdev); return err; }
From: Clément Péron peron.clem@gmail.com
[ Upstream commit 25e247bbf85af3ad721dfeb2e2caf405f43b7e66 ]
Later we will introduce devfreq probing regulator if they are present. As regulator should be probe only one time we need to get this logic in the device_init().
panfrost_device is already taking care of devfreq_resume() and devfreq_suspend(), so it's not totally illogic to move the devfreq_init() and devfreq_fini() here.
Reviewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Steven Price steven.price@arm.com Signed-off-by: Clément Péron peron.clem@gmail.com Signed-off-by: Rob Herring robh@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-9-peron.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panfrost/panfrost_device.c | 12 +++++++++++- drivers/gpu/drm/panfrost/panfrost_drv.c | 15 ++------------- 2 files changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c index 9f89984f652a6..36b5c8fea3eba 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.c +++ b/drivers/gpu/drm/panfrost/panfrost_device.c @@ -214,9 +214,16 @@ int panfrost_device_init(struct panfrost_device *pfdev) return err; }
+ err = panfrost_devfreq_init(pfdev); + if (err) { + if (err != -EPROBE_DEFER) + dev_err(pfdev->dev, "devfreq init failed %d\n", err); + goto out_clk; + } + err = panfrost_regulator_init(pfdev); if (err) - goto out_clk; + goto out_devfreq;
err = panfrost_reset_init(pfdev); if (err) { @@ -265,6 +272,8 @@ out_reset: panfrost_reset_fini(pfdev); out_regulator: panfrost_regulator_fini(pfdev); +out_devfreq: + panfrost_devfreq_fini(pfdev); out_clk: panfrost_clk_fini(pfdev); return err; @@ -278,6 +287,7 @@ void panfrost_device_fini(struct panfrost_device *pfdev) panfrost_gpu_fini(pfdev); panfrost_pm_domain_fini(pfdev); panfrost_reset_fini(pfdev); + panfrost_devfreq_fini(pfdev); panfrost_regulator_fini(pfdev); panfrost_clk_fini(pfdev); } diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index f6d5d03201fad..f2dd259f28995 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -14,7 +14,6 @@ #include <drm/drm_utils.h>
#include "panfrost_device.h" -#include "panfrost_devfreq.h" #include "panfrost_gem.h" #include "panfrost_mmu.h" #include "panfrost_job.h" @@ -606,13 +605,6 @@ static int panfrost_probe(struct platform_device *pdev) goto err_out0; }
- err = panfrost_devfreq_init(pfdev); - if (err) { - if (err != -EPROBE_DEFER) - dev_err(&pdev->dev, "Fatal error during devfreq init\n"); - goto err_out1; - } - pm_runtime_set_active(pfdev->dev); pm_runtime_mark_last_busy(pfdev->dev); pm_runtime_enable(pfdev->dev); @@ -625,16 +617,14 @@ static int panfrost_probe(struct platform_device *pdev) */ err = drm_dev_register(ddev, 0); if (err < 0) - goto err_out2; + goto err_out1;
panfrost_gem_shrinker_init(ddev);
return 0;
-err_out2: - pm_runtime_disable(pfdev->dev); - panfrost_devfreq_fini(pfdev); err_out1: + pm_runtime_disable(pfdev->dev); panfrost_device_fini(pfdev); err_out0: drm_dev_put(ddev); @@ -650,7 +640,6 @@ static int panfrost_remove(struct platform_device *pdev) panfrost_gem_shrinker_cleanup(ddev);
pm_runtime_get_sync(pfdev->dev); - panfrost_devfreq_fini(pfdev); panfrost_device_fini(pfdev); pm_runtime_put_sync_suspend(pfdev->dev); pm_runtime_disable(pfdev->dev);
From: Steven Price steven.price@arm.com
[ Upstream commit 876b15d2c88d8c005f1aebeaa23f1e448d834757 ]
When unloading the call to pm_runtime_put_sync_suspend() will attempt to turn the GPU cores off, however panfrost_device_fini() will have turned the clocks off. This leads to the hardware locking up.
Instead don't call pm_runtime_put_sync_suspend() and instead simply mark the device as suspended using pm_runtime_set_suspended(). And also include this on the error path in panfrost_probe().
Fixes: aebe8c22a912 ("drm/panfrost: Fix possible suspend in panfrost_remove") Signed-off-by: Steven Price steven.price@arm.com Reviewed-by: Tomeu Vizoso tomeu.vizoso@collabora.com Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Link: https://patchwork.freedesktop.org/patch/msgid/20201030145833.29006-1-steven.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panfrost/panfrost_drv.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index f2dd259f28995..5d95917f923a1 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -626,6 +626,7 @@ static int panfrost_probe(struct platform_device *pdev) err_out1: pm_runtime_disable(pfdev->dev); panfrost_device_fini(pfdev); + pm_runtime_set_suspended(pfdev->dev); err_out0: drm_dev_put(ddev); return err; @@ -640,9 +641,9 @@ static int panfrost_remove(struct platform_device *pdev) panfrost_gem_shrinker_cleanup(ddev);
pm_runtime_get_sync(pfdev->dev); - panfrost_device_fini(pfdev); - pm_runtime_put_sync_suspend(pfdev->dev); pm_runtime_disable(pfdev->dev); + panfrost_device_fini(pfdev); + pm_runtime_set_suspended(pfdev->dev);
drm_dev_put(ddev); return 0;
From: Stanislav Ivanichkin sivanichkin@yandex-team.ru
[ Upstream commit a6293f36ac92ab513771a98efe486477be2f981f ]
# ./perf trace -e sched:sched_switch -G test -a sleep 1 perf: Segmentation fault Obtained 11 stack frames. ./perf(sighandler_dump_stack+0x43) [0x55cfdc636db3] /lib/x86_64-linux-gnu/libc.so.6(+0x3efcf) [0x7fd23eecafcf] ./perf(parse_cgroups+0x36) [0x55cfdc673f36] ./perf(+0x3186ed) [0x55cfdc70d6ed] ./perf(parse_options_subcommand+0x629) [0x55cfdc70e999] ./perf(cmd_trace+0x9c2) [0x55cfdc5ad6d2] ./perf(+0x1e8ae0) [0x55cfdc5ddae0] ./perf(+0x1e8ded) [0x55cfdc5ddded] ./perf(main+0x370) [0x55cfdc556f00] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fd23eeadb96] ./perf(_start+0x29) [0x55cfdc557389] Segmentation fault #
It happens because "struct trace" in option->value is passed to the parse_cgroups function instead of "struct evlist".
Fixes: 9ea42ba4411ac ("perf trace: Support setting cgroups as targets") Signed-off-by: Stanislav Ivanichkin sivanichkin@yandex-team.ru Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Dmitry Monakhov dmtrmonakhov@yandex-team.ru Link: http://lore.kernel.org/lkml/20201027094357.94881-1-sivanichkin@yandex-team.r... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-trace.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 44a75f234db17..de80534473afa 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -4639,9 +4639,9 @@ do_concat: err = 0;
if (lists[0]) { - struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event", - "event selector. use 'perf list' to list available events", - parse_events_option); + struct option o = { + .value = &trace->evlist, + }; err = parse_events_option(&o, lists[0], 0); } out: @@ -4655,9 +4655,12 @@ static int trace__parse_cgroups(const struct option *opt, const char *str, int u { struct trace *trace = opt->value;
- if (!list_empty(&trace->evlist->core.entries)) - return parse_cgroups(opt, str, unset); - + if (!list_empty(&trace->evlist->core.entries)) { + struct option o = { + .value = &trace->evlist, + }; + return parse_cgroups(&o, str, unset); + } trace->cgroup = evlist__findnew_cgroup(trace->evlist, str);
return 0;
From: Jiri Olsa jolsa@kernel.org
[ Upstream commit fe01adb72356a4e2f8735e4128af85921ca98fa1 ]
We are missing swap for ino_generation field.
Fixes: 5c5e854bc760 ("perf tools: Add attr->mmap2 support") Signed-off-by: Jiri Olsa jolsa@kernel.org Acked-by: Namhyung Kim namhyung@kernel.org Link: https://lore.kernel.org/r/20201101233103.3537427-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/session.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 7a5f03764702b..d20b16ee73772 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -595,6 +595,7 @@ static void perf_event__mmap2_swap(union perf_event *event, event->mmap2.maj = bswap_32(event->mmap2.maj); event->mmap2.min = bswap_32(event->mmap2.min); event->mmap2.ino = bswap_64(event->mmap2.ino); + event->mmap2.ino_generation = bswap_64(event->mmap2.ino_generation);
if (sample_id_all) { void *data = &event->mmap2.filename;
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 2c589d933e54d183ee2a052971b730e423c62031 ]
It was missed to add a swap function for PERF_RECORD_CGROUP.
Fixes: ba78c1c5461c ("perf tools: Basic support for CGROUP event") Signed-off-by: Namhyung Kim namhyung@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20201102140228.303657-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/session.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index d20b16ee73772..098080287c687 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -711,6 +711,18 @@ static void perf_event__namespaces_swap(union perf_event *event, swap_sample_id_all(event, &event->namespaces.link_info[i]); }
+static void perf_event__cgroup_swap(union perf_event *event, bool sample_id_all) +{ + event->cgroup.id = bswap_64(event->cgroup.id); + + if (sample_id_all) { + void *data = &event->cgroup.path; + + data += PERF_ALIGN(strlen(data) + 1, sizeof(u64)); + swap_sample_id_all(event, data); + } +} + static u8 revbyte(u8 b) { int rev = (b >> 4) | ((b & 0xf) << 4); @@ -953,6 +965,7 @@ static perf_event__swap_op perf_event__swap_ops[] = { [PERF_RECORD_SWITCH] = perf_event__switch_swap, [PERF_RECORD_SWITCH_CPU_WIDE] = perf_event__switch_swap, [PERF_RECORD_NAMESPACES] = perf_event__namespaces_swap, + [PERF_RECORD_CGROUP] = perf_event__cgroup_swap, [PERF_RECORD_TEXT_POKE] = perf_event__text_poke_swap, [PERF_RECORD_HEADER_ATTR] = perf_event__hdr_attr_swap, [PERF_RECORD_HEADER_EVENT_TYPE] = perf_event__event_type_swap,
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 158e1886b6262c1d1c96a18c85fac5219b8bf804 ]
This is harmless, but the "addr" comes from the user and it could lead to a negative shift or to shift wrapping if it's too high.
Fixes: 0b00a5615dc4 ("ALSA: hdac_ext: add hdac extended controller") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/20201103101807.GC1127762@mwanda Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/ext/hdac_ext_controller.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/hda/ext/hdac_ext_controller.c b/sound/hda/ext/hdac_ext_controller.c index 4d060d5b1db6d..b0c0ef824d7d9 100644 --- a/sound/hda/ext/hdac_ext_controller.c +++ b/sound/hda/ext/hdac_ext_controller.c @@ -148,6 +148,8 @@ struct hdac_ext_link *snd_hdac_ext_bus_get_link(struct hdac_bus *bus, return NULL; if (bus->idx != bus_idx) return NULL; + if (addr < 0 || addr > 31) + return NULL;
list_for_each_entry(hlink, &bus->hlink_list, list) { for (i = 0; i < HDA_MAX_CODECS; i++) {
From: Liu Yi L yi.l.liu@intel.com
[ Upstream commit eea4e29ab8bef254b228d6e1e3de188087b2c7d0 ]
Should get correct sid and set it into sdev. Because we execute 'sdev->sid != req->rid' in the loop of prq_event_thread().
Fixes: eb8d93ea3c1d ("iommu/vt-d: Report page request faults for guest SVA") Signed-off-by: Liu Yi L yi.l.liu@intel.com Signed-off-by: Yi Sun yi.y.sun@linux.intel.com Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/1604025444-6954-2-git-send-email-yi.y.sun@linux.in... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/svm.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 95c3164a2302f..a71fbbddaa66e 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -278,6 +278,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); struct intel_svm_dev *sdev = NULL; struct dmar_domain *dmar_domain; + struct device_domain_info *info; struct intel_svm *svm = NULL; int ret = 0;
@@ -302,6 +303,10 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, if (data->hpasid <= 0 || data->hpasid >= PASID_MAX) return -EINVAL;
+ info = get_domain_info(dev); + if (!info) + return -EINVAL; + dmar_domain = to_dmar_domain(domain);
mutex_lock(&pasid_mutex); @@ -349,6 +354,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, goto out; } sdev->dev = dev; + sdev->sid = PCI_DEVID(info->bus, info->devfn);
/* Only count users if device has aux domains */ if (iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX))
From: Liu, Yi L yi.l.liu@intel.com
[ Upstream commit 71cd8e2d16703a9df5c86a9e19f4cba99316cc53 ]
In prq_event_thread(), the QI_PGRP_PDP is wrongly set by 'req->pasid_present' which should be replaced to 'req->priv_data_present'.
Fixes: 5b438f4ba315 ("iommu/vt-d: Support page request in scalable mode") Signed-off-by: Liu, Yi L yi.l.liu@intel.com Signed-off-by: Yi Sun yi.y.sun@linux.intel.com Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/1604025444-6954-3-git-send-email-yi.y.sun@linux.in... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index a71fbbddaa66e..20fa8c7fcd8e7 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -1001,7 +1001,7 @@ no_pasid: resp.qw0 = QI_PGRP_PASID(req->pasid) | QI_PGRP_DID(req->rid) | QI_PGRP_PASID_P(req->pasid_present) | - QI_PGRP_PDP(req->pasid_present) | + QI_PGRP_PDP(req->priv_data_present) | QI_PGRP_RESP_CODE(result) | QI_PGRP_RESP_TYPE; resp.qw1 = QI_PGRP_IDX(req->prg_index) |
From: David Howells dhowells@redhat.com
[ Upstream commit c80afa1d9c3603d5eddeb8d63368823b1982f3f0 ]
When using the afs.yfs.acl xattr to change an AuriStor ACL, a warning can be generated when the request is marshalled because the buffer pointer isn't increased after adding the last element, thereby triggering the check at the end if the ACL wasn't empty. This just causes something like the following warning, but doesn't stop the call from happening successfully:
kAFS: YFS.StoreOpaqueACL2: Request buffer underflow (36<108)
Fix this simply by increasing the count prior to the check.
Fixes: f5e4546347bc ("afs: Implement YFS ACL setting") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/yfsclient.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/afs/yfsclient.c b/fs/afs/yfsclient.c index 3b1239b7e90d8..bd787e71a657f 100644 --- a/fs/afs/yfsclient.c +++ b/fs/afs/yfsclient.c @@ -1990,6 +1990,7 @@ void yfs_fs_store_opaque_acl2(struct afs_operation *op) memcpy(bp, acl->data, acl->size); if (acl->size != size) memset((void *)bp + acl->size, 0, size - acl->size); + bp += size / sizeof(__be32); yfs_check_req(call, bp);
trace_afs_make_fs_call(call, &vp->fid);
From: David Howells dhowells@redhat.com
[ Upstream commit f4c79144edd8a49ffca8fa737a31d606be742a34 ]
The cleanup for the yfs_store_opaque_acl2_operation calls the wrong function to destroy the ACL content buffer. It's an afs_acl struct, not a yfs_acl struct - and the free function for latter may pass invalid pointers to kfree().
Fix this by using the afs_acl_put() function. The yfs_acl_put() function is then no longer used and can be removed.
general protection fault, probably for non-canonical address 0x7ebde00000000: 0000 [#1] SMP PTI ... RIP: 0010:compound_head+0x0/0x11 ... Call Trace: virt_to_cache+0x8/0x51 kfree+0x5d/0x79 yfs_free_opaque_acl+0x16/0x29 afs_put_operation+0x60/0x114 __vfs_setxattr+0x67/0x72 __vfs_setxattr_noperm+0x66/0xe9 vfs_setxattr+0x67/0xce setxattr+0x14e/0x184 __do_sys_fsetxattr+0x66/0x8f do_syscall_64+0x2d/0x3a entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/xattr.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/fs/afs/xattr.c b/fs/afs/xattr.c index 38884d6c57cdc..95c573dcda116 100644 --- a/fs/afs/xattr.c +++ b/fs/afs/xattr.c @@ -148,11 +148,6 @@ static const struct xattr_handler afs_xattr_afs_acl_handler = { .set = afs_xattr_set_acl, };
-static void yfs_acl_put(struct afs_operation *op) -{ - yfs_free_opaque_acl(op->yacl); -} - static const struct afs_operation_ops yfs_fetch_opaque_acl_operation = { .issue_yfs_rpc = yfs_fs_fetch_opaque_acl, .success = afs_acl_success, @@ -246,7 +241,7 @@ error: static const struct afs_operation_ops yfs_store_opaque_acl2_operation = { .issue_yfs_rpc = yfs_fs_store_opaque_acl2, .success = afs_acl_success, - .put = yfs_acl_put, + .put = afs_acl_put, };
/*
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 38565c93c8a1306dc5f245572a545fbea908ac41 ]
The ioeventfd is called under spinlock with interrupts disabled, therefore if the memory lock is contended defer code that might sleep to a thread context.
Fixes: bc93b9ae0151 ("vfio-pci: Avoid recursive read-lock usage") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209253#c1 Reported-by: Ian Pilcher arequipeno@gmail.com Tested-by: Ian Pilcher arequipeno@gmail.com Tested-by: Justin Gatzen justin.gatzen@gmail.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/pci/vfio_pci_rdwr.c | 43 ++++++++++++++++++++++++++------ 1 file changed, 35 insertions(+), 8 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 9e353c484ace2..a0b5fc8e46f4d 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -356,34 +356,60 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf, return done; }
-static int vfio_pci_ioeventfd_handler(void *opaque, void *unused) +static void vfio_pci_ioeventfd_do_write(struct vfio_pci_ioeventfd *ioeventfd, + bool test_mem) { - struct vfio_pci_ioeventfd *ioeventfd = opaque; - switch (ioeventfd->count) { case 1: - vfio_pci_iowrite8(ioeventfd->vdev, ioeventfd->test_mem, + vfio_pci_iowrite8(ioeventfd->vdev, test_mem, ioeventfd->data, ioeventfd->addr); break; case 2: - vfio_pci_iowrite16(ioeventfd->vdev, ioeventfd->test_mem, + vfio_pci_iowrite16(ioeventfd->vdev, test_mem, ioeventfd->data, ioeventfd->addr); break; case 4: - vfio_pci_iowrite32(ioeventfd->vdev, ioeventfd->test_mem, + vfio_pci_iowrite32(ioeventfd->vdev, test_mem, ioeventfd->data, ioeventfd->addr); break; #ifdef iowrite64 case 8: - vfio_pci_iowrite64(ioeventfd->vdev, ioeventfd->test_mem, + vfio_pci_iowrite64(ioeventfd->vdev, test_mem, ioeventfd->data, ioeventfd->addr); break; #endif } +} + +static int vfio_pci_ioeventfd_handler(void *opaque, void *unused) +{ + struct vfio_pci_ioeventfd *ioeventfd = opaque; + struct vfio_pci_device *vdev = ioeventfd->vdev; + + if (ioeventfd->test_mem) { + if (!down_read_trylock(&vdev->memory_lock)) + return 1; /* Lock contended, use thread */ + if (!__vfio_pci_memory_enabled(vdev)) { + up_read(&vdev->memory_lock); + return 0; + } + } + + vfio_pci_ioeventfd_do_write(ioeventfd, false); + + if (ioeventfd->test_mem) + up_read(&vdev->memory_lock);
return 0; }
+static void vfio_pci_ioeventfd_thread(void *opaque, void *unused) +{ + struct vfio_pci_ioeventfd *ioeventfd = opaque; + + vfio_pci_ioeventfd_do_write(ioeventfd, ioeventfd->test_mem); +} + long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset, uint64_t data, int count, int fd) { @@ -457,7 +483,8 @@ long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset, ioeventfd->test_mem = vdev->pdev->resource[bar].flags & IORESOURCE_MEM;
ret = vfio_virqfd_enable(ioeventfd, vfio_pci_ioeventfd_handler, - NULL, NULL, &ioeventfd->virqfd, fd); + vfio_pci_ioeventfd_thread, NULL, + &ioeventfd->virqfd, fd); if (ret) { kfree(ioeventfd); goto out_unlock;
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit 2ddd6bfe7bdbb6c661835c3ff9cab8e0769940a6 ]
A CAN driver, using the rx-offload infrastructure, is reading CAN frames (usually in IRQ context) from the hardware and placing it into the rx-offload queue to be delivered to the networking stack via NAPI.
In case the rx-offload queue is full, trying to add more skbs results in the skbs being dropped using kfree_skb(). If done from hard-IRQ context this results in the following warning:
[ 682.552693] ------------[ cut here ]------------ [ 682.557360] WARNING: CPU: 0 PID: 3057 at net/core/skbuff.c:650 skb_release_head_state+0x74/0x84 [ 682.566075] Modules linked in: can_raw can coda_vpu flexcan dw_hdmi_ahb_audio v4l2_jpeg imx_vdoa can_dev [ 682.575597] CPU: 0 PID: 3057 Comm: cansend Tainted: G W 5.7.0+ #18 [ 682.583098] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 682.589657] [<c0112628>] (unwind_backtrace) from [<c010c1c4>] (show_stack+0x10/0x14) [ 682.597423] [<c010c1c4>] (show_stack) from [<c06c481c>] (dump_stack+0xe0/0x114) [ 682.604759] [<c06c481c>] (dump_stack) from [<c0128f10>] (__warn+0xc0/0x10c) [ 682.611742] [<c0128f10>] (__warn) from [<c0129314>] (warn_slowpath_fmt+0x5c/0xc0) [ 682.619248] [<c0129314>] (warn_slowpath_fmt) from [<c0b95dec>] (skb_release_head_state+0x74/0x84) [ 682.628143] [<c0b95dec>] (skb_release_head_state) from [<c0b95e08>] (skb_release_all+0xc/0x24) [ 682.636774] [<c0b95e08>] (skb_release_all) from [<c0b95eac>] (kfree_skb+0x74/0x1c8) [ 682.644479] [<c0b95eac>] (kfree_skb) from [<bf001d1c>] (can_rx_offload_queue_sorted+0xe0/0xe8 [can_dev]) [ 682.654051] [<bf001d1c>] (can_rx_offload_queue_sorted [can_dev]) from [<bf001d6c>] (can_rx_offload_get_echo_skb+0x48/0x94 [can_dev]) [ 682.666007] [<bf001d6c>] (can_rx_offload_get_echo_skb [can_dev]) from [<bf01efe4>] (flexcan_irq+0x194/0x5dc [flexcan]) [ 682.676734] [<bf01efe4>] (flexcan_irq [flexcan]) from [<c019c1ec>] (__handle_irq_event_percpu+0x4c/0x3ec) [ 682.686322] [<c019c1ec>] (__handle_irq_event_percpu) from [<c019c5b8>] (handle_irq_event_percpu+0x2c/0x88) [ 682.695993] [<c019c5b8>] (handle_irq_event_percpu) from [<c019c64c>] (handle_irq_event+0x38/0x5c) [ 682.704887] [<c019c64c>] (handle_irq_event) from [<c01a1058>] (handle_fasteoi_irq+0xc8/0x180) [ 682.713432] [<c01a1058>] (handle_fasteoi_irq) from [<c019b2c0>] (generic_handle_irq+0x30/0x44) [ 682.722063] [<c019b2c0>] (generic_handle_irq) from [<c019b8f8>] (__handle_domain_irq+0x64/0xdc) [ 682.730783] [<c019b8f8>] (__handle_domain_irq) from [<c06df4a4>] (gic_handle_irq+0x48/0x9c) [ 682.739158] [<c06df4a4>] (gic_handle_irq) from [<c0100b30>] (__irq_svc+0x70/0x98) [ 682.746656] Exception stack(0xe80e9dd8 to 0xe80e9e20) [ 682.751725] 9dc0: 00000001 e80e8000 [ 682.759922] 9de0: e820cf80 00000000 ffffe000 00000000 eaf08fe4 00000000 600d0013 00000000 [ 682.768117] 9e00: c1732e3c c16093a8 e820d4c0 e80e9e28 c018a57c c018b870 600d0013 ffffffff [ 682.776315] [<c0100b30>] (__irq_svc) from [<c018b870>] (lock_acquire+0x108/0x4e8) [ 682.783821] [<c018b870>] (lock_acquire) from [<c0e938e4>] (down_write+0x48/0xa8) [ 682.791242] [<c0e938e4>] (down_write) from [<c02818dc>] (unlink_file_vma+0x24/0x40) [ 682.798922] [<c02818dc>] (unlink_file_vma) from [<c027a258>] (free_pgtables+0x34/0xb8) [ 682.806858] [<c027a258>] (free_pgtables) from [<c02835a4>] (exit_mmap+0xe4/0x170) [ 682.814361] [<c02835a4>] (exit_mmap) from [<c01248e0>] (mmput+0x5c/0x110) [ 682.821171] [<c01248e0>] (mmput) from [<c012e910>] (do_exit+0x374/0xbe4) [ 682.827892] [<c012e910>] (do_exit) from [<c0130888>] (do_group_exit+0x38/0xb4) [ 682.835132] [<c0130888>] (do_group_exit) from [<c0130914>] (__wake_up_parent+0x0/0x14) [ 682.843063] irq event stamp: 1936 [ 682.846399] hardirqs last enabled at (1935): [<c02938b0>] rmqueue+0xf4/0xc64 [ 682.853553] hardirqs last disabled at (1936): [<c0100b20>] __irq_svc+0x60/0x98 [ 682.860799] softirqs last enabled at (1878): [<bf04cdcc>] raw_release+0x108/0x1f0 [can_raw] [ 682.869256] softirqs last disabled at (1876): [<c0b8f478>] release_sock+0x18/0x98 [ 682.876753] ---[ end trace 7bca4751ce44c444 ]---
This patch fixes the problem by replacing the kfree_skb() by dev_kfree_skb_any(), as rx-offload might be called from threaded IRQ handlers as well.
Fixes: ca913f1ac024 ("can: rx-offload: can_rx_offload_queue_sorted(): fix error handling, avoid skb mem leak") Fixes: 6caf8a6d6586 ("can: rx-offload: can_rx_offload_queue_tail(): fix error handling, avoid skb mem leak") Link: http://lore.kernel.org/r/20201019190524.1285319-3-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/rx-offload.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/rx-offload.c b/drivers/net/can/rx-offload.c index e8328910a2349..0283b5cad746a 100644 --- a/drivers/net/can/rx-offload.c +++ b/drivers/net/can/rx-offload.c @@ -245,7 +245,7 @@ int can_rx_offload_queue_sorted(struct can_rx_offload *offload,
if (skb_queue_len(&offload->skb_queue) > offload->skb_queue_len_max) { - kfree_skb(skb); + dev_kfree_skb_any(skb); return -ENOBUFS; }
@@ -290,7 +290,7 @@ int can_rx_offload_queue_tail(struct can_rx_offload *offload, { if (skb_queue_len(&offload->skb_queue) > offload->skb_queue_len_max) { - kfree_skb(skb); + dev_kfree_skb_any(skb); return -ENOBUFS; }
From: Vincent Mailhol mailhol.vincent@wanadoo.fr
[ Upstream commit 2283f79b22684d2812e5c76fc2280aae00390365 ]
If a driver calls can_get_echo_skb() during a hardware IRQ (which is often, but not always, the case), the 'WARN_ON(in_irq)' in net/core/skbuff.c#skb_release_head_state() might be triggered, under network congestion circumstances, together with the potential risk of a NULL pointer dereference.
The root cause of this issue is the call to kfree_skb() instead of dev_kfree_skb_irq() in net/core/dev.c#enqueue_to_backlog().
This patch prevents the skb to be freed within the call to netif_rx() by incrementing its reference count with skb_get(). The skb is finally freed by one of the in-irq-context safe functions: dev_consume_skb_any() or dev_kfree_skb_any(). The "any" version is used because some drivers might call can_get_echo_skb() in a normal context.
The reason for this issue to occur is that initially, in the core network stack, loopback skb were not supposed to be received in hardware IRQ context. The CAN stack is an exeption.
This bug was previously reported back in 2017 in [1] but the proposed patch never got accepted.
While [1] directly modifies net/core/dev.c, we try to propose here a smoother modification local to CAN network stack (the assumption behind is that only CAN devices are affected by this issue).
[1] http://lore.kernel.org/r/57a3ffb6-3309-3ad5-5a34-e93c3fe3614d@cetitec.com
Signed-off-by: Vincent Mailhol mailhol.vincent@wanadoo.fr Link: https://lore.kernel.org/r/20201002154219.4887-2-mailhol.vincent@wanadoo.fr Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/dev.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index 68834a2853c9d..e291fda395a0f 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -512,7 +512,11 @@ unsigned int can_get_echo_skb(struct net_device *dev, unsigned int idx) if (!skb) return 0;
- netif_rx(skb); + skb_get(skb); + if (netif_rx(skb) == NET_RX_SUCCESS) + dev_consume_skb_any(skb); + else + dev_kfree_skb_any(skb);
return len; }
From: Oliver Hartkopp socketcan@hartkopp.net
[ Upstream commit ed3320cec279407a86bc4c72edc4a39eb49165ec ]
The can_get_echo_skb() function returns the number of received bytes to be used for netdev statistics. In the case of RTR frames we get a valid (potential non-zero) data length value which has to be passed for further operations. But on the wire RTR frames have no payload length. Therefore the value to be used in the statistics has to be zero for RTR frames.
Reported-by: Vincent Mailhol mailhol.vincent@wanadoo.fr Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Link: https://lore.kernel.org/r/20201020064443.80164-1-socketcan@hartkopp.net Fixes: cf5046b309b3 ("can: dev: let can_get_echo_skb() return dlc of CAN frame") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/dev.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index e291fda395a0f..d5e52ffc7ed25 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -486,9 +486,13 @@ __can_get_echo_skb(struct net_device *dev, unsigned int idx, u8 *len_ptr) */ struct sk_buff *skb = priv->echo_skb[idx]; struct canfd_frame *cf = (struct canfd_frame *)skb->data; - u8 len = cf->len;
- *len_ptr = len; + /* get the real payload length for netdev statistics */ + if (cf->can_id & CAN_RTR_FLAG) + *len_ptr = 0; + else + *len_ptr = cf->len; + priv->echo_skb[idx] = NULL;
return skb;
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 286228d382ba6320f04fa2e7c6fc8d4d92e428f4 ]
All user space generated SKBs are owned by a socket (unless injected into the key via AF_PACKET). If a socket is closed, all associated skbs will be cleaned up.
This leads to a problem when a CAN driver calls can_put_echo_skb() on a unshared SKB. If the socket is closed prior to the TX complete handler, can_get_echo_skb() and the subsequent delivering of the echo SKB to all registered callbacks, a SKB with a refcount of 0 is delivered.
To avoid the problem, in can_get_echo_skb() the original SKB is now always cloned, regardless of shared SKB or not. If the process exists it can now safely discard its SKBs, without disturbing the delivery of the echo SKB.
The problem shows up in the j1939 stack, when it clones the incoming skb, which detects the already 0 refcount.
We can easily reproduce this with following example:
testj1939 -B -r can0: & cansend can0 1823ff40#0123
WARNING: CPU: 0 PID: 293 at lib/refcount.c:25 refcount_warn_saturate+0x108/0x174 refcount_t: addition on 0; use-after-free. Modules linked in: coda_vpu imx_vdoa videobuf2_vmalloc dw_hdmi_ahb_audio vcan CPU: 0 PID: 293 Comm: cansend Not tainted 5.5.0-rc6-00376-g9e20dcb7040d #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<c010f570>] (dump_backtrace) from [<c010f90c>] (show_stack+0x20/0x24) [<c010f8ec>] (show_stack) from [<c0c3e1a4>] (dump_stack+0x8c/0xa0) [<c0c3e118>] (dump_stack) from [<c0127fec>] (__warn+0xe0/0x108) [<c0127f0c>] (__warn) from [<c01283c8>] (warn_slowpath_fmt+0xa8/0xcc) [<c0128324>] (warn_slowpath_fmt) from [<c0539c0c>] (refcount_warn_saturate+0x108/0x174) [<c0539b04>] (refcount_warn_saturate) from [<c0ad2cac>] (j1939_can_recv+0x20c/0x210) [<c0ad2aa0>] (j1939_can_recv) from [<c0ac9dc8>] (can_rcv_filter+0xb4/0x268) [<c0ac9d14>] (can_rcv_filter) from [<c0aca2cc>] (can_receive+0xb0/0xe4) [<c0aca21c>] (can_receive) from [<c0aca348>] (can_rcv+0x48/0x98) [<c0aca300>] (can_rcv) from [<c09b1fdc>] (__netif_receive_skb_one_core+0x64/0x88) [<c09b1f78>] (__netif_receive_skb_one_core) from [<c09b2070>] (__netif_receive_skb+0x38/0x94) [<c09b2038>] (__netif_receive_skb) from [<c09b2130>] (netif_receive_skb_internal+0x64/0xf8) [<c09b20cc>] (netif_receive_skb_internal) from [<c09b21f8>] (netif_receive_skb+0x34/0x19c) [<c09b21c4>] (netif_receive_skb) from [<c0791278>] (can_rx_offload_napi_poll+0x58/0xb4)
Fixes: 0ae89beb283a ("can: add destructor for self generated skbs") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Link: http://lore.kernel.org/r/20200124132656.22156-1-o.rempel@pengutronix.de Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/can/skb.h | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h index 900b9f4e06054..fc61cf4eff1c9 100644 --- a/include/linux/can/skb.h +++ b/include/linux/can/skb.h @@ -61,21 +61,17 @@ static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk) */ static inline struct sk_buff *can_create_echo_skb(struct sk_buff *skb) { - if (skb_shared(skb)) { - struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); + struct sk_buff *nskb;
- if (likely(nskb)) { - can_skb_set_owner(nskb, skb->sk); - consume_skb(skb); - return nskb; - } else { - kfree_skb(skb); - return NULL; - } + nskb = skb_clone(skb, GFP_ATOMIC); + if (unlikely(!nskb)) { + kfree_skb(skb); + return NULL; }
- /* we can assume to have an unshared skb with proper owner */ - return skb; + can_skb_set_owner(nskb, skb->sk); + consume_skb(skb); + return nskb; }
#endif /* !_CAN_SKB_H */
From: Yegor Yefremov yegorslists@googlemail.com
[ Upstream commit ea780d39b1888ed5afc243c29b23d9bdb3828c7a ]
The address was wrongly assigned to the PGN field and vice versa.
Signed-off-by: Yegor Yefremov yegorslists@googlemail.com Link: https://lore.kernel.org/r/20201022083708.8755-1-yegorslists@googlemail.com Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/j1939.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/j1939.rst b/Documentation/networking/j1939.rst index f5be243d250a4..4b0db514b2010 100644 --- a/Documentation/networking/j1939.rst +++ b/Documentation/networking/j1939.rst @@ -414,8 +414,8 @@ Send: .can_family = AF_CAN, .can_addr.j1939 = { .name = J1939_NO_NAME; - .pgn = 0x30, - .addr = 0x12300, + .addr = 0x30, + .pgn = 0x12300, }, };
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 08c487d8d807535f509ed80c6a10ad90e6872139 ]
When a netdev down event occurs after a successful call to j1939_sk_bind(), j1939_netdev_notify() can handle it correctly.
But if the netdev already in down state before calling j1939_sk_bind(), j1939_sk_release() will stay in wait_event_interruptible() blocked forever. Because in this case, j1939_netdev_notify() won't be called and j1939_tp_txtimer() won't call j1939_session_cancel() or other function to clear session for ENETDOWN error, this lead to mismatch of j1939_session_get/put() and jsk->skb_pending will never decrease to zero.
To reproduce it use following commands: 1. ip link add dev vcan0 type vcan 2. j1939acd -r 100,80-120 1122334455667788 vcan0 3. presses ctrl-c and thread will be blocked forever
This patch adds check for ndev->flags in j1939_sk_bind() to avoid this kind of situation and return with -ENETDOWN.
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Link: https://lore.kernel.org/r/1599460308-18770-1-git-send-email-zhangchangzhong@... Acked-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/j1939/socket.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 1be4c898b2fa8..f23966526a885 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -475,6 +475,12 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len) goto out_release_sock; }
+ if (!(ndev->flags & IFF_UP)) { + dev_put(ndev); + ret = -ENETDOWN; + goto out_release_sock; + } + priv = j1939_netdev_start(ndev); dev_put(ndev); if (IS_ERR(priv)) {
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit e002103b36a695f7cb6048b96da73e66c86ddffb ]
The driver forgets to call clk_disable_unprepare() in error path after a success calling for clk_prepare_enable().
Fix it by adding a clk_disable_unprepare() in error path.
Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Link: https://lore.kernel.org/r/1594973079-27743-1-git-send-email-zhangchangzhong@... Fixes: befa60113ce7 ("can: ti_hecc: add missing prepare and unprepare of the clock") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/ti_hecc.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/net/can/ti_hecc.c b/drivers/net/can/ti_hecc.c index 94b1491b569f3..228ecd45ca6c1 100644 --- a/drivers/net/can/ti_hecc.c +++ b/drivers/net/can/ti_hecc.c @@ -950,7 +950,7 @@ static int ti_hecc_probe(struct platform_device *pdev) err = clk_prepare_enable(priv->clk); if (err) { dev_err(&pdev->dev, "clk_prepare_enable() failed\n"); - goto probe_exit_clk; + goto probe_exit_release_clk; }
priv->offload.mailbox_read = ti_hecc_mailbox_read; @@ -959,7 +959,7 @@ static int ti_hecc_probe(struct platform_device *pdev) err = can_rx_offload_add_timestamp(ndev, &priv->offload); if (err) { dev_err(&pdev->dev, "can_rx_offload_add_timestamp() failed\n"); - goto probe_exit_clk; + goto probe_exit_disable_clk; }
err = register_candev(ndev); @@ -977,7 +977,9 @@ static int ti_hecc_probe(struct platform_device *pdev)
probe_exit_offload: can_rx_offload_del(&priv->offload); -probe_exit_clk: +probe_exit_disable_clk: + clk_disable_unprepare(priv->clk); +probe_exit_release_clk: clk_put(priv->clk); probe_exit_candev: free_candev(ndev);
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit 79c43333bdd5a7026a5aab606b53053b643585e7 ]
Calling pm_runtime_get_sync increments the counter even in case of failure, causing incorrect ref count. Call pm_runtime_put if pm_runtime_get_sync fails.
Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Link: https://lore.kernel.org/r/20200605033239.60664-1-navid.emamdoost@gmail.com Fixes: 4716620d1b62 ("can: xilinx: Convert to runtime_pm") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/xilinx_can.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c index c1dbab8c896d5..748ff70f6a7bf 100644 --- a/drivers/net/can/xilinx_can.c +++ b/drivers/net/can/xilinx_can.c @@ -1391,7 +1391,7 @@ static int xcan_open(struct net_device *ndev) if (ret < 0) { netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n", __func__, ret); - return ret; + goto err; }
ret = request_irq(ndev->irq, xcan_interrupt, priv->irq_flags, @@ -1475,6 +1475,7 @@ static int xcan_get_berr_counter(const struct net_device *ndev, if (ret < 0) { netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n", __func__, ret); + pm_runtime_put(priv->dev); return ret; }
@@ -1789,7 +1790,7 @@ static int xcan_probe(struct platform_device *pdev) if (ret < 0) { netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n", __func__, ret); - goto err_pmdisable; + goto err_disableclks; }
if (priv->read_reg(priv, XCAN_SR_OFFSET) != XCAN_SR_CONFIG_MASK) { @@ -1824,7 +1825,6 @@ static int xcan_probe(struct platform_device *pdev)
err_disableclks: pm_runtime_put(priv->dev); -err_pmdisable: pm_runtime_disable(&pdev->dev); err_free: free_candev(ndev);
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit a6921dd524fe31d1f460c161d3526a407533b6db ]
These values come from skb->data so Smatch considers them untrusted. I believe Smatch is correct but I don't have a way to test this.
The usb_if->dev[] array has 2 elements but the index is in the 0-15 range without checks. The cfd->len can be up to 255 but the maximum valid size is CANFD_MAX_DLEN (64) so that could lead to memory corruption.
Fixes: 0a25e1f4f185 ("can: peak_usb: add support for PEAK new CANFD USB adapters") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/20200813140604.GA456946@mwanda Acked-by: Stephane Grosjean s.grosjean@peak-system.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 48 +++++++++++++++++----- 1 file changed, 37 insertions(+), 11 deletions(-)
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c index 47cc1ff5b88e8..dee3e689b54da 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c @@ -468,12 +468,18 @@ static int pcan_usb_fd_decode_canmsg(struct pcan_usb_fd_if *usb_if, struct pucan_msg *rx_msg) { struct pucan_rx_msg *rm = (struct pucan_rx_msg *)rx_msg; - struct peak_usb_device *dev = usb_if->dev[pucan_msg_get_channel(rm)]; - struct net_device *netdev = dev->netdev; + struct peak_usb_device *dev; + struct net_device *netdev; struct canfd_frame *cfd; struct sk_buff *skb; const u16 rx_msg_flags = le16_to_cpu(rm->flags);
+ if (pucan_msg_get_channel(rm) >= ARRAY_SIZE(usb_if->dev)) + return -ENOMEM; + + dev = usb_if->dev[pucan_msg_get_channel(rm)]; + netdev = dev->netdev; + if (rx_msg_flags & PUCAN_MSG_EXT_DATA_LEN) { /* CANFD frame case */ skb = alloc_canfd_skb(netdev, &cfd); @@ -519,15 +525,21 @@ static int pcan_usb_fd_decode_status(struct pcan_usb_fd_if *usb_if, struct pucan_msg *rx_msg) { struct pucan_status_msg *sm = (struct pucan_status_msg *)rx_msg; - struct peak_usb_device *dev = usb_if->dev[pucan_stmsg_get_channel(sm)]; - struct pcan_usb_fd_device *pdev = - container_of(dev, struct pcan_usb_fd_device, dev); + struct pcan_usb_fd_device *pdev; enum can_state new_state = CAN_STATE_ERROR_ACTIVE; enum can_state rx_state, tx_state; - struct net_device *netdev = dev->netdev; + struct peak_usb_device *dev; + struct net_device *netdev; struct can_frame *cf; struct sk_buff *skb;
+ if (pucan_stmsg_get_channel(sm) >= ARRAY_SIZE(usb_if->dev)) + return -ENOMEM; + + dev = usb_if->dev[pucan_stmsg_get_channel(sm)]; + pdev = container_of(dev, struct pcan_usb_fd_device, dev); + netdev = dev->netdev; + /* nothing should be sent while in BUS_OFF state */ if (dev->can.state == CAN_STATE_BUS_OFF) return 0; @@ -579,9 +591,14 @@ static int pcan_usb_fd_decode_error(struct pcan_usb_fd_if *usb_if, struct pucan_msg *rx_msg) { struct pucan_error_msg *er = (struct pucan_error_msg *)rx_msg; - struct peak_usb_device *dev = usb_if->dev[pucan_ermsg_get_channel(er)]; - struct pcan_usb_fd_device *pdev = - container_of(dev, struct pcan_usb_fd_device, dev); + struct pcan_usb_fd_device *pdev; + struct peak_usb_device *dev; + + if (pucan_ermsg_get_channel(er) >= ARRAY_SIZE(usb_if->dev)) + return -EINVAL; + + dev = usb_if->dev[pucan_ermsg_get_channel(er)]; + pdev = container_of(dev, struct pcan_usb_fd_device, dev);
/* keep a trace of tx and rx error counters for later use */ pdev->bec.txerr = er->tx_err_cnt; @@ -595,11 +612,17 @@ static int pcan_usb_fd_decode_overrun(struct pcan_usb_fd_if *usb_if, struct pucan_msg *rx_msg) { struct pcan_ufd_ovr_msg *ov = (struct pcan_ufd_ovr_msg *)rx_msg; - struct peak_usb_device *dev = usb_if->dev[pufd_omsg_get_channel(ov)]; - struct net_device *netdev = dev->netdev; + struct peak_usb_device *dev; + struct net_device *netdev; struct can_frame *cf; struct sk_buff *skb;
+ if (pufd_omsg_get_channel(ov) >= ARRAY_SIZE(usb_if->dev)) + return -EINVAL; + + dev = usb_if->dev[pufd_omsg_get_channel(ov)]; + netdev = dev->netdev; + /* allocate an skb to store the error frame */ skb = alloc_can_err_skb(netdev, &cf); if (!skb) @@ -716,6 +739,9 @@ static int pcan_usb_fd_encode_msg(struct peak_usb_device *dev, u16 tx_msg_size, tx_msg_flags; u8 can_dlc;
+ if (cfd->len > CANFD_MAX_DLEN) + return -EINVAL; + tx_msg_size = ALIGN(sizeof(struct pucan_tx_msg) + cfd->len, 4); tx_msg->size = cpu_to_le16(tx_msg_size); tx_msg->type = cpu_to_le16(PUCAN_MSG_CAN_TX);
From: Stephane Grosjean s.grosjean@peak-system.com
[ Upstream commit ecc7b4187dd388549544195fb13a11b4ea8e6a84 ]
Fabian Inostroza fabianinostrozap@gmail.com has discovered a potential problem in the hardware timestamp reporting from the PCAN-USB USB CAN interface (only), related to the fact that a timestamp of an event may precede the timestamp used for synchronization when both records are part of the same USB packet. However, this case was used to detect the wrapping of the time counter.
This patch details and fixes the two identified cases where this problem can occur.
Reported-by: Fabian Inostroza fabianinostrozap@gmail.com Signed-off-by: Stephane Grosjean s.grosjean@peak-system.com Link: https://lore.kernel.org/r/20201014085631.15128-1-s.grosjean@peak-system.com Fixes: bb4785551f64 ("can: usb: PEAK-System Technik USB adapters driver core") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/usb/peak_usb/pcan_usb_core.c | 51 ++++++++++++++++++-- 1 file changed, 46 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c index d91df34e7fa88..c2764799f9efb 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c @@ -130,14 +130,55 @@ void peak_usb_get_ts_time(struct peak_time_ref *time_ref, u32 ts, ktime_t *time) /* protect from getting time before setting now */ if (ktime_to_ns(time_ref->tv_host)) { u64 delta_us; + s64 delta_ts = 0; + + /* General case: dev_ts_1 < dev_ts_2 < ts, with: + * + * - dev_ts_1 = previous sync timestamp + * - dev_ts_2 = last sync timestamp + * - ts = event timestamp + * - ts_period = known sync period (theoretical) + * ~ dev_ts2 - dev_ts1 + * *but*: + * + * - time counters wrap (see adapter->ts_used_bits) + * - sometimes, dev_ts_1 < ts < dev_ts2 + * + * "normal" case (sync time counters increase): + * must take into account case when ts wraps (tsw) + * + * < ts_period > < > + * | | | + * ---+--------+----+-------0-+--+--> + * ts_dev_1 | ts_dev_2 | + * ts tsw + */ + if (time_ref->ts_dev_1 < time_ref->ts_dev_2) { + /* case when event time (tsw) wraps */ + if (ts < time_ref->ts_dev_1) + delta_ts = 1 << time_ref->adapter->ts_used_bits; + + /* Otherwise, sync time counter (ts_dev_2) has wrapped: + * handle case when event time (tsn) hasn't. + * + * < ts_period > < > + * | | | + * ---+--------+--0-+---------+--+--> + * ts_dev_1 | ts_dev_2 | + * tsn ts + */ + } else if (time_ref->ts_dev_1 < ts) { + delta_ts = -(1 << time_ref->adapter->ts_used_bits); + }
- delta_us = ts - time_ref->ts_dev_2; - if (ts < time_ref->ts_dev_2) - delta_us &= (1 << time_ref->adapter->ts_used_bits) - 1; + /* add delay between last sync and event timestamps */ + delta_ts += (signed int)(ts - time_ref->ts_dev_2);
- delta_us += time_ref->ts_total; + /* add time from beginning to last sync */ + delta_ts += time_ref->ts_total;
- delta_us *= time_ref->adapter->us_per_ts_scale; + /* convert ticks number into microseconds */ + delta_us = delta_ts * time_ref->adapter->us_per_ts_scale; delta_us >>= time_ref->adapter->us_per_ts_shift;
*time = ktime_add_us(time_ref->tv_host_0, delta_us);
From: Stephane Grosjean s.grosjean@peak-system.com
[ Upstream commit 93ef65e5a6357cc7381f85fcec9283fe29970045 ]
Echo management is driven by PUCAN_MSG_LOOPED_BACK bit, while loopback frames are identified with PUCAN_MSG_SELF_RECEIVE bit. Those bits are set for each outgoing frame written to the IP core so that a copy of each one will be placed into the rx path. Thus,
- when PUCAN_MSG_LOOPED_BACK is set then the rx frame is an echo of a previously sent frame, - when PUCAN_MSG_LOOPED_BACK+PUCAN_MSG_SELF_RECEIVE are set, then the rx frame is an echo AND a loopback frame. Therefore, this frame must be put into the socket rx path too.
This patch fixes how CAN frames are handled when these are sent while the can interface is configured in "loopback on" mode.
Signed-off-by: Stephane Grosjean s.grosjean@peak-system.com Link: https://lore.kernel.org/r/20201013153947.28012-1-s.grosjean@peak-system.com Fixes: 8ac8321e4a79 ("can: peak: add support for PEAK PCAN-PCIe FD CAN-FD boards") Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/peak_canfd/peak_canfd.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/can/peak_canfd/peak_canfd.c b/drivers/net/can/peak_canfd/peak_canfd.c index 10aa3e457c33d..40c33b8a5fda3 100644 --- a/drivers/net/can/peak_canfd/peak_canfd.c +++ b/drivers/net/can/peak_canfd/peak_canfd.c @@ -262,8 +262,7 @@ static int pucan_handle_can_rx(struct peak_canfd_priv *priv, cf_len = get_can_dlc(pucan_msg_get_dlc(msg));
/* if this frame is an echo, */ - if ((rx_msg_flags & PUCAN_MSG_LOOPED_BACK) && - !(rx_msg_flags & PUCAN_MSG_SELF_RECEIVE)) { + if (rx_msg_flags & PUCAN_MSG_LOOPED_BACK) { unsigned long flags;
spin_lock_irqsave(&priv->echo_lock, flags); @@ -277,7 +276,13 @@ static int pucan_handle_can_rx(struct peak_canfd_priv *priv, netif_wake_queue(priv->ndev);
spin_unlock_irqrestore(&priv->echo_lock, flags); - return 0; + + /* if this frame is only an echo, stop here. Otherwise, + * continue to push this application self-received frame into + * its own rx queue. + */ + if (!(rx_msg_flags & PUCAN_MSG_SELF_RECEIVE)) + return 0; }
/* otherwise, it should be pushed into rx fifo */
From: Joakim Zhang qiangqing.zhang@nxp.com
[ Upstream commit 018799649071a1638c0c130526af36747df4355a ]
After double check with Layerscape CAN owner (Pankaj Bansal), confirm that LS1021A doesn't support ECC feature, so remove FLEXCAN_QUIRK_DISABLE_MECR quirk.
Fixes: 99b7668c04b27 ("can: flexcan: adding platform specific details for LS1021A") Cc: Pankaj Bansal pankaj.bansal@nxp.com Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Link: https://lore.kernel.org/r/20201020155402.30318-4-qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/flexcan.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index bc21a82cf3a76..bc504e09f2259 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -321,8 +321,7 @@ static const struct flexcan_devtype_data fsl_vf610_devtype_data = {
static const struct flexcan_devtype_data fsl_ls1021a_r2_devtype_data = { .quirks = FLEXCAN_QUIRK_DISABLE_RXFG | FLEXCAN_QUIRK_ENABLE_EACEN_RRS | - FLEXCAN_QUIRK_DISABLE_MECR | FLEXCAN_QUIRK_BROKEN_PERR_STATE | - FLEXCAN_QUIRK_USE_OFF_TIMESTAMP, + FLEXCAN_QUIRK_BROKEN_PERR_STATE | FLEXCAN_QUIRK_USE_OFF_TIMESTAMP, };
static const struct can_bittiming_const flexcan_bittiming_const = {
From: Joakim Zhang qiangqing.zhang@nxp.com
[ Upstream commit ab07ff1c92fa60f29438e655a1b4abab860ed0b6 ]
With below sequence, we can see wakeup default is enabled after re-load module, if it was enabled before, so we need disable wakeup in flexcan_remove().
| # cat /sys/bus/platform/drivers/flexcan/5a8e0000.can/power/wakeup | disabled | # echo enabled > /sys/bus/platform/drivers/flexcan/5a8e0000.can/power/wakeup | # cat /sys/bus/platform/drivers/flexcan/5a8e0000.can/power/wakeup | enabled | # rmmod flexcan | # modprobe flexcan | # cat /sys/bus/platform/drivers/flexcan/5a8e0000.can/power/wakeup | enabled
Fixes: de3578c198c6 ("can: flexcan: add self wakeup support") Fixes: 915f9666421c ("can: flexcan: add support for DT property 'wakeup-source'") Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Link: https://lore.kernel.org/r/20201020184527.8190-1-qiangqing.zhang@nxp.com [mkl: streamlined commit message] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/flexcan.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index bc504e09f2259..a330d6c56242e 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -1695,6 +1695,8 @@ static int flexcan_remove(struct platform_device *pdev) { struct net_device *dev = platform_get_drvdata(pdev);
+ device_set_wakeup_enable(&pdev->dev, false); + device_set_wakeup_capable(&pdev->dev, false); unregister_flexcandev(dev); pm_runtime_disable(&pdev->dev); free_candev(dev);
From: Brian Foster bfoster@redhat.com
[ Upstream commit 869ae85dae64b5540e4362d7fe4cd520e10ec05c ]
It is possible to expose non-zeroed post-EOF data in XFS if the new EOF page is dirty, backed by an unwritten block and the truncate happens to race with writeback. iomap_truncate_page() will not zero the post-EOF portion of the page if the underlying block is unwritten. The subsequent call to truncate_setsize() will, but doesn't dirty the page. Therefore, if writeback happens to complete after iomap_truncate_page() (so it still sees the unwritten block) but before truncate_setsize(), the cached page becomes inconsistent with the on-disk block. A mapped read after the associated page is reclaimed or invalidated exposes non-zero post-EOF data.
For example, consider the following sequence when run on a kernel modified to explicitly flush the new EOF page within the race window:
$ xfs_io -fc "falloc 0 4k" -c fsync /mnt/file $ xfs_io -c "pwrite 0 4k" -c "truncate 1k" /mnt/file ... $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file 00000400: 00 00 00 00 00 00 00 00 ........ $ umount /mnt/; mount <dev> /mnt/ $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file 00000400: cd cd cd cd cd cd cd cd ........
Update xfs_setattr_size() to explicitly flush the new EOF page prior to the page truncate to ensure iomap has the latest state of the underlying block.
Fixes: 68a9f5e7007c ("xfs: implement iomap based buffered write path") Signed-off-by: Brian Foster bfoster@redhat.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_iops.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 80a13c8561d85..bf93a7152181c 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -911,6 +911,16 @@ xfs_setattr_size( error = iomap_zero_range(inode, oldsize, newsize - oldsize, &did_zeroing, &xfs_buffered_write_iomap_ops); } else { + /* + * iomap won't detect a dirty page over an unwritten block (or a + * cow block over a hole) and subsequently skips zeroing the + * newly post-EOF portion of the page. Flush the new EOF to + * convert the block before the pagecache truncate. + */ + error = filemap_write_and_wait_range(inode->i_mapping, newsize, + newsize); + if (error) + return error; error = iomap_truncate_page(inode, newsize, &did_zeroing, &xfs_buffered_write_iomap_ops); }
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit c2f09217a4305478c55adc9a98692488dd19cd32 ]
In commit 7588cbeec6df, we tried to fix a race stemming from the lack of coordination between higher level code that wants to allocate and remap CoW fork extents into the data fork. Christoph cites as examples the always_cow mode, and a directio write completion racing with writeback.
According to the comments before the goto retry, we want to restart the lookup to catch the extent in the data fork, but we don't actually reset whichfork or cow_fsb, which means the second try executes using stale information. Up until now I think we've gotten lucky that either there's something left in the CoW fork to cause cow_fsb to be reset, or either data/cow fork sequence numbers have advanced enough to force a fresh lookup from the data fork. However, if we reach the retry with an empty stable CoW fork and a stable data fork, neither of those things happens. The retry foolishly re-calls xfs_convert_blocks on the CoW fork which fails again. This time, we toss the write.
I've recently been working on extending reflink to the realtime device. When the realtime extent size is larger than a single block, we have to force the page cache to CoW the entire rt extent if a write (or fallocate) are not aligned with the rt extent size. The strategy I've chosen to deal with this is derived from Dave's blocksize > pagesize series: dirtying around the write range, and ensuring that writeback always starts mapping on an rt extent boundary. This has brought this race front and center, since generic/522 blows up immediately.
However, I'm pretty sure this is a bug outright, independent of that.
Fixes: 7588cbeec6df ("xfs: retry COW fork delalloc conversion when no extent was found") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_aops.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index b35611882ff9c..e4210779cd79e 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -346,8 +346,8 @@ xfs_map_blocks( ssize_t count = i_blocksize(inode); xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); - xfs_fileoff_t cow_fsb = NULLFILEOFF; - int whichfork = XFS_DATA_FORK; + xfs_fileoff_t cow_fsb; + int whichfork; struct xfs_bmbt_irec imap; struct xfs_iext_cursor icur; int retries = 0; @@ -381,6 +381,8 @@ xfs_map_blocks( * landed in a hole and we skip the block. */ retry: + cow_fsb = NULLFILEOFF; + whichfork = XFS_DATA_FORK; xfs_ilock(ip, XFS_ILOCK_SHARED); ASSERT(ip->i_df.if_format != XFS_DINODE_FMT_BTREE || (ip->i_df.if_flags & XFS_IFEXTENTS));
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit c1f6b1ac00756a7108e5fcb849a2f8230c0b62a5 ]
The kernel has always allowed directories to have the rtinherit flag set, even if there is no rt device, so this check is wrong.
Fixes: 80e4e1268802 ("xfs: scrub inodes") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index 6d483ab29e639..1bea029b634a6 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -121,8 +121,7 @@ xchk_inode_flags( goto bad;
/* rt flags require rt device */ - if ((flags & (XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT)) && - !mp->m_rtdev_targp) + if ((flags & XFS_DIFLAG_REALTIME) && !mp->m_rtdev_targp) goto bad;
/* new rt bitmap flag only valid for rbmino */
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 4b70cf9dea4cd239b425f3282fa56ce19e234c8a ]
Ensure we get a valid view of the task mm, by using task_lock() when attempting to grab the original task mm.
Reported-by: syzbot+b57abf7ee60829090495@syzkaller.appspotmail.com Fixes: 2aede0e417db ("io_uring: stash ctx task reference for SQPOLL") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 1033e0e18f24f..2d5ca9476814d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -952,20 +952,33 @@ static void io_sq_thread_drop_mm(void) if (mm) { kthread_unuse_mm(mm); mmput(mm); + current->mm = NULL; } }
static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx) { - if (!current->mm) { - if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL) || - !ctx->sqo_task->mm || - !mmget_not_zero(ctx->sqo_task->mm))) - return -EFAULT; - kthread_use_mm(ctx->sqo_task->mm); + struct mm_struct *mm; + + if (current->mm) + return 0; + + /* Should never happen */ + if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL))) + return -EFAULT; + + task_lock(ctx->sqo_task); + mm = ctx->sqo_task->mm; + if (unlikely(!mm || !mmget_not_zero(mm))) + mm = NULL; + task_unlock(ctx->sqo_task); + + if (mm) { + kthread_use_mm(mm); + return 0; }
- return 0; + return -EFAULT; }
static int io_sq_thread_acquire_mm(struct io_ring_ctx *ctx,
From: Zhao Qiang qiang.zhao@nxp.com
[ Upstream commit 9bd77a9ce31dd242fece27219d14fbee5068dd85 ]
Since commit 530b5affc675 ("spi: fsl-dspi: fix use-after-free in remove path"), this driver causes a "NULL pointer dereference" in dspi_suspend/resume. This is because since this commit, the drivers private data point to "dspi" instead of "ctlr", the codes in suspend and resume func were not modified correspondly.
Fixes: 530b5affc675 ("spi: fsl-dspi: fix use-after-free in remove path") Signed-off-by: Zhao Qiang qiang.zhao@nxp.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20201103020546.1822-1-qiang.zhao@nxp.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-fsl-dspi.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index 108a7d50d2c37..a96762ffb70b6 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -1106,12 +1106,11 @@ MODULE_DEVICE_TABLE(of, fsl_dspi_dt_ids); #ifdef CONFIG_PM_SLEEP static int dspi_suspend(struct device *dev) { - struct spi_controller *ctlr = dev_get_drvdata(dev); - struct fsl_dspi *dspi = spi_controller_get_devdata(ctlr); + struct fsl_dspi *dspi = dev_get_drvdata(dev);
if (dspi->irq) disable_irq(dspi->irq); - spi_controller_suspend(ctlr); + spi_controller_suspend(dspi->ctlr); clk_disable_unprepare(dspi->clk);
pinctrl_pm_select_sleep_state(dev); @@ -1121,8 +1120,7 @@ static int dspi_suspend(struct device *dev)
static int dspi_resume(struct device *dev) { - struct spi_controller *ctlr = dev_get_drvdata(dev); - struct fsl_dspi *dspi = spi_controller_get_devdata(ctlr); + struct fsl_dspi *dspi = dev_get_drvdata(dev); int ret;
pinctrl_pm_select_default_state(dev); @@ -1130,7 +1128,7 @@ static int dspi_resume(struct device *dev) ret = clk_prepare_enable(dspi->clk); if (ret) return ret; - spi_controller_resume(ctlr); + spi_controller_resume(dspi->ctlr); if (dspi->irq) enable_irq(dspi->irq);
From: Rob Herring robh@kernel.org
[ Upstream commit 832ea234277a2465ec6602fa6a4db5cd9ee87ae3 ]
With commit 669cbc708122 ("PCI: Move DT resource setup into devm_pci_alloc_host_bridge()"), the DT 'ranges' is parsed and populated into resources when the host bridge is allocated. The resources are requested as well, but that happens a second time for the mvebu driver in mvebu_pcie_parse_request_resources(). We should only be requesting the additional resources added in mvebu_pcie_parse_request_resources(). These are not added by default because they use custom properties rather than standard DT address translation.
Also, the bus ranges was also populated by default, so we can remove it from mvebu_pci_host_probe().
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209729 Fixes: 669cbc708122 ("PCI: Move DT resource setup into devm_pci_alloc_host_bridge()") Link: https://lore.kernel.org/r/20201023145252.2691779-1-robh@kernel.org Reported-by: vtolkm@googlemail.com Tested-by: Jan Kundrát jan.kundrat@cesnet.cz Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: Thomas Petazzoni thomas.petazzoni@bootlin.com Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-mvebu.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c index c39978b750ec6..653c0b3d29125 100644 --- a/drivers/pci/controller/pci-mvebu.c +++ b/drivers/pci/controller/pci-mvebu.c @@ -960,25 +960,16 @@ static void mvebu_pcie_powerdown(struct mvebu_pcie_port *port) }
/* - * We can't use devm_of_pci_get_host_bridge_resources() because we - * need to parse our special DT properties encoding the MEM and IO - * apertures. + * devm_of_pci_get_host_bridge_resources() only sets up translateable resources, + * so we need extra resource setup parsing our special DT properties encoding + * the MEM and IO apertures. */ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie) { struct device *dev = &pcie->pdev->dev; - struct device_node *np = dev->of_node; struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie); int ret;
- /* Get the bus range */ - ret = of_pci_parse_bus_range(np, &pcie->busn); - if (ret) { - dev_err(dev, "failed to parse bus-range property: %d\n", ret); - return ret; - } - pci_add_resource(&bridge->windows, &pcie->busn); - /* Get the PCIe memory aperture */ mvebu_mbus_get_pcie_mem_aperture(&pcie->mem); if (resource_size(&pcie->mem) == 0) { @@ -988,6 +979,9 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie)
pcie->mem.name = "PCI MEM"; pci_add_resource(&bridge->windows, &pcie->mem); + ret = devm_request_resource(dev, &iomem_resource, &pcie->mem); + if (ret) + return ret;
/* Get the PCIe IO aperture */ mvebu_mbus_get_pcie_io_aperture(&pcie->io); @@ -1001,9 +995,12 @@ static int mvebu_pcie_parse_request_resources(struct mvebu_pcie *pcie) pcie->realio.name = "PCI I/O";
pci_add_resource(&bridge->windows, &pcie->realio); + ret = devm_request_resource(dev, &ioport_resource, &pcie->realio); + if (ret) + return ret; }
- return devm_request_pci_bus_resources(dev, &bridge->windows); + return 0; }
/*
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 62575e270f661aba64778cbc5f354511cf9abb21 ]
Some messages sent by the MDS entail a session sequence number increment, and the MDS will drop certain types of requests on the floor when the sequence numbers don't match.
In particular, a REQUEST_CLOSE message can cross with one of the sequence morphing messages from the MDS which can cause the client to stall, waiting for a response that will never come.
Originally, this meant an up to 5s delay before the recurring workqueue job kicked in and resent the request, but a recent change made it so that the client would never resend, causing a 60s stall unmounting and sometimes a blockisting event.
Add a new helper for incrementing the session sequence and then testing to see whether a REQUEST_CLOSE needs to be resent, and move the handling of CEPH_MDS_SESSION_CLOSING into that function. Change all of the bare sequence counter increments to use the new helper.
Reorganize check_session_state with a switch statement. It should no longer be called when the session is CLOSING, so throw a warning if it ever is (but still handle that case sanely).
[ idryomov: whitespace, pr_err() call fixup ]
URL: https://tracker.ceph.com/issues/47563 Fixes: fa9967734227 ("ceph: fix potential mdsc use-after-free crash") Reported-by: Patrick Donnelly pdonnell@redhat.com Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Xiubo Li xiubli@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 2 +- fs/ceph/mds_client.c | 50 +++++++++++++++++++++++++++++++------------- fs/ceph/mds_client.h | 1 + fs/ceph/quota.c | 2 +- fs/ceph/snap.c | 2 +- 5 files changed, 39 insertions(+), 18 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 034b3f4fdd3a7..64a64a29f5c79 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4064,7 +4064,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, vino.snap, inode);
mutex_lock(&session->s_mutex); - session->s_seq++; + inc_session_sequence(session); dout(" mds%d seq %lld cap seq %u\n", session->s_mds, session->s_seq, (unsigned)seq);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 76d8d9495d1d4..b2214679baf4e 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4227,7 +4227,7 @@ static void handle_lease(struct ceph_mds_client *mdsc, dname.len, dname.name);
mutex_lock(&session->s_mutex); - session->s_seq++; + inc_session_sequence(session);
if (!inode) { dout("handle_lease no inode %llx\n", vino.ino); @@ -4381,28 +4381,48 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc)
bool check_session_state(struct ceph_mds_session *s) { - if (s->s_state == CEPH_MDS_SESSION_CLOSING) { - dout("resending session close request for mds%d\n", - s->s_mds); - request_close_session(s); - return false; - } - if (s->s_ttl && time_after(jiffies, s->s_ttl)) { - if (s->s_state == CEPH_MDS_SESSION_OPEN) { + switch (s->s_state) { + case CEPH_MDS_SESSION_OPEN: + if (s->s_ttl && time_after(jiffies, s->s_ttl)) { s->s_state = CEPH_MDS_SESSION_HUNG; pr_info("mds%d hung\n", s->s_mds); } - } - if (s->s_state == CEPH_MDS_SESSION_NEW || - s->s_state == CEPH_MDS_SESSION_RESTARTING || - s->s_state == CEPH_MDS_SESSION_CLOSED || - s->s_state == CEPH_MDS_SESSION_REJECTED) - /* this mds is failed or recovering, just wait */ + break; + case CEPH_MDS_SESSION_CLOSING: + /* Should never reach this when we're unmounting */ + WARN_ON_ONCE(true); + fallthrough; + case CEPH_MDS_SESSION_NEW: + case CEPH_MDS_SESSION_RESTARTING: + case CEPH_MDS_SESSION_CLOSED: + case CEPH_MDS_SESSION_REJECTED: return false; + }
return true; }
+/* + * If the sequence is incremented while we're waiting on a REQUEST_CLOSE reply, + * then we need to retransmit that request. + */ +void inc_session_sequence(struct ceph_mds_session *s) +{ + lockdep_assert_held(&s->s_mutex); + + s->s_seq++; + + if (s->s_state == CEPH_MDS_SESSION_CLOSING) { + int ret; + + dout("resending session close request for mds%d\n", s->s_mds); + ret = request_close_session(s); + if (ret < 0) + pr_err("unable to close session to mds%d: %d\n", + s->s_mds, ret); + } +} + /* * delayed work -- periodically trim expired leases, renew caps with mds */ diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 658800605bfb4..11f20a4d36bc5 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -480,6 +480,7 @@ struct ceph_mds_client { extern const char *ceph_mds_op_name(int op);
extern bool check_session_state(struct ceph_mds_session *s); +void inc_session_sequence(struct ceph_mds_session *s);
extern struct ceph_mds_session * __ceph_lookup_mds_session(struct ceph_mds_client *, int mds); diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index cc2c4d40b0222..2b213f864c564 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -53,7 +53,7 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
/* increment msg sequence number */ mutex_lock(&session->s_mutex); - session->s_seq++; + inc_session_sequence(session); mutex_unlock(&session->s_mutex);
/* lookup inode */ diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 923be9399b21c..cc9a9bfc790a3 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -873,7 +873,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, ceph_snap_op_name(op), split, trace_len);
mutex_lock(&session->s_mutex); - session->s_seq++; + inc_session_sequence(session); mutex_unlock(&session->s_mutex);
down_write(&mdsc->snap_rwsem);
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit 1d44d0dd61b6121b49f25b731f2f7f605cb3c896 ]
XFAIL is gone since commit 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP"), use SKIP instead.
Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/core/close_range_test.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/core/close_range_test.c b/tools/testing/selftests/core/close_range_test.c index c99b98b0d461f..575b391ddc78d 100644 --- a/tools/testing/selftests/core/close_range_test.c +++ b/tools/testing/selftests/core/close_range_test.c @@ -44,7 +44,7 @@ TEST(close_range) fd = open("/dev/null", O_RDONLY | O_CLOEXEC); ASSERT_GE(fd, 0) { if (errno == ENOENT) - XFAIL(return, "Skipping test since /dev/null does not exist"); + SKIP(return, "Skipping test since /dev/null does not exist"); }
open_fds[i] = fd; @@ -52,7 +52,7 @@ TEST(close_range)
EXPECT_EQ(-1, sys_close_range(open_fds[0], open_fds[100], -1)) { if (errno == ENOSYS) - XFAIL(return, "close_range() syscall not supported"); + SKIP(return, "close_range() syscall not supported"); }
EXPECT_EQ(0, sys_close_range(open_fds[0], open_fds[50], 0)); @@ -108,7 +108,7 @@ TEST(close_range_unshare) fd = open("/dev/null", O_RDONLY | O_CLOEXEC); ASSERT_GE(fd, 0) { if (errno == ENOENT) - XFAIL(return, "Skipping test since /dev/null does not exist"); + SKIP(return, "Skipping test since /dev/null does not exist"); }
open_fds[i] = fd; @@ -197,7 +197,7 @@ TEST(close_range_unshare_capped) fd = open("/dev/null", O_RDONLY | O_CLOEXEC); ASSERT_GE(fd, 0) { if (errno == ENOENT) - XFAIL(return, "Skipping test since /dev/null does not exist"); + SKIP(return, "Skipping test since /dev/null does not exist"); }
open_fds[i] = fd;
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit afba8b0a2cc532b54eaf4254092f57bba5d7eb65 ]
XFAIL is gone since commit 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP"), use SKIP instead.
Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c b/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c index 9562425aa0a90..614091de4c545 100644 --- a/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c +++ b/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c @@ -145,7 +145,7 @@ TEST(clone3_cap_checkpoint_restore) test_clone3_supported();
EXPECT_EQ(getuid(), 0) - XFAIL(return, "Skipping all tests as non-root\n"); + SKIP(return, "Skipping all tests as non-root");
memset(&set_tid, 0, sizeof(set_tid));
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit 7d764b685ee1bc73a9fa2b6cb4d42fa72b943145 ]
XFAIL is gone since commit 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP"), use SKIP instead.
Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/filesystems/binderfs/binderfs_test.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/filesystems/binderfs/binderfs_test.c b/tools/testing/selftests/filesystems/binderfs/binderfs_test.c index 1d27f52c61e61..477cbb042f5ba 100644 --- a/tools/testing/selftests/filesystems/binderfs/binderfs_test.c +++ b/tools/testing/selftests/filesystems/binderfs/binderfs_test.c @@ -74,7 +74,7 @@ static int __do_binderfs_test(struct __test_metadata *_metadata) ret = mount(NULL, binderfs_mntpt, "binder", 0, 0); EXPECT_EQ(ret, 0) { if (errno == ENODEV) - XFAIL(goto out, "binderfs missing"); + SKIP(goto out, "binderfs missing"); TH_LOG("%s - Failed to mount binderfs", strerror(errno)); goto rmdir; } @@ -475,10 +475,10 @@ TEST(binderfs_stress) TEST(binderfs_test_privileged) { if (geteuid() != 0) - XFAIL(return, "Tests are not run as root. Skipping privileged tests"); + SKIP(return, "Tests are not run as root. Skipping privileged tests");
if (__do_binderfs_test(_metadata)) - XFAIL(return, "The Android binderfs filesystem is not available"); + SKIP(return, "The Android binderfs filesystem is not available"); }
TEST(binderfs_test_unprivileged) @@ -511,7 +511,7 @@ TEST(binderfs_test_unprivileged) ret = wait_for_pid(pid); if (ret) { if (ret == 2) - XFAIL(return, "The Android binderfs filesystem is not available"); + SKIP(return, "The Android binderfs filesystem is not available"); ASSERT_EQ(ret, 0) { TH_LOG("wait_for_pid() failed"); }
From: Anand K Mistry amistry@google.com
[ Upstream commit 1978b3a53a74e3230cd46932b149c6e62e832e9a ]
On AMD CPUs which have the feature X86_FEATURE_AMD_STIBP_ALWAYS_ON, STIBP is set to on and
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
At the same time, IBPB can be set to conditional.
However, this leads to the case where it's impossible to turn on IBPB for a process because in the PR_SPEC_DISABLE case in ib_prctl_set() the
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED
condition leads to a return before the task flag is set. Similarly, ib_prctl_get() will return PR_SPEC_DISABLE even though IBPB is set to conditional.
More generally, the following cases are possible:
1. STIBP = conditional && IBPB = on for spectre_v2_user=seccomp,ibpb 2. STIBP = on && IBPB = conditional for AMD CPUs with X86_FEATURE_AMD_STIBP_ALWAYS_ON
The first case functions correctly today, but only because spectre_v2_user_ibpb isn't updated to reflect the IBPB mode.
At a high level, this change does one thing. If either STIBP or IBPB is set to conditional, allow the prctl to change the task flag. Also, reflect that capability when querying the state. This isn't perfect since it doesn't take into account if only STIBP or IBPB is unconditionally on. But it allows the conditional feature to work as expected, without affecting the unconditional one.
[ bp: Massage commit message and comment; space out statements for better readability. ]
Fixes: 21998a351512 ("x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.") Signed-off-by: Anand K Mistry amistry@google.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Thomas Gleixner tglx@linutronix.de Acked-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lkml.kernel.org/r/20201105163246.v2.1.Ifd7243cd3e2c2206a893ad0a5b9a4... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/bugs.c | 51 ++++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index d3f0db463f96a..581fb7223ad0e 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1254,6 +1254,14 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) return 0; }
+static bool is_spec_ib_user_controlled(void) +{ + return spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL || + spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP || + spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL || + spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP; +} + static int ib_prctl_set(struct task_struct *task, unsigned long ctrl) { switch (ctrl) { @@ -1261,16 +1269,26 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl) if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE && spectre_v2_user_stibp == SPECTRE_V2_USER_NONE) return 0; + /* - * Indirect branch speculation is always disabled in strict - * mode. It can neither be enabled if it was force-disabled - * by a previous prctl call. + * With strict mode for both IBPB and STIBP, the instruction + * code paths avoid checking this task flag and instead, + * unconditionally run the instruction. However, STIBP and IBPB + * are independent and either can be set to conditionally + * enabled regardless of the mode of the other. + * + * If either is set to conditional, allow the task flag to be + * updated, unless it was force-disabled by a previous prctl + * call. Currently, this is possible on an AMD CPU which has the + * feature X86_FEATURE_AMD_STIBP_ALWAYS_ON. In this case, if the + * kernel is booted with 'spectre_v2_user=seccomp', then + * spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP and + * spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED. */ - if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED || + if (!is_spec_ib_user_controlled() || task_spec_ib_force_disable(task)) return -EPERM; + task_clear_spec_ib_disable(task); task_update_spec_tif(task); break; @@ -1283,10 +1301,10 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl) if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE && spectre_v2_user_stibp == SPECTRE_V2_USER_NONE) return -EPERM; - if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED) + + if (!is_spec_ib_user_controlled()) return 0; + task_set_spec_ib_disable(task); if (ctrl == PR_SPEC_FORCE_DISABLE) task_set_spec_ib_force_disable(task); @@ -1351,20 +1369,17 @@ static int ib_prctl_get(struct task_struct *task) if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE && spectre_v2_user_stibp == SPECTRE_V2_USER_NONE) return PR_SPEC_ENABLE; - else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED) - return PR_SPEC_DISABLE; - else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_PRCTL || - spectre_v2_user_ibpb == SPECTRE_V2_USER_SECCOMP || - spectre_v2_user_stibp == SPECTRE_V2_USER_PRCTL || - spectre_v2_user_stibp == SPECTRE_V2_USER_SECCOMP) { + else if (is_spec_ib_user_controlled()) { if (task_spec_ib_force_disable(task)) return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; if (task_spec_ib_disable(task)) return PR_SPEC_PRCTL | PR_SPEC_DISABLE; return PR_SPEC_PRCTL | PR_SPEC_ENABLE; - } else + } else if (spectre_v2_user_ibpb == SPECTRE_V2_USER_STRICT || + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED) + return PR_SPEC_DISABLE; + else return PR_SPEC_NOT_AFFECTED; }
From: Bill Wendling morbo@google.com
[ Upstream commit a968433723310f35898b4a2f635a7991aeef66b1 ]
ld's --build-id defaults to "sha1" style, while lld defaults to "fast". The build IDs are very different between the two, which may confuse programs that reference them.
Signed-off-by: Bill Wendling morbo@google.com Acked-by: David S. Miller davem@davemloft.net Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 4 ++-- arch/arm/vdso/Makefile | 2 +- arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kernel/vdso32/Makefile | 2 +- arch/mips/vdso/Makefile | 2 +- arch/riscv/kernel/vdso/Makefile | 2 +- arch/s390/kernel/vdso64/Makefile | 2 +- arch/sparc/vdso/Makefile | 2 +- arch/x86/entry/vdso/Makefile | 2 +- tools/testing/selftests/bpf/Makefile | 2 +- 10 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/Makefile b/Makefile index 035d86a0d291d..245c66fa8be17 100644 --- a/Makefile +++ b/Makefile @@ -973,8 +973,8 @@ KBUILD_CPPFLAGS += $(KCPPFLAGS) KBUILD_AFLAGS += $(KAFLAGS) KBUILD_CFLAGS += $(KCFLAGS)
-KBUILD_LDFLAGS_MODULE += --build-id -LDFLAGS_vmlinux += --build-id +KBUILD_LDFLAGS_MODULE += --build-id=sha1 +LDFLAGS_vmlinux += --build-id=sha1
ifeq ($(CONFIG_STRIP_ASM_SYMS),y) LDFLAGS_vmlinux += $(call ld-option, -X,) diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile index a54f70731d9f1..150ce6e6a5d31 100644 --- a/arch/arm/vdso/Makefile +++ b/arch/arm/vdso/Makefile @@ -19,7 +19,7 @@ ccflags-y += -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO32 ldflags-$(CONFIG_CPU_ENDIAN_BE8) := --be8 ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \ -z max-page-size=4096 -nostdlib -shared $(ldflags-y) \ - --hash-style=sysv --build-id \ + --hash-style=sysv --build-id=sha1 \ -T
obj-$(CONFIG_VDSO) += vdso.o diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile index 45d5cfe464290..871915097f9d1 100644 --- a/arch/arm64/kernel/vdso/Makefile +++ b/arch/arm64/kernel/vdso/Makefile @@ -24,7 +24,7 @@ btildflags-$(CONFIG_ARM64_BTI_KERNEL) += -z force-bti # routines, as x86 does (see 6f121e548f83 ("x86, vdso: Reimplement vdso.so # preparation in build-time C")). ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \ - -Bsymbolic $(call ld-option, --no-eh-frame-hdr) --build-id -n \ + -Bsymbolic $(call ld-option, --no-eh-frame-hdr) --build-id=sha1 -n \ $(btildflags-y) -T
ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile index d6adb4677c25f..4fa4b3fe8efb7 100644 --- a/arch/arm64/kernel/vdso32/Makefile +++ b/arch/arm64/kernel/vdso32/Makefile @@ -128,7 +128,7 @@ VDSO_LDFLAGS += -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1 VDSO_LDFLAGS += -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 VDSO_LDFLAGS += -nostdlib -shared -mfloat-abi=soft VDSO_LDFLAGS += -Wl,--hash-style=sysv -VDSO_LDFLAGS += -Wl,--build-id +VDSO_LDFLAGS += -Wl,--build-id=sha1 VDSO_LDFLAGS += $(call cc32-ldoption,-fuse-ld=bfd)
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile index 57fe832352819..5810cc12bc1d9 100644 --- a/arch/mips/vdso/Makefile +++ b/arch/mips/vdso/Makefile @@ -61,7 +61,7 @@ endif # VDSO linker flags. ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \ $(filter -E%,$(KBUILD_CFLAGS)) -nostdlib -shared \ - -G 0 --eh-frame-hdr --hash-style=sysv --build-id -T + -G 0 --eh-frame-hdr --hash-style=sysv --build-id=sha1 -T
CFLAGS_REMOVE_vdso.o = -pg
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index 478e7338ddc10..7d6a94d45ec94 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -49,7 +49,7 @@ $(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE # refer to these symbols in the kernel code rather than hand-coded addresses.
SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \ - -Wl,--build-id -Wl,--hash-style=both + -Wl,--build-id=sha1 -Wl,--hash-style=both $(obj)/vdso-dummy.o: $(src)/vdso.lds $(obj)/rt_sigreturn.o FORCE $(call if_changed,vdsold)
diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile index 4a66a1cb919b1..edc473b32e420 100644 --- a/arch/s390/kernel/vdso64/Makefile +++ b/arch/s390/kernel/vdso64/Makefile @@ -19,7 +19,7 @@ KBUILD_AFLAGS_64 += -m64 -s KBUILD_CFLAGS_64 := $(filter-out -m64,$(KBUILD_CFLAGS)) KBUILD_CFLAGS_64 += -m64 -fPIC -shared -fno-common -fno-builtin ldflags-y := -fPIC -shared -nostdlib -soname=linux-vdso64.so.1 \ - --hash-style=both --build-id -T + --hash-style=both --build-id=sha1 -T
$(targets:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_64) $(targets:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_64) diff --git a/arch/sparc/vdso/Makefile b/arch/sparc/vdso/Makefile index f44355e46f31f..469dd23887abb 100644 --- a/arch/sparc/vdso/Makefile +++ b/arch/sparc/vdso/Makefile @@ -115,7 +115,7 @@ quiet_cmd_vdso = VDSO $@ -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(OBJDUMP)' '$@'
-VDSO_LDFLAGS = -shared --hash-style=both --build-id -Bsymbolic +VDSO_LDFLAGS = -shared --hash-style=both --build-id=sha1 -Bsymbolic GCOV_PROFILE := n
# diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 215376d975a29..ebba25ed9a386 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -176,7 +176,7 @@ quiet_cmd_vdso = VDSO $@ -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@'
-VDSO_LDFLAGS = -shared --hash-style=both --build-id \ +VDSO_LDFLAGS = -shared --hash-style=both --build-id=sha1 \ $(call ld-option, --eh-frame-hdr) -Bsymbolic GCOV_PROFILE := n
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index fc946b7ac288d..daf186f88a636 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -133,7 +133,7 @@ $(OUTPUT)/%:%.c
$(OUTPUT)/urandom_read: urandom_read.c $(call msg,BINARY,,$@) - $(Q)$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS) -Wl,--build-id + $(Q)$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS) -Wl,--build-id=sha1
$(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ) $(call msg,CC,,$@)
From: Palmer Dabbelt palmerdabbelt@google.com
[ Upstream commit c2c81bb2f69138f902e1a58d3bef6ad97fb8a92c ]
We were relying on GNU ld's ability to re-link executable files in order to extract our VDSO symbols. This behavior was deemed a bug as of binutils-2.35 (specifically the binutils-gdb commit a87e1817a4 ("Have the linker fail if any attempt to link in an executable is made."), but as that has been backported to at least Debian's binutils-2.34 in may manifest in other places.
The previous version of this was a bit of a mess: we were linking a static executable version of the VDSO, containing only a subset of the input symbols, which we then linked into the kernel. This worked, but certainly wasn't a supported path through the toolchain. Instead this new version parses the textual output of nm to produce a symbol table. Both rely on near-zero addresses being linkable, but as we rely on weak undefined symbols being linkable elsewhere I don't view this as a major issue.
Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API") Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/vdso/.gitignore | 1 + arch/riscv/kernel/vdso/Makefile | 18 +++++++++--------- arch/riscv/kernel/vdso/so2s.sh | 6 ++++++ 3 files changed, 16 insertions(+), 9 deletions(-) create mode 100755 arch/riscv/kernel/vdso/so2s.sh
diff --git a/arch/riscv/kernel/vdso/.gitignore b/arch/riscv/kernel/vdso/.gitignore index 11ebee9e4c1d6..3a19def868ecc 100644 --- a/arch/riscv/kernel/vdso/.gitignore +++ b/arch/riscv/kernel/vdso/.gitignore @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only vdso.lds *.tmp +vdso-syms.S diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index 7d6a94d45ec94..cb8f9e4cfcbf8 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -43,19 +43,14 @@ $(obj)/vdso.o: $(obj)/vdso.so SYSCFLAGS_vdso.so.dbg = $(c_flags) $(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE $(call if_changed,vdsold) +SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \ + -Wl,--build-id -Wl,--hash-style=both
# We also create a special relocatable object that should mirror the symbol # table and layout of the linked DSO. With ld --just-symbols we can then # refer to these symbols in the kernel code rather than hand-coded addresses. - -SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \ - -Wl,--build-id=sha1 -Wl,--hash-style=both -$(obj)/vdso-dummy.o: $(src)/vdso.lds $(obj)/rt_sigreturn.o FORCE - $(call if_changed,vdsold) - -LDFLAGS_vdso-syms.o := -r --just-symbols -$(obj)/vdso-syms.o: $(obj)/vdso-dummy.o FORCE - $(call if_changed,ld) +$(obj)/vdso-syms.S: $(obj)/vdso.so FORCE + $(call if_changed,so2s)
# strip rule for the .so file $(obj)/%.so: OBJCOPYFLAGS := -S @@ -73,6 +68,11 @@ quiet_cmd_vdsold = VDSOLD $@ $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ rm $@.tmp
+# Extracts symbol offsets from the VDSO, converting them into an assembly file +# that contains the same symbols at the same offsets. +quiet_cmd_so2s = SO2S $@ + cmd_so2s = $(NM) -D $< | $(srctree)/$(src)/so2s.sh > $@ + # install commands for the unstripped file quiet_cmd_vdso_install = INSTALL $@ cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@ diff --git a/arch/riscv/kernel/vdso/so2s.sh b/arch/riscv/kernel/vdso/so2s.sh new file mode 100755 index 0000000000000..e64cb6d9440e7 --- /dev/null +++ b/arch/riscv/kernel/vdso/so2s.sh @@ -0,0 +1,6 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0+ +# Copyright 2020 Palmer Dabbelt palmerdabbelt@google.com + +sed 's!([0-9a-f]*) T ([a-z0-9_]*)(@@LINUX_4.15)*!.global \2\n.set \2,0x\1!' \ +| grep '^.'
From: Zhang Qilong zhangqilong3@huawei.com
[ Upstream commit 00bd6bca3fb1e98190a24eda2583062803c9e8b5 ]
pm_runtime_get_sync() will increment pm usage at first and it will resume the device later. If runtime of the device has error or device is in inaccessible state(or other error state), resume operation will fail. If we do not call put operation to decrease the reference, the result is that this device cannot enter the idle state and always stay busy or other non-idle state.
Fixes: 249fa8217b846 ("USB: Add driver to control USB fast charge for iOS devices") Signed-off-by: Zhang Qilong zhangqilong3@huawei.com Link: https://lore.kernel.org/r/20201102022650.67115-1-zhangqilong3@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/misc/apple-mfi-fastcharge.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/misc/apple-mfi-fastcharge.c b/drivers/usb/misc/apple-mfi-fastcharge.c index 579d8c84de42c..9de0171b51776 100644 --- a/drivers/usb/misc/apple-mfi-fastcharge.c +++ b/drivers/usb/misc/apple-mfi-fastcharge.c @@ -120,8 +120,10 @@ static int apple_mfi_fc_set_property(struct power_supply *psy, dev_dbg(&mfi->udev->dev, "prop: %d\n", psp);
ret = pm_runtime_get_sync(&mfi->udev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_noidle(&mfi->udev->dev); return ret; + }
switch (psp) { case POWER_SUPPLY_PROP_CHARGE_TYPE:
From: Tyler Hicks tyhicks@linux.microsoft.com
[ Upstream commit 8ffd778aff45be760292225049e0141255d4ad6e ]
Mimic the pre-existing ACPI and Device Tree event log behavior by not creating the binary_bios_measurements file when the EFI TPM event log is empty.
This fixes the following NULL pointer dereference that can occur when reading /sys/kernel/security/tpm0/binary_bios_measurements after the kernel received an empty event log from the firmware:
BUG: kernel NULL pointer dereference, address: 000000000000002c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 3932 Comm: fwupdtpmevlog Not tainted 5.9.0-00003-g629990edad62 #17 Hardware name: LENOVO 20LCS03L00/20LCS03L00, BIOS N27ET38W (1.24 ) 11/28/2019 RIP: 0010:tpm2_bios_measurements_start+0x3a/0x550 Code: 54 53 48 83 ec 68 48 8b 57 70 48 8b 1e 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 8b 82 c0 06 00 00 48 8b 8a c8 06 00 00 <44> 8b 60 1c 48 89 4d a0 4c 89 e2 49 83 c4 20 48 83 fb 00 75 2a 49 RSP: 0018:ffffa9c901203db0 EFLAGS: 00010246 RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000010 RDX: ffff8ba1eb99c000 RSI: ffff8ba1e4ce8280 RDI: ffff8ba1e4ce8258 RBP: ffffa9c901203e40 R08: ffffa9c901203dd8 R09: ffff8ba1ec443300 R10: ffffa9c901203e50 R11: 0000000000000000 R12: ffff8ba1e4ce8280 R13: ffffa9c901203ef0 R14: ffffa9c901203ef0 R15: ffff8ba1e4ce8258 FS: 00007f6595460880(0000) GS:ffff8ba1ef880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000002c CR3: 00000007d8d18003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? __kmalloc_node+0x113/0x320 ? kvmalloc_node+0x31/0x80 seq_read+0x94/0x420 vfs_read+0xa7/0x190 ksys_read+0xa7/0xe0 __x64_sys_read+0x1a/0x20 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9
In this situation, the bios_event_log pointer in the tpm_bios_log struct was not NULL but was equal to the ZERO_SIZE_PTR (0x10) value. This was due to the following kmemdup() in tpm_read_log_efi():
int tpm_read_log_efi(struct tpm_chip *chip) { ... /* malloc EventLog space */ log->bios_event_log = kmemdup(log_tbl->log, log_size, GFP_KERNEL); if (!log->bios_event_log) { ret = -ENOMEM; goto out; } ... }
When log_size is zero, due to an empty event log from firmware, ZERO_SIZE_PTR is returned from kmemdup(). Upon a read of the binary_bios_measurements file, the tpm2_bios_measurements_start() function does not perform a ZERO_OR_NULL_PTR() check on the bios_event_log pointer before dereferencing it.
Rather than add a ZERO_OR_NULL_PTR() check in functions that make use of the bios_event_log pointer, simply avoid creating the binary_bios_measurements_file as is done in other event log retrieval backends.
Explicitly ignore all of the events in the final event log when the main event log is empty. The list of events in the final event log cannot be accurately parsed without referring to the first event in the main event log (the event log header) so the final event log is useless in such a situation.
Fixes: 58cc1e4faf10 ("tpm: parse TPM event logs based on EFI table") Link: https://lore.kernel.org/linux-integrity/E1FDCCCB-CA51-4AEE-AC83-9CDE995EAE52... Reported-by: Kai-Heng Feng kai.heng.feng@canonical.com Reported-by: Kenneth R. Crudup kenny@panix.com Reported-by: Mimi Zohar zohar@linux.ibm.com Cc: Thiébaud Weksteen tweek@google.com Cc: Ard Biesheuvel ardb@kernel.org Signed-off-by: Tyler Hicks tyhicks@linux.microsoft.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/eventlog/efi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/char/tpm/eventlog/efi.c b/drivers/char/tpm/eventlog/efi.c index 6bb023de17f1f..35229e5143cac 100644 --- a/drivers/char/tpm/eventlog/efi.c +++ b/drivers/char/tpm/eventlog/efi.c @@ -41,6 +41,11 @@ int tpm_read_log_efi(struct tpm_chip *chip) log_size = log_tbl->size; memunmap(log_tbl);
+ if (!log_size) { + pr_warn("UEFI TPM log area empty\n"); + return -EIO; + } + log_tbl = memremap(efi.tpm_log, sizeof(*log_tbl) + log_size, MEMREMAP_WB); if (!log_tbl) {
From: Stephen Boyd swboyd@chromium.org
commit 1de111b51b829bcf01d2e57971f8fd07a665fa3f upstream.
According to the SMCCC spec[1](7.5.2 Discovery) the ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and SMCCC_RET_NOT_SUPPORTED.
0 is "workaround required and safe to call this function" 1 is "workaround not required but safe to call this function" SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"
SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except calling this function may not work because it isn't implemented in some cases". Wonderful. We map this SMC call to
0 is SPECTRE_MITIGATED 1 is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
For KVM hypercalls (hvc), we've implemented this function id to return SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those isn't supposed to be there. Per the code we call arm64_get_spectre_v2_state() to figure out what to return for this feature discovery call.
0 is SPECTRE_MITIGATED SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
Let's clean this up so that KVM tells the guest this mapping:
0 is SPECTRE_MITIGATED 1 is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec
Fixes: c118bbb52743 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests") Signed-off-by: Stephen Boyd swboyd@chromium.org Acked-by: Marc Zyngier maz@kernel.org Acked-by: Will Deacon will@kernel.org Cc: Andre Przywara andre.przywara@arm.com Cc: Steven Price steven.price@arm.com Cc: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://developer.arm.com/documentation/den0028/latest [1] Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Stephen Boyd swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/hypercalls.c | 2 +- include/linux/arm-smccc.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -31,7 +31,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu val = SMCCC_RET_SUCCESS; break; case KVM_BP_HARDEN_NOT_REQUIRED: - val = SMCCC_RET_NOT_REQUIRED; + val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED; break; } break; --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -86,6 +86,8 @@ ARM_SMCCC_SMC_32, \ 0, 0x7fff)
+#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 + /* Paravirtualised time calls (defined by ARM DEN0057A) */ #define ARM_SMCCC_HV_PV_TIME_FEATURES \ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
From: Masashi Honma masashi.honma@gmail.com
commit 5024f21c159f8c1668f581fff37140741c0b1ba9 upstream.
kernel test robot says: drivers/net/wireless/ath/ath9k/htc_drv_txrx.c:987:20: sparse: warning: incorrect type in assignment (different base types) drivers/net/wireless/ath/ath9k/htc_drv_txrx.c:987:20: sparse: expected restricted __be16 [usertype] rs_datalen drivers/net/wireless/ath/ath9k/htc_drv_txrx.c:987:20: sparse: got unsigned short [usertype] drivers/net/wireless/ath/ath9k/htc_drv_txrx.c:988:13: sparse: warning: restricted __be16 degrades to integer drivers/net/wireless/ath/ath9k/htc_drv_txrx.c:1001:13: sparse: warning: restricted __be16 degrades to integer
Indeed rs_datalen has host byte order, so modify it's own type.
Reported-by: kernel test robot lkp@intel.com Fixes: cd486e627e67 ("ath9k_htc: Discard undersized packets") Signed-off-by: Masashi Honma masashi.honma@gmail.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20200808233258.4596-1-masashi.honma@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c +++ b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c @@ -974,7 +974,7 @@ static bool ath9k_rx_prepare(struct ath9 struct ath_htc_rx_status *rxstatus; struct ath_rx_status rx_stats; bool decrypt_error = false; - __be16 rs_datalen; + u16 rs_datalen; bool is_phyerr;
if (skb->len < HTC_RX_FRAME_HEADER_SIZE) {
From: Pujin Shi shipujin.t@gmail.com
commit 6500251e590657066a227dce897a0392f302af24 upstream.
For older versions of gcc, the array = {0}; will cause warnings:
drivers/scsi/ufs/ufshcd-crypto.c: In function 'ufshcd_crypto_keyslot_program': drivers/scsi/ufs/ufshcd-crypto.c:62:8: warning: missing braces around initializer [-Wmissing-braces] union ufs_crypto_cfg_entry cfg = { 0 }; ^ drivers/scsi/ufs/ufshcd-crypto.c:62:8: warning: (near initialization for 'cfg.reg_val') [-Wmissing-braces] drivers/scsi/ufs/ufshcd-crypto.c: In function 'ufshcd_clear_keyslot': drivers/scsi/ufs/ufshcd-crypto.c:103:8: warning: missing braces around initializer [-Wmissing-braces] union ufs_crypto_cfg_entry cfg = { 0 }; ^ 2 warnings generated
Link: https://lore.kernel.org/r/20201002063538.1250-1-shipujin.t@gmail.com Fixes: 70297a8ac7a7 ("scsi: ufs: UFS crypto API") Reviewed-by: Eric Biggers ebiggers@google.com Signed-off-by: Pujin Shi shipujin.t@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/ufs/ufshcd-crypto.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/ufs/ufshcd-crypto.c +++ b/drivers/scsi/ufs/ufshcd-crypto.c @@ -59,7 +59,7 @@ static int ufshcd_crypto_keyslot_program u8 data_unit_mask = key->crypto_cfg.data_unit_size / 512; int i; int cap_idx = -1; - union ufs_crypto_cfg_entry cfg = { 0 }; + union ufs_crypto_cfg_entry cfg = {}; int err;
BUILD_BUG_ON(UFS_CRYPTO_KEY_SIZE_INVALID != 0); @@ -100,7 +100,7 @@ static int ufshcd_clear_keyslot(struct u * Clear the crypto cfg on the device. Clearing CFGE * might not be sufficient, so just clear the entire cfg. */ - union ufs_crypto_cfg_entry cfg = { 0 }; + union ufs_crypto_cfg_entry cfg = {};
return ufshcd_program_key(hba, &cfg, slot); }
From: Tzung-Bi Shih tzungbi@google.com
[ Upstream commit eb5a558705c7f63d06b4ddd072898b1ca894e053 ]
RT1015's output widget name is "SPO" instead of "Speaker". Fixes it to use the correct names.
Signed-off-by: Tzung-Bi Shih tzungbi@google.com Link: https://lore.kernel.org/r/20201019044724.1601476-1-tzungbi@google.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../mediatek/mt8183/mt8183-da7219-max98357.c | 31 +++++++++++++++---- 1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/sound/soc/mediatek/mt8183/mt8183-da7219-max98357.c b/sound/soc/mediatek/mt8183/mt8183-da7219-max98357.c index a6c690c5308d3..58b76e985f7f3 100644 --- a/sound/soc/mediatek/mt8183/mt8183-da7219-max98357.c +++ b/sound/soc/mediatek/mt8183/mt8183-da7219-max98357.c @@ -624,15 +624,34 @@ static struct snd_soc_codec_conf mt8183_da7219_rt1015_codec_conf[] = { }, };
+static const struct snd_kcontrol_new mt8183_da7219_rt1015_snd_controls[] = { + SOC_DAPM_PIN_SWITCH("Left Spk"), + SOC_DAPM_PIN_SWITCH("Right Spk"), +}; + +static const +struct snd_soc_dapm_widget mt8183_da7219_rt1015_dapm_widgets[] = { + SND_SOC_DAPM_SPK("Left Spk", NULL), + SND_SOC_DAPM_SPK("Right Spk", NULL), + SND_SOC_DAPM_PINCTRL("TDM_OUT_PINCTRL", + "aud_tdm_out_on", "aud_tdm_out_off"), +}; + +static const struct snd_soc_dapm_route mt8183_da7219_rt1015_dapm_routes[] = { + {"Left Spk", NULL, "Left SPO"}, + {"Right Spk", NULL, "Right SPO"}, + {"I2S Playback", NULL, "TDM_OUT_PINCTRL"}, +}; + static struct snd_soc_card mt8183_da7219_rt1015_card = { .name = "mt8183_da7219_rt1015", .owner = THIS_MODULE, - .controls = mt8183_da7219_max98357_snd_controls, - .num_controls = ARRAY_SIZE(mt8183_da7219_max98357_snd_controls), - .dapm_widgets = mt8183_da7219_max98357_dapm_widgets, - .num_dapm_widgets = ARRAY_SIZE(mt8183_da7219_max98357_dapm_widgets), - .dapm_routes = mt8183_da7219_max98357_dapm_routes, - .num_dapm_routes = ARRAY_SIZE(mt8183_da7219_max98357_dapm_routes), + .controls = mt8183_da7219_rt1015_snd_controls, + .num_controls = ARRAY_SIZE(mt8183_da7219_rt1015_snd_controls), + .dapm_widgets = mt8183_da7219_rt1015_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(mt8183_da7219_rt1015_dapm_widgets), + .dapm_routes = mt8183_da7219_rt1015_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(mt8183_da7219_rt1015_dapm_routes), .dai_link = mt8183_da7219_dai_links, .num_links = ARRAY_SIZE(mt8183_da7219_dai_links), .aux_dev = &mt8183_da7219_max98357_headset_dev,
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 3f48b6eba15ea342ef4cb420b580f5ed6605669f ]
With the current state of code, we would endup with something like below in /proc/asound/cards for 2 machines based on this driver.
Machine 1: 0 [DB845c ]: DB845c - DB845c DB845c Machine 2: 0 [LenovoYOGAC6301]: Lenovo-YOGA-C63 - Lenovo-YOGA-C630-13Q50 LENOVO-81JL-LenovoYOGAC630_13Q50-LNVNB161216
This is not very UCM friendly both w.r.t to common up configs and card identification, and UCM2 became totally not usefull with just one ucm sdm845.conf for two machines which have different setups w.r.t HDMI and other dais.
Reasons for such thing is partly because Qualcomm machine drivers never cared to set driver_name.
This patch sets up driver name for the this driver to sort out the UCM integration issues!
after this patch contents of /proc/asound/cards:
Machine 1: 0 [DB845c ]: sdm845 - DB845c DB845c Machine 2: 0 [LenovoYOGAC6301]: sdm845 - Lenovo-YOGA-C630-13Q50 LENOVO-81JL-LenovoYOGAC630_13Q50-LNVNB161216
with this its possible to align with what UCM2 expects and we can have sdm845/DB845.conf sdm845/LENOVO-81JL-LenovoYOGAC630_13Q50-LNVNB161216.conf ... for board variants. This should scale much better!
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20201023095849.22894-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/sdm845.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/qcom/sdm845.c b/sound/soc/qcom/sdm845.c index ab1bf23c21a68..6c2760e27ea6f 100644 --- a/sound/soc/qcom/sdm845.c +++ b/sound/soc/qcom/sdm845.c @@ -17,6 +17,7 @@ #include "qdsp6/q6afe.h" #include "../codecs/rt5663.h"
+#define DRIVER_NAME "sdm845" #define DEFAULT_SAMPLE_RATE_48K 48000 #define DEFAULT_MCLK_RATE 24576000 #define TDM_BCLK_RATE 6144000 @@ -552,6 +553,7 @@ static int sdm845_snd_platform_probe(struct platform_device *pdev) if (!data) return -ENOMEM;
+ card->driver_name = DRIVER_NAME; card->dapm_widgets = sdm845_snd_widgets; card->num_dapm_widgets = ARRAY_SIZE(sdm845_snd_widgets); card->dev = dev;
From: Olivier Moysan olivier.moysan@st.com
[ Upstream commit 20afe581c9b980848ad097c4d54dde9bec7593ef ]
A delay must be introduced before the shutdown down of the mclk, as stated in CS42L51 datasheet. Otherwise the codec may produce some noise after the end of DAPM power down sequence. The delay between DAC and CLOCK_SUPPLY widgets is too short. Add a delay in mclk shutdown request to manage the shutdown delay explicitly. From experiments, at least 10ms delay is necessary. Set delay to 20ms as recommended in Documentation/timers/timers-howto.rst when using msleep().
Signed-off-by: Olivier Moysan olivier.moysan@st.com Link: https://lore.kernel.org/r/20201020150109.482-1-olivier.moysan@st.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l51.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/cs42l51.c b/sound/soc/codecs/cs42l51.c index 764f2ef8f59df..2b617993b0adb 100644 --- a/sound/soc/codecs/cs42l51.c +++ b/sound/soc/codecs/cs42l51.c @@ -245,8 +245,28 @@ static const struct snd_soc_dapm_widget cs42l51_dapm_widgets[] = { &cs42l51_adcr_mux_controls), };
+static int mclk_event(struct snd_soc_dapm_widget *w, + struct snd_kcontrol *kcontrol, int event) +{ + struct snd_soc_component *comp = snd_soc_dapm_to_component(w->dapm); + struct cs42l51_private *cs42l51 = snd_soc_component_get_drvdata(comp); + + switch (event) { + case SND_SOC_DAPM_PRE_PMU: + return clk_prepare_enable(cs42l51->mclk_handle); + case SND_SOC_DAPM_POST_PMD: + /* Delay mclk shutdown to fulfill power-down sequence requirements */ + msleep(20); + clk_disable_unprepare(cs42l51->mclk_handle); + break; + } + + return 0; +} + static const struct snd_soc_dapm_widget cs42l51_dapm_mclk_widgets[] = { - SND_SOC_DAPM_CLOCK_SUPPLY("MCLK") + SND_SOC_DAPM_SUPPLY("MCLK", SND_SOC_NOPM, 0, 0, mclk_event, + SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD), };
static const struct snd_soc_dapm_route cs42l51_routes[] = {
From: Bard Liao yung-chuan.liao@linux.intel.com
[ Upstream commit 6e5329c6e6032cd997400b43b8299f607a61883e ]
Do not emit a warning for extended firmware header fields that are not used by kernel. This creates unnecessary noise to kernel logs like:
sof-audio-pci 0000:00:1f.3: warning: unknown ext header type 3 size 0x1c sof-audio-pci 0000:00:1f.3: warning: unknown ext header type 4 size 0x10
Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20201021182419.1160391-1-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/loader.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/soc/sof/loader.c b/sound/soc/sof/loader.c index b94fa5f5d4808..c90c3f3a3b3ee 100644 --- a/sound/soc/sof/loader.c +++ b/sound/soc/sof/loader.c @@ -118,6 +118,11 @@ int snd_sof_fw_parse_ext_data(struct snd_sof_dev *sdev, u32 bar, u32 offset) case SOF_IPC_EXT_CC_INFO: ret = get_cc_info(sdev, ext_hdr); break; + case SOF_IPC_EXT_UNUSED: + case SOF_IPC_EXT_PROBE_INFO: + case SOF_IPC_EXT_USER_ABI_INFO: + /* They are supported but we don't do anything here */ + break; default: dev_warn(sdev->dev, "warning: unknown ext header type %d size 0x%x\n", ext_hdr->type, ext_hdr->hdr.size);
From: Heikki Krogerus heikki.krogerus@linux.intel.com
[ Upstream commit 1384ab4fee12c4c4f8bd37bc9f8686881587b286 ]
This patch adds the necessary PCI ID for Intel Alder Lake-S devices.
Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc3/dwc3-pci.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c index 242b6210380a4..bae6a70664c80 100644 --- a/drivers/usb/dwc3/dwc3-pci.c +++ b/drivers/usb/dwc3/dwc3-pci.c @@ -40,6 +40,7 @@ #define PCI_DEVICE_ID_INTEL_TGPLP 0xa0ee #define PCI_DEVICE_ID_INTEL_TGPH 0x43ee #define PCI_DEVICE_ID_INTEL_JSP 0x4dee +#define PCI_DEVICE_ID_INTEL_ADLS 0x7ae1
#define PCI_INTEL_BXT_DSM_GUID "732b85d5-b7a7-4a1b-9ba0-4bbd00ffd511" #define PCI_INTEL_BXT_FUNC_PMU_PWR 4 @@ -367,6 +368,9 @@ static const struct pci_device_id dwc3_pci_id_table[] = { { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_JSP), (kernel_ulong_t) &dwc3_pci_intel_properties, },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ADLS), + (kernel_ulong_t) &dwc3_pci_intel_properties, }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_NL_USB), (kernel_ulong_t) &dwc3_pci_amd_properties, }, { } /* Terminating Entry */
From: Viresh Kumar viresh.kumar@linaro.org
[ Upstream commit e0df59de670b48a923246fae1f972317b84b2764 ]
There is a lot of stuff here which can be done outside of the big opp_table_lock, do that. This helps avoiding few circular dependency lockdeps around debugfs and interconnects.
Reported-by: Rob Clark robdclark@gmail.com Reported-by: Dmitry Osipenko digetx@gmail.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/opp/core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 1a95ad40795be..a963df7bd2749 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1160,6 +1160,10 @@ static void _opp_table_kref_release(struct kref *kref) struct opp_device *opp_dev, *temp; int i;
+ /* Drop the lock as soon as we can */ + list_del(&opp_table->node); + mutex_unlock(&opp_table_lock); + _of_clear_opp_table(opp_table);
/* Release clk */ @@ -1187,10 +1191,7 @@ static void _opp_table_kref_release(struct kref *kref)
mutex_destroy(&opp_table->genpd_virt_dev_lock); mutex_destroy(&opp_table->lock); - list_del(&opp_table->node); kfree(opp_table); - - mutex_unlock(&opp_table_lock); }
void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit 0d66e04875c5aae876cf3d4f4be7978fa2b00523 ]
goku_probe() goes to error label "err" and invokes goku_remove() in case of failures of pci_enable_device(), pci_resource_start() and ioremap(). goku_remove() gets a device from pci_get_drvdata(pdev) and works with it without any checks, in particular it dereferences a corresponding pointer. But goku_probe() did not set this device yet. So, one can expect various crashes. The patch moves setting the device just after allocation of memory for it.
Found by Linux Driver Verification project (linuxtesting.org).
Reported-by: Pavel Andrianov andrianov@ispras.ru Signed-off-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/goku_udc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/goku_udc.c b/drivers/usb/gadget/udc/goku_udc.c index 25c1d6ab5adb4..3e1267d38774f 100644 --- a/drivers/usb/gadget/udc/goku_udc.c +++ b/drivers/usb/gadget/udc/goku_udc.c @@ -1760,6 +1760,7 @@ static int goku_probe(struct pci_dev *pdev, const struct pci_device_id *id) goto err; }
+ pci_set_drvdata(pdev, dev); spin_lock_init(&dev->lock); dev->pdev = pdev; dev->gadget.ops = &goku_ops; @@ -1793,7 +1794,6 @@ static int goku_probe(struct pci_dev *pdev, const struct pci_device_id *id) } dev->regs = (struct goku_udc_regs __iomem *) base;
- pci_set_drvdata(pdev, dev); INFO(dev, "%s\n", driver_desc); INFO(dev, "version: " DRIVER_VERSION " %s\n", dmastr()); INFO(dev, "irq %d, pci mem %p\n", pdev->irq, base);
From: Zqiang qiang.zhang@windriver.com
[ Upstream commit 129aa9734559a17990ee933351c7b6956f1dba62 ]
When fetch 'event' from event queue, after copy its address space content to user space, the 'event' the memory space pointed to by the 'event' pointer need be freed.
BUG: memory leak unreferenced object 0xffff888110622660 (size 32): comm "softirq", pid 0, jiffies 4294941981 (age 12.480s) hex dump (first 32 bytes): 02 00 00 00 08 00 00 00 80 06 00 01 00 00 40 00 ..............@. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000efd29abd>] kmalloc include/linux/slab.h:554 [inline] [<00000000efd29abd>] raw_event_queue_add drivers/usb/gadget/legacy/raw_gadget.c:66 [inline] [<00000000efd29abd>] raw_queue_event drivers/usb/gadget/legacy/raw_gadget.c:225 [inline] [<00000000efd29abd>] gadget_setup+0xf6/0x220 drivers/usb/gadget/legacy/raw_gadget.c:343 [<00000000952c4a46>] dummy_timer+0xb9f/0x14c0 drivers/usb/gadget/udc/dummy_hcd.c:1899 [<0000000074ac2c54>] call_timer_fn+0x38/0x200 kernel/time/timer.c:1415 [<00000000560a3a79>] expire_timers kernel/time/timer.c:1460 [inline] [<00000000560a3a79>] __run_timers.part.0+0x319/0x400 kernel/time/timer.c:1757 [<000000009d9503d0>] __run_timers kernel/time/timer.c:1738 [inline] [<000000009d9503d0>] run_timer_softirq+0x3d/0x80 kernel/time/timer.c:1770 [<000000009df27c89>] __do_softirq+0xcc/0x2c2 kernel/softirq.c:298 [<000000007a3f1a47>] asm_call_irq_on_stack+0xf/0x20 [<000000004a62cc2e>] __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline] [<000000004a62cc2e>] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline] [<000000004a62cc2e>] do_softirq_own_stack+0x32/0x40 arch/x86/kernel/irq_64.c:77 [<00000000b0086800>] invoke_softirq kernel/softirq.c:393 [inline] [<00000000b0086800>] __irq_exit_rcu kernel/softirq.c:423 [inline] [<00000000b0086800>] irq_exit_rcu+0x91/0xc0 kernel/softirq.c:435 [<00000000175f9523>] sysvec_apic_timer_interrupt+0x36/0x80 arch/x86/kernel/apic/apic.c:1091 [<00000000a348e847>] asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631 [<0000000060661100>] native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline] [<0000000060661100>] arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline] [<0000000060661100>] acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline] [<0000000060661100>] acpi_idle_do_entry+0xc3/0xd0 drivers/acpi/processor_idle.c:517 [<000000003f413b99>] acpi_idle_enter+0x128/0x1f0 drivers/acpi/processor_idle.c:648 [<00000000f5e5afb8>] cpuidle_enter_state+0xc9/0x650 drivers/cpuidle/cpuidle.c:237 [<00000000d50d51fc>] cpuidle_enter+0x29/0x40 drivers/cpuidle/cpuidle.c:351 [<00000000d674baed>] call_cpuidle kernel/sched/idle.c:132 [inline] [<00000000d674baed>] cpuidle_idle_call kernel/sched/idle.c:213 [inline] [<00000000d674baed>] do_idle+0x1c8/0x250 kernel/sched/idle.c:273
Reported-by: syzbot+bd38200f53df6259e6bf@syzkaller.appspotmail.com Signed-off-by: Zqiang qiang.zhang@windriver.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/legacy/raw_gadget.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/legacy/raw_gadget.c b/drivers/usb/gadget/legacy/raw_gadget.c index e01e366d89cd5..062dfac303996 100644 --- a/drivers/usb/gadget/legacy/raw_gadget.c +++ b/drivers/usb/gadget/legacy/raw_gadget.c @@ -564,9 +564,12 @@ static int raw_ioctl_event_fetch(struct raw_dev *dev, unsigned long value) return -ENODEV; } length = min(arg.length, event->length); - if (copy_to_user((void __user *)value, event, sizeof(*event) + length)) + if (copy_to_user((void __user *)value, event, sizeof(*event) + length)) { + kfree(event); return -EFAULT; + }
+ kfree(event); return 0; }
From: Colin Ian King colin.king@canonical.com
[ Upstream commit e3e40312567087fbe6880f316cb2b0e1f3d8a82c ]
More recent libc implementations are now using openat/openat2 system calls so also add do_sys_openat2 to the tracing so that the test passes on these systems because do_sys_open may not be called.
Thanks to Masami Hiramatsu for the help on getting this fix to work correctly.
Signed-off-by: Colin Ian King colin.king@canonical.com Acked-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/ftrace/test.d/kprobe/kprobe_args_user.tc | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc index a30a9c07290d0..d25d01a197781 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc @@ -9,12 +9,16 @@ grep -A10 "fetcharg:" README | grep -q '[u]<offset>' || exit_unsupported :;: "user-memory access syntax and ustring working on user memory";: echo 'p:myevent do_sys_open path=+0($arg2):ustring path2=+u0($arg2):string' \ > kprobe_events +echo 'p:myevent2 do_sys_openat2 path=+0($arg2):ustring path2=+u0($arg2):string' \ + >> kprobe_events
grep myevent kprobe_events | \ grep -q 'path=+0($arg2):ustring path2=+u0($arg2):string' echo 1 > events/kprobes/myevent/enable +echo 1 > events/kprobes/myevent2/enable echo > /dev/null echo 0 > events/kprobes/myevent/enable +echo 0 > events/kprobes/myevent2/enable
grep myevent trace | grep -q 'path="/dev/null" path2="/dev/null"'
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit 1948172fdba5ad643529ddcd00a601c0caa913ed ]
Drop unneeded <linux/wait.h> header inclusion to fix pidfd compilation errors seen in Fedora 32:
In file included from pidfd_open_test.c:9: ../../../../usr/include/linux/wait.h:17:16: error: expected identifier before numeric constant 17 | #define P_ALL 0 | ^
Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/pidfd/pidfd_open_test.c | 1 - tools/testing/selftests/pidfd/pidfd_poll_test.c | 1 - 2 files changed, 2 deletions(-)
diff --git a/tools/testing/selftests/pidfd/pidfd_open_test.c b/tools/testing/selftests/pidfd/pidfd_open_test.c index b9fe75fc3e517..8a59438ccc78b 100644 --- a/tools/testing/selftests/pidfd/pidfd_open_test.c +++ b/tools/testing/selftests/pidfd/pidfd_open_test.c @@ -6,7 +6,6 @@ #include <inttypes.h> #include <limits.h> #include <linux/types.h> -#include <linux/wait.h> #include <sched.h> #include <signal.h> #include <stdbool.h> diff --git a/tools/testing/selftests/pidfd/pidfd_poll_test.c b/tools/testing/selftests/pidfd/pidfd_poll_test.c index 4b115444dfe90..6108112753573 100644 --- a/tools/testing/selftests/pidfd/pidfd_poll_test.c +++ b/tools/testing/selftests/pidfd/pidfd_poll_test.c @@ -3,7 +3,6 @@ #define _GNU_SOURCE #include <errno.h> #include <linux/types.h> -#include <linux/wait.h> #include <poll.h> #include <signal.h> #include <stdbool.h>
From: Kai-Heng Feng kai.heng.feng@canonical.com
[ Upstream commit f5dac54d9d93826a776dffc848df76746f7135bb ]
Both pm_runtime_force_suspend() and pm_runtime_force_resume() have some implicit checks, so it can make code flow more straightforward if we separate runtime and system suspend callbacks.
High Definition Audio Specification, 4.5.9.3 Codec Wake From System S3 states that codec can wake the system up from S3 if WAKEEN is toggled. Since HDA controller has different wakeup settings for runtime and system susend, we also need to explicitly disable direct-complete which can be enabled automatically by PCI core. In addition to that, avoid waking up codec if runtime resume is for system suspend, to not break direct-complete for codecs.
While at it, also remove AZX_DCAPS_SUSPEND_SPURIOUS_WAKEUP, as the original bug commit a6630529aecb ("ALSA: hda: Workaround for spurious wakeups on some Intel platforms") solves doesn't happen with this patch.
Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Link: https://lore.kernel.org/r/20201027130038.16463-3-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_controller.h | 3 +- sound/pci/hda/hda_intel.c | 62 +++++++++++++++++++--------------- 2 files changed, 36 insertions(+), 29 deletions(-)
diff --git a/sound/pci/hda/hda_controller.h b/sound/pci/hda/hda_controller.h index be63ead8161f8..68f9668788ea2 100644 --- a/sound/pci/hda/hda_controller.h +++ b/sound/pci/hda/hda_controller.h @@ -41,7 +41,7 @@ /* 24 unused */ #define AZX_DCAPS_COUNT_LPIB_DELAY (1 << 25) /* Take LPIB as delay */ #define AZX_DCAPS_PM_RUNTIME (1 << 26) /* runtime PM support */ -#define AZX_DCAPS_SUSPEND_SPURIOUS_WAKEUP (1 << 27) /* Workaround for spurious wakeups after suspend */ +/* 27 unused */ #define AZX_DCAPS_CORBRP_SELF_CLEAR (1 << 28) /* CORBRP clears itself after reset */ #define AZX_DCAPS_NO_MSI64 (1 << 29) /* Stick to 32-bit MSIs */ #define AZX_DCAPS_SEPARATE_STREAM_TAG (1 << 30) /* capture and playback use separate stream tag */ @@ -143,6 +143,7 @@ struct azx { unsigned int align_buffer_size:1; unsigned int region_requested:1; unsigned int disabled:1; /* disabled by vga_switcheroo */ + unsigned int pm_prepared:1;
/* GTS present */ unsigned int gts_present:1; diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 476a8b871daa1..268e9ead9795f 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -297,8 +297,7 @@ enum { /* PCH for HSW/BDW; with runtime PM */ /* no i915 binding for this as HSW/BDW has another controller for HDMI */ #define AZX_DCAPS_INTEL_PCH \ - (AZX_DCAPS_INTEL_PCH_BASE | AZX_DCAPS_PM_RUNTIME |\ - AZX_DCAPS_SUSPEND_SPURIOUS_WAKEUP) + (AZX_DCAPS_INTEL_PCH_BASE | AZX_DCAPS_PM_RUNTIME)
/* HSW HDMI */ #define AZX_DCAPS_INTEL_HASWELL \ @@ -984,7 +983,7 @@ static void __azx_runtime_suspend(struct azx *chip) display_power(chip, false); }
-static void __azx_runtime_resume(struct azx *chip, bool from_rt) +static void __azx_runtime_resume(struct azx *chip) { struct hda_intel *hda = container_of(chip, struct hda_intel, chip); struct hdac_bus *bus = azx_bus(chip); @@ -1001,7 +1000,8 @@ static void __azx_runtime_resume(struct azx *chip, bool from_rt) azx_init_pci(chip); hda_intel_init_chip(chip, true);
- if (from_rt) { + /* Avoid codec resume if runtime resume is for system suspend */ + if (!chip->pm_prepared) { list_for_each_codec(codec, &chip->bus) { if (codec->relaxed_resume) continue; @@ -1017,6 +1017,29 @@ static void __azx_runtime_resume(struct azx *chip, bool from_rt) }
#ifdef CONFIG_PM_SLEEP +static int azx_prepare(struct device *dev) +{ + struct snd_card *card = dev_get_drvdata(dev); + struct azx *chip; + + chip = card->private_data; + chip->pm_prepared = 1; + + /* HDA controller always requires different WAKEEN for runtime suspend + * and system suspend, so don't use direct-complete here. + */ + return 0; +} + +static void azx_complete(struct device *dev) +{ + struct snd_card *card = dev_get_drvdata(dev); + struct azx *chip; + + chip = card->private_data; + chip->pm_prepared = 0; +} + static int azx_suspend(struct device *dev) { struct snd_card *card = dev_get_drvdata(dev); @@ -1028,15 +1051,7 @@ static int azx_suspend(struct device *dev)
chip = card->private_data; bus = azx_bus(chip); - snd_power_change_state(card, SNDRV_CTL_POWER_D3hot); - /* An ugly workaround: direct call of __azx_runtime_suspend() and - * __azx_runtime_resume() for old Intel platforms that suffer from - * spurious wakeups after S3 suspend - */ - if (chip->driver_caps & AZX_DCAPS_SUSPEND_SPURIOUS_WAKEUP) - __azx_runtime_suspend(chip); - else - pm_runtime_force_suspend(dev); + __azx_runtime_suspend(chip); if (bus->irq >= 0) { free_irq(bus->irq, chip); bus->irq = -1; @@ -1065,11 +1080,7 @@ static int azx_resume(struct device *dev) if (azx_acquire_irq(chip, 1) < 0) return -EIO;
- if (chip->driver_caps & AZX_DCAPS_SUSPEND_SPURIOUS_WAKEUP) - __azx_runtime_resume(chip, false); - else - pm_runtime_force_resume(dev); - snd_power_change_state(card, SNDRV_CTL_POWER_D0); + __azx_runtime_resume(chip);
trace_azx_resume(chip); return 0; @@ -1117,10 +1128,7 @@ static int azx_runtime_suspend(struct device *dev) chip = card->private_data;
/* enable controller wake up event */ - if (snd_power_get_state(card) == SNDRV_CTL_POWER_D0) { - azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) | - STATESTS_INT_MASK); - } + azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) | STATESTS_INT_MASK);
__azx_runtime_suspend(chip); trace_azx_runtime_suspend(chip); @@ -1131,18 +1139,14 @@ static int azx_runtime_resume(struct device *dev) { struct snd_card *card = dev_get_drvdata(dev); struct azx *chip; - bool from_rt = snd_power_get_state(card) == SNDRV_CTL_POWER_D0;
if (!azx_is_pm_ready(card)) return 0; chip = card->private_data; - __azx_runtime_resume(chip, from_rt); + __azx_runtime_resume(chip);
/* disable controller Wake Up event*/ - if (from_rt) { - azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) & - ~STATESTS_INT_MASK); - } + azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) & ~STATESTS_INT_MASK);
trace_azx_runtime_resume(chip); return 0; @@ -1176,6 +1180,8 @@ static int azx_runtime_idle(struct device *dev) static const struct dev_pm_ops azx_pm = { SET_SYSTEM_SLEEP_PM_OPS(azx_suspend, azx_resume) #ifdef CONFIG_PM_SLEEP + .prepare = azx_prepare, + .complete = azx_complete, .freeze_noirq = azx_freeze_noirq, .thaw_noirq = azx_thaw_noirq, #endif
From: Kai-Heng Feng kai.heng.feng@canonical.com
[ Upstream commit 9fc149c3bce7bdbb94948a8e6bd025e3b3538603 ]
The broken jack detection should be fixed by commit a6e7d0a4bdb0 ("ALSA: hda: fix jack detection with Realtek codecs when in D3"), let's try enabling runtime PM by default again.
Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Link: https://lore.kernel.org/r/20201027130038.16463-4-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_intel.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 268e9ead9795f..0ae0290eb2bfd 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2361,6 +2361,7 @@ static int azx_probe_continue(struct azx *chip)
if (azx_has_pm_runtime(chip)) { pm_runtime_use_autosuspend(&pci->dev); + pm_runtime_allow(&pci->dev); pm_runtime_put_autosuspend(&pci->dev); }
From: Joerg Roedel jroedel@suse.de
[ Upstream commit 3ad84246a4097010f3ae3d6944120c0be00e9e7a ]
Introduce sev_status and initialize it together with sme_me_mask to have an indicator which SEV features are enabled.
Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lkml.kernel.org/r/20201028164659.27002-2-joro@8bytes.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/boot/compressed/mem_encrypt.S | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S index dd07e7b41b115..3092ae173f94e 100644 --- a/arch/x86/boot/compressed/mem_encrypt.S +++ b/arch/x86/boot/compressed/mem_encrypt.S @@ -81,6 +81,19 @@ SYM_FUNC_START(set_sev_encryption_mask)
bts %rax, sme_me_mask(%rip) /* Create the encryption mask */
+ /* + * Read MSR_AMD64_SEV again and store it to sev_status. Can't do this in + * get_sev_encryption_bit() because this function is 32-bit code and + * shared between 64-bit and 32-bit boot path. + */ + movl $MSR_AMD64_SEV, %ecx /* Read the SEV MSR */ + rdmsr + + /* Store MSR value in sev_status */ + shlq $32, %rdx + orq %rdx, %rax + movq %rax, sev_status(%rip) + .Lno_sev_mask: movq %rbp, %rsp /* Restore original stack pointer */
@@ -96,5 +109,6 @@ SYM_FUNC_END(set_sev_encryption_mask)
#ifdef CONFIG_AMD_MEM_ENCRYPT .balign 8 -SYM_DATA(sme_me_mask, .quad 0) +SYM_DATA(sme_me_mask, .quad 0) +SYM_DATA(sev_status, .quad 0) #endif
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit d0f17d3883f1e3f085d38572c2ea8edbd5150172 ]
Function gfs2_clear_rgrpd calls kfree(rgd->rd_bits) before calling return_all_reservations, but return_all_reservations still dereferences rgd->rd_bits in __rs_deltree. Fix that by moving the call to kfree below the call to return_all_reservations.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/rgrp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index 1bba5a9d45fa3..1d65db1b3914a 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -719,9 +719,9 @@ void gfs2_clear_rgrpd(struct gfs2_sbd *sdp) }
gfs2_free_clones(rgd); + return_all_reservations(rgd); kfree(rgd->rd_bits); rgd->rd_bits = NULL; - return_all_reservations(rgd); kmem_cache_free(gfs2_rgrpd_cachep, rgd); } }
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit a9dd945ccef07a904e412f208f8de708a3d7159e ]
Gfs2 creates an address space for its rgrps called sd_aspace, but it never called truncate_inode_pages_final on it. This confused vfs greatly which tried to reference the address space after gfs2 had freed the superblock that contained it.
This patch adds a call to truncate_inode_pages_final for sd_aspace, thus avoiding the use-after-free.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/super.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 32ae1a7cdaed8..831f6e31d6821 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -732,6 +732,7 @@ restart: gfs2_jindex_free(sdp); /* Take apart glock structures and buffer lists */ gfs2_gl_hash_clear(sdp); + truncate_inode_pages_final(&sdp->sd_aspace); gfs2_delete_debugfs_file(sdp); /* Unmount the locking protocol */ gfs2_lm_unmount(sdp);
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit c5c68724696e7d2f8db58a5fce3673208d35c485 ]
Before this patch, gfs2_fitrim was not properly checking for a "live" file system. If the file system had something to trim and the file system was read-only (or spectator) it would start the trim, but when it starts the transaction, gfs2_trans_begin returns -EROFS (read-only file system) and it errors out. However, if the file system was already trimmed so there's no work to do, it never called gfs2_trans_begin. That code is bypassed so it never returns the error. Instead, it returns a good return code with 0 work. All this makes for inconsistent behavior: The same fstrim command can return -EROFS in one case and 0 in another. This tripped up xfstests generic/537 which reports the error as:
+fstrim with unrecovered metadata just ate your filesystem
This patch adds a check for a "live" (iow, active journal, iow, RW) file system, and if not, returns the error properly.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/rgrp.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index 1d65db1b3914a..ac306895bbbcc 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -1374,6 +1374,9 @@ int gfs2_fitrim(struct file *filp, void __user *argp) if (!capable(CAP_SYS_ADMIN)) return -EPERM;
+ if (!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) + return -EROFS; + if (!blk_queue_discard(q)) return -EOPNOTSUPP;
From: Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp
[ Upstream commit af61bc1e33d2c0ec22612b46050f5b58ac56a962 ]
When hpsa_scsi_add_host() fails, h->lastlogicals is leaked since it is missing a free() in the error handler.
Fix this by adding free() when hpsa_scsi_add_host() fails.
Link: https://lore.kernel.org/r/20201027073125.14229-1-keitasuzuki.park@sslab.ics.... Tested-by: Don Brace don.brace@microchip.com Acked-by: Don Brace don.brace@microchip.com Signed-off-by: Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hpsa.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 48d5da59262b4..aed59ec20ad9e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -8854,7 +8854,7 @@ reinit_after_soft_reset: /* hook into SCSI subsystem */ rc = hpsa_scsi_add_host(h); if (rc) - goto clean7; /* perf, sg, cmd, irq, shost, pci, lu, aer/h */ + goto clean8; /* lastlogicals, perf, sg, cmd, irq, shost, pci, lu, aer/h */
/* Monitor the controller for firmware lockups */ h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL; @@ -8869,6 +8869,8 @@ reinit_after_soft_reset: HPSA_EVENT_MONITOR_INTERVAL); return 0;
+clean8: /* lastlogicals, perf, sg, cmd, irq, shost, pci, lu, aer/h */ + kfree(h->lastlogicals); clean7: /* perf, sg, cmd, irq, shost, pci, lu, aer/h */ hpsa_free_performant_mode(h); h->access.set_intr_mask(h, HPSA_INTR_OFF);
From: Evan Quan evan.quan@amd.com
[ Upstream commit 253475c455eb5f8da34faa1af92709e7bb414624 ]
This can address the random SDMA hang after pci config reset seen on Hawaii.
Signed-off-by: Evan Quan evan.quan@amd.com Tested-by: Sandeep Raghuraman sandy.8925@gmail.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index 20f108818b2b9..a3c3fe96515f2 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -1071,22 +1071,19 @@ static int cik_sdma_soft_reset(void *handle) { u32 srbm_soft_reset = 0; struct amdgpu_device *adev = (struct amdgpu_device *)handle; - u32 tmp = RREG32(mmSRBM_STATUS2); + u32 tmp;
- if (tmp & SRBM_STATUS2__SDMA_BUSY_MASK) { - /* sdma0 */ - tmp = RREG32(mmSDMA0_F32_CNTL + SDMA0_REGISTER_OFFSET); - tmp |= SDMA0_F32_CNTL__HALT_MASK; - WREG32(mmSDMA0_F32_CNTL + SDMA0_REGISTER_OFFSET, tmp); - srbm_soft_reset |= SRBM_SOFT_RESET__SOFT_RESET_SDMA_MASK; - } - if (tmp & SRBM_STATUS2__SDMA1_BUSY_MASK) { - /* sdma1 */ - tmp = RREG32(mmSDMA0_F32_CNTL + SDMA1_REGISTER_OFFSET); - tmp |= SDMA0_F32_CNTL__HALT_MASK; - WREG32(mmSDMA0_F32_CNTL + SDMA1_REGISTER_OFFSET, tmp); - srbm_soft_reset |= SRBM_SOFT_RESET__SOFT_RESET_SDMA1_MASK; - } + /* sdma0 */ + tmp = RREG32(mmSDMA0_F32_CNTL + SDMA0_REGISTER_OFFSET); + tmp |= SDMA0_F32_CNTL__HALT_MASK; + WREG32(mmSDMA0_F32_CNTL + SDMA0_REGISTER_OFFSET, tmp); + srbm_soft_reset |= SRBM_SOFT_RESET__SOFT_RESET_SDMA_MASK; + + /* sdma1 */ + tmp = RREG32(mmSDMA0_F32_CNTL + SDMA1_REGISTER_OFFSET); + tmp |= SDMA0_F32_CNTL__HALT_MASK; + WREG32(mmSDMA0_F32_CNTL + SDMA1_REGISTER_OFFSET, tmp); + srbm_soft_reset |= SRBM_SOFT_RESET__SOFT_RESET_SDMA1_MASK;
if (srbm_soft_reset) { tmp = RREG32(mmSRBM_SOFT_RESET);
From: Evan Quan evan.quan@amd.com
[ Upstream commit c108725ef589af462be6b957f63c7925e38213eb ]
Correct some registers bitmasks and add mmBIOS_SCRATCH_7 reset.
Signed-off-by: Evan Quan evan.quan@amd.com Tested-by: Sandeep Raghuraman sandy.8925@gmail.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c b/drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c index 3be40114e63d2..45f608838f6eb 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/ci_baco.c @@ -142,12 +142,12 @@ static const struct baco_cmd_entry exit_baco_tbl[] = { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_BCLK_OFF_MASK, BACO_CNTL__BACO_BCLK_OFF__SHIFT, 0, 0x00 }, { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_POWER_OFF_MASK, BACO_CNTL__BACO_POWER_OFF__SHIFT, 0, 0x00 }, { CMD_DELAY_MS, 0, 0, 0, 20, 0 }, - { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__PWRGOOD_BF_MASK, 0, 0xffffffff, 0x20 }, + { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__PWRGOOD_BF_MASK, 0, 0xffffffff, 0x200 }, { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_ISO_DIS_MASK, BACO_CNTL__BACO_ISO_DIS__SHIFT, 0, 0x01 }, - { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__PWRGOOD_MASK, 0, 5, 0x1c }, + { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__PWRGOOD_MASK, 0, 5, 0x1c00 }, { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_ANA_ISO_DIS_MASK, BACO_CNTL__BACO_ANA_ISO_DIS__SHIFT, 0, 0x01 }, { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_RESET_EN_MASK, BACO_CNTL__BACO_RESET_EN__SHIFT, 0, 0x00 }, - { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__RCU_BIF_CONFIG_DONE_MASK, 0, 5, 0x10 }, + { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__RCU_BIF_CONFIG_DONE_MASK, 0, 5, 0x100 }, { CMD_READMODIFYWRITE, mmBACO_CNTL, BACO_CNTL__BACO_EN_MASK, BACO_CNTL__BACO_EN__SHIFT, 0, 0x00 }, { CMD_WAITFOR, mmBACO_CNTL, BACO_CNTL__BACO_MODE_MASK, 0, 0xffffffff, 0x00 } }; @@ -155,6 +155,7 @@ static const struct baco_cmd_entry exit_baco_tbl[] = static const struct baco_cmd_entry clean_baco_tbl[] = { { CMD_WRITE, mmBIOS_SCRATCH_6, 0, 0, 0, 0 }, + { CMD_WRITE, mmBIOS_SCRATCH_7, 0, 0, 0, 0 }, { CMD_WRITE, mmCP_PFP_UCODE_ADDR, 0, 0, 0, 0 } };
From: Evan Quan evan.quan@amd.com
[ Upstream commit 277b080f98803cb73a83fb234f0be83a10e63958 ]
So that the succeeding resume can be performed based on a clean state.
Signed-off-by: Evan Quan evan.quan@amd.com Tested-by: Sandeep Raghuraman sandy.8925@gmail.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 ++++ drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 + drivers/gpu/drm/amd/powerplay/inc/smumgr.h | 2 ++ .../gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 24 +++++++++++++++++++ drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 8 +++++++ 5 files changed, 39 insertions(+)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c index fc63d9e32e1f8..c8ee931075e52 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c @@ -1541,6 +1541,10 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr) PP_ASSERT_WITH_CODE((tmp_result == 0), "Failed to reset to default!", result = tmp_result);
+ tmp_result = smum_stop_smc(hwmgr); + PP_ASSERT_WITH_CODE((tmp_result == 0), + "Failed to stop smc!", result = tmp_result); + tmp_result = smu7_force_switch_to_arbf0(hwmgr); PP_ASSERT_WITH_CODE((tmp_result == 0), "Failed to force to switch arbf0!", result = tmp_result); diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h index 15ed6cbdf3660..91cdc53472f01 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h @@ -229,6 +229,7 @@ struct pp_smumgr_func { bool (*is_hw_avfs_present)(struct pp_hwmgr *hwmgr); int (*update_dpm_settings)(struct pp_hwmgr *hwmgr, void *profile_setting); int (*smc_table_manager)(struct pp_hwmgr *hwmgr, uint8_t *table, uint16_t table_id, bool rw); /*rw: true for read, false for write */ + int (*stop_smc)(struct pp_hwmgr *hwmgr); };
struct pp_hwmgr_func { diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h index ad100b533d049..5f46f1a4f38ef 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h +++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h @@ -113,4 +113,6 @@ extern int smum_update_dpm_settings(struct pp_hwmgr *hwmgr, void *profile_settin
extern int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, uint16_t table_id, bool rw);
+extern int smum_stop_smc(struct pp_hwmgr *hwmgr); + #endif diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c index e4d1f3d66ef48..09128122b4932 100644 --- a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c +++ b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c @@ -2939,6 +2939,29 @@ static int ci_update_smc_table(struct pp_hwmgr *hwmgr, uint32_t type) return 0; }
+static void ci_reset_smc(struct pp_hwmgr *hwmgr) +{ + PHM_WRITE_INDIRECT_FIELD(hwmgr->device, CGS_IND_REG__SMC, + SMC_SYSCON_RESET_CNTL, + rst_reg, 1); +} + + +static void ci_stop_smc_clock(struct pp_hwmgr *hwmgr) +{ + PHM_WRITE_INDIRECT_FIELD(hwmgr->device, CGS_IND_REG__SMC, + SMC_SYSCON_CLOCK_CNTL_0, + ck_disable, 1); +} + +static int ci_stop_smc(struct pp_hwmgr *hwmgr) +{ + ci_reset_smc(hwmgr); + ci_stop_smc_clock(hwmgr); + + return 0; +} + const struct pp_smumgr_func ci_smu_funcs = { .name = "ci_smu", .smu_init = ci_smu_init, @@ -2964,4 +2987,5 @@ const struct pp_smumgr_func ci_smu_funcs = { .is_dpm_running = ci_is_dpm_running, .update_dpm_settings = ci_update_dpm_settings, .update_smc_table = ci_update_smc_table, + .stop_smc = ci_stop_smc, }; diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c index b6fb480668416..b6921db3c1305 100644 --- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c @@ -245,3 +245,11 @@ int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, uint16_t tabl
return -EINVAL; } + +int smum_stop_smc(struct pp_hwmgr *hwmgr) +{ + if (hwmgr->smumgr_funcs->stop_smc) + return hwmgr->smumgr_funcs->stop_smc(hwmgr); + + return 0; +}
From: Evan Quan evan.quan@amd.com
[ Upstream commit 786436b453001dafe81025389f96bf9dac1e9690 ]
This reverts commit f87812284172a9809820d10143b573d833cd3f75 ("drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume"). It was intended to fix Hawaii S4(hibernation) issue but break S3. As ixFEATURE_STATUS is filled with garbage data on resume which can be only cleared by reloading smc firmware(but that will involve many changes). So, we will revert this S4 fix and seek a new way.
Signed-off-by: Evan Quan evan.quan@amd.com Tested-by: Sandeep Raghuraman sandy.8925@gmail.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c index 09128122b4932..329bf4d44bbce 100644 --- a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c +++ b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c @@ -2726,10 +2726,7 @@ static int ci_initialize_mc_reg_table(struct pp_hwmgr *hwmgr)
static bool ci_is_dpm_running(struct pp_hwmgr *hwmgr) { - return (1 == PHM_READ_INDIRECT_FIELD(hwmgr->device, - CGS_IND_REG__SMC, FEATURE_STATUS, - VOLTAGE_CONTROLLER_ON)) - ? true : false; + return ci_is_smc_ram_running(hwmgr); }
static int ci_smu_init(struct pp_hwmgr *hwmgr)
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 14f46c1e5108696ec1e5a129e838ecedf108c7bf ]
When ieee80211_skb_resize() is called from ieee80211_build_hdr() the skb has no 802.11 header yet, in fact it consist only of the payload as the ethernet frame is removed. As such, we're using the payload data for ieee80211_is_mgmt(), which is of course completely wrong. This didn't really hurt us because these are always data frames, so we could only have added more tailroom than we needed if we determined it was a management frame and sdata->crypto_tx_tailroom_needed_cnt was false.
However, syzbot found that of course there need not be any payload, so we're using at best uninitialized memory for the check.
Fix this to pass explicitly the kind of frame that we have instead of checking there, by replacing the "bool may_encrypt" argument with an argument that can carry the three possible states - it's not going to be encrypted, it's a management frame, or it's a data frame (and then we check sdata->crypto_tx_tailroom_needed_cnt).
Reported-by: syzbot+32fd1a1bfe355e93f1e2@syzkaller.appspotmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Link: https://lore.kernel.org/r/20201009132538.e1fd7f802947.I799b288466ea2815f9d4c... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tx.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index aa486e202a57c..ca1e8cd75b22b 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -1938,19 +1938,24 @@ static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata,
/* device xmit handlers */
+enum ieee80211_encrypt { + ENCRYPT_NO, + ENCRYPT_MGMT, + ENCRYPT_DATA, +}; + static int ieee80211_skb_resize(struct ieee80211_sub_if_data *sdata, struct sk_buff *skb, - int head_need, bool may_encrypt) + int head_need, + enum ieee80211_encrypt encrypt) { struct ieee80211_local *local = sdata->local; - struct ieee80211_hdr *hdr; bool enc_tailroom; int tail_need = 0;
- hdr = (struct ieee80211_hdr *) skb->data; - enc_tailroom = may_encrypt && - (sdata->crypto_tx_tailroom_needed_cnt || - ieee80211_is_mgmt(hdr->frame_control)); + enc_tailroom = encrypt == ENCRYPT_MGMT || + (encrypt == ENCRYPT_DATA && + sdata->crypto_tx_tailroom_needed_cnt);
if (enc_tailroom) { tail_need = IEEE80211_ENCRYPT_TAILROOM; @@ -1981,23 +1986,29 @@ void ieee80211_xmit(struct ieee80211_sub_if_data *sdata, { struct ieee80211_local *local = sdata->local; struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); - struct ieee80211_hdr *hdr; + struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data; int headroom; - bool may_encrypt; + enum ieee80211_encrypt encrypt;
- may_encrypt = !(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT); + if (info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT) + encrypt = ENCRYPT_NO; + else if (ieee80211_is_mgmt(hdr->frame_control)) + encrypt = ENCRYPT_MGMT; + else + encrypt = ENCRYPT_DATA;
headroom = local->tx_headroom; - if (may_encrypt) + if (encrypt != ENCRYPT_NO) headroom += sdata->encrypt_headroom; headroom -= skb_headroom(skb); headroom = max_t(int, 0, headroom);
- if (ieee80211_skb_resize(sdata, skb, headroom, may_encrypt)) { + if (ieee80211_skb_resize(sdata, skb, headroom, encrypt)) { ieee80211_free_txskb(&local->hw, skb); return; }
+ /* reload after potential resize */ hdr = (struct ieee80211_hdr *) skb->data; info->control.vif = &sdata->vif;
@@ -2822,7 +2833,7 @@ static struct sk_buff *ieee80211_build_hdr(struct ieee80211_sub_if_data *sdata, head_need += sdata->encrypt_headroom; head_need += local->tx_headroom; head_need = max_t(int, 0, head_need); - if (ieee80211_skb_resize(sdata, skb, head_need, true)) { + if (ieee80211_skb_resize(sdata, skb, head_need, ENCRYPT_DATA)) { ieee80211_free_txskb(&local->hw, skb); skb = NULL; return ERR_PTR(-ENOMEM); @@ -3496,7 +3507,7 @@ static bool ieee80211_xmit_fast(struct ieee80211_sub_if_data *sdata, if (unlikely(ieee80211_skb_resize(sdata, skb, max_t(int, extra_head + hw_headroom - skb_headroom(skb), 0), - false))) { + ENCRYPT_NO))) { kfree_skb(skb); return true; }
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 9bdaf3b91efd229dd272b228e13df10310c80d19 ]
There's a race condition in the netdev registration in that NETDEV_REGISTER actually happens after the netdev is available, and so if we initialize things only there, we might get called with an uninitialized wdev through nl80211 - not using a wdev but using a netdev interface index.
I found this while looking into a syzbot report, but it doesn't really seem to be related, and unfortunately there's no repro for it (yet). I can't (yet) explain how it managed to get into cfg80211_release_pmsr() from nl80211_netlink_notify() without the wdev having been initialized, as the latter only iterates the wdevs that are linked into the rdev, which even without the change here happened after init.
However, looking at this, it seems fairly clear that the init needs to be done earlier, otherwise we might even re-init on a netns move, when data might still be pending.
Signed-off-by: Johannes Berg johannes.berg@intel.com Link: https://lore.kernel.org/r/20201009135821.fdcbba3aad65.Ie9201d91dbcb7da323188... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/core.c | 57 +++++++++++++++++++++++------------------- net/wireless/core.h | 5 ++-- net/wireless/nl80211.c | 3 ++- 3 files changed, 36 insertions(+), 29 deletions(-)
diff --git a/net/wireless/core.c b/net/wireless/core.c index 354b0ccbdc240..e025493171262 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -1248,8 +1248,7 @@ void cfg80211_stop_iface(struct wiphy *wiphy, struct wireless_dev *wdev, } EXPORT_SYMBOL(cfg80211_stop_iface);
-void cfg80211_init_wdev(struct cfg80211_registered_device *rdev, - struct wireless_dev *wdev) +void cfg80211_init_wdev(struct wireless_dev *wdev) { mutex_init(&wdev->mtx); INIT_LIST_HEAD(&wdev->event_list); @@ -1260,6 +1259,30 @@ void cfg80211_init_wdev(struct cfg80211_registered_device *rdev, spin_lock_init(&wdev->pmsr_lock); INIT_WORK(&wdev->pmsr_free_wk, cfg80211_pmsr_free_wk);
+#ifdef CONFIG_CFG80211_WEXT + wdev->wext.default_key = -1; + wdev->wext.default_mgmt_key = -1; + wdev->wext.connect.auth_type = NL80211_AUTHTYPE_AUTOMATIC; +#endif + + if (wdev->wiphy->flags & WIPHY_FLAG_PS_ON_BY_DEFAULT) + wdev->ps = true; + else + wdev->ps = false; + /* allow mac80211 to determine the timeout */ + wdev->ps_timeout = -1; + + if ((wdev->iftype == NL80211_IFTYPE_STATION || + wdev->iftype == NL80211_IFTYPE_P2P_CLIENT || + wdev->iftype == NL80211_IFTYPE_ADHOC) && !wdev->use_4addr) + wdev->netdev->priv_flags |= IFF_DONT_BRIDGE; + + INIT_WORK(&wdev->disconnect_wk, cfg80211_autodisconnect_wk); +} + +void cfg80211_register_wdev(struct cfg80211_registered_device *rdev, + struct wireless_dev *wdev) +{ /* * We get here also when the interface changes network namespaces, * as it's registered into the new one, but we don't want it to @@ -1293,6 +1316,11 @@ static int cfg80211_netdev_notifier_call(struct notifier_block *nb, switch (state) { case NETDEV_POST_INIT: SET_NETDEV_DEVTYPE(dev, &wiphy_type); + wdev->netdev = dev; + /* can only change netns with wiphy */ + dev->features |= NETIF_F_NETNS_LOCAL; + + cfg80211_init_wdev(wdev); break; case NETDEV_REGISTER: /* @@ -1300,35 +1328,12 @@ static int cfg80211_netdev_notifier_call(struct notifier_block *nb, * called within code protected by it when interfaces * are added with nl80211. */ - /* can only change netns with wiphy */ - dev->features |= NETIF_F_NETNS_LOCAL; - if (sysfs_create_link(&dev->dev.kobj, &rdev->wiphy.dev.kobj, "phy80211")) { pr_err("failed to add phy80211 symlink to netdev!\n"); } - wdev->netdev = dev; -#ifdef CONFIG_CFG80211_WEXT - wdev->wext.default_key = -1; - wdev->wext.default_mgmt_key = -1; - wdev->wext.connect.auth_type = NL80211_AUTHTYPE_AUTOMATIC; -#endif - - if (wdev->wiphy->flags & WIPHY_FLAG_PS_ON_BY_DEFAULT) - wdev->ps = true; - else - wdev->ps = false; - /* allow mac80211 to determine the timeout */ - wdev->ps_timeout = -1; - - if ((wdev->iftype == NL80211_IFTYPE_STATION || - wdev->iftype == NL80211_IFTYPE_P2P_CLIENT || - wdev->iftype == NL80211_IFTYPE_ADHOC) && !wdev->use_4addr) - dev->priv_flags |= IFF_DONT_BRIDGE; - - INIT_WORK(&wdev->disconnect_wk, cfg80211_autodisconnect_wk);
- cfg80211_init_wdev(rdev, wdev); + cfg80211_register_wdev(rdev, wdev); break; case NETDEV_GOING_DOWN: cfg80211_leave(rdev, wdev); diff --git a/net/wireless/core.h b/net/wireless/core.h index 67b0389fca4dc..8cd4a9793298e 100644 --- a/net/wireless/core.h +++ b/net/wireless/core.h @@ -208,8 +208,9 @@ struct wiphy *wiphy_idx_to_wiphy(int wiphy_idx); int cfg80211_switch_netns(struct cfg80211_registered_device *rdev, struct net *net);
-void cfg80211_init_wdev(struct cfg80211_registered_device *rdev, - struct wireless_dev *wdev); +void cfg80211_init_wdev(struct wireless_dev *wdev); +void cfg80211_register_wdev(struct cfg80211_registered_device *rdev, + struct wireless_dev *wdev);
static inline void wdev_lock(struct wireless_dev *wdev) __acquires(wdev) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index e14307f2bddcc..8eb43c47e582a 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -3801,7 +3801,8 @@ static int nl80211_new_interface(struct sk_buff *skb, struct genl_info *info) * P2P Device and NAN do not have a netdev, so don't go * through the netdev notifier and must be added here */ - cfg80211_init_wdev(rdev, wdev); + cfg80211_init_wdev(wdev); + cfg80211_register_wdev(rdev, wdev); break; default: break;
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit dcd479e10a0510522a5d88b29b8f79ea3467d501 ]
When (for example) an IBSS station is pre-moved to AUTHORIZED before it's inserted, and then the insertion fails, we don't clean up the fast RX/TX states that might already have been created, since we don't go through all the state transitions again on the way down.
Do that, if it hasn't been done already, when the station is freed. I considered only freeing the fast TX/RX state there, but we might add more state so it's more robust to wind down the state properly.
Note that we warn if the station was ever inserted, it should have been properly cleaned up in that case, and the driver will probably not like things happening out of order.
Reported-by: syzbot+2e293dbd67de2836ba42@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201009141710.7223b322a955.I95bd08b9ad0e039c03492... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/sta_info.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index fb4f2b9b294f0..4fe284ff1ea3d 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -258,6 +258,24 @@ struct sta_info *sta_info_get_by_idx(struct ieee80211_sub_if_data *sdata, */ void sta_info_free(struct ieee80211_local *local, struct sta_info *sta) { + /* + * If we had used sta_info_pre_move_state() then we might not + * have gone through the state transitions down again, so do + * it here now (and warn if it's inserted). + * + * This will clear state such as fast TX/RX that may have been + * allocated during state transitions. + */ + while (sta->sta_state > IEEE80211_STA_NONE) { + int ret; + + WARN_ON_ONCE(test_sta_flag(sta, WLAN_STA_INSERTED)); + + ret = sta_info_move_state(sta, sta->sta_state - 1); + if (WARN_ONCE(ret, "sta_info_move_state() returned %d\n", ret)) + break; + } + if (sta->rate_ctrl) rate_control_free_sta(sta);
From: Ye Bin yebin10@huawei.com
[ Upstream commit db18d20d1cb0fde16d518fb5ccd38679f174bc04 ]
Fix follow warning: [net/wireless/reg.c:3619]: (warning) %d in format string (no. 2) requires 'int' but the argument type is 'unsigned int'.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Ye Bin yebin10@huawei.com Link: https://lore.kernel.org/r/20201009070215.63695-1-yebin10@huawei.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/reg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/wireless/reg.c b/net/wireless/reg.c index d8a90d3974235..763a45655ac21 100644 --- a/net/wireless/reg.c +++ b/net/wireless/reg.c @@ -3411,7 +3411,7 @@ static void print_rd_rules(const struct ieee80211_regdomain *rd) power_rule = ®_rule->power_rule;
if (reg_rule->flags & NL80211_RRF_AUTO_BW) - snprintf(bw, sizeof(bw), "%d KHz, %d KHz AUTO", + snprintf(bw, sizeof(bw), "%d KHz, %u KHz AUTO", freq_range->max_bandwidth_khz, reg_get_max_bandwidth(rd, reg_rule)); else
From: Jason A. Donenfeld Jason@zx2c4.com
[ Upstream commit af8afcf1fdd5f365f70e2386c2d8c7a1abd853d7 ]
If netfilter changes the packet mark, the packet is rerouted. The ip_route_me_harder family of functions fails to use the right sk, opting to instead use skb->sk, resulting in a routing loop when used with tunnels. With the next change fixing this issue in netfilter, test for the relevant condition inside our test suite, since wireguard was where the bug was discovered.
Reported-by: Chen Minqiang ptpt52@gmail.com Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/wireguard/netns.sh | 8 ++++++++ tools/testing/selftests/wireguard/qemu/kernel.config | 2 ++ 2 files changed, 10 insertions(+)
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index d77f4829f1e07..74c69b75f6f5a 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -316,6 +316,14 @@ pp sleep 3 n2 ping -W 1 -c 1 192.168.241.1 n1 wg set wg0 peer "$pub2" persistent-keepalive 0
+# Test that sk_bound_dev_if works +n1 ping -I wg0 -c 1 -W 1 192.168.241.2 +# What about when the mark changes and the packet must be rerouted? +n1 iptables -t mangle -I OUTPUT -j MARK --set-xmark 1 +n1 ping -c 1 -W 1 192.168.241.2 # First the boring case +n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # Then the sk_bound_dev_if case +n1 iptables -t mangle -D OUTPUT -j MARK --set-xmark 1 + # Test that onion routing works, even when it loops n1 wg set wg0 peer "$pub3" allowed-ips 192.168.242.2/32 endpoint 192.168.241.2:5 ip1 addr add 192.168.242.1/24 dev wg0 diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index d531de13c95b0..4eecb432a66c1 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -18,10 +18,12 @@ CONFIG_NF_NAT=y CONFIG_NETFILTER_XTABLES=y CONFIG_NETFILTER_XT_NAT=y CONFIG_NETFILTER_XT_MATCH_LENGTH=y +CONFIG_NETFILTER_XT_MARK=y CONFIG_NF_CONNTRACK_IPV4=y CONFIG_NF_NAT_IPV4=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_FILTER=y +CONFIG_IP_NF_MANGLE=y CONFIG_IP_NF_NAT=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_IP_MULTIPLE_TABLES=y
From: Qiujun Huang hqjagain@gmail.com
[ Upstream commit 906695e59324635c62b5ae59df111151a546ca66 ]
The array size is FTRACE_KSTACK_NESTING, so the index FTRACE_KSTACK_NESTING is illegal too. And fix two typos by the way.
Link: https://lkml.kernel.org/r/20201031085714.2147-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang hqjagain@gmail.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 6e2fb7dc41bf3..1c76a0faf3cd1 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2611,7 +2611,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb, /* * If tracing is off, but we have triggers enabled * we still need to look at the event data. Use the temp_buffer - * to store the trace event for the tigger to use. It's recusive + * to store the trace event for the trigger to use. It's recursive * safe and will not be recorded anywhere. */ if (!entry && trace_file->flags & EVENT_FILE_FL_TRIGGER_COND) { @@ -2934,7 +2934,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer, stackidx = __this_cpu_inc_return(ftrace_stack_reserve) - 1;
/* This should never happen. If it does, yell once and skip */ - if (WARN_ON_ONCE(stackidx > FTRACE_KSTACK_NESTING)) + if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING)) goto out;
/*
From: Keith Busch kbusch@kernel.org
[ Upstream commit 38210800bf66d7302da1bb5b624ad68638da1562 ]
Multiple CPUs may be mapped to the same hctx, allowing mulitple submission contexts to attempt commit_rqs(). We need to verify we're not writing the same doorbell value multiple times since that's a spec violation.
Revert commit 54b2fcee1db041a83b52b51752dade6090cf952f.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1878596 Reported-by: "B.L. Jones" brandon.gustav@googlemail.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 8984796db0c80..a6af96aaa0eb7 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -198,6 +198,7 @@ struct nvme_queue { u32 q_depth; u16 cq_vector; u16 sq_tail; + u16 last_sq_tail; u16 cq_head; u16 qid; u8 cq_phase; @@ -455,11 +456,24 @@ static int nvme_pci_map_queues(struct blk_mq_tag_set *set) return 0; }
-static inline void nvme_write_sq_db(struct nvme_queue *nvmeq) +/* + * Write sq tail if we are asked to, or if the next command would wrap. + */ +static inline void nvme_write_sq_db(struct nvme_queue *nvmeq, bool write_sq) { + if (!write_sq) { + u16 next_tail = nvmeq->sq_tail + 1; + + if (next_tail == nvmeq->q_depth) + next_tail = 0; + if (next_tail != nvmeq->last_sq_tail) + return; + } + if (nvme_dbbuf_update_and_check_event(nvmeq->sq_tail, nvmeq->dbbuf_sq_db, nvmeq->dbbuf_sq_ei)) writel(nvmeq->sq_tail, nvmeq->q_db); + nvmeq->last_sq_tail = nvmeq->sq_tail; }
/** @@ -476,8 +490,7 @@ static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd, cmd, sizeof(*cmd)); if (++nvmeq->sq_tail == nvmeq->q_depth) nvmeq->sq_tail = 0; - if (write_sq) - nvme_write_sq_db(nvmeq); + nvme_write_sq_db(nvmeq, write_sq); spin_unlock(&nvmeq->sq_lock); }
@@ -486,7 +499,8 @@ static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) struct nvme_queue *nvmeq = hctx->driver_data;
spin_lock(&nvmeq->sq_lock); - nvme_write_sq_db(nvmeq); + if (nvmeq->sq_tail != nvmeq->last_sq_tail) + nvme_write_sq_db(nvmeq, true); spin_unlock(&nvmeq->sq_lock); }
@@ -1496,6 +1510,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid) struct nvme_dev *dev = nvmeq->dev;
nvmeq->sq_tail = 0; + nvmeq->last_sq_tail = 0; nvmeq->cq_head = 0; nvmeq->cq_phase = 1; nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride];
From: Vineet Gupta vgupta@synopsys.com
[ Upstream commit 3b57533b460c8dc22a432684b7e8d22571f34d2e ]
ARC HSDK platform stopped booting on released v5.10-rc1, getting stuck in startup of non master SMP cores.
This was bisected to upstream commit 7fef431be9c9ac25 "(mm/page_alloc: place pages to tail in __free_pages_core())" That commit itself is harmless, it just exposed a subtle assumption in our platform code (hence CC'ing linux-mm just as FYI in case some other arches / platforms trip on it).
The upstream commit is semantically disruptive as it reverses the order of page allocations (actually it can be good test for hardware verification to exercise different memory patterns altogether). For ARC HSDK platform that meant a remapped memory region (pertaining to unused Closely Coupled Memory) started getting used early for dynamice allocations, while not effectively remapped on all the cores, triggering memory error exception on those cores.
The fix is to move the CCM remapping from early platform code to to early core boot code. And while it is undesirable to riddle common boot code with platform quirks, there is no other way to do this since the faltering code involves setting up stack itself so even function calls are not allowed at that point.
If anyone is interested, all the gory details can be found at Link below.
Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/32 Cc: David Hildenbrand david@redhat.com Cc: linux-mm@kvack.org Signed-off-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arc/kernel/head.S | 17 ++++++++++++++++- arch/arc/plat-hsdk/platform.c | 17 ----------------- 2 files changed, 16 insertions(+), 18 deletions(-)
diff --git a/arch/arc/kernel/head.S b/arch/arc/kernel/head.S index 17fd1ed700cca..9152782444b55 100644 --- a/arch/arc/kernel/head.S +++ b/arch/arc/kernel/head.S @@ -67,7 +67,22 @@ sr r5, [ARC_REG_LPB_CTRL] 1: #endif /* CONFIG_ARC_LPB_DISABLE */ -#endif + + /* On HSDK, CCMs need to remapped super early */ +#ifdef CONFIG_ARC_SOC_HSDK + mov r6, 0x60000000 + lr r5, [ARC_REG_ICCM_BUILD] + breq r5, 0, 1f + sr r6, [ARC_REG_AUX_ICCM] +1: + lr r5, [ARC_REG_DCCM_BUILD] + breq r5, 0, 2f + sr r6, [ARC_REG_AUX_DCCM] +2: +#endif /* CONFIG_ARC_SOC_HSDK */ + +#endif /* CONFIG_ISA_ARCV2 */ + ; Config DSP_CTRL properly, so kernel may use integer multiply, ; multiply-accumulate, and divide operations DSP_EARLY_INIT diff --git a/arch/arc/plat-hsdk/platform.c b/arch/arc/plat-hsdk/platform.c index 0b961a2a10b8e..22c9e2c9c0283 100644 --- a/arch/arc/plat-hsdk/platform.c +++ b/arch/arc/plat-hsdk/platform.c @@ -17,22 +17,6 @@ int arc_hsdk_axi_dmac_coherent __section(.data) = 0;
#define ARC_CCM_UNUSED_ADDR 0x60000000
-static void __init hsdk_init_per_cpu(unsigned int cpu) -{ - /* - * By default ICCM is mapped to 0x7z while this area is used for - * kernel virtual mappings, so move it to currently unused area. - */ - if (cpuinfo_arc700[cpu].iccm.sz) - write_aux_reg(ARC_REG_AUX_ICCM, ARC_CCM_UNUSED_ADDR); - - /* - * By default DCCM is mapped to 0x8z while this area is used by kernel, - * so move it to currently unused area. - */ - if (cpuinfo_arc700[cpu].dccm.sz) - write_aux_reg(ARC_REG_AUX_DCCM, ARC_CCM_UNUSED_ADDR); -}
#define ARC_PERIPHERAL_BASE 0xf0000000 #define CREG_BASE (ARC_PERIPHERAL_BASE + 0x1000) @@ -339,5 +323,4 @@ static const char *hsdk_compat[] __initconst = { MACHINE_START(SIMULATION, "hsdk") .dt_compat = hsdk_compat, .init_early = hsdk_init_early, - .init_per_cpu = hsdk_init_per_cpu, MACHINE_END
From: Hannes Reinecke hare@suse.de
[ Upstream commit 5faf50e9e9fdc2117c61ff7e20da49cd6a29e0ca ]
alua_bus_detach() might be running concurrently with alua_rtpg_work(), so we might trip over h->sdev == NULL and call BUG_ON(). The correct way of handling it is to not set h->sdev to NULL in alua_bus_detach(), and call rcu_synchronize() before the final delete to ensure that all concurrent threads have left the critical section. Then we can get rid of the BUG_ON() and replace it with a simple if condition.
Link: https://lore.kernel.org/r/1600167537-12509-1-git-send-email-jitendra.khasdev... Link: https://lore.kernel.org/r/20200924104559.26753-1-hare@suse.de Cc: Brian Bunker brian@purestorage.com Acked-by: Brian Bunker brian@purestorage.com Tested-by: Jitendra Khasdev jitendra.khasdev@oracle.com Reviewed-by: Jitendra Khasdev jitendra.khasdev@oracle.com Signed-off-by: Hannes Reinecke hare@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/device_handler/scsi_dh_alua.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index f32da0ca529e0..308bda2e9c000 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -658,8 +658,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) rcu_read_lock(); list_for_each_entry_rcu(h, &tmp_pg->dh_list, node) { - /* h->sdev should always be valid */ - BUG_ON(!h->sdev); + if (!h->sdev) + continue; h->sdev->access_state = desc[0]; } rcu_read_unlock(); @@ -705,7 +705,8 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) pg->expiry = 0; rcu_read_lock(); list_for_each_entry_rcu(h, &pg->dh_list, node) { - BUG_ON(!h->sdev); + if (!h->sdev) + continue; h->sdev->access_state = (pg->state & SCSI_ACCESS_STATE_MASK); if (pg->pref) @@ -1147,7 +1148,6 @@ static void alua_bus_detach(struct scsi_device *sdev) spin_lock(&h->pg_lock); pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock)); rcu_assign_pointer(h->pg, NULL); - h->sdev = NULL; spin_unlock(&h->pg_lock); if (pg) { spin_lock_irq(&pg->lock); @@ -1156,6 +1156,7 @@ static void alua_bus_detach(struct scsi_device *sdev) kref_put(&pg->kref, release_port_group); } sdev->handler_data = NULL; + synchronize_rcu(); kfree(h); }
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
[ Upstream commit 5feed64f9199ff90c4239971733f23f30aeb2484 ]
While reenabling the IRQ after irq poll there may be small time window where HBA firmware has posted some replies and raise the interrupts but driver has not received the interrupts. So we may observe I/O timeouts as the driver has not processed the replies as interrupts got missed while reenabling the IRQ.
To fix this issue the driver has to go for one more round of processing the reply descriptors from reply descriptor post queue after enabling the IRQ.
Link: https://lore.kernel.org/r/20201102072746.27410-1-sreekanth.reddy@broadcom.co... Reported-by: Tomas Henzl thenzl@redhat.com Reviewed-by: Tomas Henzl thenzl@redhat.com Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index e86682dc34eca..87d05c1950870 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -1742,6 +1742,13 @@ _base_irqpoll(struct irq_poll *irqpoll, int budget) reply_q->irq_poll_scheduled = false; reply_q->irq_line_enable = true; enable_irq(reply_q->os_irq); + /* + * Go for one more round of processing the + * reply descriptor post queue incase if HBA + * Firmware has posted some reply descriptors + * while reenabling the IRQ. + */ + _base_process_reply_queue(reply_q); }
return num_entries;
From: Chao Leng lengchao@huawei.com
[ Upstream commit 04800fbff4764ab7b32c49d19628605a5d4cb85c ]
Introduce sync io queues for some scenarios which just only need sync io queues not sync all queues.
Signed-off-by: Chao Leng lengchao@huawei.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 8 ++++++-- drivers/nvme/host/nvme.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 893e29624c16b..59040bab5d6fa 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4641,8 +4641,7 @@ void nvme_start_queues(struct nvme_ctrl *ctrl) } EXPORT_SYMBOL_GPL(nvme_start_queues);
- -void nvme_sync_queues(struct nvme_ctrl *ctrl) +void nvme_sync_io_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns;
@@ -4650,7 +4649,12 @@ void nvme_sync_queues(struct nvme_ctrl *ctrl) list_for_each_entry(ns, &ctrl->namespaces, list) blk_sync_queue(ns->queue); up_read(&ctrl->namespaces_rwsem); +} +EXPORT_SYMBOL_GPL(nvme_sync_io_queues);
+void nvme_sync_queues(struct nvme_ctrl *ctrl) +{ + nvme_sync_io_queues(ctrl); if (ctrl->admin_q) blk_sync_queue(ctrl->admin_q); } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 2aaedfa43ed86..97fbd61191b33 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -602,6 +602,7 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl); void nvme_start_queues(struct nvme_ctrl *ctrl); void nvme_kill_queues(struct nvme_ctrl *ctrl); void nvme_sync_queues(struct nvme_ctrl *ctrl); +void nvme_sync_io_queues(struct nvme_ctrl *ctrl); void nvme_unfreeze(struct nvme_ctrl *ctrl); void nvme_wait_freeze(struct nvme_ctrl *ctrl); int nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
From: Chao Leng lengchao@huawei.com
[ Upstream commit 3017013dcc82a4862bd1e140f8b762cfc594008d ]
Now use teardown_lock to serialize for time out and tear down. This may cause abnormal: first cancel all request in tear down, then time out may complete the request again, but the request may already be freed or restarted.
To avoid race between time out and tear down, in tear down process, first we quiesce the queue, and then delete the timer and cancel the time out work for the queue. At the same time we need to delete teardown_lock.
Signed-off-by: Chao Leng lengchao@huawei.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/rdma.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 3a598e91e816d..73961cc1e9799 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -122,7 +122,6 @@ struct nvme_rdma_ctrl { struct sockaddr_storage src_addr;
struct nvme_ctrl ctrl; - struct mutex teardown_lock; bool use_inline_data; u32 io_queues[HCTX_MAX_TYPES]; }; @@ -1010,8 +1009,8 @@ out_free_io_queues: static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl, bool remove) { - mutex_lock(&ctrl->teardown_lock); blk_mq_quiesce_queue(ctrl->ctrl.admin_q); + blk_sync_queue(ctrl->ctrl.admin_q); nvme_rdma_stop_queue(&ctrl->queues[0]); if (ctrl->ctrl.admin_tagset) { blk_mq_tagset_busy_iter(ctrl->ctrl.admin_tagset, @@ -1021,16 +1020,15 @@ static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl, if (remove) blk_mq_unquiesce_queue(ctrl->ctrl.admin_q); nvme_rdma_destroy_admin_queue(ctrl, remove); - mutex_unlock(&ctrl->teardown_lock); }
static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, bool remove) { - mutex_lock(&ctrl->teardown_lock); if (ctrl->ctrl.queue_count > 1) { nvme_start_freeze(&ctrl->ctrl); nvme_stop_queues(&ctrl->ctrl); + nvme_sync_io_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl); if (ctrl->ctrl.tagset) { blk_mq_tagset_busy_iter(ctrl->ctrl.tagset, @@ -1041,7 +1039,6 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, nvme_start_queues(&ctrl->ctrl); nvme_rdma_destroy_io_queues(ctrl, remove); } - mutex_unlock(&ctrl->teardown_lock); }
static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl) @@ -1975,16 +1972,12 @@ static void nvme_rdma_complete_timed_out(struct request *rq) { struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq); struct nvme_rdma_queue *queue = req->queue; - struct nvme_rdma_ctrl *ctrl = queue->ctrl;
- /* fence other contexts that may complete the command */ - mutex_lock(&ctrl->teardown_lock); nvme_rdma_stop_queue(queue); if (!blk_mq_request_completed(rq)) { nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD; blk_mq_complete_request(rq); } - mutex_unlock(&ctrl->teardown_lock); }
static enum blk_eh_timer_return @@ -2319,7 +2312,6 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev, return ERR_PTR(-ENOMEM); ctrl->ctrl.opts = opts; INIT_LIST_HEAD(&ctrl->list); - mutex_init(&ctrl->teardown_lock);
if (!(opts->mask & NVMF_OPT_TRSVCID)) { opts->trsvcid =
From: Chao Leng lengchao@huawei.com
[ Upstream commit d6f66210f4b1aa2f5944f0e34e0f8db44f499f92 ]
Now use teardown_lock to serialize for time out and tear down. This may cause abnormal: first cancel all request in tear down, then time out may complete the request again, but the request may already be freed or restarted.
To avoid race between time out and tear down, in tear down process, first we quiesce the queue, and then delete the timer and cancel the time out work for the queue. At the same time we need to delete teardown_lock.
Signed-off-by: Chao Leng lengchao@huawei.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index d6a3e14873542..19f86ea547bbc 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -124,7 +124,6 @@ struct nvme_tcp_ctrl { struct sockaddr_storage src_addr; struct nvme_ctrl ctrl;
- struct mutex teardown_lock; struct work_struct err_work; struct delayed_work connect_work; struct nvme_tcp_request async_req; @@ -1886,8 +1885,8 @@ out_free_queue: static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl, bool remove) { - mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock); blk_mq_quiesce_queue(ctrl->admin_q); + blk_sync_queue(ctrl->admin_q); nvme_tcp_stop_queue(ctrl, 0); if (ctrl->admin_tagset) { blk_mq_tagset_busy_iter(ctrl->admin_tagset, @@ -1897,18 +1896,17 @@ static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl, if (remove) blk_mq_unquiesce_queue(ctrl->admin_q); nvme_tcp_destroy_admin_queue(ctrl, remove); - mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock); }
static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, bool remove) { - mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock); if (ctrl->queue_count <= 1) - goto out; + return; blk_mq_quiesce_queue(ctrl->admin_q); nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); + nvme_sync_io_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); if (ctrl->tagset) { blk_mq_tagset_busy_iter(ctrl->tagset, @@ -1918,8 +1916,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, if (remove) nvme_start_queues(ctrl); nvme_tcp_destroy_io_queues(ctrl, remove); -out: - mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock); }
static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl) @@ -2171,14 +2167,11 @@ static void nvme_tcp_complete_timed_out(struct request *rq) struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); struct nvme_ctrl *ctrl = &req->queue->ctrl->ctrl;
- /* fence other contexts that may complete the command */ - mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock); nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue)); if (!blk_mq_request_completed(rq)) { nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD; blk_mq_complete_request(rq); } - mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock); }
static enum blk_eh_timer_return @@ -2455,7 +2448,6 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev, nvme_tcp_reconnect_ctrl_work); INIT_WORK(&ctrl->err_work, nvme_tcp_error_recovery_work); INIT_WORK(&ctrl->ctrl.reset_work, nvme_reset_ctrl_work); - mutex_init(&ctrl->teardown_lock);
if (!(opts->mask & NVMF_OPT_TRSVCID)) { opts->trsvcid =
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit fdf58e02adecbef4c7cbb2073d8ea225e6fd5f26 ]
The request may be executed asynchronously, and rq->state may be changed to IDLE. To avoid repeated request completion, only MQ_RQ_COMPLETE of rq->state is checked in nvme_rdma_complete_timed_out. It is not safe, so need adding check IDLE for rq->state.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Chao Leng lengchao@huawei.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/rdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 73961cc1e9799..f91c20e3daf7b 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1974,7 +1974,7 @@ static void nvme_rdma_complete_timed_out(struct request *rq) struct nvme_rdma_queue *queue = req->queue;
nvme_rdma_stop_queue(queue); - if (!blk_mq_request_completed(rq)) { + if (blk_mq_request_started(rq) && !blk_mq_request_completed(rq)) { nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD; blk_mq_complete_request(rq); }
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 0a8a2c85b83589a5c10bc5564b796836bf4b4984 ]
The request may be executed asynchronously, and rq->state may be changed to IDLE. To avoid repeated request completion, only MQ_RQ_COMPLETE of rq->state is checked in nvme_tcp_complete_timed_out. It is not safe, so need adding check IDLE for rq->state.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Chao Leng lengchao@huawei.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 19f86ea547bbc..c0c33320fe659 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2168,7 +2168,7 @@ static void nvme_tcp_complete_timed_out(struct request *rq) struct nvme_ctrl *ctrl = &req->queue->ctrl->ctrl;
nvme_tcp_stop_queue(ctrl, nvme_tcp_queue_id(req->queue)); - if (!blk_mq_request_completed(rq)) { + if (blk_mq_request_started(rq) && !blk_mq_request_completed(rq)) { nvme_req(rq)->status = NVME_SC_HOST_ABORTED_CMD; blk_mq_complete_request(rq); }
From: Suravee Suthikulpanit suravee.suthikulpanit@amd.com
[ Upstream commit 73db2fc595f358460ce32bcaa3be1f0cce4a2db1 ]
Certain device drivers allocate IO queues on a per-cpu basis. On AMD EPYC platform, which can support up-to 256 cpu threads, this can exceed the current MAX_IRQ_PER_TABLE limit of 256, and result in the error message:
AMD-Vi: Failed to allocate IRTE
This has been observed with certain NVME devices.
AMD IOMMU hardware can actually support upto 512 interrupt remapping table entries. Therefore, update the driver to match the hardware limit.
Please note that this also increases the size of interrupt remapping table to 8KB per device when using the 128-bit IRTE format.
Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/20201015025002.87997-1-suravee.suthikulpanit@amd.c... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/amd_iommu_types.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 30a5d412255a4..427484c455891 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache; /* Only true if all IOMMUs support device IOTLBs */ extern bool amd_iommu_iotlb_sup;
-#define MAX_IRQS_PER_TABLE 256 +/* + * AMD IOMMU hardware only support 512 IRTEs despite + * the architectural limitation of 2048 entries. + */ +#define MAX_IRQS_PER_TABLE 512 #define IRQ_TABLE_ALIGNMENT 128
struct irq_remap_table {
From: Qian Cai cai@redhat.com
[ Upstream commit de5d9dae150ca1c1b5c7676711a9ca139d1a8dec ]
The call to rcu_cpu_starting() in smp_init_secondary() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows:
WARNING: suspicious RCU usage ----------------------------- kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0.
Call Trace: show_stack+0x158/0x1f0 dump_stack+0x1f2/0x238 __lock_acquire+0x2640/0x4dd0 lock_acquire+0x3a8/0xd08 _raw_spin_lock_irqsave+0xc0/0xf0 clockevents_register_device+0xa8/0x528 init_cpu_timer+0x33e/0x468 smp_init_secondary+0x11a/0x328 smp_start_secondary+0x82/0x88
This is avoided by moving the call to rcu_cpu_starting up near the beginning of the smp_init_secondary() function. Note that the raw_smp_processor_id() is required in order to avoid calling into lockdep before RCU has declared the CPU to be watched for readers.
Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@... Signed-off-by: Qian Cai cai@redhat.com Acked-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/smp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 85700bd85f98d..3b4c3140c18e7 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -855,13 +855,14 @@ void __init smp_detect_cpus(void)
static void smp_init_secondary(void) { - int cpu = smp_processor_id(); + int cpu = raw_smp_processor_id();
S390_lowcore.last_update_clock = get_tod_clock(); restore_access_regs(S390_lowcore.access_regs_save_area); set_cpu_flag(CIF_ASCE_PRIMARY); set_cpu_flag(CIF_ASCE_SECONDARY); cpu_init(); + rcu_cpu_starting(cpu); preempt_disable(); init_cpu_timer(); vtime_init();
From: Zhang Qilong zhangqilong3@huawei.com
[ Upstream commit bb742ad01961a3b9d1f9d19375487b879668b6b2 ]
pm_runtime_get_sync() will increment pm usage counter even it failed. Forgetting to call pm_runtime_put will result in reference leak in vfio_platform_open, so we should fix it.
Signed-off-by: Zhang Qilong zhangqilong3@huawei.com Acked-by: Eric Auger eric.auger@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/platform/vfio_platform_common.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index c0771a9567fb5..fb4b385191f28 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -267,7 +267,7 @@ static int vfio_platform_open(void *device_data)
ret = pm_runtime_get_sync(vdev->device); if (ret < 0) - goto err_pm; + goto err_rst;
ret = vfio_platform_call_reset(vdev, &extra_dbg); if (ret && vdev->reset_required) { @@ -284,7 +284,6 @@ static int vfio_platform_open(void *device_data)
err_rst: pm_runtime_put(vdev->device); -err_pm: vfio_platform_irq_cleanup(vdev); err_irq: vfio_platform_regions_cleanup(vdev);
From: Fred Gao fred.gao@intel.com
[ Upstream commit e4eccb853664de7bcf9518fb658f35e748bf1f68 ]
Bypass the IGD initialization when -ENODEV returns, that should be the case if opregion is not available for IGD or within discrete graphics device's option ROM, or host/lpc bridge is not found.
Then use of -ENODEV here means no special device resources found which needs special care for VFIO, but we still allow other normal device resource access.
Cc: Zhenyu Wang zhenyuw@linux.intel.com Cc: Xiong Zhang xiong.y.zhang@intel.com Cc: Hang Yuan hang.yuan@linux.intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Fred Gao fred.gao@intel.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/pci/vfio_pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 1ab1f5cda4ac2..bfdc010a6b043 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -385,7 +385,7 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) pdev->vendor == PCI_VENDOR_ID_INTEL && IS_ENABLED(CONFIG_VFIO_PCI_IGD)) { ret = vfio_pci_igd_init(vdev); - if (ret) { + if (ret && ret != -ENODEV) { pci_warn(pdev, "Failed to setup Intel IGD regions\n"); goto disable_exit; }
From: Qii Wang qii.wang@mediatek.com
[ Upstream commit aafced673c06b7c77040c1df42e2e965be5d0376 ]
The i2c driver default do dma reset after i2c reset, but sometimes i2c reset will trigger dma tx2rx, then apdma write data to dram which has been i2c_put_dma_safe_msg_buf(kfree). Move dma reset before i2c reset in mtk_i2c_init_hw to fix it.
Signed-off-by: Qii Wang qii.wang@mediatek.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-mt65xx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c index 0cbdfbe605b55..33de99b7bc20c 100644 --- a/drivers/i2c/busses/i2c-mt65xx.c +++ b/drivers/i2c/busses/i2c-mt65xx.c @@ -475,6 +475,10 @@ static void mtk_i2c_init_hw(struct mtk_i2c *i2c) { u16 control_reg;
+ writel(I2C_DMA_HARD_RST, i2c->pdmabase + OFFSET_RST); + udelay(50); + writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_RST); + mtk_i2c_writew(i2c, I2C_SOFT_RST, OFFSET_SOFTRESET);
/* Set ioconfig */ @@ -529,10 +533,6 @@ static void mtk_i2c_init_hw(struct mtk_i2c *i2c)
mtk_i2c_writew(i2c, control_reg, OFFSET_CONTROL); mtk_i2c_writew(i2c, I2C_DELAY_LEN, OFFSET_DELAY_LEN); - - writel(I2C_DMA_HARD_RST, i2c->pdmabase + OFFSET_RST); - udelay(50); - writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_RST); }
static const struct i2c_spec_values *mtk_i2c_get_spec(unsigned int speed)
From: Veerabadhran Gopalakrishnan veerabadhran.gopalakrishnan@amd.com
[ Upstream commit c6d2b0fbb893d5c7dda405aa0e7bcbecf1c75f98 ]
Concurrent operation of VCN and JPEG decoder in DPG mode is causing ring timeout due to power state.
Signed-off-by: Veerabadhran Gopalakrishnan veerabadhran.gopalakrishnan@amd.com Reviewed-by: Leo Liu leo.liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/soc15.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index c28ebf41530aa..254ab2ada70a0 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1220,8 +1220,7 @@ static int soc15_common_early_init(void *handle)
adev->pg_flags = AMD_PG_SUPPORT_SDMA | AMD_PG_SUPPORT_MMHUB | - AMD_PG_SUPPORT_VCN | - AMD_PG_SUPPORT_VCN_DPG; + AMD_PG_SUPPORT_VCN; } else { adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_MGLS |
From: Brian Foster bfoster@redhat.com
[ Upstream commit 50e7d6c7a5210063b9a6f0d8799d9d1440907fcf ]
The iomap writepage error handling logic is a mash of old and slightly broken XFS writepage logic. When keepwrite writeback state tracking was introduced in XFS in commit 0d085a529b42 ("xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly"), XFS had an additional cluster writeback context that scanned ahead of ->writepage() to process dirty pages over the current ->writepage() extent mapping. This context expected a dirty page and required retention of the TOWRITE tag on partial page processing so the higher level writeback context would revisit the page (in contrast to ->writepage(), which passes a page with the dirty bit already cleared).
The cluster writeback mechanism was eventually removed and some of the error handling logic folded into the primary writeback path in commit 150d5be09ce4 ("xfs: remove xfs_cancel_ioend"). This patch accidentally conflated the two contexts by using the keepwrite logic in ->writepage() without accounting for the fact that the page is not dirty. Further, the keepwrite logic has no practical effect on the core ->writepage() caller (write_cache_pages()) because it never revisits a page in the current function invocation.
Technically, the page should be redirtied for the keepwrite logic to have any effect. Otherwise, write_cache_pages() may find the tagged page but will skip it since it is clean. Even if the page was redirtied, however, there is still no practical effect to keepwrite since write_cache_pages() does not wrap around within a single invocation of the function. Therefore, the dirty page would simply end up retagged on the next writeback sequence over the associated range.
All that being said, none of this really matters because redirtying a partially processed page introduces a potential infinite redirty -> writeback failure loop that deviates from the current design principle of clearing the dirty state on writepage failure to avoid building up too much dirty, unreclaimable memory on the system. Therefore, drop the spurious keepwrite usage and dirty state clearing logic from iomap_writepage_map(), treat the partially processed page the same as a fully processed page, and let the imminent ioend failure clean up the writeback state.
Signed-off-by: Brian Foster bfoster@redhat.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/iomap/buffered-io.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index b115e7d47fcec..238613443bec2 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1395,6 +1395,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list)); WARN_ON_ONCE(!PageLocked(page)); WARN_ON_ONCE(PageWriteback(page)); + WARN_ON_ONCE(PageDirty(page));
/* * We cannot cancel the ioend directly here on error. We may have @@ -1415,21 +1416,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, unlock_page(page); goto done; } - - /* - * If the page was not fully cleaned, we need to ensure that the - * higher layers come back to it correctly. That means we need - * to keep the page dirty, and for WB_SYNC_ALL writeback we need - * to ensure the PAGECACHE_TAG_TOWRITE index mark is not removed - * so another attempt to write this page in this writeback sweep - * will be made. - */ - set_page_writeback_keepwrite(page); - } else { - clear_page_dirty_for_io(page); - set_page_writeback(page); }
+ set_page_writeback(page); unlock_page(page);
/*
From: Tommi Rantala tommi.t.rantala@nokia.com
[ Upstream commit f3ae6c6e8a3ea49076d826c64e63ea78fbf9db43 ]
Makefile already contains -D_GNU_SOURCE, so we can remove it from the *.c files.
Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/proc/proc-loadavg-001.c | 1 - tools/testing/selftests/proc/proc-self-syscall.c | 1 - tools/testing/selftests/proc/proc-uptime-002.c | 1 - 3 files changed, 3 deletions(-)
diff --git a/tools/testing/selftests/proc/proc-loadavg-001.c b/tools/testing/selftests/proc/proc-loadavg-001.c index 471e2aa280776..fb4fe9188806e 100644 --- a/tools/testing/selftests/proc/proc-loadavg-001.c +++ b/tools/testing/selftests/proc/proc-loadavg-001.c @@ -14,7 +14,6 @@ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /* Test that /proc/loadavg correctly reports last pid in pid namespace. */ -#define _GNU_SOURCE #include <errno.h> #include <sched.h> #include <sys/types.h> diff --git a/tools/testing/selftests/proc/proc-self-syscall.c b/tools/testing/selftests/proc/proc-self-syscall.c index 9f6d000c02455..8511dcfe67c75 100644 --- a/tools/testing/selftests/proc/proc-self-syscall.c +++ b/tools/testing/selftests/proc/proc-self-syscall.c @@ -13,7 +13,6 @@ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -#define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/types.h> diff --git a/tools/testing/selftests/proc/proc-uptime-002.c b/tools/testing/selftests/proc/proc-uptime-002.c index 30e2b78490898..e7ceabed7f51f 100644 --- a/tools/testing/selftests/proc/proc-uptime-002.c +++ b/tools/testing/selftests/proc/proc-uptime-002.c @@ -15,7 +15,6 @@ */ // Test that values in /proc/uptime increment monotonically // while shifting across CPUs. -#define _GNU_SOURCE #undef NDEBUG #include <assert.h> #include <unistd.h>
From: Benjamin Gwin bgwin@google.com
[ Upstream commit 108aa503657ee2fe8aa071dc620d96372c252ecd ]
It's possible that the first region picked for the new kernel will make it impossible to fit the other segments in the required 32GB window, especially if we have a very large initrd.
Instead of giving up, we can keep testing other regions for the kernel until we find one that works.
Suggested-by: Ryan O'Leary ryanoleary@google.com Signed-off-by: Benjamin Gwin bgwin@google.com Link: https://lore.kernel.org/r/20201103201106.2397844-1-bgwin@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/kexec_image.c | 41 +++++++++++++++++++------- arch/arm64/kernel/machine_kexec_file.c | 9 +++++- 2 files changed, 39 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c index af9987c154cab..66adee8b5fc81 100644 --- a/arch/arm64/kernel/kexec_image.c +++ b/arch/arm64/kernel/kexec_image.c @@ -43,7 +43,7 @@ static void *image_load(struct kimage *image, u64 flags, value; bool be_image, be_kernel; struct kexec_buf kbuf; - unsigned long text_offset; + unsigned long text_offset, kernel_segment_number; struct kexec_segment *kernel_segment; int ret;
@@ -88,11 +88,37 @@ static void *image_load(struct kimage *image, /* Adjust kernel segment with TEXT_OFFSET */ kbuf.memsz += text_offset;
- ret = kexec_add_buffer(&kbuf); - if (ret) + kernel_segment_number = image->nr_segments; + + /* + * The location of the kernel segment may make it impossible to satisfy + * the other segment requirements, so we try repeatedly to find a + * location that will work. + */ + while ((ret = kexec_add_buffer(&kbuf)) == 0) { + /* Try to load additional data */ + kernel_segment = &image->segment[kernel_segment_number]; + ret = load_other_segments(image, kernel_segment->mem, + kernel_segment->memsz, initrd, + initrd_len, cmdline); + if (!ret) + break; + + /* + * We couldn't find space for the other segments; erase the + * kernel segment and try the next available hole. + */ + image->nr_segments -= 1; + kbuf.buf_min = kernel_segment->mem + kernel_segment->memsz; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; + } + + if (ret) { + pr_err("Could not find any suitable kernel location!"); return ERR_PTR(ret); + }
- kernel_segment = &image->segment[image->nr_segments - 1]; + kernel_segment = &image->segment[kernel_segment_number]; kernel_segment->mem += text_offset; kernel_segment->memsz -= text_offset; image->start = kernel_segment->mem; @@ -101,12 +127,7 @@ static void *image_load(struct kimage *image, kernel_segment->mem, kbuf.bufsz, kernel_segment->memsz);
- /* Load additional data */ - ret = load_other_segments(image, - kernel_segment->mem, kernel_segment->memsz, - initrd, initrd_len, cmdline); - - return ERR_PTR(ret); + return 0; }
#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index 361a1143e09ee..e443df8569881 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -242,6 +242,11 @@ static int prepare_elf_headers(void **addr, unsigned long *sz) return ret; }
+/* + * Tries to add the initrd and DTB to the image. If it is not possible to find + * valid locations, this function will undo changes to the image and return non + * zero. + */ int load_other_segments(struct kimage *image, unsigned long kernel_load_addr, unsigned long kernel_size, @@ -250,7 +255,8 @@ int load_other_segments(struct kimage *image, { struct kexec_buf kbuf; void *headers, *dtb = NULL; - unsigned long headers_sz, initrd_load_addr = 0, dtb_len; + unsigned long headers_sz, initrd_load_addr = 0, dtb_len, + orig_segments = image->nr_segments; int ret = 0;
kbuf.image = image; @@ -336,6 +342,7 @@ int load_other_segments(struct kimage *image, return 0;
out_err: + image->nr_segments = orig_segments; vfree(dtb); return ret; }
From: Sean Anderson seanga2@gmail.com
[ Upstream commit 79605f1394261995c2b955c906a5a20fb27cdc84 ]
M-Mode Linux is loaded at the start of RAM, not 2MB later. Perhaps this should be calculated based on PAGE_OFFSET somehow? Even better would be to deprecate text_offset and instead introduce something absolute.
Signed-off-by: Sean Anderson seanga2@gmail.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/head.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 0a4e81b8dc795..5a0ae2eaf5e2f 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -27,12 +27,17 @@ ENTRY(_start) /* reserved */ .word 0 .balign 8 +#ifdef CONFIG_RISCV_M_MODE + /* Image load offset (0MB) from start of RAM for M-mode */ + .dword 0 +#else #if __riscv_xlen == 64 /* Image load offset(2MB) from start of RAM */ .dword 0x200000 #else /* Image load offset(4MB) from start of RAM */ .dword 0x400000 +#endif #endif /* Effective size of kernel image */ .dword _end - _start
From: Ulrich Hecht uli+renesas@fpond.eu
[ Upstream commit a49cc1fe9d64a2dc4e19b599204f403e5d25f44b ]
Implements atomic transfers to fix reboot/shutdown on r8a7790 Lager and similar boards.
Signed-off-by: Ulrich Hecht uli+renesas@fpond.eu Tested-by: Wolfram Sang wsa+renesas@sang-engineering.com Tested-by: Geert Uytterhoeven geert+renesas@glider.be [wsa: some whitespace fixing] Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-sh_mobile.c | 86 +++++++++++++++++++++++------- 1 file changed, 66 insertions(+), 20 deletions(-)
diff --git a/drivers/i2c/busses/i2c-sh_mobile.c b/drivers/i2c/busses/i2c-sh_mobile.c index cab7255599991..bdd60770779ad 100644 --- a/drivers/i2c/busses/i2c-sh_mobile.c +++ b/drivers/i2c/busses/i2c-sh_mobile.c @@ -129,6 +129,7 @@ struct sh_mobile_i2c_data { int sr; bool send_stop; bool stop_after_dma; + bool atomic_xfer;
struct resource *res; struct dma_chan *dma_tx; @@ -330,13 +331,15 @@ static unsigned char i2c_op(struct sh_mobile_i2c_data *pd, enum sh_mobile_i2c_op ret = iic_rd(pd, ICDR); break; case OP_RX_STOP: /* enable DTE interrupt, issue stop */ - iic_wr(pd, ICIC, - ICIC_DTEE | ICIC_WAITE | ICIC_ALE | ICIC_TACKE); + if (!pd->atomic_xfer) + iic_wr(pd, ICIC, + ICIC_DTEE | ICIC_WAITE | ICIC_ALE | ICIC_TACKE); iic_wr(pd, ICCR, ICCR_ICE | ICCR_RACK); break; case OP_RX_STOP_DATA: /* enable DTE interrupt, read data, issue stop */ - iic_wr(pd, ICIC, - ICIC_DTEE | ICIC_WAITE | ICIC_ALE | ICIC_TACKE); + if (!pd->atomic_xfer) + iic_wr(pd, ICIC, + ICIC_DTEE | ICIC_WAITE | ICIC_ALE | ICIC_TACKE); ret = iic_rd(pd, ICDR); iic_wr(pd, ICCR, ICCR_ICE | ICCR_RACK); break; @@ -429,7 +432,8 @@ static irqreturn_t sh_mobile_i2c_isr(int irq, void *dev_id)
if (wakeup) { pd->sr |= SW_DONE; - wake_up(&pd->wait); + if (!pd->atomic_xfer) + wake_up(&pd->wait); }
/* defeat write posting to avoid spurious WAIT interrupts */ @@ -581,6 +585,9 @@ static void start_ch(struct sh_mobile_i2c_data *pd, struct i2c_msg *usr_msg, pd->pos = -1; pd->sr = 0;
+ if (pd->atomic_xfer) + return; + pd->dma_buf = i2c_get_dma_safe_msg_buf(pd->msg, 8); if (pd->dma_buf) sh_mobile_i2c_xfer_dma(pd); @@ -637,15 +644,13 @@ static int poll_busy(struct sh_mobile_i2c_data *pd) return i ? 0 : -ETIMEDOUT; }
-static int sh_mobile_i2c_xfer(struct i2c_adapter *adapter, - struct i2c_msg *msgs, - int num) +static int sh_mobile_xfer(struct sh_mobile_i2c_data *pd, + struct i2c_msg *msgs, int num) { - struct sh_mobile_i2c_data *pd = i2c_get_adapdata(adapter); struct i2c_msg *msg; int err = 0; int i; - long timeout; + long time_left;
/* Wake up device and enable clock */ pm_runtime_get_sync(pd->dev); @@ -662,15 +667,35 @@ static int sh_mobile_i2c_xfer(struct i2c_adapter *adapter, if (do_start) i2c_op(pd, OP_START);
- /* The interrupt handler takes care of the rest... */ - timeout = wait_event_timeout(pd->wait, - pd->sr & (ICSR_TACK | SW_DONE), - adapter->timeout); - - /* 'stop_after_dma' tells if DMA transfer was complete */ - i2c_put_dma_safe_msg_buf(pd->dma_buf, pd->msg, pd->stop_after_dma); + if (pd->atomic_xfer) { + unsigned long j = jiffies + pd->adap.timeout; + + time_left = time_before_eq(jiffies, j); + while (time_left && + !(pd->sr & (ICSR_TACK | SW_DONE))) { + unsigned char sr = iic_rd(pd, ICSR); + + if (sr & (ICSR_AL | ICSR_TACK | + ICSR_WAIT | ICSR_DTE)) { + sh_mobile_i2c_isr(0, pd); + udelay(150); + } else { + cpu_relax(); + } + time_left = time_before_eq(jiffies, j); + } + } else { + /* The interrupt handler takes care of the rest... */ + time_left = wait_event_timeout(pd->wait, + pd->sr & (ICSR_TACK | SW_DONE), + pd->adap.timeout); + + /* 'stop_after_dma' tells if DMA xfer was complete */ + i2c_put_dma_safe_msg_buf(pd->dma_buf, pd->msg, + pd->stop_after_dma); + }
- if (!timeout) { + if (!time_left) { dev_err(pd->dev, "Transfer request timed out\n"); if (pd->dma_direction != DMA_NONE) sh_mobile_i2c_cleanup_dma(pd); @@ -696,14 +721,35 @@ static int sh_mobile_i2c_xfer(struct i2c_adapter *adapter, return err ?: num; }
+static int sh_mobile_i2c_xfer(struct i2c_adapter *adapter, + struct i2c_msg *msgs, + int num) +{ + struct sh_mobile_i2c_data *pd = i2c_get_adapdata(adapter); + + pd->atomic_xfer = false; + return sh_mobile_xfer(pd, msgs, num); +} + +static int sh_mobile_i2c_xfer_atomic(struct i2c_adapter *adapter, + struct i2c_msg *msgs, + int num) +{ + struct sh_mobile_i2c_data *pd = i2c_get_adapdata(adapter); + + pd->atomic_xfer = true; + return sh_mobile_xfer(pd, msgs, num); +} + static u32 sh_mobile_i2c_func(struct i2c_adapter *adapter) { return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL | I2C_FUNC_PROTOCOL_MANGLING; }
static const struct i2c_algorithm sh_mobile_i2c_algorithm = { - .functionality = sh_mobile_i2c_func, - .master_xfer = sh_mobile_i2c_xfer, + .functionality = sh_mobile_i2c_func, + .master_xfer = sh_mobile_i2c_xfer, + .master_xfer_atomic = sh_mobile_i2c_xfer_atomic, };
static const struct i2c_adapter_quirks sh_mobile_i2c_quirks = {
From: Michael Wu michael.wu@vatics.com
[ Upstream commit 66b92313e2ca9208b5f3ebf5d86e9a818299d8fa ]
If some bits were cleared by i2c_dw_read_clear_intrbits_slave() in i2c_dw_isr_slave() and not handled immediately, those cleared bits would not be shown again by later i2c_dw_read_clear_intrbits_slave(). They therefore were forgotten to be handled.
i2c_dw_read_clear_intrbits_slave() should be called once in an ISR and take its returned state for all later handlings.
Signed-off-by: Michael Wu michael.wu@vatics.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-slave.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/i2c/busses/i2c-designware-slave.c b/drivers/i2c/busses/i2c-designware-slave.c index 44974b53a6268..13de01a0f75f0 100644 --- a/drivers/i2c/busses/i2c-designware-slave.c +++ b/drivers/i2c/busses/i2c-designware-slave.c @@ -159,7 +159,6 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) u32 raw_stat, stat, enabled, tmp; u8 val = 0, slave_activity;
- regmap_read(dev->map, DW_IC_INTR_STAT, &stat); regmap_read(dev->map, DW_IC_ENABLE, &enabled); regmap_read(dev->map, DW_IC_RAW_INTR_STAT, &raw_stat); regmap_read(dev->map, DW_IC_STATUS, &tmp); @@ -168,6 +167,7 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) if (!enabled || !(raw_stat & ~DW_IC_INTR_ACTIVITY) || !dev->slave) return 0;
+ stat = i2c_dw_read_clear_intrbits_slave(dev); dev_dbg(dev->dev, "%#x STATUS SLAVE_ACTIVITY=%#x : RAW_INTR_STAT=%#x : INTR_STAT=%#x\n", enabled, slave_activity, raw_stat, stat); @@ -188,11 +188,9 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) val); } regmap_read(dev->map, DW_IC_CLR_RD_REQ, &tmp); - stat = i2c_dw_read_clear_intrbits_slave(dev); } else { regmap_read(dev->map, DW_IC_CLR_RD_REQ, &tmp); regmap_read(dev->map, DW_IC_CLR_RX_UNDER, &tmp); - stat = i2c_dw_read_clear_intrbits_slave(dev); } if (!i2c_slave_event(dev->slave, I2C_SLAVE_READ_REQUESTED, @@ -207,7 +205,6 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) regmap_read(dev->map, DW_IC_CLR_RX_DONE, &tmp);
i2c_slave_event(dev->slave, I2C_SLAVE_STOP, &val); - stat = i2c_dw_read_clear_intrbits_slave(dev); return 1; }
@@ -219,7 +216,6 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) dev_vdbg(dev->dev, "Byte %X acked!", val); } else { i2c_slave_event(dev->slave, I2C_SLAVE_STOP, &val); - stat = i2c_dw_read_clear_intrbits_slave(dev); }
return 1; @@ -230,7 +226,6 @@ static irqreturn_t i2c_dw_isr_slave(int this_irq, void *dev_id) struct dw_i2c_dev *dev = dev_id; int ret;
- i2c_dw_read_clear_intrbits_slave(dev); ret = i2c_dw_irq_handler_slave(dev); if (ret > 0) complete(&dev->cmd_complete);
From: Michael Wu michael.wu@vatics.com
[ Upstream commit 3b5f7f10ff6e6b66f553e12cc50d9bb751ce60ad ]
Sometimes we would get the following flow when doing an i2cset:
0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x514 : INTR_STAT=0x4 I2C_SLAVE_WRITE_RECEIVED 0x1 STATUS SLAVE_ACTIVITY=0x0 : RAW_INTR_STAT=0x714 : INTR_STAT=0x204 I2C_SLAVE_WRITE_REQUESTED I2C_SLAVE_WRITE_RECEIVED
Documentation/i2c/slave-interface.rst says that I2C_SLAVE_WRITE_REQUESTED, which is mandatory, should be sent while the data did not arrive yet. It means in a write-request I2C_SLAVE_WRITE_REQUESTED should be reported before any I2C_SLAVE_WRITE_RECEIVED.
By the way, I2C_SLAVE_STOP didn't be reported in the above case because DW_IC_INTR_STAT was not 0x200.
dev->status can be used to record the current state, especially Designware I2C controller has no interrupts to identify a write-request. This patch makes not only I2C_SLAVE_WRITE_REQUESTED been reported first when IC_INTR_RX_FULL is rising and dev->status isn't STATUS_WRITE_IN_PROGRESS but also I2C_SLAVE_STOP been reported when a STOP condition is received.
Signed-off-by: Michael Wu michael.wu@vatics.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-slave.c | 45 +++++++++-------------- 1 file changed, 18 insertions(+), 27 deletions(-)
diff --git a/drivers/i2c/busses/i2c-designware-slave.c b/drivers/i2c/busses/i2c-designware-slave.c index 13de01a0f75f0..0d15f4c1e9f7e 100644 --- a/drivers/i2c/busses/i2c-designware-slave.c +++ b/drivers/i2c/busses/i2c-designware-slave.c @@ -172,26 +172,25 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) "%#x STATUS SLAVE_ACTIVITY=%#x : RAW_INTR_STAT=%#x : INTR_STAT=%#x\n", enabled, slave_activity, raw_stat, stat);
- if ((stat & DW_IC_INTR_RX_FULL) && (stat & DW_IC_INTR_STOP_DET)) - i2c_slave_event(dev->slave, I2C_SLAVE_WRITE_REQUESTED, &val); + if (stat & DW_IC_INTR_RX_FULL) { + if (dev->status != STATUS_WRITE_IN_PROGRESS) { + dev->status = STATUS_WRITE_IN_PROGRESS; + i2c_slave_event(dev->slave, I2C_SLAVE_WRITE_REQUESTED, + &val); + } + + regmap_read(dev->map, DW_IC_DATA_CMD, &tmp); + val = tmp; + if (!i2c_slave_event(dev->slave, I2C_SLAVE_WRITE_RECEIVED, + &val)) + dev_vdbg(dev->dev, "Byte %X acked!", val); + }
if (stat & DW_IC_INTR_RD_REQ) { if (slave_activity) { - if (stat & DW_IC_INTR_RX_FULL) { - regmap_read(dev->map, DW_IC_DATA_CMD, &tmp); - val = tmp; - - if (!i2c_slave_event(dev->slave, - I2C_SLAVE_WRITE_RECEIVED, - &val)) { - dev_vdbg(dev->dev, "Byte %X acked!", - val); - } - regmap_read(dev->map, DW_IC_CLR_RD_REQ, &tmp); - } else { - regmap_read(dev->map, DW_IC_CLR_RD_REQ, &tmp); - regmap_read(dev->map, DW_IC_CLR_RX_UNDER, &tmp); - } + regmap_read(dev->map, DW_IC_CLR_RD_REQ, &tmp); + + dev->status = STATUS_READ_IN_PROGRESS; if (!i2c_slave_event(dev->slave, I2C_SLAVE_READ_REQUESTED, &val)) @@ -203,18 +202,10 @@ static int i2c_dw_irq_handler_slave(struct dw_i2c_dev *dev) if (!i2c_slave_event(dev->slave, I2C_SLAVE_READ_PROCESSED, &val)) regmap_read(dev->map, DW_IC_CLR_RX_DONE, &tmp); - - i2c_slave_event(dev->slave, I2C_SLAVE_STOP, &val); - return 1; }
- if (stat & DW_IC_INTR_RX_FULL) { - regmap_read(dev->map, DW_IC_DATA_CMD, &tmp); - val = tmp; - if (!i2c_slave_event(dev->slave, I2C_SLAVE_WRITE_RECEIVED, - &val)) - dev_vdbg(dev->dev, "Byte %X acked!", val); - } else { + if (stat & DW_IC_INTR_STOP_DET) { + dev->status = STATUS_IDLE; i2c_slave_event(dev->slave, I2C_SLAVE_STOP, &val); }
From: Jerry Snitselaar jsnitsel@redhat.com
[ Upstream commit b154ce11ead925de6a94feb3b0317fafeefa0ebc ]
There is a misconfiguration in the bios of the gpio pin used for the interrupt in the T490s. When interrupts are enabled in the tpm_tis driver code this results in an interrupt storm. This was initially reported when we attempted to enable the interrupt code in the tpm_tis driver, which previously wasn't setting a flag to enable it. Due to the reports of the interrupt storm that code was reverted and we went back to polling instead of using interrupts. Now that we know the T490s problem is a firmware issue, add code to check if the system is a T490s and disable interrupts if that is the case. This will allow us to enable interrupts for everyone else. If the user has a fixed bios they can force the enabling of interrupts with tpm_tis.interrupts=1 on the kernel command line.
Cc: Peter Huewe peterhuewe@gmx.de Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Hans de Goede hdegoede@redhat.com Signed-off-by: Jerry Snitselaar jsnitsel@redhat.com Reviewed-by: James Bottomley James.Bottomley@HansenPartnership.com Reviewed-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 0b214963539de..4ed6e660273a4 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -27,6 +27,7 @@ #include <linux/of.h> #include <linux/of_device.h> #include <linux/kernel.h> +#include <linux/dmi.h> #include "tpm.h" #include "tpm_tis_core.h"
@@ -49,8 +50,8 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da return container_of(data, struct tpm_tis_tcg_phy, priv); }
-static bool interrupts = true; -module_param(interrupts, bool, 0444); +static int interrupts = -1; +module_param(interrupts, int, 0444); MODULE_PARM_DESC(interrupts, "Enable interrupts");
static bool itpm; @@ -63,6 +64,28 @@ module_param(force, bool, 0444); MODULE_PARM_DESC(force, "Force device probe rather than using ACPI entry"); #endif
+static int tpm_tis_disable_irq(const struct dmi_system_id *d) +{ + if (interrupts == -1) { + pr_notice("tpm_tis: %s detected: disabling interrupts.\n", d->ident); + interrupts = 0; + } + + return 0; +} + +static const struct dmi_system_id tpm_tis_dmi_table[] = { + { + .callback = tpm_tis_disable_irq, + .ident = "ThinkPad T490s", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T490s"), + }, + }, + {} +}; + #if defined(CONFIG_PNP) && defined(CONFIG_ACPI) static int has_hid(struct acpi_device *dev, const char *hid) { @@ -192,6 +215,8 @@ static int tpm_tis_init(struct device *dev, struct tpm_info *tpm_info) int irq = -1; int rc;
+ dmi_check_system(tpm_tis_dmi_table); + rc = check_acpi_tpm2(dev); if (rc) return rc;
From: Martin Hundebøll martin@geanix.com
commit bc7f2cd7559c5595dc38b909ae9a8d43e0215994 upstream.
Removing the duplicate gpio chip select level handling in bcm2835_spi_setup() left the lflags variable uninitialized. Avoid trhe use of such variable by passing default flags to gpiochip_request_own_desc().
Fixes: 5e31ba0c0543 ("spi: bcm2835: fix gpio cs level inversion") Signed-off-by: Martin Hundebøll martin@geanix.com Link: https://lore.kernel.org/r/20201105090615.620315-1-martin@geanix.com Signed-off-by: Mark Brown broonie@kernel.org Cc: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-bcm2835.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/spi/spi-bcm2835.c +++ b/drivers/spi/spi-bcm2835.c @@ -1193,7 +1193,6 @@ static int bcm2835_spi_setup(struct spi_ struct spi_controller *ctlr = spi->controller; struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr); struct gpio_chip *chip; - enum gpio_lookup_flags lflags; u32 cs;
/* @@ -1261,7 +1260,7 @@ static int bcm2835_spi_setup(struct spi_
spi->cs_gpiod = gpiochip_request_own_desc(chip, 8 - spi->chip_select, DRV_NAME, - lflags, + GPIO_LOOKUP_FLAGS_DEFAULT, GPIOD_OUT_LOW); if (IS_ERR(spi->cs_gpiod)) return PTR_ERR(spi->cs_gpiod);
From: Baolin Wang baolin.wang7@gmail.com
commit a75bfc824a2d33f57ebdc003bfe6b7a9e11e9cb9 upstream.
When changing to use suspend-to-idle to save power, the PMIC irq can not wakeup the system due to lack of wakeup capability, which will cause the sub-irqs (such as power key) of the PMIC can not wake up the system. Thus we can add the wakeup capability for PMIC irq to solve this issue, as well as removing the IRQF_NO_SUSPEND flag to allow PMIC irq to be a wakeup source.
Reported-by: Chunyan Zhang zhang.lyra@gmail.com Signed-off-by: Baolin Wang baolin.wang7@gmail.com Tested-by: Chunyan Zhang chunyan.zhang@unisoc.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mfd/sprd-sc27xx-spi.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-)
--- a/drivers/mfd/sprd-sc27xx-spi.c +++ b/drivers/mfd/sprd-sc27xx-spi.c @@ -189,7 +189,7 @@ static int sprd_pmic_probe(struct spi_de ddata->irqs[i].mask = BIT(i);
ret = devm_regmap_add_irq_chip(&spi->dev, ddata->regmap, ddata->irq, - IRQF_ONESHOT | IRQF_NO_SUSPEND, 0, + IRQF_ONESHOT, 0, &ddata->irq_chip, &ddata->irq_data); if (ret) { dev_err(&spi->dev, "Failed to add PMIC irq chip %d\n", ret); @@ -202,9 +202,34 @@ static int sprd_pmic_probe(struct spi_de return ret; }
+ device_init_wakeup(&spi->dev, true); return 0; }
+#ifdef CONFIG_PM_SLEEP +static int sprd_pmic_suspend(struct device *dev) +{ + struct sprd_pmic *ddata = dev_get_drvdata(dev); + + if (device_may_wakeup(dev)) + enable_irq_wake(ddata->irq); + + return 0; +} + +static int sprd_pmic_resume(struct device *dev) +{ + struct sprd_pmic *ddata = dev_get_drvdata(dev); + + if (device_may_wakeup(dev)) + disable_irq_wake(ddata->irq); + + return 0; +} +#endif + +static SIMPLE_DEV_PM_OPS(sprd_pmic_pm_ops, sprd_pmic_suspend, sprd_pmic_resume); + static const struct of_device_id sprd_pmic_match[] = { { .compatible = "sprd,sc2731", .data = &sc2731_data }, {}, @@ -215,6 +240,7 @@ static struct spi_driver sprd_pmic_drive .driver = { .name = "sc27xx-pmic", .of_match_table = sprd_pmic_match, + .pm = &sprd_pmic_pm_ops, }, .probe = sprd_pmic_probe, };
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit dd26209bc56886cacdbd828571e54a6bca251e55 ]
2 kOhm bias was never an option in Intel GPIO hardware, the available matrix is:
000 none 001 1 kOhm (if available) 010 5 kOhm 100 20 kOhm
As easy to get the 3 resistors are gated separately and according to parallel circuits calculations we may get combinations of the above where the result is always strictly less than minimal resistance. Hence, additional values can be:
011 ~833.3 Ohm 101 ~952.4 Ohm 110 ~4 kOhm 111 ~800 Ohm
That said, convert TERM definitions to be the bit masks to reflect the above.
While at it, enable the same setting for pull down case.
Fixes: 7981c0015af2 ("pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support") Cc: Jamie McClymont jamie@kwiius.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/intel/pinctrl-intel.c | 32 ++++++++++++++++++--------- 1 file changed, 22 insertions(+), 10 deletions(-)
diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c index b64997b303e0c..b738b28239bd4 100644 --- a/drivers/pinctrl/intel/pinctrl-intel.c +++ b/drivers/pinctrl/intel/pinctrl-intel.c @@ -62,10 +62,10 @@ #define PADCFG1_TERM_UP BIT(13) #define PADCFG1_TERM_SHIFT 10 #define PADCFG1_TERM_MASK GENMASK(12, 10) -#define PADCFG1_TERM_20K 4 -#define PADCFG1_TERM_2K 3 -#define PADCFG1_TERM_5K 2 -#define PADCFG1_TERM_1K 1 +#define PADCFG1_TERM_20K BIT(2) +#define PADCFG1_TERM_5K BIT(1) +#define PADCFG1_TERM_1K BIT(0) +#define PADCFG1_TERM_833 (BIT(1) | BIT(0))
#define PADCFG2 0x008 #define PADCFG2_DEBEN BIT(0) @@ -549,12 +549,12 @@ static int intel_config_get_pull(struct intel_pinctrl *pctrl, unsigned int pin, return -EINVAL;
switch (term) { + case PADCFG1_TERM_833: + *arg = 833; + break; case PADCFG1_TERM_1K: *arg = 1000; break; - case PADCFG1_TERM_2K: - *arg = 2000; - break; case PADCFG1_TERM_5K: *arg = 5000; break; @@ -570,6 +570,11 @@ static int intel_config_get_pull(struct intel_pinctrl *pctrl, unsigned int pin, return -EINVAL;
switch (term) { + case PADCFG1_TERM_833: + if (!(community->features & PINCTRL_FEATURE_1K_PD)) + return -EINVAL; + *arg = 833; + break; case PADCFG1_TERM_1K: if (!(community->features & PINCTRL_FEATURE_1K_PD)) return -EINVAL; @@ -685,12 +690,12 @@ static int intel_config_set_pull(struct intel_pinctrl *pctrl, unsigned int pin, case 5000: value |= PADCFG1_TERM_5K << PADCFG1_TERM_SHIFT; break; - case 2000: - value |= PADCFG1_TERM_2K << PADCFG1_TERM_SHIFT; - break; case 1000: value |= PADCFG1_TERM_1K << PADCFG1_TERM_SHIFT; break; + case 833: + value |= PADCFG1_TERM_833 << PADCFG1_TERM_SHIFT; + break; default: ret = -EINVAL; } @@ -714,6 +719,13 @@ static int intel_config_set_pull(struct intel_pinctrl *pctrl, unsigned int pin, } value |= PADCFG1_TERM_1K << PADCFG1_TERM_SHIFT; break; + case 833: + if (!(community->features & PINCTRL_FEATURE_1K_PD)) { + ret = -EINVAL; + break; + } + value |= PADCFG1_TERM_833 << PADCFG1_TERM_SHIFT; + break; default: ret = -EINVAL; }
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit f3c75e7a9349d1d33eb53ddc1b31640994969f73 ]
When GPIO library asks pin control to set the bias, it doesn't pass any value of it and argument is considered boolean (and this is true for ACPI GpioIo() / GpioInt() resources, by the way). Thus, individual drivers must behave well, when they got the resistance value of 1 Ohm, i.e. transforming it to sane default.
In case of Intel pin control hardware the 5 kOhm sounds plausible because on one hand it's a minimum of resistors present in all hardware generations and at the same time it's high enough to minimize leakage current (will be only 200 uA with the above choice).
Fixes: e57725eabf87 ("pinctrl: intel: Add support for hardware debouncer") Reported-by: Jamie McClymont jamie@kwiius.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/intel/pinctrl-intel.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/pinctrl/intel/pinctrl-intel.c b/drivers/pinctrl/intel/pinctrl-intel.c index b738b28239bd4..31e7840bc5e25 100644 --- a/drivers/pinctrl/intel/pinctrl-intel.c +++ b/drivers/pinctrl/intel/pinctrl-intel.c @@ -683,6 +683,10 @@ static int intel_config_set_pull(struct intel_pinctrl *pctrl, unsigned int pin,
value |= PADCFG1_TERM_UP;
+ /* Set default strength value in case none is given */ + if (arg == 1) + arg = 5000; + switch (arg) { case 20000: value |= PADCFG1_TERM_20K << PADCFG1_TERM_SHIFT; @@ -705,6 +709,10 @@ static int intel_config_set_pull(struct intel_pinctrl *pctrl, unsigned int pin, case PIN_CONFIG_BIAS_PULL_DOWN: value &= ~(PADCFG1_TERM_UP | PADCFG1_TERM_MASK);
+ /* Set default strength value in case none is given */ + if (arg == 1) + arg = 5000; + switch (arg) { case 20000: value |= PADCFG1_TERM_20K << PADCFG1_TERM_SHIFT;
From: Billy Tsai billy_tsai@aspeedtech.com
[ Upstream commit 560b6ac37a87fcb78d580437e3e0bc2b6b5b0295 ]
GPIO_T is mapped to the most significant byte of input/output mask, and the byte in "output" mask should be 0 because GPIO_T is input only. All the other bits need to be 1 because GPIO_Q/R/S support both input and output modes.
Fixes: ab4a85534c3e ("gpio: aspeed: Add in ast2600 details to Aspeed driver") Signed-off-by: Billy Tsai billy_tsai@aspeedtech.com Reviewed-by: Tao Ren rentao.bupt@gmail.com Reviewed-by: Joel Stanley joel@jms.id.au Reviewed-by: Andrew Jeffery andrew@aj.id.au Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-aspeed.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpio/gpio-aspeed.c b/drivers/gpio/gpio-aspeed.c index e44d5de2a1201..b966f5e28ebff 100644 --- a/drivers/gpio/gpio-aspeed.c +++ b/drivers/gpio/gpio-aspeed.c @@ -1114,6 +1114,7 @@ static const struct aspeed_gpio_config ast2500_config =
static const struct aspeed_bank_props ast2600_bank_props[] = { /* input output */ + {4, 0xffffffff, 0x00ffffff}, /* Q/R/S/T */ {5, 0xffffffff, 0xffffff00}, /* U/V/W/X */ {6, 0x0000ffff, 0x0000ffff}, /* Y/Z */ { },
From: Andrew Jeffery andrew@aj.id.au
[ Upstream commit 9fa2e7af3d53a4b769136eccc32c02e128a4ee51 ]
Setting both CONFIG_KPROBES=y and CONFIG_FORTIFY_SOURCE=y on ARM leads to a panic in memcpy() when injecting a kprobe despite the fixes found in commit e46daee53bb5 ("ARM: 8806/1: kprobes: Fix false positive with FORTIFY_SOURCE") and commit 0ac569bf6a79 ("ARM: 8834/1: Fix: kprobes: optimized kprobes illegal instruction").
arch/arm/include/asm/kprobes.h effectively declares the target type of the optprobe_template_entry assembly label as a u32 which leads memcpy()'s __builtin_object_size() call to determine that the pointed-to object is of size four. However, the symbol is used as a handle for the optimised probe assembly template that is at least 96 bytes in size. The symbol's use despite its type blows up the memcpy() in ARM's arch_prepare_optimized_kprobe() with a false-positive fortify_panic() when it should instead copy the optimised probe template into place:
``` $ sudo perf probe -a aspeed_g6_pinctrl_probe [ 158.457252] detected buffer overflow in memcpy [ 158.458069] ------------[ cut here ]------------ [ 158.458283] kernel BUG at lib/string.c:1153! [ 158.458436] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 158.458768] Modules linked in: [ 158.459043] CPU: 1 PID: 99 Comm: perf Not tainted 5.9.0-rc7-00038-gc53ebf8167e9 #158 [ 158.459296] Hardware name: Generic DT based system [ 158.459529] PC is at fortify_panic+0x18/0x20 [ 158.459658] LR is at __irq_work_queue_local+0x3c/0x74 [ 158.459831] pc : [<8047451c>] lr : [<8020ecd4>] psr: 60000013 [ 158.460032] sp : be2d1d50 ip : be2d1c58 fp : be2d1d5c [ 158.460174] r10: 00000006 r9 : 00000000 r8 : 00000060 [ 158.460348] r7 : 8011e434 r6 : b9e0b800 r5 : 7f000000 r4 : b9fe4f0c [ 158.460557] r3 : 80c04cc8 r2 : 00000000 r1 : be7c03cc r0 : 00000022 [ 158.460801] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 158.461037] Control: 10c5387d Table: b9cd806a DAC: 00000051 [ 158.461251] Process perf (pid: 99, stack limit = 0x81c71a69) [ 158.461472] Stack: (0xbe2d1d50 to 0xbe2d2000) [ 158.461757] 1d40: be2d1d84 be2d1d60 8011e724 80474510 [ 158.462104] 1d60: b9e0b800 b9fe4f0c 00000000 b9fe4f14 80c8ec80 be235000 be2d1d9c be2d1d88 [ 158.462436] 1d80: 801cee44 8011e57c b9fe4f0c 00000000 be2d1dc4 be2d1da0 801d0ad0 801cedec [ 158.462742] 1da0: 00000000 00000000 b9fe4f00 ffffffea 00000000 be235000 be2d1de4 be2d1dc8 [ 158.463087] 1dc0: 80204604 801d0738 00000000 00000000 b9fe4004 ffffffea be2d1e94 be2d1de8 [ 158.463428] 1de0: 80205434 80204570 00385c00 00000000 00000000 00000000 be2d1e14 be2d1e08 [ 158.463880] 1e00: 802ba014 b9fe4f00 b9e718c0 b9fe4f84 b9e71ec8 be2d1e24 00000000 00385c00 [ 158.464365] 1e20: 00000000 626f7270 00000065 802b905c be2d1e94 0000002e 00000000 802b9914 [ 158.464829] 1e40: be2d1e84 be2d1e50 802b9914 8028ff78 804629d0 b9e71ec0 0000002e b9e71ec0 [ 158.465141] 1e60: be2d1ea8 80c04cc8 00000cc0 b9e713c4 00000002 80205834 80205834 0000002e [ 158.465488] 1e80: be235000 be235000 be2d1ea4 be2d1e98 80205854 80204e94 be2d1ecc be2d1ea8 [ 158.465806] 1ea0: 801ee4a0 80205840 00000002 80c04cc8 00000000 0000002e 0000002e 00000000 [ 158.466110] 1ec0: be2d1f0c be2d1ed0 801ee5c8 801ee428 00000000 be2d0000 006b1fd0 00000051 [ 158.466398] 1ee0: 00000000 b9eedf00 0000002e 80204410 006b1fd0 be2d1f60 00000000 00000004 [ 158.466763] 1f00: be2d1f24 be2d1f10 8020442c 801ee4c4 80205834 802c613c be2d1f5c be2d1f28 [ 158.467102] 1f20: 802c60ac 8020441c be2d1fac be2d1f38 8010c764 802e9888 be2d1f5c b9eedf00 [ 158.467447] 1f40: b9eedf00 006b1fd0 0000002e 00000000 be2d1f94 be2d1f60 802c634c 802c5fec [ 158.467812] 1f60: 00000000 00000000 00000000 80c04cc8 006b1fd0 00000003 76f7a610 00000004 [ 158.468155] 1f80: 80100284 be2d0000 be2d1fa4 be2d1f98 802c63ec 802c62e8 00000000 be2d1fa8 [ 158.468508] 1fa0: 80100080 802c63e0 006b1fd0 00000003 00000003 006b1fd0 0000002e 00000000 [ 158.468858] 1fc0: 006b1fd0 00000003 76f7a610 00000004 006b1fb0 0026d348 00000017 7ef2738c [ 158.469202] 1fe0: 76f3431c 7ef272d8 0014ec50 76f34338 60000010 00000003 00000000 00000000 [ 158.469461] Backtrace: [ 158.469683] [<80474504>] (fortify_panic) from [<8011e724>] (arch_prepare_optimized_kprobe+0x1b4/0x1f8) [ 158.470021] [<8011e570>] (arch_prepare_optimized_kprobe) from [<801cee44>] (alloc_aggr_kprobe+0x64/0x70) [ 158.470287] r9:be235000 r8:80c8ec80 r7:b9fe4f14 r6:00000000 r5:b9fe4f0c r4:b9e0b800 [ 158.470478] [<801cede0>] (alloc_aggr_kprobe) from [<801d0ad0>] (register_kprobe+0x3a4/0x5a0) [ 158.470685] r5:00000000 r4:b9fe4f0c [ 158.470790] [<801d072c>] (register_kprobe) from [<80204604>] (__register_trace_kprobe+0xa0/0xa4) [ 158.471001] r9:be235000 r8:00000000 r7:ffffffea r6:b9fe4f00 r5:00000000 r4:00000000 [ 158.471188] [<80204564>] (__register_trace_kprobe) from [<80205434>] (trace_kprobe_create+0x5ac/0x9ac) [ 158.471408] r7:ffffffea r6:b9fe4004 r5:00000000 r4:00000000 [ 158.471553] [<80204e88>] (trace_kprobe_create) from [<80205854>] (create_or_delete_trace_kprobe+0x20/0x3c) [ 158.471766] r10:be235000 r9:be235000 r8:0000002e r7:80205834 r6:80205834 r5:00000002 [ 158.471949] r4:b9e713c4 [ 158.472027] [<80205834>] (create_or_delete_trace_kprobe) from [<801ee4a0>] (trace_run_command+0x84/0x9c) [ 158.472255] [<801ee41c>] (trace_run_command) from [<801ee5c8>] (trace_parse_run_command+0x110/0x1f8) [ 158.472471] r6:00000000 r5:0000002e r4:0000002e [ 158.472594] [<801ee4b8>] (trace_parse_run_command) from [<8020442c>] (probes_write+0x1c/0x28) [ 158.472800] r10:00000004 r9:00000000 r8:be2d1f60 r7:006b1fd0 r6:80204410 r5:0000002e [ 158.472968] r4:b9eedf00 [ 158.473046] [<80204410>] (probes_write) from [<802c60ac>] (vfs_write+0xcc/0x1e8) [ 158.473226] [<802c5fe0>] (vfs_write) from [<802c634c>] (ksys_write+0x70/0xf8) [ 158.473400] r8:00000000 r7:0000002e r6:006b1fd0 r5:b9eedf00 r4:b9eedf00 [ 158.473567] [<802c62dc>] (ksys_write) from [<802c63ec>] (sys_write+0x18/0x1c) [ 158.473745] r9:be2d0000 r8:80100284 r7:00000004 r6:76f7a610 r5:00000003 r4:006b1fd0 [ 158.473932] [<802c63d4>] (sys_write) from [<80100080>] (ret_fast_syscall+0x0/0x54) [ 158.474126] Exception stack(0xbe2d1fa8 to 0xbe2d1ff0) [ 158.474305] 1fa0: 006b1fd0 00000003 00000003 006b1fd0 0000002e 00000000 [ 158.474573] 1fc0: 006b1fd0 00000003 76f7a610 00000004 006b1fb0 0026d348 00000017 7ef2738c [ 158.474811] 1fe0: 76f3431c 7ef272d8 0014ec50 76f34338 [ 158.475171] Code: e24cb004 e1a01000 e59f0004 ebf40dd3 (e7f001f2) [ 158.475847] ---[ end trace 55a5b31c08a29f00 ]--- [ 158.476088] Kernel panic - not syncing: Fatal exception [ 158.476375] CPU0: stopping [ 158.476709] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D 5.9.0-rc7-00038-gc53ebf8167e9 #158 [ 158.477176] Hardware name: Generic DT based system [ 158.477411] Backtrace: [ 158.477604] [<8010dd28>] (dump_backtrace) from [<8010dfd4>] (show_stack+0x20/0x24) [ 158.477990] r7:00000000 r6:60000193 r5:00000000 r4:80c2f634 [ 158.478323] [<8010dfb4>] (show_stack) from [<8046390c>] (dump_stack+0xcc/0xe8) [ 158.478686] [<80463840>] (dump_stack) from [<80110750>] (handle_IPI+0x334/0x3a0) [ 158.479063] r7:00000000 r6:00000004 r5:80b65cc8 r4:80c78278 [ 158.479352] [<8011041c>] (handle_IPI) from [<801013f8>] (gic_handle_irq+0x88/0x94) [ 158.479757] r10:10c5387d r9:80c01ed8 r8:00000000 r7:c0802000 r6:80c0537c r5:000003ff [ 158.480146] r4:c080200c r3:fffffff4 [ 158.480364] [<80101370>] (gic_handle_irq) from [<80100b6c>] (__irq_svc+0x6c/0x90) [ 158.480748] Exception stack(0x80c01ed8 to 0x80c01f20) [ 158.481031] 1ec0: 000128bc 00000000 [ 158.481499] 1ee0: be7b8174 8011d3a0 80c00000 00000000 80c04cec 80c04d28 80c5d7c2 80a026d4 [ 158.482091] 1f00: 10c5387d 80c01f34 80c01f38 80c01f28 80109554 80109558 60000013 ffffffff [ 158.482621] r9:80c00000 r8:80c5d7c2 r7:80c01f0c r6:ffffffff r5:60000013 r4:80109558 [ 158.482983] [<80109518>] (arch_cpu_idle) from [<80818780>] (default_idle_call+0x38/0x120) [ 158.483360] [<80818748>] (default_idle_call) from [<801585a8>] (do_idle+0xd4/0x158) [ 158.483945] r5:00000000 r4:80c00000 [ 158.484237] [<801584d4>] (do_idle) from [<801588f4>] (cpu_startup_entry+0x28/0x2c) [ 158.484784] r9:80c78000 r8:00000000 r7:80c78000 r6:80c78040 r5:80c04cc0 r4:000000d6 [ 158.485328] [<801588cc>] (cpu_startup_entry) from [<80810a78>] (rest_init+0x9c/0xbc) [ 158.485930] [<808109dc>] (rest_init) from [<80b00ae4>] (arch_call_rest_init+0x18/0x1c) [ 158.486503] r5:80c04cc0 r4:00000001 [ 158.486857] [<80b00acc>] (arch_call_rest_init) from [<80b00fcc>] (start_kernel+0x46c/0x548) [ 158.487589] [<80b00b60>] (start_kernel) from [<00000000>] (0x0) ```
Fixes: e46daee53bb5 ("ARM: 8806/1: kprobes: Fix false positive with FORTIFY_SOURCE") Fixes: 0ac569bf6a79 ("ARM: 8834/1: Fix: kprobes: optimized kprobes illegal instruction") Suggested-by: Kees Cook keescook@chromium.org Signed-off-by: Andrew Jeffery andrew@aj.id.au Tested-by: Luka Oreskovic luka.oreskovic@sartura.hr Tested-by: Joel Stanley joel@jms.id.au Reviewed-by: Joel Stanley joel@jms.id.au Acked-by: Masami Hiramatsu mhiramat@kernel.org Cc: Luka Oreskovic luka.oreskovic@sartura.hr Cc: Juraj Vijtiuk juraj.vijtiuk@sartura.hr Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/include/asm/kprobes.h | 22 +++++++++++----------- arch/arm/probes/kprobes/opt-arm.c | 18 +++++++++--------- 2 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h index 213607a1f45c1..e26a278d301ab 100644 --- a/arch/arm/include/asm/kprobes.h +++ b/arch/arm/include/asm/kprobes.h @@ -44,20 +44,20 @@ int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data);
/* optinsn template addresses */ -extern __visible kprobe_opcode_t optprobe_template_entry; -extern __visible kprobe_opcode_t optprobe_template_val; -extern __visible kprobe_opcode_t optprobe_template_call; -extern __visible kprobe_opcode_t optprobe_template_end; -extern __visible kprobe_opcode_t optprobe_template_sub_sp; -extern __visible kprobe_opcode_t optprobe_template_add_sp; -extern __visible kprobe_opcode_t optprobe_template_restore_begin; -extern __visible kprobe_opcode_t optprobe_template_restore_orig_insn; -extern __visible kprobe_opcode_t optprobe_template_restore_end; +extern __visible kprobe_opcode_t optprobe_template_entry[]; +extern __visible kprobe_opcode_t optprobe_template_val[]; +extern __visible kprobe_opcode_t optprobe_template_call[]; +extern __visible kprobe_opcode_t optprobe_template_end[]; +extern __visible kprobe_opcode_t optprobe_template_sub_sp[]; +extern __visible kprobe_opcode_t optprobe_template_add_sp[]; +extern __visible kprobe_opcode_t optprobe_template_restore_begin[]; +extern __visible kprobe_opcode_t optprobe_template_restore_orig_insn[]; +extern __visible kprobe_opcode_t optprobe_template_restore_end[];
#define MAX_OPTIMIZED_LENGTH 4 #define MAX_OPTINSN_SIZE \ - ((unsigned long)&optprobe_template_end - \ - (unsigned long)&optprobe_template_entry) + ((unsigned long)optprobe_template_end - \ + (unsigned long)optprobe_template_entry) #define RELATIVEJUMP_SIZE 4
struct arch_optimized_insn { diff --git a/arch/arm/probes/kprobes/opt-arm.c b/arch/arm/probes/kprobes/opt-arm.c index 7a449df0b3591..c78180172120f 100644 --- a/arch/arm/probes/kprobes/opt-arm.c +++ b/arch/arm/probes/kprobes/opt-arm.c @@ -85,21 +85,21 @@ asm ( "optprobe_template_end:\n");
#define TMPL_VAL_IDX \ - ((unsigned long *)&optprobe_template_val - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_val - (unsigned long *)optprobe_template_entry) #define TMPL_CALL_IDX \ - ((unsigned long *)&optprobe_template_call - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_call - (unsigned long *)optprobe_template_entry) #define TMPL_END_IDX \ - ((unsigned long *)&optprobe_template_end - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_end - (unsigned long *)optprobe_template_entry) #define TMPL_ADD_SP \ - ((unsigned long *)&optprobe_template_add_sp - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_add_sp - (unsigned long *)optprobe_template_entry) #define TMPL_SUB_SP \ - ((unsigned long *)&optprobe_template_sub_sp - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_sub_sp - (unsigned long *)optprobe_template_entry) #define TMPL_RESTORE_BEGIN \ - ((unsigned long *)&optprobe_template_restore_begin - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_restore_begin - (unsigned long *)optprobe_template_entry) #define TMPL_RESTORE_ORIGN_INSN \ - ((unsigned long *)&optprobe_template_restore_orig_insn - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_restore_orig_insn - (unsigned long *)optprobe_template_entry) #define TMPL_RESTORE_END \ - ((unsigned long *)&optprobe_template_restore_end - (unsigned long *)&optprobe_template_entry) + ((unsigned long *)optprobe_template_restore_end - (unsigned long *)optprobe_template_entry)
/* * ARM can always optimize an instruction when using ARM ISA, except @@ -234,7 +234,7 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *or }
/* Copy arch-dep-instance from template. */ - memcpy(code, (unsigned long *)&optprobe_template_entry, + memcpy(code, (unsigned long *)optprobe_template_entry, TMPL_END_IDX * sizeof(kprobe_opcode_t));
/* Adjust buffer according to instruction. */
From: Ard Biesheuvel ardb@kernel.org
[ Upstream commit 080b6f40763565f65ebb9540219c71ce885cf568 ]
Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a function scope __attribute__((optimize("-fno-gcse"))), to disable a GCC specific optimization that was causing trouble on x86 builds, and was not expected to have any positive effect in the first place.
However, as the GCC manual documents, __attribute__((optimize)) is not for production use, and results in all other optimization options to be forgotten for the function in question. This can cause all kinds of trouble, but in one particular reported case, it causes -fno-asynchronous-unwind-tables to be disregarded, resulting in .eh_frame info to be emitted for the function.
This reverts commit 3193c0836, and instead, it disables the -fgcse optimization for the entire source file, but only when building for X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the original commit states that CONFIG_RETPOLINE=n triggers the issue, whereas CONFIG_RETPOLINE=y performs better without the optimization, so it is kept disabled in both cases.
Fixes: 3193c0836f20 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()") Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Tested-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Nick Desaulniers ndesaulniers@google.com Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_h... Link: https://lore.kernel.org/bpf/20201028171506.15682-2-ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/compiler-gcc.h | 2 -- include/linux/compiler_types.h | 4 ---- kernel/bpf/Makefile | 6 +++++- kernel/bpf/core.c | 2 +- 4 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h index 7a3769040d7dc..3017ebd400546 100644 --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -175,5 +175,3 @@ #else #define __diag_GCC_8(s) #endif - -#define __no_fgcse __attribute__((optimize("-fno-gcse"))) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 6e390d58a9f8c..ac3fa37a84f94 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -247,10 +247,6 @@ struct ftrace_likely_data { #define asm_inline asm #endif
-#ifndef __no_fgcse -# define __no_fgcse -#endif - /* Are two types/vars the same type (ignoring qualifiers)? */ #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index e6eb9c0402dab..0cc0de72163dc 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -1,6 +1,10 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := core.o -CFLAGS_core.o += $(call cc-disable-warning, override-init) +ifneq ($(CONFIG_BPF_JIT_ALWAYS_ON),y) +# ___bpf_prog_run() needs GCSE disabled on x86; see 3193c0836f203 for details +cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse +endif +CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)
obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index ed0b3578867c0..3cb26e82549ac 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1364,7 +1364,7 @@ u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr) * * Decode and execute eBPF instructions. */ -static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) +static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) { #define BPF_INSN_2_LBL(x, y) [BPF_##x | BPF_##y] = &&x##_##y #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z
From: Ian Rogers irogers@google.com
[ Upstream commit 7a078d2d18801bba7bde7337a823d7342299acf7 ]
If bits is 0, the case when the map is empty, then the >> is the size of the register which is undefined behavior - on x86 it is the same as a shift by 0.
Fix by handling the 0 case explicitly and guarding calls to hash_bits for empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe.
Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Suggested-by: Andrii Nakryiko andriin@fb.com, Signed-off-by: Ian Rogers irogers@google.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Andrii Nakryiko andrii@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/hashmap.h | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/tools/lib/bpf/hashmap.h b/tools/lib/bpf/hashmap.h index e0af36b0e5d83..6a3c3d8bb4ab8 100644 --- a/tools/lib/bpf/hashmap.h +++ b/tools/lib/bpf/hashmap.h @@ -15,6 +15,9 @@ static inline size_t hash_bits(size_t h, int bits) { /* shuffle bits and return requested number of upper bits */ + if (bits == 0) + return 0; + #if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__) /* LP64 case */ return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits); @@ -162,17 +165,17 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value); * @key: key to iterate entries for */ #define hashmap__for_each_key_entry(map, cur, _key) \ - for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\ - map->cap_bits); \ - map->buckets ? map->buckets[bkt] : NULL; }); \ + for (cur = map->buckets \ + ? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \ + : NULL; \ cur; \ cur = cur->next) \ if (map->equal_fn(cur->key, (_key), map->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \ - for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\ - map->cap_bits); \ - cur = map->buckets ? map->buckets[bkt] : NULL; }); \ + for (cur = map->buckets \ + ? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \ + : NULL; \ cur && ({ tmp = cur->next; true; }); \ cur = tmp) \ if (map->equal_fn(cur->key, (_key), map->ctx))
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 2b12c13637134897ba320bd8906a8d918ee7069b ]
It appears that simplification of mcp23s08_spi_regmap_init() made a regression due to wrong size calculation for dev_kmemdup() call. It misses the fact that config variable is already a pointer, thus the sizeof() calculation is wrong and only 4 or 8 bytes were copied.
Fix the parameters to devm_kmemdup() to copy a full chunk of memory.
Fixes: 0874758ecb2b ("pinctrl: mcp23s08: Refactor mcp23s08_spi_regmap_init()") Reported-by: Martin Hundebøll martin@geanix.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Tested-by: Martin Hundebøll martin@geanix.com Link: https://lore.kernel.org/r/20201009180856.4738-1-andriy.shevchenko@linux.inte... Tested-by: Jan Kundrát jan.kundrat@cesnet.cz Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/pinctrl-mcp23s08_spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/pinctrl-mcp23s08_spi.c b/drivers/pinctrl/pinctrl-mcp23s08_spi.c index 1f47a661b0a79..7c72cffe14127 100644 --- a/drivers/pinctrl/pinctrl-mcp23s08_spi.c +++ b/drivers/pinctrl/pinctrl-mcp23s08_spi.c @@ -119,7 +119,7 @@ static int mcp23s08_spi_regmap_init(struct mcp23s08 *mcp, struct device *dev, return -EINVAL; }
- copy = devm_kmemdup(dev, &config, sizeof(config), GFP_KERNEL); + copy = devm_kmemdup(dev, config, sizeof(*config), GFP_KERNEL); if (!copy) return -ENOMEM;
From: Billy Tsai billy_tsai@aspeedtech.com
[ Upstream commit 9b92f5c51e9a41352d665f6f956bd95085a56a83 ]
Some gpio pin at aspeed soc is input only and the prefix name of these pin is "GPI" only. This patch fine-tune the condition of GPIO check from "GPIO" to "GPI" and it will fix the usage error of banks D and E in the AST2400/AST2500 and banks T and U in the AST2600.
Fixes: 4d3d0e4272d8 ("pinctrl: Add core support for Aspeed SoCs") Signed-off-by: Billy Tsai billy_tsai@aspeedtech.com Reviewed-by: Andrew Jeffery andrew@aj.id.au Link: https://lore.kernel.org/r/20201030055450.29613-1-billy_tsai@aspeedtech.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/aspeed/pinctrl-aspeed.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/pinctrl/aspeed/pinctrl-aspeed.c b/drivers/pinctrl/aspeed/pinctrl-aspeed.c index 3e6567355d97d..1d603732903fe 100644 --- a/drivers/pinctrl/aspeed/pinctrl-aspeed.c +++ b/drivers/pinctrl/aspeed/pinctrl-aspeed.c @@ -286,13 +286,14 @@ int aspeed_pinmux_set_mux(struct pinctrl_dev *pctldev, unsigned int function, static bool aspeed_expr_is_gpio(const struct aspeed_sig_expr *expr) { /* - * The signal type is GPIO if the signal name has "GPIO" as a prefix. + * The signal type is GPIO if the signal name has "GPI" as a prefix. * strncmp (rather than strcmp) is used to implement the prefix * requirement. * - * expr->signal might look like "GPIOT3" in the GPIO case. + * expr->signal might look like "GPIOB1" in the GPIO case. + * expr->signal might look like "GPIT0" in the GPI case. */ - return strncmp(expr->signal, "GPIO", 4) == 0; + return strncmp(expr->signal, "GPI", 3) == 0; }
static bool aspeed_gpio_in_exprs(const struct aspeed_sig_expr **exprs)
From: Maor Dickman maord@nvidia.com
[ Upstream commit e68e28b4a9d71261e3f8fd05a72d6cf0b443a493 ]
Modify header actions are allocated during parse tc actions and only freed during the flow creation, however, on error flow the allocated memory is wrongly unfreed.
Fix this by calling dealloc_mod_hdr_actions in __mlx5e_add_fdb_flow and mlx5e_add_nic_flow error flow.
Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Fixes: 2f4fe4cab073 ("net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions") Signed-off-by: Maor Dickman maord@nvidia.com Reviewed-by: Paul Blakey paulb@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 1c93f92d9210a..44947b054dc4c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -4430,6 +4430,7 @@ __mlx5e_add_fdb_flow(struct mlx5e_priv *priv, return flow;
err_free: + dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); mlx5e_flow_put(priv, flow); out: return ERR_PTR(err); @@ -4564,6 +4565,7 @@ mlx5e_add_nic_flow(struct mlx5e_priv *priv, return 0;
err_free: + dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); mlx5e_flow_put(priv, flow); kvfree(parse_attr); out:
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 78c906e430b13d30a8cfbdef4ccbbe1686841a9e ]
In functions mlx5e_route_lookup_ipv{4|6}() route_dev can be arbitrary net device and not necessary mlx5 eswitch port representor. As such, in order to ensure that route_dev is not destroyed concurrent the code needs either explicitly take reference to the device before releasing reference to rtable instance or ensure that caller holds rtnl lock. First approach is chosen as a fix since rtnl lock dependency was intentionally removed from mlx5 TC layer.
To prevent unprotected usage of route_dev in encap code take a reference to the device before releasing rt. Don't save direct pointer to the device in mlx5_encap_entry structure and use ifindex instead. Modify users of route_dev pointer to properly obtain the net device instance from its ifindex.
Fixes: 61086f391044 ("net/mlx5e: Protect encap hash table with mutex") Fixes: 6707f74be862 ("net/mlx5e: Update hw flows when encap source mac changed") Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/rep/tc.c | 6 +- .../ethernet/mellanox/mlx5/core/en/tc_tun.c | 72 ++++++++++++------- .../net/ethernet/mellanox/mlx5/core/en_rep.h | 2 +- 3 files changed, 52 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c index 79cc42d88eec6..38ea249159f60 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c @@ -107,12 +107,16 @@ void mlx5e_rep_update_flows(struct mlx5e_priv *priv, mlx5e_tc_encap_flows_del(priv, e, &flow_list);
if (neigh_connected && !(e->flags & MLX5_ENCAP_ENTRY_VALID)) { + struct net_device *route_dev; + ether_addr_copy(e->h_dest, ha); ether_addr_copy(eth->h_dest, ha); /* Update the encap source mac, in case that we delete * the flows when encap source mac changed. */ - ether_addr_copy(eth->h_source, e->route_dev->dev_addr); + route_dev = __dev_get_by_index(dev_net(priv->netdev), e->route_dev_ifindex); + if (route_dev) + ether_addr_copy(eth->h_source, route_dev->dev_addr);
mlx5e_tc_encap_flows_add(priv, e, &flow_list); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index 7cce85faa16fa..90930e54b6f28 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -77,13 +77,13 @@ static int get_route_and_out_devs(struct mlx5e_priv *priv, return 0; }
-static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv, - struct net_device *mirred_dev, - struct net_device **out_dev, - struct net_device **route_dev, - struct flowi4 *fl4, - struct neighbour **out_n, - u8 *out_ttl) +static int mlx5e_route_lookup_ipv4_get(struct mlx5e_priv *priv, + struct net_device *mirred_dev, + struct net_device **out_dev, + struct net_device **route_dev, + struct flowi4 *fl4, + struct neighbour **out_n, + u8 *out_ttl) { struct neighbour *n; struct rtable *rt; @@ -117,18 +117,28 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv, ip_rt_put(rt); return ret; } + dev_hold(*route_dev);
if (!(*out_ttl)) *out_ttl = ip4_dst_hoplimit(&rt->dst); n = dst_neigh_lookup(&rt->dst, &fl4->daddr); ip_rt_put(rt); - if (!n) + if (!n) { + dev_put(*route_dev); return -ENOMEM; + }
*out_n = n; return 0; }
+static void mlx5e_route_lookup_ipv4_put(struct net_device *route_dev, + struct neighbour *n) +{ + neigh_release(n); + dev_put(route_dev); +} + static const char *mlx5e_netdev_kind(struct net_device *dev) { if (dev->rtnl_link_ops) @@ -193,8 +203,8 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, fl4.saddr = tun_key->u.ipv4.src; ttl = tun_key->ttl;
- err = mlx5e_route_lookup_ipv4(priv, mirred_dev, &out_dev, &route_dev, - &fl4, &n, &ttl); + err = mlx5e_route_lookup_ipv4_get(priv, mirred_dev, &out_dev, &route_dev, + &fl4, &n, &ttl); if (err) return err;
@@ -223,7 +233,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, e->m_neigh.family = n->ops->family; memcpy(&e->m_neigh.dst_ip, n->primary_key, n->tbl->key_len); e->out_dev = out_dev; - e->route_dev = route_dev; + e->route_dev_ifindex = route_dev->ifindex;
/* It's important to add the neigh to the hash table before checking * the neigh validity state. So if we'll get a notification, in case the @@ -278,7 +288,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(out_dev)); - neigh_release(n); + mlx5e_route_lookup_ipv4_put(route_dev, n); return err;
destroy_neigh_entry: @@ -286,18 +296,18 @@ destroy_neigh_entry: free_encap: kfree(encap_header); release_neigh: - neigh_release(n); + mlx5e_route_lookup_ipv4_put(route_dev, n); return err; }
#if IS_ENABLED(CONFIG_INET) && IS_ENABLED(CONFIG_IPV6) -static int mlx5e_route_lookup_ipv6(struct mlx5e_priv *priv, - struct net_device *mirred_dev, - struct net_device **out_dev, - struct net_device **route_dev, - struct flowi6 *fl6, - struct neighbour **out_n, - u8 *out_ttl) +static int mlx5e_route_lookup_ipv6_get(struct mlx5e_priv *priv, + struct net_device *mirred_dev, + struct net_device **out_dev, + struct net_device **route_dev, + struct flowi6 *fl6, + struct neighbour **out_n, + u8 *out_ttl) { struct dst_entry *dst; struct neighbour *n; @@ -318,15 +328,25 @@ static int mlx5e_route_lookup_ipv6(struct mlx5e_priv *priv, return ret; }
+ dev_hold(*route_dev); n = dst_neigh_lookup(dst, &fl6->daddr); dst_release(dst); - if (!n) + if (!n) { + dev_put(*route_dev); return -ENOMEM; + }
*out_n = n; return 0; }
+static void mlx5e_route_lookup_ipv6_put(struct net_device *route_dev, + struct neighbour *n) +{ + neigh_release(n); + dev_put(route_dev); +} + int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, struct net_device *mirred_dev, struct mlx5e_encap_entry *e) @@ -348,8 +368,8 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, fl6.daddr = tun_key->u.ipv6.dst; fl6.saddr = tun_key->u.ipv6.src;
- err = mlx5e_route_lookup_ipv6(priv, mirred_dev, &out_dev, &route_dev, - &fl6, &n, &ttl); + err = mlx5e_route_lookup_ipv6_get(priv, mirred_dev, &out_dev, &route_dev, + &fl6, &n, &ttl); if (err) return err;
@@ -378,7 +398,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, e->m_neigh.family = n->ops->family; memcpy(&e->m_neigh.dst_ip, n->primary_key, n->tbl->key_len); e->out_dev = out_dev; - e->route_dev = route_dev; + e->route_dev_ifindex = route_dev->ifindex;
/* It's importent to add the neigh to the hash table before checking * the neigh validity state. So if we'll get a notification, in case the @@ -433,7 +453,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(out_dev)); - neigh_release(n); + mlx5e_route_lookup_ipv6_put(route_dev, n); return err;
destroy_neigh_entry: @@ -441,7 +461,7 @@ destroy_neigh_entry: free_encap: kfree(encap_header); release_neigh: - neigh_release(n); + mlx5e_route_lookup_ipv6_put(route_dev, n); return err; } #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 0d1562e20118c..963a6d98840ac 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -187,7 +187,7 @@ struct mlx5e_encap_entry { unsigned char h_dest[ETH_ALEN]; /* destination eth addr */
struct net_device *out_dev; - struct net_device *route_dev; + int route_dev_ifindex; struct mlx5e_tc_tunnel *tunnel; int reformat_type; u8 flags;
From: Maxim Mikityanskiy maximmi@mellanox.com
[ Upstream commit f42139ba49791ab6b12443c60044872705b74a1e ]
async_icosq_lock may be taken from softirq and non-softirq contexts. It requires protection with spin_lock_bh, otherwise a softirq may be triggered in the middle of the critical section, and it may deadlock if it tries to take the same lock. This patch fixes such a scenario by using spin_lock_bh to disable softirqs on that CPU while inside the critical section.
Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one") Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c | 4 ++-- .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c | 4 ++-- .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 14 +++++++------- 3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c index 55e65a438de70..fcaeb30778bc7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c @@ -122,9 +122,9 @@ void mlx5e_activate_xsk(struct mlx5e_channel *c) set_bit(MLX5E_RQ_STATE_ENABLED, &c->xskrq.state); /* TX queue is created active. */
- spin_lock(&c->async_icosq_lock); + spin_lock_bh(&c->async_icosq_lock); mlx5e_trigger_irq(&c->async_icosq); - spin_unlock(&c->async_icosq_lock); + spin_unlock_bh(&c->async_icosq_lock); }
void mlx5e_deactivate_xsk(struct mlx5e_channel *c) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c index 4d892f6cecb3e..4de70cee80c0a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c @@ -36,9 +36,9 @@ int mlx5e_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) if (test_and_set_bit(MLX5E_SQ_STATE_PENDING_XSK_TX, &c->async_icosq.state)) return 0;
- spin_lock(&c->async_icosq_lock); + spin_lock_bh(&c->async_icosq_lock); mlx5e_trigger_irq(&c->async_icosq); - spin_unlock(&c->async_icosq_lock); + spin_unlock_bh(&c->async_icosq_lock); }
return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c index 6bbfcf18107d2..979ff5658a3f7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c @@ -188,7 +188,7 @@ static int post_rx_param_wqes(struct mlx5e_channel *c,
err = 0; sq = &c->async_icosq; - spin_lock(&c->async_icosq_lock); + spin_lock_bh(&c->async_icosq_lock);
cseg = post_static_params(sq, priv_rx); if (IS_ERR(cseg)) @@ -199,7 +199,7 @@ static int post_rx_param_wqes(struct mlx5e_channel *c,
mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg); unlock: - spin_unlock(&c->async_icosq_lock); + spin_unlock_bh(&c->async_icosq_lock);
return err;
@@ -265,10 +265,10 @@ resync_post_get_progress_params(struct mlx5e_icosq *sq,
BUILD_BUG_ON(MLX5E_KTLS_GET_PROGRESS_WQEBBS != 1);
- spin_lock(&sq->channel->async_icosq_lock); + spin_lock_bh(&sq->channel->async_icosq_lock);
if (unlikely(!mlx5e_wqc_has_room_for(&sq->wq, sq->cc, sq->pc, 1))) { - spin_unlock(&sq->channel->async_icosq_lock); + spin_unlock_bh(&sq->channel->async_icosq_lock); err = -ENOSPC; goto err_dma_unmap; } @@ -299,7 +299,7 @@ resync_post_get_progress_params(struct mlx5e_icosq *sq, icosq_fill_wi(sq, pi, &wi); sq->pc++; mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg); - spin_unlock(&sq->channel->async_icosq_lock); + spin_unlock_bh(&sq->channel->async_icosq_lock);
return 0;
@@ -360,7 +360,7 @@ static int resync_handle_seq_match(struct mlx5e_ktls_offload_context_rx *priv_rx err = 0;
sq = &c->async_icosq; - spin_lock(&c->async_icosq_lock); + spin_lock_bh(&c->async_icosq_lock);
cseg = post_static_params(sq, priv_rx); if (IS_ERR(cseg)) { @@ -372,7 +372,7 @@ static int resync_handle_seq_match(struct mlx5e_ktls_offload_context_rx *priv_rx mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, cseg); priv_rx->stats->tls_resync_res_ok++; unlock: - spin_unlock(&c->async_icosq_lock); + spin_unlock_bh(&c->async_icosq_lock);
return err; }
From: Maor Gottlieb maorg@nvidia.com
[ Upstream commit 465e7baab6d93b399344f5868f84c177ab5cd16f ]
When a rule is duplicated, the refcount of the rule is increased so only the second deletion of the rule should cause destruction of the FTE. Currently, the FTE will be destroyed in the first deletion of rule since the modify_mask will be 0. Fix it and call to destroy FTE only if all the rules (FTE's children) have been removed.
Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes") Signed-off-by: Maor Gottlieb maorg@nvidia.com Reviewed-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 75fa44eee434d..d4755d61dd740 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -1994,10 +1994,11 @@ void mlx5_del_flow_rules(struct mlx5_flow_handle *handle) down_write_ref_node(&fte->node, false); for (i = handle->num_rules - 1; i >= 0; i--) tree_remove_node(&handle->rule[i]->node, true); - if (fte->modify_mask && fte->dests_size) { - modify_fte(fte); + if (fte->dests_size) { + if (fte->modify_mask) + modify_fte(fte); up_write_ref_node(&fte->node, false); - } else { + } else if (list_empty(&fte->node.children)) { del_hw_fte(&fte->node); /* Avoid double call to del_hw_fte */ fte->node.del_hw_func = NULL;
From: Parav Pandit parav@nvidia.com
[ Upstream commit ae35859445607f7f18dd4f332749219cd636ed59 ]
When E-switch vport is disabled, querying its hardware address is unsupported. Avoid setting extack error log message in such case.
Fixes: f099fde16db3 ("net/mlx5: E-switch, Support querying port function mac address") Signed-off-by: Parav Pandit parav@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 6e6a9a5639928..e8e6294c7ccae 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1902,8 +1902,6 @@ int mlx5_devlink_port_function_hw_addr_get(struct devlink *devlink, ether_addr_copy(hw_addr, vport->info.mac); *hw_addr_len = ETH_ALEN; err = 0; - } else { - NL_SET_ERR_MSG_MOD(extack, "Eswitch vport is disabled"); } mutex_unlock(&esw->state_lock); return err;
From: Aya Levin ayal@nvidia.com
[ Upstream commit c5eb51adf06b2644fa28d4af886bfdcc53e288da ]
During driver reload, perform firmware tear-down which results in firmware losing the configured VXLAN ports. These ports are still available in the driver's database. Fix this by cleaning up driver's VXLAN database in the nic unload flow, before firmware tear-down. With that, minimize mlx5_vxlan_destroy() to remove only what was added in mlx5_vxlan_create() and warn on leftover UDP ports.
Fixes: 18a2b7f969c9 ("net/mlx5: convert to new udp_tunnel infrastructure") Signed-off-by: Aya Levin ayal@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 1 + .../ethernet/mellanox/mlx5/core/lib/vxlan.c | 23 ++++++++++++++----- .../ethernet/mellanox/mlx5/core/lib/vxlan.h | 2 ++ 3 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 42ec28e298348..f399973a44eb0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5226,6 +5226,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
mlx5e_disable_async_events(priv); mlx5_lag_remove(mdev); + mlx5_vxlan_reset_to_default(mdev->vxlan); }
int mlx5e_update_nic_rx(struct mlx5e_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c index 3315afe2f8dce..38084400ee8fa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.c @@ -167,6 +167,17 @@ struct mlx5_vxlan *mlx5_vxlan_create(struct mlx5_core_dev *mdev) }
void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan) +{ + if (!mlx5_vxlan_allowed(vxlan)) + return; + + mlx5_vxlan_del_port(vxlan, IANA_VXLAN_UDP_PORT); + WARN_ON(!hash_empty(vxlan->htable)); + + kfree(vxlan); +} + +void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan) { struct mlx5_vxlan_port *vxlanp; struct hlist_node *tmp; @@ -175,12 +186,12 @@ void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan) if (!mlx5_vxlan_allowed(vxlan)) return;
- /* Lockless since we are the only hash table consumers*/ hash_for_each_safe(vxlan->htable, bkt, tmp, vxlanp, hlist) { - hash_del(&vxlanp->hlist); - mlx5_vxlan_core_del_port_cmd(vxlan->mdev, vxlanp->udp_port); - kfree(vxlanp); + /* Don't delete default UDP port added by the HW. + * Remove only user configured ports + */ + if (vxlanp->udp_port == IANA_VXLAN_UDP_PORT) + continue; + mlx5_vxlan_del_port(vxlan, vxlanp->udp_port); } - - kfree(vxlan); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h index ec766529f49b6..34ef662da35ed 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.h @@ -56,6 +56,7 @@ void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan); int mlx5_vxlan_add_port(struct mlx5_vxlan *vxlan, u16 port); int mlx5_vxlan_del_port(struct mlx5_vxlan *vxlan, u16 port); bool mlx5_vxlan_lookup_port(struct mlx5_vxlan *vxlan, u16 port); +void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan); #else static inline struct mlx5_vxlan* mlx5_vxlan_create(struct mlx5_core_dev *mdev) { return ERR_PTR(-EOPNOTSUPP); } @@ -63,6 +64,7 @@ static inline void mlx5_vxlan_destroy(struct mlx5_vxlan *vxlan) { return; } static inline int mlx5_vxlan_add_port(struct mlx5_vxlan *vxlan, u16 port) { return -EOPNOTSUPP; } static inline int mlx5_vxlan_del_port(struct mlx5_vxlan *vxlan, u16 port) { return -EOPNOTSUPP; } static inline bool mlx5_vxlan_lookup_port(struct mlx5_vxlan *vxlan, u16 port) { return false; } +static inline void mlx5_vxlan_reset_to_default(struct mlx5_vxlan *vxlan) { return; } #endif
#endif /* __MLX5_VXLAN_H__ */
From: Maxim Mikityanskiy maximmi@mellanox.com
[ Upstream commit 1a50cf9a67ff2241c2949d30bc11c8dd4280eef8 ]
rq->xdp_prog is RCU-protected and should be accessed only with rcu_access_pointer for the NULL check in mlx5e_poll_rx_cq.
rq->xdp_prog may change on the fly only from one non-NULL value to another non-NULL value, so the checks in mlx5e_xdp_handle and mlx5e_poll_rx_cq will have the same result during one NAPI cycle, meaning that no additional synchronization is needed.
Fixes: fe45386a2082 ("net/mlx5e: Use RCU to protect rq->xdp_prog") Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 64c8ac5eabf6a..a0a4398408b85 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1566,7 +1566,7 @@ int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget) } while ((++work_done < budget) && (cqe = mlx5_cqwq_get_cqe(cqwq)));
out: - if (rq->xdp_prog) + if (rcu_access_pointer(rq->xdp_prog)) mlx5e_xdp_rx_poll_complete(rq);
mlx5_cqwq_update_db_record(cqwq);
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit d321ff589c16d8c2207485a6d7fbdb14e873d46e ]
The TP_fast_assign() section is careful enough not to dereference xdr->rqst if it's NULL. The TP_STRUCT__entry section is not.
Fixes: 5582863f450c ("SUNRPC: Add XDR overflow trace event") Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/sunrpc.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 65d7dfbbc9cd7..ca2f27b9f919d 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -607,10 +607,10 @@ TRACE_EVENT(rpc_xdr_overflow, __field(size_t, tail_len) __field(unsigned int, page_len) __field(unsigned int, len) - __string(progname, - xdr->rqst->rq_task->tk_client->cl_program->name) - __string(procedure, - xdr->rqst->rq_task->tk_msg.rpc_proc->p_name) + __string(progname, xdr->rqst ? + xdr->rqst->rq_task->tk_client->cl_program->name : "unknown") + __string(procedure, xdr->rqst ? + xdr->rqst->rq_task->tk_msg.rpc_proc->p_name : "unknown") ),
TP_fast_assign(
From: Dai Ngo dai.ngo@oracle.com
[ Upstream commit 36e1e5ba90fb3fba6888fae26e4dfc28bf70aaf1 ]
The source file nfsd_file is not constructed the same as other nfsd_file's via nfsd_file_alloc. nfsd_file_put should not be called to free the object; nfsd_file_put is not the inverse of kzalloc, instead kfree is called by nfsd4_do_async_copy when done.
Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo dai.ngo@oracle.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/nfs4proc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 84e10aef14175..80effaa18b7b2 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1299,7 +1299,7 @@ nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct nfsd_file *src, struct nfsd_file *dst) { nfs42_ssc_close(src->nf_file); - nfsd_file_put(src); + /* 'src' is freed by nfsd4_do_async_copy */ nfsd_file_put(dst); mntput(ss_mnt); }
From: Dai Ngo dai.ngo@oracle.com
[ Upstream commit 49a361327332c9221438397059067f9b205f690d ]
Need to initialize nfsd4_copy's refcount to 1 to avoid use-after-free warning when nfs4_put_copy is called from nfsd4_cb_offload_release.
Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo dai.ngo@oracle.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/nfs4proc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 80effaa18b7b2..3ba17b5fc9286 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1486,6 +1486,7 @@ do_callback: cb_copy = kzalloc(sizeof(struct nfsd4_copy), GFP_KERNEL); if (!cb_copy) goto out; + refcount_set(&cb_copy->refcount, 1); memcpy(&cb_copy->cp_res, ©->cp_res, sizeof(copy->cp_res)); cb_copy->cp_clp = copy->cp_clp; cb_copy->nfserr = copy->nfserr;
From: Lorenz Bauer lmb@cloudflare.com
[ Upstream commit f9b7ff0d7f7a466a920424246e7ddc2b84c87e52 ]
My earlier patch to reject non-zero arguments to flow dissector attach broke attaching via bpftool. Instead of 0 it uses -1 for target_fd. Fix this by passing a zero argument when attaching the flow dissector.
Fixes: 1b514239e859 ("bpf: flow_dissector: Check value of unused flags to BPF_PROG_ATTACH") Reported-by: Jiri Benc jbenc@redhat.com Signed-off-by: Lorenz Bauer lmb@cloudflare.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20201105115230.296657-1-lmb@cloudflare.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/prog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index d393eb8263a60..994506540e564 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -741,7 +741,7 @@ static int parse_attach_detach_args(int argc, char **argv, int *progfd, }
if (*attach_type == BPF_FLOW_DISSECTOR) { - *mapfd = -1; + *mapfd = 0; return 0; }
From: David Verbeiren david.verbeiren@tessares.net
[ Upstream commit d3bec0138bfbe58606fc1d6f57a4cdc1a20218db ]
Zero-fill element values for all other cpus than current, just as when not using prealloc. This is the only way the bpf program can ensure known initial values for all cpus ('onallcpus' cannot be set when coming from the bpf program).
The scenario is: bpf program inserts some elements in a per-cpu map, then deletes some (or userspace does). When later adding new elements using bpf_map_update_elem(), the bpf program can only set the value of the new elements for the current cpu. When prealloc is enabled, previously deleted elements are re-used. Without the fix, values for other cpus remain whatever they were when the re-used entry was previously freed.
A selftest is added to validate correct operation in above scenario as well as in case of LRU per-cpu map element re-use.
Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements") Signed-off-by: David Verbeiren david.verbeiren@tessares.net Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Matthieu Baerts matthieu.baerts@tessares.net Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20201104112332.15191-1-david.verbeiren@tessares.... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/hashtab.c | 30 ++- .../selftests/bpf/prog_tests/map_init.c | 214 ++++++++++++++++++ .../selftests/bpf/progs/test_map_init.c | 33 +++ 3 files changed, 275 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 7df28a45c66bf..15364543b2c0f 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -821,6 +821,32 @@ static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr, } }
+static void pcpu_init_value(struct bpf_htab *htab, void __percpu *pptr, + void *value, bool onallcpus) +{ + /* When using prealloc and not setting the initial value on all cpus, + * zero-fill element values for other cpus (just as what happens when + * not using prealloc). Otherwise, bpf program has no way to ensure + * known initial values for cpus other than current one + * (onallcpus=false always when coming from bpf prog). + */ + if (htab_is_prealloc(htab) && !onallcpus) { + u32 size = round_up(htab->map.value_size, 8); + int current_cpu = raw_smp_processor_id(); + int cpu; + + for_each_possible_cpu(cpu) { + if (cpu == current_cpu) + bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value, + size); + else + memset(per_cpu_ptr(pptr, cpu), 0, size); + } + } else { + pcpu_copy_value(htab, pptr, value, onallcpus); + } +} + static bool fd_htab_map_needs_adjust(const struct bpf_htab *htab) { return htab->map.map_type == BPF_MAP_TYPE_HASH_OF_MAPS && @@ -891,7 +917,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, } }
- pcpu_copy_value(htab, pptr, value, onallcpus); + pcpu_init_value(htab, pptr, value, onallcpus);
if (!prealloc) htab_elem_set_ptr(l_new, key_size, pptr); @@ -1183,7 +1209,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, pcpu_copy_value(htab, htab_elem_get_ptr(l_old, key_size), value, onallcpus); } else { - pcpu_copy_value(htab, htab_elem_get_ptr(l_new, key_size), + pcpu_init_value(htab, htab_elem_get_ptr(l_new, key_size), value, onallcpus); hlist_nulls_add_head_rcu(&l_new->hash_node, head); l_new = NULL; diff --git a/tools/testing/selftests/bpf/prog_tests/map_init.c b/tools/testing/selftests/bpf/prog_tests/map_init.c new file mode 100644 index 0000000000000..14a31109dd0e0 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/map_init.c @@ -0,0 +1,214 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2020 Tessares SA http://www.tessares.net */ + +#include <test_progs.h> +#include "test_map_init.skel.h" + +#define TEST_VALUE 0x1234 +#define FILL_VALUE 0xdeadbeef + +static int nr_cpus; +static int duration; + +typedef unsigned long long map_key_t; +typedef unsigned long long map_value_t; +typedef struct { + map_value_t v; /* padding */ +} __bpf_percpu_val_align pcpu_map_value_t; + + +static int map_populate(int map_fd, int num) +{ + pcpu_map_value_t value[nr_cpus]; + int i, err; + map_key_t key; + + for (i = 0; i < nr_cpus; i++) + bpf_percpu(value, i) = FILL_VALUE; + + for (key = 1; key <= num; key++) { + err = bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + return -1; + } + + return 0; +} + +static struct test_map_init *setup(enum bpf_map_type map_type, int map_sz, + int *map_fd, int populate) +{ + struct test_map_init *skel; + int err; + + skel = test_map_init__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return NULL; + + err = bpf_map__set_type(skel->maps.hashmap1, map_type); + if (!ASSERT_OK(err, "bpf_map__set_type")) + goto error; + + err = bpf_map__set_max_entries(skel->maps.hashmap1, map_sz); + if (!ASSERT_OK(err, "bpf_map__set_max_entries")) + goto error; + + err = test_map_init__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto error; + + *map_fd = bpf_map__fd(skel->maps.hashmap1); + if (CHECK(*map_fd < 0, "bpf_map__fd", "failed\n")) + goto error; + + err = map_populate(*map_fd, populate); + if (!ASSERT_OK(err, "map_populate")) + goto error_map; + + return skel; + +error_map: + close(*map_fd); +error: + test_map_init__destroy(skel); + return NULL; +} + +/* executes bpf program that updates map with key, value */ +static int prog_run_insert_elem(struct test_map_init *skel, map_key_t key, + map_value_t value) +{ + struct test_map_init__bss *bss; + + bss = skel->bss; + + bss->inKey = key; + bss->inValue = value; + bss->inPid = getpid(); + + if (!ASSERT_OK(test_map_init__attach(skel), "skel_attach")) + return -1; + + /* Let tracepoint trigger */ + syscall(__NR_getpgid); + + test_map_init__detach(skel); + + return 0; +} + +static int check_values_one_cpu(pcpu_map_value_t *value, map_value_t expected) +{ + int i, nzCnt = 0; + map_value_t val; + + for (i = 0; i < nr_cpus; i++) { + val = bpf_percpu(value, i); + if (val) { + if (CHECK(val != expected, "map value", + "unexpected for cpu %d: 0x%llx\n", i, val)) + return -1; + nzCnt++; + } + } + + if (CHECK(nzCnt != 1, "map value", "set for %d CPUs instead of 1!\n", + nzCnt)) + return -1; + + return 0; +} + +/* Add key=1 elem with values set for all CPUs + * Delete elem key=1 + * Run bpf prog that inserts new key=1 elem with value=0x1234 + * (bpf prog can only set value for current CPU) + * Lookup Key=1 and check value is as expected for all CPUs: + * value set by bpf prog for one CPU, 0 for all others + */ +static void test_pcpu_map_init(void) +{ + pcpu_map_value_t value[nr_cpus]; + struct test_map_init *skel; + int map_fd, err; + map_key_t key; + + /* max 1 elem in map so insertion is forced to reuse freed entry */ + skel = setup(BPF_MAP_TYPE_PERCPU_HASH, 1, &map_fd, 1); + if (!ASSERT_OK_PTR(skel, "prog_setup")) + return; + + /* delete element so the entry can be re-used*/ + key = 1; + err = bpf_map_delete_elem(map_fd, &key); + if (!ASSERT_OK(err, "bpf_map_delete_elem")) + goto cleanup; + + /* run bpf prog that inserts new elem, re-using the slot just freed */ + err = prog_run_insert_elem(skel, key, TEST_VALUE); + if (!ASSERT_OK(err, "prog_run_insert_elem")) + goto cleanup; + + /* check that key=1 was re-created by bpf prog */ + err = bpf_map_lookup_elem(map_fd, &key, value); + if (!ASSERT_OK(err, "bpf_map_lookup_elem")) + goto cleanup; + + /* and has expected values */ + check_values_one_cpu(value, TEST_VALUE); + +cleanup: + test_map_init__destroy(skel); +} + +/* Add key=1 and key=2 elems with values set for all CPUs + * Run bpf prog that inserts new key=3 elem + * (only for current cpu; other cpus should have initial value = 0) + * Lookup Key=1 and check value is as expected for all CPUs + */ +static void test_pcpu_lru_map_init(void) +{ + pcpu_map_value_t value[nr_cpus]; + struct test_map_init *skel; + int map_fd, err; + map_key_t key; + + /* Set up LRU map with 2 elements, values filled for all CPUs. + * With these 2 elements, the LRU map is full + */ + skel = setup(BPF_MAP_TYPE_LRU_PERCPU_HASH, 2, &map_fd, 2); + if (!ASSERT_OK_PTR(skel, "prog_setup")) + return; + + /* run bpf prog that inserts new key=3 element, re-using LRU slot */ + key = 3; + err = prog_run_insert_elem(skel, key, TEST_VALUE); + if (!ASSERT_OK(err, "prog_run_insert_elem")) + goto cleanup; + + /* check that key=3 replaced one of earlier elements */ + err = bpf_map_lookup_elem(map_fd, &key, value); + if (!ASSERT_OK(err, "bpf_map_lookup_elem")) + goto cleanup; + + /* and has expected values */ + check_values_one_cpu(value, TEST_VALUE); + +cleanup: + test_map_init__destroy(skel); +} + +void test_map_init(void) +{ + nr_cpus = bpf_num_possible_cpus(); + if (nr_cpus <= 1) { + printf("%s:SKIP: >1 cpu needed for this test\n", __func__); + test__skip(); + return; + } + + if (test__start_subtest("pcpu_map_init")) + test_pcpu_map_init(); + if (test__start_subtest("pcpu_lru_map_init")) + test_pcpu_lru_map_init(); +} diff --git a/tools/testing/selftests/bpf/progs/test_map_init.c b/tools/testing/selftests/bpf/progs/test_map_init.c new file mode 100644 index 0000000000000..c89d28ead6737 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_map_init.c @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Tessares SA http://www.tessares.net */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> + +__u64 inKey = 0; +__u64 inValue = 0; +__u32 inPid = 0; + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_HASH); + __uint(max_entries, 2); + __type(key, __u64); + __type(value, __u64); +} hashmap1 SEC(".maps"); + + +SEC("tp/syscalls/sys_enter_getpgid") +int sysenter_getpgid(const void *ctx) +{ + /* Just do it for once, when called from our own test prog. This + * ensures the map value is only updated for a single CPU. + */ + int cur_pid = bpf_get_current_pid_tgid() >> 32; + + if (cur_pid == inPid) + bpf_map_update_elem(&hashmap1, &inKey, &inValue, BPF_NOEXIST); + + return 0; +} + +char _license[] SEC("license") = "GPL";
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit cc6528bc9a0c901c83b8220a2e2617f3354d6dd9 ]
The caller of rtl8169_tso_csum_v2() frees the skb if false is returned. eth_skb_pad() internally frees the skb on error what would result in a double free. Therefore use __skb_put_padto() directly and instruct it to not free the skb on error.
Fixes: b423e9ae49d7 ("r8169: fix offloaded tx checksum for small packets.") Reported-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/f7e68191-acff-9ded-4263-c016428a8762@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index c74d9c02a805f..ed918c12bc5e9 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4145,7 +4145,8 @@ static bool rtl8169_tso_csum_v2(struct rtl8169_private *tp, opts[1] |= transport_offset << TCPHO_SHIFT; } else { if (unlikely(skb->len < ETH_ZLEN && rtl_test_hw_pad_bug(tp))) - return !eth_skb_pad(skb); + /* eth_skb_pad would free the skb on error */ + return !__skb_put_padto(skb, ETH_ZLEN, false); }
return true;
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 847f0a2bfd2fe16d6afa537816b313b71f32e139 ]
RTL8125B has same or similar short packet hw padding bug as RTL8168evl. The main workaround has been extended accordingly, however we have to disable also hw checksumming for short packets on affected new chip versions. Instead of checking for an affected chip version let's simply disable hw checksumming for short packets in general.
v2: - remove the version checks and disable short packet hw csum in general - reflect this in commit title and message
Fixes: 0439297be951 ("r8169: add support for RTL8125B") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/7fbb35f0-e244-ef65-aa55-3872d7d38698@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_main.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index ed918c12bc5e9..515d9116dfadf 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4325,18 +4325,9 @@ static netdev_features_t rtl8169_features_check(struct sk_buff *skb, rtl_chip_supports_csum_v2(tp)) features &= ~NETIF_F_ALL_TSO; } else if (skb->ip_summed == CHECKSUM_PARTIAL) { - if (skb->len < ETH_ZLEN) { - switch (tp->mac_version) { - case RTL_GIGA_MAC_VER_11: - case RTL_GIGA_MAC_VER_12: - case RTL_GIGA_MAC_VER_17: - case RTL_GIGA_MAC_VER_34: - features &= ~NETIF_F_CSUM_MASK; - break; - default: - break; - } - } + /* work around hw bug on some chip versions */ + if (skb->len < ETH_ZLEN) + features &= ~NETIF_F_CSUM_MASK;
if (transport_offset > TCPHO_MAX && rtl_chip_supports_csum_v2(tp))
From: Maulik Shah mkshah@codeaurora.org
[ Upstream commit 71266d9d39366c9b24b866d811b3facaf837f13f ]
When GPIOs that are routed to PDC are used as output they can still latch the IRQ pending at GIC. As a result the spurious IRQ was handled when the client driver change the direction to input to starts using it as IRQ.
Currently such erroneous latched IRQ are cleared with .irq_enable callback however if the driver continue to use GPIO as interrupt and invokes disable_irq() followed by enable_irq() then everytime during enable_irq() previously latched interrupt gets cleared.
This can make edge IRQs not seen after enable_irq() if they had arrived after the driver has invoked disable_irq() and were pending at GIC.
Move clearing erroneous IRQ to .irq_request_resources callback as this is the place where GPIO direction is changed as input and its locked as IRQ.
While at this add a missing check to invoke msm_gpio_irq_clear_unmask() from .irq_enable callback only when GPIO is not routed to PDC.
Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy") Signed-off-by: Maulik Shah mkshah@codeaurora.org Link: https://lore.kernel.org/r/1604561884-10166-1-git-send-email-mkshah@codeauror... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/qcom/pinctrl-msm.c | 32 ++++++++++++++++++------------ 1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c index 1df232266f63a..1554f0275067e 100644 --- a/drivers/pinctrl/qcom/pinctrl-msm.c +++ b/drivers/pinctrl/qcom/pinctrl-msm.c @@ -815,21 +815,14 @@ static void msm_gpio_irq_clear_unmask(struct irq_data *d, bool status_clear)
static void msm_gpio_irq_enable(struct irq_data *d) { - /* - * Clear the interrupt that may be pending before we enable - * the line. - * This is especially a problem with the GPIOs routed to the - * PDC. These GPIOs are direct-connect interrupts to the GIC. - * Disabling the interrupt line at the PDC does not prevent - * the interrupt from being latched at the GIC. The state at - * GIC needs to be cleared before enabling. - */ - if (d->parent_data) { - irq_chip_set_parent_state(d, IRQCHIP_STATE_PENDING, 0); + struct gpio_chip *gc = irq_data_get_irq_chip_data(d); + struct msm_pinctrl *pctrl = gpiochip_get_data(gc); + + if (d->parent_data) irq_chip_enable_parent(d); - }
- msm_gpio_irq_clear_unmask(d, true); + if (!test_bit(d->hwirq, pctrl->skip_wake_irqs)) + msm_gpio_irq_clear_unmask(d, true); }
static void msm_gpio_irq_disable(struct irq_data *d) @@ -1104,6 +1097,19 @@ static int msm_gpio_irq_reqres(struct irq_data *d) ret = -EINVAL; goto out; } + + /* + * Clear the interrupt that may be pending before we enable + * the line. + * This is especially a problem with the GPIOs routed to the + * PDC. These GPIOs are direct-connect interrupts to the GIC. + * Disabling the interrupt line at the PDC does not prevent + * the interrupt from being latched at the GIC. The state at + * GIC needs to be cleared before enabling. + */ + if (d->parent_data && test_bit(d->hwirq, pctrl->skip_wake_irqs)) + irq_chip_set_parent_state(d, IRQCHIP_STATE_PENDING, 0); + return 0; out: module_put(gc->owner);
From: Bjorn Andersson bjorn.andersson@linaro.org
[ Upstream commit b41efeed507addecb92e83dd444d86c1fbe38ae0 ]
Specify the PDC mapping for SM8250, so that gpio interrupts are propertly mapped to the wakeup IRQs of the PDC.
Fixes: 4e3ec9e407ad ("pinctrl: qcom: Add sm8250 pinctrl driver.") Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Link: https://lore.kernel.org/r/20201028043642.1141723-1-bjorn.andersson@linaro.or... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/qcom/pinctrl-sm8250.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/pinctrl/qcom/pinctrl-sm8250.c b/drivers/pinctrl/qcom/pinctrl-sm8250.c index 826df0d637eaa..af144e724bd9c 100644 --- a/drivers/pinctrl/qcom/pinctrl-sm8250.c +++ b/drivers/pinctrl/qcom/pinctrl-sm8250.c @@ -1313,6 +1313,22 @@ static const struct msm_pingroup sm8250_groups[] = { [183] = SDC_PINGROUP(sdc2_data, 0xb7000, 9, 0), };
+static const struct msm_gpio_wakeirq_map sm8250_pdc_map[] = { + { 0, 79 }, { 1, 84 }, { 2, 80 }, { 3, 82 }, { 4, 107 }, { 7, 43 }, + { 11, 42 }, { 14, 44 }, { 15, 52 }, { 19, 67 }, { 23, 68 }, { 24, 105 }, + { 27, 92 }, { 28, 106 }, { 31, 69 }, { 35, 70 }, { 39, 37 }, + { 40, 108 }, { 43, 71 }, { 45, 72 }, { 47, 83 }, { 51, 74 }, { 55, 77 }, + { 59, 78 }, { 63, 75 }, { 64, 81 }, { 65, 87 }, { 66, 88 }, { 67, 89 }, + { 68, 54 }, { 70, 85 }, { 77, 46 }, { 80, 90 }, { 81, 91 }, { 83, 97 }, + { 84, 98 }, { 86, 99 }, { 87, 100 }, { 88, 101 }, { 89, 102 }, + { 92, 103 }, { 93, 104 }, { 100, 53 }, { 103, 47 }, { 104, 48 }, + { 108, 49 }, { 109, 94 }, { 110, 95 }, { 111, 96 }, { 112, 55 }, + { 113, 56 }, { 118, 50 }, { 121, 51 }, { 122, 57 }, { 123, 58 }, + { 124, 45 }, { 126, 59 }, { 128, 76 }, { 129, 86 }, { 132, 93 }, + { 133, 65 }, { 134, 66 }, { 136, 62 }, { 137, 63 }, { 138, 64 }, + { 142, 60 }, { 143, 61 } +}; + static const struct msm_pinctrl_soc_data sm8250_pinctrl = { .pins = sm8250_pins, .npins = ARRAY_SIZE(sm8250_pins), @@ -1323,6 +1339,8 @@ static const struct msm_pinctrl_soc_data sm8250_pinctrl = { .ngpios = 181, .tiles = sm8250_tiles, .ntiles = ARRAY_SIZE(sm8250_tiles), + .wakeirq_map = sm8250_pdc_map, + .nwakeirq_map = ARRAY_SIZE(sm8250_pdc_map), };
static int sm8250_pinctrl_probe(struct platform_device *pdev)
From: Christoph Hellwig hch@lst.de
[ Upstream commit 2bd645b2d3f0bacadaa6037f067538e1cd4e42ef ]
bdget_disk needs to be paired with bdput to not leak a reference on the block device inode.
Fixes: 08ba91ee6e2c ("nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index d76fca629c143..36c46fe078556 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1517,6 +1517,7 @@ static void nbd_release(struct gendisk *disk, fmode_t mode) if (test_bit(NBD_RT_DISCONNECT_ON_CLOSE, &nbd->config->runtime_flags) && bdev->bd_openers == 0) nbd_disconnect_and_put(nbd); + bdput(bdev);
nbd_config_put(nbd); nbd_put(nbd);
From: Vlad Buslov vlad@buslov.dev
[ Upstream commit 97adb13dc9ba08ecd4758bc59efc0205f5cbf377 ]
Iproute2 tc classifier terse dump has been accepted with modified syntax. Update the tests accordingly.
Signed-off-by: Vlad Buslov vlad@buslov.dev Fixes: e7534fd42a99 ("selftests: implement flower classifier terse dump tests") Link: https://lore.kernel.org/r/20201107111928.453534-1-vlad@buslov.dev Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../testing/selftests/tc-testing/tc-tests/filters/tests.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json index bb543bf69d694..361235ad574be 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json +++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json @@ -100,7 +100,7 @@ ], "cmdUnderTest": "$TC filter add dev $DEV2 protocol ip pref 1 ingress flower dst_mac e4:11:22:11:4a:51 action drop", "expExitCode": "0", - "verifyCmd": "$TC filter show terse dev $DEV2 ingress", + "verifyCmd": "$TC -br filter show dev $DEV2 ingress", "matchPattern": "filter protocol ip pref 1 flower.*handle", "matchCount": "1", "teardown": [ @@ -119,7 +119,7 @@ ], "cmdUnderTest": "$TC filter add dev $DEV2 protocol ip pref 1 ingress flower dst_mac e4:11:22:11:4a:51 action drop", "expExitCode": "0", - "verifyCmd": "$TC filter show terse dev $DEV2 ingress", + "verifyCmd": "$TC -br filter show dev $DEV2 ingress", "matchPattern": " dst_mac e4:11:22:11:4a:51", "matchCount": "0", "teardown": [
From: Slawomir Laba slawomirx.laba@intel.com
[ Upstream commit 3a7001788fed0311d6fb77ed0dabe7bed3567bc0 ]
Fix MAC setting flow for the PF driver.
Update the unicast VF's MAC address in VF structure if it is a new setting in i40e_vc_add_mac_addr_msg.
When unicast MAC address gets deleted, record that and set the new unicast MAC address that is already waiting in the filter list. This logic is based on the order of messages arriving to the PF driver.
Without this change the MAC address setting was interpreted incorrectly in the following use cases: 1) Print incorrect VF MAC or zero MAC ip link show dev $pf 2) Don't preserve MAC between driver reload rmmod iavf; modprobe iavf 3) Update VF MAC when macvlan was set ip link add link $vf address $mac $vf.1 type macvlan 4) Failed to update mac address when VF was trusted ip link set dev $vf address $mac
This includes all other configurations including above commands.
Fixes: f657a6e1313b ("i40e: Fix VF driver MAC address configuration") Signed-off-by: Slawomir Laba slawomirx.laba@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/intel/i40e/i40e_virtchnl_pf.c | 26 +++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 47bfb2e95e2db..343177d71f70a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -2712,6 +2712,10 @@ static int i40e_vc_add_mac_addr_msg(struct i40e_vf *vf, u8 *msg) spin_unlock_bh(&vsi->mac_filter_hash_lock); goto error_param; } + if (is_valid_ether_addr(al->list[i].addr) && + is_zero_ether_addr(vf->default_lan_addr.addr)) + ether_addr_copy(vf->default_lan_addr.addr, + al->list[i].addr); } } spin_unlock_bh(&vsi->mac_filter_hash_lock); @@ -2739,6 +2743,7 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg) { struct virtchnl_ether_addr_list *al = (struct virtchnl_ether_addr_list *)msg; + bool was_unimac_deleted = false; struct i40e_pf *pf = vf->pf; struct i40e_vsi *vsi = NULL; i40e_status ret = 0; @@ -2758,6 +2763,8 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg) ret = I40E_ERR_INVALID_MAC_ADDR; goto error_param; } + if (ether_addr_equal(al->list[i].addr, vf->default_lan_addr.addr)) + was_unimac_deleted = true; } vsi = pf->vsi[vf->lan_vsi_idx];
@@ -2778,10 +2785,25 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg) dev_err(&pf->pdev->dev, "Unable to program VF %d MAC filters, error %d\n", vf->vf_id, ret);
+ if (vf->trusted && was_unimac_deleted) { + struct i40e_mac_filter *f; + struct hlist_node *h; + u8 *macaddr = NULL; + int bkt; + + /* set last unicast mac address as default */ + spin_lock_bh(&vsi->mac_filter_hash_lock); + hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) { + if (is_valid_ether_addr(f->macaddr)) + macaddr = f->macaddr; + } + if (macaddr) + ether_addr_copy(vf->default_lan_addr.addr, macaddr); + spin_unlock_bh(&vsi->mac_filter_hash_lock); + } error_param: /* send the response to the VF */ - return i40e_vc_send_resp_to_vf(vf, VIRTCHNL_OP_DEL_ETH_ADDR, - ret); + return i40e_vc_send_resp_to_vf(vf, VIRTCHNL_OP_DEL_ETH_ADDR, ret); }
/**
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 6b7ed22ae4c96a415001f0c3116ebee15bb8491a ]
'igc_update_stats()' was not updating 'netdev->stats', so the returned statistics, for example, requested by:
$ ip -s link show dev enp3s0
were not being updated and were always zero.
Fix by returning a set of statistics that are actually being updated (adapter->stats64).
Fixes: c9a11c23ceb6 ("igc: Add netdev") Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Tested-by: Aaron Brown aaron.f.brown@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 9593aa4eea369..1358a39c34ad3 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -3890,21 +3890,23 @@ static int igc_change_mtu(struct net_device *netdev, int new_mtu) }
/** - * igc_get_stats - Get System Network Statistics + * igc_get_stats64 - Get System Network Statistics * @netdev: network interface device structure + * @stats: rtnl_link_stats64 pointer * * Returns the address of the device statistics structure. * The statistics are updated here and also from the timer callback. */ -static struct net_device_stats *igc_get_stats(struct net_device *netdev) +static void igc_get_stats64(struct net_device *netdev, + struct rtnl_link_stats64 *stats) { struct igc_adapter *adapter = netdev_priv(netdev);
+ spin_lock(&adapter->stats64_lock); if (!test_bit(__IGC_RESETTING, &adapter->state)) igc_update_stats(adapter); - - /* only return the current stats */ - return &netdev->stats; + memcpy(stats, &adapter->stats64, sizeof(*stats)); + spin_unlock(&adapter->stats64_lock); }
static netdev_features_t igc_fix_features(struct net_device *netdev, @@ -4833,7 +4835,7 @@ static const struct net_device_ops igc_netdev_ops = { .ndo_set_rx_mode = igc_set_rx_mode, .ndo_set_mac_address = igc_set_mac, .ndo_change_mtu = igc_change_mtu, - .ndo_get_stats = igc_get_stats, + .ndo_get_stats64 = igc_get_stats64, .ndo_fix_features = igc_fix_features, .ndo_set_features = igc_set_features, .ndo_features_check = igc_features_check,
From: Sven Van Asbroeck thesven73@gmail.com
[ Upstream commit 902a66e08ceaadb9a7a1ab3a4f3af611cd1d8cba ]
Commit 6f197fb63850 ("lan743x: Added fixed link and RGMII support") assumes that chips with an internal PHY will never have a devicetree entry. This is incorrect: even for these chips, a devicetree entry can be useful e.g. to pass the mac address from bootloader to chip:
&pcie { status = "okay";
host@0 { reg = <0 0 0 0 0>;
#address-cells = <3>; #size-cells = <2>;
lan7430: ethernet@0 { /* LAN7430 with internal PHY */ compatible = "microchip,lan743x"; status = "okay"; reg = <0 0 0 0 0>; /* filled in by bootloader */ local-mac-address = [00 00 00 00 00 00]; }; }; };
If a devicetree entry is present, the driver will not attach the chip to its internal phy, and the chip will be non-operational.
Fix by tweaking the phy connection algorithm: - first try to connect to a phy specified in the devicetree (could be 'real' phy, or just a 'fixed-link') - if that doesn't succeed, try to connect to an internal phy, even if the chip has a devnode
Tested on a LAN7430 with internal PHY. I cannot test a device using fixed-link, as I do not have access to one.
Fixes: 6f197fb63850 ("lan743x: Added fixed link and RGMII support") Tested-by: Sven Van Asbroeck thesven73@gmail.com # lan7430 Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Sven Van Asbroeck thesven73@gmail.com Link: https://lore.kernel.org/r/20201108171224.23829-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan743x_main.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index de93cc6ebc1ac..be58a941965b1 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1027,9 +1027,9 @@ static int lan743x_phy_open(struct lan743x_adapter *adapter)
netdev = adapter->netdev; phynode = of_node_get(adapter->pdev->dev.of_node); - adapter->phy_mode = PHY_INTERFACE_MODE_GMII;
if (phynode) { + /* try devicetree phy, or fixed link */ of_get_phy_mode(phynode, &adapter->phy_mode);
if (of_phy_is_fixed_link(phynode)) { @@ -1045,13 +1045,15 @@ static int lan743x_phy_open(struct lan743x_adapter *adapter) lan743x_phy_link_status_change, 0, adapter->phy_mode); of_node_put(phynode); - if (!phydev) - goto return_error; - } else { + } + + if (!phydev) { + /* try internal phy */ phydev = phy_find_first(adapter->mdiobus); if (!phydev) goto return_error;
+ adapter->phy_mode = PHY_INTERFACE_MODE_GMII; ret = phy_connect_direct(netdev, phydev, lan743x_phy_link_status_change, adapter->phy_mode);
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit f3037c5a31b58a73b32a36e938ad0560085acadd ]
The RTL8401-internal PHY identifies as RTL8201CP, and the init sequence in r8169, copied from vendor driver r8168, uses paged operations. Therefore set the same paged operation callbacks as for the other Realtek PHY's.
Fixes: cdafdc29ef75 ("r8169: sync support for RTL8401 with vendor driver") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/69882f7a-ca2f-e0c7-ae83-c9b6937282cd@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/realtek.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c index 0f09609718007..81a614f903c4a 100644 --- a/drivers/net/phy/realtek.c +++ b/drivers/net/phy/realtek.c @@ -542,6 +542,8 @@ static struct phy_driver realtek_drvs[] = { { PHY_ID_MATCH_EXACT(0x00008201), .name = "RTL8201CP Ethernet", + .read_page = rtl821x_read_page, + .write_page = rtl821x_write_page, }, { PHY_ID_MATCH_EXACT(0x001cc816), .name = "RTL8201F Fast Ethernet",
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit ea8439899c0b15a176664df62aff928010fad276 ]
Pass the same oldext argument (which contains the existing rmapping's unwritten state) to xfs_rmap_lookup_le_range at the start of xfs_rmap_convert_shared. At this point in the code, flags is zero, which means that we perform lookups using the wrong key.
Fixes: 3f165b334e51 ("xfs: convert unwritten status of reverse mappings for shared files") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_rmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 27c39268c31f7..82117b1ee34cb 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -1514,7 +1514,7 @@ xfs_rmap_convert_shared( * record for our insertion point. This will also give us the record for * start block contiguity tests. */ - error = xfs_rmap_lookup_le_range(cur, bno, owner, offset, flags, + error = xfs_rmap_lookup_le_range(cur, bno, owner, offset, oldext, &PREV, &i); if (error) goto done;
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit 5dda3897fd90783358c4c6115ef86047d8c8f503 ]
When the bmbt scrubber is looking up rmap extents, we need to set the extent flags from the bmbt record fully. This will matter once we fix the rmap btree comparison functions to check those flags correctly.
Fixes: d852657ccfc0 ("xfs: cross-reference reverse-mapping btree") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/bmap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 955302e7cdde9..412e2ec55e388 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -113,6 +113,8 @@ xchk_bmap_get_rmap(
if (info->whichfork == XFS_ATTR_FORK) rflags |= XFS_RMAP_ATTR_FORK; + if (irec->br_state == XFS_EXT_UNWRITTEN) + rflags |= XFS_RMAP_UNWRITTEN;
/* * CoW staging extents are owned (on disk) by the refcountbt, so
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit 6ff646b2ceb0eec916101877f38da0b73e3a5b7f ]
Keys for extent interval records in the reverse mapping btree are supposed to be computed as follows:
(physical block, owner, fork, is_btree, is_unwritten, offset)
This provides users the ability to look up a reverse mapping from a bmbt record -- start with the physical block; then if there are multiple records for the same block, move on to the owner; then the inode fork type; and so on to the file offset.
However, the key comparison functions incorrectly remove the fork/btree/unwritten information that's encoded in the on-disk offset. This means that lookup comparisons are only done with:
(physical block, owner, offset)
This means that queries can return incorrect results. On consistent filesystems this hasn't been an issue because blocks are never shared between forks or with bmbt blocks; and are never unwritten. However, this bug means that online repair cannot always detect corruption in the key information in internal rmapbt nodes.
Found by fuzzing keys[1].attrfork = ones on xfs/371.
Fixes: 4b8ed67794fe ("xfs: add rmap btree operations") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_rmap_btree.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c index beb81c84a9375..577a66381327c 100644 --- a/fs/xfs/libxfs/xfs_rmap_btree.c +++ b/fs/xfs/libxfs/xfs_rmap_btree.c @@ -243,8 +243,8 @@ xfs_rmapbt_key_diff( else if (y > x) return -1;
- x = XFS_RMAP_OFF(be64_to_cpu(kp->rm_offset)); - y = rec->rm_offset; + x = be64_to_cpu(kp->rm_offset); + y = xfs_rmap_irec_offset_pack(rec); if (x > y) return 1; else if (y > x) @@ -275,8 +275,8 @@ xfs_rmapbt_diff_two_keys( else if (y > x) return -1;
- x = XFS_RMAP_OFF(be64_to_cpu(kp1->rm_offset)); - y = XFS_RMAP_OFF(be64_to_cpu(kp2->rm_offset)); + x = be64_to_cpu(kp1->rm_offset); + y = be64_to_cpu(kp2->rm_offset); if (x > y) return 1; else if (y > x) @@ -390,8 +390,8 @@ xfs_rmapbt_keys_inorder( return 1; else if (a > b) return 0; - a = XFS_RMAP_OFF(be64_to_cpu(k1->rmap.rm_offset)); - b = XFS_RMAP_OFF(be64_to_cpu(k2->rmap.rm_offset)); + a = be64_to_cpu(k1->rmap.rm_offset); + b = be64_to_cpu(k2->rmap.rm_offset); if (a <= b) return 1; return 0; @@ -420,8 +420,8 @@ xfs_rmapbt_recs_inorder( return 1; else if (a > b) return 0; - a = XFS_RMAP_OFF(be64_to_cpu(r1->rmap.rm_offset)); - b = XFS_RMAP_OFF(be64_to_cpu(r2->rmap.rm_offset)); + a = be64_to_cpu(r1->rmap.rm_offset); + b = be64_to_cpu(r2->rmap.rm_offset); if (a <= b) return 1; return 0;
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit 54e9b09e153842ab5adb8a460b891e11b39e9c3d ]
Fix some serious WTF in the reference count scrubber's rmap fragment processing. The code comment says that this loop is supposed to move all fragment records starting at or before bno onto the worklist, but there's no obvious reason why nr (the number of items added) should increment starting from 1, and breaking the loop when we've added the target number seems dubious since we could have more rmap fragments that should have been added to the worklist.
This seems to manifest in xfs/411 when adding one to the refcount field.
Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/scrub/refcount.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c index beaeb6fa31197..dd672e6bbc75c 100644 --- a/fs/xfs/scrub/refcount.c +++ b/fs/xfs/scrub/refcount.c @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments( */ INIT_LIST_HEAD(&worklist); rbno = NULLAGBLOCK; - nr = 1;
/* Make sure the fragments actually /are/ in agbno order. */ bno = 0; @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments( * Find all the rmaps that start at or before the refc extent, * and put them on the worklist. */ + nr = 0; list_for_each_entry_safe(frag, n, &refchk->fragments, list) { - if (frag->rm.rm_startblock > refchk->bno) - goto done; + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) + break; bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; if (bno < rbno) rbno = bno; list_move_tail(&frag->list, &worklist); - if (nr == target_nr) - break; nr++; }
From: Sven Van Asbroeck thesven73@gmail.com
[ Upstream commit 2b52a4b65bc8f14520fe6e996ea7fb3f7e400761 ]
In the net core, the struct net_device_ops -> ndo_set_rx_mode() callback is called with the dev->addr_list_lock spinlock held.
However, this driver's ndo_set_rx_mode callback eventually calls lan743x_dp_write(), which acquires a mutex. Mutex acquisition may sleep, and this is not allowed when holding a spinlock.
Fix by removing the dp_lock mutex entirely. Its purpose is to prevent concurrent accesses to the data port. No concurrent accesses are possible, because the dev->addr_list_lock spinlock in the core only lets through one thread at a time.
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Sven Van Asbroeck thesven73@gmail.com Link: https://lore.kernel.org/r/20201109203828.5115-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan743x_main.c | 12 +++--------- drivers/net/ethernet/microchip/lan743x_main.h | 3 --- 2 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index be58a941965b1..6c25c7c8b7cf8 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -675,14 +675,12 @@ clean_up: static int lan743x_dp_write(struct lan743x_adapter *adapter, u32 select, u32 addr, u32 length, u32 *buf) { - int ret = -EIO; u32 dp_sel; int i;
- mutex_lock(&adapter->dp_lock); if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_, 1, 40, 100, 100)) - goto unlock; + return -EIO; dp_sel = lan743x_csr_read(adapter, DP_SEL); dp_sel &= ~DP_SEL_MASK_; dp_sel |= select; @@ -694,13 +692,10 @@ static int lan743x_dp_write(struct lan743x_adapter *adapter, lan743x_csr_write(adapter, DP_CMD, DP_CMD_WRITE_); if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_, 1, 40, 100, 100)) - goto unlock; + return -EIO; } - ret = 0;
-unlock: - mutex_unlock(&adapter->dp_lock); - return ret; + return 0; }
static u32 lan743x_mac_mii_access(u16 id, u16 index, int read) @@ -2737,7 +2732,6 @@ static int lan743x_hardware_init(struct lan743x_adapter *adapter,
adapter->intr.irq = adapter->pdev->irq; lan743x_csr_write(adapter, INT_EN_CLR, 0xFFFFFFFF); - mutex_init(&adapter->dp_lock);
ret = lan743x_gpio_init(adapter); if (ret) diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h index c61a404113179..a536f4a4994df 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.h +++ b/drivers/net/ethernet/microchip/lan743x_main.h @@ -712,9 +712,6 @@ struct lan743x_adapter { struct lan743x_csr csr; struct lan743x_intr intr;
- /* lock, used to prevent concurrent access to data port */ - struct mutex dp_lock; - struct lan743x_gpio gpio; struct lan743x_ptp ptp;
From: Christoph Hellwig hch@lst.de
[ Upstream commit 2bd3fa793aaa7e98b74e3653fdcc72fa753913b5 ]
We also need to drop the iolock when invalidate_inode_pages2 fails, not only on all other error or successful cases.
Fixes: 527851124d10 ("xfs: implement pNFS export operations") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_pnfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c index b101feb2aab45..f3082a957d5e1 100644 --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -134,7 +134,7 @@ xfs_fs_map_blocks( goto out_unlock; error = invalidate_inode_pages2(inode->i_mapping); if (WARN_ON_ONCE(error)) - return error; + goto out_unlock;
end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)offset + length); offset_fsb = XFS_B_TO_FSBT(mp, offset);
From: Evan Nimmo evan.nimmo@alliedtelesis.co.nz
[ Upstream commit a5bea04fcc0b3c0aec71ee1fd58fd4ff7ee36177 ]
Commit dabf6b36b83a ("of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc") added a check to of_dma_is_coherent which returns early if OF_DMA_DEFAULT_COHERENT is enabled. This results in the of_node_put() being skipped causing a memory leak. Moved the of_node_get() below this check so we now we only get the node if OF_DMA_DEFAULT_COHERENT is not enabled.
Fixes: dabf6b36b83a ("of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc") Signed-off-by: Evan Nimmo evan.nimmo@alliedtelesis.co.nz Link: https://lore.kernel.org/r/20201110022825.30895-1-evan.nimmo@alliedtelesis.co... Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/address.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index da4f7341323f2..37ac311843090 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -1043,11 +1043,13 @@ out: */ bool of_dma_is_coherent(struct device_node *np) { - struct device_node *node = of_node_get(np); + struct device_node *node;
if (IS_ENABLED(CONFIG_OF_DMA_DEFAULT_COHERENT)) return true;
+ node = of_node_get(np); + while (node) { if (of_property_read_bool(node, "dma-coherent")) { of_node_put(node);
From: Rohit Maheshwari rohitm@chelsio.com
[ Upstream commit 86716b51d14fc2201938939b323ba3ad99186910 ]
Checksum update was missing in the WR.
Fixes: 429765a149f1 ("chcr: handle partial end part of a record") Signed-off-by: Rohit Maheshwari rohitm@chelsio.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/chelsio/chcr_ktls.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/chelsio/chcr_ktls.c b/drivers/crypto/chelsio/chcr_ktls.c index c5cce024886ac..026689d091102 100644 --- a/drivers/crypto/chelsio/chcr_ktls.c +++ b/drivers/crypto/chelsio/chcr_ktls.c @@ -926,6 +926,7 @@ chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb, struct iphdr *ip; int credits; u8 buf[150]; + u64 cntrl1; void *pos;
iplen = skb_network_header_len(skb); @@ -964,22 +965,28 @@ chcr_ktls_write_tcp_options(struct chcr_ktls_info *tx_info, struct sk_buff *skb, TXPKT_PF_V(tx_info->adap->pf)); cpl->pack = 0; cpl->len = htons(pktlen); - /* checksum offload */ - cpl->ctrl1 = 0; - - pos = cpl + 1;
memcpy(buf, skb->data, pktlen); if (tx_info->ip_family == AF_INET) { /* we need to correct ip header len */ ip = (struct iphdr *)(buf + maclen); ip->tot_len = htons(pktlen - maclen); + cntrl1 = TXPKT_CSUM_TYPE_V(TX_CSUM_TCPIP); #if IS_ENABLED(CONFIG_IPV6) } else { ip6 = (struct ipv6hdr *)(buf + maclen); ip6->payload_len = htons(pktlen - maclen - iplen); + cntrl1 = TXPKT_CSUM_TYPE_V(TX_CSUM_TCPIP6); #endif } + + cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) | + TXPKT_IPHDR_LEN_V(iplen); + /* checksum offload */ + cpl->ctrl1 = cpu_to_be64(cntrl1); + + pos = cpl + 1; + /* now take care of the tcp header, if fin is not set then clear push * bit as well, and if fin is set, it will be sent at the last so we * need to update the tcp sequence number as per the last packet.
From: Rohit Maheshwari rohitm@chelsio.com
[ Upstream commit 7d01c428c86b525dc780226924d74df2048cf411 ]
context id and port id should be filled while sending tcb update.
Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling") Signed-off-by: Rohit Maheshwari rohitm@chelsio.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/chelsio/chcr_ktls.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/chelsio/chcr_ktls.c b/drivers/crypto/chelsio/chcr_ktls.c index 026689d091102..dc5e22bc64b39 100644 --- a/drivers/crypto/chelsio/chcr_ktls.c +++ b/drivers/crypto/chelsio/chcr_ktls.c @@ -659,7 +659,8 @@ int chcr_ktls_cpl_set_tcb_rpl(struct adapter *adap, unsigned char *input) }
static void *__chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info, - u32 tid, void *pos, u16 word, u64 mask, + u32 tid, void *pos, u16 word, + struct sge_eth_txq *q, u64 mask, u64 val, u32 reply) { struct cpl_set_tcb_field_core *cpl; @@ -668,7 +669,10 @@ static void *__chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info,
/* ULP_TXPKT */ txpkt = pos; - txpkt->cmd_dest = htonl(ULPTX_CMD_V(ULP_TX_PKT) | ULP_TXPKT_DEST_V(0)); + txpkt->cmd_dest = htonl(ULPTX_CMD_V(ULP_TX_PKT) | + ULP_TXPKT_CHANNELID_V(tx_info->port_id) | + ULP_TXPKT_FID_V(q->q.cntxt_id) | + ULP_TXPKT_RO_F); txpkt->len = htonl(DIV_ROUND_UP(CHCR_SET_TCB_FIELD_LEN, 16));
/* ULPTX_IDATA sub-command */ @@ -723,7 +727,7 @@ static void *chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info, } else { u8 buf[48] = {0};
- __chcr_write_cpl_set_tcb_ulp(tx_info, tid, buf, word, + __chcr_write_cpl_set_tcb_ulp(tx_info, tid, buf, word, q, mask, val, reply);
return chcr_copy_to_txd(buf, &q->q, pos, @@ -731,7 +735,7 @@ static void *chcr_write_cpl_set_tcb_ulp(struct chcr_ktls_info *tx_info, } }
- pos = __chcr_write_cpl_set_tcb_ulp(tx_info, tid, pos, word, + pos = __chcr_write_cpl_set_tcb_ulp(tx_info, tid, pos, word, q, mask, val, reply);
/* check again if we are at the end of the queue */
From: Wang Hai wanghai38@huawei.com
[ Upstream commit 52755b66ddcef2e897778fac5656df18817b59ab ]
If memory allocation for 'kbuf' succeed, cosa_write() doesn't have a corresponding kfree() in exception handling. Thus add kfree() for this function implementation.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang Hai wanghai38@huawei.com Acked-by: Jan "Yenya" Kasprzak kas@fi.muni.cz Link: https://lore.kernel.org/r/20201110144614.43194-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/cosa.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/wan/cosa.c b/drivers/net/wan/cosa.c index f8aed0696d775..2369ca250cd65 100644 --- a/drivers/net/wan/cosa.c +++ b/drivers/net/wan/cosa.c @@ -889,6 +889,7 @@ static ssize_t cosa_write(struct file *file, chan->tx_status = 1; spin_unlock_irqrestore(&cosa->lock, flags); up(&chan->wsem); + kfree(kbuf); return -ERESTARTSYS; } }
From: Brad Campbell brad@fnarfbargle.com
[ Upstream commit 4d64bb4ba5ecf4831448cdb2fe16d0ae91b2b40b ]
Commit fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()") introduced an issue whereby communication with the SMC became unreliable with write errors like :
[ 120.378614] applesmc: send_byte(0x00, 0x0300) fail: 0x40 [ 120.378621] applesmc: LKSB: write data fail [ 120.512782] applesmc: send_byte(0x00, 0x0300) fail: 0x40 [ 120.512787] applesmc: LKSB: write data fail
The original code appeared to be timing sensitive and was not reliable with the timing changes in the aforementioned commit.
This patch re-factors the SMC communication to remove the timing dependencies and restore function with the changes previously committed.
Tested on : MacbookAir6,2 MacBookPro11,1 iMac12,2, MacBookAir1,1, MacBookAir3,1
Fixes: fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()") Reported-by: Andreas Kemnade andreas@kemnade.info Tested-by: Andreas Kemnade andreas@kemnade.info # MacBookAir6,2 Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Brad Campbell brad@fnarfbargle.com Signed-off-by: Henrik Rydberg rydberg@bitmath.org Link: https://lore.kernel.org/r/194a7d71-a781-765a-d177-c962ef296b90@fnarfbargle.c... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/applesmc.c | 130 ++++++++++++++++++++++++--------------- 1 file changed, 82 insertions(+), 48 deletions(-)
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index a18887990f4a2..79b498f816fe9 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -32,6 +32,7 @@ #include <linux/hwmon.h> #include <linux/workqueue.h> #include <linux/err.h> +#include <linux/bits.h>
/* data port used by Apple SMC */ #define APPLESMC_DATA_PORT 0x300 @@ -42,10 +43,13 @@
#define APPLESMC_MAX_DATA_LENGTH 32
-/* wait up to 128 ms for a status change. */ -#define APPLESMC_MIN_WAIT 0x0010 -#define APPLESMC_RETRY_WAIT 0x0100 -#define APPLESMC_MAX_WAIT 0x20000 +/* Apple SMC status bits */ +#define SMC_STATUS_AWAITING_DATA BIT(0) /* SMC has data waiting to be read */ +#define SMC_STATUS_IB_CLOSED BIT(1) /* Will ignore any input */ +#define SMC_STATUS_BUSY BIT(2) /* Command in progress */ + +/* Initial wait is 8us */ +#define APPLESMC_MIN_WAIT 0x0008
#define APPLESMC_READ_CMD 0x10 #define APPLESMC_WRITE_CMD 0x11 @@ -151,65 +155,84 @@ static unsigned int key_at_index; static struct workqueue_struct *applesmc_led_wq;
/* - * wait_read - Wait for a byte to appear on SMC port. Callers must - * hold applesmc_lock. + * Wait for specific status bits with a mask on the SMC. + * Used before all transactions. + * This does 10 fast loops of 8us then exponentially backs off for a + * minimum total wait of 262ms. Depending on usleep_range this could + * run out past 500ms. */ -static int wait_read(void) + +static int wait_status(u8 val, u8 mask) { - unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC; u8 status; int us; + int i;
- for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) { - usleep_range(us, us * 16); + us = APPLESMC_MIN_WAIT; + for (i = 0; i < 24 ; i++) { status = inb(APPLESMC_CMD_PORT); - /* read: wait for smc to settle */ - if (status & 0x01) + if ((status & mask) == val) return 0; - /* timeout: give up */ - if (time_after(jiffies, end)) - break; + usleep_range(us, us * 2); + if (i > 9) + us <<= 1; } - - pr_warn("wait_read() fail: 0x%02x\n", status); return -EIO; }
-/* - * send_byte - Write to SMC port, retrying when necessary. Callers - * must hold applesmc_lock. - */ +/* send_byte - Write to SMC data port. Callers must hold applesmc_lock. */ + static int send_byte(u8 cmd, u16 port) { - u8 status; - int us; - unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC; + int status; + + status = wait_status(0, SMC_STATUS_IB_CLOSED); + if (status) + return status; + /* + * This needs to be a separate read looking for bit 0x04 + * after bit 0x02 falls. If consolidated with the wait above + * this extra read may not happen if status returns both + * simultaneously and this would appear to be required. + */ + status = wait_status(SMC_STATUS_BUSY, SMC_STATUS_BUSY); + if (status) + return status;
outb(cmd, port); - for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) { - usleep_range(us, us * 16); - status = inb(APPLESMC_CMD_PORT); - /* write: wait for smc to settle */ - if (status & 0x02) - continue; - /* ready: cmd accepted, return */ - if (status & 0x04) - return 0; - /* timeout: give up */ - if (time_after(jiffies, end)) - break; - /* busy: long wait and resend */ - udelay(APPLESMC_RETRY_WAIT); - outb(cmd, port); - } - - pr_warn("send_byte(0x%02x, 0x%04x) fail: 0x%02x\n", cmd, port, status); - return -EIO; + return 0; }
+/* send_command - Write a command to the SMC. Callers must hold applesmc_lock. */ + static int send_command(u8 cmd) { - return send_byte(cmd, APPLESMC_CMD_PORT); + int ret; + + ret = wait_status(0, SMC_STATUS_IB_CLOSED); + if (ret) + return ret; + outb(cmd, APPLESMC_CMD_PORT); + return 0; +} + +/* + * Based on logic from the Apple driver. This is issued before any interaction + * If busy is stuck high, issue a read command to reset the SMC state machine. + * If busy is stuck high after the command then the SMC is jammed. + */ + +static int smc_sane(void) +{ + int ret; + + ret = wait_status(0, SMC_STATUS_BUSY); + if (!ret) + return ret; + ret = send_command(APPLESMC_READ_CMD); + if (ret) + return ret; + return wait_status(0, SMC_STATUS_BUSY); }
static int send_argument(const char *key) @@ -226,6 +249,11 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) { u8 status, data = 0; int i; + int ret; + + ret = smc_sane(); + if (ret) + return ret;
if (send_command(cmd) || send_argument(key)) { pr_warn("%.4s: read arg fail\n", key); @@ -239,7 +267,8 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) }
for (i = 0; i < len; i++) { - if (wait_read()) { + if (wait_status(SMC_STATUS_AWAITING_DATA | SMC_STATUS_BUSY, + SMC_STATUS_AWAITING_DATA | SMC_STATUS_BUSY)) { pr_warn("%.4s: read data[%d] fail\n", key, i); return -EIO; } @@ -250,19 +279,24 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len) for (i = 0; i < 16; i++) { udelay(APPLESMC_MIN_WAIT); status = inb(APPLESMC_CMD_PORT); - if (!(status & 0x01)) + if (!(status & SMC_STATUS_AWAITING_DATA)) break; data = inb(APPLESMC_DATA_PORT); } if (i) pr_warn("flushed %d bytes, last value is: %d\n", i, data);
- return 0; + return wait_status(0, SMC_STATUS_BUSY); }
static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len) { int i; + int ret; + + ret = smc_sane(); + if (ret) + return ret;
if (send_command(cmd) || send_argument(key)) { pr_warn("%s: write arg fail\n", key); @@ -281,7 +315,7 @@ static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len) } }
- return 0; + return wait_status(0, SMC_STATUS_BUSY); }
static int read_register_count(unsigned int *count)
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 6c2190b3fcbc92cb79e39cc7e7531656b341e463 ]
Certain NFSv4.2/RDMA tests fail with v5.9-rc1.
rpcrdma_convert_kvec() runs off the end of the rl_segments array because rq_rcv_buf.tail[0].iov_len holds a very large positive value. The resultant kernel memory corruption is enough to crash the client system.
Callers of rpc_prepare_reply_pages() must reserve an extra XDR_UNIT in the maximum decode size for a possible XDR pad of the contents of the xdr_buf's pages. That guarantees the allocated receive buffer will be large enough to accommodate the usual contents plus that XDR pad word.
encode_op_hdr() cannot add that extra word. If it does, xdr_inline_pages() underruns the length of the tail iovec.
Fixes: 3e1f02123fba ("NFSv4.2: add client side XDR handling for extended attributes") Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs42xdr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c index cc50085e151c5..d0ddf90c9be48 100644 --- a/fs/nfs/nfs42xdr.c +++ b/fs/nfs/nfs42xdr.c @@ -179,7 +179,7 @@ 1 + nfs4_xattr_name_maxsz + 1) #define decode_setxattr_maxsz (op_decode_hdr_maxsz + decode_change_info_maxsz) #define encode_listxattrs_maxsz (op_encode_hdr_maxsz + 2 + 1) -#define decode_listxattrs_maxsz (op_decode_hdr_maxsz + 2 + 1 + 1) +#define decode_listxattrs_maxsz (op_decode_hdr_maxsz + 2 + 1 + 1 + 1) #define encode_removexattr_maxsz (op_encode_hdr_maxsz + 1 + \ nfs4_xattr_name_maxsz) #define decode_removexattr_maxsz (op_decode_hdr_maxsz + \ @@ -504,7 +504,7 @@ static void encode_listxattrs(struct xdr_stream *xdr, { __be32 *p;
- encode_op_hdr(xdr, OP_LISTXATTRS, decode_listxattrs_maxsz + 1, hdr); + encode_op_hdr(xdr, OP_LISTXATTRS, decode_listxattrs_maxsz, hdr);
p = reserve_space(xdr, 12); if (unlikely(!p))
From: Martin Willi martin@strongswan.org
[ Upstream commit 9e2b7fa2df4365e99934901da4fb4af52d81e820 ]
VRF devices use an optimized direct path on output if a default qdisc is involved, calling Netfilter hooks directly. This path, however, does not consider Netfilter rules completing asynchronously, such as with NFQUEUE. The Netfilter okfn() is called for asynchronously accepted packets, but the VRF never passes that packet down the stack to send it out over the slave device. Using the slower redirect path for this seems not feasible, as we do not know beforehand if a Netfilter hook has asynchronously completing rules.
Fix the use of asynchronously completing Netfilter rules in OUTPUT and POSTROUTING by using a special completion function that additionally calls dst_output() to pass the packet down the stack. Also, slightly adjust the use of nf_reset_ct() so that is called in the asynchronous case, too.
Fixes: dcdd43c41e60 ("net: vrf: performance improvements for IPv4") Fixes: a9ec54d1b0cd ("net: vrf: performance improvements for IPv6") Signed-off-by: Martin Willi martin@strongswan.org Link: https://lore.kernel.org/r/20201106073030.3974927-1-martin@strongswan.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vrf.c | 92 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 69 insertions(+), 23 deletions(-)
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index 60c1aadece89a..f2793ffde1913 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -608,8 +608,7 @@ static netdev_tx_t vrf_xmit(struct sk_buff *skb, struct net_device *dev) return ret; }
-static int vrf_finish_direct(struct net *net, struct sock *sk, - struct sk_buff *skb) +static void vrf_finish_direct(struct sk_buff *skb) { struct net_device *vrf_dev = skb->dev;
@@ -628,7 +627,8 @@ static int vrf_finish_direct(struct net *net, struct sock *sk, skb_pull(skb, ETH_HLEN); }
- return 1; + /* reset skb device */ + nf_reset_ct(skb); }
#if IS_ENABLED(CONFIG_IPV6) @@ -707,15 +707,41 @@ static struct sk_buff *vrf_ip6_out_redirect(struct net_device *vrf_dev, return skb; }
+static int vrf_output6_direct_finish(struct net *net, struct sock *sk, + struct sk_buff *skb) +{ + vrf_finish_direct(skb); + + return vrf_ip6_local_out(net, sk, skb); +} + static int vrf_output6_direct(struct net *net, struct sock *sk, struct sk_buff *skb) { + int err = 1; + skb->protocol = htons(ETH_P_IPV6);
- return NF_HOOK_COND(NFPROTO_IPV6, NF_INET_POST_ROUTING, - net, sk, skb, NULL, skb->dev, - vrf_finish_direct, - !(IPCB(skb)->flags & IPSKB_REROUTED)); + if (!(IPCB(skb)->flags & IPSKB_REROUTED)) + err = nf_hook(NFPROTO_IPV6, NF_INET_POST_ROUTING, net, sk, skb, + NULL, skb->dev, vrf_output6_direct_finish); + + if (likely(err == 1)) + vrf_finish_direct(skb); + + return err; +} + +static int vrf_ip6_out_direct_finish(struct net *net, struct sock *sk, + struct sk_buff *skb) +{ + int err; + + err = vrf_output6_direct(net, sk, skb); + if (likely(err == 1)) + err = vrf_ip6_local_out(net, sk, skb); + + return err; }
static struct sk_buff *vrf_ip6_out_direct(struct net_device *vrf_dev, @@ -728,18 +754,15 @@ static struct sk_buff *vrf_ip6_out_direct(struct net_device *vrf_dev, skb->dev = vrf_dev;
err = nf_hook(NFPROTO_IPV6, NF_INET_LOCAL_OUT, net, sk, - skb, NULL, vrf_dev, vrf_output6_direct); + skb, NULL, vrf_dev, vrf_ip6_out_direct_finish);
if (likely(err == 1)) err = vrf_output6_direct(net, sk, skb);
- /* reset skb device */ if (likely(err == 1)) - nf_reset_ct(skb); - else - skb = NULL; + return skb;
- return skb; + return NULL; }
static struct sk_buff *vrf_ip6_out(struct net_device *vrf_dev, @@ -919,15 +942,41 @@ static struct sk_buff *vrf_ip_out_redirect(struct net_device *vrf_dev, return skb; }
+static int vrf_output_direct_finish(struct net *net, struct sock *sk, + struct sk_buff *skb) +{ + vrf_finish_direct(skb); + + return vrf_ip_local_out(net, sk, skb); +} + static int vrf_output_direct(struct net *net, struct sock *sk, struct sk_buff *skb) { + int err = 1; + skb->protocol = htons(ETH_P_IP);
- return NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING, - net, sk, skb, NULL, skb->dev, - vrf_finish_direct, - !(IPCB(skb)->flags & IPSKB_REROUTED)); + if (!(IPCB(skb)->flags & IPSKB_REROUTED)) + err = nf_hook(NFPROTO_IPV4, NF_INET_POST_ROUTING, net, sk, skb, + NULL, skb->dev, vrf_output_direct_finish); + + if (likely(err == 1)) + vrf_finish_direct(skb); + + return err; +} + +static int vrf_ip_out_direct_finish(struct net *net, struct sock *sk, + struct sk_buff *skb) +{ + int err; + + err = vrf_output_direct(net, sk, skb); + if (likely(err == 1)) + err = vrf_ip_local_out(net, sk, skb); + + return err; }
static struct sk_buff *vrf_ip_out_direct(struct net_device *vrf_dev, @@ -940,18 +989,15 @@ static struct sk_buff *vrf_ip_out_direct(struct net_device *vrf_dev, skb->dev = vrf_dev;
err = nf_hook(NFPROTO_IPV4, NF_INET_LOCAL_OUT, net, sk, - skb, NULL, vrf_dev, vrf_output_direct); + skb, NULL, vrf_dev, vrf_ip_out_direct_finish);
if (likely(err == 1)) err = vrf_output_direct(net, sk, skb);
- /* reset skb device */ if (likely(err == 1)) - nf_reset_ct(skb); - else - skb = NULL; + return skb;
- return skb; + return NULL; }
static struct sk_buff *vrf_ip_out(struct net_device *vrf_dev,
From: Sven Van Asbroeck thesven73@gmail.com
[ Upstream commit edbc21113bde13ca3d06eec24b621b1f628583dd ]
When no devicetree is present, the driver will use an uninitialized variable.
Fix by initializing this variable.
Fixes: 902a66e08cea ("lan743x: correctly handle chips with internal PHY") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Sven Van Asbroeck thesven73@gmail.com Link: https://lore.kernel.org/r/20201112152513.1941-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan743x_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index 6c25c7c8b7cf8..bc368136bccc6 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1015,8 +1015,8 @@ static void lan743x_phy_close(struct lan743x_adapter *adapter) static int lan743x_phy_open(struct lan743x_adapter *adapter) { struct lan743x_phy *phy = &adapter->phy; + struct phy_device *phydev = NULL; struct device_node *phynode; - struct phy_device *phydev; struct net_device *netdev; int ret = -EIO;
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 58284a901b426e6130672e9f14c30dfd5a9dbde0 ]
During memory hotplug process, the linear mapping should not be created for a given memory range if that would fall outside the maximum allowed linear range. Else it might cause memory corruption in the kernel virtual space.
Maximum linear mapping region is [PAGE_OFFSET..(PAGE_END -1)] accommodating both its ends but excluding PAGE_END. Max physical range that can be mapped inside this linear mapping range, must also be derived from its end points.
This ensures that arch_add_memory() validates memory hot add range for its potential linear mapping requirements, before creating it with __create_pgd_mapping().
Fixes: 4ab215061554 ("arm64: Add memory hotplug support") Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Ard Biesheuvel ardb@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Ard Biesheuvel ardb@kernel.org Cc: Steven Price steven.price@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: David Hildenbrand david@redhat.com Cc: Andrew Morton akpm@linux-foundation.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1605252614-761-1-git-send-email-anshuman.khandual@... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/mmu.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 75df62fea1b68..a834e7fb0e250 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1433,11 +1433,28 @@ static void __remove_pgd_mapping(pgd_t *pgdir, unsigned long start, u64 size) free_empty_tables(start, end, PAGE_OFFSET, PAGE_END); }
+static bool inside_linear_region(u64 start, u64 size) +{ + /* + * Linear mapping region is the range [PAGE_OFFSET..(PAGE_END - 1)] + * accommodating both its ends but excluding PAGE_END. Max physical + * range which can be mapped inside this linear mapping range, must + * also be derived from its end points. + */ + return start >= __pa(_PAGE_OFFSET(vabits_actual)) && + (start + size - 1) <= __pa(PAGE_END - 1); +} + int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params) { int ret, flags = 0;
+ if (!inside_linear_region(start, size)) { + pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size); + return -EINVAL; + } + if (rodata_full || debug_pagealloc_enabled()) flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
From: Santosh Sivaraj santosh@fossix.org
[ Upstream commit e7e046155af04cdca5e1157f28b07e1651eb317b ]
Define watchdog_allowed_mask only when SOFTLOCKUP_DETECTOR is enabled.
Fixes: 7feeb9cd4f5b ("watchdog/sysctl: Clean up sysctl variable name space") Signed-off-by: Santosh Sivaraj santosh@fossix.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Petr Mladek pmladek@suse.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20201106015025.1281561-1-santosh@fossix.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/watchdog.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 5abb5b22ad130..71109065bd8eb 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -44,8 +44,6 @@ int __read_mostly soft_watchdog_user_enabled = 1; int __read_mostly watchdog_thresh = 10; static int __read_mostly nmi_watchdog_available;
-static struct cpumask watchdog_allowed_mask __read_mostly; - struct cpumask watchdog_cpumask __read_mostly; unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
@@ -162,6 +160,8 @@ static void lockup_detector_update_enable(void) int __read_mostly sysctl_softlockup_all_cpu_backtrace; #endif
+static struct cpumask watchdog_allowed_mask __read_mostly; + /* Global variables, exported for sysctl */ unsigned int __read_mostly softlockup_panic = CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
From: Muchun Song songmuchun@bytedance.com
[ Upstream commit 8b21ca0218d29cc6bb7028125c7e5a10dfb4730c ]
When we poll the swap.events, we can miss being woken up when the swap event occurs. Because we didn't notify.
Fixes: f3a53a3a1e5b ("mm, memcontrol: implement memory.swap.events") Signed-off-by: Muchun Song songmuchun@bytedance.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Shakeel Butt shakeelb@google.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Roman Gushchin guro@fb.com Cc: Michal Hocko mhocko@suse.com Cc: Yafang Shao laoar.shao@gmail.com Cc: Chris Down chris@chrisdown.name Cc: Tejun Heo tj@kernel.org Link: https://lkml.kernel.org/r/20201105161936.98312-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/memcontrol.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d0b036123c6ab..fa635207fe96d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -897,12 +897,19 @@ static inline void count_memcg_event_mm(struct mm_struct *mm, static inline void memcg_memory_event(struct mem_cgroup *memcg, enum memcg_memory_event event) { + bool swap_event = event == MEMCG_SWAP_HIGH || event == MEMCG_SWAP_MAX || + event == MEMCG_SWAP_FAIL; + atomic_long_inc(&memcg->memory_events_local[event]); - cgroup_file_notify(&memcg->events_local_file); + if (!swap_event) + cgroup_file_notify(&memcg->events_local_file);
do { atomic_long_inc(&memcg->memory_events[event]); - cgroup_file_notify(&memcg->events_file); + if (swap_event) + cgroup_file_notify(&memcg->swap_events_file); + else + cgroup_file_notify(&memcg->events_file);
if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) break;
From: David Howells dhowells@redhat.com
[ Upstream commit 3ad216ee73abc554ed8f13f4f8b70845a7bef6da ]
When afs_write_end() is called with copied == 0, it tries to set the dirty region, but there's no way to actually encode a 0-length region in the encoding in page->private.
"0,0", for example, indicates a 1-byte region at offset 0. The maths miscalculates this and sets it incorrectly.
Fix it to just do nothing but unlock and put the page in this case. We don't actually need to mark the page dirty as nothing presumably changed.
Fixes: 65dd2d6072d3 ("afs: Alter dirty range encoding in page->private") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/write.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/afs/write.c b/fs/afs/write.c index 50371207f3273..c9195fc67fd8f 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -169,11 +169,14 @@ int afs_write_end(struct file *file, struct address_space *mapping, unsigned int f, from = pos & (PAGE_SIZE - 1); unsigned int t, to = from + copied; loff_t i_size, maybe_i_size; - int ret; + int ret = 0;
_enter("{%llx:%llu},{%lx}", vnode->fid.vid, vnode->fid.vnode, page->index);
+ if (copied == 0) + goto out; + maybe_i_size = pos + copied;
i_size = i_size_read(&vnode->vfs_inode);
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit ce0f17fc93f63ee91428af10b7b2ddef38cd19e5 ]
One should use in_serving_softirq() to detect SoftIRQ context.
Fixes: 96f6d4444302 ("perf_counter: avoid recursion") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20201030151955.120572175@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/internal.h b/kernel/events/internal.h index fcbf5616a4411..402054e755f27 100644 --- a/kernel/events/internal.h +++ b/kernel/events/internal.h @@ -211,7 +211,7 @@ static inline int get_recursion_context(int *recursion) rctx = 3; else if (in_irq()) rctx = 2; - else if (in_softirq()) + else if (in_serving_softirq()) rctx = 1; else rctx = 0;
From: Christoph Hellwig hch@lst.de
[ Upstream commit d4609ea8b3d3fb3423f35805843a82774cb4ef2f ]
Factor out a helper from nvme_update_ns_info that configures the per-namespaces metadata and PI settings. Also make sure the helpers clear the flags explicitly instead of all of ->features to allow for potentially reusing ->features for future non-metadata flags.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 78 ++++++++++++++++++++++++---------------- 1 file changed, 47 insertions(+), 31 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 59040bab5d6fa..be0cec51f5e6d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1946,6 +1946,50 @@ static int nvme_setup_streams_ns(struct nvme_ctrl *ctrl, struct nvme_ns *ns, return 0; }
+static int nvme_configure_metadata(struct nvme_ns *ns, struct nvme_id_ns *id) +{ + struct nvme_ctrl *ctrl = ns->ctrl; + + /* + * The PI implementation requires the metadata size to be equal to the + * t10 pi tuple size. + */ + ns->ms = le16_to_cpu(id->lbaf[id->flbas & NVME_NS_FLBAS_LBA_MASK].ms); + if (ns->ms == sizeof(struct t10_pi_tuple)) + ns->pi_type = id->dps & NVME_NS_DPS_PI_MASK; + else + ns->pi_type = 0; + + ns->features &= ~(NVME_NS_METADATA_SUPPORTED | NVME_NS_EXT_LBAS); + if (!ns->ms || !(ctrl->ops->flags & NVME_F_METADATA_SUPPORTED)) + return 0; + if (ctrl->ops->flags & NVME_F_FABRICS) { + /* + * The NVMe over Fabrics specification only supports metadata as + * part of the extended data LBA. We rely on HCA/HBA support to + * remap the separate metadata buffer from the block layer. + */ + if (WARN_ON_ONCE(!(id->flbas & NVME_NS_FLBAS_META_EXT))) + return -EINVAL; + if (ctrl->max_integrity_segments) + ns->features |= + (NVME_NS_METADATA_SUPPORTED | NVME_NS_EXT_LBAS); + } else { + /* + * For PCIe controllers, we can't easily remap the separate + * metadata buffer from the block layer and thus require a + * separate metadata buffer for block layer metadata/PI support. + * We allow extended LBAs for the passthrough interface, though. + */ + if (id->flbas & NVME_NS_FLBAS_META_EXT) + ns->features |= NVME_NS_EXT_LBAS; + else + ns->features |= NVME_NS_METADATA_SUPPORTED; + } + + return 0; +} + static void nvme_update_disk_info(struct gendisk *disk, struct nvme_ns *ns, struct nvme_id_ns *id) { @@ -2096,37 +2140,9 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id) return -ENODEV; }
- ns->features = 0; - ns->ms = le16_to_cpu(id->lbaf[lbaf].ms); - /* the PI implementation requires metadata equal t10 pi tuple size */ - if (ns->ms == sizeof(struct t10_pi_tuple)) - ns->pi_type = id->dps & NVME_NS_DPS_PI_MASK; - else - ns->pi_type = 0; - - if (ns->ms) { - /* - * For PCIe only the separate metadata pointer is supported, - * as the block layer supplies metadata in a separate bio_vec - * chain. For Fabrics, only metadata as part of extended data - * LBA is supported on the wire per the Fabrics specification, - * but the HBA/HCA will do the remapping from the separate - * metadata buffers for us. - */ - if (id->flbas & NVME_NS_FLBAS_META_EXT) { - ns->features |= NVME_NS_EXT_LBAS; - if ((ctrl->ops->flags & NVME_F_FABRICS) && - (ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) && - ctrl->max_integrity_segments) - ns->features |= NVME_NS_METADATA_SUPPORTED; - } else { - if (WARN_ON_ONCE(ctrl->ops->flags & NVME_F_FABRICS)) - return -EINVAL; - if (ctrl->ops->flags & NVME_F_METADATA_SUPPORTED) - ns->features |= NVME_NS_METADATA_SUPPORTED; - } - } - + ret = nvme_configure_metadata(ns, id); + if (ret) + return ret; nvme_set_chunk_sectors(ns, id); nvme_update_disk_info(disk, ns, id); #ifdef CONFIG_NVME_MULTIPATH
[ Upstream commit f9d5f4579feafa721dba2f350fc064a1852c6f8c ]
Ensure that there can't be any I/O in flight went we change the disk geometry in nvme_update_ns_info, most notable the LBA size by lifting the queue free from nvme_update_disk_info into the caller
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index be0cec51f5e6d..b130696b00592 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2001,7 +2001,7 @@ static void nvme_update_disk_info(struct gendisk *disk, /* unsupported block size, set capacity to 0 later */ bs = (1 << 9); } - blk_mq_freeze_queue(disk->queue); + blk_integrity_unregister(disk);
atomic_bs = phys_bs = bs; @@ -2066,8 +2066,6 @@ static void nvme_update_disk_info(struct gendisk *disk, set_disk_ro(disk, true); else set_disk_ro(disk, false); - - blk_mq_unfreeze_queue(disk->queue); }
static inline bool nvme_first_scan(struct gendisk *disk) @@ -2114,6 +2112,7 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id) struct nvme_ctrl *ctrl = ns->ctrl; int ret;
+ blk_mq_freeze_queue(ns->disk->queue); /* * If identify namespace failed, use default 512 byte block size so * block layer can use before failing read/write for 0 capacity. @@ -2131,29 +2130,38 @@ static int __nvme_revalidate_disk(struct gendisk *disk, struct nvme_id_ns *id) dev_warn(ctrl->device, "failed to add zoned namespace:%u ret:%d\n", ns->head->ns_id, ret); - return ret; + goto out_unfreeze; } break; default: dev_warn(ctrl->device, "unknown csi:%u ns:%u\n", ns->head->ids.csi, ns->head->ns_id); - return -ENODEV; + ret = -ENODEV; + goto out_unfreeze; }
ret = nvme_configure_metadata(ns, id); if (ret) - return ret; + goto out_unfreeze; nvme_set_chunk_sectors(ns, id); nvme_update_disk_info(disk, ns, id); + blk_mq_unfreeze_queue(ns->disk->queue); + #ifdef CONFIG_NVME_MULTIPATH if (ns->head->disk) { + blk_mq_freeze_queue(ns->head->disk->queue); nvme_update_disk_info(ns->head->disk, ns, id); blk_stack_limits(&ns->head->disk->queue->limits, &ns->queue->limits, 0); nvme_mpath_update_disk_size(ns->head->disk); + blk_mq_unfreeze_queue(ns->head->disk->queue); } #endif return 0; + +out_unfreeze: + blk_mq_unfreeze_queue(ns->disk->queue); + return ret; }
static int _nvme_revalidate_disk(struct gendisk *disk)
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 65c5a055b0d567b7e7639d942c0605da9cc54c5e ]
The offending commit breaks BLKROSET ioctl because a device revalidation will blindly override BLKROSET setting. Hence, we remove the disk rw setting in case NVME_NS_ATTR_RO is cleared from by the controller.
Fixes: 1293477f4f32 ("nvme: set gendisk read only based on nsattr") Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index b130696b00592..349fba056cb65 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2064,8 +2064,6 @@ static void nvme_update_disk_info(struct gendisk *disk,
if (id->nsattr & NVME_NS_ATTR_RO) set_disk_ro(disk, true); - else - set_disk_ro(disk, false); }
static inline bool nvme_first_scan(struct gendisk *disk)
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 251ff2d49347793d348babcff745289b11910e96 ]
Collate the error paths. Code duplication only leads to divergence and extra bugs.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20201029162901.972161394@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index 98a603098f23e..c245ccd426b71 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2565,11 +2565,8 @@ group_sched_in(struct perf_event *group_event,
pmu->start_txn(pmu, PERF_PMU_TXN_ADD);
- if (event_sched_in(group_event, cpuctx, ctx)) { - pmu->cancel_txn(pmu); - perf_mux_hrtimer_restart(cpuctx); - return -EAGAIN; - } + if (event_sched_in(group_event, cpuctx, ctx)) + goto error;
/* * Schedule in siblings as one group (if any): @@ -2598,10 +2595,9 @@ group_error: } event_sched_out(group_event, cpuctx, ctx);
+error: pmu->cancel_txn(pmu); - perf_mux_hrtimer_restart(cpuctx); - return -EAGAIN; }
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 2714c3962f304d031d5016c963c4b459337b0749 ]
Commit 9e6302056f80 ("perf: Use hrtimers for event multiplexing") placed the hrtimer (re)start call in the wrong place. Instead of capturing all scheduling failures, it only considered the PMU failure.
The result is that groups using perf_event_attr::exclusive are no longer rotated.
Fixes: 9e6302056f80 ("perf: Use hrtimers for event multiplexing") Reported-by: Andi Kleen ak@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20201029162902.038667689@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index c245ccd426b71..a06ac60d346f1 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2597,7 +2597,6 @@ group_error:
error: pmu->cancel_txn(pmu); - perf_mux_hrtimer_restart(cpuctx); return -EAGAIN; }
@@ -3653,6 +3652,7 @@ static int merge_sched_in(struct perf_event *event, void *data)
*can_add_hw = 0; ctx->rotate_necessary = 1; + perf_mux_hrtimer_restart(cpuctx); }
return 0;
From: Arnd Bergmann arnd@arndb.de
commit f3217d6f2f7a76b36a3326ad58c8897f4d5fbe31 upstream.
The zynqmp_pm_set_suspend_mode() and zynqmp_pm_get_trustzone_version() functions pass values as api_id into zynqmp_pm_invoke_fn that are beyond PM_API_MAX, resulting in an out-of-bounds access:
drivers/firmware/xilinx/zynqmp.c: In function 'zynqmp_pm_set_suspend_mode': drivers/firmware/xilinx/zynqmp.c:150:24: warning: array subscript 2562 is above array bounds of 'u32[64]' {aka 'unsigned int[64]'} [-Warray-bounds] 150 | if (zynqmp_pm_features[api_id] != PM_FEATURE_UNCHECKED) | ~~~~~~~~~~~~~~~~~~^~~~~~~~ drivers/firmware/xilinx/zynqmp.c:28:12: note: while referencing 'zynqmp_pm_features' 28 | static u32 zynqmp_pm_features[PM_API_MAX]; | ^~~~~~~~~~~~~~~~~~
Replace the resulting undefined behavior with an error return. This may break some things that happen to work at the moment but seems better than randomly overwriting kernel data.
I assume we need additional fixes for the two functions that now return an error.
Fixes: 76582671eb5d ("firmware: xilinx: Add Zynqmp firmware driver") Fixes: e178df31cf41 ("firmware: xilinx: Implement ZynqMP power management APIs") Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20201026155449.3703142-1-arnd@kernel.org Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/firmware/xilinx/zynqmp.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/firmware/xilinx/zynqmp.c +++ b/drivers/firmware/xilinx/zynqmp.c @@ -147,6 +147,9 @@ static int zynqmp_pm_feature(u32 api_id) return 0;
/* Return value if feature is already checked */ + if (api_id > ARRAY_SIZE(zynqmp_pm_features)) + return PM_FEATURE_INVALID; + if (zynqmp_pm_features[api_id] != PM_FEATURE_UNCHECKED) return zynqmp_pm_features[api_id];
From: Gao Xiang hsiangkao@redhat.com
commit a30573b3cdc77b8533d004ece1ea7c0146b437a0 upstream.
pcluster should be only set up for all managed pages instead of temporary pages. Since it currently uses page->mapping to identify, the impact is minor for now.
[ Update: Vladimir reported the kernel log becomes polluted because PAGE_FLAGS_CHECK_AT_FREE flag(s) set if the page allocation debug option is enabled. ]
Link: https://lore.kernel.org/r/20201022145724.27284-1-hsiangkao@aol.com Fixes: 5ddcee1f3a1c ("erofs: get rid of __stagingpage_alloc helper") Cc: stable@vger.kernel.org # 5.5+ Tested-by: Vladimir Zapolskiy vladimir@tuxera.com Reviewed-by: Chao Yu yuchao0@huawei.com Signed-off-by: Gao Xiang hsiangkao@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/erofs/zdata.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1080,8 +1080,11 @@ out_allocpage: cond_resched(); goto repeat; } - set_page_private(page, (unsigned long)pcl); - SetPagePrivate(page); + + if (tocache) { + set_page_private(page, (unsigned long)pcl); + SetPagePrivate(page); + } out: /* the only exit (for tracing and debugging) */ return page; }
From: Gao Xiang hsiangkao@redhat.com
commit d3938ee23e97bfcac2e0eb6b356875da73d700df upstream.
EROFS has _only one_ ondisk timestamp (ctime is currently documented and recorded, we might also record mtime instead with a new compat feature if needed) for each extended inode since EROFS isn't mainly for archival purposes so no need to keep all timestamps on disk especially for Android scenarios due to security concerns. Also, romfs/cramfs don't have their own on-disk timestamp, and squashfs only records mtime instead.
Let's also derive access time from ondisk timestamp rather than leaving it empty, and if mtime/atime for each file are really needed for specific scenarios as well, we can also use xattrs to record them then.
Link: https://lore.kernel.org/r/20201031195102.21221-1-hsiangkao@aol.com [ Gao Xiang: It'd be better to backport for user-friendly concern. ] Fixes: 431339ba9042 ("staging: erofs: add inode operations") Cc: stable stable@vger.kernel.org # 4.19+ Reported-by: nl6720 nl6720@gmail.com Reviewed-by: Chao Yu yuchao0@huawei.com Signed-off-by: Gao Xiang hsiangkao@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/erofs/inode.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
--- a/fs/erofs/inode.c +++ b/fs/erofs/inode.c @@ -107,11 +107,9 @@ static struct page *erofs_read_inode(str i_gid_write(inode, le32_to_cpu(die->i_gid)); set_nlink(inode, le32_to_cpu(die->i_nlink));
- /* ns timestamp */ - inode->i_mtime.tv_sec = inode->i_ctime.tv_sec = - le64_to_cpu(die->i_ctime); - inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec = - le32_to_cpu(die->i_ctime_nsec); + /* extended inode has its own timestamp */ + inode->i_ctime.tv_sec = le64_to_cpu(die->i_ctime); + inode->i_ctime.tv_nsec = le32_to_cpu(die->i_ctime_nsec);
inode->i_size = le64_to_cpu(die->i_size);
@@ -149,11 +147,9 @@ static struct page *erofs_read_inode(str i_gid_write(inode, le16_to_cpu(dic->i_gid)); set_nlink(inode, le16_to_cpu(dic->i_nlink));
- /* use build time to derive all file time */ - inode->i_mtime.tv_sec = inode->i_ctime.tv_sec = - sbi->build_time; - inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec = - sbi->build_time_nsec; + /* use build time for compact inodes */ + inode->i_ctime.tv_sec = sbi->build_time; + inode->i_ctime.tv_nsec = sbi->build_time_nsec;
inode->i_size = le32_to_cpu(dic->i_size); if (erofs_inode_is_data_compressed(vi->datalayout)) @@ -167,6 +163,11 @@ static struct page *erofs_read_inode(str goto err_out; }
+ inode->i_mtime.tv_sec = inode->i_ctime.tv_sec; + inode->i_atime.tv_sec = inode->i_ctime.tv_sec; + inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec; + inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec; + if (!nblks) /* measure inode.i_blocks as generic filesystems */ inode->i_blocks = roundup(inode->i_size, EROFS_BLKSIZ) >> 9;
From: Kaixu Xia kaixuxia@tencent.com
commit 174fe5ba2d1ea0d6c5ab2a7d4aa058d6d497ae4d upstream.
The macro MOPT_Q is used to indicates the mount option is related to quota stuff and is defined to be MOPT_NOSUPPORT when CONFIG_QUOTA is disabled. Normally the quota options are handled explicitly, so it didn't matter that the MOPT_STRING flag was missing, even though the usrjquota and grpjquota mount options take a string argument. It's important that's present in the !CONFIG_QUOTA case, since without MOPT_STRING, the mount option matcher will match usrjquota= followed by an integer, and will otherwise skip the table entry, and so "mount option not supported" error message is never reported.
[ Fixed up the commit description to better explain why the fix works. --TYT ]
Fixes: 26092bf52478 ("ext4: use a table-driven handler for mount options") Signed-off-by: Kaixu Xia kaixuxia@tencent.com Link: https://lore.kernel.org/r/1603986396-28917-1-git-send-email-kaixuxia@tencent... Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/super.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1829,8 +1829,8 @@ static const struct mount_opts { {Opt_noquota, (EXT4_MOUNT_QUOTA | EXT4_MOUNT_USRQUOTA | EXT4_MOUNT_GRPQUOTA | EXT4_MOUNT_PRJQUOTA), MOPT_CLEAR | MOPT_Q}, - {Opt_usrjquota, 0, MOPT_Q}, - {Opt_grpjquota, 0, MOPT_Q}, + {Opt_usrjquota, 0, MOPT_Q | MOPT_STRING}, + {Opt_grpjquota, 0, MOPT_Q | MOPT_STRING}, {Opt_offusrjquota, 0, MOPT_Q}, {Opt_offgrpjquota, 0, MOPT_Q}, {Opt_jqfmt_vfsold, QFMT_VFS_OLD, MOPT_QFMT},
From: Joseph Qi joseph.qi@linux.alibaba.com
commit 7067b2619017d51e71686ca9756b454de0e5826a upstream.
It takes xattr_sem to check inline data again but without unlock it in case not have. So unlock it before return.
Fixes: aef1c8513c1f ("ext4: let ext4_truncate handle inline data correctly") Reported-by: Dan Carpenter dan.carpenter@oracle.com Cc: Tao Ma boyu.mt@taobao.com Signed-off-by: Joseph Qi joseph.qi@linux.alibaba.com Reviewed-by: Andreas Dilger adilger@dilger.ca Link: https://lore.kernel.org/r/1604370542-124630-1-git-send-email-joseph.qi@linux... Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/inline.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -1880,6 +1880,7 @@ int ext4_inline_data_truncate(struct ino
ext4_write_lock_xattr(inode, &no_expand); if (!ext4_has_inline_data(inode)) { + ext4_write_unlock_xattr(inode, &no_expand); *has_inline = 0; ext4_journal_stop(handle); return 0;
From: Matthew Wilcox (Oracle) willy@infradead.org
commit a1fbc6750e212c5675a4e48d7f51d44607eb8756 upstream.
On 32-bit systems, this shift will overflow for files larger than 4GB as start_index is unsigned long while the calls to btrfs_delalloc_*_space expect u64.
CC: stable@vger.kernel.org # 4.4+ Fixes: df480633b891 ("btrfs: extent-tree: Switch to new delalloc space reserve and release") Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: David Sterba dsterba@suse.com [ define the variable instead of repeating the shift ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/ioctl.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1261,6 +1261,7 @@ static int cluster_pages_for_defrag(stru u64 page_start; u64 page_end; u64 page_cnt; + u64 start = (u64)start_index << PAGE_SHIFT; int ret; int i; int i_done; @@ -1277,8 +1278,7 @@ static int cluster_pages_for_defrag(stru page_cnt = min_t(u64, (u64)num_pages, (u64)file_end - start_index + 1);
ret = btrfs_delalloc_reserve_space(BTRFS_I(inode), &data_reserved, - start_index << PAGE_SHIFT, - page_cnt << PAGE_SHIFT); + start, page_cnt << PAGE_SHIFT); if (ret) return ret; i_done = 0; @@ -1367,8 +1367,7 @@ again: btrfs_mod_outstanding_extents(BTRFS_I(inode), 1); spin_unlock(&BTRFS_I(inode)->lock); btrfs_delalloc_release_space(BTRFS_I(inode), data_reserved, - start_index << PAGE_SHIFT, - (page_cnt - i_done) << PAGE_SHIFT, true); + start, (page_cnt - i_done) << PAGE_SHIFT, true); }
@@ -1395,8 +1394,7 @@ out: put_page(pages[i]); } btrfs_delalloc_release_space(BTRFS_I(inode), data_reserved, - start_index << PAGE_SHIFT, - page_cnt << PAGE_SHIFT, true); + start, page_cnt << PAGE_SHIFT, true); btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT); extent_changeset_free(data_reserved); return ret;
From: Dinghao Liu dinghao.liu@zju.edu.cn
commit 468600c6ec28613b756193c5f780aac062f1acdf upstream.
There is one error handling path that does not free ref, which may cause a minor memory leak.
CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/ref-verify.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/ref-verify.c +++ b/fs/btrfs/ref-verify.c @@ -860,6 +860,7 @@ int btrfs_ref_tree_mod(struct btrfs_fs_i "dropping a ref for a root that doesn't have a ref on the block"); dump_block_entry(fs_info, be); dump_ref_action(fs_info, ra); + kfree(ref); kfree(ra); goto out_unlock; }
From: Josef Bacik josef@toxicpanda.com
commit fca3a45d08782a2bb85e048fb8e3128b1388d7b7 upstream.
The minimum reserve size was adjusted to take into account the height of the tree we are merging, however we can have a root with a level == 0. What we want is root_level + 1 to get the number of nodes we may have to cow. This fixes the enospc_debug warning pops with btrfs/101.
Nikolay: this fixes failures on btrfs/060 btrfs/062 btrfs/063 and btrfs/195 That I was seeing, the call trace was:
[ 3680.515564] ------------[ cut here ]------------ [ 3680.515566] BTRFS: block rsv returned -28 [ 3680.515585] WARNING: CPU: 2 PID: 8339 at fs/btrfs/block-rsv.c:521 btrfs_use_block_rsv+0x162/0x180 [ 3680.515587] Modules linked in: [ 3680.515591] CPU: 2 PID: 8339 Comm: btrfs Tainted: G W 5.9.0-rc8-default #95 [ 3680.515593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 [ 3680.515595] RIP: 0010:btrfs_use_block_rsv+0x162/0x180 [ 3680.515600] RSP: 0018:ffffa01ac9753910 EFLAGS: 00010282 [ 3680.515602] RAX: 0000000000000000 RBX: ffff984b34200000 RCX: 0000000000000027 [ 3680.515604] RDX: 0000000000000027 RSI: 0000000000000000 RDI: ffff984b3bd19e28 [ 3680.515606] RBP: 0000000000004000 R08: ffff984b3bd19e20 R09: 0000000000000001 [ 3680.515608] R10: 0000000000000004 R11: 0000000000000046 R12: ffff984b264fdc00 [ 3680.515609] R13: ffff984b13149000 R14: 00000000ffffffe4 R15: ffff984b34200000 [ 3680.515613] FS: 00007f4e2912b8c0(0000) GS:ffff984b3bd00000(0000) knlGS:0000000000000000 [ 3680.515615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3680.515617] CR2: 00007fab87122150 CR3: 0000000118e42000 CR4: 00000000000006e0 [ 3680.515620] Call Trace: [ 3680.515627] btrfs_alloc_tree_block+0x8b/0x340 [ 3680.515633] ? __lock_acquire+0x51a/0xac0 [ 3680.515646] alloc_tree_block_no_bg_flush+0x4f/0x60 [ 3680.515651] __btrfs_cow_block+0x14e/0x7e0 [ 3680.515662] btrfs_cow_block+0x144/0x2c0 [ 3680.515670] merge_reloc_root+0x4d4/0x610 [ 3680.515675] ? btrfs_lookup_fs_root+0x78/0x90 [ 3680.515686] merge_reloc_roots+0xee/0x280 [ 3680.515695] relocate_block_group+0x2ce/0x5e0 [ 3680.515704] btrfs_relocate_block_group+0x16e/0x310 [ 3680.515711] btrfs_relocate_chunk+0x38/0xf0 [ 3680.515716] btrfs_shrink_device+0x200/0x560 [ 3680.515728] btrfs_rm_device+0x1ae/0x6a6 [ 3680.515744] ? _copy_from_user+0x6e/0xb0 [ 3680.515750] btrfs_ioctl+0x1afe/0x28c0 [ 3680.515755] ? find_held_lock+0x2b/0x80 [ 3680.515760] ? do_user_addr_fault+0x1f8/0x418 [ 3680.515773] ? __x64_sys_ioctl+0x77/0xb0 [ 3680.515775] __x64_sys_ioctl+0x77/0xb0 [ 3680.515781] do_syscall_64+0x31/0x70 [ 3680.515785] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: Nikolay Borisov nborisov@suse.com Fixes: 44d354abf33e ("btrfs: relocation: review the call sites which can be interrupted by signal") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Nikolay Borisov nborisov@suse.com Tested-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/relocation.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1646,6 +1646,7 @@ static noinline_for_stack int merge_relo struct btrfs_root_item *root_item; struct btrfs_path *path; struct extent_buffer *leaf; + int reserve_level; int level; int max_level; int replaced = 0; @@ -1694,7 +1695,8 @@ static noinline_for_stack int merge_relo * Thus the needed metadata size is at most root_level * nodesize, * and * 2 since we have two trees to COW. */ - min_reserved = fs_info->nodesize * btrfs_root_level(root_item) * 2; + reserve_level = max_t(int, 1, btrfs_root_level(root_item)); + min_reserved = fs_info->nodesize * reserve_level * 2; memset(&next_key, 0, sizeof(next_key));
while (1) {
From: Anand Jain anand.jain@oracle.com
commit cf89af146b7e62af55470cf5f3ec3c56ec144a5e upstream.
If there is a device BTRFS_DEV_REPLACE_DEVID without the device replace item, then it means the filesystem is inconsistent state. This is either corruption or a crafted image. Fail the mount as this needs a closer look what is actually wrong.
As of now if BTRFS_DEV_REPLACE_DEVID is present without the replace item, in __btrfs_free_extra_devids() we determine that there is an extra device, and free those extra devices but continue to mount the device. However, we were wrong in keeping tack of the rw_devices so the syzbot testcase failed:
WARNING: CPU: 1 PID: 3612 at fs/btrfs/volumes.c:1166 close_fs_devices.part.0+0x607/0x800 fs/btrfs/volumes.c:1166 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 3612 Comm: syz-executor.2 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 panic+0x347/0x7c0 kernel/panic.c:231 __warn.cold+0x20/0x46 kernel/panic.c:600 report_bug+0x1bd/0x210 lib/bug.c:198 handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234 exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536 RIP: 0010:close_fs_devices.part.0+0x607/0x800 fs/btrfs/volumes.c:1166 RSP: 0018:ffffc900091777e0 EFLAGS: 00010246 RAX: 0000000000040000 RBX: ffffffffffffffff RCX: ffffc9000c8b7000 RDX: 0000000000040000 RSI: ffffffff83097f47 RDI: 0000000000000007 RBP: dffffc0000000000 R08: 0000000000000001 R09: ffff8880988a187f R10: 0000000000000000 R11: 0000000000000001 R12: ffff88809593a130 R13: ffff88809593a1ec R14: ffff8880988a1908 R15: ffff88809593a050 close_fs_devices fs/btrfs/volumes.c:1193 [inline] btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179 open_ctree+0x4984/0x4a2d fs/btrfs/disk-io.c:3434 btrfs_fill_super fs/btrfs/super.c:1316 [inline] btrfs_mount_root.cold+0x14/0x165 fs/btrfs/super.c:1672
The fix here is, when we determine that there isn't a replace item then fail the mount if there is a replace target device (devid 0).
CC: stable@vger.kernel.org # 4.19+ Reported-by: syzbot+4cfe71a4da060be47502@syzkaller.appspotmail.com Signed-off-by: Anand Jain anand.jain@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/dev-replace.c | 26 ++++++++++++++++++++++++-- fs/btrfs/volumes.c | 26 +++++++------------------- 2 files changed, 31 insertions(+), 21 deletions(-)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -95,6 +95,17 @@ int btrfs_init_dev_replace(struct btrfs_ ret = btrfs_search_slot(NULL, dev_root, &key, path, 0, 0); if (ret) { no_valid_dev_replace_entry_found: + /* + * We don't have a replace item or it's corrupted. If there is + * a replace target, fail the mount. + */ + if (btrfs_find_device(fs_info->fs_devices, + BTRFS_DEV_REPLACE_DEVID, NULL, NULL, false)) { + btrfs_err(fs_info, + "found replace target device without a valid replace item"); + ret = -EUCLEAN; + goto out; + } ret = 0; dev_replace->replace_state = BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED; @@ -147,8 +158,19 @@ no_valid_dev_replace_entry_found: case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED: case BTRFS_IOCTL_DEV_REPLACE_STATE_FINISHED: case BTRFS_IOCTL_DEV_REPLACE_STATE_CANCELED: - dev_replace->srcdev = NULL; - dev_replace->tgtdev = NULL; + /* + * We don't have an active replace item but if there is a + * replace target, fail the mount. + */ + if (btrfs_find_device(fs_info->fs_devices, + BTRFS_DEV_REPLACE_DEVID, NULL, NULL, false)) { + btrfs_err(fs_info, + "replace devid present without an active replace item"); + ret = -EUCLEAN; + } else { + dev_replace->srcdev = NULL; + dev_replace->tgtdev = NULL; + } break; case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED: case BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED: --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1064,22 +1064,13 @@ again: continue; }
- if (device->devid == BTRFS_DEV_REPLACE_DEVID) { - /* - * In the first step, keep the device which has - * the correct fsid and the devid that is used - * for the dev_replace procedure. - * In the second step, the dev_replace state is - * read from the device tree and it is known - * whether the procedure is really active or - * not, which means whether this device is - * used or whether it should be removed. - */ - if (step == 0 || test_bit(BTRFS_DEV_STATE_REPLACE_TGT, - &device->dev_state)) { - continue; - } - } + /* + * We have already validated the presence of BTRFS_DEV_REPLACE_DEVID, + * in btrfs_init_dev_replace() so just continue. + */ + if (device->devid == BTRFS_DEV_REPLACE_DEVID) + continue; + if (device->bdev) { blkdev_put(device->bdev, device->mode); device->bdev = NULL; @@ -1088,9 +1079,6 @@ again: if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) { list_del_init(&device->dev_alloc_list); clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state); - if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, - &device->dev_state)) - fs_devices->rw_devices--; } list_del_init(&device->dev_list); fs_devices->num_devices--;
From: Andrew Jones drjones@redhat.com
commit f81cb2c3ad41ac6d8cb2650e3d72d5f67db1aa28 upstream.
ID registers are RAZ until they've been allocated a purpose, but that doesn't mean they should be removed from the KVM_GET_REG_LIST list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that is hidden from userspace when its function, SVE, is not present.
Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not implemented. Removing the userspace visibility checks is enough to reexpose it, as it will already return zero to userspace when SVE is not present. The register already behaves as RAZ for the guest when SVE is not present.
Fixes: 73433762fcae ("KVM: arm64/sve: System register context switch and access support") Reported-by: 张东旭 xu910121@sina.com Signed-off-by: Andrew Jones drjones@redhat.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org#v5.2+ Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kvm/sys_regs.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-)
--- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1193,16 +1193,6 @@ static unsigned int sve_visibility(const return REG_HIDDEN_USER | REG_HIDDEN_GUEST; }
-/* Visibility overrides for SVE-specific ID registers */ -static unsigned int sve_id_visibility(const struct kvm_vcpu *vcpu, - const struct sys_reg_desc *rd) -{ - if (vcpu_has_sve(vcpu)) - return 0; - - return REG_HIDDEN_USER; -} - /* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */ static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu) { @@ -1229,9 +1219,6 @@ static int get_id_aa64zfr0_el1(struct kv { u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu))) - return -ENOENT; - val = guest_id_aa64zfr0_el1(vcpu); return reg_to_user(uaddr, &val, reg->id); } @@ -1244,9 +1231,6 @@ static int set_id_aa64zfr0_el1(struct kv int err; u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu))) - return -ENOENT; - err = reg_from_user(&val, uaddr, id); if (err) return err; @@ -1509,7 +1493,7 @@ static const struct sys_reg_desc sys_reg ID_SANITISED(ID_AA64PFR1_EL1), ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3), - { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = sve_id_visibility }, + { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, }, ID_UNALLOCATED(4,5), ID_UNALLOCATED(4,6), ID_UNALLOCATED(4,7),
From: Samuel Thibault samuel.thibault@ens-lyon.org
commit d7012df3c9aecdcfb50f7a2ebad766952fd1410e upstream.
commit d97a9d7aea04 ("staging/speakup: Add inflection synth parameter") introduced a new "inflection" speakup parameter next to "pitch", but the values of the var_id_t enum are actually used by the keymap tables so we must not renumber them. The effect was that notably the volume control shortcut (speakup-1 or 2) was actually changing the inflection.
This moves the INFLECTION value at the end of the var_id_t enum to fix back the enum values. This also adds a warning about it.
Fixes: d97a9d7aea04 ("staging/speakup: Add inflection synth parameter") Cc: stable@vger.kernel.org Reported-by: Kirk Reiser kirk@reisers.ca Reported-by: Gregory Nowak greg@gregn.net Tested-by: Gregory Nowak greg@gregn.net Signed-off-by: Samuel Thibault samuel.thibault@ens-lyon.org Link: https://lore.kernel.org/r/20201012160646.qmdo4eqtj24hpch4@function Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/accessibility/speakup/spk_types.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/accessibility/speakup/spk_types.h +++ b/drivers/accessibility/speakup/spk_types.h @@ -32,6 +32,10 @@ enum { E_NEW_DEFAULT, };
+/* + * Note: add new members at the end, speakupmap.h depends on the values of the + * enum starting from SPELL_DELAY (see inc_dec_var) + */ enum var_id_t { VERSION = 0, SYNTH, SILENT, SYNTH_DIRECT, KEYMAP, CHARS, @@ -42,9 +46,9 @@ enum var_id_t { SAY_CONTROL, SAY_WORD_CTL, NO_INTERRUPT, KEY_ECHO, SPELL_DELAY, PUNC_LEVEL, READING_PUNC, ATTRIB_BLEEP, BLEEPS, - RATE, PITCH, INFLECTION, VOL, TONE, PUNCT, VOICE, FREQUENCY, LANG, + RATE, PITCH, VOL, TONE, PUNCT, VOICE, FREQUENCY, LANG, DIRECT, PAUSE, - CAPS_START, CAPS_STOP, CHARTAB, + CAPS_START, CAPS_STOP, CHARTAB, INFLECTION, MAXVARS };
From: Samuel Thibault samuel.thibault@ens-lyon.org
commit 3ed1cfb2cee4355ddef49489897bfe474daeeaec upstream.
With the ltlk and spkout drivers, the index read function, i.e. in_nowait, is getting called from the read_all_doc mechanism, from the timer softirq:
Call Trace: <IRQ> dump_stack+0x71/0x98 dequeue_task_idle+0x1f/0x28 __schedule+0x167/0x5d6 ? trace_hardirqs_on+0x2e/0x3a ? usleep_range+0x7f/0x7f schedule+0x8a/0xae schedule_timeout+0xb1/0xea ? del_timer_sync+0x31/0x31 do_wait_for_common+0xba/0x12b ? wake_up_q+0x45/0x45 wait_for_common+0x37/0x50 ttyio_in+0x2a/0x6b spk_ttyio_in_nowait+0xc/0x13 spk_get_index_count+0x20/0x93 cursor_done+0x1c6/0x4c6 ? read_all_doc+0xb1/0xb1 call_timer_fn+0x89/0x140 run_timer_softirq+0x164/0x1a5 ? read_all_doc+0xb1/0xb1 ? hrtimer_forward+0x7b/0x87 ? timerqueue_add+0x62/0x68 ? enqueue_hrtimer+0x95/0x9f __do_softirq+0x181/0x31f irq_exit+0x6a/0x86 smp_apic_timer_interrupt+0x15e/0x183 apic_timer_interrupt+0xf/0x20 </IRQ>
We thus should not schedule() at all, even with timeout == 0, this crashes the kernel. We can however use try_wait_for_completion() instead of wait_for_completion_timeout(0).
Cc: stable@vger.kernel.org Reported-by: John Covici covici@ccs.covici.com Tested-by: John Covici covici@ccs.covici.com Signed-off-by: Samuel Thibault samuel.thibault@ens-lyon.org Link: https://lore.kernel.org/r/20201108131233.tadycr73sxlvodgo@function Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/accessibility/speakup/spk_ttyio.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/accessibility/speakup/spk_ttyio.c +++ b/drivers/accessibility/speakup/spk_ttyio.c @@ -298,11 +298,13 @@ static unsigned char ttyio_in(int timeou struct spk_ldisc_data *ldisc_data = speakup_tty->disc_data; char rv;
- if (wait_for_completion_timeout(&ldisc_data->completion, + if (!timeout) { + if (!try_wait_for_completion(&ldisc_data->completion)) + return 0xff; + } else if (wait_for_completion_timeout(&ldisc_data->completion, usecs_to_jiffies(timeout)) == 0) { - if (timeout) - pr_warn("spk_ttyio: timeout (%d) while waiting for input\n", - timeout); + pr_warn("spk_ttyio: timeout (%d) while waiting for input\n", + timeout); return 0xff; }
From: Samuel Thibault samuel.thibault@ens-lyon.org
commit 640969a69ca4dd2ac025fe873c6bf25eba8f11b3 upstream.
speakup_cut() calls speakup_clear_selection() which calls console_lock. Problem is: speakup_cut() is called from a keyboard interrupt context. This would hang if speakup_cut is pressed while the console lock is unfortunately already held.
We can however as well just defer calling clear_selection() until the already-deferred set_selection_kernel() call.
This was spotted by the lock hardener:
Possible unsafe locking scenario:\x0a CPU0 ---- lock(console_lock); <Interrupt> lock(console_lock); \x0a *** DEADLOCK ***\x0a [...] Call Trace: <IRQ> dump_stack+0xc2/0x11a print_usage_bug.cold+0x3e0/0x4b1 mark_lock+0xd95/0x1390 ? print_irq_inversion_bug+0xa0/0xa0 __lock_acquire+0x21eb/0x5730 ? __kasan_check_read+0x11/0x20 ? check_chain_key+0x215/0x5e0 ? register_lock_class+0x1580/0x1580 ? lock_downgrade+0x7a0/0x7a0 ? __rwlock_init+0x140/0x140 lock_acquire+0x13f/0x370 ? speakup_clear_selection+0xe/0x20 [speakup] console_lock+0x33/0x50 ? speakup_clear_selection+0xe/0x20 [speakup] speakup_clear_selection+0xe/0x20 [speakup] speakup_cut+0x19e/0x4b0 [speakup] keyboard_notifier_call+0x1f04/0x4a40 [speakup] ? read_all_doc+0x240/0x240 [speakup] notifier_call_chain+0xbf/0x130 __atomic_notifier_call_chain+0x80/0x130 atomic_notifier_call_chain+0x16/0x20 kbd_event+0x7d7/0x3b20 ? k_pad+0x850/0x850 ? sysrq_filter+0x450/0xd40 input_to_handler+0x362/0x4b0 ? rcu_read_lock_sched_held+0xe0/0xe0 input_pass_values+0x408/0x5a0 ? __rwlock_init+0x140/0x140 ? lock_acquire+0x13f/0x370 input_handle_event+0x70e/0x1380 input_event+0x67/0x90 atkbd_interrupt+0xe62/0x1d4e [atkbd] ? __kasan_check_write+0x14/0x20 ? atkbd_event_work+0x130/0x130 [atkbd] ? _raw_spin_lock_irqsave+0x26/0x70 serio_interrupt+0x93/0x120 [serio] i8042_interrupt+0x232/0x510 [i8042] ? rcu_read_lock_bh_held+0xd0/0xd0 ? handle_irq_event+0xa5/0x13a ? i8042_remove+0x1f0/0x1f0 [i8042] __handle_irq_event_percpu+0xe6/0x6c0 handle_irq_event_percpu+0x71/0x150 ? __handle_irq_event_percpu+0x6c0/0x6c0 ? __kasan_check_read+0x11/0x20 ? do_raw_spin_unlock+0x5c/0x240 handle_irq_event+0xad/0x13a handle_edge_irq+0x233/0xa90 do_IRQ+0x10b/0x310 common_interrupt+0xf/0xf </IRQ>
Cc: stable@vger.kernel.org Reported-by: Jookia contact@jookia.org Signed-off-by: Samuel Thibault samuel.thibault@ens-lyon.org Link: https://lore.kernel.org/r/20201107233310.7iisvaozpiqj3yvy@function Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/accessibility/speakup/main.c | 1 - drivers/accessibility/speakup/selection.c | 11 ++++------- drivers/accessibility/speakup/speakup.h | 1 - 3 files changed, 4 insertions(+), 9 deletions(-)
--- a/drivers/accessibility/speakup/main.c +++ b/drivers/accessibility/speakup/main.c @@ -357,7 +357,6 @@ static void speakup_cut(struct vc_data * mark_cut_flag = 0; synth_printf("%s\n", spk_msg_get(MSG_CUT));
- speakup_clear_selection(); ret = speakup_set_selection(tty);
switch (ret) { --- a/drivers/accessibility/speakup/selection.c +++ b/drivers/accessibility/speakup/selection.c @@ -22,13 +22,6 @@ struct speakup_selection_work { struct tty_struct *tty; };
-void speakup_clear_selection(void) -{ - console_lock(); - clear_selection(); - console_unlock(); -} - static void __speakup_set_selection(struct work_struct *work) { struct speakup_selection_work *ssw = @@ -51,6 +44,10 @@ static void __speakup_set_selection(stru goto unref; }
+ console_lock(); + clear_selection(); + console_unlock(); + set_selection_kernel(&sel, tty);
unref: --- a/drivers/accessibility/speakup/speakup.h +++ b/drivers/accessibility/speakup/speakup.h @@ -70,7 +70,6 @@ void spk_do_flush(void); void speakup_start_ttys(void); void synth_buffer_add(u16 ch); void synth_buffer_clear(void); -void speakup_clear_selection(void); int speakup_set_selection(struct tty_struct *tty); void speakup_cancel_selection(void); int speakup_paste_selection(struct tty_struct *tty);
From: Mika Westerberg mika.westerberg@linux.intel.com
commit a663e0df4a374b8537562a44d1cecafb472cd65b upstream.
The svc->key field is not released as it should be if ida_simple_get() fails so fix that.
Fixes: 9aabb68568b4 ("thunderbolt: Fix to check return value of ida_simple_get") Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/thunderbolt/xdomain.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -881,6 +881,7 @@ static void enumerate_services(struct tb
id = ida_simple_get(&xd->service_ids, 0, 0, GFP_KERNEL); if (id < 0) { + kfree(svc->key); kfree(svc); break; }
From: Jing Xiangfeng jingxiangfeng@huawei.com
commit 7342ca34d931a357d408aaa25fadd031e46af137 upstream.
ring_request_msix() misses to call ida_simple_remove() in an error path. Add a label 'err_ida_remove' and jump to it.
Fixes: 046bee1f9ab8 ("thunderbolt: Add MSI-X support") Cc: stable@vger.kernel.org Signed-off-by: Jing Xiangfeng jingxiangfeng@huawei.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/thunderbolt/nhi.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -405,12 +405,23 @@ static int ring_request_msix(struct tb_r
ring->vector = ret;
- ring->irq = pci_irq_vector(ring->nhi->pdev, ring->vector); - if (ring->irq < 0) - return ring->irq; + ret = pci_irq_vector(ring->nhi->pdev, ring->vector); + if (ret < 0) + goto err_ida_remove; + + ring->irq = ret;
irqflags = no_suspend ? IRQF_NO_SUSPEND : 0; - return request_irq(ring->irq, ring_msix, irqflags, "thunderbolt", ring); + ret = request_irq(ring->irq, ring_msix, irqflags, "thunderbolt", ring); + if (ret) + goto err_ida_remove; + + return 0; + +err_ida_remove: + ida_simple_remove(&nhi->msix_ida, ring->vector); + + return ret; }
static void ring_release_msix(struct tb_ring *ring)
From: Christoph Hellwig hch@lst.de
commit 7e890c37c25c7cbca37ff0ab292873d8146e713b upstream.
Return if the function ended up sending an uevent or not.
Cc: stable@vger.kernel.org # v5.9 Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Petr Vorel pvorel@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/genhd.c | 5 ++++- include/linux/genhd.h | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-)
--- a/block/genhd.c +++ b/block/genhd.c @@ -49,7 +49,7 @@ static void disk_release_events(struct g * Set disk capacity and notify if the size is not currently * zero and will not be set to zero */ -void set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size, +bool set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size, bool revalidate) { sector_t capacity = get_capacity(disk); @@ -63,7 +63,10 @@ void set_capacity_revalidate_and_notify( char *envp[] = { "RESIZE=1", NULL };
kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp); + return true; } + + return false; }
EXPORT_SYMBOL_GPL(set_capacity_revalidate_and_notify); --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -315,7 +315,7 @@ static inline int get_disk_ro(struct gen extern void disk_block_events(struct gendisk *disk); extern void disk_unblock_events(struct gendisk *disk); extern void disk_flush_events(struct gendisk *disk, unsigned int mask); -extern void set_capacity_revalidate_and_notify(struct gendisk *disk, +extern bool set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size, bool revalidate); extern unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask);
From: Petr Vorel pvorel@suse.cz
commit c01a21b77722db0474bbcc4eafc8c4e0d8fed6d8 upstream.
Commit 716ad0986cbd ("loop: Switch to set_capacity_revalidate_and_notify") causes an occasional drop of loop device uevent, which are no longer triggered in loop_set_size() but in a different part of code.
Bug is reproducible with LTP test uevent01 [1]:
i=0; while true; do i=$((i+1)); echo "== $i ==" lsmod |grep -q loop && rmmod -f loop ./uevent01 || break done
Put back triggering through code called in loop_set_size().
Fix required to add yet another parameter to set_capacity_revalidate_and_notify().
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/ueven...
[hch: rebased on a different change to the prototype of set_capacity_revalidate_and_notify]
Cc: stable@vger.kernel.org # v5.9 Fixes: 716ad0986cbd ("loop: Switch to set_capacity_revalidate_and_notify") Reported-by: ltp@lists.linux.it Signed-off-by: Petr Vorel pvorel@suse.cz Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -255,7 +255,8 @@ static void loop_set_size(struct loop_de
bd_set_size(bdev, size << SECTOR_SHIFT);
- set_capacity_revalidate_and_notify(lo->lo_disk, size, false); + if (!set_capacity_revalidate_and_notify(lo->lo_disk, size, false)) + kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE); }
static inline int
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
commit 092561f06702dd4fdd7fb74dd3a838f1818529b7 upstream.
Commit 8fd0e2a6df26 ("uio: free uio id after uio file node is freed") triggered KASAN use-after-free failure at deletion of TCM-user backstores [1].
In uio_unregister_device(), struct uio_device *idev is passed to uio_free_minor() to refer idev->minor. However, before uio_free_minor() call, idev is already freed by uio_device_release() during call to device_unregister().
To avoid reference to idev->minor after idev free, keep idev->minor value in a local variable. Also modify uio_free_minor() argument to receive the value.
[1] BUG: KASAN: use-after-free in uio_unregister_device+0x166/0x190 Read of size 4 at addr ffff888105196508 by task targetcli/49158
CPU: 3 PID: 49158 Comm: targetcli Not tainted 5.10.0-rc1 #1 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0 12/17/2015 Call Trace: dump_stack+0xae/0xe5 ? uio_unregister_device+0x166/0x190 print_address_description.constprop.0+0x1c/0x210 ? uio_unregister_device+0x166/0x190 ? uio_unregister_device+0x166/0x190 kasan_report.cold+0x37/0x7c ? kobject_put+0x80/0x410 ? uio_unregister_device+0x166/0x190 uio_unregister_device+0x166/0x190 tcmu_destroy_device+0x1c4/0x280 [target_core_user] ? tcmu_release+0x90/0x90 [target_core_user] ? __mutex_unlock_slowpath+0xd6/0x5d0 target_free_device+0xf3/0x2e0 [target_core_mod] config_item_cleanup+0xea/0x210 configfs_rmdir+0x651/0x860 ? detach_groups.isra.0+0x380/0x380 vfs_rmdir.part.0+0xec/0x3a0 ? __lookup_hash+0x20/0x150 do_rmdir+0x252/0x320 ? do_file_open_root+0x420/0x420 ? strncpy_from_user+0xbc/0x2f0 ? getname_flags.part.0+0x8e/0x450 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f9e2bfc91fb Code: 73 01 c3 48 8b 0d 9d ec 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 54 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d ec 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffdd2baafe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000054 RAX: ffffffffffffffda RBX: 00007f9e2beb44a0 RCX: 00007f9e2bfc91fb RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f9e1c20be90 RBP: 00007ffdd2bab000 R08: 0000000000000000 R09: 00007f9e2bdf2440 R10: 00007ffdd2baaf37 R11: 0000000000000246 R12: 00000000ffffff9c R13: 000055f9abb7e390 R14: 000055f9abcf9558 R15: 00007f9e2be7a780
Allocated by task 34735: kasan_save_stack+0x1b/0x40 __kasan_kmalloc.constprop.0+0xc2/0xd0 __uio_register_device+0xeb/0xd40 tcmu_configure_device+0x5a0/0xbc0 [target_core_user] target_configure_device+0x12f/0x760 [target_core_mod] target_dev_enable_store+0x32/0x50 [target_core_mod] configfs_write_file+0x2bb/0x450 vfs_write+0x1ce/0x610 ksys_write+0xe9/0x1b0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 49158: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x1b/0x30 __kasan_slab_free+0x110/0x150 slab_free_freelist_hook+0x5a/0x170 kfree+0xc6/0x560 device_release+0x9b/0x210 kobject_put+0x13e/0x410 uio_unregister_device+0xf9/0x190 tcmu_destroy_device+0x1c4/0x280 [target_core_user] target_free_device+0xf3/0x2e0 [target_core_mod] config_item_cleanup+0xea/0x210 configfs_rmdir+0x651/0x860 vfs_rmdir.part.0+0xec/0x3a0 do_rmdir+0x252/0x320 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff888105196000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1288 bytes inside of 2048-byte region [ffff888105196000, ffff888105196800) The buggy address belongs to the page: page:0000000098e6ca81 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x105190 head:0000000098e6ca81 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x17ffffc0010200(slab|head) raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888100043040 raw: 0000000000000000 0000000000080008 00000001ffffffff ffff88810eb55c01 page dumped because: kasan: bad access detected page->mem_cgroup:ffff88810eb55c01
Memory state around the buggy address: ffff888105196400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888105196480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888105196500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff888105196580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888105196600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 8fd0e2a6df26 ("uio: free uio id after uio file node is freed") Cc: stable stable@vger.kernel.org Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Link: https://lore.kernel.org/r/20201102122819.2346270-1-shinichiro.kawasaki@wdc.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/uio/uio.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/uio/uio.c +++ b/drivers/uio/uio.c @@ -413,10 +413,10 @@ static int uio_get_minor(struct uio_devi return retval; }
-static void uio_free_minor(struct uio_device *idev) +static void uio_free_minor(unsigned long minor) { mutex_lock(&minor_lock); - idr_remove(&uio_idr, idev->minor); + idr_remove(&uio_idr, minor); mutex_unlock(&minor_lock); }
@@ -990,7 +990,7 @@ err_request_irq: err_uio_dev_add_attributes: device_del(&idev->dev); err_device_create: - uio_free_minor(idev); + uio_free_minor(idev->minor); put_device(&idev->dev); return ret; } @@ -1042,11 +1042,13 @@ EXPORT_SYMBOL_GPL(__devm_uio_register_de void uio_unregister_device(struct uio_info *info) { struct uio_device *idev; + unsigned long minor;
if (!info || !info->uio_dev) return;
idev = info->uio_dev; + minor = idev->minor;
mutex_lock(&idev->info_lock); uio_dev_del_attributes(idev); @@ -1062,7 +1064,7 @@ void uio_unregister_device(struct uio_in
device_unregister(&idev->dev);
- uio_free_minor(idev); + uio_free_minor(minor);
return; }
From: Geert Uytterhoeven geert+renesas@glider.be
commit ffa13d2d94029882eca22a565551783787f121e5 upstream.
This reverts commit 2d30e408a2a6b3443d3232593e3d472584a3e9f8.
On Beaglebone Black, where each interface has 2 children:
musb-dsps 47401c00.usb: can't request region for resource [mem 0x47401800-0x474019ff] musb-hdrc musb-hdrc.1: musb_init_controller failed with status -16 musb-hdrc: probe of musb-hdrc.1 failed with error -16 musb-dsps 47401400.usb: can't request region for resource [mem 0x47401000-0x474011ff] musb-hdrc musb-hdrc.0: musb_init_controller failed with status -16 musb-hdrc: probe of musb-hdrc.0 failed with error -16
Before, devm_ioremap_resource() was called on "dev" ("musb-hdrc.0" or "musb-hdrc.1"), after it is called on "&pdev->dev" ("47401400.usb" or "47401c00.usb"), leading to a duplicate region request, which fails.
Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Fixes: 2d30e408a2a6 ("usb: musb: convert to devm_platform_ioremap_resource_byname") Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20201112135900.3822599-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/musb/musb_dsps.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/musb/musb_dsps.c +++ b/drivers/usb/musb/musb_dsps.c @@ -429,10 +429,12 @@ static int dsps_musb_init(struct musb *m struct platform_device *parent = to_platform_device(dev->parent); const struct dsps_musb_wrapper *wrp = glue->wrp; void __iomem *reg_base; + struct resource *r; u32 rev, val; int ret;
- reg_base = devm_platform_ioremap_resource_byname(parent, "control"); + r = platform_get_resource_byname(parent, IORESOURCE_MEM, "control"); + reg_base = devm_ioremap_resource(dev, r); if (IS_ERR(reg_base)) return PTR_ERR(reg_base); musb->ctrl_base = reg_base;
From: Chris Brandt chris.brandt@renesas.com
commit 6d853c9e4104b4fc8d55dc9cd3b99712aa347174 upstream.
Renesas R-Car and RZ/G SoCs have a firmware download mode over USB. However, on reset a banner string is transmitted out which is not expected to be echoed back and will corrupt the protocol.
Cc: stable stable@vger.kernel.org Acked-by: Oliver Neukum oneukum@suse.com Signed-off-by: Chris Brandt chris.brandt@renesas.com Link: https://lore.kernel.org/r/20201111131209.3977903-1-chris.brandt@renesas.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/class/cdc-acm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -1706,6 +1706,15 @@ static const struct usb_device_id acm_id { USB_DEVICE(0x0870, 0x0001), /* Metricom GS Modem */ .driver_info = NO_UNION_NORMAL, /* has no union descriptor */ }, + { USB_DEVICE(0x045b, 0x023c), /* Renesas USB Download mode */ + .driver_info = DISABLE_ECHO, /* Don't echo banner */ + }, + { USB_DEVICE(0x045b, 0x0248), /* Renesas USB Download mode */ + .driver_info = DISABLE_ECHO, /* Don't echo banner */ + }, + { USB_DEVICE(0x045b, 0x024D), /* Renesas USB Download mode */ + .driver_info = DISABLE_ECHO, /* Don't echo banner */ + }, { USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov@gmail.com */ .driver_info = NO_UNION_NORMAL, /* has no union descriptor */ },
From: Heikki Krogerus heikki.krogerus@linux.intel.com
commit 0e6371fbfba3a4f76489e6e97c1c7f8386ad5fd2 upstream.
When the ucsi power supply goes online/offline, and when the power levels change, the power supply class needs to be notified so it can inform the user space.
Fixes: 992a60ed0d5e ("usb: typec: ucsi: register with power_supply class") Cc: stable@vger.kernel.org Reported-and-tested-by: Vladimir Yerilov openmindead@gmail.com Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20201110120547.67922-1-heikki.krogerus@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/typec/ucsi/psy.c | 9 +++++++++ drivers/usb/typec/ucsi/ucsi.c | 7 ++++++- drivers/usb/typec/ucsi/ucsi.h | 2 ++ 3 files changed, 17 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/ucsi/psy.c +++ b/drivers/usb/typec/ucsi/psy.c @@ -238,4 +238,13 @@ void ucsi_unregister_port_psy(struct ucs return;
power_supply_unregister(con->psy); + con->psy = NULL; +} + +void ucsi_port_psy_changed(struct ucsi_connector *con) +{ + if (IS_ERR_OR_NULL(con->psy)) + return; + + power_supply_changed(con->psy); } --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -643,8 +643,10 @@ static void ucsi_handle_connector_change role = !!(con->status.flags & UCSI_CONSTAT_PWR_DIR);
if (con->status.change & UCSI_CONSTAT_POWER_OPMODE_CHANGE || - con->status.change & UCSI_CONSTAT_POWER_LEVEL_CHANGE) + con->status.change & UCSI_CONSTAT_POWER_LEVEL_CHANGE) { ucsi_pwr_opmode_change(con); + ucsi_port_psy_changed(con); + }
if (con->status.change & UCSI_CONSTAT_POWER_DIR_CHANGE) { typec_set_pwr_role(con->port, role); @@ -674,6 +676,8 @@ static void ucsi_handle_connector_change ucsi_register_partner(con); else ucsi_unregister_partner(con); + + ucsi_port_psy_changed(con); }
if (con->status.change & UCSI_CONSTAT_CAM_CHANGE) { @@ -994,6 +998,7 @@ static int ucsi_register_port(struct ucs !!(con->status.flags & UCSI_CONSTAT_PWR_DIR)); ucsi_pwr_opmode_change(con); ucsi_register_partner(con); + ucsi_port_psy_changed(con); }
if (con->partner) { --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -340,9 +340,11 @@ int ucsi_resume(struct ucsi *ucsi); #if IS_ENABLED(CONFIG_POWER_SUPPLY) int ucsi_register_port_psy(struct ucsi_connector *con); void ucsi_unregister_port_psy(struct ucsi_connector *con); +void ucsi_port_psy_changed(struct ucsi_connector *con); #else static inline int ucsi_register_port_psy(struct ucsi_connector *con) { return 0; } static inline void ucsi_unregister_port_psy(struct ucsi_connector *con) { } +static inline void ucsi_port_psy_changed(struct ucsi_connector *con) { } #endif /* CONFIG_POWER_SUPPLY */
#if IS_ENABLED(CONFIG_TYPEC_DP_ALTMODE)
From: Zhang Qilong zhangqilong3@huawei.com
commit 76255470ffa2795a44032e8b3c1ced11d81aa2db upstream.
pm_runtime_get_sync() will increment pm usage at first and it will resume the device later. We should decrease the usage count whetever it succeeded or failed(maybe runtime of the device has error, or device is in inaccessible state, or other error state). If we do not call put operation to decrease the reference, it will result in reference leak in xhci_histb_probe. Moreover, this device cannot enter the idle state and always stay busy or other non-idle state later. So we fixed it by jumping to error handling branch.
Fixes: c508f41da0788 ("xhci: hisilicon: support HiSilicon STB xHCI host controller") Signed-off-by: Zhang Qilong zhangqilong3@huawei.com Link: https://lore.kernel.org/r/20201106122221.2304528-1-zhangqilong3@huawei.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-histb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-histb.c +++ b/drivers/usb/host/xhci-histb.c @@ -240,7 +240,7 @@ static int xhci_histb_probe(struct platf /* Initialize dma_mask and coherent_dma_mask to 32-bits */ ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); if (ret) - return ret; + goto disable_pm;
hcd = usb_create_hcd(driver, dev, dev_name(dev)); if (!hcd) {
From: Alexander Lobakin alobakin@pm.me
commit 9d516aa82b7d4fbe7f6303348697960ba03a530b upstream.
Since commit 086d08725d34 ("remoteproc: create vdev subdevice with specific dma memory pool"), every remoteproc has a DMA subdevice ("remoteprocX#vdevYbuffer") for each virtio device, which inherits DMA capabilities from the corresponding platform device. This allowed to associate different DMA pools with each vdev, and required from virtio drivers to perform DMA operations with the parent device (vdev->dev.parent) instead of grandparent (vdev->dev.parent->parent).
virtio_rpmsg_bus was already changed in the same merge cycle with commit d999b622fcfb ("rpmsg: virtio: allocate buffer from parent"), but virtio_console did not. In fact, operations using the grandparent worked fine while the grandparent was the platform device, but since commit c774ad010873 ("remoteproc: Fix and restore the parenting hierarchy for vdev") this was changed, and now the grandparent device is the remoteproc device without any DMA capabilities. So, starting v5.8-rc1 the following warning is observed:
[ 2.483925] ------------[ cut here ]------------ [ 2.489148] WARNING: CPU: 3 PID: 101 at kernel/dma/mapping.c:427 0x80e7eee8 [ 2.489152] Modules linked in: virtio_console(+) [ 2.503737] virtio_rpmsg_bus rpmsg_core [ 2.508903] [ 2.528898] <Other modules, stack and call trace here> [ 2.913043] [ 2.914907] ---[ end trace 93ac8746beab612c ]--- [ 2.920102] virtio-ports vport1p0: Error allocating inbufs
kernel/dma/mapping.c:427 is:
WARN_ON_ONCE(!dev->coherent_dma_mask);
obviously because the grandparent now is remoteproc dev without any DMA caps:
[ 3.104943] Parent: remoteproc0#vdev1buffer, grandparent: remoteproc0
Fix this the same way as it was for virtio_rpmsg_bus, using just the parent device (vdev->dev.parent, "remoteprocX#vdevYbuffer") for DMA operations. This also allows now to reserve DMA pools/buffers for rproc serial via Device Tree.
Fixes: c774ad010873 ("remoteproc: Fix and restore the parenting hierarchy for vdev") Cc: stable@vger.kernel.org # 5.1+ Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Alexander Lobakin alobakin@pm.me Date: Thu, 5 Nov 2020 11:10:24 +0800 Link: https://lore.kernel.org/r/AOKowLclCbOCKxyiJ71WeNyuAAj2q8EUtxrXbyky5E@cp7-web... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/virtio_console.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -435,12 +435,12 @@ static struct port_buffer *alloc_buf(str /* * Allocate DMA memory from ancestor. When a virtio * device is created by remoteproc, the DMA memory is - * associated with the grandparent device: - * vdev => rproc => platform-dev. + * associated with the parent device: + * virtioY => remoteprocX#vdevYbuffer. */ - if (!vdev->dev.parent || !vdev->dev.parent->parent) + buf->dev = vdev->dev.parent; + if (!buf->dev) goto free_buf; - buf->dev = vdev->dev.parent->parent;
/* Increase device refcnt to avoid freeing it */ get_device(buf->dev);
From: Alexander Usyskin alexander.usyskin@intel.com
commit bcbc0b2e275f0a797de11a10eff495b4571863fc upstream.
A receive callback is queued while the client is still connected but can still be called after the client was disconnected. Upon disconnect cl->me_cl is set to NULL, hence we need to check that ME client is not-NULL in mei_cl_mtu to avoid null dereference.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20201029095444.957924-2-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/client.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/misc/mei/client.h +++ b/drivers/misc/mei/client.h @@ -164,11 +164,11 @@ static inline u8 mei_cl_me_id(const stru * * @cl: host client * - * Return: mtu + * Return: mtu or 0 if client is not connected */ static inline size_t mei_cl_mtu(const struct mei_cl *cl) { - return cl->me_cl->props.max_msg_length; + return cl->me_cl ? cl->me_cl->props.max_msg_length : 0; }
/**
From: Dan Carpenter dan.carpenter@oracle.com
commit 1e106aa3509b86738769775969822ffc1ec21bf4 upstream.
The exit_pi_state_list() function calls put_pi_state() with IRQs disabled and is not expecting that IRQs will be enabled inside the function.
Use the _irqsave() variant so that IRQs are restored to the original state instead of being enabled unconditionally.
Fixes: 153fbd1226fb ("futex: Fix more put_pi_state() vs. exit_pi_state_list() races") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201106085205.GA1159983@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -788,8 +788,9 @@ static void put_pi_state(struct futex_pi */ if (pi_state->owner) { struct task_struct *owner; + unsigned long flags;
- raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); + raw_spin_lock_irqsave(&pi_state->pi_mutex.wait_lock, flags); owner = pi_state->owner; if (owner) { raw_spin_lock(&owner->pi_lock); @@ -797,7 +798,7 @@ static void put_pi_state(struct futex_pi raw_spin_unlock(&owner->pi_lock); } rt_mutex_proxy_unlock(&pi_state->pi_mutex, owner); - raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); + raw_spin_unlock_irqrestore(&pi_state->pi_mutex.wait_lock, flags); }
if (current->pi_state_cache) {
From: Theodore Ts'o tytso@mit.edu
commit 05d5233df85e9621597c5838e95235107eb624a2 upstream.
Add missing __acquires() and __releases() annotations. Also, in an "this should never happen" WARN_ON check, if it *does* actually happen, we need to release j_state_lock since this function is always supposed to release that lock. Otherwise, things will quickly grind to a halt after the WARN_ON trips.
Fixes: 96f1e0974575 ("jbd2: avoid long hold times of j_state_lock...") Cc: stable@kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/jbd2/checkpoint.c | 2 ++ fs/jbd2/transaction.c | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -106,6 +106,8 @@ static int __try_to_free_cp_buf(struct j * for a checkpoint to free up some space in the log. */ void __jbd2_log_wait_for_space(journal_t *journal) +__acquires(&journal->j_state_lock) +__releases(&journal->j_state_lock) { int nblocks, space_left; /* assert_spin_locked(&journal->j_state_lock); */ --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -195,8 +195,10 @@ static void wait_transaction_switching(j DEFINE_WAIT(wait);
if (WARN_ON(!journal->j_running_transaction || - journal->j_running_transaction->t_state != T_SWITCH)) + journal->j_running_transaction->t_state != T_SWITCH)) { + read_unlock(&journal->j_state_lock); return; + } prepare_to_wait(&journal->j_wait_transaction_locked, &wait, TASK_UNINTERRUPTIBLE); read_unlock(&journal->j_state_lock);
From: Masami Hiramatsu mhiramat@kernel.org
commit 50b8a742850fce7293bed45753152c425f7e931b upstream.
Since Grub may align the size of initrd to 4 if user pass initrd from cpio, we have to check the preceding 3 bytes as well.
Link: https://lkml.kernel.org/r/160520205132.303174.4876760192433315429.stgit@devn...
Cc: stable@vger.kernel.org Fixes: 85c46b78da58 ("bootconfig: Add bootconfig magic word for indicating bootconfig explicitly") Reported-by: Chen Yu yu.chen.surf@gmail.com Tested-by: Chen Yu yu.chen.surf@gmail.com Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- init/main.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/init/main.c +++ b/init/main.c @@ -267,14 +267,24 @@ static void * __init get_boot_config_fro u32 size, csum; char *data; u32 *hdr; + int i;
if (!initrd_end) return NULL;
data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN; - if (memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN)) - return NULL; + /* + * Since Grub may align the size of initrd to 4, we must + * check the preceding 3 bytes as well. + */ + for (i = 0; i < 4; i++) { + if (!memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN)) + goto found; + data--; + } + return NULL;
+found: hdr = (u32 *)(data - 8); size = hdr[0]; csum = hdr[1];
From: Zi Yan ziy@nvidia.com
commit 38935861d85a4d9a353d1dd5a156c97700e2765d upstream.
In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages. But nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought.
So count compound pages as the number of base pages they contain. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32, since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX.
In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to '>='.
The issue can be triggered as follows:
In a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish.
[ziy@nvidia.com: v3]
Link: https://lkml.kernel.org/r/20201030183809.3616803-1-zi.yan@sent.com Fixes: 1da2f328fa64 ("cmm,thp,compaction,cma: allow THP migration for CMA allocations") Signed-off-by: Zi Yan ziy@nvidia.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Yang Shi shy828301@gmail.com Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Rik van Riel riel@surriel.com Cc: Michal Hocko mhocko@kernel.org Cc: Mel Gorman mgorman@techsingularity.net Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201029200435.3386066-1-zi.yan@sent.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/compaction.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/compaction.c +++ b/mm/compaction.c @@ -1013,8 +1013,8 @@ isolate_migratepages_block(struct compac
isolate_success: list_add(&page->lru, &cc->migratepages); - cc->nr_migratepages++; - nr_isolated++; + cc->nr_migratepages += compound_nr(page); + nr_isolated += compound_nr(page);
/* * Avoid isolating too much unless this block is being @@ -1022,7 +1022,7 @@ isolate_success: * or a lock is contended. For contention, isolate quickly to * potentially remove one source of contention. */ - if (cc->nr_migratepages == COMPACT_CLUSTER_MAX && + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX && !cc->rescan && !cc->contended) { ++low_pfn; break; @@ -1133,7 +1133,7 @@ isolate_migratepages_range(struct compac if (!pfn) break;
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) + if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) break; }
From: Zi Yan ziy@nvidia.com
commit d20bdd571ee5c9966191568527ecdb1bd4b52368 upstream.
In isolate_migratepages_block, if we have too many isolated pages and nr_migratepages is not zero, we should try to migrate what we have without wasting time on isolating.
In theory it's possible that multiple parallel compactions will cause too_many_isolated() to become true even if each has isolated less than COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing immediately prevents that.
[vbabka@suse.cz: changelog addition]
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Suggested-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Zi Yan ziy@nvidia.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org Cc: Mel Gorman mgorman@techsingularity.net Cc: Michal Hocko mhocko@kernel.org Cc: Rik van Riel riel@surriel.com Cc: Yang Shi shy828301@gmail.com Link: https://lkml.kernel.org/r/20201030183809.3616803-2-zi.yan@sent.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/compaction.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/compaction.c +++ b/mm/compaction.c @@ -818,6 +818,10 @@ isolate_migratepages_block(struct compac * delay for some time until fewer pages are isolated */ while (unlikely(too_many_isolated(pgdat))) { + /* stop isolation if there are still pages not migrated */ + if (cc->nr_migratepages) + return 0; + /* async migration should just abort */ if (cc->mode == MIGRATE_ASYNC) return 0;
From: Laurent Dufour ldufour@linux.ibm.com
commit 22e4663e916321b72972c69ca0c6b962f529bd78 upstream.
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs with 11TB of ram, I hit the following panic:
BUG: Kernel NULL pointer dereference on read at 0x00000007 Faulting instruction address: 0xc000000000456048 Oops: Kernel access of bad area, sig: 11 [#2] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1 NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350 REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000 CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0 GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000 GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320 GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000 GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1 GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8 GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000 GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001 NIP [c000000000456048] __kmalloc_node+0x108/0x790 LR [c000000000455fd4] __kmalloc_node+0x94/0x790 Call Trace: kvmalloc_node+0x58/0x110 mem_cgroup_css_online+0x10c/0x270 online_css+0x48/0xd0 cgroup_apply_control_enable+0x2c4/0x470 cgroup_mkdir+0x408/0x5f0 kernfs_iop_mkdir+0x90/0x100 vfs_mkdir+0x138/0x250 do_mkdirat+0x154/0x1c0 system_call_exception+0xf8/0x200 system_call_common+0xf0/0x27c Instruction dump: e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a 2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
This pointing to the following code:
mm/slub.c:2851 if (unlikely(!object || !node_match(page, node))) { c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0 c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114> node_match(): mm/slub.c:2491 if (node != NUMA_NO_NODE && page_to_nid(page) != node) c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330> page_to_nid(): include/linux/mm.h:1294 c000000000456044: 10 00 27 e9 ld r9,16(r7) c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL node_match(): mm/slub.c:2491 c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9 c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
The panic occurred in slab_alloc_node() when checking for the page's node:
object = c->freelist; page = c->page; if (unlikely(!object || !node_match(page, node))) { object = __slab_alloc(s, gfpflags, node, addr, c); stat(s, ALLOC_SLOWPATH);
The issue is that object is not NULL while page is NULL which is odd but may happen if the cache flush happened after loading object but before loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a piece of memory is off-lined. That interrupt is triggered when a memory hot-unplug operation is initiated and offline_pages() is calling the slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which is calling flush_cpu_slab(). If that interrupt is caught between the reading of c->freelist and the reading of c->page, this could lead to such a situation. That situation is expected and the later call to this_cpu_cmpxchg_double() will detect the change to c->freelist and redo the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()") check on the page pointer has been removed assuming that page is always valid when it is called. It happens that this is not true in that particular case, so check for page before calling node_match() here.
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()") Signed-off-by: Laurent Dufour ldufour@linux.ibm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Vlastimil Babka vbabka@suse.cz Acked-by: Christoph Lameter cl@linux.com Cc: Wei Yang richard.weiyang@gmail.com Cc: Pekka Enberg penberg@kernel.org Cc: David Rientjes rientjes@google.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Nathan Lynch nathanl@linux.ibm.com Cc: Scott Cheloha cheloha@linux.ibm.com Cc: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c +++ b/mm/slub.c @@ -2848,7 +2848,7 @@ redo:
object = c->freelist; page = c->page; - if (unlikely(!object || !node_match(page, node))) { + if (unlikely(!object || !page || !node_match(page, node))) { object = __slab_alloc(s, gfpflags, node, addr, c); stat(s, ALLOC_SLOWPATH); } else {
From: Nicholas Piggin npiggin@gmail.com
commit 2da9f6305f306ffbbb44790675799328fb73119d upstream.
Previously the negated unsigned long would be cast back to signed long which would have the correct negative value. After commit 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex due to alloc_contig_range->...->isolate_migratepages_block waiting forever in "while (unlikely(too_many_isolated(pgdat)))".
[akpm@linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal]
Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list") Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Michal Hocko mhocko@suse.com Cc: Vaneet Narang v.narang@samsung.com Cc: Maninder Singh maninder1.s@samsung.com Cc: Amit Sahrawat a.sahrawat@samsung.com Cc: Mel Gorman mgorman@suse.de Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmscan.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1514,7 +1514,8 @@ unsigned int reclaim_clean_pages_from_li nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc, TTU_IGNORE_ACCESS, &stat, true); list_splice(&clean_pages, page_list); - mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed); + mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, + -(long)nr_reclaimed); /* * Since lazyfree pages are isolated from file LRU from the beginning, * they will rotate back to anonymous LRU in the end if it failed to @@ -1524,7 +1525,7 @@ unsigned int reclaim_clean_pages_from_li mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON, stat.nr_lazyfree_fail); mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, - -stat.nr_lazyfree_fail); + -(long)stat.nr_lazyfree_fail); return nr_reclaimed; }
From: Jason Gunthorpe jgg@nvidia.com
commit 96e1fac162cc0086c50b2b14062112adb2ba640e upstream.
When FOLL_PIN is passed to __get_user_pages() the page list must be put back using unpin_user_pages() otherwise the page pin reference persists in a corrupted state.
There are two places in the unwind of __gup_longterm_locked() that put the pages back without checking. Normally on error this function would return the partial page list making this the caller's responsibility, but in these two cases the caller is not allowed to see these pages at all.
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages") Reported-by: Ira Weiny ira.weiny@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Ira Weiny ira.weiny@intel.com Reviewed-by: John Hubbard jhubbard@nvidia.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Cc: Dan Williams dan.j.williams@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/0-v2-3ae7d9d162e2+2a7-gup_cma_fix_jgg@nvidia.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/gup.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -1637,8 +1637,11 @@ check_again: /* * drop the above get_user_pages reference. */ - for (i = 0; i < nr_pages; i++) - put_page(pages[i]); + if (gup_flags & FOLL_PIN) + unpin_user_pages(pages, nr_pages); + else + for (i = 0; i < nr_pages; i++) + put_page(pages[i]);
if (migrate_pages(&cma_page_list, alloc_migration_target, NULL, (unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) { @@ -1718,8 +1721,11 @@ static long __gup_longterm_locked(struct goto out;
if (check_dax_vmas(vmas_tmp, rc)) { - for (i = 0; i < rc; i++) - put_page(pages[i]); + if (gup_flags & FOLL_PIN) + unpin_user_pages(pages, rc); + else + for (i = 0; i < rc; i++) + put_page(pages[i]); rc = -EOPNOTSUPP; goto out; }
From: Matteo Croce mcroce@microsoft.com
commit 8b92c4ff4423aa9900cf838d3294fcade4dbda35 upstream.
Patch series "fix parsing of reboot= cmdline", v3.
The parsing of the reboot= cmdline has two major errors:
- a missing bound check can crash the system on reboot
- parsing of the cpu number only works if specified last
Fix both.
This patch (of 2):
This reverts commit 616feab753972b97.
kstrtoint() and simple_strtoul() have a subtle difference which makes them non interchangeable: if a non digit character is found amid the parsing, the former will return an error, while the latter will just stop parsing, e.g. simple_strtoul("123xyx") = 123.
The kernel cmdline reboot= argument allows to specify the CPU used for rebooting, with the syntax `s####` among the other flags, e.g. "reboot=warm,s31,force", so if this flag is not the last given, it's silently ignored as well as the subsequent ones.
Fixes: 616feab75397 ("kernel/reboot.c: convert simple_strtoul to kstrtoint") Signed-off-by: Matteo Croce mcroce@microsoft.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Cc: Petr Mladek pmladek@suse.com Cc: Arnd Bergmann arnd@arndb.de Cc: Mike Rapoport rppt@kernel.org Cc: Kees Cook keescook@chromium.org Cc: Pavel Tatashin pasha.tatashin@soleen.com Cc: Robin Holt robinmholt@gmail.com Cc: Fabian Frederick fabf@skynet.be Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/reboot.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-)
--- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -551,22 +551,15 @@ static int __init reboot_setup(char *str break;
case 's': - { - int rc; - - if (isdigit(*(str+1))) { - rc = kstrtoint(str+1, 0, &reboot_cpu); - if (rc) - return rc; - } else if (str[1] == 'm' && str[2] == 'p' && - isdigit(*(str+3))) { - rc = kstrtoint(str+3, 0, &reboot_cpu); - if (rc) - return rc; - } else + if (isdigit(*(str+1))) + reboot_cpu = simple_strtoul(str+1, NULL, 0); + else if (str[1] == 'm' && str[2] == 'p' && + isdigit(*(str+3))) + reboot_cpu = simple_strtoul(str+3, NULL, 0); + else *mode = REBOOT_SOFT; break; - } + case 'g': *mode = REBOOT_GPIO; break;
From: Matteo Croce mcroce@microsoft.com
commit df5b0ab3e08a156701b537809914b339b0daa526 upstream.
Limit the CPU number to num_possible_cpus(), because setting it to a value lower than INT_MAX but higher than NR_CPUS produces the following error on reboot and shutdown:
BUG: unable to handle page fault for address: ffffffff90ab1bb0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1c09067 P4D 1c09067 PUD 1c0a063 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.9.0-rc8-kvm #110 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:migrate_to_reboot_cpu+0xe/0x60 Code: ea ea 00 48 89 fa 48 c7 c7 30 57 f1 81 e9 fa ef ff ff 66 2e 0f 1f 84 00 00 00 00 00 53 8b 1d d5 ea ea 00 e8 14 33 fe ff 89 da <48> 0f a3 15 ea fc bd 00 48 89 d0 73 29 89 c2 c1 e8 06 65 48 8b 3c RSP: 0018:ffffc90000013e08 EFLAGS: 00010246 RAX: ffff88801f0a0000 RBX: 0000000077359400 RCX: 0000000000000000 RDX: 0000000077359400 RSI: 0000000000000002 RDI: ffffffff81c199e0 RBP: ffffffff81c1e3c0 R08: ffff88801f41f000 R09: ffffffff81c1e348 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 00007f32bedf8830 R14: 00000000fee1dead R15: 0000000000000000 FS: 00007f32bedf8980(0000) GS:ffff88801f480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff90ab1bb0 CR3: 000000001d057000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __do_sys_reboot.cold+0x34/0x5b do_syscall_64+0x2d/0x40
Fixes: 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic kernel") Signed-off-by: Matteo Croce mcroce@microsoft.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Cc: Arnd Bergmann arnd@arndb.de Cc: Fabian Frederick fabf@skynet.be Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Guenter Roeck linux@roeck-us.net Cc: Kees Cook keescook@chromium.org Cc: Mike Rapoport rppt@kernel.org Cc: Pavel Tatashin pasha.tatashin@soleen.com Cc: Petr Mladek pmladek@suse.com Cc: Robin Holt robinmholt@gmail.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201103214025.116799-3-mcroce@linux.microsoft.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/reboot.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -558,6 +558,13 @@ static int __init reboot_setup(char *str reboot_cpu = simple_strtoul(str+3, NULL, 0); else *mode = REBOOT_SOFT; + if (reboot_cpu >= num_possible_cpus()) { + pr_err("Ignoring the CPU number in reboot= option. " + "CPU %d exceeds possible cpu number %d\n", + reboot_cpu, num_possible_cpus()); + reboot_cpu = 0; + break; + } break;
case 'g':
From: Mike Kravetz mike.kravetz@oracle.com
commit 336bf30eb76580b579dc711ded5d599d905c0217 upstream.
Qian Cai reported the following BUG in [1]
LTP: starting move_pages12 BUG: unable to handle page fault for address: ffffffffffffffe0 ... RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63 Call Trace: rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864 try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763 migrate_pages+0x1005/0x1fb0 move_pages_and_store_status.isra.47+0xd7/0x1a0 __x64_sys_move_pages+0xa5c/0x1100 do_syscall_64+0x5f/0x310 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickins diagnosed this as a migration bug caused by code introduced to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for anon pages as the anon_vma_lock should be held in this case. Further analysis suggested that i_mmap_rwsem was not required to he held at all when calling try_to_unmap for anon pages as an anon page could never be part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine will now simply do a 'trylock' while still holding the page lock. If the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard coded limit (10).
- memory error code will treat the page as BUSY. This will force killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race. None of the standard kernel testing suites actually hit this race, but it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/ [2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.an...
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Reported-by: Qian Cai cai@lca.pw Suggested-by: Hugh Dickins hughd@google.com Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Naoya Horiguchi naoya.horiguchi@nec.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/hugetlb.c | 90 ++-------------------------------------------------- mm/memory-failure.c | 36 +++++++++----------- mm/migrate.c | 46 ++++++++++++++------------ mm/rmap.c | 5 -- 4 files changed, 48 insertions(+), 129 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1579,103 +1579,23 @@ int PageHeadHuge(struct page *page_head) }
/* - * Find address_space associated with hugetlbfs page. - * Upon entry page is locked and page 'was' mapped although mapped state - * could change. If necessary, use anon_vma to find vma and associated - * address space. The returned mapping may be stale, but it can not be - * invalid as page lock (which is held) is required to destroy mapping. - */ -static struct address_space *_get_hugetlb_page_mapping(struct page *hpage) -{ - struct anon_vma *anon_vma; - pgoff_t pgoff_start, pgoff_end; - struct anon_vma_chain *avc; - struct address_space *mapping = page_mapping(hpage); - - /* Simple file based mapping */ - if (mapping) - return mapping; - - /* - * Even anonymous hugetlbfs mappings are associated with an - * underlying hugetlbfs file (see hugetlb_file_setup in mmap - * code). Find a vma associated with the anonymous vma, and - * use the file pointer to get address_space. - */ - anon_vma = page_lock_anon_vma_read(hpage); - if (!anon_vma) - return mapping; /* NULL */ - - /* Use first found vma */ - pgoff_start = page_to_pgoff(hpage); - pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1; - anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, - pgoff_start, pgoff_end) { - struct vm_area_struct *vma = avc->vma; - - mapping = vma->vm_file->f_mapping; - break; - } - - anon_vma_unlock_read(anon_vma); - return mapping; -} - -/* * Find and lock address space (mapping) in write mode. * - * Upon entry, the page is locked which allows us to find the mapping - * even in the case of an anon page. However, locking order dictates - * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs - * specific. So, we first try to lock the sema while still holding the - * page lock. If this works, great! If not, then we need to drop the - * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of - * course, need to revalidate state along the way. + * Upon entry, the page is locked which means that page_mapping() is + * stable. Due to locking order, we can only trylock_write. If we can + * not get the lock, simply return NULL to caller. */ struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage) { - struct address_space *mapping, *mapping2; + struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage); -retry: if (!mapping) return mapping;
- /* - * If no contention, take lock and return - */ if (i_mmap_trylock_write(mapping)) return mapping;
- /* - * Must drop page lock and wait on mapping sema. - * Note: Once page lock is dropped, mapping could become invalid. - * As a hack, increase map count until we lock page again. - */ - atomic_inc(&hpage->_mapcount); - unlock_page(hpage); - i_mmap_lock_write(mapping); - lock_page(hpage); - atomic_add_negative(-1, &hpage->_mapcount); - - /* verify page is still mapped */ - if (!page_mapped(hpage)) { - i_mmap_unlock_write(mapping); - return NULL; - } - - /* - * Get address space again and verify it is the same one - * we locked. If not, drop lock and retry. - */ - mapping2 = _get_hugetlb_page_mapping(hpage); - if (mapping2 != mapping) { - i_mmap_unlock_write(mapping); - mapping = mapping2; - goto retry; - } - - return mapping; + return NULL; }
pgoff_t __basepage_index(struct page *page) --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1031,27 +1031,25 @@ static bool hwpoison_user_mappings(struc if (!PageHuge(hpage)) { unmap_success = try_to_unmap(hpage, ttu); } else { - /* - * For hugetlb pages, try_to_unmap could potentially call - * huge_pmd_unshare. Because of this, take semaphore in - * write mode here and set TTU_RMAP_LOCKED to indicate we - * have taken the lock at this higer level. - * - * Note that the call to hugetlb_page_mapping_lock_write - * is necessary even if mapping is already set. It handles - * ugliness of potentially having to drop page lock to obtain - * i_mmap_rwsem. - */ - mapping = hugetlb_page_mapping_lock_write(hpage); - - if (mapping) { - unmap_success = try_to_unmap(hpage, + if (!PageAnon(hpage)) { + /* + * For hugetlb pages in shared mappings, try_to_unmap + * could potentially call huge_pmd_unshare. Because of + * this, take semaphore in write mode here and set + * TTU_RMAP_LOCKED to indicate we have taken the lock + * at this higer level. + */ + mapping = hugetlb_page_mapping_lock_write(hpage); + if (mapping) { + unmap_success = try_to_unmap(hpage, ttu|TTU_RMAP_LOCKED); - i_mmap_unlock_write(mapping); + i_mmap_unlock_write(mapping); + } else { + pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn); + unmap_success = false; + } } else { - pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n", - pfn); - unmap_success = false; + unmap_success = try_to_unmap(hpage, ttu); } } if (!unmap_success) --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1333,34 +1333,38 @@ static int unmap_and_move_huge_page(new_ goto put_anon;
if (page_mapped(hpage)) { - /* - * try_to_unmap could potentially call huge_pmd_unshare. - * Because of this, take semaphore in write mode here and - * set TTU_RMAP_LOCKED to let lower levels know we have - * taken the lock. - */ - mapping = hugetlb_page_mapping_lock_write(hpage); - if (unlikely(!mapping)) - goto unlock_put_anon; - - try_to_unmap(hpage, - TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS| - TTU_RMAP_LOCKED); + bool mapping_locked = false; + enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK| + TTU_IGNORE_ACCESS; + + if (!PageAnon(hpage)) { + /* + * In shared mappings, try_to_unmap could potentially + * call huge_pmd_unshare. Because of this, take + * semaphore in write mode here and set TTU_RMAP_LOCKED + * to let lower levels know we have taken the lock. + */ + mapping = hugetlb_page_mapping_lock_write(hpage); + if (unlikely(!mapping)) + goto unlock_put_anon; + + mapping_locked = true; + ttu |= TTU_RMAP_LOCKED; + } + + try_to_unmap(hpage, ttu); page_was_mapped = 1; - /* - * Leave mapping locked until after subsequent call to - * remove_migration_ptes() - */ + + if (mapping_locked) + i_mmap_unlock_write(mapping); }
if (!page_mapped(hpage)) rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) { + if (page_was_mapped) remove_migration_ptes(hpage, - rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true); - i_mmap_unlock_write(mapping); - } + rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon: unlock_page(new_hpage); --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page /* * If sharing is possible, start and end will be adjusted * accordingly. - * - * If called for a huge page, caller must hold i_mmap_rwsem - * in write mode as it is possible to call huge_pmd_unshare. */ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); @@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); address = pvmw.address;
- if (PageHuge(page)) { + if (PageHuge(page) && !PageAnon(page)) { /* * To call huge_pmd_unshare, i_mmap_rwsem must be * held in write mode. Caller needs to explicitly
From: Wengang Wang wen.gang.wang@oracle.com
commit f5785283dd64867a711ca1fb1f5bb172f252ecdf upstream.
Though problem if found on a lower 4.1.12 kernel, I think upstream has same issue.
In one node in the cluster, there is the following callback trace:
# cat /proc/21473/stack __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2] ocfs2_evict_inode+0x152/0x820 [ocfs2] evict+0xae/0x1a0 iput+0x1c6/0x230 ocfs2_orphan_filldir+0x5d/0x100 [ocfs2] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2] ocfs2_dir_foreach+0x29/0x30 [ocfs2] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2] process_one_work+0x169/0x4a0 worker_thread+0x5b/0x560 kthread+0xcb/0xf0 ret_from_fork+0x61/0x90
The above stack is not reasonable, the final iput shouldn't happen in ocfs2_orphan_filldir() function. Looking at the code,
2067 /* Skip inodes which are already added to recover list, since dio may 2068 * happen concurrently with unlink/rename */ 2069 if (OCFS2_I(iter)->ip_next_orphan) { 2070 iput(iter); 2071 return 0; 2072 } 2073
The logic thinks the inode is already in recover list on seeing ip_next_orphan is non-NULL, so it skip this inode after dropping a reference which incremented in ocfs2_iget().
While, if the inode is already in recover list, it should have another reference and the iput() at line 2070 should not be the final iput (dropping the last reference). So I don't think the inode is really in the recover list (no vmcore to confirm).
Note that ocfs2_queue_orphans(), though not shown up in the call back trace, is holding cluster lock on the orphan directory when looking up for unlinked inodes. The on disk inode eviction could involve a lot of IOs which may need long time to finish. That means this node could hold the cluster lock for very long time, that can lead to the lock requests (from other nodes) to the orhpan directory hang for long time.
Looking at more on ip_next_orphan, I found it's not initialized when allocating a new ocfs2_inode_info structure.
This causes te reflink operations from some nodes hang for very long time waiting for the cluster lock on the orphan directory.
Fix: initialize ip_next_orphan as NULL.
Signed-off-by: Wengang Wang wen.gang.wang@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ocfs2/super.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -1713,6 +1713,7 @@ static void ocfs2_inode_init_once(void *
oi->ip_blkno = 0ULL; oi->ip_clusters = 0; + oi->ip_next_orphan = NULL;
ocfs2_resv_init_once(&oi->ip_la_data_resv);
From: Naveen Krishna Chatradhi nchatrad@amd.com
commit 60268b0e8258fdea9a3c9f4b51e161c123571db3 upstream.
This patch limits the visibility to owner and groups only for the energy counters exposed through the hwmon based amd_energy driver.
Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Naveen Krishna Chatradhi nchatrad@amd.com Link: https://lore.kernel.org/r/20201112172159.8781-1-nchatrad@amd.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwmon/amd_energy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwmon/amd_energy.c +++ b/drivers/hwmon/amd_energy.c @@ -209,7 +209,7 @@ static umode_t amd_energy_is_visible(con enum hwmon_sensor_types type, u32 attr, int channel) { - return 0444; + return 0440; }
static int energy_accumulator(void *p)
From: Chen Zhou chenzhou10@huawei.com
commit c350f8bea271782e2733419bd2ab9bf4ec2051ef upstream.
Fix to return a negative error code from the error handling case instead of 0 in function sel_ib_pkey_sid_slow(), as done elsewhere in this function.
Cc: stable@vger.kernel.org Fixes: 409dcf31538a ("selinux: Add a cache for quicker retreival of PKey SIDs") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Chen Zhou chenzhou10@huawei.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/ibpkey.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/security/selinux/ibpkey.c +++ b/security/selinux/ibpkey.c @@ -151,8 +151,10 @@ static int sel_ib_pkey_sid_slow(u64 subn * is valid, it just won't be added to the cache. */ new = kzalloc(sizeof(*new), GFP_ATOMIC); - if (!new) + if (!new) { + ret = -ENOMEM; goto out; + }
new->psec.subnet_prefix = subnet_prefix; new->psec.pkey = pkey_num;
From: Jens Axboe axboe@kernel.dk
commit 88ec3211e46344a7d10cf6cb5045f839f7785f8e upstream.
If an application specifies IORING_SETUP_CQSIZE to set the CQ ring size to a specific size, we ensure that the CQ size is at least that of the SQ ring size. But in doing so, we compare the already rounded up to power of two SQ size to the as-of yet unrounded CQ size. This means that if an application passes in non power of two sizes, we can return -EINVAL when the final value would've been fine. As an example, an application passing in 100/100 for sq/cq size should end up with 128 for both. But since we round the SQ size first, we compare the CQ size of 100 to 128, and return -EINVAL as that is too small.
Cc: stable@vger.kernel.org Fixes: 33a107f0a1b8 ("io_uring: allow application controlled CQ ring size") Reported-by: Dan Melnic dmm@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8878,6 +8878,7 @@ static int io_uring_create(unsigned entr * to a power-of-two, if it isn't already. We do NOT impose * any cq vs sq ring sizing. */ + p->cq_entries = roundup_pow_of_two(p->cq_entries); if (p->cq_entries < p->sq_entries) return -EINVAL; if (p->cq_entries > IORING_MAX_CQ_ENTRIES) { @@ -8885,7 +8886,6 @@ static int io_uring_create(unsigned entr return -EINVAL; p->cq_entries = IORING_MAX_CQ_ENTRIES; } - p->cq_entries = roundup_pow_of_two(p->cq_entries); } else { p->cq_entries = 2 * p->sq_entries; }
From: Damien Le Moal damien.lemoal@wdc.com
commit b72de3ff19fdc4bbe4d4bb3f4483c7e46e00bac3 upstream.
Fix the check on the number of IRQs to allow up to the maximum (32) instead of only the maximum minus one.
Fixes: 96868dce644d ("gpio/sifive: Add GPIO driver for SiFive SoCs") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Link: https://lore.kernel.org/r/20201107081420.60325-10-damien.lemoal@wdc.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpio-sifive.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-sifive.c +++ b/drivers/gpio/gpio-sifive.c @@ -183,7 +183,7 @@ static int sifive_gpio_probe(struct plat return PTR_ERR(chip->regs);
ngpio = of_irq_count(node); - if (ngpio >= SIFIVE_GPIO_MAX) { + if (ngpio > SIFIVE_GPIO_MAX) { dev_err(dev, "Too many GPIO interrupts (max=%d)\n", SIFIVE_GPIO_MAX); return -ENXIO;
From: Arnaud de Turckheim quarium@gmail.com
commit d8f270efeac850c569c305dc0baa42ac3d607988 upstream.
Fix the bitwise operation to remove only the corresponding bit from the mask.
Fixes: 585562046628 ("gpio: Add GPIO support for the ACCES PCIe-IDIO-24 family") Cc: stable@vger.kernel.org Signed-off-by: Arnaud de Turckheim quarium@gmail.com Reviewed-by: William Breathitt Gray vilhelm.gray@gmail.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpio-pcie-idio-24.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-pcie-idio-24.c +++ b/drivers/gpio/gpio-pcie-idio-24.c @@ -339,7 +339,7 @@ static void idio_24_irq_mask(struct irq_
raw_spin_lock_irqsave(&idio24gpio->lock, flags);
- idio24gpio->irq_mask &= BIT(bit_offset); + idio24gpio->irq_mask &= ~BIT(bit_offset); new_irq_mask = idio24gpio->irq_mask >> bank_offset;
if (!new_irq_mask) {
From: Arnaud de Turckheim quarium@gmail.com
commit 23a7fdc06ebcc334fa667f0550676b035510b70b upstream.
This fixes the COS Enable Register value for enabling/disabling the corresponding IRQs bank.
Fixes: 585562046628 ("gpio: Add GPIO support for the ACCES PCIe-IDIO-24 family") Cc: stable@vger.kernel.org Signed-off-by: Arnaud de Turckheim quarium@gmail.com Reviewed-by: William Breathitt Gray vilhelm.gray@gmail.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpio-pcie-idio-24.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpio/gpio-pcie-idio-24.c +++ b/drivers/gpio/gpio-pcie-idio-24.c @@ -334,13 +334,13 @@ static void idio_24_irq_mask(struct irq_ unsigned long flags; const unsigned long bit_offset = irqd_to_hwirq(data) - 24; unsigned char new_irq_mask; - const unsigned long bank_offset = bit_offset/8 * 8; + const unsigned long bank_offset = bit_offset / 8; unsigned char cos_enable_state;
raw_spin_lock_irqsave(&idio24gpio->lock, flags);
idio24gpio->irq_mask &= ~BIT(bit_offset); - new_irq_mask = idio24gpio->irq_mask >> bank_offset; + new_irq_mask = idio24gpio->irq_mask >> bank_offset * 8;
if (!new_irq_mask) { cos_enable_state = ioread8(&idio24gpio->reg->cos_enable); @@ -363,12 +363,12 @@ static void idio_24_irq_unmask(struct ir unsigned long flags; unsigned char prev_irq_mask; const unsigned long bit_offset = irqd_to_hwirq(data) - 24; - const unsigned long bank_offset = bit_offset/8 * 8; + const unsigned long bank_offset = bit_offset / 8; unsigned char cos_enable_state;
raw_spin_lock_irqsave(&idio24gpio->lock, flags);
- prev_irq_mask = idio24gpio->irq_mask >> bank_offset; + prev_irq_mask = idio24gpio->irq_mask >> bank_offset * 8; idio24gpio->irq_mask |= BIT(bit_offset);
if (!prev_irq_mask) {
From: Arnaud de Turckheim quarium@gmail.com
commit 10a2f11d3c9e48363c729419e0f0530dea76e4fe upstream.
This enables the PEX8311 internal PCI wire interrupt and the PEX8311 local interrupt input so the local interrupts are forwarded to the PCI.
Fixes: 585562046628 ("gpio: Add GPIO support for the ACCES PCIe-IDIO-24 family") Cc: stable@vger.kernel.org Signed-off-by: Arnaud de Turckheim quarium@gmail.com Reviewed-by: William Breathitt Gray vilhelm.gray@gmail.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpio-pcie-idio-24.c | 52 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 51 insertions(+), 1 deletion(-)
--- a/drivers/gpio/gpio-pcie-idio-24.c +++ b/drivers/gpio/gpio-pcie-idio-24.c @@ -28,6 +28,47 @@ #include <linux/spinlock.h> #include <linux/types.h>
+/* + * PLX PEX8311 PCI LCS_INTCSR Interrupt Control/Status + * + * Bit: Description + * 0: Enable Interrupt Sources (Bit 0) + * 1: Enable Interrupt Sources (Bit 1) + * 2: Generate Internal PCI Bus Internal SERR# Interrupt + * 3: Mailbox Interrupt Enable + * 4: Power Management Interrupt Enable + * 5: Power Management Interrupt + * 6: Slave Read Local Data Parity Check Error Enable + * 7: Slave Read Local Data Parity Check Error Status + * 8: Internal PCI Wire Interrupt Enable + * 9: PCI Express Doorbell Interrupt Enable + * 10: PCI Abort Interrupt Enable + * 11: Local Interrupt Input Enable + * 12: Retry Abort Enable + * 13: PCI Express Doorbell Interrupt Active + * 14: PCI Abort Interrupt Active + * 15: Local Interrupt Input Active + * 16: Local Interrupt Output Enable + * 17: Local Doorbell Interrupt Enable + * 18: DMA Channel 0 Interrupt Enable + * 19: DMA Channel 1 Interrupt Enable + * 20: Local Doorbell Interrupt Active + * 21: DMA Channel 0 Interrupt Active + * 22: DMA Channel 1 Interrupt Active + * 23: Built-In Self-Test (BIST) Interrupt Active + * 24: Direct Master was the Bus Master during a Master or Target Abort + * 25: DMA Channel 0 was the Bus Master during a Master or Target Abort + * 26: DMA Channel 1 was the Bus Master during a Master or Target Abort + * 27: Target Abort after internal 256 consecutive Master Retrys + * 28: PCI Bus wrote data to LCS_MBOX0 + * 29: PCI Bus wrote data to LCS_MBOX1 + * 30: PCI Bus wrote data to LCS_MBOX2 + * 31: PCI Bus wrote data to LCS_MBOX3 + */ +#define PLX_PEX8311_PCI_LCS_INTCSR 0x68 +#define INTCSR_INTERNAL_PCI_WIRE BIT(8) +#define INTCSR_LOCAL_INPUT BIT(11) + /** * struct idio_24_gpio_reg - GPIO device registers structure * @out0_7: Read: FET Outputs 0-7 @@ -92,6 +133,7 @@ struct idio_24_gpio_reg { struct idio_24_gpio { struct gpio_chip chip; raw_spinlock_t lock; + __u8 __iomem *plx; struct idio_24_gpio_reg __iomem *reg; unsigned long irq_mask; }; @@ -455,6 +497,7 @@ static int idio_24_probe(struct pci_dev struct device *const dev = &pdev->dev; struct idio_24_gpio *idio24gpio; int err; + const size_t pci_plx_bar_index = 1; const size_t pci_bar_index = 2; const char *const name = pci_name(pdev); struct gpio_irq_chip *girq; @@ -469,12 +512,13 @@ static int idio_24_probe(struct pci_dev return err; }
- err = pcim_iomap_regions(pdev, BIT(pci_bar_index), name); + err = pcim_iomap_regions(pdev, BIT(pci_plx_bar_index) | BIT(pci_bar_index), name); if (err) { dev_err(dev, "Unable to map PCI I/O addresses (%d)\n", err); return err; }
+ idio24gpio->plx = pcim_iomap_table(pdev)[pci_plx_bar_index]; idio24gpio->reg = pcim_iomap_table(pdev)[pci_bar_index];
idio24gpio->chip.label = name; @@ -504,6 +548,12 @@ static int idio_24_probe(struct pci_dev
/* Software board reset */ iowrite8(0, &idio24gpio->reg->soft_reset); + /* + * enable PLX PEX8311 internal PCI wire interrupt and local interrupt + * input + */ + iowrite8((INTCSR_INTERNAL_PCI_WIRE | INTCSR_LOCAL_INPUT) >> 8, + idio24gpio->plx + PLX_PEX8311_PCI_LCS_INTCSR + 1);
err = devm_gpiochip_add_data(dev, &idio24gpio->chip, idio24gpio); if (err) {
From: Yangbo Lu yangbo.lu@nxp.com
commit 71b053276a87ddfa40c8f236315d81543219bfb9 upstream.
Apply erratum workaround of unreliable pulse width detection to more affected platforms (LX2160A Rev2.0 and LS1028A Rev1.0).
Signed-off-by: Yangbo Lu yangbo.lu@nxp.com Fixes: 48e304cc1970 ("mmc: sdhci-of-esdhc: workaround for unreliable pulse width detection") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201110071314.3868-1-yangbo.lu@nxp.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/sdhci-of-esdhc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -1324,6 +1324,8 @@ static struct soc_device_attribute soc_f
static struct soc_device_attribute soc_unreliable_pulse_detection[] = { { .family = "QorIQ LX2160A", .revision = "1.0", }, + { .family = "QorIQ LX2160A", .revision = "2.0", }, + { .family = "QorIQ LS1028A", .revision = "1.0", }, { }, };
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
commit e8973201d9b281375b5a8c66093de5679423021a upstream.
The commit 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()") added tmio_mmc_host_free(), but missed the function calling in the sh_mobile_sdhi_remove() at that time. So, fix it. Otherwise, we cannot rebind the sdhi/mmc devices when we use aliases of mmc.
Fixes: 94b110aff867 ("mmc: tmio: add tmio_mmc_host_alloc/free()") Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Reviewed-by: Wolfram Sang wsa+renesas@sang-engineering.com Tested-by: Wolfram Sang wsa+renesas@sang-engineering.com Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1604654730-29914-1-git-send-email-yoshihiro.shimod... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/renesas_sdhi_core.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -997,6 +997,7 @@ int renesas_sdhi_remove(struct platform_
tmio_mmc_host_remove(host); renesas_sdhi_clk_disable(host); + tmio_mmc_host_free(host);
return 0; }
From: Al Viro viro@zeniv.linux.org.uk
commit 77f6ab8b7768cf5e6bdd0e72499270a0671506ee upstream.
Coredump logics needs to report not only the registers of the dumping thread, but (since 2.5.43) those of other threads getting killed.
Doing that might require extra state saved on the stack in asm glue at kernel entry; signal delivery logics does that (we need to be able to save sigcontext there, at the very least) and so does seccomp.
That covers all callers of do_coredump(). Secondary threads get hit with SIGKILL and caught as soon as they reach exit_mm(), which normally happens in signal delivery, so those are also fine most of the time. Unfortunately, it is possible to end up with secondary zapped when it has already entered exit(2) (or, worse yet, is oopsing). In those cases we reach exit_mm() when mm->core_state is already set, but the stack contents is not what we would have in signal delivery.
At least on two architectures (alpha and m68k) it leads to infoleaks - we end up with a chunk of kernel stack written into coredump, with the contents consisting of normal C stack frames of the call chain leading to exit_mm() instead of the expected copy of userland registers. In case of alpha we leak 312 bytes of stack. Other architectures (including the regset-using ones) might have similar problems - the normal user of regsets is ptrace and the state of tracee at the time of such calls is special in the same way signal delivery is.
Note that had the zapper gotten to the exiting thread slightly later, it wouldn't have been included into coredump anyway - we skip the threads that have already cleared their ->mm. So let's pretend that zapper always loses the race. IOW, have exit_mm() only insert into the dumper list if we'd gotten there from handling a fatal signal[*]
As the result, the callers of do_exit() that have *not* gone through get_signal() are not seen by coredump logics as secondary threads. Which excludes voluntary exit()/oopsen/traps/etc. The dumper thread itself is unaffected by that, so seccomp is fine.
[*] originally I intended to add a new flag in tsk->flags, but ebiederman pointed out that PF_SIGNALED is already doing just what we need.
Cc: stable@vger.kernel.org Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3") History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Acked-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/exit.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/exit.c +++ b/kernel/exit.c @@ -454,7 +454,10 @@ static void exit_mm(void) mmap_read_unlock(mm);
self.task = current; - self.next = xchg(&core_state->dumper.next, &self); + if (self.task->flags & PF_SIGNALED) + self.next = xchg(&core_state->dumper.next, &self); + else + self.task = NULL; /* * Implies mb(), the result of xchg() must be visible * to core_state->dumper.
From: Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com
commit a422490a595600659664901b609aacccdbba4a5f upstream.
If we have more than 4 displays we will run into dummy irq calls or flip timout issues.
Signed-off-by: Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com Reviewed-by: Charlene Liu Charlene.Liu@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 5.9.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/display/dc/irq/dcn30/irq_service_dcn30.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/irq/dcn30/irq_service_dcn30.c +++ b/drivers/gpu/drm/amd/display/dc/irq/dcn30/irq_service_dcn30.c @@ -306,8 +306,8 @@ irq_source_info_dcn30[DAL_IRQ_SOURCES_NU pflip_int_entry(1), pflip_int_entry(2), pflip_int_entry(3), - [DC_IRQ_SOURCE_PFLIP5] = dummy_irq_entry(), - [DC_IRQ_SOURCE_PFLIP6] = dummy_irq_entry(), + pflip_int_entry(4), + pflip_int_entry(5), [DC_IRQ_SOURCE_PFLIP_UNDERLAY0] = dummy_irq_entry(), gpio_pad_int_entry(0), gpio_pad_int_entry(1),
From: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
commit 5ce6861d36ed5207aff9e5eead4c7cc38a986586 upstream.
SFC capability of video engines is not set correctly because i915 is testing for incorrect bits.
Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query") Cc: Matt Roper matthew.d.roper@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniele... (cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -370,7 +370,8 @@ static void __setup_engine_capabilities( * instances. */ if ((INTEL_GEN(i915) >= 11 && - engine->gt->info.vdbox_sfc_access & engine->mask) || + (engine->gt->info.vdbox_sfc_access & + BIT(engine->instance))) || (INTEL_GEN(i915) >= 9 && engine->instance == 0)) engine->uabi_capabilities |= I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
From: Thomas Zimmermann tzimmermann@suse.de
commit 06ad8d339524bf94b89859047822c31df6ace239 upstream.
The gma500 driver expects 3 pipelines in several it's IRQ functions. Accessing struct drm_device.vblank[], this fails with devices that only have 2 pipelines. An example KASAN report is shown below.
[ 62.267688] ================================================================== [ 62.268856] BUG: KASAN: slab-out-of-bounds in psb_irq_postinstall+0x250/0x3c0 [gma500_gfx] [ 62.269450] Read of size 1 at addr ffff8880012bc6d0 by task systemd-udevd/285 [ 62.269949] [ 62.270192] CPU: 0 PID: 285 Comm: systemd-udevd Tainted: G E 5.10.0-rc1-1-default+ #572 [ 62.270807] Hardware name: /DN2800MT, BIOS MTCDT10N.86A.0164.2012.1213.1024 12/13/2012 [ 62.271366] Call Trace: [ 62.271705] dump_stack+0xae/0xe5 [ 62.272180] print_address_description.constprop.0+0x17/0xf0 [ 62.272987] ? psb_irq_postinstall+0x250/0x3c0 [gma500_gfx] [ 62.273474] __kasan_report.cold+0x20/0x38 [ 62.273989] ? psb_irq_postinstall+0x250/0x3c0 [gma500_gfx] [ 62.274460] kasan_report+0x3a/0x50 [ 62.274891] psb_irq_postinstall+0x250/0x3c0 [gma500_gfx] [ 62.275380] drm_irq_install+0x131/0x1f0 <...> [ 62.300751] Allocated by task 285: [ 62.301223] kasan_save_stack+0x1b/0x40 [ 62.301731] __kasan_kmalloc.constprop.0+0xbf/0xd0 [ 62.302293] drmm_kmalloc+0x55/0x100 [ 62.302773] drm_vblank_init+0x77/0x210
Resolve the issue by only handling vblank entries up to the number of CRTCs.
I'm adding a Fixes tag for reference, although the bug has been present since the driver's initial commit.
Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Fixes: 5c49fd3aa0ab ("gma500: Add the core DRM files and headers") Cc: Alan Cox alan@linux.intel.com Cc: Dave Airlie airlied@redhat.com Cc: Patrik Jakobsson patrik.r.jakobsson@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org#v3.3+ Link: https://patchwork.freedesktop.org/patch/msgid/20201105190256.3893-1-tzimmerm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/gma500/psb_irq.c | 34 ++++++++++++---------------------- 1 file changed, 12 insertions(+), 22 deletions(-)
--- a/drivers/gpu/drm/gma500/psb_irq.c +++ b/drivers/gpu/drm/gma500/psb_irq.c @@ -347,6 +347,7 @@ int psb_irq_postinstall(struct drm_devic { struct drm_psb_private *dev_priv = dev->dev_private; unsigned long irqflags; + unsigned int i;
spin_lock_irqsave(&dev_priv->irqmask_lock, irqflags);
@@ -359,20 +360,12 @@ int psb_irq_postinstall(struct drm_devic PSB_WVDC32(dev_priv->vdc_irq_mask, PSB_INT_ENABLE_R); PSB_WVDC32(0xFFFFFFFF, PSB_HWSTAM);
- if (dev->vblank[0].enabled) - psb_enable_pipestat(dev_priv, 0, PIPE_VBLANK_INTERRUPT_ENABLE); - else - psb_disable_pipestat(dev_priv, 0, PIPE_VBLANK_INTERRUPT_ENABLE); - - if (dev->vblank[1].enabled) - psb_enable_pipestat(dev_priv, 1, PIPE_VBLANK_INTERRUPT_ENABLE); - else - psb_disable_pipestat(dev_priv, 1, PIPE_VBLANK_INTERRUPT_ENABLE); - - if (dev->vblank[2].enabled) - psb_enable_pipestat(dev_priv, 2, PIPE_VBLANK_INTERRUPT_ENABLE); - else - psb_disable_pipestat(dev_priv, 2, PIPE_VBLANK_INTERRUPT_ENABLE); + for (i = 0; i < dev->num_crtcs; ++i) { + if (dev->vblank[i].enabled) + psb_enable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); + else + psb_disable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); + }
if (dev_priv->ops->hotplug_enable) dev_priv->ops->hotplug_enable(dev, true); @@ -385,6 +378,7 @@ void psb_irq_uninstall(struct drm_device { struct drm_psb_private *dev_priv = dev->dev_private; unsigned long irqflags; + unsigned int i;
spin_lock_irqsave(&dev_priv->irqmask_lock, irqflags);
@@ -393,14 +387,10 @@ void psb_irq_uninstall(struct drm_device
PSB_WVDC32(0xFFFFFFFF, PSB_HWSTAM);
- if (dev->vblank[0].enabled) - psb_disable_pipestat(dev_priv, 0, PIPE_VBLANK_INTERRUPT_ENABLE); - - if (dev->vblank[1].enabled) - psb_disable_pipestat(dev_priv, 1, PIPE_VBLANK_INTERRUPT_ENABLE); - - if (dev->vblank[2].enabled) - psb_disable_pipestat(dev_priv, 2, PIPE_VBLANK_INTERRUPT_ENABLE); + for (i = 0; i < dev->num_crtcs; ++i) { + if (dev->vblank[i].enabled) + psb_disable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); + }
dev_priv->vdc_irq_mask &= _PSB_IRQ_SGX_FLAG | _PSB_IRQ_MSVDX_FLAG |
From: J. Bruce Fields bfields@redhat.com
commit 70438afbf17e5194dd607dd17759560a363b7bb4 upstream.
We forgot to unregister the nfs4_xattr_large_entry_shrinker.
That leaves the global list of shrinkers corrupted after unload of the nfs module, after which possibly unrelated code that calls register_shrinker() or unregister_shrinker() gets a BUG() with "supervisor write access in kernel mode".
And similarly for the nfs4_xattr_large_entry_lru.
Reported-by: Kris Karas bugs-a17@moonlit-rail.com Tested-By: Kris Karas bugs-a17@moonlit-rail.com Fixes: 95ad37f90c33 "NFSv4.2: add client side xattr caching." Signed-off-by: J. Bruce Fields bfields@redhat.com CC: stable@vger.kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/nfs42xattr.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/nfs/nfs42xattr.c +++ b/fs/nfs/nfs42xattr.c @@ -1048,8 +1048,10 @@ out4:
void nfs4_xattr_cache_exit(void) { + unregister_shrinker(&nfs4_xattr_large_entry_shrinker); unregister_shrinker(&nfs4_xattr_entry_shrinker); unregister_shrinker(&nfs4_xattr_cache_shrinker); + list_lru_destroy(&nfs4_xattr_large_entry_lru); list_lru_destroy(&nfs4_xattr_entry_lru); list_lru_destroy(&nfs4_xattr_cache_lru); kmem_cache_destroy(nfs4_xattr_cache_cachep);
From: Coiby Xu coiby.xu@gmail.com
commit c64a6a0d4a928c63e5bc3b485552a8903a506c36 upstream.
RTC is 32.768kHz thus 512 RtcClk equals 15625 usec. The documentation likely has dropped precision and that's why the driver mistakenly took the slightly deviated value.
Cc: stable@vger.kernel.org Reported-by: Andy Shevchenko andy.shevchenko@gmail.com Suggested-by: Andy Shevchenko andy.shevchenko@gmail.com Suggested-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Coiby Xu coiby.xu@gmail.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/linux-gpio/2f4706a1-502f-75f0-9596-cc25b4933b6c@redh... Link: https://lore.kernel.org/r/20201105231912.69527-3-coiby.xu@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/pinctrl-amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -156,7 +156,7 @@ static int amd_gpio_set_debounce(struct pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF); pin_reg &= ~BIT(DB_TMR_LARGE_OFF); } else if (debounce < 250000) { - time = debounce / 15600; + time = debounce / 15625; pin_reg |= time & DB_TMR_OUT_MASK; pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF); pin_reg |= BIT(DB_TMR_LARGE_OFF);
From: Coiby Xu coiby.xu@gmail.com
commit 06abe8291bc31839950f7d0362d9979edc88a666 upstream.
The correct way to disable debounce filter is to clear bit 5 and 6 of the register.
Cc: stable@vger.kerne.org Signed-off-by: Coiby Xu coiby.xu@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Cc: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/linux-gpio/df2c008b-e7b5-4fdd-42ea-4d1c62b52139@redh... Link: https://lore.kernel.org/r/20201105231912.69527-2-coiby.xu@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/pinctrl-amd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -166,14 +166,14 @@ static int amd_gpio_set_debounce(struct pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF); pin_reg |= BIT(DB_TMR_LARGE_OFF); } else { - pin_reg &= ~DB_CNTRl_MASK; + pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF); ret = -EINVAL; } } else { pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF); pin_reg &= ~BIT(DB_TMR_LARGE_OFF); pin_reg &= ~DB_TMR_OUT_MASK; - pin_reg &= ~DB_CNTRl_MASK; + pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF); } writel(pin_reg, gpio_dev->base + offset * 4); raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
From: Stefano Stabellini stefano.stabellini@xilinx.com
commit e9696d259d0fb5d239e8c28ca41089838ea76d13 upstream.
kernel/dma/swiotlb.c:swiotlb_init gets called first and tries to allocate a buffer for the swiotlb. It does so by calling
memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
If the allocation must fail, no_iotlb_memory is set.
Later during initialization swiotlb-xen comes in (drivers/xen/swiotlb-xen.c:xen_swiotlb_init) and given that io_tlb_start is != 0, it thinks the memory is ready to use when actually it is not.
When the swiotlb is actually needed, swiotlb_tbl_map_single gets called and since no_iotlb_memory is set the kernel panics.
Instead, if swiotlb-xen.c:xen_swiotlb_init knew the swiotlb hadn't been initialized, it would do the initialization itself, which might still succeed.
Fix the panic by setting io_tlb_start to 0 on swiotlb initialization failure, and also by setting no_iotlb_memory to false on swiotlb initialization success.
Fixes: ac2cbab21f31 ("x86: Don't panic if can not alloc buffer for swiotlb")
Reported-by: Elliott Mitchell ehem+xen@m5p.com Tested-by: Elliott Mitchell ehem+xen@m5p.com Signed-off-by: Stefano Stabellini stefano.stabellini@xilinx.com Reviewed-by: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/dma/swiotlb.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -231,6 +231,7 @@ int __init swiotlb_init_with_tbl(char *t io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; } io_tlb_index = 0; + no_iotlb_memory = false;
if (verbose) swiotlb_print_info(); @@ -262,9 +263,11 @@ swiotlb_init(int verbose) if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, verbose)) return;
- if (io_tlb_start) + if (io_tlb_start) { memblock_free_early(io_tlb_start, PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT)); + io_tlb_start = 0; + } pr_warn("Cannot allocate buffer"); no_iotlb_memory = true; } @@ -362,6 +365,7 @@ swiotlb_late_init_with_tbl(char *tlb, un io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; } io_tlb_index = 0; + no_iotlb_memory = false;
swiotlb_print_info();
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 9a2a9ebc0a758d887ee06e067e9f7f0b36ff7574 upstream.
A new cpufreq governor flag will be added subsequently, so replace the bool dynamic_switching fleid in struct cpufreq_governor with a flags field and introduce CPUFREQ_GOV_DYNAMIC_SWITCHING to set for the "dynamic switching" governors instead of it.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/cpufreq.c | 2 +- drivers/cpufreq/cpufreq_governor.h | 2 +- include/linux/cpufreq.h | 9 +++++++-- kernel/sched/cpufreq_schedutil.c | 2 +- 4 files changed, 10 insertions(+), 5 deletions(-)
--- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2233,7 +2233,7 @@ static int cpufreq_init_governor(struct return -EINVAL;
/* Platform doesn't want dynamic frequency switching ? */ - if (policy->governor->dynamic_switching && + if (policy->governor->flags & CPUFREQ_GOV_DYNAMIC_SWITCHING && cpufreq_driver->flags & CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING) { struct cpufreq_governor *gov = cpufreq_fallback_governor();
--- a/drivers/cpufreq/cpufreq_governor.h +++ b/drivers/cpufreq/cpufreq_governor.h @@ -156,7 +156,7 @@ void cpufreq_dbs_governor_limits(struct #define CPUFREQ_DBS_GOVERNOR_INITIALIZER(_name_) \ { \ .name = _name_, \ - .dynamic_switching = true, \ + .flags = CPUFREQ_GOV_DYNAMIC_SWITCHING, \ .owner = THIS_MODULE, \ .init = cpufreq_dbs_governor_init, \ .exit = cpufreq_dbs_governor_exit, \ --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -565,12 +565,17 @@ struct cpufreq_governor { char *buf); int (*store_setspeed) (struct cpufreq_policy *policy, unsigned int freq); - /* For governors which change frequency dynamically by themselves */ - bool dynamic_switching; struct list_head governor_list; struct module *owner; + u8 flags; };
+/* Governor flags */ + +/* For governors which change frequency dynamically by themselves */ +#define CPUFREQ_GOV_DYNAMIC_SWITCHING BIT(0) + + /* Pass a target to the cpufreq driver */ unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, unsigned int target_freq); --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -896,7 +896,7 @@ static void sugov_limits(struct cpufreq_ struct cpufreq_governor schedutil_gov = { .name = "schedutil", .owner = THIS_MODULE, - .dynamic_switching = true, + .flags = CPUFREQ_GOV_DYNAMIC_SWITCHING, .init = sugov_init, .exit = sugov_exit, .start = sugov_start,
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 218f66870181bec7aaa6e3c72f346039c590c3c2 upstream.
Introduce a new governor flag, CPUFREQ_GOV_STRICT_TARGET, for the governors that want the target frequency to be set exactly to the given value without leaving any room for adjustments on the hardware side and set this flag for the powersave and performance governors.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/cpufreq_performance.c | 1 + drivers/cpufreq/cpufreq_powersave.c | 1 + include/linux/cpufreq.h | 3 +++ 3 files changed, 5 insertions(+)
--- a/drivers/cpufreq/cpufreq_performance.c +++ b/drivers/cpufreq/cpufreq_performance.c @@ -20,6 +20,7 @@ static void cpufreq_gov_performance_limi static struct cpufreq_governor cpufreq_gov_performance = { .name = "performance", .owner = THIS_MODULE, + .flags = CPUFREQ_GOV_STRICT_TARGET, .limits = cpufreq_gov_performance_limits, };
--- a/drivers/cpufreq/cpufreq_powersave.c +++ b/drivers/cpufreq/cpufreq_powersave.c @@ -21,6 +21,7 @@ static struct cpufreq_governor cpufreq_g .name = "powersave", .limits = cpufreq_gov_powersave_limits, .owner = THIS_MODULE, + .flags = CPUFREQ_GOV_STRICT_TARGET, };
MODULE_AUTHOR("Dominik Brodowski linux@brodo.de"); --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -575,6 +575,9 @@ struct cpufreq_governor { /* For governors which change frequency dynamically by themselves */ #define CPUFREQ_GOV_DYNAMIC_SWITCHING BIT(0)
+/* For governors wanting the target frequency to be set exactly */ +#define CPUFREQ_GOV_STRICT_TARGET BIT(1) +
/* Pass a target to the cpufreq driver */ unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit ea9364bbadf11f0c55802cf11387d74f524cee84 upstream.
Add a new field to be set when the CPUFREQ_GOV_STRICT_TARGET flag is set for the current governor to struct cpufreq_policy, so that the drivers needing to check CPUFREQ_GOV_STRICT_TARGET do not have to access the governor object during every frequency transition.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/cpufreq.c | 2 ++ include/linux/cpufreq.h | 6 ++++++ 2 files changed, 8 insertions(+)
--- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2259,6 +2259,8 @@ static int cpufreq_init_governor(struct } }
+ policy->strict_target = !!(policy->governor->flags & CPUFREQ_GOV_STRICT_TARGET); + return 0; }
--- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -110,6 +110,12 @@ struct cpufreq_policy { bool fast_switch_enabled;
/* + * Set if the CPUFREQ_GOV_STRICT_TARGET flag is set for the current + * governor. + */ + bool strict_target; + + /* * Preferred average time interval between consecutive invocations of * the driver to set the frequency for this policy. To be set by the * scaling driver (0, which is the default, means no preference).
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit fcb3a1ab79904d54499db77017793ccca665eb7e upstream.
Make intel_pstate take the new CPUFREQ_GOV_STRICT_TARGET governor flag into account when it operates in the passive mode with HWP enabled, so as to fix the "powersave" governor behavior in that case (currently, HWP is allowed to scale the performance all the way up to the policy max limit when the "powersave" governor is used, but it should be constrained to the policy min limit then).
Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled") Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Cc: 5.9+ stable@vger.kernel.org # 5.9+: 9a2a9ebc0a75 cpufreq: Introduce governor flags Cc: 5.9+ stable@vger.kernel.org # 5.9+: 218f66870181 cpufreq: Introduce CPUFREQ_GOV_STRICT_TARGET Cc: 5.9+ stable@vger.kernel.org # 5.9+: ea9364bbadf1 cpufreq: Add strict_target to struct cpufreq_policy Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/intel_pstate.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -2509,7 +2509,7 @@ static void intel_cpufreq_trace(struct c }
static void intel_cpufreq_adjust_hwp(struct cpudata *cpu, u32 target_pstate, - bool fast_switch) + bool strict, bool fast_switch) { u64 prev = READ_ONCE(cpu->hwp_req_cached), value = prev;
@@ -2521,7 +2521,7 @@ static void intel_cpufreq_adjust_hwp(str * field in it, so opportunistically update the max too if needed. */ value &= ~HWP_MAX_PERF(~0L); - value |= HWP_MAX_PERF(cpu->max_perf_ratio); + value |= HWP_MAX_PERF(strict ? target_pstate : cpu->max_perf_ratio);
if (value == prev) return; @@ -2544,14 +2544,16 @@ static void intel_cpufreq_adjust_perf_ct pstate_funcs.get_val(cpu, target_pstate)); }
-static int intel_cpufreq_update_pstate(struct cpudata *cpu, int target_pstate, - bool fast_switch) +static int intel_cpufreq_update_pstate(struct cpufreq_policy *policy, + int target_pstate, bool fast_switch) { + struct cpudata *cpu = all_cpu_data[policy->cpu]; int old_pstate = cpu->pstate.current_pstate;
target_pstate = intel_pstate_prepare_request(cpu, target_pstate); if (hwp_active) { - intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch); + intel_cpufreq_adjust_hwp(cpu, target_pstate, + policy->strict_target, fast_switch); cpu->pstate.current_pstate = target_pstate; } else if (target_pstate != old_pstate) { intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch); @@ -2591,7 +2593,7 @@ static int intel_cpufreq_target(struct c break; }
- target_pstate = intel_cpufreq_update_pstate(cpu, target_pstate, false); + target_pstate = intel_cpufreq_update_pstate(policy, target_pstate, false);
freqs.new = target_pstate * cpu->pstate.scaling;
@@ -2610,7 +2612,7 @@ static unsigned int intel_cpufreq_fast_s
target_pstate = DIV_ROUND_UP(target_freq, cpu->pstate.scaling);
- target_pstate = intel_cpufreq_update_pstate(cpu, target_pstate, true); + target_pstate = intel_cpufreq_update_pstate(policy, target_pstate, true);
return target_pstate * cpu->pstate.scaling; }
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit 413691384a37fe27f43460226c4160e33140e638 ]
After updating userspace Ethtool from 5.7 to 5.9, I noticed that NETDEV_FEAT_CHANGE is no more raised when changing netdev features through Ethtool. That's because the old Ethtool ioctl interface always calls netdev_features_change() at the end of user request processing to inform the kernel that our netdevice has some features changed, but the new Netlink interface does not. Instead, it just notifies itself with ETHTOOL_MSG_FEATURES_NTF. Replace this ethtool_notify() call with netdev_features_change(), so the kernel will be aware of any features changes, just like in case with the ioctl interface. This does not omit Ethtool notifications, as Ethtool itself listens to NETDEV_FEAT_CHANGE and drops ETHTOOL_MSG_FEATURES_NTF on it (net/ethtool/netlink.c:ethnl_netdev_event()).
From v1 [1]:
- dropped extra new line as advised by Jakub; - no functional changes.
[1] https://lore.kernel.org/netdev/AlZXQ2o5uuTVHCfNGOiGgJ8vJ3KgO5YIWAnQjH0cDE@cp...
Fixes: 0980bfcd6954 ("ethtool: set netdev features with FEATURES_SET request") Signed-off-by: Alexander Lobakin alobakin@pm.me Reviewed-by: Michal Kubecek mkubecek@suse.cz Link: https://lore.kernel.org/r/ahA2YWXYICz5rbUSQqNG4roJ8OlJzzYQX7PTiG80@cp4-web-0... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/features.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ethtool/features.c +++ b/net/ethtool/features.c @@ -296,7 +296,7 @@ int ethnl_set_features(struct sk_buff *s active_diff_mask, compact); } if (mod) - ethtool_notify(dev, ETHTOOL_MSG_FEATURES_NTF, NULL); + netdev_features_change(dev);
out_rtnl: rtnl_unlock();
From: Oliver Herms oliver.peter.herms@gmail.com
[ Upstream commit 8ef9ba4d666614497a057d09b0a6eafc1e34eadf ]
Due to the legacy usage of hard_header_len for SIT tunnels while already using infrastructure from net/ipv4/ip_tunnel.c the calculation of the path MTU in tnl_update_pmtu is incorrect. This leads to unnecessary creation of MTU exceptions for any flow going over a SIT tunnel.
As SIT tunnels do not have a header themsevles other than their transport (L3, L2) headers we're leaving hard_header_len set to zero as tnl_update_pmtu is already taking care of the transport headers sizes.
This will also help avoiding unnecessary IPv6 GC runs and spinlock contention seen when using SIT tunnels and for more than net.ipv6.route.gc_thresh flows.
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Oliver Herms oliver.peter.herms@gmail.com Acked-by: Willem de Bruijn willemb@google.com Link: https://lore.kernel.org/r/20201103104133.GA1573211@tws Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/sit.c | 2 -- 1 file changed, 2 deletions(-)
--- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1128,7 +1128,6 @@ static void ipip6_tunnel_bind_dev(struct if (tdev && !netif_is_l3_master(tdev)) { int t_hlen = tunnel->hlen + sizeof(struct iphdr);
- dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr); dev->mtu = tdev->mtu - t_hlen; if (dev->mtu < IPV6_MIN_MTU) dev->mtu = IPV6_MIN_MTU; @@ -1426,7 +1425,6 @@ static void ipip6_tunnel_setup(struct ne dev->priv_destructor = ipip6_dev_free;
dev->type = ARPHRD_SIT; - dev->hard_header_len = LL_MAX_HEADER + t_hlen; dev->mtu = ETH_DATA_LEN - t_hlen; dev->min_mtu = IPV6_MIN_MTU; dev->max_mtu = IP6_MAX_MTU - t_hlen;
From: Ursula Braun ubraun@linux.ibm.com
[ Upstream commit 4031eeafa71eaf22ae40a15606a134ae86345daf ]
syzbot reported the following KASAN finding:
BUG: KASAN: nullptr-dereference in iucv_send_ctrl+0x390/0x3f0 net/iucv/af_iucv.c:385 Read of size 2 at addr 000000000000021e by task syz-executor907/519
CPU: 0 PID: 519 Comm: syz-executor907 Not tainted 5.9.0-syzkaller-07043-gbcf9877ad213 #0 Hardware name: IBM 3906 M04 701 (KVM/Linux) Call Trace: [<00000000c576af60>] unwind_start arch/s390/include/asm/unwind.h:65 [inline] [<00000000c576af60>] show_stack+0x180/0x228 arch/s390/kernel/dumpstack.c:135 [<00000000c9dcd1f8>] __dump_stack lib/dump_stack.c:77 [inline] [<00000000c9dcd1f8>] dump_stack+0x268/0x2f0 lib/dump_stack.c:118 [<00000000c5fed016>] print_address_description.constprop.0+0x5e/0x218 mm/kasan/report.c:383 [<00000000c5fec82a>] __kasan_report mm/kasan/report.c:517 [inline] [<00000000c5fec82a>] kasan_report+0x11a/0x168 mm/kasan/report.c:534 [<00000000c98b5b60>] iucv_send_ctrl+0x390/0x3f0 net/iucv/af_iucv.c:385 [<00000000c98b6262>] iucv_sock_shutdown+0x44a/0x4c0 net/iucv/af_iucv.c:1457 [<00000000c89d3a54>] __sys_shutdown+0x12c/0x1c8 net/socket.c:2204 [<00000000c89d3b70>] __do_sys_shutdown net/socket.c:2212 [inline] [<00000000c89d3b70>] __s390x_sys_shutdown+0x38/0x48 net/socket.c:2210 [<00000000c9e36eac>] system_call+0xe0/0x28c arch/s390/kernel/entry.S:415
There is nothing to shutdown if a connection has never been established. Besides that iucv->hs_dev is not yet initialized if a socket is in IUCV_OPEN state and iucv->path is not yet initialized if socket is in IUCV_BOUND state. So, just skip the shutdown calls for a socket in these states.
Fixes: eac3731bd04c ("[S390]: Add AF_IUCV socket support") Fixes: 82492a355fac ("af_iucv: add shutdown for HS transport") Reviewed-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Ursula Braun ubraun@linux.ibm.com [jwi: correct one Fixes tag] Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/iucv/af_iucv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/iucv/af_iucv.c +++ b/net/iucv/af_iucv.c @@ -1434,7 +1434,8 @@ static int iucv_sock_shutdown(struct soc break; }
- if (how == SEND_SHUTDOWN || how == SHUTDOWN_MASK) { + if ((how == SEND_SHUTDOWN || how == SHUTDOWN_MASK) && + sk->sk_state == IUCV_CONNECTED) { if (iucv->transport == AF_IUCV_TRANS_IUCV) { txmsg.class = 0; txmsg.tag = 0;
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit 55e729889bb07d68ab071660ce3f5e7a7872ebe8 ]
udp{4,6}_lib_lookup_skb() use ip{,v6}_hdr() to get IP header of the packet. While it's probably OK for non-frag0 paths, this helpers will also point to junk on Fast/frag0 GRO when all headers are located in frags. As a result, sk/skb lookup may fail or give wrong results. To support both GRO modes, skb_gro_network_header() might be used. To not modify original functions, add private versions of udp{4,6}_lib_lookup_skb() only to perform correct sk lookups on GRO.
Present since the introduction of "application-level" UDP GRO in 4.7-rc1.
Misc: replace totally unneeded ternaries with plain ifs.
Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket") Suggested-by: Willem de Bruijn willemb@google.com Cc: Eric Dumazet edumazet@google.com Signed-off-by: Alexander Lobakin alobakin@pm.me Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp_offload.c | 17 +++++++++++++++-- net/ipv6/udp_offload.c | 17 +++++++++++++++-- 2 files changed, 30 insertions(+), 4 deletions(-)
--- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -500,12 +500,22 @@ out: } EXPORT_SYMBOL(udp_gro_receive);
+static struct sock *udp4_gro_lookup_skb(struct sk_buff *skb, __be16 sport, + __be16 dport) +{ + const struct iphdr *iph = skb_gro_network_header(skb); + + return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport, + iph->daddr, dport, inet_iif(skb), + inet_sdif(skb), &udp_table, NULL); +} + INDIRECT_CALLABLE_SCOPE struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); + struct sock *sk = NULL; struct sk_buff *pp; - struct sock *sk;
if (unlikely(!uh)) goto flush; @@ -523,7 +533,10 @@ struct sk_buff *udp4_gro_receive(struct skip: NAPI_GRO_CB(skb)->is_ipv6 = 0; rcu_read_lock(); - sk = static_branch_unlikely(&udp_encap_needed_key) ? udp4_lib_lookup_skb(skb, uh->source, uh->dest) : NULL; + + if (static_branch_unlikely(&udp_encap_needed_key)) + sk = udp4_gro_lookup_skb(skb, uh->source, uh->dest); + pp = udp_gro_receive(head, skb, uh, sk); rcu_read_unlock(); return pp; --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -111,12 +111,22 @@ out: return segs; }
+static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport, + __be16 dport) +{ + const struct ipv6hdr *iph = skb_gro_network_header(skb); + + return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport, + &iph->daddr, dport, inet6_iif(skb), + inet6_sdif(skb), &udp_table, NULL); +} + INDIRECT_CALLABLE_SCOPE struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); + struct sock *sk = NULL; struct sk_buff *pp; - struct sock *sk;
if (unlikely(!uh)) goto flush; @@ -135,7 +145,10 @@ struct sk_buff *udp6_gro_receive(struct skip: NAPI_GRO_CB(skb)->is_ipv6 = 1; rcu_read_lock(); - sk = static_branch_unlikely(&udpv6_encap_needed_key) ? udp6_lib_lookup_skb(skb, uh->source, uh->dest) : NULL; + + if (static_branch_unlikely(&udpv6_encap_needed_key)) + sk = udp6_gro_lookup_skb(skb, uh->source, uh->dest); + pp = udp_gro_receive(head, skb, uh, sk); rcu_read_unlock(); return pp;
From: Alexander Lobakin alobakin@pm.me
[ Upstream commit 4b1a86281cc1d0de46df3ad2cb8c1f86ac07681c ]
UDP GRO uses udp_hdr(skb) in its .gro_receive() callback. While it's probably OK for non-frag0 paths (when all headers or even the entire frame are already in skb head), this inline points to junk when using Fast GRO (napi_gro_frags() or napi_gro_receive() with only Ethernet header in skb head and all the rest in the frags) and breaks GRO packet compilation and the packet flow itself. To support both modes, skb_gro_header_fast() + skb_gro_header_slow() are typically used. UDP even has an inline helper that makes use of them, udp_gro_udphdr(). Use that instead of troublemaking udp_hdr() to get rid of the out-of-order delivers.
Present since the introduction of plain UDP GRO in 5.0-rc1.
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Cc: Eric Dumazet edumazet@google.com Signed-off-by: Alexander Lobakin alobakin@pm.me Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp_offload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -366,7 +366,7 @@ out: static struct sk_buff *udp_gro_receive_segment(struct list_head *head, struct sk_buff *skb) { - struct udphdr *uh = udp_hdr(skb); + struct udphdr *uh = udp_gro_udphdr(skb); struct sk_buff *pp = NULL; struct udphdr *uh2; struct sk_buff *p;
From: Mao Wenan wenan.mao@linux.alibaba.com
[ Upstream commit 909172a149749242990a6e64cb55d55460d4e417 ]
When net.ipv4.tcp_syncookies=1 and syn flood is happened, cookie_v4_check or cookie_v6_check tries to redo what tcp_v4_send_synack or tcp_v6_send_synack did, rsk_window_clamp will be changed if SOCK_RCVBUF is set, which will make rcv_wscale is different, the client still operates with initial window scale and can overshot granted window, the client use the initial scale but local server use new scale to advertise window value, and session work abnormally.
Fixes: e88c64f0a425 ("tcp: allow effective reduction of TCP's rcv-buffer via setsockopt") Signed-off-by: Mao Wenan wenan.mao@linux.alibaba.com Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/1604967391-123737-1-git-send-email-wenan.mao@linux... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/syncookies.c | 9 +++++++-- net/ipv6/syncookies.c | 10 ++++++++-- 2 files changed, 15 insertions(+), 4 deletions(-)
--- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -331,7 +331,7 @@ struct sock *cookie_v4_check(struct sock __u32 cookie = ntohl(th->ack_seq) - 1; struct sock *ret = sk; struct request_sock *req; - int mss; + int full_space, mss; struct rtable *rt; __u8 rcv_wscale; struct flowi4 fl4; @@ -427,8 +427,13 @@ struct sock *cookie_v4_check(struct sock
/* Try to redo what tcp_v4_send_synack did. */ req->rsk_window_clamp = tp->window_clamp ? :dst_metric(&rt->dst, RTAX_WINDOW); + /* limit the window selection if the user enforce a smaller rx buffer */ + full_space = tcp_full_space(sk); + if (sk->sk_userlocks & SOCK_RCVBUF_LOCK && + (req->rsk_window_clamp > full_space || req->rsk_window_clamp == 0)) + req->rsk_window_clamp = full_space;
- tcp_select_initial_window(sk, tcp_full_space(sk), req->mss, + tcp_select_initial_window(sk, full_space, req->mss, &req->rsk_rcv_wnd, &req->rsk_window_clamp, ireq->wscale_ok, &rcv_wscale, dst_metric(&rt->dst, RTAX_INITRWND)); --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -136,7 +136,7 @@ struct sock *cookie_v6_check(struct sock __u32 cookie = ntohl(th->ack_seq) - 1; struct sock *ret = sk; struct request_sock *req; - int mss; + int full_space, mss; struct dst_entry *dst; __u8 rcv_wscale; u32 tsoff = 0; @@ -241,7 +241,13 @@ struct sock *cookie_v6_check(struct sock }
req->rsk_window_clamp = tp->window_clamp ? :dst_metric(dst, RTAX_WINDOW); - tcp_select_initial_window(sk, tcp_full_space(sk), req->mss, + /* limit the window selection if the user enforce a smaller rx buffer */ + full_space = tcp_full_space(sk); + if (sk->sk_userlocks & SOCK_RCVBUF_LOCK && + (req->rsk_window_clamp > full_space || req->rsk_window_clamp == 0)) + req->rsk_window_clamp = full_space; + + tcp_select_initial_window(sk, full_space, req->mss, &req->rsk_rcv_wnd, &req->rsk_window_clamp, ireq->wscale_ok, &rcv_wscale, dst_metric(dst, RTAX_INITRWND));
From: Martin Schiller ms@dev.tdt.de
[ Upstream commit 361182308766a265b6c521879b34302617a8c209 ]
This fixes a regression for blocking connects introduced by commit 4becb7ee5b3d ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect").
The x25->neighbour is already set to "NULL" by x25_disconnect() now, while a blocking connect is waiting in x25_wait_for_connection_establishment(). Therefore x25->neighbour must not be accessed here again and x25->state is also already set to X25_STATE_0 by x25_disconnect().
Fixes: 4becb7ee5b3d ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect") Signed-off-by: Martin Schiller ms@dev.tdt.de Reviewed-by: Xie He xie.he.0141@gmail.com Link: https://lore.kernel.org/r/20201109065449.9014-1-ms@dev.tdt.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/x25/af_x25.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -825,7 +825,7 @@ static int x25_connect(struct socket *so sock->state = SS_CONNECTED; rc = 0; out_put_neigh: - if (rc) { + if (rc && x25->neighbour) { read_lock_bh(&x25_list_lock); x25_neigh_put(x25->neighbour); x25->neighbour = NULL;
From: Wang Hai wanghai38@huawei.com
[ Upstream commit fa6882c63621821f73cc806f291208e1c6ea6187 ]
kmemleak report a memory leak as follows:
unreferenced object 0xffff88810a596800 (size 512): comm "ip", pid 21558, jiffies 4297568990 (age 112.120s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 00 83 60 b0 ff ff ff ff ..........`..... backtrace: [<0000000022bbe21f>] tipc_topsrv_init_net+0x1f3/0xa70 [<00000000fe15ddf7>] ops_init+0xa8/0x3c0 [<00000000138af6f2>] setup_net+0x2de/0x7e0 [<000000008c6807a3>] copy_net_ns+0x27d/0x530 [<000000006b21adbd>] create_new_namespaces+0x382/0xa30 [<00000000bb169746>] unshare_nsproxy_namespaces+0xa1/0x1d0 [<00000000fe2e42bc>] ksys_unshare+0x39c/0x780 [<0000000009ba3b19>] __x64_sys_unshare+0x2d/0x40 [<00000000614ad866>] do_syscall_64+0x56/0xa0 [<00000000a1b5ca3c>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
'srv' is malloced in tipc_topsrv_start() but not free before leaving from the error handling cases. We need to free it.
Fixes: 5c45ab24ac77 ("tipc: make struct tipc_server private for server.c") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wang Hai wanghai38@huawei.com Link: https://lore.kernel.org/r/20201109140913.47370-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/topsrv.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -665,12 +665,18 @@ static int tipc_topsrv_start(struct net
ret = tipc_topsrv_work_start(srv); if (ret < 0) - return ret; + goto err_start;
ret = tipc_topsrv_create_listener(srv); if (ret < 0) - tipc_topsrv_work_stop(srv); + goto err_create;
+ return 0; + +err_create: + tipc_topsrv_work_stop(srv); +err_start: + kfree(srv); return ret; }
From: Parav Pandit parav@nvidia.com
[ Upstream commit 9f73bd1c2c4c304b238051fc92b3f807326f0a89 ]
Cited commit in fixes tag overwrites the port attributes for the registered port.
Avoid such error by checking registered flag before setting attributes.
Fixes: 71ad8d55f8e5 ("devlink: Replace devlink_port_attrs_set parameters with a struct") Signed-off-by: Parav Pandit parav@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20201111034744.35554-1-parav@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/devlink.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -7675,8 +7675,6 @@ static int __devlink_port_attrs_set(stru { struct devlink_port_attrs *attrs = &devlink_port->attrs;
- if (WARN_ON(devlink_port->registered)) - return -EEXIST; devlink_port->attrs_set = true; attrs->flavour = flavour; if (attrs->switch_id.id_len) { @@ -7700,6 +7698,8 @@ void devlink_port_attrs_set(struct devli { int ret;
+ if (WARN_ON(devlink_port->registered)) + return; devlink_port->attrs = *attrs; ret = __devlink_port_attrs_set(devlink_port, attrs->flavour); if (ret) @@ -7719,6 +7719,8 @@ void devlink_port_attrs_pci_pf_set(struc struct devlink_port_attrs *attrs = &devlink_port->attrs; int ret;
+ if (WARN_ON(devlink_port->registered)) + return; ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_PF); if (ret) @@ -7741,6 +7743,8 @@ void devlink_port_attrs_pci_vf_set(struc struct devlink_port_attrs *attrs = &devlink_port->attrs; int ret;
+ if (WARN_ON(devlink_port->registered)) + return; ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_VF); if (ret)
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 989ef49bdf100cc772b3a8737089df36b1ab1e30 ]
The mptcp proto struct currently does not provide the required limit for forward memory scheduling. Under pressure sk_rmem_schedule() will unconditionally try to use such field and will oops.
Address the issue inheriting the tcp limit, as we already do for the wmem one.
Fixes: 9c3f94e1681b ("mptcp: add missing memory scheduling in the rx path") Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Link: https://lore.kernel.org/r/37af798bd46f402fb7c79f57ebbdd00614f5d7fa.160486109... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2122,6 +2122,7 @@ static struct proto mptcp_prot = { .memory_pressure = &tcp_memory_pressure, .stream_memory_free = mptcp_memory_free, .sysctl_wmem_offset = offsetof(struct net, ipv4.sysctl_tcp_wmem), + .sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_tcp_rmem), .sysctl_mem = sysctl_tcp_mem, .obj_size = sizeof(struct mptcp_sock), .slab_flags = SLAB_TYPESAFE_BY_RCU,
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit 77a2d673d5c9d1d359b5652ff75043273c5dea28 ]
Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6 packets over a link with a PMTU estimation of exactly 1350 bytes, won't trigger ICMPv6 Packet Too Big replies when the encapsulated datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU would be legitimate and expected.
This comes from an off-by-one error I introduced in checks added as part of commit 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets"), whose purpose was to prevent sending ICMPv6 Packet Too Big messages with an MTU lower than the smallest permissible IPv6 link MTU, i.e. 1280 bytes.
In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if the advertised MTU would be less than, and not equal to, 1280 bytes.
Also fix the analogous comparison for IPv4, that is, skip the ICMP reply only if the resulting MTU is strictly less than 576 bytes.
This becomes apparent while running the net/pmtu.sh bridged VXLAN or GENEVE selftests with adjusted lower-link MTU values. Using e.g. GENEVE, setting ll_mtu to the values reported below, in the test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test function, we can see failures on the following tests:
test | ll_mtu -------------------------------|-------- pmtu_ipv4_br_geneve4_exception | 626 pmtu_ipv6_br_geneve4_exception | 1330 pmtu_ipv6_br_geneve6_exception | 1350
owing to the different tunneling overheads implied by the corresponding configurations.
Reported-by: Jianlin Shi jishi@redhat.com Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Stefano Brivio sbrivio@redhat.com Link: https://lore.kernel.org/r/4f5fc2f33bfdf8409549fafd4f952b008bf04d63.160468170... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_tunnel_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -263,7 +263,7 @@ static int iptunnel_pmtud_check_icmp(str const struct icmphdr *icmph = icmp_hdr(skb); const struct iphdr *iph = ip_hdr(skb);
- if (mtu <= 576 || iph->frag_off != htons(IP_DF)) + if (mtu < 576 || iph->frag_off != htons(IP_DF)) return 0;
if (ipv4_is_lbcast(iph->daddr) || ipv4_is_multicast(iph->daddr) || @@ -359,7 +359,7 @@ static int iptunnel_pmtud_check_icmpv6(s __be16 frag_off; int offset;
- if (mtu <= IPV6_MIN_MTU) + if (mtu < IPV6_MIN_MTU) return 0;
if (stype == IPV6_ADDR_ANY || stype == IPV6_ADDR_MULTICAST ||
From: Christophe Leroy christophe.leroy@csgroup.eu
commit 11522448e641e8f1690c9db06e01985e8e19b401 upstream.
The kernel expects pte_young() to work regardless of CONFIG_SWAP.
Make sure a minor fault is taken to set _PAGE_ACCESSED when it is not already set, regardless of the selection of CONFIG_SWAP.
Fixes: 84de6ab0e904 ("powerpc/603: don't handle PAGE_ACCESSED in TLB miss handlers.") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/a44367744de54e2315b2f1a8cbbd7f88488072e0.160234280... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/head_32.S | 12 ------------ 1 file changed, 12 deletions(-)
--- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -472,11 +472,7 @@ InstructionTLBMiss: cmplw 0,r1,r3 #endif mfspr r2, SPRN_SPRG_PGDIR -#ifdef CONFIG_SWAP li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC -#else - li r1,_PAGE_PRESENT | _PAGE_EXEC -#endif #if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ @@ -538,11 +534,7 @@ DataLoadTLBMiss: lis r1, TASK_SIZE@h /* check if kernel address */ cmplw 0,r1,r3 mfspr r2, SPRN_SPRG_PGDIR -#ifdef CONFIG_SWAP li r1, _PAGE_PRESENT | _PAGE_ACCESSED -#else - li r1, _PAGE_PRESENT -#endif bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */ @@ -618,11 +610,7 @@ DataStoreTLBMiss: lis r1, TASK_SIZE@h /* check if kernel address */ cmplw 0,r1,r3 mfspr r2, SPRN_SPRG_PGDIR -#ifdef CONFIG_SWAP li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED -#else - li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT -#endif bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
From: Damien Le Moal damien.lemoal@wdc.com
commit e1777d099728a76a8f8090f89649aac961e7e530 upstream.
Commit aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") changed zone locking to using the potentially sleeping wait_on_bit_io() function. This is acceptable when memory backing is enabled as the device queue is in that case marked as blocking, but this triggers a scheduling while in atomic context with memory backing disabled.
Fix this by relying solely on the device zone spinlock for zone information protection without temporarily releasing this lock around null_process_cmd() execution in null_zone_write(). This is OK to do since when memory backing is disabled, command processing does not block and the memory backing lock nullb->lock is unused. This solution avoids the overhead of having to mark a zoned null_blk device queue as blocking when memory backing is unused.
This patch also adds comments to the zone locking code to explain the unusual locking scheme.
Fixes: aa1c09cb65e2 ("null_blk: Fix locking in zoned mode") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/null_blk.h | 1 + drivers/block/null_blk_zoned.c | 31 +++++++++++++++++++++++++------ 2 files changed, 26 insertions(+), 6 deletions(-)
--- a/drivers/block/null_blk.h +++ b/drivers/block/null_blk.h @@ -44,6 +44,7 @@ struct nullb_device { unsigned int nr_zones; struct blk_zone *zones; sector_t zone_size_sects; + spinlock_t zone_lock; unsigned long *zone_locks;
unsigned long size; /* device size in MB */ --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk_zoned.c @@ -46,10 +46,20 @@ int null_init_zoned_dev(struct nullb_dev if (!dev->zones) return -ENOMEM;
- dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL); - if (!dev->zone_locks) { - kvfree(dev->zones); - return -ENOMEM; + /* + * With memory backing, the zone_lock spinlock needs to be temporarily + * released to avoid scheduling in atomic context. To guarantee zone + * information protection, use a bitmap to lock zones with + * wait_on_bit_lock_io(). Sleeping on the lock is OK as memory backing + * implies that the queue is marked with BLK_MQ_F_BLOCKING. + */ + spin_lock_init(&dev->zone_lock); + if (dev->memory_backed) { + dev->zone_locks = bitmap_zalloc(dev->nr_zones, GFP_KERNEL); + if (!dev->zone_locks) { + kvfree(dev->zones); + return -ENOMEM; + } }
if (dev->zone_nr_conv >= dev->nr_zones) { @@ -118,12 +128,16 @@ void null_free_zoned_dev(struct nullb_de
static inline void null_lock_zone(struct nullb_device *dev, unsigned int zno) { - wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE); + if (dev->memory_backed) + wait_on_bit_lock_io(dev->zone_locks, zno, TASK_UNINTERRUPTIBLE); + spin_lock_irq(&dev->zone_lock); }
static inline void null_unlock_zone(struct nullb_device *dev, unsigned int zno) { - clear_and_wake_up_bit(zno, dev->zone_locks); + spin_unlock_irq(&dev->zone_lock); + if (dev->memory_backed) + clear_and_wake_up_bit(zno, dev->zone_locks); }
int null_report_zones(struct gendisk *disk, sector_t sector, @@ -233,7 +247,12 @@ static blk_status_t null_zone_write(stru if (zone->cond != BLK_ZONE_COND_EXP_OPEN) zone->cond = BLK_ZONE_COND_IMP_OPEN;
+ if (dev->memory_backed) + spin_unlock_irq(&dev->zone_lock); ret = null_process_cmd(cmd, REQ_OP_WRITE, sector, nr_sectors); + if (dev->memory_backed) + spin_lock_irq(&dev->zone_lock); + if (ret != BLK_STS_OK) break;
From: Arnaldo Carvalho de Melo acme@redhat.com
commit d0e7b0c71fbb653de90a7163ef46912a96f0bdaf upstream.
To avoid this:
util/scripting-engines/trace-event-python.c: In function 'python_start_script': util/scripting-engines/trace-event-python.c:1595:2: error: 'visibility' attribute ignored [-Werror=attributes] 1595 | PyMODINIT_FUNC (*initfunc)(void); | ^~~~~~~~~~~~~~
That started breaking when building with PYTHON=python3 and these gcc versions (I haven't checked with the clang ones, maybe it breaks there as well):
# export PERF_TARBALL=http://192.168.86.5/perf/perf-5.9.0.tar.xz # dm fedora:33 fedora:rawhide 1 107.80 fedora:33 : Ok gcc (GCC) 10.2.1 20201005 (Red Hat 10.2.1-5), clang version 11.0.0 (Fedora 11.0.0-1.fc33) 2 92.47 fedora:rawhide : Ok gcc (GCC) 10.2.1 20201016 (Red Hat 10.2.1-6), clang version 11.0.0 (Fedora 11.0.0-1.fc34) #
Avoid that by ditching that 'initfunc' function pointer with its:
#define Py_EXPORTED_SYMBOL _attribute_ ((visibility ("default"))) #define PyMODINIT_FUNC Py_EXPORTED_SYMBOL PyObject*
And just call PyImport_AppendInittab() at the end of the ifdef python3 block with the functions that were being attributed to that initfunc.
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Tapas Kundu tkundu@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/scripting-engines/trace-event-python.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
--- a/tools/perf/util/scripting-engines/trace-event-python.c +++ b/tools/perf/util/scripting-engines/trace-event-python.c @@ -1592,7 +1592,6 @@ static void _free_command_line(wchar_t * static int python_start_script(const char *script, int argc, const char **argv) { struct tables *tables = &tables_global; - PyMODINIT_FUNC (*initfunc)(void); #if PY_MAJOR_VERSION < 3 const char **command_line; #else @@ -1607,20 +1606,18 @@ static int python_start_script(const cha FILE *fp;
#if PY_MAJOR_VERSION < 3 - initfunc = initperf_trace_context; command_line = malloc((argc + 1) * sizeof(const char *)); command_line[0] = script; for (i = 1; i < argc + 1; i++) command_line[i] = argv[i - 1]; + PyImport_AppendInittab(name, initperf_trace_context); #else - initfunc = PyInit_perf_trace_context; command_line = malloc((argc + 1) * sizeof(wchar_t *)); command_line[0] = Py_DecodeLocale(script, NULL); for (i = 1; i < argc + 1; i++) command_line[i] = Py_DecodeLocale(argv[i - 1], NULL); + PyImport_AppendInittab(name, PyInit_perf_trace_context); #endif - - PyImport_AppendInittab(name, initfunc); Py_Initialize();
#if PY_MAJOR_VERSION < 3
From: Linu Cherian lcherian@marvell.com
commit bb1860efc817c18fce4112f25f51043e44346d1b upstream.
When using the perf interface, sink selection using sysfs is deprecated.
Signed-off-by: Linu Cherian lcherian@marvell.com Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Link: https://lore.kernel.org/r/20200916191737.4001561-14-mathieu.poirier@linaro.o... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/coresight/coresight-etm-perf.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -222,8 +222,6 @@ static void *etm_setup_aux(struct perf_e if (event->attr.config2) { id = (u32)event->attr.config2; sink = coresight_get_sink_by_id(id); - } else { - sink = coresight_get_enabled_sink(true); }
mask = &event_data->mask;
From: Mike Leach mike.leach@linaro.org
commit 39a7661dcf655c8198fd5d72412f5030a8e58444 upstream.
Commit [bb1860efc817] changed the sink handling code introducing an uninitialised pointer bug. This results in the default sink selection failing.
Prior to commit:
static void etm_setup_aux(...)
<snip> struct coresight_device *sink; <snip>
/* First get the selected sink from user space. */ if (event->attr.config2) { id = (u32)event->attr.config2; sink = coresight_get_sink_by_id(id); } else { sink = coresight_get_enabled_sink(true); } <ctd>
*sink always initialised - possibly to NULL which triggers the automatic sink selection.
After commit:
static void etm_setup_aux(...)
<snip> struct coresight_device *sink; <snip>
/* First get the selected sink from user space. */ if (event->attr.config2) { id = (u32)event->attr.config2; sink = coresight_get_sink_by_id(id); } <ctd>
*sink pointer uninitialised when not providing a sink on the perf command line. This breaks later checks to enable automatic sink selection.
Fixes: bb1860efc817 ("coresight: etm: perf: Sink selection using sysfs is deprecated") Signed-off-by: Mike Leach mike.leach@linaro.org Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Link: https://lore.kernel.org/r/20201029164559.1268531-3-mathieu.poirier@linaro.or... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/coresight/coresight-etm-perf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -210,7 +210,7 @@ static void *etm_setup_aux(struct perf_e u32 id; int cpu = event->cpu; cpumask_t *mask; - struct coresight_device *sink; + struct coresight_device *sink = NULL; struct etm_event_data *event_data = NULL;
event_data = alloc_event_data(cpu);
From: Boris Protopopov pboris@amazon.com
commit 57c176074057531b249cf522d90c22313fa74b0b upstream.
When converting trailing spaces and periods in paths, do so for every component of the path, not just the last component. If the conversion is not done for every path component, then subsequent operations in directories with trailing spaces or periods (e.g. create(), mkdir()) will fail with ENOENT. This is because on the server, the directory will have a special symbol in its name, and the client needs to provide the same.
Signed-off-by: Boris Protopopov pboris@amazon.com Acked-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifs_unicode.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/cifs/cifs_unicode.c +++ b/fs/cifs/cifs_unicode.c @@ -488,7 +488,13 @@ cifsConvertToUTF16(__le16 *target, const else if (map_chars == SFM_MAP_UNI_RSVD) { bool end_of_string;
- if (i == srclen - 1) + /** + * Remap spaces and periods found at the end of every + * component of the path. The special cases of '.' and + * '..' do not need to be dealt with explicitly because + * they are addressed in namei.c:link_path_walk(). + **/ + if ((i == srclen - 1) || (source[i+1] == '\')) end_of_string = true; else end_of_string = false;
On Tue, 17 Nov 2020 14:02:20 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.9: 15 builds: 15 pass, 0 fail 26 boots: 26 pass, 0 fail 64 tests: 64 pass, 0 fail
Linux version: 5.9.9-rc1-gfb1622495321 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Tue, Nov 17, 2020 at 07:09:30PM +0000, Jon Hunter wrote:
On Tue, 17 Nov 2020 14:02:20 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.9: 15 builds: 15 pass, 0 fail 26 boots: 26 pass, 0 fail 64 tests: 64 pass, 0 fail
Linux version: 5.9.9-rc1-gfb1622495321 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Thanks for testing all of them and letting me know.
greg k-h
On Tue, 2020-11-17 at 14:02 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux- stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted 5.9.9-rc1+. No typical dmesg regression or regressions.
Tested-by: Jeffrin Jose T jeffrin@rajagiritech.edu.in
On 11/17/20 6:02 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Tue, Nov 17, 2020 at 03:04:01PM -0700, Shuah Khan wrote:
On 11/17/20 6:02 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
Thanks for testing them all and letting me know.
greg k-h
On Tue, 17 Nov 2020 at 19:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE: 1) BUG: Invalid wait context on arm64 db410c device while booting. This issue has not reproduced after several testing loops. https://lore.kernel.org/stable/CA+G9fYsk54r9Re4E9BWpqsoxLjpCvxRKFWRgdiKVcPoY...
2) kselftest test suite version upgrade to v5.9
3) While running kselftest netfilter on x86, i386, arm64 and arm devices the following kernel warning was noticed. WARNING: at net/netfilter/nf_tables_api.c:622 lockdep_nfnl_nft_mutex_not_held+0x19/0x20 [nf_tables] https://lore.kernel.org/linux-kselftest/CA+G9fYvFUpODs+NkSYcnwKnXm62tmP=ksLe...
4)
From this release we have started building kernels with clang-10 toolchain
and testing LTP testsuite on qemu_arm64, qemu_arm, qemu_x86_64 and qemu_i386.
Summary ------------------------------------------------------------------------
kernel: 5.9.9-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.9.y git commit: fb1622495321923cbb1ae2c6cf2da1e9ca286800 git describe: v5.9.8-256-gfb1622495321 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.9.y/build/v5.9.8-...
No regressions (compared to build v5.9.8)
No fixes (compared to build v5.9.8)
Ran 52946 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - juno-r2-compat - juno-r2-kasan - nxp-ls2088 - qemu-arm-clang - qemu-arm64-clang - qemu-arm64-kasan - qemu-i386-clang - qemu-x86_64-clang - qemu-x86_64-kasan - qemu_arm - qemu_arm64 - qemu_arm64-compat - qemu_i386 - qemu_x86_64 - qemu_x86_64-compat - x15 - x86 - x86-kasan
Test Suites ----------- * build * install-android-platform-tools-r2600 * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * perf * v4l2-compliance * ltp-controllers-tests * ltp-cve-tests * network-basic-tests * kselftest * ltp-open-posix-tests * kvm-unit-tests * kunit * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Wed, Nov 18, 2020 at 11:03:55AM +0530, Naresh Kamboju wrote:
On Tue, 17 Nov 2020 at 19:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.9.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Thanks for testing all of these and letting me know.
greg k-h
On Tue, Nov 17, 2020 at 02:02:20PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 426 pass: 426 fail: 0
Reviewed-by: Guenter Roeck linux@roeck-us.net
Guenter
On Wed, Nov 18, 2020 at 07:25:46AM -0800, Guenter Roeck wrote:
On Tue, Nov 17, 2020 at 02:02:20PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 426 pass: 426 fail: 0
Reviewed-by: Guenter Roeck linux@roeck-us.net
Thanks for testing all of these and letting me know.
gre gk-h
On Tue, Nov 17, 2020 at 02:02:20PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.9.9 release. There are 255 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Nov 2020 12:20:51 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 426 pass: 426 fail: 0
Reviewed-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org