From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.11.7-rc1
Andrew Scull ascull@google.com KVM: arm64: Fix nVHE hyp panic host context restore
Mike Rapoport rppt@kernel.org mm/page_alloc.c: refactor initialization of struct page for holes in memory layout
Zhou Guanghui zhouguanghui1@huawei.com mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
Zhou Guanghui zhouguanghui1@huawei.com mm/memcg: set memcg when splitting page
Suren Baghdasaryan surenb@google.com mm/madvise: replace ptrace attach requirement for process_madvise
Nadav Amit namit@vmware.com mm/userfaultfd: fix memory corruption due to writeprotect
OGAWA Hirofumi hirofumi@mail.parknet.co.jp mm/highmem.c: fix zero_user_segments() with start > end
Marc Zyngier maz@kernel.org KVM: arm64: Fix exclusive limit for IPA size
Marc Zyngier maz@kernel.org KVM: arm64: Reject VM creation when the default IPA size is unsupported
Suzuki K Poulose suzuki.poulose@arm.com KVM: arm64: nvhe: Save the SPE context early
Will Deacon will@kernel.org KVM: arm64: Avoid corrupting vCPU context register in guest exit
Jia He justin.he@arm.com KVM: arm64: Fix range alignment when walking page tables
Marc Zyngier maz@kernel.org KVM: arm64: Ensure I-cache isolation between vcpus of a same VM
Wanpeng Li wanpengli@tencent.com KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged
Sean Christopherson seanjc@google.com KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
Andy Lutomirski luto@kernel.org x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
Joerg Roedel jroedel@suse.de x86/sev-es: Use __copy_from_user_inatomic()
Joerg Roedel jroedel@suse.de x86/sev-es: Correctly track IRQ states in runtime #VC handler
Joerg Roedel jroedel@suse.de x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack
Joerg Roedel jroedel@suse.de x86/sev-es: Introduce ip_within_syscall_gap() helper
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
Andrey Konovalov andreyknvl@google.com kasan: fix KASAN_STACK dependency for HW_TAGS
Andrey Konovalov andreyknvl@google.com kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
Lior Ribak liorribak@gmail.com binfmt_misc: fix possible deadlock in bm_register_write
Christophe Leroy christophe.leroy@csgroup.eu powerpc: Fix missing declaration of [en/dis]able_kernel_vsx()
Nicholas Piggin npiggin@gmail.com powerpc: Fix inverted SET_FULL_REGS bitop
Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com powerpc/64s: Fix instruction encoding for lis in ppc_function_entry()
Ard Biesheuvel ardb@kernel.org efi: stub: omit SetVirtualAddressMap() if marked unsupported in RT_PROP table
Peter Zijlstra peterz@infradead.org sched: Simplify set_affinity_pending refcounts
Peter Zijlstra peterz@infradead.org sched: Fix affine_move_task() self-concurrency
Peter Zijlstra peterz@infradead.org sched: Optimize migration_cpu_stop()
Peter Zijlstra peterz@infradead.org sched: Simplify migration_cpu_stop()
Peter Zijlstra peterz@infradead.org sched: Collate affine_move_task() stoppers
Mathieu Desnoyers mathieu.desnoyers@efficios.com sched/membarrier: fix missing local execution of ipi_sync_rq_state()
Peter Zijlstra peterz@infradead.org sched: Fix migration_cpu_stop() requeueing
Arnd Bergmann arnd@arndb.de linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
Minchan Kim minchan@kernel.org zram: fix broken page writeback
Minchan Kim minchan@kernel.org zram: fix return value on writeback_store
Alexey Dobriyan adobriyan@gmail.com prctl: fix PR_SET_MM_AUXV kernel stack leak
Matthew Wilcox (Oracle) willy@infradead.org include/linux/sched/mm.h: use rcu_dereference in in_vfork()
Arnd Bergmann arnd@arndb.de stop_machine: mark helpers __always_inline
Arnd Bergmann arnd@arndb.de memblock: fix section mismatch warning
Peter Zijlstra peterz@infradead.org seqlock,lockdep: Fix seqcount_latch_init()
Daniel Axtens dja@axtens.net powerpc/64s/exception: Clean up a missed SRR specifier
Anna-Maria Behnsen anna-maria@linutronix.de hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
Kan Liang kan.liang@linux.intel.com perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
Kan Liang kan.liang@linux.intel.com perf/core: Flush PMU internal buffers for per-CPU events
Paolo Abeni pabeni@redhat.com mptcp: fix memory accounting on allocation error
Florian Westphal fw@strlen.de mptcp: put subflow sock on connect error
Willem de Bruijn willemb@google.com net: expand textsearch ts_state to fit skb_seq_state
Wei Yongjun weiyongjun1@huawei.com perf/arm_dmc620_pmu: Fix error return code in dmc620_pmu_device_probe()
Dave Airlie airlied@redhat.com drm/nouveau: fix dma syncing for loops (v2)
Jens Axboe axboe@kernel.dk io_uring: perform IOPOLL reaping if canceler is thread itself
Ard Biesheuvel ardb@kernel.org arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds
Daiyue Zhang zhangdaiyue1@huawei.com configfs: fix a use-after-free in __configfs_open_file
James Smart jsmart2021@gmail.com nvme-fc: fix racing controller reset and create association
Anthony DeRossi ajderossi@gmail.com drm/ttm: Fix TTM page pool accounting
Jia-Ju Bai baijiaju1990@gmail.com block: rsxx: fix error return code of rsxx_pci_probe()
Ondrej Mosnacek omosnace@redhat.com NFSv4.2: fix return value of _nfs4_get_security_label()
Trond Myklebust trond.myklebust@hammerspace.com NFS: Don't gratuitously clear the inode cache when lookup failed
Trond Myklebust trond.myklebust@hammerspace.com NFS: Don't revalidate the directory permissions on a lookup failure
Benjamin Coddington bcodding@redhat.com SUNRPC: Set memalloc_nofs_save() for sync tasks
Anshuman Khandual anshuman.khandual@arm.com arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory
Wei Yongjun weiyongjun1@huawei.com cpufreq: qcom-hw: Fix return value check in qcom_cpufreq_hw_cpu_init()
Shawn Guo shawn.guo@linaro.org cpufreq: qcom-hw: fix dereferencing freed memory 'data'
Atish Patra atish.patra@wdc.com net: macb: Add default usrio config to default gem config
Jordan Niethe jniethe5@gmail.com powerpc/sstep: Fix VSX instruction emulation
Sergey Shtylyov s.shtylyov@omprussia.ru sh_eth: fix TRSCER mask for R7S72100
Ioana Ciornei ioana.ciornei@nxp.com net: phy: ti: take into account all possible interrupt sources
Ido Schimmel idosch@nvidia.com mlxsw: spectrum_router: Ignore routes using a deleted nexthop object
Ian Abbott abbotti@mev.co.uk staging: comedi: pcl818: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: pcl711: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: me4000: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: dmm32at: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: das800: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: das6402: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: adv_pci1710: Fix endian problem for AI command data
Ian Abbott abbotti@mev.co.uk staging: comedi: addi_apci_1500: Fix endian problem for command sample
Ian Abbott abbotti@mev.co.uk staging: comedi: addi_apci_1032: Fix endian problem for COS sample
Lee Gibson leegib@gmail.com staging: rtl8192e: Fix possible buffer overflow in _rtl92e_wx_set_scan
Lee Gibson leegib@gmail.com staging: rtl8712: Fix possible buffer overflow in r8712_sitesurvey_cmd
Dan Carpenter dan.carpenter@oracle.com staging: ks7010: prevent buffer overflow in ks_wlan_set_scan()
Dan Carpenter dan.carpenter@oracle.com staging: rtl8188eu: fix potential memory corruption in rtw_check_beacon_data()
Dan Carpenter dan.carpenter@oracle.com staging: rtl8712: unterminated string leads to read overflow
Dan Carpenter dan.carpenter@oracle.com staging: rtl8188eu: prevent ->ssid overflow in rtw_wx_set_scan()
Dan Carpenter dan.carpenter@oracle.com staging: rtl8192u: fix ->ssid overflow in r8192_wx_set_scan()
Dmitry Baryshkov dmitry.baryshkov@linaro.org misc: fastrpc: restrict user apps from sending kernel RPC messages
Shile Zhang shile.zhang@linux.alibaba.com misc/pvpanic: Export module FDT device table
Alexander Shiyan shc_work@mail.ru Revert "serial: max310x: rework RX interrupt handling"
Shuah Khan skhan@linuxfoundation.org usbip: fix vudc usbip_sockfd_store races leading to gpf
Shuah Khan skhan@linuxfoundation.org usbip: fix vhci_hcd attach_store() races leading to gpf
Shuah Khan skhan@linuxfoundation.org usbip: fix stub_dev usbip_sockfd_store() races leading to gpf
Shuah Khan skhan@linuxfoundation.org usbip: fix vudc to check for stream socket
Shuah Khan skhan@linuxfoundation.org usbip: fix vhci_hcd to check for stream socket
Shuah Khan skhan@linuxfoundation.org usbip: fix stub_dev to check for stream socket
Sebastian Reichel sebastian.reichel@collabora.com USB: serial: cp210x: add some more GE USB IDs
Karan Singhal karan.singhal@acuitybrands.com USB: serial: cp210x: add ID for Acuity Brands nLight Air Adapter
Niv Sardi xaiki@evilgiggle.com USB: serial: ch341: add new Product ID
Pavel Skripkin paskripkin@gmail.com USB: serial: io_edgeport: fix memory leak in edge_startup
Mathias Nyman mathias.nyman@linux.intel.com xhci: Fix repeated xhci wake after suspend due to uncleared internal wake state
Forest Crossman cyrozap@gmail.com usb: xhci: Fix ASMedia ASM1042A and ASM3242 DMA addressing
Mathias Nyman mathias.nyman@linux.intel.com xhci: Improve detection of device initiated wake signal.
Stanislaw Gruszka stf_xl@wp.pl usb: xhci: do not perform Soft Retry for some xHCI hosts
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com usb: renesas_usbhs: Clear PIPECFG for re-enabling pipe with other EPNUM
Pete Zaitcev zaitcev@redhat.com USB: usblp: fix a hang in poll() if disconnected
Matthias Kaehlcke mka@chromium.org usb: dwc3: qcom: Honor wakeup enabled/disabled state
Shawn Guo shawn.guo@linaro.org usb: dwc3: qcom: add ACPI device id for sc8180x
Shawn Guo shawn.guo@linaro.org usb: dwc3: qcom: add URS Host support for sdm845 ACPI boot
Serge Semin Sergey.Semin@baikalelectronics.ru usb: dwc3: qcom: Add missing DWC3 OF node refcount decrement
Ruslan Bilovol ruslan.bilovol@gmail.com usb: gadget: f_uac1: stop playback on function disable
Ruslan Bilovol ruslan.bilovol@gmail.com usb: gadget: f_uac2: always increase endpoint max_packet_size by one audio slot
Dan Carpenter dan.carpenter@oracle.com USB: gadget: u_ether: Fix a configfs return code
Wei Yongjun weiyongjun1@huawei.com USB: gadget: udc: s3c2410_udc: fix return value check in s3c2410_udc_probe()
Yorick de Wid ydewid@gmail.com Goodix Fingerprint device is not a modem
Paulo Alcantara pc@cjr.nz cifs: do not send close in compound create+close requests
Frank Li lznuaa@gmail.com mmc: cqhci: Fix random crash when remove mmc module/card
Adrian Hunter adrian.hunter@intel.com mmc: core: Fix partition switch time for eMMC
Yann Gautier yann.gautier@foss.st.com mmc: mmci: Add MMC_CAP_NEED_RSP_BUSY for the stm32 variants
Juergen Gross jgross@suse.com xen/events: avoid handling the same event on two cpus at the same time
Juergen Gross jgross@suse.com xen/events: don't unmask an event channel when an eoi is pending
Juergen Gross jgross@suse.com xen/events: reset affinity of 2-level event when tearing it down
Heikki Krogerus heikki.krogerus@linux.intel.com software node: Fix node registration
Stefan Haberland sth@linux.ibm.com s390/dasd: fix hanging IO request during DASD driver unbind
Stefan Haberland sth@linux.ibm.com s390/dasd: fix hanging DASD driver unbind
Rob Herring robh@kernel.org arm64: perf: Fix 64-bit event counter read truncation
Catalin Marinas catalin.marinas@arm.com arm64: mte: Map hotplugged memory as Normal Tagged
Andrey Konovalov andreyknvl@google.com arm64: kasan: fix page_alloc tagging with DEBUG_VIRTUAL
Jan Kara jack@suse.cz block: Try to handle busy underlying device on discard
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com block: Discard page cache of zone reset target range
Eric W. Biederman ebiederm@xmission.com Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")
Beata Michalska beata.michalska@arm.com opp: Don't drop extra references to OPPs accidentally
Pavel Skripkin paskripkin@gmail.com ALSA: usb-audio: fix use after free in usb_audio_disconnect
Pavel Skripkin paskripkin@gmail.com ALSA: usb-audio: fix NULL ptr dereference in usb_audio_probe
Kai-Heng Feng kai.heng.feng@canonical.com ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend()
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Apply the control quirk to Plantronics headsets
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Fix "cannot get freq eq" errors on Dell AE515 sound bar
Takashi Iwai tiwai@suse.de ALSA: hda: Avoid spurious unsol event handling during S3/S4
Takashi Iwai tiwai@suse.de ALSA: hda: Flush pending unsolicited events before suspend
Takashi Iwai tiwai@suse.de ALSA: hda: Drop the BATCH workaround for AMD controllers
Simeon Simeonoff sim.simeonoff@gmail.com ALSA: hda/ca0132: Add Sound BlasterX AE-5 Plus support
Takashi Iwai tiwai@suse.de ALSA: hda/conexant: Add quirk for mute LED control on HP ZBook G5
Takashi Iwai tiwai@suse.de ALSA: hda/hdmi: Cancel pending works before suspend
John Ernberg john.ernberg@actia.se ALSA: usb: Add Plantronics C320-M USB ctrl msg delay quirk
AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org clk: qcom: gpucc-msm8998: Add resets, cxc, fix flags on gpu_gx_gdsc
Aleksandr Miloserdov a.miloserdov@yadro.com scsi: target: core: Prevent underflow for service actions
Aleksandr Miloserdov a.miloserdov@yadro.com scsi: target: core: Add cmd length set before cmd complete
Mike Christie michael.christie@oracle.com scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling
Lin Feng linf@wangsu.com sysctl.c: fix underflow value setting risk in vm_table
David Hildenbrand david@redhat.com drivers/base/memory: don't store phys_device in memory blocks
Heiko Carstens hca@linux.ibm.com s390/smp: __smp_rescan_cpus() - move cpumask away from stack
Andrey Konovalov andreyknvl@google.com kasan: fix memory corruption in kasan_bitops_tags test
Keith Busch kbusch@kernel.org PCI/ERR: Retain status from error notification
Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp i40e: Fix memory leak in i40e_probe
Geert Uytterhoeven geert+renesas@glider.be PCI: Fix pci_register_io_range() memory leak
Sasha Levin sashal@kernel.org kbuild: clamp SUBLEVEL to 255
Theodore Ts'o tytso@mit.edu ext4: don't try to processed freed blocks until mballoc is initialized
Bjorn Helgaas bhelgaas@google.com PCI/LINK: Remove bandwidth notification
Arnd Bergmann arnd@arndb.de drivers/base: build kunit tests without structleak plugin
Krzysztof Wilczyński kw@linux.com PCI: mediatek: Add missing of_node_put() to fix reference leak
Martin Kaiser martin@kaiser.cx PCI: xgene-msi: Fix race in installing chained irq handler
Ronald Tschalär ronald@innovation.ch Input: applespi - don't wait for responses to commands indefinitely.
Khalid Aziz khalid.aziz@oracle.com sparc64: Use arch_validate_flags() to validate ADI flag
Andreas Larsson andreas@gaisler.com sparc32: Limit memblock allocation to low memory
AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org clk: qcom: gdsc: Implement NO_RET_PERIPH flag
Suravee Suthikulpanit suravee.suthikulpanit@amd.com iommu/amd: Fix performance counter initialization
Michael Ellerman mpe@ellerman.id.au powerpc/64: Fix stack trace not displaying final frame
Filipe Laíns lains@riseup.net HID: logitech-dj: add support for the new lightspeed connection iteration
Athira Rajeev atrajeev@linux.vnet.ibm.com powerpc/perf: Record counter overflow always if SAMPLE_IP is unset
Nicholas Piggin npiggin@gmail.com powerpc: improve handling of unrecoverable system reset
Alain Volmat alain.volmat@foss.st.com spi: stm32: make spurious and overrun interrupts visible
Oliver O'Halloran oohall@gmail.com powerpc/pci: Add ppc_md.discover_phbs()
Lubomir Rintel lkundrak@v3.sk Platform: OLPC: Fix probe error handling
Pan Bian bianpan2016@163.com platform/x86: amd-pmc: put device on error paths
Jeremy Linton jeremy.linton@arm.com mmc: sdhci-iproc: Add ACPI bindings for the RPi
Chaotian Jing chaotian.jing@mediatek.com mmc: mediatek: fix race condition between msdc_request_timeout and irq
Christophe JAILLET christophe.jaillet@wanadoo.fr mmc: mxs-mmc: Fix a resource leak in an error handling path in 'mxs_mmc_probe()'
Lu Baolu baolu.lu@linux.intel.com iommu/vt-d: Clear PRQ overflow only when PRQ is empty
Steven J. Magnani magnani@ieee.org udf: fix silent AED tagLocation corruption
Can Guo cang@codeaurora.org scsi: ufs: Protect some contexts from unexpected clock scaling
Jaegeuk Kim jaegeuk@kernel.org scsi: ufs: WB is only available on LUN #0 to #7
akshatzen akshatzen@google.com scsi: pm80xx: Fix missing tag_free in NVMD DATA req
Wolfram Sang wsa+renesas@sang-engineering.com i2c: rcar: optimize cacheline to minimize HW race condition
Wolfram Sang wsa+renesas@sang-engineering.com i2c: rcar: faster irq code to minimize HW race condition
Florian Westphal fw@strlen.de mptcp: reset last_snd on subflow close
Paolo Abeni pabeni@redhat.com mptcp: always graft subflow socket to parent
Thomas Bogendoerfer tsbogend@alpha.franken.de MIPS: kernel: Reserve exception base early to prevent corruption
Hans Verkuil hverkuil@xs4all.nl media: rc: compile rc-cec.c into rc-core
Biju Das biju.das.jz@bp.renesas.com media: v4l: vsp1: Fix bru null pointer access
Biju Das biju.das.jz@bp.renesas.com media: v4l: vsp1: Fix uif null pointer access
Dafna Hirschfeld dafna.hirschfeld@collabora.com media: rkisp1: params: fix wrong bits settings
Maxim Mikityanskiy maxtram95@gmail.com media: usbtv: Fix deadlock on suspend
Sergey Shtylyov s.shtylyov@omprussia.ru sh_eth: fix TRSCER mask for R7S9210
Colin Ian King colin.king@canonical.com qxl: Fix uninitialised struct field head.surface_id
Wang Qing wangqing@vivo.com s390/crypto: return -EFAULT if copy_to_user() fails
Eric Farman farman@linux.ibm.com s390/cio: return -EFAULT if copy_to_user() fails
Tvrtko Ursulin tvrtko.ursulin@intel.com drm/i915: Wedge the GPU if command parser setup fails
Noralf Trønnes noralf@tronnes.org drm/shmem-helpers: vunmap: Don't put pages for dma-buf
Artem Lapkin art@khadas.com drm: meson_drv add shutdown function
Alex Deucher alexander.deucher@amd.com drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m
Thomas Zimmermann tzimmermann@suse.de drm: Use USB controller's DMA mask when importing dmabufs
Neil Roberts nroberts@igalia.com drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff
Neil Roberts nroberts@igalia.com drm/shmem-helper: Check for purged buffers in fault handler
Alex Deucher alexander.deucher@amd.com drm/amdgpu/display: handle aux backlight in backlight_get_brightness
Alex Deucher alexander.deucher@amd.com drm/amdgpu/display: don't assert in set backlight function
Alex Deucher alexander.deucher@amd.com drm/amdgpu/display: simplify backlight setting
Kenneth Feng kenneth.feng@amd.com drm/amd/pm: bug fix for pcie dpm
Evan Quan evan.quan@amd.com drm/amd/pm: correct the watermark settings for Polaris
Holger Hoffstätte holger@applied-asynchrony.com drm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth()
Holger Hoffstätte holger@applied-asynchrony.com drm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp()
Takashi Iwai tiwai@suse.de drm/amd/display: Add a backlight module option
Christian König christian.koenig@amd.com drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table
Daniel Vetter daniel.vetter@ffwll.ch drm/compat: Clear bounce structures
Tong Zhang ztong0001@gmail.com drm/fb-helper: only unmap if buffer not null
Edwin Peer edwin.peer@broadcom.com bnxt_en: reliably allocate IRQ table on reset to avoid crash
Wang Qing wangqing@vivo.com s390/cio: return -EFAULT if copy_to_user() fails again
Jian Shen shenjian15@huawei.com net: hns3: fix bug when calculating the TCAM table info
Jian Shen shenjian15@huawei.com net: hns3: fix query vlan mask value error for flow director
Jian Shen shenjian15@huawei.com net: hns3: fix error mask definition of flow director
Ravi Bangoria ravi.bangoria@linux.ibm.com perf report: Fix -F for branch & mem modes
Ian Rogers irogers@google.com perf traceevent: Ensure read cmdlines are null terminated.
Danielle Ratson danieller@nvidia.com mlxsw: spectrum_ethtool: Add an external speed to PTYS register
Danielle Ratson danieller@nvidia.com selftests: forwarding: Fix race condition in mirror installation
Arnd Bergmann arnd@arndb.de net: phy: make mdio_bus_phy_suspend/resume as __maybe_unused
Yinjun Zhang yinjun.zhang@corigine.com ethtool: fix the check logic of at least one channel for RX/TX
Joakim Zhang qiangqing.zhang@nxp.com net: stmmac: fix wrongly set buffer2 valid when sph unsupport
Joakim Zhang qiangqing.zhang@nxp.com net: stmmac: fix watchdog timeout during suspend/resume stress test
Joakim Zhang qiangqing.zhang@nxp.com net: stmmac: stop each tx channel independently
Antonio Terceiro antonio.terceiro@linaro.org perf build: Fix ccache usage in $(CC) when generating arch errno table
Kun-Chuan Hsieh jetswayss@gmail.com tools/resolve_btfids: Fix build error with older host toolchains
Antony Antony antony@phenome.org ixgbe: fail to create xfrm offload of IPsec tunnel mode SA
Hayes Wang hayeswang@realtek.com r8169: fix r8168fp_adjust_ocp_cmd function
Julian Wiedmann jwi@linux.ibm.com s390/qeth: fix notification for pending buffers during teardown
Julian Wiedmann jwi@linux.ibm.com s390/qeth: schedule TX NAPI on QAOB completion
Julian Wiedmann jwi@linux.ibm.com s390/qeth: improve completion of pending TX buffers
Julian Wiedmann jwi@linux.ibm.com s390/qeth: fix memory leak after failed TX Buffer allocation
Jia-Ju Bai baijiaju1990@gmail.com net: qrtr: fix error return code of qrtr_sendmsg()
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
Paul Cercueil paul@crapouillou.net net: davicom: Fix regulator not turned off on driver removal
Paul Cercueil paul@crapouillou.net net: davicom: Fix regulator not turned off on failed probe
Xie He xie.he.0141@gmail.com net: lapbether: Remove netif_start_queue / netif_stop_queue
Wong Vee Khee vee.khee.wong@intel.com stmmac: intel: Fixes clock registration error seen for multiple interfaces
Ong Boon Leong boon.leong.ong@intel.com net: stmmac: Fix VLAN filter delete timeout issue in Intel mGBE SGMII
Paul Moore paul@paul-moore.com cipso,calipso: resolve a number of problems with the DOI refcounts
Hillf Danton hdanton@sina.com netdevsim: init u64 stats for 32bit hardware
Daniele Palmas dnlplm@gmail.com net: usb: qmi_wwan: allow qmimux add/del with master up
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: properly reject destination IP keys in VCAP IS1
Maximilian Heyne mheyne@amazon.de net: sched: avoid duplicates in classes dump
Ido Schimmel idosch@nvidia.com nexthop: Do not flush blackhole nexthops when loopback goes down
Ong Boon Leong boon.leong.ong@intel.com net: stmmac: fix incorrect DMA channel intr enable setting of EQoS v4.10
Kevin(Yudong) Yang yyd@google.com net/mlx4_en: update moderation when config reset
Biao Huang biao.huang@mediatek.com net: ethernet: mtk-star-emac: fix wrong unmap in RX handling
DENG Qingfang dqfext@gmail.com net: dsa: tag_mtk: fix 802.1ad VLAN egress
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: keep RX ring consumer index in sync with hardware
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdr
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: force the RGMII speed and duplex instead of operating in inband mode
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: don't disable VLAN filtering in IFF_PROMISC mode
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: fix incorrect TPID when receiving 802.1ad tagged packets
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: take the MDIO lock only once per NAPI poll cycle
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: initialize RFS/RSS memories for unused ports too
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: don't overwrite the RSS indirection table when initializing
Sergey Shtylyov s.shtylyov@omprussia.ru sh_eth: fix TRSCER mask for SH771x
DENG Qingfang dqfext@gmail.com net: dsa: tag_rtl4_a: fix egress tags
Jakub Kicinski kuba@kernel.org docs: networking: drop special stable handling
Linus Torvalds torvalds@linux-foundation.org Revert "mm, slub: consider rest of partial list if acquire_slab() fails"
Paulo Alcantara pc@cjr.nz cifs: return proper error code in statfs(2)
Aurelien Aptel aaptel@suse.com cifs: fix credit accounting for extra channel
Christian Brauner christian.brauner@ubuntu.com mount: fix mounting of detached mounts onto targets that reside on shared mounts
Johan Hovold johan@kernel.org gpio: fix gpio-device list corruption
Lorenzo Bianconi lorenzo@kernel.org mt76: dma: do not report truncated frames to mac80211
Junlin Yang yangjunlin@yulong.com ibmvnic: remove excessive irqsave
Jiri Wiesner jwiesner@suse.com ibmvnic: always store valid MAC address
Michal Suchanek msuchanek@suse.de ibmvnic: Fix possibly uninitialized old_num_tx_queues variable warning.
Maciej Fijalkowski maciej.fijalkowski@intel.com libbpf: Clear map_info before each bpf_obj_get_info_by_fd
Maciej Fijalkowski maciej.fijalkowski@intel.com samples, bpf: Add missing munmap in xdpsock
Yauheni Kaliuta yauheni.kaliuta@redhat.com selftests/bpf: Mask bpf_csum_diff() return value to 16 bits in test_verifier
Hangbin Liu liuhangbin@gmail.com selftests/bpf: No need to drop the packet when there is no geneve opt
Ilya Leoshkevich iii@linux.ibm.com selftests/bpf: Use the last page in test_snprintf_btf on s390
Guangbin Huang huangguangbin2@huawei.com net: phy: fix save wrong speed and duplex problem if autoneg is on
Jason A. Donenfeld Jason@zx2c4.com net: always use icmp{,v6}_ndo_send from ndo_start_xmit
Vasily Averin vvs@virtuozzo.com netfilter: x_tables: gpf inside xt_find_revision()
Florian Westphal fw@strlen.de netfilter: nf_nat: undo erroneous tcp edemux lookup
Eric Dumazet edumazet@google.com tcp: add sanity tests to TCP_QUEUE_SEQ
Arjun Roy arjunroy@google.com tcp: Fix sign comparison bug in getsockopt(TCP_ZEROCOPY_RECEIVE)
Torin Cooper-Bennun torin@maxiluxsystems.com can: tcan4x5x: tcan4x5x_init(): fix initialization - clear MRAM before entering Normal Mode
Joakim Zhang qiangqing.zhang@nxp.com can: flexcan: invoke flexcan_chip_freeze() to enter freeze mode
Joakim Zhang qiangqing.zhang@nxp.com can: flexcan: enable RX FIFO after FRZ/HALT valid
Joakim Zhang qiangqing.zhang@nxp.com can: flexcan: assert FRZ bit in flexcan_chip_freeze()
Andy Shevchenko andriy.shevchenko@linux.intel.com gpio: pca953x: Set IRQ type when handle Intel Galileo Gen 2
Oleksij Rempel linux@rempel-privat.de can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership
Andy Shevchenko andriy.shevchenko@linux.intel.com gpiolib: Read "gpio-line-names" from a firmware node
Andy Shevchenko andriy.shevchenko@linux.intel.com gpiolib: acpi: Allow to find GpioInt() resource by name and index
Andy Shevchenko andriy.shevchenko@linux.intel.com gpiolib: acpi: Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk
Matthias Schiffer mschiffer@universe-factory.net net: l2tp: reduce log level of messages in receive path, add counter instead
Kalle Valo kvalo@codeaurora.org ath11k: fix AP mode for QCA6390
Balazs Nemeth bnemeth@redhat.com net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
Balazs Nemeth bnemeth@redhat.com net: check if protocol extracted by virtio_net_hdr_set_proto is correct
Daniel Borkmann daniel@iogearbox.net net: Fix gro aggregation for udp encaps with zero csum
Felix Fietkau nbd@nbd.name ath9k: fix transmitting to stations in dynamic SMPS mode
Davide Caratti dcaratti@redhat.com mptcp: fix length of ADD_ADDR with port sub-option
Maciej W. Rozycki macro@orcam.me.uk crypto: mips/poly1305 - enable for all MIPS processors
Jakub Kicinski kuba@kernel.org ethernet: alx: fix order of calls on resume
Greg Kurz groug@kaod.org powerpc/pseries: Don't enforce MSI affinity with kdump
Athira Rajeev atrajeev@linux.vnet.ibm.com powerpc/perf: Fix handling of privilege level checks in perf interrupt context
Christophe Leroy christophe.leroy@csgroup.eu powerpc/603: Fix protection of user pages mapped with PROT_NONE
Dmitry V. Levin ldv@altlinux.org uapi: nfnetlink_cthelper.h: fix userspace compilation error
-------------
Diffstat:
Documentation/ABI/testing/sysfs-devices-memory | 5 +- Documentation/admin-guide/mm/memory-hotplug.rst | 4 +- Documentation/gpu/todo.rst | 21 +++ Documentation/networking/netdev-FAQ.rst | 72 +-------- Documentation/process/stable-kernel-rules.rst | 6 - Documentation/process/submitting-patches.rst | 5 - Documentation/virt/kvm/api.rst | 3 + Makefile | 16 +- arch/arm64/include/asm/kvm_asm.h | 4 +- arch/arm64/include/asm/kvm_hyp.h | 8 +- arch/arm64/include/asm/memory.h | 5 + arch/arm64/include/asm/mmu_context.h | 5 +- arch/arm64/include/asm/pgtable-prot.h | 1 - arch/arm64/include/asm/pgtable.h | 3 + arch/arm64/kernel/head.S | 2 +- arch/arm64/kernel/perf_event.c | 2 +- arch/arm64/kvm/arm.c | 7 +- arch/arm64/kvm/hyp/entry.S | 2 +- arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 +- arch/arm64/kvm/hyp/nvhe/host.S | 20 +-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +- arch/arm64/kvm/hyp/nvhe/switch.c | 14 +- arch/arm64/kvm/hyp/nvhe/tlb.c | 3 +- arch/arm64/kvm/hyp/pgtable.c | 1 + arch/arm64/kvm/hyp/vhe/tlb.c | 3 +- arch/arm64/kvm/mmu.c | 3 +- arch/arm64/kvm/reset.c | 12 +- arch/arm64/mm/init.c | 12 ++ arch/arm64/mm/mmu.c | 5 +- arch/mips/crypto/Makefile | 4 +- arch/mips/include/asm/traps.h | 3 + arch/mips/kernel/cpu-probe.c | 6 + arch/mips/kernel/cpu-r3k-probe.c | 3 + arch/mips/kernel/traps.c | 10 +- arch/powerpc/include/asm/code-patching.h | 2 +- arch/powerpc/include/asm/machdep.h | 3 + arch/powerpc/include/asm/ptrace.h | 7 +- arch/powerpc/include/asm/switch_to.h | 10 ++ arch/powerpc/kernel/asm-offsets.c | 2 +- arch/powerpc/kernel/exceptions-64s.S | 2 +- arch/powerpc/kernel/head_book3s_32.S | 9 +- arch/powerpc/kernel/pci-common.c | 10 ++ arch/powerpc/kernel/process.c | 2 +- arch/powerpc/kernel/traps.c | 5 +- arch/powerpc/lib/sstep.c | 4 +- arch/powerpc/perf/core-book3s.c | 23 ++- arch/powerpc/platforms/pseries/msi.c | 25 ++- arch/s390/kernel/smp.c | 2 +- arch/sparc/include/asm/mman.h | 54 ++++--- arch/sparc/mm/init_32.c | 3 + arch/x86/entry/common.c | 3 +- arch/x86/entry/entry_64_compat.S | 2 + arch/x86/events/intel/core.c | 5 +- arch/x86/include/asm/insn-eval.h | 2 + arch/x86/include/asm/proto.h | 1 + arch/x86/include/asm/ptrace.h | 15 ++ arch/x86/kernel/kvmclock.c | 19 ++- arch/x86/kernel/sev-es.c | 22 ++- arch/x86/kernel/traps.c | 3 +- arch/x86/kernel/unwind_orc.c | 12 +- arch/x86/kvm/lapic.c | 11 +- arch/x86/lib/insn-eval.c | 66 ++++++-- block/blk-zoned.c | 38 ++++- crypto/Kconfig | 2 +- drivers/base/memory.c | 25 ++- drivers/base/swnode.c | 3 + drivers/base/test/Makefile | 1 + drivers/block/rsxx/core.c | 1 + drivers/block/zram/zram_drv.c | 17 ++- drivers/clk/qcom/gdsc.c | 10 +- drivers/clk/qcom/gdsc.h | 3 +- drivers/clk/qcom/gpucc-msm8998.c | 8 +- drivers/cpufreq/qcom-cpufreq-hw.c | 6 +- drivers/firmware/efi/libstub/efi-stub.c | 16 ++ drivers/gpio/gpio-pca953x.c | 78 +++------- drivers/gpio/gpiolib-acpi.c | 19 ++- drivers/gpio/gpiolib.c | 16 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 49 +++--- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 1 - .../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 6 +- .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 8 +- .../gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 48 ++++++ .../gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 66 ++++++++ .../gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 48 +++--- drivers/gpu/drm/drm_fb_helper.c | 2 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 32 ++-- drivers/gpu/drm/drm_ioc32.c | 11 ++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 19 ++- drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/meson/meson_drv.c | 11 ++ drivers/gpu/drm/nouveau/nouveau_bo.c | 6 +- drivers/gpu/drm/qxl/qxl_display.c | 1 + drivers/gpu/drm/radeon/radeon.h | 2 + drivers/gpu/drm/radeon/radeon_gem.c | 4 +- drivers/gpu/drm/radeon/radeon_prime.c | 2 + drivers/gpu/drm/tiny/gm12u320.c | 44 +++++- drivers/gpu/drm/ttm/ttm_pool.c | 4 +- drivers/gpu/drm/udl/udl_drv.c | 17 +++ drivers/gpu/drm/udl/udl_drv.h | 1 + drivers/gpu/drm/udl/udl_main.c | 10 ++ drivers/hid/hid-logitech-dj.c | 7 +- drivers/i2c/busses/i2c-rcar.c | 13 +- drivers/input/keyboard/applespi.c | 21 ++- drivers/iommu/amd/init.c | 45 ++++-- drivers/iommu/intel/svm.c | 13 +- .../media/platform/rockchip/rkisp1/rkisp1-params.c | 1 - drivers/media/platform/vsp1/vsp1_drm.c | 6 +- drivers/media/rc/Makefile | 1 + drivers/media/rc/keymaps/Makefile | 1 - drivers/media/rc/keymaps/rc-cec.c | 28 ++-- drivers/media/rc/rc-main.c | 6 + drivers/media/usb/usbtv/usbtv-audio.c | 2 +- drivers/misc/fastrpc.c | 5 + drivers/misc/pvpanic.c | 1 + drivers/mmc/core/bus.c | 11 +- drivers/mmc/core/mmc.c | 15 +- drivers/mmc/host/mmci.c | 10 +- drivers/mmc/host/mtk-sd.c | 18 ++- drivers/mmc/host/mxs-mmc.c | 2 +- drivers/mmc/host/sdhci-iproc.c | 18 +++ drivers/net/Kconfig | 2 +- drivers/net/can/flexcan.c | 24 +-- drivers/net/can/m_can/tcan4x5x.c | 6 +- drivers/net/dsa/sja1105/sja1105_main.c | 2 +- drivers/net/ethernet/atheros/alx/main.c | 7 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 14 +- drivers/net/ethernet/cadence/macb_main.c | 15 +- drivers/net/ethernet/davicom/dm9000.c | 21 ++- drivers/net/ethernet/freescale/enetc/enetc.c | 93 ++++++------ drivers/net/ethernet/freescale/enetc/enetc.h | 5 + drivers/net/ethernet/freescale/enetc/enetc_hw.h | 18 ++- drivers/net/ethernet/freescale/enetc/enetc_pf.c | 98 +++++++++--- drivers/net/ethernet/freescale/enetc/enetc_vf.c | 7 + .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 6 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 +- drivers/net/ethernet/ibm/ibmvnic.c | 17 +-- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 + drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 5 + drivers/net/ethernet/intel/ixgbevf/ipsec.c | 5 + drivers/net/ethernet/mediatek/mtk_star_emac.c | 5 +- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 2 +- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 + drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 + drivers/net/ethernet/mellanox/mlxsw/reg.h | 1 + .../net/ethernet/mellanox/mlxsw/spectrum_ethtool.c | 5 + .../net/ethernet/mellanox/mlxsw/spectrum_router.c | 7 + drivers/net/ethernet/mellanox/mlxsw/switchx2.c | 3 +- drivers/net/ethernet/mscc/ocelot_flower.c | 3 +- drivers/net/ethernet/realtek/r8169_main.c | 2 +- drivers/net/ethernet/renesas/sh_eth.c | 7 + drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c | 5 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 9 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 19 ++- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 - .../net/ethernet/stmicro/stmmac/dwxgmac2_descs.c | 2 +- drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 19 ++- drivers/net/netdevsim/netdev.c | 1 + drivers/net/phy/dp83822.c | 9 +- drivers/net/phy/dp83tc811.c | 11 +- drivers/net/phy/phy.c | 6 +- drivers/net/phy/phy_device.c | 6 +- drivers/net/usb/qmi_wwan.c | 14 -- drivers/net/wan/lapbether.c | 3 - drivers/net/wireless/ath/ath11k/mac.c | 4 +- drivers/net/wireless/ath/ath9k/ath9k.h | 3 +- drivers/net/wireless/ath/ath9k/xmit.c | 6 + drivers/net/wireless/mediatek/mt76/dma.c | 11 +- drivers/nvme/host/fc.c | 2 +- drivers/opp/core.c | 48 +++--- drivers/opp/opp.h | 2 + drivers/pci/controller/pci-xgene-msi.c | 10 +- drivers/pci/controller/pcie-mediatek.c | 7 +- drivers/pci/pci.c | 4 + drivers/pci/pcie/Kconfig | 8 - drivers/pci/pcie/Makefile | 1 - drivers/pci/pcie/bw_notification.c | 138 ----------------- drivers/pci/pcie/err.c | 3 +- drivers/pci/pcie/portdrv.h | 6 - drivers/pci/pcie/portdrv_pci.c | 1 - drivers/perf/arm_dmc620_pmu.c | 1 + drivers/platform/olpc/olpc-ec.c | 15 +- drivers/platform/x86/amd-pmc.c | 14 +- drivers/s390/block/dasd.c | 6 +- drivers/s390/cio/vfio_ccw_ops.c | 6 +- drivers/s390/crypto/vfio_ap_ops.c | 2 +- drivers/s390/net/qeth_core.h | 3 +- drivers/s390/net/qeth_core_main.c | 128 ++++++++-------- drivers/scsi/libiscsi.c | 11 +- drivers/scsi/pm8001/pm8001_hwi.c | 14 +- drivers/scsi/ufs/ufs-sysfs.c | 3 +- drivers/scsi/ufs/ufs.h | 6 +- drivers/scsi/ufs/ufshcd.c | 82 ++++++---- drivers/scsi/ufs/ufshcd.h | 6 +- drivers/spi/spi-stm32.c | 15 +- drivers/staging/comedi/drivers/addi_apci_1032.c | 4 +- drivers/staging/comedi/drivers/addi_apci_1500.c | 18 +-- drivers/staging/comedi/drivers/adv_pci1710.c | 10 +- drivers/staging/comedi/drivers/das6402.c | 2 +- drivers/staging/comedi/drivers/das800.c | 2 +- drivers/staging/comedi/drivers/dmm32at.c | 2 +- drivers/staging/comedi/drivers/me4000.c | 2 +- drivers/staging/comedi/drivers/pcl711.c | 2 +- drivers/staging/comedi/drivers/pcl818.c | 2 +- drivers/staging/ks7010/ks_wlan_net.c | 6 +- drivers/staging/rtl8188eu/core/rtw_ap.c | 5 + drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 6 +- drivers/staging/rtl8192e/rtl8192e/rtl_wx.c | 7 +- drivers/staging/rtl8192u/r8192U_wx.c | 6 +- drivers/staging/rtl8712/rtl871x_cmd.c | 6 +- drivers/staging/rtl8712/rtl871x_ioctl_linux.c | 2 +- drivers/target/target_core_pr.c | 15 +- drivers/target/target_core_transport.c | 15 +- drivers/tty/serial/max310x.c | 29 +--- drivers/usb/class/cdc-acm.c | 5 + drivers/usb/class/usblp.c | 16 +- drivers/usb/core/usb.c | 32 ++++ drivers/usb/dwc3/dwc3-qcom.c | 77 +++++++++- drivers/usb/gadget/function/f_uac1.c | 1 + drivers/usb/gadget/function/f_uac2.c | 2 +- drivers/usb/gadget/function/u_ether_configfs.h | 5 +- drivers/usb/gadget/udc/s3c2410_udc.c | 4 +- drivers/usb/host/xhci-pci.c | 13 +- drivers/usb/host/xhci-ring.c | 3 +- drivers/usb/host/xhci.c | 78 +++++----- drivers/usb/host/xhci.h | 1 + drivers/usb/renesas_usbhs/pipe.c | 2 + drivers/usb/serial/ch341.c | 1 + drivers/usb/serial/cp210x.c | 3 + drivers/usb/serial/io_edgeport.c | 26 ++-- drivers/usb/usbip/stub_dev.c | 42 +++++- drivers/usb/usbip/vhci_sysfs.c | 39 ++++- drivers/usb/usbip/vudc_sysfs.c | 49 +++++- drivers/xen/events/events_2l.c | 22 ++- drivers/xen/events/events_base.c | 130 ++++++++++++---- drivers/xen/events/events_fifo.c | 7 - drivers/xen/events/events_internal.h | 14 +- fs/binfmt_misc.c | 29 ++-- fs/block_dev.c | 11 +- fs/cifs/cifsfs.c | 2 +- fs/cifs/cifsglob.h | 11 +- fs/cifs/connect.c | 10 +- fs/cifs/sess.c | 1 + fs/cifs/smb2inode.c | 1 + fs/cifs/smb2misc.c | 8 +- fs/cifs/smb2ops.c | 10 +- fs/cifs/smb2proto.h | 3 +- fs/cifs/transport.c | 2 +- fs/configfs/file.c | 6 +- fs/ext4/super.c | 9 +- fs/io_uring.c | 3 +- fs/nfs/dir.c | 40 +++-- fs/nfs/nfs4proc.c | 2 +- fs/pnode.h | 2 +- fs/udf/inode.c | 9 +- include/linux/acpi.h | 10 +- include/linux/can/skb.h | 8 +- include/linux/compiler-clang.h | 6 + include/linux/gpio/consumer.h | 2 + include/linux/memblock.h | 4 +- include/linux/memcontrol.h | 6 +- include/linux/memory.h | 3 +- include/linux/perf_event.h | 2 + include/linux/pgtable.h | 4 + include/linux/sched/mm.h | 3 +- include/linux/seqlock.h | 5 +- include/linux/stop_machine.h | 11 +- include/linux/textsearch.h | 2 +- include/linux/usb.h | 2 + include/linux/virtio_net.h | 7 +- include/media/rc-map.h | 7 + include/target/target_core_backend.h | 1 + include/uapi/linux/l2tp.h | 1 + include/uapi/linux/netfilter/nfnetlink_cthelper.h | 2 +- kernel/events/core.c | 42 +++++- kernel/sched/core.c | 126 ++++++++-------- kernel/sched/membarrier.c | 4 +- kernel/sys.c | 2 +- kernel/sysctl.c | 8 +- kernel/time/hrtimer.c | 60 +++++--- lib/Kconfig.kasan | 1 + lib/logic_pio.c | 3 + lib/test_kasan.c | 10 +- mm/highmem.c | 17 ++- mm/huge_memory.c | 2 +- mm/madvise.c | 13 +- mm/memcontrol.c | 15 +- mm/memory.c | 8 + mm/memory_hotplug.c | 2 +- mm/page_alloc.c | 167 ++++++++++----------- mm/slub.c | 2 +- net/core/skbuff.c | 2 + net/dsa/tag_mtk.c | 19 ++- net/dsa/tag_rtl4_a.c | 12 +- net/ethtool/channels.c | 26 ++-- net/ipv4/cipso_ipv4.c | 11 +- net/ipv4/ip_tunnel.c | 5 +- net/ipv4/ip_vti.c | 6 +- net/ipv4/nexthop.c | 10 +- net/ipv4/tcp.c | 26 ++-- net/ipv4/udp_offload.c | 2 +- net/ipv6/calipso.c | 14 +- net/ipv6/ip6_gre.c | 16 +- net/ipv6/ip6_tunnel.c | 10 +- net/ipv6/ip6_vti.c | 6 +- net/ipv6/sit.c | 2 +- net/l2tp/l2tp_core.c | 41 ++--- net/l2tp/l2tp_core.h | 1 + net/l2tp/l2tp_netlink.c | 6 + net/mpls/mpls_gso.c | 3 + net/mptcp/protocol.c | 40 ++--- net/mptcp/protocol.h | 15 +- net/mptcp/subflow.c | 4 + net/netfilter/nf_nat_proto.c | 25 ++- net/netfilter/x_tables.c | 6 +- net/netlabel/netlabel_cipso_v4.c | 3 + net/qrtr/qrtr.c | 4 +- net/sched/sch_api.c | 8 +- net/sunrpc/sched.c | 5 +- samples/bpf/xdpsock_user.c | 2 + security/commoncap.c | 12 +- sound/pci/hda/hda_bind.c | 4 + sound/pci/hda/hda_controller.c | 7 - sound/pci/hda/hda_intel.c | 2 + sound/pci/hda/patch_ca0132.c | 1 + sound/pci/hda/patch_conexant.c | 62 +++++--- sound/pci/hda/patch_hdmi.c | 13 ++ sound/usb/card.c | 6 + sound/usb/quirks.c | 11 +- sound/usb/usbaudio.h | 1 + tools/bpf/resolve_btfids/main.c | 5 + tools/lib/bpf/xsk.c | 5 +- tools/perf/Makefile.perf | 2 +- tools/perf/util/sort.c | 4 +- tools/perf/util/trace-event-read.c | 1 + .../selftests/bpf/progs/netif_receive_skb.c | 13 +- .../testing/selftests/bpf/progs/test_tunnel_kern.c | 6 +- .../testing/selftests/bpf/verifier/array_access.c | 3 +- .../net/forwarding/mirror_gre_bridge_1d_vlan.sh | 9 ++ 343 files changed, 2826 insertions(+), 1666 deletions(-)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dmitry V. Levin ldv@altlinux.org
commit c33cb0020ee6dd96cc9976d6085a7d8422f6dbed upstream.
Apparently, <linux/netfilter/nfnetlink_cthelper.h> and <linux/netfilter/nfnetlink_acct.h> could not be included into the same compilation unit because of a cut-and-paste typo in the former header.
Fixes: 12f7a505331e6 ("netfilter: add user-space connection tracking helper infrastructure") Cc: stable@vger.kernel.org # v3.6 Signed-off-by: Dmitry V. Levin ldv@altlinux.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/netfilter/nfnetlink_cthelper.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/uapi/linux/netfilter/nfnetlink_cthelper.h +++ b/include/uapi/linux/netfilter/nfnetlink_cthelper.h @@ -5,7 +5,7 @@ #define NFCT_HELPER_STATUS_DISABLED 0 #define NFCT_HELPER_STATUS_ENABLED 1
-enum nfnl_acct_msg_types { +enum nfnl_cthelper_msg_types { NFNL_MSG_CTHELPER_NEW, NFNL_MSG_CTHELPER_GET, NFNL_MSG_CTHELPER_DEL,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Christophe Leroy christophe.leroy@csgroup.eu
commit c119565a15a628efdfa51352f9f6c5186e506a1c upstream.
On book3s/32, page protection is defined by the PP bits in the PTE which provide the following protection depending on the access keys defined in the matching segment register: - PP 00 means RW with key 0 and N/A with key 1. - PP 01 means RW with key 0 and RO with key 1. - PP 10 means RW with both key 0 and key 1. - PP 11 means RO with both key 0 and key 1.
Since the implementation of kernel userspace access protection, PP bits have been set as follows: - PP00 for pages without _PAGE_USER - PP01 for pages with _PAGE_USER and _PAGE_RW - PP11 for pages with _PAGE_USER and without _PAGE_RW
For kernelspace segments, kernel accesses are performed with key 0 and user accesses are performed with key 1. As PP00 is used for non _PAGE_USER pages, user can't access kernel pages not flagged _PAGE_USER while kernel can.
For userspace segments, both kernel and user accesses are performed with key 0, therefore pages not flagged _PAGE_USER are still accessible to the user.
This shouldn't be an issue, because userspace is expected to be accessible to the user. But unlike most other architectures, powerpc implements PROT_NONE protection by removing _PAGE_USER flag instead of flagging the page as not valid. This means that pages in userspace that are not flagged _PAGE_USER shall remain inaccessible.
To get the expected behaviour, just mimic other architectures in the TLB miss handler by checking _PAGE_USER permission on userspace accesses as if it was the _PAGE_PRESENT bit.
Note that this problem only is only for 603 cores. The 604+ have an hash table, and hash_page() function already implement the verification of _PAGE_USER permission on userspace pages.
Fixes: f342adca3afc ("powerpc/32s: Prepare Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Christoph Plattner christoph.plattner@thalesgroup.com Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/4a0c6e3bb8f0c162457bf54d9bc6fd8d7b55129f.161216090... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/head_book3s_32.S | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/arch/powerpc/kernel/head_book3s_32.S +++ b/arch/powerpc/kernel/head_book3s_32.S @@ -447,11 +447,12 @@ InstructionTLBMiss: cmplw 0,r1,r3 #endif mfspr r2, SPRN_SDR1 - li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC + li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC | _PAGE_USER rlwinm r2, r2, 28, 0xfffff000 #ifdef CONFIG_MODULES bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ + li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */ #endif 112: rlwimi r2,r3,12,20,29 /* insert top 10 bits of address */ @@ -510,10 +511,11 @@ DataLoadTLBMiss: lis r1, TASK_SIZE@h /* check if kernel address */ cmplw 0,r1,r3 mfspr r2, SPRN_SDR1 - li r1, _PAGE_PRESENT | _PAGE_ACCESSED + li r1, _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER rlwinm r2, r2, 28, 0xfffff000 bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ + li r1, _PAGE_PRESENT | _PAGE_ACCESSED addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */ 112: rlwimi r2,r3,12,20,29 /* insert top 10 bits of address */ lwz r2,0(r2) /* get pmd entry */ @@ -587,10 +589,11 @@ DataStoreTLBMiss: lis r1, TASK_SIZE@h /* check if kernel address */ cmplw 0,r1,r3 mfspr r2, SPRN_SDR1 - li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED + li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER rlwinm r2, r2, 28, 0xfffff000 bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ + li r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT | _PAGE_ACCESSED addi r2, r2, (swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */ 112: rlwimi r2,r3,12,20,29 /* insert top 10 bits of address */ lwz r2,0(r2) /* get pmd entry */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Athira Rajeev atrajeev@linux.vnet.ibm.com
commit 5ae5fbd2107959b68ac69a8b75412208663aea88 upstream.
Running "perf mem record" in powerpc platforms with selinux enabled resulted in soft lockup's. Below call-trace was seen in the logs:
CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2 NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000 REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x94/0x120 LR _raw_spin_lock_irqsave+0x90/0x120 Call Trace: 0xc00000000fd47260 (unreliable) skb_queue_tail+0x3c/0x90 audit_log_end+0x6c/0x180 common_lsm_audit+0xb0/0xe0 slow_avc_audit+0xa4/0x110 avc_has_perm+0x1c4/0x260 selinux_perf_event_open+0x74/0xd0 security_perf_event_open+0x68/0xc0 record_and_restart+0x6e8/0x7f0 perf_event_interrupt+0x22c/0x560 performance_monitor_exception0x4c/0x60 performance_monitor_common_virt+0x1c8/0x1d0 interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120 NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0 REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x38/0x120 LR skb_queue_tail+0x3c/0x90 interrupt: f00 0x38 (unreliable) 0xc00000000aae6200 audit_log_end+0x6c/0x180 audit_log_exit+0x344/0xf80 __audit_syscall_exit+0x2c0/0x320 do_syscall_trace_leave+0x148/0x200 syscall_exit_prepare+0x324/0x390 system_call_common+0xfc/0x27c
The above trace shows that while the CPU was handling a performance monitor exception, there was a call to security_perf_event_open() function. In powerpc core-book3s, this function is called from perf_allow_kernel() check during recording of data address in the sample via perf_get_data_addr().
Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks") introduced security enhancements to perf. As part of this commit, the new security hook for perf_event_open() was added in all places where perf paranoid check was previously used. In powerpc core-book3s code, originally had paranoid checks in perf_get_data_addr() and power_pmu_bhrb_read(). So perf_paranoid_kernel() checks were replaced with perf_allow_kernel() in these PMU helper functions as well.
The intention of paranoid checks in core-book3s was to verify privilege access before capturing some of the sample data. Along with paranoid checks, perf_allow_kernel() also does a security_perf_event_open(). Since these functions are accessed while recording a sample, we end up calling selinux_perf_event_open() in PMI context. Some of the security functions use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and if we end up in calling selinux hook functions in PMI handler, this could cause a dead lock.
Since the purpose of this security hook is to control access to perf_event_open(), it is not right to call this in interrupt context.
The paranoid checks in powerpc core-book3s were done at interrupt time which is also not correct.
Reference commits: Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")
We only allow creation of events that have already passed the privilege checks in perf_event_open(). So these paranoid checks are not needed at event time. As a fix, patch uses 'event->attr.exclude_kernel' check to prevent exposing kernel address for userspace only sampling.
Fixes: cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Cc: stable@vger.kernel.org # v4.17+ Suggested-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Athira Rajeev atrajeev@linux.vnet.ibm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vn... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/perf/core-book3s.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -212,7 +212,7 @@ static inline void perf_get_data_addr(st if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid) *addrp = mfspr(SPRN_SDAR);
- if (is_kernel_addr(mfspr(SPRN_SDAR)) && perf_allow_kernel(&event->attr) != 0) + if (is_kernel_addr(mfspr(SPRN_SDAR)) && event->attr.exclude_kernel) *addrp = 0; }
@@ -506,7 +506,7 @@ static void power_pmu_bhrb_read(struct p * addresses, hence include a check before filtering code */ if (!(ppmu->flags & PPMU_ARCH_31) && - is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0) + is_kernel_addr(addr) && event->attr.exclude_kernel) continue;
/* Branches are read most recent first (ie. mfbhrb 0 is
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Greg Kurz groug@kaod.org
commit f9619d5e5174867536b7e558683bc4408eab833f upstream.
Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs.
This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point.
This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter.
The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0)
Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only.
Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz groug@kaod.org Reviewed-by: Laurent Vivier lvivier@redhat.com Reviewed-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/pseries/msi.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
--- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -4,6 +4,7 @@ * Copyright 2006-2007 Michael Ellerman, IBM Corp. */
+#include <linux/crash_dump.h> #include <linux/device.h> #include <linux/irq.h> #include <linux/msi.h> @@ -458,8 +459,28 @@ again: return hwirq; }
- virq = irq_create_mapping_affinity(NULL, hwirq, - entry->affinity); + /* + * Depending on the number of online CPUs in the original + * kernel, it is likely for CPU #0 to be offline in a kdump + * kernel. The associated IRQs in the affinity mappings + * provided by irq_create_affinity_masks() are thus not + * started by irq_startup(), as per-design for managed IRQs. + * This can be a problem with multi-queue block devices driven + * by blk-mq : such a non-started IRQ is very likely paired + * with the single queue enforced by blk-mq during kdump (see + * blk_mq_alloc_tag_set()). This causes the device to remain + * silent and likely hangs the guest at some point. + * + * We don't really care for fine-grained affinity when doing + * kdump actually : simply ignore the pre-computed affinity + * masks in this case and let the default mask with all CPUs + * be used when creating the IRQ mappings. + */ + if (is_kdump_kernel()) + virq = irq_create_mapping(NULL, hwirq); + else + virq = irq_create_mapping_affinity(NULL, hwirq, + entry->affinity);
if (!virq) { pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jakub Kicinski kuba@kernel.org
commit a4dcfbc4ee2218abd567d81d795082d8d4afcdf6 upstream.
netif_device_attach() will unpause the queues so we can't call it before __alx_open(). This went undetected until commit b0999223f224 ("alx: add ability to allocate and free alx_napi structures") but now if stack tries to xmit immediately on resume before __alx_open() we'll crash on the NAPI being null:
BUG: kernel NULL pointer dereference, address: 0000000000000198 CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G OE 5.10.0-3-amd64 #1 Debian 5.10.13-1 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013 RIP: 0010:alx_start_xmit+0x34/0x650 [alx] Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0 0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df RSP: 0018:ffffb09240083d28 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00 RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700 R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100 R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00 FS: 0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000 i915 0000:00:02.0: vblank wait timed out on crtc 0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0 Call Trace: dev_hard_start_xmit+0xc7/0x1e0 sch_direct_xmit+0x10f/0x310
Cc: stable@vger.kernel.org # 4.9+ Fixes: bc2bebe8de8e ("alx: remove WoL support") Reported-by: Zbynek Michl zbynek.michl@gmail.com Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595 Signed-off-by: Jakub Kicinski kuba@kernel.org Tested-by: Zbynek Michl zbynek.michl@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/atheros/alx/main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/atheros/alx/main.c +++ b/drivers/net/ethernet/atheros/alx/main.c @@ -1894,13 +1894,16 @@ static int alx_resume(struct device *dev
if (!netif_running(alx->dev)) return 0; - netif_device_attach(alx->dev);
rtnl_lock(); err = __alx_open(alx, true); rtnl_unlock(); + if (err) + return err; + + netif_device_attach(alx->dev);
- return err; + return 0; }
static SIMPLE_DEV_PM_OPS(alx_pm_ops, alx_suspend, alx_resume);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Maciej W. Rozycki macro@orcam.me.uk
commit 6c810cf20feef0d4338e9b424ab7f2644a8b353e upstream.
The MIPS Poly1305 implementation is generic MIPS code written such as to support down to the original MIPS I and MIPS III ISA for the 32-bit and 64-bit variant respectively. Lift the current limitation then to enable code for MIPSr1 ISA or newer processors only and have it available for all MIPS processors.
Signed-off-by: Maciej W. Rozycki macro@orcam.me.uk Fixes: a11d055e7a64 ("crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS optimized implementation") Cc: stable@vger.kernel.org # v5.5+ Acked-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/crypto/Makefile | 4 ++-- crypto/Kconfig | 2 +- drivers/net/Kconfig | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/mips/crypto/Makefile +++ b/arch/mips/crypto/Makefile @@ -12,8 +12,8 @@ AFLAGS_chacha-core.o += -O2 # needed to obj-$(CONFIG_CRYPTO_POLY1305_MIPS) += poly1305-mips.o poly1305-mips-y := poly1305-core.o poly1305-glue.o
-perlasm-flavour-$(CONFIG_CPU_MIPS32) := o32 -perlasm-flavour-$(CONFIG_CPU_MIPS64) := 64 +perlasm-flavour-$(CONFIG_32BIT) := o32 +perlasm-flavour-$(CONFIG_64BIT) := 64
quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) $(perlasm-flavour-y) $(@) --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -772,7 +772,7 @@ config CRYPTO_POLY1305_X86_64
config CRYPTO_POLY1305_MIPS tristate "Poly1305 authenticator algorithm (MIPS optimized)" - depends on CPU_MIPS32 || (CPU_MIPS64 && 64BIT) + depends on MIPS select CRYPTO_ARCH_HAVE_LIB_POLY1305
config CRYPTO_MD4 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -92,7 +92,7 @@ config WIREGUARD select CRYPTO_POLY1305_ARM if ARM select CRYPTO_CURVE25519_NEON if ARM && KERNEL_MODE_NEON select CRYPTO_CHACHA_MIPS if CPU_MIPS32_R2 - select CRYPTO_POLY1305_MIPS if CPU_MIPS32 || (CPU_MIPS64 && 64BIT) + select CRYPTO_POLY1305_MIPS if MIPS help WireGuard is a secure, fast, and easy to use replacement for IPSec that uses modern cryptography and clever networking tricks. It's
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Davide Caratti dcaratti@redhat.com
commit 27ab92d9996e4e003a726d22c56d780a1655d6b4 upstream.
in current Linux, MPTCP peers advertising endpoints with port numbers use a sub-option length that wrongly accounts for the trailing TCP NOP. Also, receivers will only process incoming ADD_ADDR with port having such wrong sub-option length. Fix this, making ADD_ADDR compliant to RFC8684 §3.4.1.
this can be verified running tcpdump on the kselftests artifacts:
unpatched kernel: [root@bottarga mptcp]# tcpdump -tnnr unpatched.pcap | grep add-addr reading from file unpatched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.53078: Flags [.], ack 101, win 509, options [nop,nop,TS val 214459678 ecr 521312851,mptcp add-addr v1 id 1 a00:201:2774:2d88:7436:85c3:17fd:101], length 0 IP 10.0.1.2.53078 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 521312852 ecr 214459678,mptcp add-addr[bad opt]]
patched kernel: [root@bottarga mptcp]# tcpdump -tnnr patched.pcap | grep add-addr reading from file patched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.38178: Flags [.], ack 101, win 509, options [nop,nop,TS val 3728873902 ecr 2732713192,mptcp add-addr v1 id 1 10.0.2.1:10100 hmac 0xbccdfcbe59292a1f,nop,nop], length 0 IP 10.0.1.2.38178 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 2732713195 ecr 3728873902,mptcp add-addr v1-echo id 1 10.0.2.1:10100,nop,nop], length 0
Fixes: 22fb85ffaefb ("mptcp: add port support for ADD_ADDR suboption writing") CC: stable@vger.kernel.org # 5.11+ Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Acked-and-tested-by: Geliang Tang geliangtang@gmail.com Signed-off-by: Davide Caratti dcaratti@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.h | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -50,14 +50,15 @@ #define TCPOLEN_MPTCP_DSS_MAP64 14 #define TCPOLEN_MPTCP_DSS_CHECKSUM 2 #define TCPOLEN_MPTCP_ADD_ADDR 16 -#define TCPOLEN_MPTCP_ADD_ADDR_PORT 20 +#define TCPOLEN_MPTCP_ADD_ADDR_PORT 18 #define TCPOLEN_MPTCP_ADD_ADDR_BASE 8 -#define TCPOLEN_MPTCP_ADD_ADDR_BASE_PORT 12 +#define TCPOLEN_MPTCP_ADD_ADDR_BASE_PORT 10 #define TCPOLEN_MPTCP_ADD_ADDR6 28 -#define TCPOLEN_MPTCP_ADD_ADDR6_PORT 32 +#define TCPOLEN_MPTCP_ADD_ADDR6_PORT 30 #define TCPOLEN_MPTCP_ADD_ADDR6_BASE 20 -#define TCPOLEN_MPTCP_ADD_ADDR6_BASE_PORT 24 -#define TCPOLEN_MPTCP_PORT_LEN 4 +#define TCPOLEN_MPTCP_ADD_ADDR6_BASE_PORT 22 +#define TCPOLEN_MPTCP_PORT_LEN 2 +#define TCPOLEN_MPTCP_PORT_ALIGN 2 #define TCPOLEN_MPTCP_RM_ADDR_BASE 4 #define TCPOLEN_MPTCP_FASTCLOSE 12
@@ -587,8 +588,9 @@ static inline unsigned int mptcp_add_add len = TCPOLEN_MPTCP_ADD_ADDR6_BASE; if (!echo) len += MPTCPOPT_THMAC_LEN; + /* account for 2 trailing 'nop' options */ if (port) - len += TCPOLEN_MPTCP_PORT_LEN; + len += TCPOLEN_MPTCP_PORT_LEN + TCPOLEN_MPTCP_PORT_ALIGN;
return len; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Felix Fietkau nbd@nbd.name
commit 3b9ea7206d7e1fdd7419cbd10badd3b2c80d04b4 upstream.
When transmitting to a receiver in dynamic SMPS mode, all transmissions that use multiple spatial streams need to be sent using CTS-to-self or RTS/CTS to give the receiver's extra chains some time to wake up. This fixes the tx rate getting stuck at <= MCS7 for some clients, especially Intel ones, which make aggressive use of SMPS.
Cc: stable@vger.kernel.org Reported-by: Martin Kennedy hurricos@gmail.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20210214184911.96702-1-nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath9k/ath9k.h | 3 ++- drivers/net/wireless/ath/ath9k/xmit.c | 6 ++++++ 2 files changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath9k/ath9k.h +++ b/drivers/net/wireless/ath/ath9k/ath9k.h @@ -177,7 +177,8 @@ struct ath_frame_info { s8 txq; u8 keyix; u8 rtscts_rate; - u8 retries : 7; + u8 retries : 6; + u8 dyn_smps : 1; u8 baw_tracked : 1; u8 tx_power; enum ath9k_key_type keytype:2; --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -1271,6 +1271,11 @@ static void ath_buf_set_rate(struct ath_ is_40, is_sgi, is_sp); if (rix < 8 && (tx_info->flags & IEEE80211_TX_CTL_STBC)) info->rates[i].RateFlags |= ATH9K_RATESERIES_STBC; + if (rix >= 8 && fi->dyn_smps) { + info->rates[i].RateFlags |= + ATH9K_RATESERIES_RTS_CTS; + info->flags |= ATH9K_TXDESC_CTSENA; + }
info->txpower[i] = ath_get_rate_txpower(sc, bf, rix, is_40, false); @@ -2114,6 +2119,7 @@ static void setup_frame_info(struct ieee fi->keyix = an->ps_key; else fi->keyix = ATH9K_TXKEYIX_INVALID; + fi->dyn_smps = sta && sta->smps_mode == IEEE80211_SMPS_DYNAMIC; fi->keytype = keytype; fi->framelen = framelen; fi->tx_power = txpower;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Daniel Borkmann daniel@iogearbox.net
commit 89e5c58fc1e2857ccdaae506fb8bc5fed57ee063 upstream.
We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the csum for the UDP header itself is 0. In that case, GRO aggregation does not take place on the phys dev, but instead is deferred to the vxlan/geneve driver (see trace below).
The reason is essentially that GRO aggregation bails out in udp_gro_receive() for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE and napi context has csum_valid set. This is however not the case for zero UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false).
At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches on !uh->check ^ !uh2->check as part to determine candidates for aggregation, so it certainly is expected to handle zero csums in udp_gro_receive(). The purpose of the check added via 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") seems to catch bad csum and stop aggregation right away.
One way to fix aggregation in the zero case is to only perform the !csum_valid check in udp_gro_receive() if uh->check is infact non-zero.
Before:
[...] swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) [...]
# netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 20.01 13129.24
After:
[...] swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) [...]
# netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 20.01 24576.53
Fixes: 57c67ff4bd92 ("udp: additional GRO support") Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Cc: Eric Dumazet edumazet@google.com Cc: Jesse Brandeburg jesse.brandeburg@intel.com Cc: Tom Herbert tom@herbertland.com Acked-by: Willem de Bruijn willemb@google.com Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/r/20210226212248.8300-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp_offload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -525,7 +525,7 @@ struct sk_buff *udp_gro_receive(struct l }
if (!sk || NAPI_GRO_CB(skb)->encap_mark || - (skb->ip_summed != CHECKSUM_PARTIAL && + (uh->check && skb->ip_summed != CHECKSUM_PARTIAL && NAPI_GRO_CB(skb)->csum_cnt == 0 && !NAPI_GRO_CB(skb)->csum_valid) || !udp_sk(sk)->gro_receive)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Balazs Nemeth bnemeth@redhat.com
commit 924a9bc362a5223cd448ca08c3dde21235adc310 upstream.
For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't set) based on the type in the virtio net hdr, but the skb could contain anything since it could come from packet_snd through a raw socket. If there is a mismatch between what virtio_net_hdr_set_proto sets and the actual protocol, then the skb could be handled incorrectly later on.
An example where this poses an issue is with the subsequent call to skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set correctly. A specially crafted packet could fool skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
Avoid blindly trusting the information provided by the virtio net header by checking that the protocol in the packet actually matches the protocol set by virtio_net_hdr_set_proto. Note that since the protocol is only checked if skb->dev implements header_ops->parse_protocol, packets from devices without the implementation are not checked at this stage.
Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets") Signed-off-by: Balazs Nemeth bnemeth@redhat.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/virtio_net.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -79,8 +79,13 @@ static inline int virtio_net_hdr_to_skb( if (gso_type && skb->network_header) { struct flow_keys_basic keys;
- if (!skb->protocol) + if (!skb->protocol) { + __be16 protocol = dev_parse_header_protocol(skb); + virtio_net_hdr_set_proto(skb, hdr); + if (protocol && protocol != skb->protocol) + return -EINVAL; + } retry: if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys, NULL, 0, 0, 0,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Balazs Nemeth bnemeth@redhat.com
commit d348ede32e99d3a04863e9f9b28d224456118c27 upstream.
A packet with skb_inner_network_header(skb) == skb_network_header(skb) and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers from the packet. Subsequently, the call to skb_mac_gso_segment will again call mpls_gso_segment with the same packet leading to an infinite loop. In addition, ensure that the header length is a multiple of four, which should hold irrespective of the number of stacked labels.
Signed-off-by: Balazs Nemeth bnemeth@redhat.com Acked-by: Willem de Bruijn willemb@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mpls/mpls_gso.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/mpls/mpls_gso.c +++ b/net/mpls/mpls_gso.c @@ -14,6 +14,7 @@ #include <linux/netdev_features.h> #include <linux/netdevice.h> #include <linux/skbuff.h> +#include <net/mpls.h>
static struct sk_buff *mpls_gso_segment(struct sk_buff *skb, netdev_features_t features) @@ -27,6 +28,8 @@ static struct sk_buff *mpls_gso_segment(
skb_reset_network_header(skb); mpls_hlen = skb_inner_network_header(skb) - skb_network_header(skb); + if (unlikely(!mpls_hlen || mpls_hlen % MPLS_HLEN)) + goto out; if (unlikely(!pskb_may_pull(skb, mpls_hlen))) goto out;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kalle Valo kvalo@codeaurora.org
commit 77d7e87128d4dfb400df4208b2812160e999c165 upstream.
Commit c134d1f8c436 ("ath11k: Handle errors if peer creation fails") completely broke AP mode on QCA6390:
kernel: [ 151.230734] ath11k_pci 0000:06:00.0: failed to create peer after vdev start delay: -22 wpa_supplicant[2307]: Failed to set beacon parameters wpa_supplicant[2307]: Interface initialization failed wpa_supplicant[2307]: wlan0: interface state UNINITIALIZED->DISABLED wpa_supplicant[2307]: wlan0: AP-DISABLED wpa_supplicant[2307]: wlan0: Unable to setup interface. wpa_supplicant[2307]: Failed to initialize AP interface
This was because commit c134d1f8c436 ("ath11k: Handle errors if peer creation fails") added error handling for ath11k_peer_create(), which had been failing all along but was unnoticed due to the missing error handling. The actual bug was introduced already in commit aa44b2f3ecd4 ("ath11k: start vdev if a bss peer is already created").
ath11k_peer_create() was failing because for AP mode the peer is created already earlier op_add_interface() and we should skip creation here, but the check for modes was wrong. Fixing that makes AP mode work again.
This shouldn't affect IPQ8074 nor QCN9074 as they have hw_params.vdev_start_delay disabled.
Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1
Fixes: c134d1f8c436 ("ath11k: Handle errors if peer creation fails") Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/1614006849-25764-1-git-send-email-kvalo@codeaurora... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/mac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -5299,8 +5299,8 @@ ath11k_mac_op_assign_vif_chanctx(struct }
if (ab->hw_params.vdev_start_delay && - (arvif->vdev_type == WMI_VDEV_TYPE_AP || - arvif->vdev_type == WMI_VDEV_TYPE_MONITOR)) { + arvif->vdev_type != WMI_VDEV_TYPE_AP && + arvif->vdev_type != WMI_VDEV_TYPE_MONITOR) { param.vdev_id = arvif->vdev_id; param.peer_type = WMI_PEER_TYPE_DEFAULT; param.peer_addr = ar->mac_addr;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Matthias Schiffer mschiffer@universe-factory.net
commit 3e59e8856758eb5a2dfe1f831ef53b168fd58105 upstream.
Commit 5ee759cda51b ("l2tp: use standard API for warning log messages") changed a number of warnings about invalid packets in the receive path so that they are always shown, instead of only when a special L2TP debug flag is set. Even with rate limiting these warnings can easily cause significant log spam - potentially triggered by a malicious party sending invalid packets on purpose.
In addition these warnings were noticed by projects like Tunneldigger [1], which uses L2TP for its data path, but implements its own control protocol (which is sufficiently different from L2TP data packets that it would always be passed up to userspace even with future extensions of L2TP).
Some of the warnings were already redundant, as l2tp_stats has a counter for these packets. This commit adds one additional counter for invalid packets that are passed up to userspace. Packets with unknown session are not counted as invalid, as there is nothing wrong with the format of these packets.
With the additional counter, all of these messages are either redundant or benign, so we reduce them to pr_debug_ratelimited().
[1] https://github.com/wlanslovenija/tunneldigger/issues/160
Fixes: 5ee759cda51b ("l2tp: use standard API for warning log messages") Signed-off-by: Matthias Schiffer mschiffer@universe-factory.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/l2tp.h | 1 + net/l2tp/l2tp_core.c | 41 ++++++++++++++++++++++------------------- net/l2tp/l2tp_core.h | 1 + net/l2tp/l2tp_netlink.c | 6 ++++++ 4 files changed, 30 insertions(+), 19 deletions(-)
--- a/include/uapi/linux/l2tp.h +++ b/include/uapi/linux/l2tp.h @@ -145,6 +145,7 @@ enum { L2TP_ATTR_RX_ERRORS, /* u64 */ L2TP_ATTR_STATS_PAD, L2TP_ATTR_RX_COOKIE_DISCARDS, /* u64 */ + L2TP_ATTR_RX_INVALID, /* u64 */ __L2TP_ATTR_STATS_MAX, };
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -649,9 +649,9 @@ void l2tp_recv_common(struct l2tp_sessio /* Parse and check optional cookie */ if (session->peer_cookie_len > 0) { if (memcmp(ptr, &session->peer_cookie[0], session->peer_cookie_len)) { - pr_warn_ratelimited("%s: cookie mismatch (%u/%u). Discarding.\n", - tunnel->name, tunnel->tunnel_id, - session->session_id); + pr_debug_ratelimited("%s: cookie mismatch (%u/%u). Discarding.\n", + tunnel->name, tunnel->tunnel_id, + session->session_id); atomic_long_inc(&session->stats.rx_cookie_discards); goto discard; } @@ -702,8 +702,8 @@ void l2tp_recv_common(struct l2tp_sessio * If user has configured mandatory sequence numbers, discard. */ if (session->recv_seq) { - pr_warn_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", - session->name); + pr_debug_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", + session->name); atomic_long_inc(&session->stats.rx_seq_discards); goto discard; } @@ -718,8 +718,8 @@ void l2tp_recv_common(struct l2tp_sessio session->send_seq = 0; l2tp_session_set_header_len(session, tunnel->version); } else if (session->send_seq) { - pr_warn_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", - session->name); + pr_debug_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", + session->name); atomic_long_inc(&session->stats.rx_seq_discards); goto discard; } @@ -809,9 +809,9 @@ static int l2tp_udp_recv_core(struct l2t
/* Short packet? */ if (!pskb_may_pull(skb, L2TP_HDR_SIZE_MAX)) { - pr_warn_ratelimited("%s: recv short packet (len=%d)\n", - tunnel->name, skb->len); - goto error; + pr_debug_ratelimited("%s: recv short packet (len=%d)\n", + tunnel->name, skb->len); + goto invalid; }
/* Point to L2TP header */ @@ -824,9 +824,9 @@ static int l2tp_udp_recv_core(struct l2t /* Check protocol version */ version = hdrflags & L2TP_HDR_VER_MASK; if (version != tunnel->version) { - pr_warn_ratelimited("%s: recv protocol version mismatch: got %d expected %d\n", - tunnel->name, version, tunnel->version); - goto error; + pr_debug_ratelimited("%s: recv protocol version mismatch: got %d expected %d\n", + tunnel->name, version, tunnel->version); + goto invalid; }
/* Get length of L2TP packet */ @@ -834,7 +834,7 @@ static int l2tp_udp_recv_core(struct l2t
/* If type is control packet, it is handled by userspace. */ if (hdrflags & L2TP_HDRFLAG_T) - goto error; + goto pass;
/* Skip flags */ ptr += 2; @@ -863,21 +863,24 @@ static int l2tp_udp_recv_core(struct l2t l2tp_session_dec_refcount(session);
/* Not found? Pass to userspace to deal with */ - pr_warn_ratelimited("%s: no session found (%u/%u). Passing up.\n", - tunnel->name, tunnel_id, session_id); - goto error; + pr_debug_ratelimited("%s: no session found (%u/%u). Passing up.\n", + tunnel->name, tunnel_id, session_id); + goto pass; }
if (tunnel->version == L2TP_HDR_VER_3 && l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr)) - goto error; + goto invalid;
l2tp_recv_common(session, skb, ptr, optr, hdrflags, length); l2tp_session_dec_refcount(session);
return 0;
-error: +invalid: + atomic_long_inc(&tunnel->stats.rx_invalid); + +pass: /* Put UDP header back */ __skb_push(skb, sizeof(struct udphdr));
--- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -39,6 +39,7 @@ struct l2tp_stats { atomic_long_t rx_oos_packets; atomic_long_t rx_errors; atomic_long_t rx_cookie_discards; + atomic_long_t rx_invalid; };
struct l2tp_tunnel; --- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -428,6 +428,9 @@ static int l2tp_nl_tunnel_send(struct sk L2TP_ATTR_STATS_PAD) || nla_put_u64_64bit(skb, L2TP_ATTR_RX_ERRORS, atomic_long_read(&tunnel->stats.rx_errors), + L2TP_ATTR_STATS_PAD) || + nla_put_u64_64bit(skb, L2TP_ATTR_RX_INVALID, + atomic_long_read(&tunnel->stats.rx_invalid), L2TP_ATTR_STATS_PAD)) goto nla_put_failure; nla_nest_end(skb, nest); @@ -771,6 +774,9 @@ static int l2tp_nl_session_send(struct s L2TP_ATTR_STATS_PAD) || nla_put_u64_64bit(skb, L2TP_ATTR_RX_ERRORS, atomic_long_read(&session->stats.rx_errors), + L2TP_ATTR_STATS_PAD) || + nla_put_u64_64bit(skb, L2TP_ATTR_RX_INVALID, + atomic_long_read(&session->stats.rx_invalid), L2TP_ATTR_STATS_PAD)) goto nla_put_failure; nla_nest_end(skb, nest);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 62d5247d239d4b48762192a251c647d7c997616a upstream.
On some systems the ACPI tables has wrong pin number and instead of having a relative one it provides an absolute one in the global GPIO number space.
Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk to cope with such cases.
Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib-acpi.c | 7 ++++++- include/linux/gpio/consumer.h | 2 ++ 2 files changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -677,6 +677,7 @@ static int acpi_populate_gpio_lookup(str if (!lookup->desc) { const struct acpi_resource_gpio *agpio = &ares->data.gpio; bool gpioint = agpio->connection_type == ACPI_RESOURCE_GPIO_TYPE_INT; + struct gpio_desc *desc; u16 pin_index;
if (lookup->info.quirks & ACPI_GPIO_QUIRK_ONLY_GPIOIO && gpioint) @@ -689,8 +690,12 @@ static int acpi_populate_gpio_lookup(str if (pin_index >= agpio->pin_table_length) return 1;
- lookup->desc = acpi_get_gpiod(agpio->resource_source.string_ptr, + if (lookup->info.quirks & ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER) + desc = gpio_to_desc(agpio->pin_table[pin_index]); + else + desc = acpi_get_gpiod(agpio->resource_source.string_ptr, agpio->pin_table[pin_index]); + lookup->desc = desc; lookup->info.pin_config = agpio->pin_config; lookup->info.debounce = agpio->debounce_timeout; lookup->info.gpioint = gpioint; --- a/include/linux/gpio/consumer.h +++ b/include/linux/gpio/consumer.h @@ -674,6 +674,8 @@ struct acpi_gpio_mapping { * get GpioIo type explicitly, this quirk may be used. */ #define ACPI_GPIO_QUIRK_ONLY_GPIOIO BIT(1) +/* Use given pin as an absolute GPIO number in the system */ +#define ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER BIT(2)
unsigned int quirks; };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 809390219fb9c2421239afe5c9eb862d73978ba0 upstream.
Currently only search by index is supported. However, in some cases we might need to pass the quirks to the acpi_dev_gpio_irq_get().
For this, split out acpi_dev_gpio_irq_get_by() and replace acpi_dev_gpio_irq_get() by calling above with NULL for name parameter.
Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib-acpi.c | 12 ++++++++---- include/linux/acpi.h | 10 ++++++++-- 2 files changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -945,8 +945,9 @@ struct gpio_desc *acpi_node_get_gpiod(st }
/** - * acpi_dev_gpio_irq_get() - Find GpioInt and translate it to Linux IRQ number + * acpi_dev_gpio_irq_get_by() - Find GpioInt and translate it to Linux IRQ number * @adev: pointer to a ACPI device to get IRQ from + * @name: optional name of GpioInt resource * @index: index of GpioInt resource (starting from %0) * * If the device has one or more GpioInt resources, this function can be @@ -956,9 +957,12 @@ struct gpio_desc *acpi_node_get_gpiod(st * The function is idempotent, though each time it runs it will configure GPIO * pin direction according to the flags in GpioInt resource. * + * The function takes optional @name parameter. If the resource has a property + * name, then only those will be taken into account. + * * Return: Linux IRQ number (> %0) on success, negative errno on failure. */ -int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index) +int acpi_dev_gpio_irq_get_by(struct acpi_device *adev, const char *name, int index) { int idx, i; unsigned int irq_flags; @@ -968,7 +972,7 @@ int acpi_dev_gpio_irq_get(struct acpi_de struct acpi_gpio_info info; struct gpio_desc *desc;
- desc = acpi_get_gpiod_by_index(adev, NULL, i, &info); + desc = acpi_get_gpiod_by_index(adev, name, i, &info);
/* Ignore -EPROBE_DEFER, it only matters if idx matches */ if (IS_ERR(desc) && PTR_ERR(desc) != -EPROBE_DEFER) @@ -1013,7 +1017,7 @@ int acpi_dev_gpio_irq_get(struct acpi_de } return -ENOENT; } -EXPORT_SYMBOL_GPL(acpi_dev_gpio_irq_get); +EXPORT_SYMBOL_GPL(acpi_dev_gpio_irq_get_by);
static acpi_status acpi_gpio_adr_space_handler(u32 function, acpi_physical_address address, --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -1072,19 +1072,25 @@ void __acpi_handle_debug(struct _ddebug #if defined(CONFIG_ACPI) && defined(CONFIG_GPIOLIB) bool acpi_gpio_get_irq_resource(struct acpi_resource *ares, struct acpi_resource_gpio **agpio); -int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index); +int acpi_dev_gpio_irq_get_by(struct acpi_device *adev, const char *name, int index); #else static inline bool acpi_gpio_get_irq_resource(struct acpi_resource *ares, struct acpi_resource_gpio **agpio) { return false; } -static inline int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index) +static inline int acpi_dev_gpio_irq_get_by(struct acpi_device *adev, + const char *name, int index) { return -ENXIO; } #endif
+static inline int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index) +{ + return acpi_dev_gpio_irq_get_by(adev, NULL, index); +} + /* Device properties */
#ifdef CONFIG_ACPI
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit b41ba2ec54a70908067034f139aa23d0dd2985ce upstream.
On STM32MP1, the GPIO banks are subnodes of pin-controller@50002000, see arch/arm/boot/dts/stm32mp151.dtsi. The driver for pin-controller@50002000 is in drivers/pinctrl/stm32/pinctrl-stm32.c and iterates over all of its DT subnodes when registering each GPIO bank gpiochip. Each gpiochip has:
- gpio_chip.parent = dev, where dev is the device node of the pin controller - gpio_chip.of_node = np, which is the OF node of the GPIO bank
Therefore, dev_fwnode(chip->parent) != of_fwnode_handle(chip.of_node), i.e. pin-controller@50002000 != pin-controller@50002000/gpio@5000*000.
The original code behaved correctly, as it extracted the "gpio-line-names" from of_fwnode_handle(chip.of_node) = pin-controller@50002000/gpio@5000*000.
To achieve the same behaviour, read property from the firmware node.
Fixes: 7cba1a4d5e162 ("gpiolib: generalize devprop_gpiochip_set_names() for device properties") Reported-by: Marek Vasut marex@denx.de Reported-by: Roman Guskov rguskov@dh-electronics.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Tested-by: Marek Vasut marex@denx.de Reviewed-by: Marek Vasut marex@denx.de Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
--- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -365,22 +365,18 @@ static int gpiochip_set_desc_names(struc * * Looks for device property "gpio-line-names" and if it exists assigns * GPIO line names for the chip. The memory allocated for the assigned - * names belong to the underlying software node and should not be released + * names belong to the underlying firmware node and should not be released * by the caller. */ static int devprop_gpiochip_set_names(struct gpio_chip *chip) { struct gpio_device *gdev = chip->gpiodev; - struct device *dev = chip->parent; + struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev); const char **names; int ret, i; int count;
- /* GPIO chip may not have a parent device whose properties we inspect. */ - if (!dev) - return 0; - - count = device_property_string_array_count(dev, "gpio-line-names"); + count = fwnode_property_string_array_count(fwnode, "gpio-line-names"); if (count < 0) return 0;
@@ -394,7 +390,7 @@ static int devprop_gpiochip_set_names(st if (!names) return -ENOMEM;
- ret = device_property_read_string_array(dev, "gpio-line-names", + ret = fwnode_property_read_string_array(fwnode, "gpio-line-names", names, count); if (ret < 0) { dev_warn(&gdev->dev, "failed to read GPIO line names\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Oleksij Rempel o.rempel@pengutronix.de
commit e940e0895a82c6fbaa259f2615eb52b57ee91a7e upstream.
There are two ref count variables controlling the free()ing of a socket: - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put() - struct sock::sk_wmem_alloc - which accounts the memory allocated by the skbs in the send path.
In case there are still TX skbs on the fly and the socket() is closed, the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack clones an "echo" skb, calls sock_hold() on the original socket and references it. This produces the following back trace:
| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134 | refcount_t: addition on 0; use-after-free. | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E) | CPU: 0 PID: 280 Comm: test_can.sh Tainted: G E 5.11.0-04577-gf8ff6603c617 #203 | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) | Backtrace: | [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220 | [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) | [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90 | [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2 | [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540 | [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50) | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c) | [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600 | [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00 | [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600 | [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400 | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)
To fix this problem, only set skb ownership to sockets which have still a ref count > 0.
Fixes: 0ae89beb283a ("can: add destructor for self generated skbs") Cc: Oliver Hartkopp socketcan@hartkopp.net Cc: Andre Naujoks nautsch2@gmail.com Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.de Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Reviewed-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/can/skb.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/include/linux/can/skb.h +++ b/include/linux/can/skb.h @@ -49,8 +49,12 @@ static inline void can_skb_reserve(struc
static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk) { - if (sk) { - sock_hold(sk); + /* If the socket has already been closed by user space, the + * refcount may already be 0 (and the socket will be freed + * after the last TX skb has been freed). So only increase + * socket refcount if the refcount is > 0. + */ + if (sk && refcount_inc_not_zero(&sk->sk_refcnt)) { skb->destructor = sock_efree; skb->sk = sk; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit eb441337c7147514ab45036cadf09c3a71e4ce31 upstream.
The commit 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") indeliberately made a regression on how IRQ line from GPIO I²C expander is handled. I.e. it reveals that the quirk for Intel Galileo Gen 2 misses the part of setting IRQ type which previously was predefined by gpio-dwapb driver. Now, we have to reorganize the approach to call necessary parts, which can be done via ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk.
Without this fix and with above mentioned change the kernel hangs on the first IRQ event with:
gpio gpiochip3: Persistence not supported for GPIO 1 irq 32, desc: 62f8fb50, depth: 0, count: 0, unhandled: 0 ->handle_irq(): 41c7b0ab, handle_bad_irq+0x0/0x40 ->irq_data.chip(): e03f1e72, 0xc2539218 ->action(): 0ecc7e6f ->action->handler(): 8a3db21e, irq_default_primary_handler+0x0/0x10 IRQ_NOPROBE set unexpected IRQ trap at vector 20
Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-pca953x.c | 78 ++++++++++++-------------------------------- 1 file changed, 23 insertions(+), 55 deletions(-)
--- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -112,8 +112,29 @@ MODULE_DEVICE_TABLE(i2c, pca953x_id); #ifdef CONFIG_GPIO_PCA953X_IRQ
#include <linux/dmi.h> -#include <linux/gpio.h> -#include <linux/list.h> + +static const struct acpi_gpio_params pca953x_irq_gpios = { 0, 0, true }; + +static const struct acpi_gpio_mapping pca953x_acpi_irq_gpios[] = { + { "irq-gpios", &pca953x_irq_gpios, 1, ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER }, + { } +}; + +static int pca953x_acpi_get_irq(struct device *dev) +{ + int ret; + + ret = devm_acpi_dev_add_driver_gpios(dev, pca953x_acpi_irq_gpios); + if (ret) + dev_warn(dev, "can't add GPIO ACPI mapping\n"); + + ret = acpi_dev_gpio_irq_get_by(ACPI_COMPANION(dev), "irq-gpios", 0); + if (ret < 0) + return ret; + + dev_info(dev, "ACPI interrupt quirk (IRQ %d)\n", ret); + return ret; +}
static const struct dmi_system_id pca953x_dmi_acpi_irq_info[] = { { @@ -132,59 +153,6 @@ static const struct dmi_system_id pca953 }, {} }; - -#ifdef CONFIG_ACPI -static int pca953x_acpi_get_pin(struct acpi_resource *ares, void *data) -{ - struct acpi_resource_gpio *agpio; - int *pin = data; - - if (acpi_gpio_get_irq_resource(ares, &agpio)) - *pin = agpio->pin_table[0]; - return 1; -} - -static int pca953x_acpi_find_pin(struct device *dev) -{ - struct acpi_device *adev = ACPI_COMPANION(dev); - int pin = -ENOENT, ret; - LIST_HEAD(r); - - ret = acpi_dev_get_resources(adev, &r, pca953x_acpi_get_pin, &pin); - acpi_dev_free_resource_list(&r); - if (ret < 0) - return ret; - - return pin; -} -#else -static inline int pca953x_acpi_find_pin(struct device *dev) { return -ENXIO; } -#endif - -static int pca953x_acpi_get_irq(struct device *dev) -{ - int pin, ret; - - pin = pca953x_acpi_find_pin(dev); - if (pin < 0) - return pin; - - dev_info(dev, "Applying ACPI interrupt quirk (GPIO %d)\n", pin); - - if (!gpio_is_valid(pin)) - return -EINVAL; - - ret = gpio_request(pin, "pca953x interrupt"); - if (ret) - return ret; - - ret = gpio_to_irq(pin); - - /* When pin is used as an IRQ, no need to keep it requested */ - gpio_free(pin); - - return ret; -} #endif
static const struct acpi_device_id pca953x_acpi_ids[] = {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit 449052cfebf624b670faa040245d3feed770d22f upstream.
Assert HALT bit to enter freeze mode, there is a premise that FRZ bit is asserted. This patch asserts FRZ bit in flexcan_chip_freeze, although the reset value is 1b'1. This is a prepare patch, later patch will invoke flexcan_chip_freeze() to enter freeze mode, which polling freeze mode acknowledge.
Fixes: b1aa1c7a2165b ("can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze") Link: https://lore.kernel.org/r/20210218110037.16591-2-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/flexcan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -701,7 +701,7 @@ static int flexcan_chip_freeze(struct fl u32 reg;
reg = priv->read(®s->mcr); - reg |= FLEXCAN_MCR_HALT; + reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT; priv->write(reg, ®s->mcr);
while (timeout-- && !(priv->read(®s->mcr) & FLEXCAN_MCR_FRZ_ACK))
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit ec15e27cc8904605846a354bb1f808ea1432f853 upstream.
RX FIFO enable failed could happen when do system reboot stress test:
[ 0.303958] flexcan 5a8d0000.can: 5a8d0000.can supply xceiver not found, using dummy regulator [ 0.304281] flexcan 5a8d0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.314640] flexcan 5a8d0000.can: registering netdev failed [ 0.320728] flexcan 5a8e0000.can: 5a8e0000.can supply xceiver not found, using dummy regulator [ 0.320991] flexcan 5a8e0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.331360] flexcan 5a8e0000.can: registering netdev failed [ 0.337444] flexcan 5a8f0000.can: 5a8f0000.can supply xceiver not found, using dummy regulator [ 0.337716] flexcan 5a8f0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.348117] flexcan 5a8f0000.can: registering netdev failed
RX FIFO should be enabled after the FRZ/HALT are valid. But the current code enable RX FIFO and FRZ/HALT at the same time.
Fixes: e955cead03117 ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-3-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/flexcan.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -1864,10 +1864,14 @@ static int register_flexcandev(struct ne if (err) goto out_chip_disable;
- /* set freeze, halt and activate FIFO, restrict register access */ + /* set freeze, halt */ + err = flexcan_chip_freeze(priv); + if (err) + goto out_chip_disable; + + /* activate FIFO, restrict register access */ reg = priv->read(®s->mcr); - reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT | - FLEXCAN_MCR_FEN | FLEXCAN_MCR_SUPV; + reg |= FLEXCAN_MCR_FEN | FLEXCAN_MCR_SUPV; priv->write(reg, ®s->mcr);
/* Currently we only support newer versions of this core
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit c63820045e2000f05657467a08715c18c9f490d9 upstream.
Invoke flexcan_chip_freeze() to enter freeze mode, since need poll freeze mode acknowledge.
Fixes: e955cead03117 ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-4-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/flexcan.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -1479,10 +1479,13 @@ static int flexcan_chip_start(struct net
flexcan_set_bittiming(dev);
+ /* set freeze, halt */ + err = flexcan_chip_freeze(priv); + if (err) + goto out_chip_disable; + /* MCR * - * enable freeze - * halt now * only supervisor access * enable warning int * enable individual RX masking @@ -1491,9 +1494,8 @@ static int flexcan_chip_start(struct net */ reg_mcr = priv->read(®s->mcr); reg_mcr &= ~FLEXCAN_MCR_MAXMB(0xff); - reg_mcr |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT | FLEXCAN_MCR_SUPV | - FLEXCAN_MCR_WRN_EN | FLEXCAN_MCR_IRMQ | FLEXCAN_MCR_IDAM_C | - FLEXCAN_MCR_MAXMB(priv->tx_mb_idx); + reg_mcr |= FLEXCAN_MCR_SUPV | FLEXCAN_MCR_WRN_EN | FLEXCAN_MCR_IRMQ | + FLEXCAN_MCR_IDAM_C | FLEXCAN_MCR_MAXMB(priv->tx_mb_idx);
/* MCR *
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Torin Cooper-Bennun torin@maxiluxsystems.com
commit 2712625200ed69c642b9abc3a403830c4643364c upstream.
This patch prevents a potentially destructive race condition. The device is fully operational on the bus after entering Normal Mode, so zeroing the MRAM after entering this mode may lead to loss of information, e.g. new received messages.
This patch fixes the problem by first initializing the MRAM, then bringing the device into Normale Mode.
Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Link: https://lore.kernel.org/r/20210226163440.313628-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Torin Cooper-Bennun torin@maxiluxsystems.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/m_can/tcan4x5x.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/can/m_can/tcan4x5x.c +++ b/drivers/net/can/m_can/tcan4x5x.c @@ -326,14 +326,14 @@ static int tcan4x5x_init(struct m_can_cl if (ret) return ret;
+ /* Zero out the MCAN buffers */ + m_can_init_ram(cdev); + ret = regmap_update_bits(tcan4x5x->regmap, TCAN4X5X_CONFIG, TCAN4X5X_MODE_SEL_MASK, TCAN4X5X_MODE_NORMAL); if (ret) return ret;
- /* Zero out the MCAN buffers */ - m_can_init_ram(cdev); - return ret; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arjun Roy arjunroy@google.com
commit 2107d45f17bedd7dbf4178462da0ac223835a2a7 upstream.
getsockopt(TCP_ZEROCOPY_RECEIVE) has a bug where we read a user-provided "len" field of type signed int, and then compare the value to the result of an "offsetofend" operation, which is unsigned.
Negative values provided by the user will be promoted to large positive numbers; thus checking that len < offsetofend() will return false when the intention was that it return true.
Note that while len is originally checked for negative values earlier on in do_tcp_getsockopt(), subsequent calls to get_user() re-read the value from userspace which may have changed in the meantime.
Therefore, re-add the check for negative values after the call to get_user in the handler code for TCP_ZEROCOPY_RECEIVE.
Fixes: c8856c051454 ("tcp-zerocopy: Return inq along with tcp receive zerocopy.") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Arjun Roy arjunroy@google.com Link: https://lore.kernel.org/r/20210225232628.4033281-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4088,7 +4088,8 @@ static int do_tcp_getsockopt(struct sock
if (get_user(len, optlen)) return -EFAULT; - if (len < offsetofend(struct tcp_zerocopy_receive, length)) + if (len < 0 || + len < offsetofend(struct tcp_zerocopy_receive, length)) return -EINVAL; if (len > sizeof(zc)) { len = sizeof(zc);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Eric Dumazet edumazet@google.com
commit 8811f4a9836e31c14ecdf79d9f3cb7c5d463265d upstream.
Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue.
mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer)
syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0
This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty.
This patch fixes this case, and the tx path as well.
Fixes: ee9952831cfd ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Pavel Emelyanov xemul@parallels.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li ieatmuttonchuan@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3431,16 +3431,23 @@ static int do_tcp_setsockopt(struct sock break;
case TCP_QUEUE_SEQ: - if (sk->sk_state != TCP_CLOSE) + if (sk->sk_state != TCP_CLOSE) { err = -EPERM; - else if (tp->repair_queue == TCP_SEND_QUEUE) - WRITE_ONCE(tp->write_seq, val); - else if (tp->repair_queue == TCP_RECV_QUEUE) { - WRITE_ONCE(tp->rcv_nxt, val); - WRITE_ONCE(tp->copied_seq, val); - } - else + } else if (tp->repair_queue == TCP_SEND_QUEUE) { + if (!tcp_rtx_queue_empty(sk)) + err = -EPERM; + else + WRITE_ONCE(tp->write_seq, val); + } else if (tp->repair_queue == TCP_RECV_QUEUE) { + if (tp->rcv_nxt != tp->copied_seq) { + err = -EPERM; + } else { + WRITE_ONCE(tp->rcv_nxt, val); + WRITE_ONCE(tp->copied_seq, val); + } + } else { err = -EINVAL; + } break;
case TCP_REPAIR_OPTIONS:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Florian Westphal fw@strlen.de
commit 03a3ca37e4c6478e3a84f04c8429dd5889e107fd upstream.
Under extremely rare conditions TCP early demux will retrieve the wrong socket.
1. local machine establishes a connection to a remote server, S, on port p.
This gives: laddr:lport -> S:p ... both in tcp and conntrack.
2. local machine establishes a connection to host H, on port p2. 2a. TCP stack choses same laddr:lport, so we have laddr:lport -> H:p2 from TCP point of view. 2b). There is a destination NAT rewrite in place, translating H:p2 to S:p. This results in following conntrack entries:
I) laddr:lport -> S:p (origin) S:p -> laddr:lport (reply) II) laddr:lport -> H:p2 (origin) S:p -> laddr:lport2 (reply)
NAT engine has rewritten laddr:lport to laddr:lport2 to map the reply packet to the correct origin.
When server sends SYN/ACK to laddr:lport2, the PREROUTING hook will undo-the SNAT transformation, rewriting IP header to S:p -> laddr:lport
This causes TCP early demux to associate the skb with the TCP socket of the first connection.
The INPUT hook will then reverse the DNAT transformation, rewriting the IP header to H:p2 -> laddr:lport.
Because packet ends up with the wrong socket, the new connection never completes: originator stays in SYN_SENT and conntrack entry remains in SYN_RECV until timeout, and responder retransmits SYN/ACK until it gives up.
To resolve this, orphan the skb after the input rewrite: Because the source IP address changed, the socket must be incorrect. We can't move the DNAT undo to prerouting due to backwards compatibility, doing so will make iptables/nftables rules to no longer match the way they did.
After orphan, the packet will be handed to the next protocol layer (tcp, udp, ...) and that will repeat the socket lookup just like as if early demux was disabled.
Fixes: 41063e9dd1195 ("ipv4: Early TCP socket demux.") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427 Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_nat_proto.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
--- a/net/netfilter/nf_nat_proto.c +++ b/net/netfilter/nf_nat_proto.c @@ -646,8 +646,8 @@ nf_nat_ipv4_fn(void *priv, struct sk_buf }
static unsigned int -nf_nat_ipv4_in(void *priv, struct sk_buff *skb, - const struct nf_hook_state *state) +nf_nat_ipv4_pre_routing(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) { unsigned int ret; __be32 daddr = ip_hdr(skb)->daddr; @@ -660,6 +660,23 @@ nf_nat_ipv4_in(void *priv, struct sk_buf }
static unsigned int +nf_nat_ipv4_local_in(void *priv, struct sk_buff *skb, + const struct nf_hook_state *state) +{ + __be32 saddr = ip_hdr(skb)->saddr; + struct sock *sk = skb->sk; + unsigned int ret; + + ret = nf_nat_ipv4_fn(priv, skb, state); + + if (ret == NF_ACCEPT && sk && saddr != ip_hdr(skb)->saddr && + !inet_sk_transparent(sk)) + skb_orphan(skb); /* TCP edemux obtained wrong socket */ + + return ret; +} + +static unsigned int nf_nat_ipv4_out(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { @@ -736,7 +753,7 @@ nf_nat_ipv4_local_fn(void *priv, struct static const struct nf_hook_ops nf_nat_ipv4_ops[] = { /* Before packet filtering, change destination */ { - .hook = nf_nat_ipv4_in, + .hook = nf_nat_ipv4_pre_routing, .pf = NFPROTO_IPV4, .hooknum = NF_INET_PRE_ROUTING, .priority = NF_IP_PRI_NAT_DST, @@ -757,7 +774,7 @@ static const struct nf_hook_ops nf_nat_i }, /* After packet filtering, change source */ { - .hook = nf_nat_ipv4_fn, + .hook = nf_nat_ipv4_local_in, .pf = NFPROTO_IPV4, .hooknum = NF_INET_LOCAL_IN, .priority = NF_IP_PRI_NAT_SRC,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vasily Averin vvs@virtuozzo.com
commit 8e24edddad152b998b37a7f583175137ed2e04a5 upstream.
nested target/match_revfn() calls work with xt[NFPROTO_UNSPEC] lists without taking xt[NFPROTO_UNSPEC].mutex. This can race with module unload and cause host to crash:
general protection fault: 0000 [#1] Modules linked in: ... [last unloaded: xt_cluster] CPU: 0 PID: 542455 Comm: iptables RIP: 0010:[<ffffffff8ffbd518>] [<ffffffff8ffbd518>] strcmp+0x18/0x40 RDX: 0000000000000003 RSI: ffff9a5a5d9abe10 RDI: dead000000000111 R13: ffff9a5a5d9abe10 R14: ffff9a5a5d9abd8c R15: dead000000000100 (VvS: %R15 -- &xt_match, %RDI -- &xt_match.name, xt_cluster unregister match in xt[NFPROTO_UNSPEC].match list) Call Trace: [<ffffffff902ccf44>] match_revfn+0x54/0xc0 [<ffffffff902ccf9f>] match_revfn+0xaf/0xc0 [<ffffffff902cd01e>] xt_find_revision+0x6e/0xf0 [<ffffffffc05a5be0>] do_ipt_get_ctl+0x100/0x420 [ip_tables] [<ffffffff902cc6bf>] nf_getsockopt+0x4f/0x70 [<ffffffff902dd99e>] ip_getsockopt+0xde/0x100 [<ffffffff903039b5>] raw_getsockopt+0x25/0x50 [<ffffffff9026c5da>] sock_common_getsockopt+0x1a/0x20 [<ffffffff9026b89d>] SyS_getsockopt+0x7d/0xf0 [<ffffffff903cbf92>] system_call_fastpath+0x25/0x2a
Fixes: 656caff20e1 ("netfilter 04/09: x_tables: fix match/target revision lookup") Signed-off-by: Vasily Averin vvs@virtuozzo.com Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/x_tables.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -330,6 +330,7 @@ static int match_revfn(u8 af, const char const struct xt_match *m; int have_rev = 0;
+ mutex_lock(&xt[af].mutex); list_for_each_entry(m, &xt[af].match, list) { if (strcmp(m->name, name) == 0) { if (m->revision > *bestp) @@ -338,6 +339,7 @@ static int match_revfn(u8 af, const char have_rev = 1; } } + mutex_unlock(&xt[af].mutex);
if (af != NFPROTO_UNSPEC && !have_rev) return match_revfn(NFPROTO_UNSPEC, name, revision, bestp); @@ -350,6 +352,7 @@ static int target_revfn(u8 af, const cha const struct xt_target *t; int have_rev = 0;
+ mutex_lock(&xt[af].mutex); list_for_each_entry(t, &xt[af].target, list) { if (strcmp(t->name, name) == 0) { if (t->revision > *bestp) @@ -358,6 +361,7 @@ static int target_revfn(u8 af, const cha have_rev = 1; } } + mutex_unlock(&xt[af].mutex);
if (af != NFPROTO_UNSPEC && !have_rev) return target_revfn(NFPROTO_UNSPEC, name, revision, bestp); @@ -371,12 +375,10 @@ int xt_find_revision(u8 af, const char * { int have_rev, best = -1;
- mutex_lock(&xt[af].mutex); if (target == 1) have_rev = target_revfn(af, name, revision, &best); else have_rev = match_revfn(af, name, revision, &best); - mutex_unlock(&xt[af].mutex);
/* Nothing at all? Return 0 to try loading module. */ if (best == -1) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jason A. Donenfeld Jason@zx2c4.com
commit 4372339efc06bc2a796f4cc9d0a7a929dfda4967 upstream.
There were a few remaining tunnel drivers that didn't receive the prior conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending") for details), there's even more imperative to have these all converted. So this commit goes through the remaining cases that I could find and does a boring translation to the ndo variety.
The Fixes: line below is the merge that originally added icmp{,v6}_ ndo_send and converted the first batch of icmp{,v6}_send users. The rationale then for the change applies equally to this patch. It's just that these drivers were left out of the initial conversion because these network devices are hiding in net/ rather than in drivers/net/.
Cc: Florian Westphal fw@strlen.de Cc: Willem de Bruijn willemb@google.com Cc: David S. Miller davem@davemloft.net Cc: Hideaki YOSHIFUJI yoshfuji@linux-ipv6.org Cc: David Ahern dsahern@kernel.org Cc: Jakub Kicinski kuba@kernel.org Cc: Steffen Klassert steffen.klassert@secunet.com Fixes: 803381f9f117 ("Merge branch 'icmp-account-for-NAT-when-sending-icmps-from-ndo-layer'") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_tunnel.c | 5 ++--- net/ipv4/ip_vti.c | 6 +++--- net/ipv6/ip6_gre.c | 16 ++++++++-------- net/ipv6/ip6_tunnel.c | 10 +++++----- net/ipv6/ip6_vti.c | 6 +++--- net/ipv6/sit.c | 2 +- 6 files changed, 22 insertions(+), 23 deletions(-)
--- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -502,8 +502,7 @@ static int tnl_update_pmtu(struct net_de if (!skb_is_gso(skb) && (inner_iph->frag_off & htons(IP_DF)) && mtu < pkt_size) { - memset(IPCB(skb), 0, sizeof(*IPCB(skb))); - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); return -E2BIG; } } @@ -527,7 +526,7 @@ static int tnl_update_pmtu(struct net_de
if (!skb_is_gso(skb) && mtu >= IPV6_MIN_MTU && mtu < pkt_size) { - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); return -E2BIG; } } --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -238,13 +238,13 @@ static netdev_tx_t vti_xmit(struct sk_bu if (skb->len > mtu) { skb_dst_update_pmtu_no_confirm(skb, mtu); if (skb->protocol == htons(ETH_P_IP)) { - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); } else { if (mtu < IPV6_MIN_MTU) mtu = IPV6_MIN_MTU;
- icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); }
dst_release(dst); --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -678,8 +678,8 @@ static int prepare_ip6gre_xmit_ipv6(stru
tel = (struct ipv6_tlv_tnl_enc_lim *)&skb_network_header(skb)[offset]; if (tel->encap_limit == 0) { - icmpv6_send(skb, ICMPV6_PARAMPROB, - ICMPV6_HDR_FIELD, offset + 2); + icmpv6_ndo_send(skb, ICMPV6_PARAMPROB, + ICMPV6_HDR_FIELD, offset + 2); return -1; } *encap_limit = tel->encap_limit - 1; @@ -805,8 +805,8 @@ static inline int ip6gre_xmit_ipv4(struc if (err != 0) { /* XXX: send ICMP error even if DF is not set. */ if (err == -EMSGSIZE) - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); return -1; }
@@ -837,7 +837,7 @@ static inline int ip6gre_xmit_ipv6(struc &mtu, skb->protocol); if (err != 0) { if (err == -EMSGSIZE) - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); return -1; }
@@ -1063,10 +1063,10 @@ static netdev_tx_t ip6erspan_tunnel_xmit /* XXX: send ICMP error even if DF is not set. */ if (err == -EMSGSIZE) { if (skb->protocol == htons(ETH_P_IP)) - icmp_send(skb, ICMP_DEST_UNREACH, - ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, + ICMP_FRAG_NEEDED, htonl(mtu)); else - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); }
goto tx_err; --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1332,8 +1332,8 @@ ipxip6_tnl_xmit(struct sk_buff *skb, str
tel = (void *)&skb_network_header(skb)[offset]; if (tel->encap_limit == 0) { - icmpv6_send(skb, ICMPV6_PARAMPROB, - ICMPV6_HDR_FIELD, offset + 2); + icmpv6_ndo_send(skb, ICMPV6_PARAMPROB, + ICMPV6_HDR_FIELD, offset + 2); return -1; } encap_limit = tel->encap_limit - 1; @@ -1385,11 +1385,11 @@ ipxip6_tnl_xmit(struct sk_buff *skb, str if (err == -EMSGSIZE) switch (protocol) { case IPPROTO_IPIP: - icmp_send(skb, ICMP_DEST_UNREACH, - ICMP_FRAG_NEEDED, htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, + ICMP_FRAG_NEEDED, htonl(mtu)); break; case IPPROTO_IPV6: - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); break; default: break; --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -521,10 +521,10 @@ vti6_xmit(struct sk_buff *skb, struct ne if (mtu < IPV6_MIN_MTU) mtu = IPV6_MIN_MTU;
- icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); } else { - icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, - htonl(mtu)); + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, + htonl(mtu)); }
err = -EMSGSIZE; --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -987,7 +987,7 @@ static netdev_tx_t ipip6_tunnel_xmit(str skb_dst_update_pmtu_no_confirm(skb, mtu);
if (skb->len > mtu && !skb_is_gso(skb)) { - icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); ip_rt_put(rt); goto tx_error; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Guangbin Huang huangguangbin2@huawei.com
commit d9032dba5a2b2bbf0fdce67c8795300ec9923b43 upstream.
If phy uses generic driver and autoneg is on, enter command "ethtool -s eth0 speed 50" will not change phy speed actually, but command "ethtool eth0" shows speed is 50Mb/s because phydev->speed has been set to 50 and no update later.
And duplex setting has same problem too.
However, if autoneg is on, phy only changes speed and duplex according to phydev->advertising, but not phydev->speed and phydev->duplex. So in this case, phydev->speed and phydev->duplex don't need to be set in function phy_ethtool_ksettings_set() if autoneg is on.
Fixes: 51e2a3846eab ("PHY: Avoid unnecessary aneg restarts") Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -276,14 +276,16 @@ int phy_ethtool_ksettings_set(struct phy
phydev->autoneg = autoneg;
- phydev->speed = speed; + if (autoneg == AUTONEG_DISABLE) { + phydev->speed = speed; + phydev->duplex = duplex; + }
linkmode_copy(phydev->advertising, advertising);
linkmode_mod_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, phydev->advertising, autoneg == AUTONEG_ENABLE);
- phydev->duplex = duplex; phydev->master_slave_set = cmd->base.master_slave_cfg; phydev->mdix_ctrl = cmd->base.eth_tp_mdix_ctrl;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ilya Leoshkevich iii@linux.ibm.com
commit 42a382a466a967dc053c73b969cd2ac2fec502cf upstream.
test_snprintf_btf fails on s390, because NULL points to a readable struct lowcore there. Fix by using the last page instead.
Error message example:
printing fffffffffffff000 should generate error, got (361)
Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests") Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Heiko Carstens hca@linux.ibm.com Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20210227051726.121256-1-iii@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/progs/netif_receive_skb.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/tools/testing/selftests/bpf/progs/netif_receive_skb.c +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -16,6 +16,13 @@ bool skip = false; #define STRSIZE 2048 #define EXPECTED_STRSIZE 256
+#if defined(bpf_target_s390) +/* NULL points to a readable struct lowcore on s390, so take the last page */ +#define BADPTR ((void *)0xFFFFFFFFFFFFF000ULL) +#else +#define BADPTR 0 +#endif + #ifndef ARRAY_SIZE #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) #endif @@ -113,11 +120,11 @@ int BPF_PROG(trace_netif_receive_skb, st }
/* Check invalid ptr value */ - p.ptr = 0; + p.ptr = BADPTR; __ret = bpf_snprintf_btf(str, STRSIZE, &p, sizeof(p), 0); if (__ret >= 0) { - bpf_printk("printing NULL should generate error, got (%d)", - __ret); + bpf_printk("printing %llx should generate error, got (%d)", + (unsigned long long)BADPTR, __ret); ret = -ERANGE; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Hangbin Liu liuhangbin@gmail.com
commit 557c223b643a35effec9654958d8edc62fd2603a upstream.
In bpf geneve tunnel test we set geneve option on tx side. On rx side we only call bpf_skb_get_tunnel_opt(). Since commit 9c2e14b48119 ("ip_tunnels: Set tunnel option flag when tunnel metadata is present") geneve_rx() will not add TUNNEL_GENEVE_OPT flag if there is no geneve option, which cause bpf_skb_get_tunnel_opt() return ENOENT and _geneve_get_tunnel() in test_tunnel_kern.c drop the packet.
As it should be valid that bpf_skb_get_tunnel_opt() return error when there is not tunnel option, there is no need to drop the packet and break all geneve rx traffic. Just set opt_class to 0 in this test and keep returning TC_ACT_OK.
Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: William Tu u9012063@gmail.com Link: https://lore.kernel.org/bpf/20210224081403.1425474-1-liuhangbin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/progs/test_tunnel_kern.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -446,10 +446,8 @@ int _geneve_get_tunnel(struct __sk_buff }
ret = bpf_skb_get_tunnel_opt(skb, &gopt, sizeof(gopt)); - if (ret < 0) { - ERROR(ret); - return TC_ACT_SHOT; - } + if (ret < 0) + gopt.opt_class = 0;
bpf_trace_printk(fmt, sizeof(fmt), key.tunnel_id, key.remote_ipv4, gopt.opt_class);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Yauheni Kaliuta yauheni.kaliuta@redhat.com
commit 6185266c5a853bb0f2a459e3ff594546f277609b upstream.
The verifier test labelled "valid read map access into a read-only array 2" calls the bpf_csum_diff() helper and checks its return value. However, architecture implementations of csum_partial() (which is what the helper uses) differ in whether they fold the return value to 16 bit or not. For example, x86 version has ...
if (unlikely(odd)) { result = from32to16(result); result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); }
... while generic lib/checksum.c does:
result = from32to16(result); if (odd) result = ((result >> 8) & 0xff) | ((result & 0xff) << 8);
This makes the helper return different values on different architectures, breaking the test on non-x86. To fix this, add an additional instruction to always mask the return value to 16 bits, and update the expected return value accordingly.
Fixes: fb2abb73e575 ("bpf, selftest: test {rd, wr}only flags and direct value access") Signed-off-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210228103017.320240-1-yauheni.kaliuta@redhat.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/verifier/array_access.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/bpf/verifier/array_access.c +++ b/tools/testing/selftests/bpf/verifier/array_access.c @@ -250,12 +250,13 @@ BPF_MOV64_IMM(BPF_REG_5, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff), + BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xffff), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .fixup_map_array_ro = { 3 }, .result = ACCEPT, - .retval = -29, + .retval = 65507, }, { "invalid write map access into a read-only array 1",
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
commit 6bc6699881012b5bd5d49fa861a69a37fc01b49c upstream.
We mmap the umem region, but we never munmap it. Add the missing call at the end of the cleanup.
Fixes: 3945b37a975d ("samples/bpf: use hugepages in xdpsock app") Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Björn Töpel bjorn.topel@intel.com Link: https://lore.kernel.org/bpf/20210303185636.18070-3-maciej.fijalkowski@intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- samples/bpf/xdpsock_user.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -1699,5 +1699,7 @@ int main(int argc, char **argv)
xdpsock_cleanup();
+ munmap(bufs, NUM_FRAMES * opt_xsk_frame_size); + return 0; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
commit 2b2aedabc44e9660f90ccf7ba1ca2706d75f411f upstream.
xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a reference to XSKMAP. BPF prog can include insns that work on various BPF maps and this is covered by iterating through map_ids.
The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling needs to be cleared at each iteration, so that it doesn't contain any outdated fields and that is currently missing in the function of interest.
To fix that, zero-init map_info via memset before each bpf_obj_get_info_by_fd call.
Also, since the area of this code is touched, in general strcmp is considered harmful, so let's convert it to strncmp and provide the size of the array name for current map_info.
While at it, do s/continue/break/ once we have found the xsks_map to terminate the search.
Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup") Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Björn Töpel bjorn.topel@intel.com Link: https://lore.kernel.org/bpf/20210303185636.18070-4-maciej.fijalkowski@intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/bpf/xsk.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -535,15 +535,16 @@ static int xsk_lookup_bpf_maps(struct xs if (fd < 0) continue;
+ memset(&map_info, 0, map_len); err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len); if (err) { close(fd); continue; }
- if (!strcmp(map_info.name, "xsks_map")) { + if (!strncmp(map_info.name, "xsks_map", sizeof(map_info.name))) { ctx->xsks_map_fd = fd; - continue; + break; }
close(fd);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Michal Suchanek msuchanek@suse.de
commit 6881b07fdd24850def1f03761c66042b983ff86e upstream.
GCC 7.5 reports: ../drivers/net/ethernet/ibm/ibmvnic.c: In function 'ibmvnic_reset_init': ../drivers/net/ethernet/ibm/ibmvnic.c:5373:51: warning: 'old_num_tx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized] ../drivers/net/ethernet/ibm/ibmvnic.c:5373:6: warning: 'old_num_rx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized]
The variable is initialized only if(reset) and used only if(reset && something) so this is a false positive. However, there is no reason to not initialize the variables unconditionally avoiding the warning.
Fixes: 635e442f4a48 ("ibmvnic: merge ibmvnic_reset_init and ibmvnic_init") Signed-off-by: Michal Suchanek msuchanek@suse.de Reviewed-by: Sukadev Bhattiprolu sukadev@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ibm/ibmvnic.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -5283,16 +5283,14 @@ static int ibmvnic_reset_init(struct ibm { struct device *dev = &adapter->vdev->dev; unsigned long timeout = msecs_to_jiffies(20000); - u64 old_num_rx_queues, old_num_tx_queues; + u64 old_num_rx_queues = adapter->req_rx_queues; + u64 old_num_tx_queues = adapter->req_tx_queues; int rc;
adapter->from_passive_init = false;
- if (reset) { - old_num_rx_queues = adapter->req_rx_queues; - old_num_tx_queues = adapter->req_tx_queues; + if (reset) reinit_completion(&adapter->init_done); - }
adapter->init_done_rc = 0; rc = ibmvnic_send_crq_init(adapter);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jiri Wiesner jwiesner@suse.com
commit 67eb211487f0c993d9f402d1c196ef159fd6a3b5 upstream.
The last change to ibmvnic_set_mac(), 8fc3672a8ad3, meant to prevent users from setting an invalid MAC address on an ibmvnic interface that has not been brought up yet. The change also prevented the requested MAC address from being stored by the adapter object for an ibmvnic interface when the state of the ibmvnic interface is VNIC_PROBED - that is after probing has finished but before the ibmvnic interface is brought up. The MAC address stored by the adapter object is used and sent to the hypervisor for checking when an ibmvnic interface is brought up.
The ibmvnic driver ignoring the requested MAC address when in VNIC_PROBED state caused LACP bonds (bonds in 802.3ad mode) with more than one slave to malfunction. The bonding code must be able to change the MAC address of its slaves before they are brought up during enslaving. The inability of kernels with 8fc3672a8ad3 to set the MAC addresses of bonding slaves is observable in the output of "ip address show". The MAC addresses of the slaves are the same as the MAC address of the bond on a working system whereas the slaves retain their original MAC addresses on a system with a malfunctioning LACP bond.
Fixes: 8fc3672a8ad3 ("ibmvnic: fix ibmvnic_set_mac") Signed-off-by: Jiri Wiesner jwiesner@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ibm/ibmvnic.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1923,10 +1923,9 @@ static int ibmvnic_set_mac(struct net_de if (!is_valid_ether_addr(addr->sa_data)) return -EADDRNOTAVAIL;
- if (adapter->state != VNIC_PROBED) { - ether_addr_copy(adapter->mac_addr, addr->sa_data); + ether_addr_copy(adapter->mac_addr, addr->sa_data); + if (adapter->state != VNIC_PROBED) rc = __ibmvnic_set_mac(netdev, addr->sa_data); - }
return rc; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Junlin Yang yangjunlin@yulong.com
commit 69cdb7947adb816fc9325b4ec02a6dddd5070b82 upstream.
ibmvnic_remove locks multiple spinlocks while disabling interrupts: spin_lock_irqsave(&adapter->state_lock, flags); spin_lock_irqsave(&adapter->rwi_lock, flags);
As reported by coccinelle, the second _irqsave() overwrites the value saved in 'flags' by the first _irqsave(), therefore when the second _irqrestore() comes,the value in 'flags' is not valid,the value saved by the first _irqsave() has been lost. This likely leads to IRQs remaining disabled. So remove the second _irqsave(): spin_lock_irqsave(&adapter->state_lock, flags); spin_lock(&adapter->rwi_lock);
Generated by: ./scripts/coccinelle/locks/flags.cocci ./drivers/net/ethernet/ibm/ibmvnic.c:5413:1-18: ERROR: nested lock+irqsave that reuses flags from line 5404.
Fixes: 4a41c421f367 ("ibmvnic: serialize access to work queue on remove") Signed-off-by: Junlin Yang yangjunlin@yulong.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ibm/ibmvnic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -5474,9 +5474,9 @@ static int ibmvnic_remove(struct vio_dev * after setting state, so __ibmvnic_reset() which is called * from the flush_work() below, can make progress. */ - spin_lock_irqsave(&adapter->rwi_lock, flags); + spin_lock(&adapter->rwi_lock); adapter->state = VNIC_REMOVING; - spin_unlock_irqrestore(&adapter->rwi_lock, flags); + spin_unlock(&adapter->rwi_lock);
spin_unlock_irqrestore(&adapter->state_lock, flags);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lorenzo Bianconi lorenzo@kernel.org
commit d0bd52c591a1070c54dc428e926660eb4f981099 upstream.
Commit b102f0c522cf6 ("mt76: fix array overflow on receiving too many fragments for a packet") fixes a possible OOB access but it introduces a memory leak since the pending frame is not released to page_frag_cache if the frag array of skb_shared_info is full. Commit 93a1d4791c10 ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") fixes the issue but does not free the truncated skb that is forwarded to mac80211 layer. Fix the leftover issue discarding even truncated skbs.
Fixes: 93a1d4791c10 ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/a03166fcc8214644333c68674a781836e0f57576.161269721... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/mediatek/mt76/dma.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -511,13 +511,13 @@ mt76_add_fragment(struct mt76_dev *dev, { struct sk_buff *skb = q->rx_head; struct skb_shared_info *shinfo = skb_shinfo(skb); + int nr_frags = shinfo->nr_frags;
- if (shinfo->nr_frags < ARRAY_SIZE(shinfo->frags)) { + if (nr_frags < ARRAY_SIZE(shinfo->frags)) { struct page *page = virt_to_head_page(data); int offset = data - page_address(page) + q->buf_offset;
- skb_add_rx_frag(skb, shinfo->nr_frags, page, offset, len, - q->buf_size); + skb_add_rx_frag(skb, nr_frags, page, offset, len, q->buf_size); } else { skb_free_frag(data); } @@ -526,7 +526,10 @@ mt76_add_fragment(struct mt76_dev *dev, return;
q->rx_head = NULL; - dev->drv->rx_skb(dev, q - dev->q_rx, skb); + if (nr_frags < ARRAY_SIZE(shinfo->frags)) + dev->drv->rx_skb(dev, q - dev->q_rx, skb); + else + dev_kfree_skb(skb); }
static int
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Johan Hovold johan@kernel.org
commit cf25ef6b631c6fc6c0435fc91eba8734cca20511 upstream.
Make sure to hold the gpio_lock when removing the gpio device from the gpio_devices list (when dropping the last reference) to avoid corrupting the list when there are concurrent accesses.
Fixes: ff2b13592299 ("gpio: make the gpiochip a real device") Cc: stable@vger.kernel.org # 4.6 Reviewed-by: Saravana Kannan saravanak@google.com Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com [ johan: adjust context to 5.11 ] Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -469,8 +469,12 @@ EXPORT_SYMBOL_GPL(gpiochip_line_is_valid static void gpiodevice_release(struct device *dev) { struct gpio_device *gdev = dev_get_drvdata(dev); + unsigned long flags;
+ spin_lock_irqsave(&gpio_lock, flags); list_del(&gdev->list); + spin_unlock_irqrestore(&gpio_lock, flags); + ida_free(&gpio_ida, gdev->id); kfree_const(gdev->label); kfree(gdev->descs);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Christian Brauner christian.brauner@ubuntu.com
commit ee2e3f50629f17b0752b55b2566c15ce8dafb557 upstream.
Creating a series of detached mounts, attaching them to the filesystem, and unmounting them can be used to trigger an integer overflow in ns->mounts causing the kernel to block any new mounts in count_mounts() and returning ENOSPC because it falsely assumes that the maximum number of mounts in the mount namespace has been reached, i.e. it thinks it can't fit the new mounts into the mount namespace anymore.
Depending on the number of mounts in your system, this can be reproduced on any kernel that supportes open_tree() and move_mount() by compiling and running the following program:
/* SPDX-License-Identifier: LGPL-2.1+ */
#define _GNU_SOURCE #include <errno.h> #include <fcntl.h> #include <getopt.h> #include <limits.h> #include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h>
/* open_tree() */ #ifndef OPEN_TREE_CLONE #define OPEN_TREE_CLONE 1 #endif
#ifndef OPEN_TREE_CLOEXEC #define OPEN_TREE_CLOEXEC O_CLOEXEC #endif
#ifndef __NR_open_tree #if defined __alpha__ #define __NR_open_tree 538 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 /* o32 */ #define __NR_open_tree 4428 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 /* n32 */ #define __NR_open_tree 6428 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 /* n64 */ #define __NR_open_tree 5428 #endif #elif defined __ia64__ #define __NR_open_tree (428 + 1024) #else #define __NR_open_tree 428 #endif #endif
/* move_mount() */ #ifndef MOVE_MOUNT_F_EMPTY_PATH #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004 /* Empty from path permitted */ #endif
#ifndef __NR_move_mount #if defined __alpha__ #define __NR_move_mount 539 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 /* o32 */ #define __NR_move_mount 4429 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 /* n32 */ #define __NR_move_mount 6429 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 /* n64 */ #define __NR_move_mount 5429 #endif #elif defined __ia64__ #define __NR_move_mount (428 + 1024) #else #define __NR_move_mount 429 #endif #endif
static inline int sys_open_tree(int dfd, const char *filename, unsigned int flags) { return syscall(__NR_open_tree, dfd, filename, flags); }
static inline int sys_move_mount(int from_dfd, const char *from_pathname, int to_dfd, const char *to_pathname, unsigned int flags) { return syscall(__NR_move_mount, from_dfd, from_pathname, to_dfd, to_pathname, flags); }
static bool is_shared_mountpoint(const char *path) { bool shared = false; FILE *f = NULL; char *line = NULL; int i; size_t len = 0;
f = fopen("/proc/self/mountinfo", "re"); if (!f) return 0;
while (getline(&line, &len, f) > 0) { char *slider1, *slider2;
for (slider1 = line, i = 0; slider1 && i < 4; i++) slider1 = strchr(slider1 + 1, ' ');
if (!slider1) continue;
slider2 = strchr(slider1 + 1, ' '); if (!slider2) continue;
*slider2 = '\0'; if (strcmp(slider1 + 1, path) == 0) { /* This is the path. Is it shared? */ slider1 = strchr(slider2 + 1, ' '); if (slider1 && strstr(slider1, "shared:")) { shared = true; break; } } } fclose(f); free(line);
return shared; }
static void usage(void) { const char *text = "mount-new [--recursive] <base-dir>\n"; fprintf(stderr, "%s", text); _exit(EXIT_SUCCESS); }
#define exit_usage(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ usage(); \ })
#define exit_log(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ exit(EXIT_FAILURE); \ })
static const struct option longopts[] = { {"help", no_argument, 0, 'a'}, { NULL, no_argument, 0, 0 }, };
int main(int argc, char *argv[]) { int exit_code = EXIT_SUCCESS, index = 0; int dfd, fd_tree, new_argc, ret; char *base_dir; char *const *new_argv; char target[PATH_MAX];
while ((ret = getopt_long_only(argc, argv, "", longopts, &index)) != -1) { switch (ret) { case 'a': /* fallthrough */ default: usage(); } }
new_argv = &argv[optind]; new_argc = argc - optind; if (new_argc < 1) exit_usage("Missing base directory\n"); base_dir = new_argv[0];
if (*base_dir != '/') exit_log("Please specify an absolute path");
/* Ensure that target is a shared mountpoint. */ if (!is_shared_mountpoint(base_dir)) exit_log("Please ensure that "%s" is a shared mountpoint", base_dir);
dfd = open(base_dir, O_RDONLY | O_DIRECTORY | O_CLOEXEC); if (dfd < 0) exit_log("%m - Failed to open base directory "%s"", base_dir);
ret = mkdirat(dfd, "detached-move-mount", 0755); if (ret < 0) exit_log("%m - Failed to create required temporary directories");
ret = snprintf(target, sizeof(target), "%s/detached-move-mount", base_dir); if (ret < 0 || (size_t)ret >= sizeof(target)) exit_log("%m - Failed to assemble target path");
/* * Having a mount table with 10000 mounts is already quite excessive * and shoult account even for weird test systems. */ for (size_t i = 0; i < 10000; i++) { fd_tree = sys_open_tree(dfd, "detached-move-mount", OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC | AT_EMPTY_PATH); if (fd_tree < 0) { fprintf(stderr, "%m - Failed to open %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; }
ret = sys_move_mount(fd_tree, "", dfd, "detached-move-mount", MOVE_MOUNT_F_EMPTY_PATH); if (ret < 0) { if (errno == ENOSPC) fprintf(stderr, "%m - Buggy mount counting"); else fprintf(stderr, "%m - Failed to attach mount to %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; } close(fd_tree);
ret = umount2(target, MNT_DETACH); if (ret < 0) { fprintf(stderr, "%m - Failed to unmount %s", target); exit_code = EXIT_FAILURE; break; } }
(void)unlinkat(dfd, "detached-move-mount", AT_REMOVEDIR); close(dfd);
exit(exit_code); }
and wait for the kernel to refuse any new mounts by returning ENOSPC. How many iterations are needed depends on the number of mounts in your system. Assuming you have something like 50 mounts on a standard system it should be almost instantaneous.
The root cause of this is that detached mounts aren't handled correctly when source and target mount are identical and reside on a shared mount causing a broken mount tree where the detached source itself is propagated which propagation prevents for regular bind-mounts and new mounts. This ultimately leads to a miscalculation of the number of mounts in the mount namespace.
Detached mounts created via open_tree(fd, path, OPEN_TREE_CLONE) are essentially like an unattached new mount, or an unattached bind-mount. They can then later on be attached to the filesystem via move_mount() which calls into attach_recursive_mount(). Part of attaching it to the filesystem is making sure that mounts get correctly propagated in case the destination mountpoint is MS_SHARED, i.e. is a shared mountpoint. This is done by calling into propagate_mnt() which walks the list of peers calling propagate_one() on each mount in this list making sure it receives the propagation event. The propagate_one() functions thereby skips both new mounts and bind mounts to not propagate them "into themselves". Both are identified by checking whether the mount is already attached to any mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is responsible for.
However, detached mounts have an anonymous mount namespace attached to them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't realize they need to be skipped causing the mount to propagate "into itself" breaking the mount table and causing a disconnect between the number of mounts recorded as being beneath or reachable from the target mountpoint and the number of mounts actually recorded/counted in ns->mounts ultimately causing an overflow which in turn prevents any new mounts via the ENOSPC issue.
So teach propagation to handle detached mounts by making it aware of them. I've been tracking this issue down for the last couple of days and then verifying that the fix is correct by unmounting everything in my current mount table leaving only /proc and /sys mounted and running the reproducer above overnight verifying the number of mounts counted in ns->mounts. With this fix the counts are correct and the ENOSPC issue can't be reproduced.
This change will only have an effect on mounts created with the new mount API since detached mounts cannot be created with the old mount API so regressions are extremely unlikely.
Link: https://lore.kernel.org/r/20210306101010.243666-1-christian.brauner@ubuntu.c... Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") Cc: David Howells dhowells@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/pnode.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/pnode.h +++ b/fs/pnode.h @@ -12,7 +12,7 @@
#define IS_MNT_SHARED(m) ((m)->mnt.mnt_flags & MNT_SHARED) #define IS_MNT_SLAVE(m) ((m)->mnt_master) -#define IS_MNT_NEW(m) (!(m)->mnt_ns) +#define IS_MNT_NEW(m) (!(m)->mnt_ns || is_anon_ns((m)->mnt_ns)) #define CLEAR_MNT_SHARED(m) ((m)->mnt.mnt_flags &= ~MNT_SHARED) #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE) #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Aurelien Aptel aaptel@suse.com
commit a249cc8bc2e2fed680047d326eb9a50756724198 upstream.
With multichannel, operations like the queries from "ls -lR" can cause all credits to be used and errors to be returned since max_credits was not being set correctly on the secondary channels and thus the client was requesting 0 credits incorrectly in some cases (which can lead to not having enough credits to perform any operation on that channel).
Signed-off-by: Aurelien Aptel aaptel@suse.com CC: stable@vger.kernel.org # v5.8+ Reviewed-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/connect.c | 10 +++++----- fs/cifs/sess.c | 1 + 2 files changed, 6 insertions(+), 5 deletions(-)
--- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -1405,6 +1405,11 @@ smbd_connected: tcp_ses->min_offload = ctx->min_offload; tcp_ses->tcpStatus = CifsNeedNegotiate;
+ if ((ctx->max_credits < 20) || (ctx->max_credits > 60000)) + tcp_ses->max_credits = SMB2_MAX_CREDITS_AVAILABLE; + else + tcp_ses->max_credits = ctx->max_credits; + tcp_ses->nr_targets = 1; tcp_ses->ignore_signature = ctx->ignore_signature; /* thread spawned, put it on the list */ @@ -2806,11 +2811,6 @@ static int mount_get_conns(struct smb3_f
*nserver = server;
- if ((ctx->max_credits < 20) || (ctx->max_credits > 60000)) - server->max_credits = SMB2_MAX_CREDITS_AVAILABLE; - else - server->max_credits = ctx->max_credits; - /* get a reference to a SMB session */ ses = cifs_get_smb_ses(server, ctx); if (IS_ERR(ses)) { --- a/fs/cifs/sess.c +++ b/fs/cifs/sess.c @@ -230,6 +230,7 @@ cifs_ses_add_channel(struct cifs_sb_info ctx.noautotune = ses->server->noautotune; ctx.sockopt_tcp_nodelay = ses->server->tcp_nodelay; ctx.echo_interval = ses->server->echo_interval / HZ; + ctx.max_credits = ses->server->max_credits;
/* * This will be used for encoding/decoding user/domain/pw
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paulo Alcantara pc@cjr.nz
commit 14302ee3301b3a77b331cc14efb95bf7184c73cc upstream.
In cifs_statfs(), if server->ops->queryfs is not NULL, then we should use its return value rather than always returning 0. Instead, use rc variable as it is properly set to 0 in case there is no server->ops->queryfs.
Signed-off-by: Paulo Alcantara (SUSE) pc@cjr.nz Reviewed-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com CC: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifsfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -290,7 +290,7 @@ cifs_statfs(struct dentry *dentry, struc rc = server->ops->queryfs(xid, tcon, cifs_sb, buf);
free_xid(xid); - return 0; + return rc; }
static long cifs_fallocate(struct file *file, int mode, loff_t off, loff_t len)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Linus Torvalds torvalds@linux-foundation.org
commit 9b1ea29bc0d7b94d420f96a0f4121403efc3dd85 upstream.
This reverts commit 8ff60eb052eeba95cfb3efe16b08c9199f8121cf.
The kernel test robot reports a huge performance regression due to the commit, and the reason seems fairly straightforward: when there is contention on the page list (which is what causes acquire_slab() to fail), we do _not_ want to just loop and try again, because that will transfer the contention to the 'n->list_lock' spinlock we hold, and just make things even worse.
This is admittedly likely a problem only on big machines - the kernel test robot report comes from a 96-thread dual socket Intel Xeon Gold 6252 setup, but the regression there really is quite noticeable:
-47.9% regression of stress-ng.rawpkt.ops_per_sec
and the commit that was marked as being fixed (7ced37197196: "slub: Acquire_slab() avoid loop") actually did the loop exit early very intentionally (the hint being that "avoid loop" part of that commit message), exactly to avoid this issue.
The correct thing to do may be to pick some kind of reasonable middle ground: instead of breaking out of the loop on the very first sign of contention, or trying over and over and over again, the right thing may be to re-try _once_, and then give up on the second failure (or pick your favorite value for "once"..).
Reported-by: kernel test robot oliver.sang@intel.com Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/ Cc: Jann Horn jannh@google.com Cc: David Rientjes rientjes@google.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Acked-by: Christoph Lameter cl@linux.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c +++ b/mm/slub.c @@ -1973,7 +1973,7 @@ static void *get_partial_node(struct kme
t = acquire_slab(s, n, page, object == NULL, &objects); if (!t) - continue; /* cmpxchg raced */ + break;
available += objects; if (!object) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jakub Kicinski kuba@kernel.org
commit dbbe7c962c3a8163bf724dbc3c9fdfc9b16d3117 upstream.
Leave it to Greg.
Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/networking/netdev-FAQ.rst | 72 ++------------------------ Documentation/process/stable-kernel-rules.rst | 6 -- Documentation/process/submitting-patches.rst | 5 - 3 files changed, 6 insertions(+), 77 deletions(-)
--- a/Documentation/networking/netdev-FAQ.rst +++ b/Documentation/networking/netdev-FAQ.rst @@ -142,73 +142,13 @@ Please send incremental versions on top the patches the way they would look like if your latest patch series was to be merged.
-How can I tell what patches are queued up for backporting to the various stable releases? ------------------------------------------------------------------------------------------ -Normally Greg Kroah-Hartman collects stable commits himself, but for -networking, Dave collects up patches he deems critical for the -networking subsystem, and then hands them off to Greg. - -There is a patchworks queue that you can see here: - - https://patchwork.kernel.org/bundle/netdev/stable/?state=* - -It contains the patches which Dave has selected, but not yet handed off -to Greg. If Greg already has the patch, then it will be here: - - https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git - -A quick way to find whether the patch is in this stable-queue is to -simply clone the repo, and then git grep the mainline commit ID, e.g. -:: - - stable-queue$ git grep -l 284041ef21fdf2e - releases/3.0.84/ipv6-fix-possible-crashes-in-ip6_cork_release.patch - releases/3.4.51/ipv6-fix-possible-crashes-in-ip6_cork_release.patch - releases/3.9.8/ipv6-fix-possible-crashes-in-ip6_cork_release.patch - stable/stable-queue$ - -I see a network patch and I think it should be backported to stable. Should I request it via stable@vger.kernel.org like the references in the kernel's Documentation/process/stable-kernel-rules.rst file say? ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -No, not for networking. Check the stable queues as per above first -to see if it is already queued. If not, then send a mail to netdev, -listing the upstream commit ID and why you think it should be a stable -candidate. - -Before you jump to go do the above, do note that the normal stable rules -in :ref:`Documentation/process/stable-kernel-rules.rst <stable_kernel_rules>` -still apply. So you need to explicitly indicate why it is a critical -fix and exactly what users are impacted. In addition, you need to -convince yourself that you *really* think it has been overlooked, -vs. having been considered and rejected. - -Generally speaking, the longer it has had a chance to "soak" in -mainline, the better the odds that it is an OK candidate for stable. So -scrambling to request a commit be added the day after it appears should -be avoided. - -I have created a network patch and I think it should be backported to stable. Should I add a Cc: stable@vger.kernel.org like the references in the kernel's Documentation/ directory say? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -No. See above answer. In short, if you think it really belongs in -stable, then ensure you write a decent commit log that describes who -gets impacted by the bug fix and how it manifests itself, and when the -bug was introduced. If you do that properly, then the commit will get -handled appropriately and most likely get put in the patchworks stable -queue if it really warrants it. - -If you think there is some valid information relating to it being in -stable that does *not* belong in the commit log, then use the three dash -marker line as described in -:ref:`Documentation/process/submitting-patches.rst <the_canonical_patch_format>` -to temporarily embed that information into the patch that you send. - -Are all networking bug fixes backported to all stable releases? +Are there special rules regarding stable submissions on netdev? --------------------------------------------------------------- -Due to capacity, Dave could only take care of the backports for the -last two stable releases. For earlier stable releases, each stable -branch maintainer is supposed to take care of them. If you find any -patch is missing from an earlier stable branch, please notify -stable@vger.kernel.org with either a commit ID or a formal patch -backported, and CC Dave and other relevant networking developers. +While it used to be the case that netdev submissions were not supposed +to carry explicit ``CC: stable@vger.kernel.org`` tags that is no longer +the case today. Please follow the standard stable rules in +:ref:`Documentation/process/stable-kernel-rules.rst <stable_kernel_rules>`, +and make sure you include appropriate Fixes tags!
Is the comment style convention different for the networking content? --------------------------------------------------------------------- --- a/Documentation/process/stable-kernel-rules.rst +++ b/Documentation/process/stable-kernel-rules.rst @@ -35,12 +35,6 @@ Rules on what kind of patches are accept Procedure for submitting patches to the -stable tree ----------------------------------------------------
- - If the patch covers files in net/ or drivers/net please follow netdev stable - submission guidelines as described in - :ref:`Documentation/networking/netdev-FAQ.rst <netdev-FAQ>` - after first checking the stable networking queue at - https://patchwork.kernel.org/bundle/netdev/stable/?state=* - to ensure the requested patch is not already queued up. - Security patches should not be handled (solely) by the -stable review process but should follow the procedures in :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`. --- a/Documentation/process/submitting-patches.rst +++ b/Documentation/process/submitting-patches.rst @@ -250,11 +250,6 @@ should also read :ref:`Documentation/process/stable-kernel-rules.rst <stable_kernel_rules>` in addition to this file.
-Note, however, that some subsystem maintainers want to come to their own -conclusions on which patches should go to the stable trees. The networking -maintainer, in particular, would rather not see individual developers -adding lines like the above to their patches. - If changes affect userland-kernel interfaces, please send the MAN-PAGES maintainer (as listed in the MAINTAINERS file) a man-pages patch, or at least a notification of the change, so that some information makes its way
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: DENG Qingfang dqfext@gmail.com
commit 9eb8bc593a5eed167dac2029abef343854c5ba75 upstream.
Commit 86dd9868b878 has several issues, but was accepted too soon before anyone could take a look.
- Double free. dsa_slave_xmit() will free the skb if the xmit function returns NULL, but the skb is already freed by eth_skb_pad(). Use __skb_put_padto() to avoid that. - Unnecessary allocation. It has been done by DSA core since commit a3b0b6479700. - A u16 pointer points to skb data. It should be __be16 for network byte order. - Typo in comments. "numer" -> "number".
Fixes: 86dd9868b878 ("net: dsa: tag_rtl4_a: Support also egress tags") Signed-off-by: DENG Qingfang dqfext@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/tag_rtl4_a.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/net/dsa/tag_rtl4_a.c +++ b/net/dsa/tag_rtl4_a.c @@ -35,14 +35,12 @@ static struct sk_buff *rtl4a_tag_xmit(st struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); + __be16 *p; u8 *tag; - u16 *p; u16 out;
/* Pad out to at least 60 bytes */ - if (unlikely(eth_skb_pad(skb))) - return NULL; - if (skb_cow_head(skb, RTL4_A_HDR_LEN) < 0) + if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) return NULL;
netdev_dbg(dev, "add realtek tag to package to port %d\n", @@ -53,13 +51,13 @@ static struct sk_buff *rtl4a_tag_xmit(st tag = skb->data + 2 * ETH_ALEN;
/* Set Ethertype */ - p = (u16 *)tag; + p = (__be16 *)tag; *p = htons(RTL4_A_ETHERTYPE);
out = (RTL4_A_PROTOCOL_RTL8366RB << 12) | (2 << 8); - /* The lower bits is the port numer */ + /* The lower bits is the port number */ out |= (u8)dp->index; - p = (u16 *)(tag + 2); + p = (__be16 *)(tag + 2); *p = htons(out);
return skb;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Sergey Shtylyov s.shtylyov@omprussia.ru
commit 8c91bc3d44dfef8284af384877fbe61117e8b7d1 upstream.
According to the SH7710, SH7712, SH7713 Group User's Manual: Hardware, Rev. 3.00, the TRSCER register actually has only bit 7 valid (and named differently), with all the other bits reserved. Apparently, this was not the case with some early revisions of the manual as we have the other bits declared (and set) in the original driver. Follow the suit and add the explicit sh_eth_cpu_data::trscer_err_mask initializer for SH771x...
Fixes: 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") Signed-off-by: Sergey Shtylyov s.shtylyov@omprussia.ru Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/renesas/sh_eth.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -1089,6 +1089,9 @@ static struct sh_eth_cpu_data sh771x_dat EESIPR_CEEFIP | EESIPR_CELFIP | EESIPR_RRFIP | EESIPR_RTLFIP | EESIPR_RTSFIP | EESIPR_PREIP | EESIPR_CERFIP, + + .trscer_err_mask = DESC_I_RINT8, + .tsu = 1, .dual_port = 1, };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit c646d10dda2dcde82c6ce5a474522621ab2b8b19 upstream.
After the blamed patch, all RX traffic gets hashed to CPU 0 because the hashing indirection table set up in:
enetc_pf_probe -> enetc_alloc_si_resources -> enetc_configure_si -> enetc_setup_default_rss_table
is overwritten later in:
enetc_pf_probe -> enetc_init_port_rss_memory
which zero-initializes the entire port RSS table in order to avoid ECC errors.
The trouble really is that enetc_init_port_rss_memory really neads enetc_alloc_si_resources to be called, because it depends upon enetc_alloc_cbdr and enetc_setup_cbdr. But that whole enetc_configure_si thing could have been better thought out, it has nothing to do in a function called "alloc_si_resources", especially since its counterpart, "free_si_resources", does nothing to unwind the configuration of the SI.
The point is, we need to pull out enetc_configure_si out of enetc_alloc_resources, and move it after enetc_init_port_rss_memory. This allows us to set up the default RSS indirection table after initializing the memory.
Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Cc: Jesse Brandeburg jesse.brandeburg@intel.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 11 +++-------- drivers/net/ethernet/freescale/enetc/enetc.h | 1 + drivers/net/ethernet/freescale/enetc/enetc_pf.c | 7 +++++++ drivers/net/ethernet/freescale/enetc/enetc_vf.c | 7 +++++++ 4 files changed, 18 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1058,13 +1058,12 @@ static int enetc_setup_default_rss_table return 0; }
-static int enetc_configure_si(struct enetc_ndev_priv *priv) +int enetc_configure_si(struct enetc_ndev_priv *priv) { struct enetc_si *si = priv->si; struct enetc_hw *hw = &si->hw; int err;
- enetc_setup_cbdr(hw, &si->cbd_ring); /* set SI cache attributes */ enetc_wr(hw, ENETC_SICAR0, ENETC_SICAR_RD_COHERENT | ENETC_SICAR_WR_COHERENT); @@ -1112,6 +1111,8 @@ int enetc_alloc_si_resources(struct enet if (err) return err;
+ enetc_setup_cbdr(&si->hw, &si->cbd_ring); + priv->cls_rules = kcalloc(si->num_fs_entries, sizeof(*priv->cls_rules), GFP_KERNEL); if (!priv->cls_rules) { @@ -1119,14 +1120,8 @@ int enetc_alloc_si_resources(struct enet goto err_alloc_cls; }
- err = enetc_configure_si(priv); - if (err) - goto err_config_si; - return 0;
-err_config_si: - kfree(priv->cls_rules); err_alloc_cls: enetc_clear_cbdr(&si->hw); enetc_free_cbdr(priv->dev, &si->cbd_ring); --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -292,6 +292,7 @@ void enetc_get_si_caps(struct enetc_si * void enetc_init_si_rings_params(struct enetc_ndev_priv *priv); int enetc_alloc_si_resources(struct enetc_ndev_priv *priv); void enetc_free_si_resources(struct enetc_ndev_priv *priv); +int enetc_configure_si(struct enetc_ndev_priv *priv);
int enetc_open(struct net_device *ndev); int enetc_close(struct net_device *ndev); --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -1108,6 +1108,12 @@ static int enetc_pf_probe(struct pci_dev goto err_init_port_rss; }
+ err = enetc_configure_si(priv); + if (err) { + dev_err(&pdev->dev, "Failed to configure SI\n"); + goto err_config_si; + } + err = enetc_alloc_msix(priv); if (err) { dev_err(&pdev->dev, "MSIX alloc failed\n"); @@ -1136,6 +1142,7 @@ err_phylink_create: enetc_mdiobus_destroy(pf); err_mdiobus_create: enetc_free_msix(priv); +err_config_si: err_init_port_rss: err_init_port_rfs: err_alloc_msix: --- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c @@ -171,6 +171,12 @@ static int enetc_vf_probe(struct pci_dev goto err_alloc_si_res; }
+ err = enetc_configure_si(priv); + if (err) { + dev_err(&pdev->dev, "Failed to configure SI\n"); + goto err_config_si; + } + err = enetc_alloc_msix(priv); if (err) { dev_err(&pdev->dev, "MSIX alloc failed\n"); @@ -187,6 +193,7 @@ static int enetc_vf_probe(struct pci_dev
err_reg_netdev: enetc_free_msix(priv); +err_config_si: err_alloc_msix: enetc_free_si_resources(priv); err_alloc_si_res:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 3222b5b613db558e9a494bbf53f3c984d90f71ea upstream.
Michael reports that since linux-next-20210211, the AER messages for ECC errors have started reappearing, and this time they can be reliably reproduced with the first ping on one of his LS1028A boards.
$ ping 1[ 33.258069] pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0 72.16.0.1 PING [ 33.267050] pcieport 0000:00:1f.0: AER: can't find device of ID0000 172.16.0.1 (172.16.0.1): 56 data bytes 64 bytes from 172.16.0.1: seq=0 ttl=64 time=17.124 ms 64 bytes from 172.16.0.1: seq=1 ttl=64 time=0.273 ms
$ devmem 0x1f8010e10 32 0xC0000006
It isn't clear why this is necessary, but it seems that for the errors to go away, we must clear the entire RFS and RSS memory, not just for the ports in use.
Sadly the code is structured in such a way that we can't have unified logic for the used and unused ports. For the minimal initialization of an unused port, we need just to enable and ioremap the PF memory space, and a control buffer descriptor ring. Unused ports must then free the CBDR because the driver will exit, but used ports can not pick up from where that code path left, since the CBDR API does not reinitialize a ring when setting it up, so its producer and consumer indices are out of sync between the software and hardware state. So a separate enetc_init_unused_port function was created, and it gets called right after the PF memory space is enabled.
Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Reported-by: Michael Walle michael@walle.cc Cc: Jesse Brandeburg jesse.brandeburg@intel.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Tested-by: Michael Walle michael@walle.cc Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 8 ++--- drivers/net/ethernet/freescale/enetc/enetc.h | 4 ++ drivers/net/ethernet/freescale/enetc/enetc_pf.c | 33 ++++++++++++++++++++---- 3 files changed, 36 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -984,7 +984,7 @@ static void enetc_free_rxtx_rings(struct enetc_free_tx_ring(priv->tx_ring[i]); }
-static int enetc_alloc_cbdr(struct device *dev, struct enetc_cbdr *cbdr) +int enetc_alloc_cbdr(struct device *dev, struct enetc_cbdr *cbdr) { int size = cbdr->bd_count * sizeof(struct enetc_cbd);
@@ -1005,7 +1005,7 @@ static int enetc_alloc_cbdr(struct devic return 0; }
-static void enetc_free_cbdr(struct device *dev, struct enetc_cbdr *cbdr) +void enetc_free_cbdr(struct device *dev, struct enetc_cbdr *cbdr) { int size = cbdr->bd_count * sizeof(struct enetc_cbd);
@@ -1013,7 +1013,7 @@ static void enetc_free_cbdr(struct devic cbdr->bd_base = NULL; }
-static void enetc_setup_cbdr(struct enetc_hw *hw, struct enetc_cbdr *cbdr) +void enetc_setup_cbdr(struct enetc_hw *hw, struct enetc_cbdr *cbdr) { /* set CBDR cache attributes */ enetc_wr(hw, ENETC_SICAR2, @@ -1033,7 +1033,7 @@ static void enetc_setup_cbdr(struct enet cbdr->cir = hw->reg + ENETC_SICBDRCIR; }
-static void enetc_clear_cbdr(struct enetc_hw *hw) +void enetc_clear_cbdr(struct enetc_hw *hw) { enetc_wr(hw, ENETC_SICBDRMR, 0); } --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -310,6 +310,10 @@ int enetc_setup_tc(struct net_device *nd void enetc_set_ethtool_ops(struct net_device *ndev);
/* control buffer descriptor ring (CBDR) */ +int enetc_alloc_cbdr(struct device *dev, struct enetc_cbdr *cbdr); +void enetc_free_cbdr(struct device *dev, struct enetc_cbdr *cbdr); +void enetc_setup_cbdr(struct enetc_hw *hw, struct enetc_cbdr *cbdr); +void enetc_clear_cbdr(struct enetc_hw *hw); int enetc_set_mac_flt_entry(struct enetc_si *si, int index, char *mac_addr, int si_map); int enetc_clear_mac_flt_entry(struct enetc_si *si, int index); --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -1041,6 +1041,26 @@ static int enetc_init_port_rss_memory(st return err; }
+static void enetc_init_unused_port(struct enetc_si *si) +{ + struct device *dev = &si->pdev->dev; + struct enetc_hw *hw = &si->hw; + int err; + + si->cbd_ring.bd_count = ENETC_CBDR_DEFAULT_SIZE; + err = enetc_alloc_cbdr(dev, &si->cbd_ring); + if (err) + return; + + enetc_setup_cbdr(hw, &si->cbd_ring); + + enetc_init_port_rfs_memory(si); + enetc_init_port_rss_memory(si); + + enetc_clear_cbdr(hw); + enetc_free_cbdr(dev, &si->cbd_ring); +} + static int enetc_pf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -1051,11 +1071,6 @@ static int enetc_pf_probe(struct pci_dev struct enetc_pf *pf; int err;
- if (node && !of_device_is_available(node)) { - dev_info(&pdev->dev, "device is disabled, skipping\n"); - return -ENODEV; - } - err = enetc_pci_probe(pdev, KBUILD_MODNAME, sizeof(*pf)); if (err) { dev_err(&pdev->dev, "PCI probing failed\n"); @@ -1069,6 +1084,13 @@ static int enetc_pf_probe(struct pci_dev goto err_map_pf_space; }
+ if (node && !of_device_is_available(node)) { + enetc_init_unused_port(si); + dev_info(&pdev->dev, "device is disabled, skipping\n"); + err = -ENODEV; + goto err_device_disabled; + } + pf = enetc_si_priv(si); pf->si = si; pf->total_vfs = pci_sriov_get_totalvfs(pdev); @@ -1151,6 +1173,7 @@ err_alloc_si_res: si->ndev = NULL; free_netdev(ndev); err_alloc_netdev: +err_device_disabled: err_map_pf_space: enetc_pci_remove(pdev);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 6d36ecdbc4410e61a0e02adc5d3abeee22a8ffd3 upstream.
The workaround for the ENETC MDIO erratum caused a performance degradation of 82 Kpps (seen with IP forwarding of two 1Gbps streams of 64B packets). This is due to excessive locking and unlocking in the fast path, which can be avoided.
By taking the MDIO read-side lock only once per NAPI poll cycle, we are able to regain 54 Kpps (65%) of the performance hit. The rest of the performance degradation comes from the TX data path, but unfortunately it doesn't look like we can optimize that away easily, even with netdev_xmit_more(), there just isn't any skb batching done, to help with taking the MDIO lock less often than once per packet.
We need to change the register accessor type for enetc_get_tx_tstamp, because it now runs under the enetc_lock_mdio as per the new call path detailed below:
enetc_msix -> napi_schedule -> enetc_poll -> enetc_lock_mdio -> enetc_clean_tx_ring -> enetc_get_tx_tstamp -> enetc_clean_rx_ring -> enetc_unlock_mdio
Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 31 ++++++------------------ drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 + 2 files changed, 11 insertions(+), 22 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -281,6 +281,8 @@ static int enetc_poll(struct napi_struct int work_done; int i;
+ enetc_lock_mdio(); + for (i = 0; i < v->count_tx_rings; i++) if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) complete = false; @@ -291,8 +293,10 @@ static int enetc_poll(struct napi_struct if (work_done) v->rx_napi_work = true;
- if (!complete) + if (!complete) { + enetc_unlock_mdio(); return budget; + }
napi_complete_done(napi, work_done);
@@ -301,8 +305,6 @@ static int enetc_poll(struct napi_struct
v->rx_napi_work = false;
- enetc_lock_mdio(); - /* enable interrupts */ enetc_wr_reg_hot(v->rbier, ENETC_RBIER_RXTIE);
@@ -327,8 +329,8 @@ static void enetc_get_tx_tstamp(struct e { u32 lo, hi, tstamp_lo;
- lo = enetc_rd(hw, ENETC_SICTR0); - hi = enetc_rd(hw, ENETC_SICTR1); + lo = enetc_rd_hot(hw, ENETC_SICTR0); + hi = enetc_rd_hot(hw, ENETC_SICTR1); tstamp_lo = le32_to_cpu(txbd->wb.tstamp); if (lo <= tstamp_lo) hi -= 1; @@ -358,9 +360,7 @@ static bool enetc_clean_tx_ring(struct e i = tx_ring->next_to_clean; tx_swbd = &tx_ring->tx_swbd[i];
- enetc_lock_mdio(); bds_to_clean = enetc_bd_ready_count(tx_ring, i); - enetc_unlock_mdio();
do_tstamp = false;
@@ -403,8 +403,6 @@ static bool enetc_clean_tx_ring(struct e tx_swbd = tx_ring->tx_swbd; }
- enetc_lock_mdio(); - /* BD iteration loop end */ if (is_eof) { tx_frm_cnt++; @@ -415,8 +413,6 @@ static bool enetc_clean_tx_ring(struct e
if (unlikely(!bds_to_clean)) bds_to_clean = enetc_bd_ready_count(tx_ring, i); - - enetc_unlock_mdio(); }
tx_ring->next_to_clean = i; @@ -660,8 +656,6 @@ static int enetc_clean_rx_ring(struct en u32 bd_status; u16 size;
- enetc_lock_mdio(); - if (cleaned_cnt >= ENETC_RXBD_BUNDLE) { int count = enetc_refill_rx_ring(rx_ring, cleaned_cnt);
@@ -672,19 +666,15 @@ static int enetc_clean_rx_ring(struct en
rxbd = enetc_rxbd(rx_ring, i); bd_status = le32_to_cpu(rxbd->r.lstatus); - if (!bd_status) { - enetc_unlock_mdio(); + if (!bd_status) break; - }
enetc_wr_reg_hot(rx_ring->idr, BIT(rx_ring->index)); dma_rmb(); /* for reading other rxbd fields */ size = le16_to_cpu(rxbd->r.buf_len); skb = enetc_map_rx_buff_to_skb(rx_ring, i, size); - if (!skb) { - enetc_unlock_mdio(); + if (!skb) break; - }
enetc_get_offloads(rx_ring, rxbd, skb);
@@ -696,7 +686,6 @@ static int enetc_clean_rx_ring(struct en
if (unlikely(bd_status & ENETC_RXBD_LSTATUS(ENETC_RXBD_ERR_MASK))) { - enetc_unlock_mdio(); dev_kfree_skb(skb); while (!(bd_status & ENETC_RXBD_LSTATUS_F)) { dma_rmb(); @@ -736,8 +725,6 @@ static int enetc_clean_rx_ring(struct en
enetc_process_skb(rx_ring, skb);
- enetc_unlock_mdio(); - napi_gro_receive(napi, skb);
rx_frm_cnt++; --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -453,6 +453,8 @@ static inline u64 _enetc_rd_reg64_wa(voi #define enetc_wr_reg(reg, val) _enetc_wr_reg_wa((reg), (val)) #define enetc_rd(hw, off) enetc_rd_reg((hw)->reg + (off)) #define enetc_wr(hw, off, val) enetc_wr_reg((hw)->reg + (off), val) +#define enetc_rd_hot(hw, off) enetc_rd_reg_hot((hw)->reg + (off)) +#define enetc_wr_hot(hw, off, val) enetc_wr_reg_hot((hw)->reg + (off), val) #define enetc_rd64(hw, off) _enetc_rd_reg64_wa((hw)->reg + (off)) /* port register accessors - PF only */ #define enetc_port_rd(hw, off) enetc_rd_reg((hw)->port + (off))
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 827b6fd046516af605e190c872949f22208b5d41 upstream.
When the enetc ports have rx-vlan-offload enabled, they report a TPID of ETH_P_8021Q regardless of what was actually in the packet. When rx-vlan-offload is disabled, packets have the proper TPID. Fix this inconsistency by finishing the TODO left in the code.
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 34 ++++++++++++++++++------ drivers/net/ethernet/freescale/enetc/enetc_hw.h | 3 ++ 2 files changed, 29 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -523,9 +523,8 @@ static void enetc_get_rx_tstamp(struct n static void enetc_get_offloads(struct enetc_bdr *rx_ring, union enetc_rx_bd *rxbd, struct sk_buff *skb) { -#ifdef CONFIG_FSL_ENETC_PTP_CLOCK struct enetc_ndev_priv *priv = netdev_priv(rx_ring->ndev); -#endif + /* TODO: hashing */ if (rx_ring->ndev->features & NETIF_F_RXCSUM) { u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum); @@ -534,12 +533,31 @@ static void enetc_get_offloads(struct en skb->ip_summed = CHECKSUM_COMPLETE; }
- /* copy VLAN to skb, if one is extracted, for now we assume it's a - * standard TPID, but HW also supports custom values - */ - if (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_VLAN) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), - le16_to_cpu(rxbd->r.vlan_opt)); + if (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_VLAN) { + __be16 tpid = 0; + + switch (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_TPID) { + case 0: + tpid = htons(ETH_P_8021Q); + break; + case 1: + tpid = htons(ETH_P_8021AD); + break; + case 2: + tpid = htons(enetc_port_rd(&priv->si->hw, + ENETC_PCVLANR1)); + break; + case 3: + tpid = htons(enetc_port_rd(&priv->si->hw, + ENETC_PCVLANR2)); + break; + default: + break; + } + + __vlan_hwaccel_put_tag(skb, tpid, le16_to_cpu(rxbd->r.vlan_opt)); + } + #ifdef CONFIG_FSL_ENETC_PTP_CLOCK if (priv->active_offloads & ENETC_F_RX_TSTAMP) enetc_get_rx_tstamp(rx_ring->ndev, rxbd, skb); --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -172,6 +172,8 @@ enum enetc_bdr_type {TX, RX}; #define ENETC_PSIPMAR0(n) (0x0100 + (n) * 0x8) /* n = SI index */ #define ENETC_PSIPMAR1(n) (0x0104 + (n) * 0x8) #define ENETC_PVCLCTR 0x0208 +#define ENETC_PCVLANR1 0x0210 +#define ENETC_PCVLANR2 0x0214 #define ENETC_VLAN_TYPE_C BIT(0) #define ENETC_VLAN_TYPE_S BIT(1) #define ENETC_PVCLCTR_OVTPIDL(bmp) ((bmp) & 0xff) /* VLAN_TYPE */ @@ -570,6 +572,7 @@ union enetc_rx_bd { #define ENETC_RXBD_LSTATUS(flags) ((flags) << 16) #define ENETC_RXBD_FLAG_VLAN BIT(9) #define ENETC_RXBD_FLAG_TSTMP BIT(10) +#define ENETC_RXBD_FLAG_TPID GENMASK(1, 0)
#define ENETC_MAC_ADDR_FILT_CNT 8 /* # of supported entries per port */ #define EMETC_MAC_ADDR_FILT_RES 3 /* # of reserved entries at the beginning */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit a74dbce9d4541888fe0d39afe69a3a95004669b4 upstream.
Quoting from the blamed commit:
In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored.
Intuitive or not, there isn't any definition issued by a standards body which says that promiscuity has anything to do with VLAN filtering - it only has to do with accepting packets regardless of destination MAC address.
In fact people are already trying to use this misunderstanding/bug of the enetc driver as a justification to transform promiscuity into something it never was about: accepting every packet (maybe that would be the "rx-all" netdev feature?): https://lore.kernel.org/netdev/20201110153958.ci5ekor3o2ekg3ky@ipetronik.com...
This is relevant because there are use cases in the kernel (such as tc-flower rules with the protocol 802.1Q and a vlan_id key) which do not (yet) use the vlan_vid_add API to be compatible with VLAN-filtering NICs such as enetc, so for those, disabling rx-vlan-filter is currently the only right solution to make these setups work: https://lore.kernel.org/netdev/CA+h21hoxwRdhq4y+w8Kwgm74d4cA0xLeiHTrmT-VpSaM... The blamed patch has unintentionally introduced one more way for this to work, which is to enable IFF_PROMISC, however this is non-portable because port promiscuity is not meant to disable VLAN filtering. Therefore, it could invite people to write broken scripts for enetc, and then wonder why they are broken when migrating to other drivers that don't handle promiscuity in the same way.
Fixes: 7070eea5e95a ("enetc: permit configuration of rx-vlan-filter with ethtool") Cc: Markus Blöchl Markus.Bloechl@ipetronik.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc_pf.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -190,7 +190,6 @@ static void enetc_pf_set_rx_mode(struct { struct enetc_ndev_priv *priv = netdev_priv(ndev); struct enetc_pf *pf = enetc_si_priv(priv->si); - char vlan_promisc_simap = pf->vlan_promisc_simap; struct enetc_hw *hw = &priv->si->hw; bool uprom = false, mprom = false; struct enetc_mac_filter *filter; @@ -203,16 +202,12 @@ static void enetc_pf_set_rx_mode(struct psipmr = ENETC_PSIPMR_SET_UP(0) | ENETC_PSIPMR_SET_MP(0); uprom = true; mprom = true; - /* Enable VLAN promiscuous mode for SI0 (PF) */ - vlan_promisc_simap |= BIT(0); } else if (ndev->flags & IFF_ALLMULTI) { /* enable multi cast promisc mode for SI0 (PF) */ psipmr = ENETC_PSIPMR_SET_MP(0); mprom = true; }
- enetc_set_vlan_promisc(&pf->si->hw, vlan_promisc_simap); - /* first 2 filter entries belong to PF */ if (!uprom) { /* Update unicast filters */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit c76a97218dcbb2cb7cec1404ace43ef96c87d874 upstream.
The ENETC port 0 MAC supports in-band status signaling coming from a PHY when operating in RGMII mode, and this feature is enabled by default.
It has been reported that RGMII is broken in fixed-link, and that is not surprising considering the fact that no PHY is attached to the MAC in that case, but a switch.
This brings us to the topic of the patch: the enetc driver should have not enabled the optional in-band status signaling for RGMII unconditionally, but should have forced the speed and duplex to what was resolved by phylink.
Note that phylink does not accept the RGMII modes as valid for in-band signaling, and these operate a bit differently than 1000base-x and SGMII (notably there is no clause 37 state machine so no ACK required from the MAC, instead the PHY sends extra code words on RXD[3:0] whenever it is not transmitting something else, so it should be safe to leave a PHY with this option unconditionally enabled even if we ignore it). The spec talks about this here: https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-fil...
Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Cc: Florian Fainelli f.fainelli@gmail.com Cc: Andrew Lunn andrew@lunn.ch Cc: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Acked-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc_hw.h | 13 ++++- drivers/net/ethernet/freescale/enetc/enetc_pf.c | 53 ++++++++++++++++++++---- 2 files changed, 56 insertions(+), 10 deletions(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -238,10 +238,17 @@ enum enetc_bdr_type {TX, RX}; #define ENETC_PM_IMDIO_BASE 0x8030
#define ENETC_PM0_IF_MODE 0x8300 -#define ENETC_PMO_IFM_RG BIT(2) +#define ENETC_PM0_IFM_RG BIT(2) #define ENETC_PM0_IFM_RLP (BIT(5) | BIT(11)) -#define ENETC_PM0_IFM_RGAUTO (BIT(15) | ENETC_PMO_IFM_RG | BIT(1)) -#define ENETC_PM0_IFM_XGMII BIT(12) +#define ENETC_PM0_IFM_EN_AUTO BIT(15) +#define ENETC_PM0_IFM_SSP_MASK GENMASK(14, 13) +#define ENETC_PM0_IFM_SSP_1000 (2 << 13) +#define ENETC_PM0_IFM_SSP_100 (0 << 13) +#define ENETC_PM0_IFM_SSP_10 (1 << 13) +#define ENETC_PM0_IFM_FULL_DPX BIT(12) +#define ENETC_PM0_IFM_IFMODE_MASK GENMASK(1, 0) +#define ENETC_PM0_IFM_IFMODE_XGMII 0 +#define ENETC_PM0_IFM_IFMODE_GMII 2 #define ENETC_PSIDCAPR 0x1b08 #define ENETC_PSIDCAPR_MSK GENMASK(15, 0) #define ENETC_PSFCAPR 0x1b18 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -315,7 +315,7 @@ static void enetc_set_loopback(struct ne u32 reg;
reg = enetc_port_rd(hw, ENETC_PM0_IF_MODE); - if (reg & ENETC_PMO_IFM_RG) { + if (reg & ENETC_PM0_IFM_RG) { /* RGMII mode */ reg = (reg & ~ENETC_PM0_IFM_RLP) | (en ? ENETC_PM0_IFM_RLP : 0); @@ -494,13 +494,20 @@ static void enetc_configure_port_mac(str
static void enetc_mac_config(struct enetc_hw *hw, phy_interface_t phy_mode) { - /* set auto-speed for RGMII */ - if (enetc_port_rd(hw, ENETC_PM0_IF_MODE) & ENETC_PMO_IFM_RG || - phy_interface_mode_is_rgmii(phy_mode)) - enetc_port_wr(hw, ENETC_PM0_IF_MODE, ENETC_PM0_IFM_RGAUTO); + u32 val; + + if (phy_interface_mode_is_rgmii(phy_mode)) { + val = enetc_port_rd(hw, ENETC_PM0_IF_MODE); + val &= ~ENETC_PM0_IFM_EN_AUTO; + val &= ENETC_PM0_IFM_IFMODE_MASK; + val |= ENETC_PM0_IFM_IFMODE_GMII | ENETC_PM0_IFM_RG; + enetc_port_wr(hw, ENETC_PM0_IF_MODE, val); + }
- if (phy_mode == PHY_INTERFACE_MODE_USXGMII) - enetc_port_wr(hw, ENETC_PM0_IF_MODE, ENETC_PM0_IFM_XGMII); + if (phy_mode == PHY_INTERFACE_MODE_USXGMII) { + val = ENETC_PM0_IFM_FULL_DPX | ENETC_PM0_IFM_IFMODE_XGMII; + enetc_port_wr(hw, ENETC_PM0_IF_MODE, val); + } }
static void enetc_mac_enable(struct enetc_hw *hw, bool en) @@ -932,6 +939,34 @@ static void enetc_pl_mac_config(struct p phylink_set_pcs(priv->phylink, &pf->pcs->pcs); }
+static void enetc_force_rgmii_mac(struct enetc_hw *hw, int speed, int duplex) +{ + u32 old_val, val; + + old_val = val = enetc_port_rd(hw, ENETC_PM0_IF_MODE); + + if (speed == SPEED_1000) { + val &= ~ENETC_PM0_IFM_SSP_MASK; + val |= ENETC_PM0_IFM_SSP_1000; + } else if (speed == SPEED_100) { + val &= ~ENETC_PM0_IFM_SSP_MASK; + val |= ENETC_PM0_IFM_SSP_100; + } else if (speed == SPEED_10) { + val &= ~ENETC_PM0_IFM_SSP_MASK; + val |= ENETC_PM0_IFM_SSP_10; + } + + if (duplex == DUPLEX_FULL) + val |= ENETC_PM0_IFM_FULL_DPX; + else + val &= ~ENETC_PM0_IFM_FULL_DPX; + + if (val == old_val) + return; + + enetc_port_wr(hw, ENETC_PM0_IF_MODE, val); +} + static void enetc_pl_mac_link_up(struct phylink_config *config, struct phy_device *phy, unsigned int mode, phy_interface_t interface, int speed, @@ -944,6 +979,10 @@ static void enetc_pl_mac_link_up(struct if (priv->active_offloads & ENETC_F_QBV) enetc_sched_speed_set(priv, speed);
+ if (!phylink_autoneg_inband(mode) && + phy_interface_mode_is_rgmii(interface)) + enetc_force_rgmii_mac(&pf->si->hw, speed, duplex); + enetc_mac_enable(&pf->si->hw, true); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 96a5223b918c8b79270fc0fec235a7ebad459098 upstream.
The Station Interface Receive Interrupt Detect Register (SIRXIDR) contains a 16-bit wide mask of 'interrupt detected' events for each ring associated with a port. Bit i is write-1-to-clean for RX ring i.
I have no explanation whatsoever how this line of code came to be inserted in the blamed commit. I checked the downstream versions of that patch and none of them have it.
The somewhat comical aspect of it is that we're writing a binary number to the SIRXIDR register, which is derived from enetc_bd_unused(rx_ring). Since the RX rings have 512 buffer descriptors, we end up writing 511 to this register, which is 0x1ff, so we are effectively clearing the 'interrupt detected' event for rings 0-8.
This register is not what is used for interrupt handling though - it only provides a summary for the entire SI. The hardware provides one separate Interrupt Detect Register per RX ring, which auto-clears upon read. So there doesn't seem to be any adverse effect caused by this bogus write.
There is, however, one reason why this should be handled as a bugfix: next_to_clean _should_ be committed to hardware, just not to that register, and this was obscuring the fact that it wasn't. This is fixed in the next patch, and removing the bogus line now allows the fix patch to be backported beyond that point.
Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1212,7 +1212,6 @@ static void enetc_setup_rxbdr(struct ene rx_ring->idr = hw->reg + ENETC_SIRXIDR;
enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring)); - enetc_wr(hw, ENETC_SIRXIDR, rx_ring->next_to_use);
/* enable ring */ enetc_rxbdr_wr(hw, idx, ENETC_RBMR, rbmr);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 3a5d12c9be6f30080600c8bacaf310194e37d029 upstream.
The RX rings have a producer index owned by hardware, where newly received frame buffers are placed, and a consumer index owned by software, where newly allocated buffers are placed, in expectation of hardware being able to place frame data in them.
Hardware increments the producer index when a frame is received, however it is not allowed to increment the producer index to match the consumer index (RBCIR) since the ring can hold at most RBLENR[LENGTH]-1 received BDs. Whenever the producer index matches the value of the consumer index, the ring has no unprocessed received frames and all BDs in the ring have been initialized/prepared by software, i.e. hardware owns all BDs in the ring.
The code uses the next_to_clean variable to keep track of the producer index, and the next_to_use variable to keep track of the consumer index.
The RX rings are seeded from enetc_refill_rx_ring, which is called from two places:
1. initially the ring is seeded until full with enetc_bd_unused(rx_ring), i.e. with 511 buffers. This will make next_to_clean=0 and next_to_use=511:
.ndo_open -> enetc_open -> enetc_setup_bdrs -> enetc_setup_rxbdr -> enetc_refill_rx_ring
2. then during the data path processing, it is refilled with 16 buffers at a time:
enetc_msix -> napi_schedule -> enetc_poll -> enetc_clean_rx_ring -> enetc_refill_rx_ring
There is just one problem: the initial seeding done during .ndo_open updates just the producer index (ENETC_RBPIR) with 0, and the software next_to_clean and next_to_use variables. Notably, it will not update the consumer index to make the hardware aware of the newly added buffers.
Wait, what? So how does it work?
Well, the reset values of the producer index and of the consumer index of a ring are both zero. As per the description in the second paragraph, it means that the ring is full of buffers waiting for hardware to put frames in them, which by coincidence is almost true, because we have in fact seeded 511 buffers into the ring.
But will the hardware attempt to access the 512th entry of the ring, which has an invalid BD in it? Well, no, because in order to do that, it would have to first populate the first 511 entries, and the NAPI enetc_poll will kick in by then. Eventually, after 16 processed slots have become available in the RX ring, enetc_clean_rx_ring will call enetc_refill_rx_ring and then will [ finally ] update the consumer index with the new software next_to_use variable. From now on, the next_to_clean and next_to_use variables are in sync with the producer and consumer ring indices.
So the day is saved, right? Well, not quite. Freeing the memory allocated for the rings is done in:
enetc_close -> enetc_clear_bdrs -> enetc_clear_rxbdr -> this just disables the ring -> enetc_free_rxtx_rings -> enetc_free_rx_ring -> sets next_to_clean and next_to_use to 0
but again, nothing is committed to the hardware producer and consumer indices (yay!). The assumption is that the ring is disabled, so the indices don't matter anyway, and it's the responsibility of the "open" code path to set those up.
.. Except that the "open" code path does not set those up properly.
While initially, things almost work, during subsequent enetc_close -> enetc_open sequences, we have problems. To be precise, the enetc_open that is subsequent to enetc_close will again refill the ring with 511 entries, but it will leave the consumer index untouched. Untouched means, of course, equal to the value it had before disabling the ring and draining the old buffers in enetc_close.
But as mentioned, enetc_setup_rxbdr will at least update the producer index though, through this line of code:
enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0);
so at this stage we'll have:
next_to_clean=0 (in hardware 0) next_to_use=511 (in hardware we'll have the refill index prior to enetc_close)
Again, the next_to_clean and producer index are in sync and set to correct values, so the driver manages to limp on. Eventually, 16 ring entries will be consumed by enetc_poll, and the savior enetc_clean_rx_ring will come and call enetc_refill_rx_ring, and then update the hardware consumer ring based upon the new next_to_use.
So.. it works? Well, by coincidence, it almost does, but there's a circumstance where enetc_clean_rx_ring won't be there to save us. If the previous value of the consumer index was 15, there's a problem, because the NAPI poll sequence will only issue a refill when 16 or more buffers have been consumed.
It's easiest to illustrate this with an example:
ip link set eno0 up ip addr add 192.168.100.1/24 dev eno0 ping 192.168.100.1 -c 20 # ping this port from another board ip link set eno0 down ip link set eno0 up ping 192.168.100.1 -c 20 # ping it again from the same other board
One by one:
1. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 0)
2. ping 192.168.100.1 -c 20 # ping this port from another board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=15 next_to_clean 14 (in hw 15) next_to_use 511 (in hw 0) enetc_clean_rx_ring: enetc_refill_rx_ring(16) increments next_to_use by 16 (mod 512) and writes it to hw enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=0 next_to_clean 15 (in hw 16) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 16 (in hw 17) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 17 (in hw 18) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 18 (in hw 19) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 19 (in hw 20) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 20 (in hw 21) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 21 (in hw 22) next_to_use 15 (in hw 15)
20 packets transmitted, 20 packets received, 0% packet loss
3. ip link set eno0 down enetc_free_rx_ring: next_to_clean 0 (in hw 22), next_to_use 0 (in hw 15)
4. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 15)
5. ping 192.168.100.1 -c 20 # ping it again from the same other board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 15)
20 packets transmitted, 12 packets received, 40% packet loss
And there it dies. No enetc_refill_rx_ring (because cleaned_cnt must be equal to 15 for that to happen), no nothing. The hardware enters the condition where the producer (14) + 1 is equal to the consumer (15) index, which makes it believe it has no more free buffers to put packets in, so it starts discarding them:
ip netns exec ns0 ethtool -S eno0 | grep -v ': 0' NIC statistics: Rx ring 0 discarded frames: 8
Summarized, if the interface receives between 16 and 32 (mod 512) frames and then there is a link flap, then the port will eventually die with no way to recover. If it receives less than 16 (mod 512) frames, then the initial NAPI poll [ before the link flap ] will not update the consumer index in hardware (it will remain zero) which will be ok when the buffers are later reinitialized. If more than 32 (mod 512) frames are received, the initial NAPI poll has the chance to refill the ring twice, updating the consumer index to at least 32. So after the link flap, the consumer index is still wrong, but the post-flap NAPI poll gets a chance to refill the ring once (because it passes through cleaned_cnt=15) and makes the consumer index be again back in sync with next_to_use.
The solution to this problem is actually simple, we just need to write next_to_use into the hardware consumer index at enetc_open time, which always brings it back in sync after an initial buffer seeding process.
The simpler thing would be to put the write to the consumer index into enetc_refill_rx_ring directly, but there are issues with the MDIO locking: in the NAPI poll code we have the enetc_lock_mdio() taken from top-level and we use the unlocked enetc_wr_reg_hot, whereas in enetc_open, the enetc_lock_mdio() is not taken at the top level, but instead by each individual enetc_wr_reg, so we are forced to put an additional enetc_wr_reg in enetc_setup_rxbdr. Better organization of the code is left as a refactoring exercise.
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1212,6 +1212,8 @@ static void enetc_setup_rxbdr(struct ene rx_ring->idr = hw->reg + ENETC_SIRXIDR;
enetc_refill_rx_ring(rx_ring, enetc_bd_unused(rx_ring)); + /* update ENETC's consumer index */ + enetc_rxbdr_wr(hw, idx, ENETC_RBCIR, rx_ring->next_to_use);
/* enable ring */ enetc_rxbdr_wr(hw, idx, ENETC_RBMR, rbmr);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: DENG Qingfang dqfext@gmail.com
commit 9200f515c41f4cbaeffd8fdd1d8b6373a18b1b67 upstream.
A different TPID bit is used for 802.1ad VLAN frames.
Reported-by: Ilario Gelmetti iochesonome@gmail.com Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag") Signed-off-by: DENG Qingfang dqfext@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/tag_mtk.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
--- a/net/dsa/tag_mtk.c +++ b/net/dsa/tag_mtk.c @@ -13,6 +13,7 @@ #define MTK_HDR_LEN 4 #define MTK_HDR_XMIT_UNTAGGED 0 #define MTK_HDR_XMIT_TAGGED_TPID_8100 1 +#define MTK_HDR_XMIT_TAGGED_TPID_88A8 2 #define MTK_HDR_RECV_SOURCE_PORT_MASK GENMASK(2, 0) #define MTK_HDR_XMIT_DP_BIT_MASK GENMASK(5, 0) #define MTK_HDR_XMIT_SA_DIS BIT(6) @@ -21,8 +22,8 @@ static struct sk_buff *mtk_tag_xmit(stru struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); + u8 xmit_tpid; u8 *mtk_tag; - bool is_vlan_skb = true; unsigned char *dest = eth_hdr(skb)->h_dest; bool is_multicast_skb = is_multicast_ether_addr(dest) && !is_broadcast_ether_addr(dest); @@ -33,10 +34,17 @@ static struct sk_buff *mtk_tag_xmit(stru * the both special and VLAN tag at the same time and then look up VLAN * table with VID. */ - if (!skb_vlan_tagged(skb)) { + switch (skb->protocol) { + case htons(ETH_P_8021Q): + xmit_tpid = MTK_HDR_XMIT_TAGGED_TPID_8100; + break; + case htons(ETH_P_8021AD): + xmit_tpid = MTK_HDR_XMIT_TAGGED_TPID_88A8; + break; + default: + xmit_tpid = MTK_HDR_XMIT_UNTAGGED; skb_push(skb, MTK_HDR_LEN); memmove(skb->data, skb->data + MTK_HDR_LEN, 2 * ETH_ALEN); - is_vlan_skb = false; }
mtk_tag = skb->data + 2 * ETH_ALEN; @@ -44,8 +52,7 @@ static struct sk_buff *mtk_tag_xmit(stru /* Mark tag attribute on special tag insertion to notify hardware * whether that's a combined special tag with 802.1Q header. */ - mtk_tag[0] = is_vlan_skb ? MTK_HDR_XMIT_TAGGED_TPID_8100 : - MTK_HDR_XMIT_UNTAGGED; + mtk_tag[0] = xmit_tpid; mtk_tag[1] = (1 << dp->index) & MTK_HDR_XMIT_DP_BIT_MASK;
/* Disable SA learning for multicast frames */ @@ -53,7 +60,7 @@ static struct sk_buff *mtk_tag_xmit(stru mtk_tag[1] |= MTK_HDR_XMIT_SA_DIS;
/* Tag control information is kept for 802.1Q */ - if (!is_vlan_skb) { + if (xmit_tpid == MTK_HDR_XMIT_UNTAGGED) { mtk_tag[2] = 0; mtk_tag[3] = 0; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Biao Huang biao.huang@mediatek.com
commit 95b39f07a17faef3a9b225248ba449b976e529c8 upstream.
mtk_star_dma_unmap_rx() should unmap the dma_addr of old skb rather than that of new skb. Assign new_dma_addr to desc_data.dma_addr after all handling of old skb ends to avoid unexpected receive side error.
Fixes: f96e9641e92b ("net: ethernet: mtk-star-emac: fix error path in RX handling") Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mediatek/mtk_star_emac.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mediatek/mtk_star_emac.c +++ b/drivers/net/ethernet/mediatek/mtk_star_emac.c @@ -1225,8 +1225,6 @@ static int mtk_star_receive_packet(struc goto push_new_skb; }
- desc_data.dma_addr = new_dma_addr; - /* We can't fail anymore at this point: it's safe to unmap the skb. */ mtk_star_dma_unmap_rx(priv, &desc_data);
@@ -1236,6 +1234,9 @@ static int mtk_star_receive_packet(struc desc_data.skb->dev = ndev; netif_receive_skb(desc_data.skb);
+ /* update dma_addr for new skb */ + desc_data.dma_addr = new_dma_addr; + push_new_skb: desc_data.len = skb_tailroom(new_skb); desc_data.skb = new_skb;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kevin(Yudong) Yang yyd@google.com
commit 00ff801bb8ce6711e919af4530b6ffa14a22390a upstream.
This patch fixes a bug that the moderation config will not be applied when calling mlx4_en_reset_config. For example, when turning on rx timestamping, mlx4_en_reset_config() will be called, causing the NIC to forget previous moderation config.
This fix is in phase with a previous fix: commit 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called")
Tested: Before this patch, on a host with NIC using mlx4, run netserver and stream TCP to the host at full utilization. $ sar -I SUM 1 INTR intr/s 14:03:56 sum 48758.00
After rx hwtstamp is enabled: $ sar -I SUM 1 14:10:38 sum 317771.00 We see the moderation is not working properly and issued 7x more interrupts.
After the patch, and turned on rx hwtstamp, the rate of interrupts is as expected: $ sar -I SUM 1 14:52:11 sum 49332.00
Fixes: 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Signed-off-by: Kevin(Yudong) Yang yyd@google.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: Neal Cardwell ncardwell@google.com CC: Tariq Toukan tariqt@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 2 +- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 ++ drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 + 3 files changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c @@ -47,7 +47,7 @@ #define EN_ETHTOOL_SHORT_MASK cpu_to_be16(0xffff) #define EN_ETHTOOL_WORD_MASK cpu_to_be32(0xffffffff)
-static int mlx4_en_moderation_update(struct mlx4_en_priv *priv) +int mlx4_en_moderation_update(struct mlx4_en_priv *priv) { int i, t; int err = 0; --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -3558,6 +3558,8 @@ int mlx4_en_reset_config(struct net_devi en_err(priv, "Failed starting port\n"); }
+ if (!err) + err = mlx4_en_moderation_update(priv); out: mutex_unlock(&mdev->state_lock); kfree(tmp); --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -775,6 +775,7 @@ void mlx4_en_ptp_overflow_check(struct m #define DEV_FEATURE_CHANGED(dev, new_features, feature) \ ((dev->features & feature) ^ (new_features & feature))
+int mlx4_en_moderation_update(struct mlx4_en_priv *priv); int mlx4_en_reset_config(struct net_device *dev, struct hwtstamp_config ts_config, netdev_features_t new_features);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ong Boon Leong boon.leong.ong@intel.com
commit 879c348c35bb5fb758dd881d8a97409c1862dae8 upstream.
We introduce dwmac410_dma_init_channel() here for both EQoS v4.10 and above which use different DMA_CH(n)_Interrupt_Enable bit definitions for NIE and AIE.
Fixes: 48863ce5940f ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Ong Boon Leong boon.leong.ong@intel.com Signed-off-by: Ramesh Babu B ramesh.babu.b@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c @@ -124,6 +124,23 @@ static void dwmac4_dma_init_channel(void ioaddr + DMA_CHAN_INTR_ENA(chan)); }
+static void dwmac410_dma_init_channel(void __iomem *ioaddr, + struct stmmac_dma_cfg *dma_cfg, u32 chan) +{ + u32 value; + + /* common channel control register config */ + value = readl(ioaddr + DMA_CHAN_CONTROL(chan)); + if (dma_cfg->pblx8) + value = value | DMA_BUS_MODE_PBL; + + writel(value, ioaddr + DMA_CHAN_CONTROL(chan)); + + /* Mask interrupts by writing to CSR7 */ + writel(DMA_CHAN_INTR_DEFAULT_MASK_4_10, + ioaddr + DMA_CHAN_INTR_ENA(chan)); +} + static void dwmac4_dma_init(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, int atds) { @@ -523,7 +540,7 @@ const struct stmmac_dma_ops dwmac4_dma_o const struct stmmac_dma_ops dwmac410_dma_ops = { .reset = dwmac4_dma_reset, .init = dwmac4_dma_init, - .init_chan = dwmac4_dma_init_channel, + .init_chan = dwmac410_dma_init_channel, .init_rx_chan = dwmac4_dma_init_rx_chan, .init_tx_chan = dwmac4_dma_init_tx_chan, .axi = dwmac4_dma_axi,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ido Schimmel idosch@nvidia.com
commit 76c03bf8e2624076b88d93542d78e22d5345c88e upstream.
As far as user space is concerned, blackhole nexthops do not have a nexthop device and therefore should not be affected by the administrative or carrier state of any netdev.
However, when the loopback netdev goes down all the blackhole nexthops are flushed. This happens because internally the kernel associates blackhole nexthops with the loopback netdev.
This behavior is both confusing to those not familiar with kernel internals and also diverges from the legacy API where blackhole IPv4 routes are not flushed when the loopback netdev goes down:
# ip route add blackhole 198.51.100.0/24 # ip link set dev lo down # ip route show 198.51.100.0/24 blackhole 198.51.100.0/24
Blackhole IPv6 routes are flushed, but at least user space knows that they are associated with the loopback netdev:
# ip -6 route show 2001:db8:1::/64 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium
Fix this by only flushing blackhole nexthops when the loopback netdev is unregistered.
Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Ido Schimmel idosch@nvidia.com Reported-by: Donald Sharp sharpd@nvidia.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/nexthop.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1364,7 +1364,7 @@ out:
/* rtnl */ /* remove all nexthops tied to a device being deleted */ -static void nexthop_flush_dev(struct net_device *dev) +static void nexthop_flush_dev(struct net_device *dev, unsigned long event) { unsigned int hash = nh_dev_hashfn(dev->ifindex); struct net *net = dev_net(dev); @@ -1376,6 +1376,10 @@ static void nexthop_flush_dev(struct net if (nhi->fib_nhc.nhc_dev != dev) continue;
+ if (nhi->reject_nh && + (event == NETDEV_DOWN || event == NETDEV_CHANGE)) + continue; + remove_nexthop(net, nhi->nh_parent, NULL); } } @@ -2122,11 +2126,11 @@ static int nh_netdev_event(struct notifi switch (event) { case NETDEV_DOWN: case NETDEV_UNREGISTER: - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGE: if (!(dev_get_flags(dev) & (IFF_RUNNING | IFF_LOWER_UP))) - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGEMTU: info_ext = ptr;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Maximilian Heyne mheyne@amazon.de
commit bfc2560563586372212b0a8aeca7428975fa91fe upstream.
This is a follow up of commit ea3274695353 ("net: sched: avoid duplicates in qdisc dump") which has fixed the issue only for the qdisc dump.
The duplicate printing also occurs when dumping the classes via tc class show dev eth0
Fixes: 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") Signed-off-by: Maximilian Heyne mheyne@amazon.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_api.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -2167,7 +2167,7 @@ static int tc_dump_tclass_qdisc(struct Q
static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, struct tcmsg *tcm, struct netlink_callback *cb, - int *t_p, int s_t) + int *t_p, int s_t, bool recur) { struct Qdisc *q; int b; @@ -2178,7 +2178,7 @@ static int tc_dump_tclass_root(struct Qd if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) return -1;
- if (!qdisc_dev(root)) + if (!qdisc_dev(root) || !recur) return 0;
if (tcm->tcm_parent) { @@ -2213,13 +2213,13 @@ static int tc_dump_tclass(struct sk_buff s_t = cb->args[0]; t = 0;
- if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t) < 0) + if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t, true) < 0) goto done;
dev_queue = dev_ingress_queue(dev); if (dev_queue && tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb, - &t, s_t) < 0) + &t, s_t, false) < 0) goto done;
done:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit f1becbed411c6fa29d7ce3def3a1dcd4f63f2d74 upstream.
An attempt is made to warn the user about the fact that VCAP IS1 cannot offload keys matching on destination IP (at least given the current half key format), but sadly that warning fails miserably in practice, due to the fact that it operates on an uninitialized "match" variable. We must first decode the keys from the flow rule.
Fixes: 75944fda1dfe ("net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1") Reported-by: Colin Ian King colin.king@canonical.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Colin Ian King colin.king@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mscc/ocelot_flower.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mscc/ocelot_flower.c +++ b/drivers/net/ethernet/mscc/ocelot_flower.c @@ -540,13 +540,14 @@ ocelot_flower_parse_key(struct ocelot *o return -EOPNOTSUPP; }
+ flow_rule_match_ipv4_addrs(rule, &match); + if (filter->block_id == VCAP_IS1 && *(u32 *)&match.mask->dst) { NL_SET_ERR_MSG_MOD(extack, "Key type S1_NORMAL cannot match on destination IP"); return -EOPNOTSUPP; }
- flow_rule_match_ipv4_addrs(rule, &match); tmp = &filter->key.ipv4.sip.value.addr[0]; memcpy(tmp, &match.key->src, 4);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 053d8ad10d585adf9891fcd049637536e2fe9ea7 upstream.
When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is read before resetting the switch so it can be reprogrammed afterwards. This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps, because SPEED_10 is actually 0, so AND-ing anything with 0 is false, therefore that last branch is dead code.
Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just remove the check for SPEED_10, let it fall into the default case.
Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/sja1105/sja1105_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1834,7 +1834,7 @@ out_unlock_ptp: speed = SPEED_1000; else if (bmcr & BMCR_SPEED100) speed = SPEED_100; - else if (bmcr & BMCR_SPEED10) + else speed = SPEED_10;
sja1105_sgmii_pcs_force_speed(priv, speed);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Daniele Palmas dnlplm@gmail.com
commit 6c59cff38e66584ae3ac6c2f0cbd8d039c710ba7 upstream.
There's no reason for preventing the creation and removal of qmimux network interfaces when the underlying interface is up.
This makes qmi_wwan mux implementation more similar to the rmnet one, simplifying userspace management of the same logical interfaces.
Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Reported-by: Aleksander Morgado aleksander@aleksander.es Signed-off-by: Daniele Palmas dnlplm@gmail.com Acked-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 14 -------------- 1 file changed, 14 deletions(-)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -396,13 +396,6 @@ static ssize_t add_mux_store(struct devi goto err; }
- /* we don't want to modify a running netdev */ - if (netif_running(dev->net)) { - netdev_err(dev->net, "Cannot change a running device\n"); - ret = -EBUSY; - goto err; - } - ret = qmimux_register_device(dev->net, mux_id); if (!ret) { info->flags |= QMI_WWAN_FLAG_MUX; @@ -432,13 +425,6 @@ static ssize_t del_mux_store(struct devi if (!rtnl_trylock()) return restart_syscall();
- /* we don't want to modify a running netdev */ - if (netif_running(dev->net)) { - netdev_err(dev->net, "Cannot change a running device\n"); - ret = -EBUSY; - goto err; - } - del_dev = qmimux_find_dev(dev, mux_id); if (!del_dev) { netdev_err(dev->net, "mux_id not present\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Hillf Danton hdanton@sina.com
commit 863a42b289c22df63db62b10fc2c2ffc237e2125 upstream.
Init the u64 stats in order to avoid the lockdep prints on the 32bit hardware like
INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 4695 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<826fc5b8>] (dump_backtrace) from [<826fc82c>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) [<826fc814>] (show_stack) from [<8270d1f8>] (__dump_stack lib/dump_stack.c:79 [inline]) [<826fc814>] (show_stack) from [<8270d1f8>] (dump_stack+0xa8/0xc8 lib/dump_stack.c:120) [<8270d150>] (dump_stack) from [<802bf9c0>] (assign_lock_key kernel/locking/lockdep.c:935 [inline]) [<8270d150>] (dump_stack) from [<802bf9c0>] (register_lock_class+0xabc/0xb68 kernel/locking/lockdep.c:1247) [<802bef04>] (register_lock_class) from [<802baa2c>] (__lock_acquire+0x84/0x32d4 kernel/locking/lockdep.c:4711) [<802ba9a8>] (__lock_acquire) from [<802be840>] (lock_acquire.part.0+0xf0/0x554 kernel/locking/lockdep.c:5442) [<802be750>] (lock_acquire.part.0) from [<802bed10>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5415) [<802beca4>] (lock_acquire) from [<81560548>] (seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (__u64_stats_fetch_begin include/linux/u64_stats_sync.h:164 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (u64_stats_fetch_begin include/linux/u64_stats_sync.h:175 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (nsim_get_stats64+0xdc/0xf0 drivers/net/netdevsim/netdev.c:70) [<8156046c>] (nsim_get_stats64) from [<81e2efa0>] (dev_get_stats+0x44/0xd0 net/core/dev.c:10405) [<81e2ef5c>] (dev_get_stats) from [<81e53204>] (rtnl_fill_stats+0x38/0x120 net/core/rtnetlink.c:1211) [<81e531cc>] (rtnl_fill_stats) from [<81e59d58>] (rtnl_fill_ifinfo+0x6d4/0x148c net/core/rtnetlink.c:1783) [<81e59684>] (rtnl_fill_ifinfo) from [<81e5ceb4>] (rtmsg_ifinfo_build_skb+0x9c/0x108 net/core/rtnetlink.c:3798) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3830 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3821 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo+0x44/0x70 net/core/rtnetlink.c:3839) [<81e5d068>] (rtmsg_ifinfo) from [<81e45c2c>] (register_netdevice+0x664/0x68c net/core/dev.c:10103) [<81e455c8>] (register_netdevice) from [<815608bc>] (nsim_create+0xf8/0x124 drivers/net/netdevsim/netdev.c:317) [<815607c4>] (nsim_create) from [<81561184>] (__nsim_dev_port_add+0x108/0x188 drivers/net/netdevsim/dev.c:941) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_port_add_all drivers/net/netdevsim/dev.c:990 [inline]) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_probe+0x5cc/0x750 drivers/net/netdevsim/dev.c:1119) [<81561b0c>] (nsim_dev_probe) from [<815661dc>] (nsim_bus_probe+0x10/0x14 drivers/net/netdevsim/bus.c:287) [<815661cc>] (nsim_bus_probe) from [<811724c0>] (really_probe+0x100/0x50c drivers/base/dd.c:554) [<811723c0>] (really_probe) from [<811729c4>] (driver_probe_device+0xf8/0x1c8 drivers/base/dd.c:740) [<811728cc>] (driver_probe_device) from [<81172fe4>] (__device_attach_driver+0x8c/0xf0 drivers/base/dd.c:846) [<81172f58>] (__device_attach_driver) from [<8116fee0>] (bus_for_each_drv+0x88/0xd8 drivers/base/bus.c:431) [<8116fe58>] (bus_for_each_drv) from [<81172c6c>] (__device_attach+0xdc/0x1d0 drivers/base/dd.c:914) [<81172b90>] (__device_attach) from [<8117305c>] (device_initial_probe+0x14/0x18 drivers/base/dd.c:961) [<81173048>] (device_initial_probe) from [<81171358>] (bus_probe_device+0x90/0x98 drivers/base/bus.c:491) [<811712c8>] (bus_probe_device) from [<8116e77c>] (device_add+0x320/0x824 drivers/base/core.c:3109) [<8116e45c>] (device_add) from [<8116ec9c>] (device_register+0x1c/0x20 drivers/base/core.c:3182) [<8116ec80>] (device_register) from [<81566710>] (nsim_bus_dev_new drivers/net/netdevsim/bus.c:336 [inline]) [<8116ec80>] (device_register) from [<81566710>] (new_device_store+0x178/0x208 drivers/net/netdevsim/bus.c:215) [<81566598>] (new_device_store) from [<8116fcb4>] (bus_attr_store+0x2c/0x38 drivers/base/bus.c:122) [<8116fc88>] (bus_attr_store) from [<805b4b8c>] (sysfs_kf_write+0x48/0x54 fs/sysfs/file.c:139) [<805b4b44>] (sysfs_kf_write) from [<805b3c90>] (kernfs_fop_write_iter+0x128/0x1ec fs/kernfs/file.c:296) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (call_write_iter include/linux/fs.h:1901 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (new_sync_write fs/read_write.c:518 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (vfs_write+0x3dc/0x57c fs/read_write.c:605) [<804d1f20>] (vfs_write) from [<804d2604>] (ksys_write+0x68/0xec fs/read_write.c:658) [<804d259c>] (ksys_write) from [<804d2698>] (__do_sys_write fs/read_write.c:670 [inline]) [<804d259c>] (ksys_write) from [<804d2698>] (sys_write+0x10/0x14 fs/read_write.c:667) [<804d2688>] (sys_write) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64)
Fixes: 83c9e13aa39a ("netdevsim: add software driver for testing offloads") Reported-by: syzbot+e74a6857f2d0efe3ad81@syzkaller.appspotmail.com Tested-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Hillf Danton hdanton@sina.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/netdevsim/netdev.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -296,6 +296,7 @@ nsim_create(struct nsim_dev *nsim_dev, s dev_net_set(dev, nsim_dev_net(nsim_dev)); ns = netdev_priv(dev); ns->netdev = dev; + u64_stats_init(&ns->syncp); ns->nsim_dev = nsim_dev; ns->nsim_dev_port = nsim_dev_port; ns->nsim_bus_dev = nsim_dev->nsim_bus_dev;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paul Moore paul@paul-moore.com
commit ad5d07f4a9cd671233ae20983848874731102c08 upstream.
The current CIPSO and CALIPSO refcounting scheme for the DOI definitions is a bit flawed in that we:
1. Don't correctly match gets/puts in netlbl_cipsov4_list(). 2. Decrement the refcount on each attempt to remove the DOI from the DOI list, only removing it from the list once the refcount drops to zero.
This patch fixes these problems by adding the missing "puts" to netlbl_cipsov4_list() and introduces a more conventional, i.e. not-buggy, refcounting mechanism to the DOI definitions. Upon the addition of a DOI to the DOI list, it is initialized with a refcount of one, removing a DOI from the list removes it from the list and drops the refcount by one; "gets" and "puts" behave as expected with respect to refcounts, increasing and decreasing the DOI's refcount by one.
Fixes: b1edeb102397 ("netlabel: Replace protocol/NetLabel linking with refrerence counts") Fixes: d7cce01504a0 ("netlabel: Add support for removing a CALIPSO DOI.") Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/cipso_ipv4.c | 11 +---------- net/ipv6/calipso.c | 14 +++++--------- net/netlabel/netlabel_cipso_v4.c | 3 +++ 3 files changed, 9 insertions(+), 19 deletions(-)
--- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -519,16 +519,10 @@ int cipso_v4_doi_remove(u32 doi, struct ret_val = -ENOENT; goto doi_remove_return; } - if (!refcount_dec_and_test(&doi_def->refcount)) { - spin_unlock(&cipso_v4_doi_list_lock); - ret_val = -EBUSY; - goto doi_remove_return; - } list_del_rcu(&doi_def->list); spin_unlock(&cipso_v4_doi_list_lock);
- cipso_v4_cache_invalidate(); - call_rcu(&doi_def->rcu, cipso_v4_doi_free_rcu); + cipso_v4_doi_putdef(doi_def); ret_val = 0;
doi_remove_return: @@ -585,9 +579,6 @@ void cipso_v4_doi_putdef(struct cipso_v4
if (!refcount_dec_and_test(&doi_def->refcount)) return; - spin_lock(&cipso_v4_doi_list_lock); - list_del_rcu(&doi_def->list); - spin_unlock(&cipso_v4_doi_list_lock);
cipso_v4_cache_invalidate(); call_rcu(&doi_def->rcu, cipso_v4_doi_free_rcu); --- a/net/ipv6/calipso.c +++ b/net/ipv6/calipso.c @@ -83,6 +83,9 @@ struct calipso_map_cache_entry {
static struct calipso_map_cache_bkt *calipso_cache;
+static void calipso_cache_invalidate(void); +static void calipso_doi_putdef(struct calipso_doi *doi_def); + /* Label Mapping Cache Functions */
@@ -444,15 +447,10 @@ static int calipso_doi_remove(u32 doi, s ret_val = -ENOENT; goto doi_remove_return; } - if (!refcount_dec_and_test(&doi_def->refcount)) { - spin_unlock(&calipso_doi_list_lock); - ret_val = -EBUSY; - goto doi_remove_return; - } list_del_rcu(&doi_def->list); spin_unlock(&calipso_doi_list_lock);
- call_rcu(&doi_def->rcu, calipso_doi_free_rcu); + calipso_doi_putdef(doi_def); ret_val = 0;
doi_remove_return: @@ -508,10 +506,8 @@ static void calipso_doi_putdef(struct ca
if (!refcount_dec_and_test(&doi_def->refcount)) return; - spin_lock(&calipso_doi_list_lock); - list_del_rcu(&doi_def->list); - spin_unlock(&calipso_doi_list_lock);
+ calipso_cache_invalidate(); call_rcu(&doi_def->rcu, calipso_doi_free_rcu); }
--- a/net/netlabel/netlabel_cipso_v4.c +++ b/net/netlabel/netlabel_cipso_v4.c @@ -575,6 +575,7 @@ list_start:
break; } + cipso_v4_doi_putdef(doi_def); rcu_read_unlock();
genlmsg_end(ans_skb, data); @@ -583,12 +584,14 @@ list_start: list_retry: /* XXX - this limit is a guesstimate */ if (nlsze_mult < 4) { + cipso_v4_doi_putdef(doi_def); rcu_read_unlock(); kfree_skb(ans_skb); nlsze_mult *= 2; goto list_start; } list_failure_lock: + cipso_v4_doi_putdef(doi_def); rcu_read_unlock(); list_failure: kfree_skb(ans_skb);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ong Boon Leong boon.leong.ong@intel.com
commit 9a7b3950c7e15968e23d83be215e95ccc7c92a53 upstream.
For Intel mGbE controller, MAC VLAN filter delete operation will time-out if serdes power-down sequence happened first during driver remove() with below message.
[82294.764958] intel-eth-pci 0000:00:1e.4 eth2: stmmac_dvr_remove: removing driver [82294.778677] intel-eth-pci 0000:00:1e.4 eth2: Timeout accessing MAC_VLAN_Tag_Filter [82294.779997] intel-eth-pci 0000:00:1e.4 eth2: failed to kill vid 0081/0 [82294.947053] intel-eth-pci 0000:00:1d.2 eth1: stmmac_dvr_remove: removing driver [82295.002091] intel-eth-pci 0000:00:1d.1 eth0: stmmac_dvr_remove: removing driver
Therefore, we delay the serdes power-down to be after unregister_netdev() which triggers the VLAN filter delete.
Fixes: b9663b7ca6ff ("net: stmmac: Enable SERDES power up/down sequence") Signed-off-by: Ong Boon Leong boon.leong.ong@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5144,13 +5144,16 @@ int stmmac_dvr_remove(struct device *dev netdev_info(priv->dev, "%s: removing driver", __func__);
stmmac_stop_all_dma(priv); + stmmac_mac_set(priv, priv->ioaddr, false); + netif_carrier_off(ndev); + unregister_netdev(ndev);
+ /* Serdes power down needs to happen after VLAN filter + * is deleted that is triggered by unregister_netdev(). + */ if (priv->plat->serdes_powerdown) priv->plat->serdes_powerdown(ndev, priv->plat->bsp_priv);
- stmmac_mac_set(priv, priv->ioaddr, false); - netif_carrier_off(ndev); - unregister_netdev(ndev); #ifdef CONFIG_DEBUG_FS stmmac_exit_fs(ndev); #endif
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wong Vee Khee vee.khee.wong@intel.com
commit 8eb37ab7cc045ec6305a6a1a9c32374695a1a977 upstream.
Issue seen when enumerating multiple Intel mGbE interfaces in EHL.
[ 6.898141] intel-eth-pci 0000:00:1d.2: enabling device (0000 -> 0002) [ 6.900971] intel-eth-pci 0000:00:1d.2: Fail to register stmmac-clk [ 6.906434] intel-eth-pci 0000:00:1d.2: User ID: 0x51, Synopsys ID: 0x52
We fix it by making the clock name to be unique following the format of stmmac-pci_name(pci_dev) so that we can differentiate the clock for these Intel mGbE interfaces in EHL platform as follow:
/sys/kernel/debug/clk/stmmac-0000:00:1d.1 /sys/kernel/debug/clk/stmmac-0000:00:1d.2 /sys/kernel/debug/clk/stmmac-0000:00:1e.4
Fixes: 58da0cfa6cf1 ("net: stmmac: create dwmac-intel.c to contain all Intel platform") Signed-off-by: Wong Vee Khee vee.khee.wong@intel.com Signed-off-by: Voon Weifeng weifeng.voon@intel.com Co-developed-by: Ong Boon Leong boon.leong.ong@intel.com Signed-off-by: Ong Boon Leong boon.leong.ong@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c @@ -233,6 +233,7 @@ static void common_default_data(struct p static int intel_mgbe_common_data(struct pci_dev *pdev, struct plat_stmmacenet_data *plat) { + char clk_name[20]; int ret; int i;
@@ -300,8 +301,10 @@ static int intel_mgbe_common_data(struct plat->eee_usecs_rate = plat->clk_ptp_rate;
/* Set system clock */ + sprintf(clk_name, "%s-%s", "stmmac", pci_name(pdev)); + plat->stmmac_clk = clk_register_fixed_rate(&pdev->dev, - "stmmac-clk", NULL, 0, + clk_name, NULL, 0, plat->clk_ptp_rate);
if (IS_ERR(plat->stmmac_clk)) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Xie He xie.he.0141@gmail.com
commit f7d9d4854519fdf4d45c70a4d953438cd88e7e58 upstream.
For the devices in this driver, the default qdisc is "noqueue", because their "tx_queue_len" is 0.
In function "__dev_queue_xmit" in "net/core/dev.c", devices with the "noqueue" qdisc are specially handled. Packets are transmitted without being queued after a "dev->flags & IFF_UP" check. However, it's possible that even if this check succeeds, "ops->ndo_stop" may still have already been called. This is because in "__dev_close_many", "ops->ndo_stop" is called before clearing the "IFF_UP" flag.
If we call "netif_stop_queue" in "ops->ndo_stop", then it's possible in "__dev_queue_xmit", it sees the "IFF_UP" flag is present, and then it checks "netif_xmit_stopped" and finds that the queue is already stopped. In this case, it will complain that: "Virtual device ... asks to queue packet!"
To prevent "__dev_queue_xmit" from generating this complaint, we should not call "netif_stop_queue" in "ops->ndo_stop".
We also don't need to call "netif_start_queue" in "ops->ndo_open", because after a netdev is allocated and registered, the "__QUEUE_STATE_DRV_XOFF" flag is initially not set, so there is no need to call "netif_start_queue" to clear it.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He xie.he.0141@gmail.com Acked-by: Martin Schiller ms@dev.tdt.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wan/lapbether.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/net/wan/lapbether.c +++ b/drivers/net/wan/lapbether.c @@ -292,7 +292,6 @@ static int lapbeth_open(struct net_devic return -ENODEV; }
- netif_start_queue(dev); return 0; }
@@ -300,8 +299,6 @@ static int lapbeth_close(struct net_devi { int err;
- netif_stop_queue(dev); - if ((err = lapb_unregister(dev)) != LAPB_OK) pr_err("lapb_unregister error: %d\n", err);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paul Cercueil paul@crapouillou.net
commit ac88c531a5b38877eba2365a3f28f0c8b513dc33 upstream.
When the probe fails or requests to be defered, we must disable the regulator that was previously enabled.
Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/davicom/dm9000.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/davicom/dm9000.c +++ b/drivers/net/ethernet/davicom/dm9000.c @@ -1449,7 +1449,7 @@ dm9000_probe(struct platform_device *pde if (ret) { dev_err(dev, "failed to request reset gpio %d: %d\n", reset_gpios, ret); - return -ENODEV; + goto out_regulator_disable; }
/* According to manual PWRST# Low Period Min 1ms */ @@ -1461,8 +1461,10 @@ dm9000_probe(struct platform_device *pde
if (!pdata) { pdata = dm9000_parse_dt(&pdev->dev); - if (IS_ERR(pdata)) - return PTR_ERR(pdata); + if (IS_ERR(pdata)) { + ret = PTR_ERR(pdata); + goto out_regulator_disable; + } }
/* Init network device */ @@ -1703,6 +1705,10 @@ out: dm9000_release_board(pdev, db); free_netdev(ndev);
+out_regulator_disable: + if (!IS_ERR(power)) + regulator_disable(power); + return ret; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paul Cercueil paul@crapouillou.net
commit cf9e60aa69ae6c40d3e3e4c94dd6c8de31674e9b upstream.
We must disable the regulator that was enabled in the probe function.
Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/davicom/dm9000.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/davicom/dm9000.c +++ b/drivers/net/ethernet/davicom/dm9000.c @@ -133,6 +133,8 @@ struct board_info { u32 wake_state;
int ip_summed; + + struct regulator *power_supply; };
/* debug code */ @@ -1481,6 +1483,8 @@ dm9000_probe(struct platform_device *pde
db->dev = &pdev->dev; db->ndev = ndev; + if (!IS_ERR(power)) + db->power_supply = power;
spin_lock_init(&db->lock); mutex_init(&db->addr_lock); @@ -1766,10 +1770,13 @@ static int dm9000_drv_remove(struct platform_device *pdev) { struct net_device *ndev = platform_get_drvdata(pdev); + struct board_info *dm = to_dm9000_board(ndev);
unregister_netdev(ndev); - dm9000_release_board(pdev, netdev_priv(ndev)); + dm9000_release_board(pdev, dm); free_netdev(ndev); /* free device structure */ + if (dm->power_supply) + regulator_disable(dm->power_supply);
dev_dbg(&pdev->dev, "released and freed device\n"); return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 29d98f54a4fe1b6a9089bec8715a1b89ff9ad59c upstream.
The txtime is passed to the driver in skb->skb_mstamp_ns, which is actually in a union with skb->tstamp (the place where software timestamps are kept).
Since commit b50a5c70ffa4 ("net: allow simultaneous SW and HW transmit timestamping"), __sock_recv_timestamp has some logic for making sure that the two calls to skb_tstamp_tx:
skb_tx_timestamp(skb) # Software timestamp in the driver -> skb_tstamp_tx(skb, NULL)
and
skb_tstamp_tx(skb, &shhwtstamps) # Hardware timestamp in the driver
will both do the right thing and in a race-free manner, meaning that skb_tx_timestamp will deliver a cmsg with the software timestamp only, and skb_tstamp_tx with a non-NULL hwtstamps argument will deliver a cmsg with the hardware timestamp only.
Why are races even possible? Well, because although the software timestamp skb->tstamp is private per skb, the hardware timestamp skb_hwtstamps(skb) lives in skb_shinfo(skb), an area which is shared between skbs and their clones. And skb_tstamp_tx works by cloning the packets when timestamping them, therefore attempting to perform hardware timestamping on an skb's clone will also change the hardware timestamp of the original skb. And the original skb might have been yet again cloned for software timestamping, at an earlier stage.
So the logic in __sock_recv_timestamp can't be as simple as saying "does this skb have a hardware timestamp? if yes I'll send the hardware timestamp to the socket, otherwise I'll send the software timestamp", precisely because the hardware timestamp is shared. Instead, it's quite the other way around: __sock_recv_timestamp says "does this skb have a software timestamp? if yes, I'll send the software timestamp, otherwise the hardware one". This works because the software timestamp is not shared with clones.
But that means we have a problem when we attempt hardware timestamping with skbs that don't have the skb->tstamp == 0. __sock_recv_timestamp will say "oh, yeah, this must be some sort of odd clone" and will not deliver the hardware timestamp to the socket. And this is exactly what is happening when we have txtime enabled on the socket: as mentioned, that is put in a union with skb->tstamp, so it is quite easy to mistake it.
Do what other drivers do (intel igb/igc) and write zero to skb->tstamp before taking the hardware timestamp. It's of no use to us now (we're already on the TX confirmation path).
Fixes: 0d08c9ec7d6e ("enetc: add support time specific departure base on the qos etf") Cc: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/enetc/enetc.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -344,6 +344,12 @@ static void enetc_tstamp_tx(struct sk_bu if (skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS) { memset(&shhwtstamps, 0, sizeof(shhwtstamps)); shhwtstamps.hwtstamp = ns_to_ktime(tstamp); + /* Ensure skb_mstamp_ns, which might have been populated with + * the txtime, is not mistaken for a software timestamp, + * because this will prevent the dispatch of our hardware + * timestamp to the socket. + */ + skb->tstamp = ktime_set(0, 0); skb_tstamp_tx(skb, &shhwtstamps); } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jia-Ju Bai baijiaju1990@gmail.com
commit 179d0ba0c454057a65929c46af0d6ad986754781 upstream.
When sock_alloc_send_skb() returns NULL to skb, no error return code of qrtr_sendmsg() is assigned. To fix this bug, rc is assigned with -ENOMEM in this case.
Fixes: 194ccc88297a ("net: qrtr: Support decoding incoming v2 packets") Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/qrtr/qrtr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/qrtr/qrtr.c +++ b/net/qrtr/qrtr.c @@ -958,8 +958,10 @@ static int qrtr_sendmsg(struct socket *s plen = (len + 3) & ~3; skb = sock_alloc_send_skb(sk, plen + QRTR_HDR_MAX_SIZE, msg->msg_flags & MSG_DONTWAIT, &rc); - if (!skb) + if (!skb) { + rc = -ENOMEM; goto out_node; + }
skb_reserve(skb, QRTR_HDR_MAX_SIZE);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Julian Wiedmann jwi@linux.ibm.com
commit e7a36d27f6b9f389e41d8189a8a08919c6835732 upstream.
When qeth_alloc_qdio_queues() fails to allocate one of the buffers that back an Output Queue, the 'out_freeoutqbufs' path will free all previously allocated buffers for this queue. But it misses to free the half-finished queue struct itself.
Move the buffer allocation into qeth_alloc_output_queue(), and deal with such errors internally.
Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Reviewed-by: Alexandra Winter wintera@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 35 +++++++++++++++++------------------ 1 file changed, 17 insertions(+), 18 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -2630,15 +2630,28 @@ static void qeth_free_output_queue(struc static struct qeth_qdio_out_q *qeth_alloc_output_queue(void) { struct qeth_qdio_out_q *q = kzalloc(sizeof(*q), GFP_KERNEL); + unsigned int i;
if (!q) return NULL;
- if (qdio_alloc_buffers(q->qdio_bufs, QDIO_MAX_BUFFERS_PER_Q)) { - kfree(q); - return NULL; + if (qdio_alloc_buffers(q->qdio_bufs, QDIO_MAX_BUFFERS_PER_Q)) + goto err_qdio_bufs; + + for (i = 0; i < QDIO_MAX_BUFFERS_PER_Q; i++) { + if (qeth_init_qdio_out_buf(q, i)) + goto err_out_bufs; } + return q; + +err_out_bufs: + while (i > 0) + kmem_cache_free(qeth_qdio_outbuf_cache, q->bufs[--i]); + qdio_free_buffers(q->qdio_bufs, QDIO_MAX_BUFFERS_PER_Q); +err_qdio_bufs: + kfree(q); + return NULL; }
static void qeth_tx_completion_timer(struct timer_list *timer) @@ -2651,7 +2664,7 @@ static void qeth_tx_completion_timer(str
static int qeth_alloc_qdio_queues(struct qeth_card *card) { - int i, j; + unsigned int i;
QETH_CARD_TEXT(card, 2, "allcqdbf");
@@ -2685,13 +2698,6 @@ static int qeth_alloc_qdio_queues(struct queue->coalesce_usecs = QETH_TX_COALESCE_USECS; queue->max_coalesced_frames = QETH_TX_MAX_COALESCED_FRAMES; queue->priority = QETH_QIB_PQUE_PRIO_DEFAULT; - - /* give outbound qeth_qdio_buffers their qdio_buffers */ - for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j) { - WARN_ON(queue->bufs[j]); - if (qeth_init_qdio_out_buf(queue, j)) - goto out_freeoutqbufs; - } }
/* completion */ @@ -2700,13 +2706,6 @@ static int qeth_alloc_qdio_queues(struct
return 0;
-out_freeoutqbufs: - while (j > 0) { - --j; - kmem_cache_free(qeth_qdio_outbuf_cache, - card->qdio.out_qs[i]->bufs[j]); - card->qdio.out_qs[i]->bufs[j] = NULL; - } out_freeoutq: while (i > 0) { qeth_free_output_queue(card->qdio.out_qs[--i]);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Julian Wiedmann jwi@linux.ibm.com
commit c20383ad1656b0f6354dd50e4acd894f9d94090d upstream.
The current design attaches a pending TX buffer to a custom single-linked list, which is anchored at the buffer's slot on the TX ring. The buffer is then checked for final completion whenever this slot is processed during a subsequent TX NAPI poll cycle.
But if there's insufficient traffic on the ring, we might never make enough progress to get back to this ring slot and discover the pending buffer's final TX completion. In particular if this missing TX completion blocks the application from sending further traffic.
So convert the custom single-linked list code to a per-queue list_head, and scan this list on every TX NAPI cycle.
Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core.h | 3 + drivers/s390/net/qeth_core_main.c | 69 +++++++++++++++----------------------- 2 files changed, 30 insertions(+), 42 deletions(-)
--- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -436,7 +436,7 @@ struct qeth_qdio_out_buffer { int is_header[QDIO_MAX_ELEMENTS_PER_BUFFER];
struct qeth_qdio_out_q *q; - struct qeth_qdio_out_buffer *next_pending; + struct list_head list_entry; };
struct qeth_card; @@ -500,6 +500,7 @@ struct qeth_qdio_out_q { struct qdio_buffer *qdio_bufs[QDIO_MAX_BUFFERS_PER_Q]; struct qeth_qdio_out_buffer *bufs[QDIO_MAX_BUFFERS_PER_Q]; struct qdio_outbuf_state *bufstates; /* convenience pointer */ + struct list_head pending_bufs; struct qeth_out_q_stats stats; spinlock_t lock; unsigned int priority; --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -73,8 +73,6 @@ static void qeth_free_qdio_queues(struct static void qeth_notify_skbs(struct qeth_qdio_out_q *queue, struct qeth_qdio_out_buffer *buf, enum iucv_tx_notify notification); -static void qeth_tx_complete_buf(struct qeth_qdio_out_buffer *buf, bool error, - int budget);
static void qeth_close_dev_handler(struct work_struct *work) { @@ -465,41 +463,6 @@ static enum iucv_tx_notify qeth_compute_ return n; }
-static void qeth_cleanup_handled_pending(struct qeth_qdio_out_q *q, int bidx, - int forced_cleanup) -{ - if (q->card->options.cq != QETH_CQ_ENABLED) - return; - - if (q->bufs[bidx]->next_pending != NULL) { - struct qeth_qdio_out_buffer *head = q->bufs[bidx]; - struct qeth_qdio_out_buffer *c = q->bufs[bidx]->next_pending; - - while (c) { - if (forced_cleanup || - atomic_read(&c->state) == QETH_QDIO_BUF_EMPTY) { - struct qeth_qdio_out_buffer *f = c; - - QETH_CARD_TEXT(f->q->card, 5, "fp"); - QETH_CARD_TEXT_(f->q->card, 5, "%lx", (long) f); - /* release here to avoid interleaving between - outbound tasklet and inbound tasklet - regarding notifications and lifecycle */ - qeth_tx_complete_buf(c, forced_cleanup, 0); - - c = f->next_pending; - WARN_ON_ONCE(head->next_pending != f); - head->next_pending = c; - kmem_cache_free(qeth_qdio_outbuf_cache, f); - } else { - head = c; - c = c->next_pending; - } - - } - } -} - static void qeth_qdio_handle_aob(struct qeth_card *card, unsigned long phys_aob_addr) { @@ -537,7 +500,7 @@ static void qeth_qdio_handle_aob(struct qeth_notify_skbs(buffer->q, buffer, notification);
/* Free dangling allocations. The attached skbs are handled by - * qeth_cleanup_handled_pending(). + * qeth_tx_complete_pending_bufs(). */ for (i = 0; i < aob->sb_count && i < QETH_MAX_BUFFER_ELEMENTS(card); @@ -1484,14 +1447,35 @@ static void qeth_clear_output_buffer(str atomic_set(&buf->state, QETH_QDIO_BUF_EMPTY); }
+static void qeth_tx_complete_pending_bufs(struct qeth_card *card, + struct qeth_qdio_out_q *queue, + bool drain) +{ + struct qeth_qdio_out_buffer *buf, *tmp; + + list_for_each_entry_safe(buf, tmp, &queue->pending_bufs, list_entry) { + if (drain || atomic_read(&buf->state) == QETH_QDIO_BUF_EMPTY) { + QETH_CARD_TEXT(card, 5, "fp"); + QETH_CARD_TEXT_(card, 5, "%lx", (long) buf); + + qeth_tx_complete_buf(buf, drain, 0); + + list_del(&buf->list_entry); + kmem_cache_free(qeth_qdio_outbuf_cache, buf); + } + } +} + static void qeth_drain_output_queue(struct qeth_qdio_out_q *q, bool free) { int j;
+ qeth_tx_complete_pending_bufs(q->card, q, true); + for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j) { if (!q->bufs[j]) continue; - qeth_cleanup_handled_pending(q, j, 1); + qeth_clear_output_buffer(q, q->bufs[j], true, 0); if (free) { kmem_cache_free(qeth_qdio_outbuf_cache, q->bufs[j]); @@ -2611,7 +2595,6 @@ static int qeth_init_qdio_out_buf(struct skb_queue_head_init(&newbuf->skb_list); lockdep_set_class(&newbuf->skb_list.lock, &qdio_out_skb_queue_key); newbuf->q = q; - newbuf->next_pending = q->bufs[bidx]; atomic_set(&newbuf->state, QETH_QDIO_BUF_EMPTY); q->bufs[bidx] = newbuf; return 0; @@ -2693,6 +2676,7 @@ static int qeth_alloc_qdio_queues(struct card->qdio.out_qs[i] = queue; queue->card = card; queue->queue_no = i; + INIT_LIST_HEAD(&queue->pending_bufs); spin_lock_init(&queue->lock); timer_setup(&queue->timer, qeth_tx_completion_timer, 0); queue->coalesce_usecs = QETH_TX_COALESCE_USECS; @@ -6099,6 +6083,8 @@ static void qeth_iqd_tx_complete(struct qeth_schedule_recovery(card); }
+ list_add(&buffer->list_entry, + &queue->pending_bufs); /* Skip clearing the buffer: */ return; case QETH_QDIO_BUF_QAOB_OK: @@ -6154,6 +6140,8 @@ static int qeth_tx_poll(struct napi_stru unsigned int bytes = 0; int completed;
+ qeth_tx_complete_pending_bufs(card, queue, false); + if (qeth_out_queue_is_empty(queue)) { napi_complete(napi); return 0; @@ -6186,7 +6174,6 @@ static int qeth_tx_poll(struct napi_stru
qeth_handle_send_error(card, buffer, error); qeth_iqd_tx_complete(queue, bidx, error, budget); - qeth_cleanup_handled_pending(queue, bidx, false); }
netdev_tx_completed_queue(txq, packets, bytes);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Julian Wiedmann jwi@linux.ibm.com
commit 3e83d467a08e25b27c44c885f511624a71c84f7c upstream.
When a QAOB notifies us that a pending TX buffer has been delivered, the actual TX completion processing by qeth_tx_complete_pending_bufs() is done within the context of a TX NAPI instance. We shouldn't rely on this instance being scheduled by some other TX event, but just do it ourselves.
qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI instance. To avoid touching the TX queue's NAPI instance before/after it is (un-)registered, reorder the code in qeth_open() and qeth_stop() accordingly.
Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -470,6 +470,7 @@ static void qeth_qdio_handle_aob(struct struct qaob *aob; struct qeth_qdio_out_buffer *buffer; enum iucv_tx_notify notification; + struct qeth_qdio_out_q *queue; unsigned int i;
aob = (struct qaob *) phys_to_virt(phys_aob_addr); @@ -512,7 +513,9 @@ static void qeth_qdio_handle_aob(struct buffer->is_header[i] = 0; }
+ queue = buffer->q; atomic_set(&buffer->state, QETH_QDIO_BUF_EMPTY); + napi_schedule(&queue->napi); break; default: WARN_ON_ONCE(1); @@ -7225,9 +7228,7 @@ int qeth_open(struct net_device *dev) card->data.state = CH_STATE_UP; netif_tx_start_all_queues(dev);
- napi_enable(&card->napi); local_bh_disable(); - napi_schedule(&card->napi); if (IS_IQD(card)) { struct qeth_qdio_out_q *queue; unsigned int i; @@ -7239,8 +7240,12 @@ int qeth_open(struct net_device *dev) napi_schedule(&queue->napi); } } + + napi_enable(&card->napi); + napi_schedule(&card->napi); /* kick-start the NAPI softirq: */ local_bh_enable(); + return 0; } EXPORT_SYMBOL_GPL(qeth_open); @@ -7250,6 +7255,11 @@ int qeth_stop(struct net_device *dev) struct qeth_card *card = dev->ml_priv;
QETH_CARD_TEXT(card, 4, "qethstop"); + + napi_disable(&card->napi); + cancel_delayed_work_sync(&card->buffer_reclaim_work); + qdio_stop_irq(CARD_DDEV(card)); + if (IS_IQD(card)) { struct qeth_qdio_out_q *queue; unsigned int i; @@ -7270,10 +7280,6 @@ int qeth_stop(struct net_device *dev) netif_tx_disable(dev); }
- napi_disable(&card->napi); - cancel_delayed_work_sync(&card->buffer_reclaim_work); - qdio_stop_irq(CARD_DDEV(card)); - return 0; } EXPORT_SYMBOL_GPL(qeth_stop);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Julian Wiedmann jwi@linux.ibm.com
commit 7eefda7f353ef86ad82a2dc8329e8a3538c08ab6 upstream.
The cited commit reworked the state machine for pending TX buffers. In qeth_iqd_tx_complete() it turned PENDING into a transient state, and uses NEED_QAOB for buffers that get parked while waiting for their QAOB completion.
But it missed to adjust the check in qeth_tx_complete_buf(). So if qeth_tx_complete_pending_bufs() is called during teardown to drain the parked TX buffers, we no longer raise a notification for af_iucv.
Instead of updating the checked state, just move this code into qeth_tx_complete_pending_bufs() itself. This also gets rid of the special-case in the common TX completion path.
Fixes: 8908f36d20d8 ("s390/qeth: fix af_iucv notification race") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/net/qeth_core_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -1386,9 +1386,6 @@ static void qeth_tx_complete_buf(struct struct qeth_qdio_out_q *queue = buf->q; struct sk_buff *skb;
- if (atomic_read(&buf->state) == QETH_QDIO_BUF_PENDING) - qeth_notify_skbs(queue, buf, TX_NOTIFY_GENERALERROR); - /* Empty buffer? */ if (buf->next_element_to_fill == 0) return; @@ -1461,6 +1458,9 @@ static void qeth_tx_complete_pending_buf QETH_CARD_TEXT(card, 5, "fp"); QETH_CARD_TEXT_(card, 5, "%lx", (long) buf);
+ if (drain) + qeth_notify_skbs(queue, buf, + TX_NOTIFY_GENERALERROR); qeth_tx_complete_buf(buf, drain, 0);
list_del(&buf->list_entry);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Hayes Wang hayeswang@realtek.com
commit abbf9a0ef8848dca58c5b97750c1c59bbee45637 upstream.
The (0xBAF70000 & 0x00FFF000) << 6 should be (0xf70 << 18).
Fixes: 561535b0f239 ("r8169: fix OCP access on RTL8117") Signed-off-by: Hayes Wang hayeswang@realtek.com Acked-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -1013,7 +1013,7 @@ static void r8168fp_adjust_ocp_cmd(struc { /* based on RTL8168FP_OOBMAC_BASE in vendor driver */ if (tp->mac_version == RTL_GIGA_MAC_VER_52 && type == ERIAR_OOB) - *cmd |= 0x7f0 << 18; + *cmd |= 0xf70 << 18; }
DECLARE_RTL_COND(rtl_eriar_cond)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Antony Antony antony@phenome.org
commit d785e1fec60179f534fbe8d006c890e5ad186e51 upstream.
Based on talks and indirect references ixgbe IPsec offlod do not support IPsec tunnel mode offload. It can only support IPsec transport mode offload. Now explicitly fail when creating non transport mode SA with offload to avoid false performance expectations.
Fixes: 63a67fe229ea ("ixgbe: add ipsec offload add and remove SA") Signed-off-by: Antony Antony antony@phenome.org Acked-by: Shannon Nelson snelson@pensando.io Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 5 +++++ drivers/net/ethernet/intel/ixgbevf/ipsec.c | 5 +++++ 2 files changed, 10 insertions(+)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c @@ -575,6 +575,11 @@ static int ixgbe_ipsec_add_sa(struct xfr return -EINVAL; }
+ if (xs->props.mode != XFRM_MODE_TRANSPORT) { + netdev_err(dev, "Unsupported mode for ipsec offload\n"); + return -EINVAL; + } + if (ixgbe_ipsec_check_mgmt_ip(xs)) { netdev_err(dev, "IPsec IP addr clash with mgmt filters\n"); return -EINVAL; --- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c +++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c @@ -272,6 +272,11 @@ static int ixgbevf_ipsec_add_sa(struct x return -EINVAL; }
+ if (xs->props.mode != XFRM_MODE_TRANSPORT) { + netdev_err(dev, "Unsupported mode for ipsec offload\n"); + return -EINVAL; + } + if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) { struct rx_sa rsa;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kun-Chuan Hsieh jetswayss@gmail.com
commit 41462c6e730ca0e63f5fed5a517052385d980c54 upstream.
Older libelf.h and glibc elf.h might not yet define the ELF compression types.
Checking and defining SHF_COMPRESSED fix the build error when compiling with older toolchains. Also, the tool resolve_btfids is compiled with host toolchain. The host toolchain is more likely to be older than the cross compile toolchain.
Fixes: 51f6463aacfb ("tools/resolve_btfids: Fix sections with wrong alignment") Signed-off-by: Kun-Chuan Hsieh jetswayss@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Jiri Olsa jolsa@redhat.com Link: https://lore.kernel.org/bpf/20210224052752.5284-1-jetswayss@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/bpf/resolve_btfids/main.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -260,6 +260,11 @@ static struct btf_id *add_symbol(struct return btf_id__add(root, id, false); }
+/* Older libelf.h and glibc elf.h might not yet define the ELF compression types. */ +#ifndef SHF_COMPRESSED +#define SHF_COMPRESSED (1 << 11) /* Section with compressed data. */ +#endif + /* * The data of compressed section should be aligned to 4 * (for 32bit) or 8 (for 64 bit) bytes. The binutils ld
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Antonio Terceiro antonio.terceiro@linaro.org
commit dacfc08dcafa7d443ab339592999e37bbb8a3ef0 upstream.
This was introduced by commit e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table").
Assuming the first word of $(CC) is the actual compiler breaks usage like CC="ccache gcc": the script ends up calling ccache directly with gcc arguments, what fails. Instead of getting the first word, just remove from $(CC) any word that starts with a "-". This maintains the spirit of the original patch, while not breaking ccache users.
Fixes: e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table") Signed-off-by: Antonio Terceiro antonio.terceiro@linaro.org Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: He Zhe zhe.he@windriver.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210224130046.346977-1-antonio.terceiro@linaro.... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/Makefile.perf | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -600,7 +600,7 @@ arch_errno_hdr_dir := $(srctree)/tools arch_errno_tbl := $(srctree)/tools/perf/trace/beauty/arch_errno_names.sh
$(arch_errno_name_array): $(arch_errno_tbl) - $(Q)$(SHELL) '$(arch_errno_tbl)' $(firstword $(CC)) $(arch_errno_hdr_dir) > $@ + $(Q)$(SHELL) '$(arch_errno_tbl)' '$(patsubst -%,,$(CC))' $(arch_errno_hdr_dir) > $@
sync_file_range_arrays := $(beauty_outdir)/sync_file_range_arrays.c sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit a3e860a83397bf761ec1128a3f0ba186445992c6 upstream.
If clear GMAC_CONFIG_TE bit, it would stop all tx channels, but users may only want to stop specific tx channel.
Fixes: 48863ce5940f ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c @@ -53,10 +53,6 @@ void dwmac4_dma_stop_tx(void __iomem *io
value &= ~DMA_CONTROL_ST; writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan)); - - value = readl(ioaddr + GMAC_CONFIG); - value &= ~GMAC_CONFIG_TE; - writel(value, ioaddr + GMAC_CONFIG); }
void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit c511819d138de38e1637eedb645c207e09680d0f upstream.
stmmac_xmit() call stmmac_tx_timer_arm() at the end to modify tx timer to do the transmission cleanup work. Imagine such a situation, stmmac enters suspend immediately after tx timer modified, it's expire callback stmmac_tx_clean() would not be invoked. This could affect BQL, since netdev_tx_sent_queue() has been called, but netdev_tx_completed_queue() have not been involved, as a result, dql_avail(&dev_queue->dql) finally always return a negative value.
__dev_queue_xmit->__dev_xmit_skb->qdisc_run->__qdisc_run->qdisc_restart->dequeue_skb: if ((q->flags & TCQ_F_ONETXQUEUE) && netif_xmit_frozen_or_stopped(txq)) // __QUEUE_STATE_STACK_XOFF is set
Net core will stop transmitting any more. Finillay, net watchdong would timeout. To fix this issue, we should call netdev_tx_reset_queue() in stmmac_resume().
Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx") Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5260,6 +5260,8 @@ static void stmmac_reset_queues_param(st tx_q->cur_tx = 0; tx_q->dirty_tx = 0; tx_q->mss = 0; + + netdev_tx_reset_queue(netdev_get_tx_queue(priv->dev, queue)); } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joakim Zhang qiangqing.zhang@nxp.com
commit 396e13e11577b614db77db0bbb6fca935b94eb1b upstream.
In current driver, buffer2 available only when hardware supports split header. Wrongly set buffer2 valid in stmmac_rx_refill when refill buffer address. You can see that desc3 is 0x81000000 after initialization, but turn out to be 0x83000000 after refill.
Fixes: 67afd6d1cfdf ("net: stmmac: Add Split Header support and enable it in XGMAC cores") Signed-off-by: Joakim Zhang qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c | 9 +++++++-- drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c | 2 +- drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 8 ++++++-- 4 files changed, 15 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c @@ -499,10 +499,15 @@ static void dwmac4_get_rx_header_len(str *len = le32_to_cpu(p->des2) & RDES2_HL; }
-static void dwmac4_set_sec_addr(struct dma_desc *p, dma_addr_t addr) +static void dwmac4_set_sec_addr(struct dma_desc *p, dma_addr_t addr, bool buf2_valid) { p->des2 = cpu_to_le32(lower_32_bits(addr)); - p->des3 = cpu_to_le32(upper_32_bits(addr) | RDES3_BUFFER2_VALID_ADDR); + p->des3 = cpu_to_le32(upper_32_bits(addr)); + + if (buf2_valid) + p->des3 |= cpu_to_le32(RDES3_BUFFER2_VALID_ADDR); + else + p->des3 &= cpu_to_le32(~RDES3_BUFFER2_VALID_ADDR); }
static void dwmac4_set_tbs(struct dma_edesc *p, u32 sec, u32 nsec) --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c @@ -292,7 +292,7 @@ static void dwxgmac2_get_rx_header_len(s *len = le32_to_cpu(p->des2) & XGMAC_RDES2_HL; }
-static void dwxgmac2_set_sec_addr(struct dma_desc *p, dma_addr_t addr) +static void dwxgmac2_set_sec_addr(struct dma_desc *p, dma_addr_t addr, bool is_valid) { p->des2 = cpu_to_le32(lower_32_bits(addr)); p->des3 = cpu_to_le32(upper_32_bits(addr)); --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -91,7 +91,7 @@ struct stmmac_desc_ops { int (*get_rx_hash)(struct dma_desc *p, u32 *hash, enum pkt_hash_types *type); void (*get_rx_header_len)(struct dma_desc *p, unsigned int *len); - void (*set_sec_addr)(struct dma_desc *p, dma_addr_t addr); + void (*set_sec_addr)(struct dma_desc *p, dma_addr_t addr, bool buf2_valid); void (*set_sarc)(struct dma_desc *p, u32 sarc_type); void (*set_vlan_tag)(struct dma_desc *p, u16 tag, u16 inner_tag, u32 inner_type); --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1303,9 +1303,10 @@ static int stmmac_init_rx_buffers(struct return -ENOMEM;
buf->sec_addr = page_pool_get_dma_addr(buf->sec_page); - stmmac_set_desc_sec_addr(priv, p, buf->sec_addr); + stmmac_set_desc_sec_addr(priv, p, buf->sec_addr, true); } else { buf->sec_page = NULL; + stmmac_set_desc_sec_addr(priv, p, buf->sec_addr, false); }
buf->addr = page_pool_get_dma_addr(buf->page); @@ -3648,7 +3649,10 @@ static inline void stmmac_rx_refill(stru DMA_FROM_DEVICE);
stmmac_set_desc_addr(priv, p, buf->addr); - stmmac_set_desc_sec_addr(priv, p, buf->sec_addr); + if (priv->sph) + stmmac_set_desc_sec_addr(priv, p, buf->sec_addr, true); + else + stmmac_set_desc_sec_addr(priv, p, buf->sec_addr, false); stmmac_refill_desc3(priv, rx_q, p);
rx_q->rx_count_frames++;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Yinjun Zhang yinjun.zhang@corigine.com
commit a4fc088ad4ff4a99d01978aa41065132b574b4b2 upstream.
The command "ethtool -L <intf> combined 0" may clean the RX/TX channel count and skip the error path, since the attrs tb[ETHTOOL_A_CHANNELS_RX_COUNT] and tb[ETHTOOL_A_CHANNELS_TX_COUNT] are NULL in this case when recent ethtool is used.
Tested using ethtool v5.10.
Fixes: 7be92514b99c ("ethtool: check if there is at least one channel for TX/RX in the core") Signed-off-by: Yinjun Zhang yinjun.zhang@corigine.com Signed-off-by: Simon Horman simon.horman@netronome.com Signed-off-by: Louis Peens louis.peens@netronome.com Link: https://lore.kernel.org/r/20210225125102.23989-1-simon.horman@netronome.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ethtool/channels.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
--- a/net/ethtool/channels.c +++ b/net/ethtool/channels.c @@ -116,10 +116,9 @@ int ethnl_set_channels(struct sk_buff *s struct ethtool_channels channels = {}; struct ethnl_req_info req_info = {}; struct nlattr **tb = info->attrs; - const struct nlattr *err_attr; + u32 err_attr, max_rx_in_use = 0; const struct ethtool_ops *ops; struct net_device *dev; - u32 max_rx_in_use = 0; int ret;
ret = ethnl_parse_header_dev_get(&req_info, @@ -157,34 +156,35 @@ int ethnl_set_channels(struct sk_buff *s
/* ensure new channel counts are within limits */ if (channels.rx_count > channels.max_rx) - err_attr = tb[ETHTOOL_A_CHANNELS_RX_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_RX_COUNT; else if (channels.tx_count > channels.max_tx) - err_attr = tb[ETHTOOL_A_CHANNELS_TX_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_TX_COUNT; else if (channels.other_count > channels.max_other) - err_attr = tb[ETHTOOL_A_CHANNELS_OTHER_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_OTHER_COUNT; else if (channels.combined_count > channels.max_combined) - err_attr = tb[ETHTOOL_A_CHANNELS_COMBINED_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_COMBINED_COUNT; else - err_attr = NULL; + err_attr = 0; if (err_attr) { ret = -EINVAL; - NL_SET_ERR_MSG_ATTR(info->extack, err_attr, + NL_SET_ERR_MSG_ATTR(info->extack, tb[err_attr], "requested channel count exceeds maximum"); goto out_ops; }
/* ensure there is at least one RX and one TX channel */ if (!channels.combined_count && !channels.rx_count) - err_attr = tb[ETHTOOL_A_CHANNELS_RX_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_RX_COUNT; else if (!channels.combined_count && !channels.tx_count) - err_attr = tb[ETHTOOL_A_CHANNELS_TX_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_TX_COUNT; else - err_attr = NULL; + err_attr = 0; if (err_attr) { if (mod_combined) - err_attr = tb[ETHTOOL_A_CHANNELS_COMBINED_COUNT]; + err_attr = ETHTOOL_A_CHANNELS_COMBINED_COUNT; ret = -EINVAL; - NL_SET_ERR_MSG_ATTR(info->extack, err_attr, "requested channel counts would result in no RX or TX channel being configured"); + NL_SET_ERR_MSG_ATTR(info->extack, tb[err_attr], + "requested channel counts would result in no RX or TX channel being configured"); goto out_ops; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arnd Bergmann arnd@arndb.de
commit 7f654157f0aefba04cd7f6297351c87b76b47b89 upstream.
When CONFIG_PM_SLEEP is disabled, the compiler warns about unused functions:
drivers/net/phy/phy_device.c:273:12: error: unused function 'mdio_bus_phy_suspend' [-Werror,-Wunused-function] static int mdio_bus_phy_suspend(struct device *dev) drivers/net/phy/phy_device.c:293:12: error: unused function 'mdio_bus_phy_resume' [-Werror,-Wunused-function] static int mdio_bus_phy_resume(struct device *dev)
The logic is intentional, so just mark these two as __maybe_unused and remove the incorrect #ifdef.
Fixes: 4c0d2e96ba05 ("net: phy: consider that suspend2ram may cut off PHY power") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20210225145748.404410-1-arnd@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy_device.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -230,7 +230,6 @@ static struct phy_driver genphy_driver; static LIST_HEAD(phy_fixup_list); static DEFINE_MUTEX(phy_fixup_lock);
-#ifdef CONFIG_PM static bool mdio_bus_phy_may_suspend(struct phy_device *phydev) { struct device_driver *drv = phydev->mdio.dev.driver; @@ -270,7 +269,7 @@ out: return !phydev->suspended; }
-static int mdio_bus_phy_suspend(struct device *dev) +static __maybe_unused int mdio_bus_phy_suspend(struct device *dev) { struct phy_device *phydev = to_phy_device(dev);
@@ -290,7 +289,7 @@ static int mdio_bus_phy_suspend(struct d return phy_suspend(phydev); }
-static int mdio_bus_phy_resume(struct device *dev) +static __maybe_unused int mdio_bus_phy_resume(struct device *dev) { struct phy_device *phydev = to_phy_device(dev); int ret; @@ -316,7 +315,6 @@ no_resume:
static SIMPLE_DEV_PM_OPS(mdio_bus_phy_pm_ops, mdio_bus_phy_suspend, mdio_bus_phy_resume); -#endif /* CONFIG_PM */
/** * phy_register_fixup - creates a new phy_fixup and adds it to the list
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Danielle Ratson danieller@nvidia.com
commit edcbf5137f093b5502f5f6b97cce3cbadbde27aa upstream.
When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly.
If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail.
Fix this by setting the neighbor's state to permanent, so that it is always valid.
Fixes: b5b029399fa6d ("selftests: forwarding: mirror_gre_bridge_1d_vlan: Add STP test") Signed-off-by: Danielle Ratson danieller@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Ido Schimmel idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/forwarding/mirror_gre_bridge_1d_vlan.sh | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1d_vlan.sh +++ b/tools/testing/selftests/net/forwarding/mirror_gre_bridge_1d_vlan.sh @@ -86,11 +86,20 @@ test_ip6gretap()
test_gretap_stp() { + # Sometimes after mirror installation, the neighbor's state is not valid. + # The reason is that there is no SW datapath activity related to the + # neighbor for the remote GRE address. Therefore whether the corresponding + # neighbor will be valid is a matter of luck, and the test is thus racy. + # Set the neighbor's state to permanent, so it would be always valid. + ip neigh replace 192.0.2.130 lladdr $(mac_get $h3) \ + nud permanent dev br2 full_test_span_gre_stp gt4 $swp3.555 "mirror to gretap" }
test_ip6gretap_stp() { + ip neigh replace 2001:db8:2::2 lladdr $(mac_get $h3) \ + nud permanent dev br2 full_test_span_gre_stp gt6 $swp3.555 "mirror to ip6gretap" }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Danielle Ratson danieller@nvidia.com
commit ae9b24ddb69b4e31cda1b5e267a5a08a1db11717 upstream.
Currently, only external bits are added to the PTYS register, whereas there is one external bit that is wrongly marked as internal, and so was recently removed from the register.
Add that bit to the PTYS register again, as this bit is no longer internal.
Its removal resulted in '100000baseLR4_ER4/Full' link mode no longer being supported, causing a regression on some setups.
Fixes: 5bf01b571cf4 ("mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register") Signed-off-by: Danielle Ratson danieller@nvidia.com Reported-by: Eddie Shklaer eddies@nvidia.com Tested-by: Eddie Shklaer eddies@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Ido Schimmel idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/reg.h | 1 + drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c | 5 +++++ drivers/net/ethernet/mellanox/mlxsw/switchx2.c | 3 ++- 3 files changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h +++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h @@ -4430,6 +4430,7 @@ MLXSW_ITEM32(reg, ptys, ext_eth_proto_ca #define MLXSW_REG_PTYS_ETH_SPEED_100GBASE_CR4 BIT(20) #define MLXSW_REG_PTYS_ETH_SPEED_100GBASE_SR4 BIT(21) #define MLXSW_REG_PTYS_ETH_SPEED_100GBASE_KR4 BIT(22) +#define MLXSW_REG_PTYS_ETH_SPEED_100GBASE_LR4_ER4 BIT(23) #define MLXSW_REG_PTYS_ETH_SPEED_25GBASE_CR BIT(27) #define MLXSW_REG_PTYS_ETH_SPEED_25GBASE_KR BIT(28) #define MLXSW_REG_PTYS_ETH_SPEED_25GBASE_SR BIT(29) --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c @@ -1171,6 +1171,11 @@ static const struct mlxsw_sp1_port_link_ .mask_ethtool = ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT, .speed = SPEED_100000, }, + { + .mask = MLXSW_REG_PTYS_ETH_SPEED_100GBASE_LR4_ER4, + .mask_ethtool = ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT, + .speed = SPEED_100000, + }, };
#define MLXSW_SP1_PORT_LINK_MODE_LEN ARRAY_SIZE(mlxsw_sp1_port_link_mode) --- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c +++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c @@ -613,7 +613,8 @@ static const struct mlxsw_sx_port_link_m { .mask = MLXSW_REG_PTYS_ETH_SPEED_100GBASE_CR4 | MLXSW_REG_PTYS_ETH_SPEED_100GBASE_SR4 | - MLXSW_REG_PTYS_ETH_SPEED_100GBASE_KR4, + MLXSW_REG_PTYS_ETH_SPEED_100GBASE_KR4 | + MLXSW_REG_PTYS_ETH_SPEED_100GBASE_LR4_ER4, .speed = 100000, }, };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Rogers irogers@google.com
commit 137a5258939aca56558f3a23eb229b9c4b293917 upstream.
Issue detected by address sanitizer.
Fixes: cd4ceb63438e9e28 ("perf util: Save pid-cmdline mapping into tracing header") Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20210226221431.1985458-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/trace-event-read.c | 1 + 1 file changed, 1 insertion(+)
--- a/tools/perf/util/trace-event-read.c +++ b/tools/perf/util/trace-event-read.c @@ -361,6 +361,7 @@ static int read_saved_cmdline(struct tep pr_debug("error reading saved cmdlines\n"); goto out; } + buf[ret] = '\0';
parse_saved_cmdline(pevent, buf, size); ret = 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ravi Bangoria ravi.bangoria@linux.ibm.com
commit 6740a4e70e5d1b9d8e7fe41fd46dd5656d65dadf upstream.
perf report fails to add valid additional fields with -F when used with branch or mem modes. Fix it.
Before patch:
$ perf record -b $ perf report -b -F +srcline_from --stdio Error: Invalid --fields key: `srcline_from'
After patch:
$ perf report -b -F +srcline_from --stdio # Samples: 8K of event 'cycles' # Event count (approx.): 8784 ...
Committer notes:
There was an inversion: when looking at branch stack dimensions (keys) it was checking if the sort mode was 'mem', not 'branch'.
Fixes: aa6b3c99236b ("perf report: Make -F more strict like -s") Reported-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Reviewed-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Tested-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Namhyung Kim namhyung@kernel.org Link: http://lore.kernel.org/lkml/20210304062958.85465-1-ravi.bangoria@linux.ibm.c... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/sort.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -3033,7 +3033,7 @@ int output_field_add(struct perf_hpp_lis if (strncasecmp(tok, sd->name, strlen(tok))) continue;
- if (sort__mode != SORT_MODE__MEMORY) + if (sort__mode != SORT_MODE__BRANCH) return -EINVAL;
return __sort_dimension__add_output(list, sd); @@ -3045,7 +3045,7 @@ int output_field_add(struct perf_hpp_lis if (strncasecmp(tok, sd->name, strlen(tok))) continue;
- if (sort__mode != SORT_MODE__BRANCH) + if (sort__mode != SORT_MODE__MEMORY) return -EINVAL;
return __sort_dimension__add_output(list, sd);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jian Shen shenjian15@huawei.com
commit ae85ddda0f1b341b2d25f5a5e0eff1d42b6ef3df upstream.
Currently, some bit filed definitions of flow director TCAM configuration command are incorrect. Since the wrong MSB is always 0, and these fields are assgined in order, so it still works.
Fix it by redefine them.
Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h @@ -1048,16 +1048,16 @@ struct hclge_fd_tcam_config_3_cmd { #define HCLGE_FD_AD_DROP_B 0 #define HCLGE_FD_AD_DIRECT_QID_B 1 #define HCLGE_FD_AD_QID_S 2 -#define HCLGE_FD_AD_QID_M GENMASK(12, 2) +#define HCLGE_FD_AD_QID_M GENMASK(11, 2) #define HCLGE_FD_AD_USE_COUNTER_B 12 #define HCLGE_FD_AD_COUNTER_NUM_S 13 #define HCLGE_FD_AD_COUNTER_NUM_M GENMASK(20, 13) #define HCLGE_FD_AD_NXT_STEP_B 20 #define HCLGE_FD_AD_NXT_KEY_S 21 -#define HCLGE_FD_AD_NXT_KEY_M GENMASK(26, 21) +#define HCLGE_FD_AD_NXT_KEY_M GENMASK(25, 21) #define HCLGE_FD_AD_WR_RULE_ID_B 0 #define HCLGE_FD_AD_RULE_ID_S 1 -#define HCLGE_FD_AD_RULE_ID_M GENMASK(13, 1) +#define HCLGE_FD_AD_RULE_ID_M GENMASK(12, 1) #define HCLGE_FD_AD_TC_OVRD_B 16 #define HCLGE_FD_AD_TC_SIZE_S 17 #define HCLGE_FD_AD_TC_SIZE_M GENMASK(20, 17)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jian Shen shenjian15@huawei.com
commit c75ec148a316e8cf52274d16b9b422703b96f5ce upstream.
Currently, the driver returns VLAN_VID_MASK for vlan mask field, when get flow director rule information for rule doesn't use vlan. It may cause the vlan mask value display as 0xf000 in this case, like below:
estuary:/$ ethtool -u eth1 50 RX rings available Total 1 rules
Filter: 2 Rule Type: TCP over IPv4 Src IP addr: 0.0.0.0 mask: 255.255.255.255 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff VLAN EtherType: 0x0 mask: 0xffff VLAN: 0x0 mask: 0xf000 User-defined: 0x1234 mask: 0x0 Action: Direct to queue 3
Fix it by return 0.
Fixes: 05c2314fe6a8 ("net: hns3: Add support for rule query of flow director") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -6283,8 +6283,7 @@ static void hclge_fd_get_ext_info(struct fs->h_ext.vlan_tci = cpu_to_be16(rule->tuples.vlan_tag1); fs->m_ext.vlan_tci = rule->unused_tuple & BIT(INNER_VLAN_TAG_FST) ? - cpu_to_be16(VLAN_VID_MASK) : - cpu_to_be16(rule->tuples_mask.vlan_tag1); + 0 : cpu_to_be16(rule->tuples_mask.vlan_tag1); }
if (fs->flow_type & FLOW_MAC_EXT) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jian Shen shenjian15@huawei.com
commit b36fc875bcdee56865c444a2cdae17d354a6d5f5 upstream.
The function hclge_fd_convert_tuple() is used to convert tuples and tuples mask to TCAM x and y. But it misuses the source mac as source mac mask when convert INNER_SRC_MAC, which may cause the flow director rule works unexpectedly. So fix it.
Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -5194,9 +5194,9 @@ static bool hclge_fd_convert_tuple(u32 t case BIT(INNER_SRC_MAC): for (i = 0; i < ETH_ALEN; i++) { calc_x(key_x[ETH_ALEN - 1 - i], rule->tuples.src_mac[i], - rule->tuples.src_mac[i]); + rule->tuples_mask.src_mac[i]); calc_y(key_y[ETH_ALEN - 1 - i], rule->tuples.src_mac[i], - rule->tuples.src_mac[i]); + rule->tuples_mask.src_mac[i]); }
return true;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wang Qing wangqing@vivo.com
commit 51c44babdc19aaf882e1213325a0ba291573308f upstream.
The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return -EFAULT if the copy doesn't complete.
Fixes: e01bcdd61320 ("vfio: ccw: realize VFIO_DEVICE_GET_REGION_INFO ioctl") Signed-off-by: Wang Qing wangqing@vivo.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Link: https://lore.kernel.org/r/1614600093-13992-1-git-send-email-wangqing@vivo.co... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/cio/vfio_ccw_ops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -543,7 +543,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struc if (ret) return ret;
- return copy_to_user((void __user *)arg, &info, minsz); + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; } case VFIO_DEVICE_GET_REGION_INFO: { @@ -561,7 +561,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struc if (ret) return ret;
- return copy_to_user((void __user *)arg, &info, minsz); + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; } case VFIO_DEVICE_GET_IRQ_INFO: {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Edwin Peer edwin.peer@broadcom.com
commit 20d7d1c5c9b11e9f538ed4a2289be106de970d3e upstream.
The following trace excerpt corresponds with a NULL pointer dereference of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many device resets:
Unable to handle kernel NULL pointer dereference at ... 000000d ... pc : string+0x3c/0x80 lr : vsnprintf+0x294/0x7e0 sp : ffff00000f61ba70 pstate : 20000145 x29: ffff00000f61ba70 x28: 000000000000000d x27: ffff0000009c8b5a x26: ffff00000f61bb80 x25: ffff0000009c8b5a x24: 0000000000000012 x23: 00000000ffffffe0 x22: ffff000008990428 x21: ffff00000f61bb80 x20: 000000000000000d x19: 000000000000001f x18: 0000000000000000 x17: 0000000000000000 x16: ffff800b6d0fb400 x15: 0000000000000000 x14: ffff800b7fe31ae8 x13: 00001ed16472c920 x12: ffff000008c6b1c9 x11: ffff000008cf0580 x10: ffff00000f61bb80 x9 : 00000000ffffffd8 x8 : 000000000000000c x7 : ffff800b684b8000 x6 : 0000000000000000 x5 : 0000000000000065 x4 : 0000000000000001 x3 : ffff0a00ffffff04 x2 : 000000000000001f x1 : 0000000000000000 x0 : 000000000000000d Call trace: string+0x3c/0x80 vsnprintf+0x294/0x7e0 snprintf+0x44/0x50 __bnxt_open_nic+0x34c/0x928 [bnxt_en] bnxt_open+0xe8/0x238 [bnxt_en] __dev_open+0xbc/0x130 __dev_change_flags+0x12c/0x168 dev_change_flags+0x20/0x60 ...
Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining) would not be expected on a system supporting MSIX at all. However, if bnxt_init_int_mode() does not end up being called after the call to bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will think that only INTA is supported and bp->irq_tbl will be NULL, causing the above crash.
In the error recovery scenario, we call bnxt_clear_int_mode() in bnxt_fw_reset_close() early in the sequence. Ordinarily, we will call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we reestablish communication with the firmware after reset. However, if the sequence has to abort before we call bnxt_init_int_mode() and if the user later attempts to re-open the device, then it will cause the crash above.
We fix it in 2 ways:
1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call bnxt_init_init_mode().
2. If we need to abort in bnxt_hwrm_if_change() and cannot complete the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag. This will cause more drastic recovery at the next attempt to re-open the device, including a call to bnxt_init_int_mode().
Fixes: 3bc7d4a352ef ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.") Reviewed-by: Scott Branden scott.branden@broadcom.com Signed-off-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8430,10 +8430,18 @@ static void bnxt_setup_inta(struct bnxt bp->irq_tbl[0].handler = bnxt_inta; }
+static int bnxt_init_int_mode(struct bnxt *bp); + static int bnxt_setup_int_mode(struct bnxt *bp) { int rc;
+ if (!bp->irq_tbl) { + rc = bnxt_init_int_mode(bp); + if (rc || !bp->irq_tbl) + return rc ?: -ENODEV; + } + if (bp->flags & BNXT_FLAG_USING_MSIX) bnxt_setup_msix(bp); else @@ -8618,7 +8626,7 @@ static int bnxt_init_inta(struct bnxt *b
static int bnxt_init_int_mode(struct bnxt *bp) { - int rc = 0; + int rc = -ENODEV;
if (bp->flags & BNXT_FLAG_MSIX_CAP) rc = bnxt_init_msix(bp); @@ -9339,7 +9347,8 @@ static int bnxt_hwrm_if_change(struct bn { struct hwrm_func_drv_if_change_output *resp = bp->hwrm_cmd_resp_addr; struct hwrm_func_drv_if_change_input req = {0}; - bool resc_reinit = false, fw_reset = false; + bool fw_reset = !bp->irq_tbl; + bool resc_reinit = false; u32 flags = 0; int rc;
@@ -9367,6 +9376,7 @@ static int bnxt_hwrm_if_change(struct bn
if (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state) && !fw_reset) { netdev_err(bp->dev, "RESET_DONE not set during FW reset.\n"); + set_bit(BNXT_STATE_ABORT_ERR, &bp->state); return -ENODEV; } if (resc_reinit || fw_reset) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Tong Zhang ztong0001@gmail.com
commit 874a52f9b693ed8bf7a92b3592a547ce8a684e6f upstream.
drm_fbdev_cleanup() can be called when fb_helper->buffer is null, hence fb_helper->buffer should be checked before calling drm_client_buffer_vunmap(). This buffer is also checked in drm_client_framebuffer_delete(), so we should also do the same thing for drm_client_buffer_vunmap().
[ 199.128742] RIP: 0010:drm_client_buffer_vunmap+0xd/0x20 [ 199.129031] Code: 43 18 48 8b 53 20 49 89 45 00 49 89 55 08 5b 44 89 e0 41 5c 41 5d 41 5e 5d c3 0f 1f 00 53 48 89 fb 48 8d 7f 10 e8 73 7d a1 ff <48> 8b 7b 10 48 8d 73 18 5b e9 75 53 fc ff 0 f 1f 44 00 00 48 b8 00 [ 199.130041] RSP: 0018:ffff888103f3fc88 EFLAGS: 00010282 [ 199.130329] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8214d46d [ 199.130733] RDX: 1ffffffff079c6b9 RSI: 0000000000000246 RDI: ffffffff83ce35c8 [ 199.131119] RBP: ffff888103d25458 R08: 0000000000000001 R09: fffffbfff0791761 [ 199.131505] R10: ffffffff83c8bb07 R11: fffffbfff0791760 R12: 0000000000000000 [ 199.131891] R13: ffff888103d25468 R14: ffff888103d25418 R15: ffff888103f18120 [ 199.132277] FS: 00007f36fdcbb6a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 199.132721] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 199.133033] CR2: 0000000000000010 CR3: 0000000103d26000 CR4: 00000000000006f0 [ 199.133420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 199.133807] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 199.134195] Call Trace: [ 199.134333] drm_fbdev_cleanup+0x179/0x1a0 [ 199.134562] drm_fbdev_client_unregister+0x2b/0x40 [ 199.134828] drm_client_dev_unregister+0xa8/0x180 [ 199.135088] drm_dev_unregister+0x61/0x110 [ 199.135315] mgag200_pci_remove+0x38/0x52 [mgag200] [ 199.135586] pci_device_remove+0x62/0xe0 [ 199.135806] device_release_driver_internal+0x148/0x270 [ 199.136094] driver_detach+0x76/0xe0 [ 199.136294] bus_remove_driver+0x7e/0x100 [ 199.136521] pci_unregister_driver+0x28/0xf0 [ 199.136759] __x64_sys_delete_module+0x268/0x300 [ 199.137016] ? __ia32_sys_delete_module+0x300/0x300 [ 199.137285] ? call_rcu+0x3e4/0x580 [ 199.137481] ? fpregs_assert_state_consistent+0x4d/0x60 [ 199.137767] ? exit_to_user_mode_prepare+0x2f/0x130 [ 199.138037] do_syscall_64+0x33/0x40 [ 199.138237] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 199.138517] RIP: 0033:0x7f36fdc3dcf7
Signed-off-by: Tong Zhang ztong0001@gmail.com Fixes: 763aea17bf57 ("drm/fb-helper: Unmap client buffer during shutdown") Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Sam Ravnborg sam@ravnborg.org Cc: Maxime Ripard mripard@kernel.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: David Airlie airlied@linux.ie Cc: Daniel Vetter daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20210228044625.171151-1-ztong0... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_fb_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -2043,7 +2043,7 @@ static void drm_fbdev_cleanup(struct drm
if (shadow) vfree(shadow); - else + else if (fb_helper->buffer) drm_client_buffer_vunmap(fb_helper->buffer);
drm_client_framebuffer_delete(fb_helper->buffer);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Daniel Vetter daniel.vetter@ffwll.ch
commit de066e116306baf3a6a62691ac63cfc0b1dabddb upstream.
Some of them have gaps, or fields we don't clear. Native ioctl code does full copies plus zero-extends on size mismatch, so nothing can leak. But compat is more hand-rolled so need to be careful.
None of these matter for performance, so just memset.
Also I didn't fix up the CONFIG_DRM_LEGACY or CONFIG_DRM_AGP ioctl, those are security holes anyway.
Acked-by: Maxime Ripard mripard@kernel.org Reported-by: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com # vblank ioctl Cc: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210222100643.400935-1-daniel... (cherry picked from commit e926c474ebee404441c838d18224cd6f246a71b7) Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_ioc32.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/gpu/drm/drm_ioc32.c +++ b/drivers/gpu/drm/drm_ioc32.c @@ -99,6 +99,8 @@ static int compat_drm_version(struct fil if (copy_from_user(&v32, (void __user *)arg, sizeof(v32))) return -EFAULT;
+ memset(&v, 0, sizeof(v)); + v = (struct drm_version) { .name_len = v32.name_len, .name = compat_ptr(v32.name), @@ -137,6 +139,9 @@ static int compat_drm_getunique(struct f
if (copy_from_user(&uq32, (void __user *)arg, sizeof(uq32))) return -EFAULT; + + memset(&uq, 0, sizeof(uq)); + uq = (struct drm_unique){ .unique_len = uq32.unique_len, .unique = compat_ptr(uq32.unique), @@ -265,6 +270,8 @@ static int compat_drm_getclient(struct f if (copy_from_user(&c32, argp, sizeof(c32))) return -EFAULT;
+ memset(&client, 0, sizeof(client)); + client.idx = c32.idx;
err = drm_ioctl_kernel(file, drm_getclient, &client, 0); @@ -852,6 +859,8 @@ static int compat_drm_wait_vblank(struct if (copy_from_user(&req32, argp, sizeof(req32))) return -EFAULT;
+ memset(&req, 0, sizeof(req)); + req.request.type = req32.request.type; req.request.sequence = req32.request.sequence; req.request.signal = req32.request.signal; @@ -889,6 +898,8 @@ static int compat_drm_mode_addfb2(struct struct drm_mode_fb_cmd2 req64; int err;
+ memset(&req64, 0, sizeof(req64)); + if (copy_from_user(&req64, argp, offsetof(drm_mode_fb_cmd232_t, modifier))) return -EFAULT;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Christian König christian.koenig@amd.com
commit a25955ba123499d7db520175c6be59c29f9215e3 upstream.
Otherwise we will run into a NULL ptr deref.
Signed-off-by: Christian König christian.koenig@amd.com Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137 Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/radeon/radeon.h | 2 ++ drivers/gpu/drm/radeon/radeon_gem.c | 4 ++-- drivers/gpu/drm/radeon/radeon_prime.c | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -575,6 +575,8 @@ struct radeon_gem { struct list_head objects; };
+extern const struct drm_gem_object_funcs radeon_gem_object_funcs; + int radeon_gem_init(struct radeon_device *rdev); void radeon_gem_fini(struct radeon_device *rdev); int radeon_gem_object_create(struct radeon_device *rdev, unsigned long size, --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -43,7 +43,7 @@ struct sg_table *radeon_gem_prime_get_sg int radeon_gem_prime_pin(struct drm_gem_object *obj); void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-static const struct drm_gem_object_funcs radeon_gem_object_funcs; +const struct drm_gem_object_funcs radeon_gem_object_funcs;
static void radeon_gem_object_free(struct drm_gem_object *gobj) { @@ -227,7 +227,7 @@ static int radeon_gem_handle_lockup(stru return r; }
-static const struct drm_gem_object_funcs radeon_gem_object_funcs = { +const struct drm_gem_object_funcs radeon_gem_object_funcs = { .free = radeon_gem_object_free, .open = radeon_gem_object_open, .close = radeon_gem_object_close, --- a/drivers/gpu/drm/radeon/radeon_prime.c +++ b/drivers/gpu/drm/radeon/radeon_prime.c @@ -56,6 +56,8 @@ struct drm_gem_object *radeon_gem_prime_ if (ret) return ERR_PTR(ret);
+ bo->tbo.base.funcs = &radeon_gem_object_funcs; + mutex_lock(&rdev->gem.mutex); list_add_tail(&bo->list, &rdev->gem.objects); mutex_unlock(&rdev->gem.mutex);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 7a46f05e5e163c00e41892e671294286e53fe15c upstream.
There seem devices that don't work with the aux channel backlight control. For allowing such users to test with the other backlight control method, provide a new module option, aux_backlight, to specify enabling or disabling the aux backport support explicitly. As default, the aux support is detected by the hardware capability.
v2: make the backlight option generic in case we add future backlight types (Alex)
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438 Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +++++ 3 files changed, 10 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -179,6 +179,7 @@ extern uint amdgpu_smu_memory_pool_size; extern uint amdgpu_dc_feature_mask; extern uint amdgpu_dc_debug_mask; extern uint amdgpu_dm_abm_level; +extern int amdgpu_backlight; extern struct amdgpu_mgpu_info mgpu_info; extern int amdgpu_ras_enable; extern uint amdgpu_ras_mask; --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -777,6 +777,10 @@ uint amdgpu_dm_abm_level; MODULE_PARM_DESC(abmlevel, "ABM level (0 = off (default), 1-4 = backlight reduction level) "); module_param_named(abmlevel, amdgpu_dm_abm_level, uint, 0444);
+int amdgpu_backlight = -1; +MODULE_PARM_DESC(backlight, "Backlight control (0 = pwm, 1 = aux, -1 auto (default))"); +module_param_named(backlight, amdgpu_backlight, bint, 0444); + /** * DOC: tmz (int) * Trusted Memory Zone (TMZ) is a method to protect data being written --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2209,6 +2209,11 @@ static void update_connector_ext_caps(st caps->ext_caps->bits.hdr_aux_backlight_control == 1) caps->aux_support = true;
+ if (amdgpu_backlight == 0) + caps->aux_support = false; + else if (amdgpu_backlight == 1) + caps->aux_support = true; + /* From the specification (CTA-861-G), for calculating the maximum * luminance we need to use: * Luminance = 50*2**(CV/32)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Holger Hoffstätte holger@applied-asynchrony.com
commit 680174cfd1e1cea70a8f30ccb44d8fbdf996018e upstream.
After fixing nested FPU contexts caused by 41401ac67791 we're still seeing complaints about spurious kernel_fpu_end(). As it turns out this was already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved forward to dcn21.
Signed-off-by: Holger Hoffstätte holger@applied-asynchrony.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -1339,7 +1339,7 @@ static noinline bool dcn21_validate_band int vlevel = 0; int pipe_split_from[MAX_PIPES]; int pipe_cnt = 0; - display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_KERNEL); + display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); DC_LOGGER_INIT(dc->ctx->logger);
BW_VAL_TRACE_COUNT();
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Holger Hoffstätte holger@applied-asynchrony.com
commit 15e8b95d5f7509e0b09289be8c422c459c9f0412 upstream.
Commit 41401ac67791 added FPU wrappers to dcn21_validate_bandwidth(), which was correct. Unfortunately a nested function alredy contained DC_FP_START()/DC_FP_END() calls, which results in nested FPU context enter/exit and complaints by kernel_fpu_begin_mask(). This can be observed e.g. with 5.10.20, which backported 41401ac67791 and now emits the following warning on boot:
WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0 Call Trace: dcn21_calculate_wm+0x47/0xa90 [amdgpu] dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu] dcn21_validate_bandwidth+0x29/0x40 [amdgpu] dc_validate_global_state+0x3c7/0x4c0 [amdgpu]
The warning is emitted due to the additional DC_FP_START/END calls in patch_bounding_box(), which is inlined into dcn21_calculate_wm(), its only caller. Removing the calls brings the code in line with dcn20 and makes the warning disappear.
Fixes: 41401ac67791 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()") Signed-off-by: Holger Hoffstätte holger@applied-asynchrony.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -1062,8 +1062,6 @@ static void patch_bounding_box(struct dc { int i;
- DC_FP_START(); - if (dc->bb_overrides.sr_exit_time_ns) { for (i = 0; i < WM_SET_COUNT; i++) { dc->clk_mgr->bw_params->wm_table.entries[i].sr_exit_time_us = @@ -1088,8 +1086,6 @@ static void patch_bounding_box(struct dc dc->bb_overrides.dram_clock_change_latency_ns / 1000.0; } } - - DC_FP_END(); }
void dcn21_calculate_wm(
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Evan Quan evan.quan@amd.com
commit 48123d068fcb584838ce29912660c5e9490bad0e upstream.
The "/ 10" should be applied to the right-hand operand instead of the left-hand one.
Signed-off-by: Evan Quan evan.quan@amd.com Noticed-by: Georgios Toptsidis gtoptsid@gmail.com Reviewed-by: Feifei Xu Feifei.Xu@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -5216,10 +5216,10 @@ static int smu7_set_watermarks_for_clock for (j = 0; j < dep_sclk_table->count; j++) { valid_entry = false; for (k = 0; k < watermarks->num_wm_sets; k++) { - if (dep_sclk_table->entries[i].clk / 10 >= watermarks->wm_clk_ranges[k].wm_min_eng_clk_in_khz && - dep_sclk_table->entries[i].clk / 10 < watermarks->wm_clk_ranges[k].wm_max_eng_clk_in_khz && - dep_mclk_table->entries[i].clk / 10 >= watermarks->wm_clk_ranges[k].wm_min_mem_clk_in_khz && - dep_mclk_table->entries[i].clk / 10 < watermarks->wm_clk_ranges[k].wm_max_mem_clk_in_khz) { + if (dep_sclk_table->entries[i].clk >= watermarks->wm_clk_ranges[k].wm_min_eng_clk_in_khz / 10 && + dep_sclk_table->entries[i].clk < watermarks->wm_clk_ranges[k].wm_max_eng_clk_in_khz / 10 && + dep_mclk_table->entries[i].clk >= watermarks->wm_clk_ranges[k].wm_min_mem_clk_in_khz / 10 && + dep_mclk_table->entries[i].clk < watermarks->wm_clk_ranges[k].wm_max_mem_clk_in_khz / 10) { valid_entry = true; table->DisplayWatermark[i][j] = watermarks->wm_clk_ranges[k].wm_set_id; break;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kenneth Feng kenneth.feng@amd.com
commit 50ceb1fe7acd50831180f4b5597bf7b39e8059c8 upstream.
Currently the pcie dpm has two problems. 1. Only the high dpm level speed/width can be overrided if the requested values are out of the pcie capability. 2. The high dpm level is always overrided though sometimes it's not necesarry.
Signed-off-by: Kenneth Feng kenneth.feng@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 48 +++++++++++++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 66 ++++++++++++++++++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 48 +++++++------ 3 files changed, 141 insertions(+), 21 deletions(-)
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c @@ -1506,6 +1506,48 @@ static int vega10_populate_single_lclk_l return 0; }
+static int vega10_override_pcie_parameters(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev); + struct vega10_hwmgr *data = + (struct vega10_hwmgr *)(hwmgr->backend); + uint32_t pcie_gen = 0, pcie_width = 0; + PPTable_t *pp_table = &(data->smc_state_table.pp_table); + int i; + + if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4) + pcie_gen = 3; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3) + pcie_gen = 2; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2) + pcie_gen = 1; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1) + pcie_gen = 0; + + if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X16) + pcie_width = 6; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X12) + pcie_width = 5; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X8) + pcie_width = 4; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4) + pcie_width = 3; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2) + pcie_width = 2; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X1) + pcie_width = 1; + + for (i = 0; i < NUM_LINK_LEVELS; i++) { + if (pp_table->PcieGenSpeed[i] > pcie_gen) + pp_table->PcieGenSpeed[i] = pcie_gen; + + if (pp_table->PcieLaneCount[i] > pcie_width) + pp_table->PcieLaneCount[i] = pcie_width; + } + + return 0; +} + static int vega10_populate_smc_link_levels(struct pp_hwmgr *hwmgr) { int result = -1; @@ -2557,6 +2599,11 @@ static int vega10_init_smc_table(struct "Failed to initialize Link Level!", return result);
+ result = vega10_override_pcie_parameters(hwmgr); + PP_ASSERT_WITH_CODE(!result, + "Failed to override pcie parameters!", + return result); + result = vega10_populate_all_graphic_levels(hwmgr); PP_ASSERT_WITH_CODE(!result, "Failed to initialize Graphics Level!", @@ -2923,6 +2970,7 @@ static int vega10_start_dpm(struct pp_hw return 0; }
+ static int vega10_enable_disable_PCC_limit_feature(struct pp_hwmgr *hwmgr, bool enable) { struct vega10_hwmgr *data = hwmgr->backend; --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c @@ -481,6 +481,67 @@ static void vega12_init_dpm_state(struct dpm_state->hard_max_level = 0xffff; }
+static int vega12_override_pcie_parameters(struct pp_hwmgr *hwmgr) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev); + struct vega12_hwmgr *data = + (struct vega12_hwmgr *)(hwmgr->backend); + uint32_t pcie_gen = 0, pcie_width = 0, smu_pcie_arg, pcie_gen_arg, pcie_width_arg; + PPTable_t *pp_table = &(data->smc_state_table.pp_table); + int i; + int ret; + + if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4) + pcie_gen = 3; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3) + pcie_gen = 2; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2) + pcie_gen = 1; + else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1) + pcie_gen = 0; + + if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X16) + pcie_width = 6; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X12) + pcie_width = 5; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X8) + pcie_width = 4; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4) + pcie_width = 3; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2) + pcie_width = 2; + else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X1) + pcie_width = 1; + + /* Bit 31:16: LCLK DPM level. 0 is DPM0, and 1 is DPM1 + * Bit 15:8: PCIE GEN, 0 to 3 corresponds to GEN1 to GEN4 + * Bit 7:0: PCIE lane width, 1 to 7 corresponds is x1 to x32 + */ + for (i = 0; i < NUM_LINK_LEVELS; i++) { + pcie_gen_arg = (pp_table->PcieGenSpeed[i] > pcie_gen) ? pcie_gen : + pp_table->PcieGenSpeed[i]; + pcie_width_arg = (pp_table->PcieLaneCount[i] > pcie_width) ? pcie_width : + pp_table->PcieLaneCount[i]; + + if (pcie_gen_arg != pp_table->PcieGenSpeed[i] || pcie_width_arg != + pp_table->PcieLaneCount[i]) { + smu_pcie_arg = (i << 16) | (pcie_gen_arg << 8) | pcie_width_arg; + ret = smum_send_msg_to_smc_with_parameter(hwmgr, + PPSMC_MSG_OverridePcieParameters, smu_pcie_arg, + NULL); + PP_ASSERT_WITH_CODE(!ret, + "[OverridePcieParameters] Attempt to override pcie params failed!", + return ret); + } + + /* update the pptable */ + pp_table->PcieGenSpeed[i] = pcie_gen_arg; + pp_table->PcieLaneCount[i] = pcie_width_arg; + } + + return 0; +} + static int vega12_get_number_of_dpm_level(struct pp_hwmgr *hwmgr, PPCLK_e clk_id, uint32_t *num_of_levels) { @@ -969,6 +1030,11 @@ static int vega12_enable_dpm_tasks(struc "Failed to enable all smu features!", return result);
+ result = vega12_override_pcie_parameters(hwmgr); + PP_ASSERT_WITH_CODE(!result, + "[EnableDPMTasks] Failed to override pcie parameters!", + return result); + tmp_result = vega12_power_control_set_level(hwmgr); PP_ASSERT_WITH_CODE(!tmp_result, "Failed to power control set level!", --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c @@ -832,7 +832,9 @@ static int vega20_override_pcie_paramete struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev); struct vega20_hwmgr *data = (struct vega20_hwmgr *)(hwmgr->backend); - uint32_t pcie_gen = 0, pcie_width = 0, smu_pcie_arg; + uint32_t pcie_gen = 0, pcie_width = 0, smu_pcie_arg, pcie_gen_arg, pcie_width_arg; + PPTable_t *pp_table = &(data->smc_state_table.pp_table); + int i; int ret;
if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4) @@ -861,17 +863,27 @@ static int vega20_override_pcie_paramete * Bit 15:8: PCIE GEN, 0 to 3 corresponds to GEN1 to GEN4 * Bit 7:0: PCIE lane width, 1 to 7 corresponds is x1 to x32 */ - smu_pcie_arg = (1 << 16) | (pcie_gen << 8) | pcie_width; - ret = smum_send_msg_to_smc_with_parameter(hwmgr, - PPSMC_MSG_OverridePcieParameters, smu_pcie_arg, - NULL); - PP_ASSERT_WITH_CODE(!ret, - "[OverridePcieParameters] Attempt to override pcie params failed!", - return ret); + for (i = 0; i < NUM_LINK_LEVELS; i++) { + pcie_gen_arg = (pp_table->PcieGenSpeed[i] > pcie_gen) ? pcie_gen : + pp_table->PcieGenSpeed[i]; + pcie_width_arg = (pp_table->PcieLaneCount[i] > pcie_width) ? pcie_width : + pp_table->PcieLaneCount[i]; + + if (pcie_gen_arg != pp_table->PcieGenSpeed[i] || pcie_width_arg != + pp_table->PcieLaneCount[i]) { + smu_pcie_arg = (i << 16) | (pcie_gen_arg << 8) | pcie_width_arg; + ret = smum_send_msg_to_smc_with_parameter(hwmgr, + PPSMC_MSG_OverridePcieParameters, smu_pcie_arg, + NULL); + PP_ASSERT_WITH_CODE(!ret, + "[OverridePcieParameters] Attempt to override pcie params failed!", + return ret); + }
- data->pcie_parameters_override = true; - data->pcie_gen_level1 = pcie_gen; - data->pcie_width_level1 = pcie_width; + /* update the pptable */ + pp_table->PcieGenSpeed[i] = pcie_gen_arg; + pp_table->PcieLaneCount[i] = pcie_width_arg; + }
return 0; } @@ -3320,9 +3332,7 @@ static int vega20_print_clock_levels(str data->od8_settings.od8_settings_array; OverDriveTable_t *od_table = &(data->smc_state_table.overdrive_table); - struct phm_ppt_v3_information *pptable_information = - (struct phm_ppt_v3_information *)hwmgr->pptable; - PPTable_t *pptable = (PPTable_t *)pptable_information->smc_pptable; + PPTable_t *pptable = &(data->smc_state_table.pp_table); struct pp_clock_levels_with_latency clocks; struct vega20_single_dpm_table *fclk_dpm_table = &(data->dpm_table.fclk_table); @@ -3421,13 +3431,9 @@ static int vega20_print_clock_levels(str current_lane_width = vega20_get_current_pcie_link_width_level(hwmgr); for (i = 0; i < NUM_LINK_LEVELS; i++) { - if (i == 1 && data->pcie_parameters_override) { - gen_speed = data->pcie_gen_level1; - lane_width = data->pcie_width_level1; - } else { - gen_speed = pptable->PcieGenSpeed[i]; - lane_width = pptable->PcieLaneCount[i]; - } + gen_speed = pptable->PcieGenSpeed[i]; + lane_width = pptable->PcieLaneCount[i]; + size += sprintf(buf + size, "%d: %s %s %dMhz %s\n", i, (gen_speed == 0) ? "2.5GT/s," : (gen_speed == 1) ? "5.0GT/s," :
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alex Deucher alexander.deucher@amd.com
commit a2f8d988698d7d3645b045f4940415b045140b81 upstream.
Avoid the extra wrapper function.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -3132,19 +3132,6 @@ static void amdgpu_dm_update_backlight_c #endif }
-static int set_backlight_via_aux(struct dc_link *link, uint32_t brightness) -{ - bool rc; - - if (!link) - return 1; - - rc = dc_link_set_backlight_level_nits(link, true, brightness, - AUX_BL_DEFAULT_TRANSITION_TIME_MS); - - return rc ? 0 : 1; -} - static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, unsigned *min, unsigned *max) { @@ -3207,9 +3194,10 @@ static int amdgpu_dm_backlight_update_st brightness = convert_brightness_from_user(&caps, bd->props.brightness); // Change brightness based on AUX property if (caps.aux_support) - return set_backlight_via_aux(link, brightness); - - rc = dc_link_set_backlight_level(dm->backlight_link, brightness, 0); + rc = dc_link_set_backlight_level_nits(link, true, brightness, + AUX_BL_DEFAULT_TRANSITION_TIME_MS); + else + rc = dc_link_set_backlight_level(dm->backlight_link, brightness, 0);
return rc ? 0 : 1; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alex Deucher alexander.deucher@amd.com
commit dfd8b7fbd985ec1cf76fe10f2875a50b10833740 upstream.
It just spams the logs.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -2571,7 +2571,6 @@ bool dc_link_set_backlight_level(const s if (pipe_ctx->plane_state == NULL) frame_ramp = 0; } else { - ASSERT(false); return false; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alex Deucher alexander.deucher@amd.com
commit 0ad3e64eb46d8c47de3af552e282894e3893e973 upstream.
Need to fetch it via aux.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 24 ++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -3205,11 +3205,27 @@ static int amdgpu_dm_backlight_update_st static int amdgpu_dm_backlight_get_brightness(struct backlight_device *bd) { struct amdgpu_display_manager *dm = bl_get_data(bd); - int ret = dc_link_get_backlight_level(dm->backlight_link); + struct amdgpu_dm_backlight_caps caps;
- if (ret == DC_ERROR_UNEXPECTED) - return bd->props.brightness; - return convert_brightness_to_user(&dm->backlight_caps, ret); + amdgpu_dm_update_backlight_caps(dm); + caps = dm->backlight_caps; + + if (caps.aux_support) { + struct dc_link *link = (struct dc_link *)dm->backlight_link; + u32 avg, peak; + bool rc; + + rc = dc_link_get_backlight_level_nits(link, &avg, &peak); + if (!rc) + return bd->props.brightness; + return convert_brightness_to_user(&caps, avg); + } else { + int ret = dc_link_get_backlight_level(dm->backlight_link); + + if (ret == DC_ERROR_UNEXPECTED) + return bd->props.brightness; + return convert_brightness_to_user(&caps, ret); + } }
static const struct backlight_ops amdgpu_dm_backlight_ops = {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Neil Roberts nroberts@igalia.com
commit d611b4a0907cece060699f2fd347c492451cd2aa upstream.
When a buffer is madvised as not needed and then purged, any attempts to access the buffer from user-space should cause a bus fault. This patch adds a check for that.
Cc: stable@vger.kernel.org Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers") Signed-off-by: Neil Roberts nroberts@igalia.com Reviewed-by: Steven Price steven.price@arm.com Signed-off-by: Steven Price steven.price@arm.com Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-2-nrober... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_gem_shmem_helper.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -525,14 +525,24 @@ static vm_fault_t drm_gem_shmem_fault(st struct drm_gem_object *obj = vma->vm_private_data; struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); loff_t num_pages = obj->size >> PAGE_SHIFT; + vm_fault_t ret; struct page *page;
- if (vmf->pgoff >= num_pages || WARN_ON_ONCE(!shmem->pages)) - return VM_FAULT_SIGBUS; + mutex_lock(&shmem->pages_lock);
- page = shmem->pages[vmf->pgoff]; + if (vmf->pgoff >= num_pages || + WARN_ON_ONCE(!shmem->pages) || + shmem->madv < 0) { + ret = VM_FAULT_SIGBUS; + } else { + page = shmem->pages[vmf->pgoff];
- return vmf_insert_page(vma, vmf->address, page); + ret = vmf_insert_page(vma, vmf->address, page); + } + + mutex_unlock(&shmem->pages_lock); + + return ret; }
static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Neil Roberts nroberts@igalia.com
commit 11d5a4745e00e73745774671dbf2fb07bd6e2363 upstream.
When mmapping the shmem, it would previously adjust the pgoff in the vm_area_struct to remove the fake offset that is added to be able to identify the buffer. This patch removes the adjustment and makes the fault handler use the vm_fault address to calculate the page offset instead. Although using this address is apparently discouraged, several DRM drivers seem to be doing it anyway.
The problem with removing the pgoff is that it prevents drm_vma_node_unmap from working because that searches the mapping tree by address. That doesn't work because all of the mappings are at offset 0. drm_vma_node_unmap is being used by the shmem helpers when purging the buffer.
This fixes a bug in Panfrost which is using drm_gem_shmem_purge. Without this the mapping for the purged buffer can still be accessed which might mean it would access random pages from other buffers
v2: Don't check whether the unsigned page_offset is less than 0.
Cc: stable@vger.kernel.org Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers") Signed-off-by: Neil Roberts nroberts@igalia.com Reviewed-by: Steven Price steven.price@arm.com Signed-off-by: Steven Price steven.price@arm.com Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-3-nrober... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_gem_shmem_helper.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -527,15 +527,19 @@ static vm_fault_t drm_gem_shmem_fault(st loff_t num_pages = obj->size >> PAGE_SHIFT; vm_fault_t ret; struct page *page; + pgoff_t page_offset; + + /* We don't use vmf->pgoff since that has the fake offset */ + page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
mutex_lock(&shmem->pages_lock);
- if (vmf->pgoff >= num_pages || + if (page_offset >= num_pages || WARN_ON_ONCE(!shmem->pages) || shmem->madv < 0) { ret = VM_FAULT_SIGBUS; } else { - page = shmem->pages[vmf->pgoff]; + page = shmem->pages[page_offset];
ret = vmf_insert_page(vma, vmf->address, page); } @@ -591,9 +595,6 @@ int drm_gem_shmem_mmap(struct drm_gem_ob struct drm_gem_shmem_object *shmem; int ret;
- /* Remove the fake offset */ - vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node); - if (obj->import_attach) { /* Drop the reference drm_gem_mmap_obj() acquired.*/ drm_gem_object_put(obj);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Thomas Zimmermann tzimmermann@suse.de
commit 659ab7a49cbebe0deffcbe1f9560e82006b21817 upstream.
USB devices cannot perform DMA and hence have no dma_mask set in their device structure. Therefore importing dmabuf into a USB-based driver fails, which breaks joining and mirroring of display in X11.
For USB devices, pick the associated USB controller as attachment device. This allows the DRM import helpers to perform the DMA setup. If the DMA controller does not support DMA transfers, we're out of luck and cannot import. Our current USB-based DRM drivers don't use DMA, so the actual DMA device is not important.
Tested by joining/mirroring displays of udl and radeon under Gnome/X11.
v8: * release dmadev if device initialization fails (Noralf) * fix commit description (Noralf) v7: * fix use-before-init bug in gm12u320 (Dan) v6: * implement workaround in DRM drivers and hold reference to DMA device while USB device is in use * remove dev_is_usb() (Greg) * collapse USB helper into usb_intf_get_dma_device() (Alan) * integrate Daniel's TODO statement (Daniel) * fix typos (Greg) v5: * provide a helper for USB interfaces (Alan) * add FIXME item to documentation and TODO list (Daniel) v4: * implement workaround with USB helper functions (Greg) * use struct usb_device->bus->sysdev as DMA device (Takashi) v3: * drop gem_create_object * use DMA mask of USB controller, if any (Daniel, Christian, Noralf) v2: * move fix to importer side (Christian, Daniel) * update SHMEM and CMA helpers for new PRIME callbacks
Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Fixes: 6eb0233ec2d0 ("usb: don't inherity DMA properties for USB devices") Tested-by: Pavel Machek pavel@ucw.cz Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Acked-by: Christian König christian.koenig@amd.com Acked-by: Daniel Vetter daniel.vetter@ffwll.ch Acked-by: Noralf Trønnes noralf@tronnes.org Cc: Christoph Hellwig hch@lst.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20210303133229.3288-1-tzimmerm... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/gpu/todo.rst | 21 +++++++++++++++++++ drivers/gpu/drm/tiny/gm12u320.c | 44 ++++++++++++++++++++++++++++++++-------- drivers/gpu/drm/udl/udl_drv.c | 17 +++++++++++++++ drivers/gpu/drm/udl/udl_drv.h | 1 drivers/gpu/drm/udl/udl_main.c | 10 +++++++++ drivers/usb/core/usb.c | 32 +++++++++++++++++++++++++++++ include/linux/usb.h | 2 + 7 files changed, 119 insertions(+), 8 deletions(-)
--- a/Documentation/gpu/todo.rst +++ b/Documentation/gpu/todo.rst @@ -594,6 +594,27 @@ Some of these date from the very introdu
Level: Intermediate
+Remove automatic page mapping from dma-buf importing +---------------------------------------------------- + +When importing dma-bufs, the dma-buf and PRIME frameworks automatically map +imported pages into the importer's DMA area. drm_gem_prime_fd_to_handle() and +drm_gem_prime_handle_to_fd() require that importers call dma_buf_attach() +even if they never do actual device DMA, but only CPU access through +dma_buf_vmap(). This is a problem for USB devices, which do not support DMA +operations. + +To fix the issue, automatic page mappings should be removed from the +buffer-sharing code. Fixing this is a bit more involved, since the import/export +cache is also tied to &drm_gem_object.import_attach. Meanwhile we paper over +this problem for USB devices by fishing out the USB host controller device, as +long as that supports DMA. Otherwise importing can still needlessly fail. + +Contact: Thomas Zimmermann tzimmermann@suse.de, Daniel Vetter + +Level: Advanced + + Better Testing ==============
--- a/drivers/gpu/drm/tiny/gm12u320.c +++ b/drivers/gpu/drm/tiny/gm12u320.c @@ -83,6 +83,7 @@ MODULE_PARM_DESC(eco_mode, "Turn on Eco
struct gm12u320_device { struct drm_device dev; + struct device *dmadev; struct drm_simple_display_pipe pipe; struct drm_connector conn; unsigned char *cmd_buf; @@ -601,6 +602,22 @@ static const uint64_t gm12u320_pipe_modi DRM_FORMAT_MOD_INVALID };
+/* + * FIXME: Dma-buf sharing requires DMA support by the importing device. + * This function is a workaround to make USB devices work as well. + * See todo.rst for how to fix the issue in the dma-buf framework. + */ +static struct drm_gem_object *gm12u320_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf) +{ + struct gm12u320_device *gm12u320 = to_gm12u320(dev); + + if (!gm12u320->dmadev) + return ERR_PTR(-ENODEV); + + return drm_gem_prime_import_dev(dev, dma_buf, gm12u320->dmadev); +} + DEFINE_DRM_GEM_FOPS(gm12u320_fops);
static const struct drm_driver gm12u320_drm_driver = { @@ -614,6 +631,7 @@ static const struct drm_driver gm12u320_
.fops = &gm12u320_fops, DRM_GEM_SHMEM_DRIVER_OPS, + .gem_prime_import = gm12u320_gem_prime_import, };
static const struct drm_mode_config_funcs gm12u320_mode_config_funcs = { @@ -640,15 +658,18 @@ static int gm12u320_usb_probe(struct usb struct gm12u320_device, dev); if (IS_ERR(gm12u320)) return PTR_ERR(gm12u320); + dev = &gm12u320->dev; + + gm12u320->dmadev = usb_intf_get_dma_device(to_usb_interface(dev->dev)); + if (!gm12u320->dmadev) + drm_warn(dev, "buffer sharing not supported"); /* not an error */
INIT_DELAYED_WORK(&gm12u320->fb_update.work, gm12u320_fb_update_work); mutex_init(&gm12u320->fb_update.lock);
- dev = &gm12u320->dev; - ret = drmm_mode_config_init(dev); if (ret) - return ret; + goto err_put_device;
dev->mode_config.min_width = GM12U320_USER_WIDTH; dev->mode_config.max_width = GM12U320_USER_WIDTH; @@ -658,15 +679,15 @@ static int gm12u320_usb_probe(struct usb
ret = gm12u320_usb_alloc(gm12u320); if (ret) - return ret; + goto err_put_device;
ret = gm12u320_set_ecomode(gm12u320); if (ret) - return ret; + goto err_put_device;
ret = gm12u320_conn_init(gm12u320); if (ret) - return ret; + goto err_put_device;
ret = drm_simple_display_pipe_init(&gm12u320->dev, &gm12u320->pipe, @@ -676,24 +697,31 @@ static int gm12u320_usb_probe(struct usb gm12u320_pipe_modifiers, &gm12u320->conn); if (ret) - return ret; + goto err_put_device;
drm_mode_config_reset(dev);
usb_set_intfdata(interface, dev); ret = drm_dev_register(dev, 0); if (ret) - return ret; + goto err_put_device;
drm_fbdev_generic_setup(dev, 0);
return 0; + +err_put_device: + put_device(gm12u320->dmadev); + return ret; }
static void gm12u320_usb_disconnect(struct usb_interface *interface) { struct drm_device *dev = usb_get_intfdata(interface); + struct gm12u320_device *gm12u320 = to_gm12u320(dev);
+ put_device(gm12u320->dmadev); + gm12u320->dmadev = NULL; drm_dev_unplug(dev); drm_atomic_helper_shutdown(dev); } --- a/drivers/gpu/drm/udl/udl_drv.c +++ b/drivers/gpu/drm/udl/udl_drv.c @@ -32,6 +32,22 @@ static int udl_usb_resume(struct usb_int return drm_mode_config_helper_resume(dev); }
+/* + * FIXME: Dma-buf sharing requires DMA support by the importing device. + * This function is a workaround to make USB devices work as well. + * See todo.rst for how to fix the issue in the dma-buf framework. + */ +static struct drm_gem_object *udl_driver_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf) +{ + struct udl_device *udl = to_udl(dev); + + if (!udl->dmadev) + return ERR_PTR(-ENODEV); + + return drm_gem_prime_import_dev(dev, dma_buf, udl->dmadev); +} + DEFINE_DRM_GEM_FOPS(udl_driver_fops);
static const struct drm_driver driver = { @@ -40,6 +56,7 @@ static const struct drm_driver driver = /* GEM hooks */ .fops = &udl_driver_fops, DRM_GEM_SHMEM_DRIVER_OPS, + .gem_prime_import = udl_driver_gem_prime_import,
.name = DRIVER_NAME, .desc = DRIVER_DESC, --- a/drivers/gpu/drm/udl/udl_drv.h +++ b/drivers/gpu/drm/udl/udl_drv.h @@ -50,6 +50,7 @@ struct urb_list { struct udl_device { struct drm_device drm; struct device *dev; + struct device *dmadev;
struct drm_simple_display_pipe display_pipe;
--- a/drivers/gpu/drm/udl/udl_main.c +++ b/drivers/gpu/drm/udl/udl_main.c @@ -315,6 +315,10 @@ int udl_init(struct udl_device *udl)
DRM_DEBUG("\n");
+ udl->dmadev = usb_intf_get_dma_device(to_usb_interface(dev->dev)); + if (!udl->dmadev) + drm_warn(dev, "buffer sharing not supported"); /* not an error */ + mutex_init(&udl->gem_lock);
if (!udl_parse_vendor_descriptor(udl)) { @@ -343,12 +347,18 @@ int udl_init(struct udl_device *udl) err: if (udl->urbs.count) udl_free_urb_list(dev); + put_device(udl->dmadev); DRM_ERROR("%d\n", ret); return ret; }
int udl_drop_usb(struct drm_device *dev) { + struct udl_device *udl = to_udl(dev); + udl_free_urb_list(dev); + put_device(udl->dmadev); + udl->dmadev = NULL; + return 0; } --- a/drivers/usb/core/usb.c +++ b/drivers/usb/core/usb.c @@ -748,6 +748,38 @@ void usb_put_intf(struct usb_interface * } EXPORT_SYMBOL_GPL(usb_put_intf);
+/** + * usb_intf_get_dma_device - acquire a reference on the usb interface's DMA endpoint + * @intf: the usb interface + * + * While a USB device cannot perform DMA operations by itself, many USB + * controllers can. A call to usb_intf_get_dma_device() returns the DMA endpoint + * for the given USB interface, if any. The returned device structure must be + * released with put_device(). + * + * See also usb_get_dma_device(). + * + * Returns: A reference to the usb interface's DMA endpoint; or NULL if none + * exists. + */ +struct device *usb_intf_get_dma_device(struct usb_interface *intf) +{ + struct usb_device *udev = interface_to_usbdev(intf); + struct device *dmadev; + + if (!udev->bus) + return NULL; + + dmadev = get_device(udev->bus->sysdev); + if (!dmadev || !dmadev->dma_mask) { + put_device(dmadev); + return NULL; + } + + return dmadev; +} +EXPORT_SYMBOL_GPL(usb_intf_get_dma_device); + /* USB device locking * * USB devices and interfaces are locked using the semaphore in their --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -746,6 +746,8 @@ extern int usb_lock_device_for_reset(str extern int usb_reset_device(struct usb_device *dev); extern void usb_queue_reset_device(struct usb_interface *dev);
+extern struct device *usb_intf_get_dma_device(struct usb_interface *intf); + #ifdef CONFIG_ACPI extern int usb_acpi_set_power_state(struct usb_device *hdev, int index, bool enable);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alex Deucher alexander.deucher@amd.com
commit a5cb3c1a36376c25cd25fd3e99918dc48ac420bb upstream.
Need to check the module variant as well.
Acked-by: Prike Liang Prike.Liang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -903,7 +903,7 @@ void amdgpu_acpi_fini(struct amdgpu_devi */ bool amdgpu_acpi_is_s0ix_supported(struct amdgpu_device *adev) { -#if defined(CONFIG_AMD_PMC) +#if defined(CONFIG_AMD_PMC) || defined(CONFIG_AMD_PMC_MODULE) if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) { if (adev->flags & AMD_IS_APU) return true;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Artem Lapkin art@khadas.com
commit fa0c16caf3d73ab4d2e5d6fa2ef2394dbec91791 upstream.
Problem: random stucks on reboot stage about 1/20 stuck/reboots // debug kernel log [ 4.496660] reboot: kernel restart prepare CMD:(null) [ 4.498114] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin [ 4.503949] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU... ...STUCK...
Solution: add shutdown function to meson_drm driver // debug kernel log [ 5.231896] reboot: kernel restart prepare CMD:(null) [ 5.246135] [drm:meson_drv_shutdown] ... [ 5.259271] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin [ 5.274688] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU... [ 5.338331] reboot: Restarting system [ 5.358293] psci: PSCI_0_2_FN_SYSTEM_RESET reboot_mode:0 cmd:(null) bl31 reboot reason: 0xd bl31 reboot reason: 0x0 system cmd 1. ...REBOOT...
Tested: on VIM1 VIM2 VIM3 VIM3L khadas sbcs - 1000+ successful reboots and Odroid boards, WeTek Play2 (GXBB)
Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller") Signed-off-by: Artem Lapkin art@khadas.com Tested-by: Christian Hewitt christianshewitt@gmail.com Acked-by: Neil Armstrong narmstrong@baylibre.com Acked-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Neil Armstrong narmstrong@baylibre.com Link: https://patchwork.freedesktop.org/patch/msgid/20210302042202.3728113-1-art@k... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/meson/meson_drv.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/gpu/drm/meson/meson_drv.c +++ b/drivers/gpu/drm/meson/meson_drv.c @@ -482,6 +482,16 @@ static int meson_probe_remote(struct pla return count; }
+static void meson_drv_shutdown(struct platform_device *pdev) +{ + struct meson_drm *priv = dev_get_drvdata(&pdev->dev); + struct drm_device *drm = priv->drm; + + DRM_DEBUG_DRIVER("\n"); + drm_kms_helper_poll_fini(drm); + drm_atomic_helper_shutdown(drm); +} + static int meson_drv_probe(struct platform_device *pdev) { struct component_match *match = NULL; @@ -553,6 +563,7 @@ static const struct dev_pm_ops meson_drv
static struct platform_driver meson_drm_platform_driver = { .probe = meson_drv_probe, + .shutdown = meson_drv_shutdown, .driver = { .name = "meson-drm", .of_match_table = dt_match,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Noralf Trønnes noralf@tronnes.org
commit 64e194e278673bceb68fb2dde7dbc3d812bfceb3 upstream.
dma-buf importing was reworked in commit 7d2cd72a9aa3 ("drm/shmem-helpers: Simplify dma-buf importing"). Before that commit drm_gem_shmem_prime_import_sg_table() did set ->pages_use_count=1 and drm_gem_shmem_vunmap_locked() could call drm_gem_shmem_put_pages() unconditionally. Now without the use count set, put pages is called also on dma-bufs. Fix this by only putting pages if it's not imported.
Signed-off-by: Noralf Trønnes noralf@tronnes.org Fixes: 7d2cd72a9aa3 ("drm/shmem-helpers: Simplify dma-buf importing") Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Thomas Zimmermann tzimmermann@suse.de Acked-by: Thomas Zimmermann tzimmermann@suse.de Tested-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20210219122203.51130-1-noralf@... (cherry picked from commit cdea72518a2b38207146e92e1c9e2fac15975679) Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_gem_shmem_helper.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -357,13 +357,14 @@ static void drm_gem_shmem_vunmap_locked( if (--shmem->vmap_use_count > 0) return;
- if (obj->import_attach) + if (obj->import_attach) { dma_buf_vunmap(obj->import_attach->dmabuf, map); - else + } else { vunmap(shmem->vaddr); + drm_gem_shmem_put_pages(shmem); + }
shmem->vaddr = NULL; - drm_gem_shmem_put_pages(shmem); }
/*
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
commit a829f033e966d5e4aa27c3ef2b381f51734e4a7f upstream.
Commit 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") introduced mandatory command parsing but setup failures were not translated into wedging the GPU which was probably the intent.
Possible errors come in two categories. Either the sanity check on internal tables has failed, which should be caught in CI unless an affected platform would be missed in testing; or memory allocation failure happened during driver load, which should be extremely unlikely but for correctness should still be handled.
v2: * Tidy coding style. (Chris)
[airlied: cherry-picked to avoid rc1 base] Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Fixes: 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Chris Wilson chris.p.wilson@intel.com Reviewed-by: Chris Wilson chris.p.wilson@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210302114213.1102223-1-tvrtk... (cherry picked from commit 5a1a659762d35a6dc51047c9127c011303c77b7f) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 ++++++- drivers/gpu/drm/i915/i915_cmd_parser.c | 19 +++++++++++++------ drivers/gpu/drm/i915/i915_drv.h | 2 +- 3 files changed, 20 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -709,9 +709,12 @@ static int engine_setup_common(struct in goto err_status; }
+ err = intel_engine_init_cmd_parser(engine); + if (err) + goto err_cmd_parser; + intel_engine_init_active(engine, ENGINE_PHYSICAL); intel_engine_init_execlists(engine); - intel_engine_init_cmd_parser(engine); intel_engine_init__pm(engine); intel_engine_init_retire(engine);
@@ -725,6 +728,8 @@ static int engine_setup_common(struct in
return 0;
+err_cmd_parser: + intel_breadcrumbs_free(engine->breadcrumbs); err_status: cleanup_status_page(engine); return err; --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -939,7 +939,7 @@ static void fini_hash_table(struct intel * struct intel_engine_cs based on whether the platform requires software * command parsing. */ -void intel_engine_init_cmd_parser(struct intel_engine_cs *engine) +int intel_engine_init_cmd_parser(struct intel_engine_cs *engine) { const struct drm_i915_cmd_table *cmd_tables; int cmd_table_count; @@ -947,7 +947,7 @@ void intel_engine_init_cmd_parser(struct
if (!IS_GEN(engine->i915, 7) && !(IS_GEN(engine->i915, 9) && engine->class == COPY_ENGINE_CLASS)) - return; + return 0;
switch (engine->class) { case RENDER_CLASS: @@ -1012,19 +1012,19 @@ void intel_engine_init_cmd_parser(struct break; default: MISSING_CASE(engine->class); - return; + goto out; }
if (!validate_cmds_sorted(engine, cmd_tables, cmd_table_count)) { drm_err(&engine->i915->drm, "%s: command descriptions are not sorted\n", engine->name); - return; + goto out; } if (!validate_regs_sorted(engine)) { drm_err(&engine->i915->drm, "%s: registers are not sorted\n", engine->name); - return; + goto out; }
ret = init_hash_table(engine, cmd_tables, cmd_table_count); @@ -1032,10 +1032,17 @@ void intel_engine_init_cmd_parser(struct drm_err(&engine->i915->drm, "%s: initialised failed!\n", engine->name); fini_hash_table(engine); - return; + goto out; }
engine->flags |= I915_ENGINE_USING_CMD_PARSER; + +out: + if (intel_engine_requires_cmd_parser(engine) && + !intel_engine_using_cmd_parser(engine)) + return -EINVAL; + + return 0; }
/** --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1947,7 +1947,7 @@ const char *i915_cache_level_str(struct
/* i915_cmd_parser.c */ int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv); -void intel_engine_init_cmd_parser(struct intel_engine_cs *engine); +int intel_engine_init_cmd_parser(struct intel_engine_cs *engine); void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine); int intel_engine_cmd_parser(struct intel_engine_cs *engine, struct i915_vma *batch,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Eric Farman farman@linux.ibm.com
commit d9c48a948d29bcb22f4fe61a81b718ef6de561a0 upstream.
Fixes: 120e214e504f ("vfio: ccw: realize VFIO_DEVICE_G(S)ET_IRQ_INFO ioctls") Signed-off-by: Eric Farman farman@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/cio/vfio_ccw_ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -582,7 +582,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struc if (info.count == -1) return -EINVAL;
- return copy_to_user((void __user *)arg, &info, minsz); + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; } case VFIO_DEVICE_SET_IRQS: {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wang Qing wangqing@vivo.com
commit 942df4be7ab40195e2a839e9de81951a5862bc5b upstream.
The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return -EFAULT if the copy doesn't complete.
Fixes: e06670c5fe3b ("s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl") Signed-off-by: Wang Qing wangqing@vivo.com Reviewed-by: Tony Krowiak akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Link: https://lore.kernel.org/r/1614600502-16714-1-git-send-email-wangqing@vivo.co... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/crypto/vfio_ap_ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1286,7 +1286,7 @@ static int vfio_ap_mdev_get_device_info( info.num_regions = 0; info.num_irqs = 0;
- return copy_to_user((void __user *)arg, &info, minsz); + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; }
static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Colin Ian King colin.king@canonical.com
commit 738acd49eb018feb873e0fac8f9517493f6ce2c7 upstream.
The surface_id struct field in head is not being initialized and static analysis warns that this is being passed through to dev->monitors_config->heads[i] on an assignment. Clear up this warning by initializing it to zero.
Addresses-Coverity: ("Uninitialized scalar variable") Fixes: a6d3c4d79822 ("qxl: hook monitors_config updates into crtc, not encoder.") Signed-off-by: Colin Ian King colin.king@canonical.com Link: http://patchwork.freedesktop.org/patch/msgid/20210304094928.2280722-1-colin.... Signed-off-by: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/qxl/qxl_display.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/qxl/qxl_display.c +++ b/drivers/gpu/drm/qxl/qxl_display.c @@ -328,6 +328,7 @@ static void qxl_crtc_update_monitors_con
head.id = i; head.flags = 0; + head.surface_id = 0; oldcount = qdev->monitors_config->count; if (crtc->state->active) { struct drm_display_mode *mode = &crtc->mode;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Sergey Shtylyov s.shtylyov@omprussia.ru
commit 165bc5a4f30eee4735845aa7dbd6b738643f2603 upstream.
According to the RZ/A2M Group User's Manual: Hardware, Rev. 2.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data:: trscer_err_mask for R7S9210.
Fixes: 6e0bb04d0e4f ("sh_eth: Add R7S9210 support") Signed-off-by: Sergey Shtylyov s.shtylyov@omprussia.ru Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/renesas/sh_eth.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -780,6 +780,8 @@ static struct sh_eth_cpu_data r7s9210_da
.fdr_value = 0x0000070f,
+ .trscer_err_mask = DESC_I_RINT8 | DESC_I_RINT5, + .apr = 1, .mpr = 1, .tpauser = 1,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Maxim Mikityanskiy maxtram95@gmail.com
commit 8a7e27fd5cd696ba564a3f62cedef7269cfd0723 upstream.
usbtv doesn't support power management, so on system suspend the .disconnect callback of the driver is called. The teardown sequence includes a call to snd_card_free. Its implementation waits until the refcount of the sound card device drops to zero, however, if its file is open, snd_card_file_add takes a reference, which can't be dropped during the suspend, because the userspace processes are already frozen at this point. snd_card_free waits for completion forever, leading to a hang on suspend.
This commit fixes this deadlock condition by replacing snd_card_free with snd_card_free_when_closed, that doesn't wait until all references are released, allowing suspend to progress.
Fixes: 63ddf68de52e ("[media] usbtv: add audio support") Signed-off-by: Maxim Mikityanskiy maxtram95@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/usb/usbtv/usbtv-audio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/usb/usbtv/usbtv-audio.c +++ b/drivers/media/usb/usbtv/usbtv-audio.c @@ -371,7 +371,7 @@ void usbtv_audio_free(struct usbtv *usbt cancel_work_sync(&usbtv->snd_trigger);
if (usbtv->snd && usbtv->udev) { - snd_card_free(usbtv->snd); + snd_card_free_when_closed(usbtv->snd); usbtv->snd = NULL; } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dafna Hirschfeld dafna.hirschfeld@collabora.com
commit 2025a48cfd92d541c5ee47deee97f8a46d00c4ac upstream.
The histogram mode is set using 'rkisp1_params_set_bits'. Only the bits of the mode should be the value argument for that function. Otherwise bits outside the mode mask are turned on which is not what was intended.
Fixes: bae1155cf579 ("media: staging: rkisp1: add output device for parameters") Signed-off-by: Dafna Hirschfeld dafna.hirschfeld@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/rockchip/rkisp1/rkisp1-params.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/media/platform/rockchip/rkisp1/rkisp1-params.c +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-params.c @@ -1288,7 +1288,6 @@ static void rkisp1_params_config_paramet memset(hst.hist_weight, 0x01, sizeof(hst.hist_weight)); rkisp1_hst_config(params, &hst); rkisp1_param_set_bits(params, RKISP1_CIF_ISP_HIST_PROP, - ~RKISP1_CIF_ISP_HIST_PROP_MODE_MASK | rkisp1_hst_params_default_config.mode);
/* set the range */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Biju Das biju.das.jz@bp.renesas.com
commit 6732f313938027a910e1f7351951ff52c0329e70 upstream.
RZ/G2L SoC has no UIF. This patch fixes null pointer access, when UIF module is not used.
Fixes: 5e824f989e6e8("media: v4l: vsp1: Integrate DISCOM in display pipeline") Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/vsp1/vsp1_drm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/media/platform/vsp1/vsp1_drm.c +++ b/drivers/media/platform/vsp1/vsp1_drm.c @@ -462,9 +462,9 @@ static int vsp1_du_pipeline_setup_inputs * make sure it is present in the pipeline's list of entities if it * wasn't already. */ - if (!use_uif) { + if (drm_pipe->uif && !use_uif) { drm_pipe->uif->pipe = NULL; - } else if (!drm_pipe->uif->pipe) { + } else if (drm_pipe->uif && !drm_pipe->uif->pipe) { drm_pipe->uif->pipe = pipe; list_add_tail(&drm_pipe->uif->list_pipe, &pipe->entities); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Biju Das biju.das.jz@bp.renesas.com
commit ac8d82f586c8692b501cb974604a71ef0e22a04c upstream.
RZ/G2L SoC has only BRS. This patch fixes null pointer access,when only BRS is enabled.
Fixes: cbb7fa49c7466("media: v4l: vsp1: Rename BRU to BRx") Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/vsp1/vsp1_drm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/vsp1/vsp1_drm.c +++ b/drivers/media/platform/vsp1/vsp1_drm.c @@ -245,7 +245,7 @@ static int vsp1_du_pipeline_setup_brx(st brx = &vsp1->bru->entity; else if (pipe->brx && !drm_pipe->force_brx_release) brx = pipe->brx; - else if (!vsp1->bru->entity.pipe) + else if (vsp1_feature(vsp1, VSP1_HAS_BRU) && !vsp1->bru->entity.pipe) brx = &vsp1->bru->entity; else brx = &vsp1->brs->entity;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Hans Verkuil hverkuil@xs4all.nl
commit f09f9f93afad770a04b35235a0aa465fcc8d6e3d upstream.
The rc-cec keymap is unusual in that it can't be built as a module, instead it is registered directly in rc-main.c if CONFIG_MEDIA_CEC_RC is set. This is because it can be called from drm_dp_cec_set_edid() via cec_register_adapter() in an asynchronous context, and it is not allowed to use request_module() to load rc-cec.ko in that case. Trying to do so results in a 'WARN_ON_ONCE(wait && current_is_async())'.
Since this keymap is only used if CONFIG_MEDIA_CEC_RC is set, we just compile this keymap into the rc-core module and never as a separate module.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Fixes: 2c6d1fffa1d9 (drm: add support for DisplayPort CEC-Tunneling-over-AUX) Reported-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/Makefile | 1 + drivers/media/rc/keymaps/Makefile | 1 - drivers/media/rc/keymaps/rc-cec.c | 28 +++++++++++----------------- drivers/media/rc/rc-main.c | 6 ++++++ include/media/rc-map.h | 7 +++++++ 5 files changed, 25 insertions(+), 18 deletions(-)
--- a/drivers/media/rc/Makefile +++ b/drivers/media/rc/Makefile @@ -5,6 +5,7 @@ obj-y += keymaps/ obj-$(CONFIG_RC_CORE) += rc-core.o rc-core-y := rc-main.o rc-ir-raw.o rc-core-$(CONFIG_LIRC) += lirc_dev.o +rc-core-$(CONFIG_MEDIA_CEC_RC) += keymaps/rc-cec.o rc-core-$(CONFIG_BPF_LIRC_MODE2) += bpf-lirc.o obj-$(CONFIG_IR_NEC_DECODER) += ir-nec-decoder.o obj-$(CONFIG_IR_RC5_DECODER) += ir-rc5-decoder.o --- a/drivers/media/rc/keymaps/Makefile +++ b/drivers/media/rc/keymaps/Makefile @@ -21,7 +21,6 @@ obj-$(CONFIG_RC_MAP) += rc-adstech-dvb-t rc-behold.o \ rc-behold-columbus.o \ rc-budget-ci-old.o \ - rc-cec.o \ rc-cinergy-1400.o \ rc-cinergy.o \ rc-d680-dmb.o \ --- a/drivers/media/rc/keymaps/rc-cec.c +++ b/drivers/media/rc/keymaps/rc-cec.c @@ -1,6 +1,16 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* Keytable for the CEC remote control * + * This keymap is unusual in that it can't be built as a module, + * instead it is registered directly in rc-main.c if CONFIG_MEDIA_CEC_RC + * is set. This is because it can be called from drm_dp_cec_set_edid() via + * cec_register_adapter() in an asynchronous context, and it is not + * allowed to use request_module() to load rc-cec.ko in that case. + * + * Since this keymap is only used if CONFIG_MEDIA_CEC_RC is set, we + * just compile this keymap into the rc-core module and never as a + * separate module. + * * Copyright (c) 2015 by Kamil Debski */
@@ -152,7 +162,7 @@ static struct rc_map_table cec[] = { /* 0x77-0xff: Reserved */ };
-static struct rc_map_list cec_map = { +struct rc_map_list cec_map = { .map = { .scan = cec, .size = ARRAY_SIZE(cec), @@ -160,19 +170,3 @@ static struct rc_map_list cec_map = { .name = RC_MAP_CEC, } }; - -static int __init init_rc_map_cec(void) -{ - return rc_map_register(&cec_map); -} - -static void __exit exit_rc_map_cec(void) -{ - rc_map_unregister(&cec_map); -} - -module_init(init_rc_map_cec); -module_exit(exit_rc_map_cec); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Kamil Debski"); --- a/drivers/media/rc/rc-main.c +++ b/drivers/media/rc/rc-main.c @@ -2069,6 +2069,9 @@ static int __init rc_core_init(void)
led_trigger_register_simple("rc-feedback", &led_feedback); rc_map_register(&empty_map); +#ifdef CONFIG_MEDIA_CEC_RC + rc_map_register(&cec_map); +#endif
return 0; } @@ -2078,6 +2081,9 @@ static void __exit rc_core_exit(void) lirc_dev_exit(); class_unregister(&rc_class); led_trigger_unregister_simple(led_feedback); +#ifdef CONFIG_MEDIA_CEC_RC + rc_map_unregister(&cec_map); +#endif rc_map_unregister(&empty_map); }
--- a/include/media/rc-map.h +++ b/include/media/rc-map.h @@ -175,6 +175,13 @@ struct rc_map_list { struct rc_map map; };
+#ifdef CONFIG_MEDIA_CEC_RC +/* + * rc_map_list from rc-cec.c + */ +extern struct rc_map_list cec_map; +#endif + /* Routines from rc-map.c */
/**
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Thomas Bogendoerfer tsbogend@alpha.franken.de
[ Upstream commit bd67b711bfaa02cf19e88aa2d9edae5c1c1d2739 ]
BMIPS is one of the few platforms that do change the exception base. After commit 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") we started seeing BMIPS boards fail to boot with the built-in FDT being corrupted.
Before the cited commit, early allocations would be in the [kernel_end, RAM_END] range, but after commit they would be within [RAM_START + PAGE_SIZE, RAM_END].
The custom exception base handler that is installed by bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the memory region allocated by unflatten_and_copy_device_tree() thus corrupting the FDT used by the kernel.
To fix this, we need to perform an early reservation of the custom exception space. Additional we reserve the first 4k (1k for R3k) for either normal exception vector space (legacy CPUs) or special vectors like cache exceptions.
Huge thanks to Serge for analysing and proposing a solution to this issue.
Fixes: 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") Reported-by: Kamal Dasu kdasu.kdev@gmail.com Debugged-by: Serge Semin Sergey.Semin@baikalelectronics.ru Acked-by: Mike Rapoport rppt@linux.ibm.com Tested-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/traps.h | 3 +++ arch/mips/kernel/cpu-probe.c | 6 ++++++ arch/mips/kernel/cpu-r3k-probe.c | 3 +++ arch/mips/kernel/traps.c | 10 +++++----- 4 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/arch/mips/include/asm/traps.h b/arch/mips/include/asm/traps.h index 6a0864bb604d..9038b91e2d8c 100644 --- a/arch/mips/include/asm/traps.h +++ b/arch/mips/include/asm/traps.h @@ -24,6 +24,9 @@ extern void (*board_ebase_setup)(void); extern void (*board_cache_error_setup)(void);
extern int register_nmi_notifier(struct notifier_block *nb); +extern void reserve_exception_space(phys_addr_t addr, unsigned long size); + +#define VECTORSPACING 0x100 /* for EI/VI mode */
#define nmi_notifier(fn, pri) \ ({ \ diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c index 31cb9199197c..21794db53c05 100644 --- a/arch/mips/kernel/cpu-probe.c +++ b/arch/mips/kernel/cpu-probe.c @@ -26,6 +26,7 @@ #include <asm/elf.h> #include <asm/pgtable-bits.h> #include <asm/spram.h> +#include <asm/traps.h> #include <linux/uaccess.h>
#include "fpu-probe.h" @@ -1619,6 +1620,7 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu) c->cputype = CPU_BMIPS3300; __cpu_name[cpu] = "Broadcom BMIPS3300"; set_elf_platform(cpu, "bmips3300"); + reserve_exception_space(0x400, VECTORSPACING * 64); break; case PRID_IMP_BMIPS43XX: { int rev = c->processor_id & PRID_REV_MASK; @@ -1629,6 +1631,7 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu) __cpu_name[cpu] = "Broadcom BMIPS4380"; set_elf_platform(cpu, "bmips4380"); c->options |= MIPS_CPU_RIXI; + reserve_exception_space(0x400, VECTORSPACING * 64); } else { c->cputype = CPU_BMIPS4350; __cpu_name[cpu] = "Broadcom BMIPS4350"; @@ -1645,6 +1648,7 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu) __cpu_name[cpu] = "Broadcom BMIPS5000"; set_elf_platform(cpu, "bmips5000"); c->options |= MIPS_CPU_ULRI | MIPS_CPU_RIXI; + reserve_exception_space(0x1000, VECTORSPACING * 64); break; } } @@ -2124,6 +2128,8 @@ void cpu_probe(void) if (cpu == 0) __ua_limit = ~((1ull << cpu_vmbits) - 1); #endif + + reserve_exception_space(0, 0x1000); }
void cpu_report(void) diff --git a/arch/mips/kernel/cpu-r3k-probe.c b/arch/mips/kernel/cpu-r3k-probe.c index abdbbe8c5a43..af654771918c 100644 --- a/arch/mips/kernel/cpu-r3k-probe.c +++ b/arch/mips/kernel/cpu-r3k-probe.c @@ -21,6 +21,7 @@ #include <asm/fpu.h> #include <asm/mipsregs.h> #include <asm/elf.h> +#include <asm/traps.h>
#include "fpu-probe.h"
@@ -158,6 +159,8 @@ void cpu_probe(void) cpu_set_fpu_opts(c); else cpu_set_nofpu_opts(c); + + reserve_exception_space(0, 0x400); }
void cpu_report(void) diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c index e0352958e2f7..808b8b61ded1 100644 --- a/arch/mips/kernel/traps.c +++ b/arch/mips/kernel/traps.c @@ -2009,13 +2009,16 @@ void __noreturn nmi_exception_handler(struct pt_regs *regs) nmi_exit(); }
-#define VECTORSPACING 0x100 /* for EI/VI mode */ - unsigned long ebase; EXPORT_SYMBOL_GPL(ebase); unsigned long exception_handlers[32]; unsigned long vi_handlers[64];
+void reserve_exception_space(phys_addr_t addr, unsigned long size) +{ + memblock_reserve(addr, size); +} + void __init *set_except_vector(int n, void *addr) { unsigned long handler = (unsigned long) addr; @@ -2367,10 +2370,7 @@ void __init trap_init(void)
if (!cpu_has_mips_r2_r6) { ebase = CAC_BASE; - ebase_pa = virt_to_phys((void *)ebase); vec_size = 0x400; - - memblock_reserve(ebase_pa, vec_size); } else { if (cpu_has_veic || cpu_has_vint) vec_size = 0x200 + VECTORSPACING*64;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 866f26f2a9c33bc70eb0f07ffc37fd9424ffe501 ]
Currently, incoming subflows link to the parent socket, while outgoing ones link to a per subflow socket. The latter is not really needed, except at the initial connect() time and for the first subflow.
Always graft the outgoing subflow to the parent socket and free the unneeded ones early.
This allows some code cleanup, reduces the amount of memory used and will simplify the next patch
Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 36 ++++++++++-------------------------- net/mptcp/protocol.h | 1 + net/mptcp/subflow.c | 3 +++ 3 files changed, 14 insertions(+), 26 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index b51872b9dd61..3cc7be259396 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -114,11 +114,7 @@ static int __mptcp_socket_create(struct mptcp_sock *msk) list_add(&subflow->node, &msk->conn_list); sock_hold(ssock->sk); subflow->request_mptcp = 1; - - /* accept() will wait on first subflow sk_wq, and we always wakes up - * via msk->sk_socket - */ - RCU_INIT_POINTER(msk->first->sk_wq, &sk->sk_socket->wq); + mptcp_sock_graft(msk->first, sk->sk_socket);
return 0; } @@ -2114,9 +2110,6 @@ static struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk) void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow) { - bool dispose_socket = false; - struct socket *sock; - list_del(&subflow->node);
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); @@ -2124,11 +2117,8 @@ void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, /* if we are invoked by the msk cleanup code, the subflow is * already orphaned */ - sock = ssk->sk_socket; - if (sock) { - dispose_socket = sock != sk->sk_socket; + if (ssk->sk_socket) sock_orphan(ssk); - }
subflow->disposable = 1;
@@ -2146,8 +2136,6 @@ void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, __sock_put(ssk); } release_sock(ssk); - if (dispose_socket) - iput(SOCK_INODE(sock));
sock_put(ssk); } @@ -2535,6 +2523,12 @@ static void __mptcp_destroy_sock(struct sock *sk)
pr_debug("msk=%p", msk);
+ /* dispose the ancillatory tcp socket, if any */ + if (msk->subflow) { + iput(SOCK_INODE(msk->subflow)); + msk->subflow = NULL; + } + /* be sure to always acquire the join list lock, to sync vs * mptcp_finish_join(). */ @@ -2585,20 +2579,10 @@ cleanup: inet_csk(sk)->icsk_mtup.probe_timestamp = tcp_jiffies32; list_for_each_entry(subflow, &mptcp_sk(sk)->conn_list, node) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - bool slow, dispose_socket; - struct socket *sock; + bool slow = lock_sock_fast(ssk);
- slow = lock_sock_fast(ssk); - sock = ssk->sk_socket; - dispose_socket = sock && sock != sk->sk_socket; sock_orphan(ssk); unlock_sock_fast(ssk, slow); - - /* for the outgoing subflows we additionally need to free - * the associated socket - */ - if (dispose_socket) - iput(SOCK_INODE(sock)); } sock_orphan(sk);
@@ -3040,7 +3024,7 @@ void mptcp_finish_connect(struct sock *ssk) mptcp_rcv_space_init(msk, ssk); }
-static void mptcp_sock_graft(struct sock *sk, struct socket *parent) +void mptcp_sock_graft(struct sock *sk, struct socket *parent) { write_lock_bh(&sk->sk_callback_lock); rcu_assign_pointer(sk->sk_wq, &parent->wq); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index dbf62e74fcc1..18fef4273bdc 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -460,6 +460,7 @@ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how); void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow); void mptcp_subflow_reset(struct sock *ssk); +void mptcp_sock_graft(struct sock *sk, struct socket *parent);
/* called with sk socket lock held */ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc, diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 9d28f6e3dc49..81b7be67d288 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1165,6 +1165,9 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc, if (err && err != -EINPROGRESS) goto failed_unlink;
+ /* discard the subflow socket */ + mptcp_sock_graft(ssk, sk->sk_socket); + iput(SOCK_INODE(sf)); return err;
failed_unlink:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Florian Westphal fw@strlen.de
[ Upstream commit e0be4931f3fee2e04dec4013ea4f27ec2db8556f ]
Send logic caches last active subflow in the msk, so it needs to be cleared when the cached subflow is closed.
Fixes: d5f49190def61c ("mptcp: allow picking different xmit subflows") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/155 Reported-by: Christoph Paasch cpaasch@apple.com Acked-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3cc7be259396..de89824a2a36 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2110,6 +2110,8 @@ static struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk) void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow) { + struct mptcp_sock *msk = mptcp_sk(sk); + list_del(&subflow->node);
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); @@ -2138,6 +2140,9 @@ void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, release_sock(ssk);
sock_put(ssk); + + if (ssk == msk->last_snd) + msk->last_snd = NULL; }
static unsigned int mptcp_sync_mss(struct sock *sk, u32 pmtu)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit c7b514ec979e23a08c411f3d8ed39c7922751422 ]
To avoid the HW race condition on R-Car Gen2 and earlier, we need to write to ICMCR as soon as possible in the interrupt handler. We can improve this by writing a static value instead of masking out bits.
Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-rcar.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c index 217def2d7cb4..824586d7ee56 100644 --- a/drivers/i2c/busses/i2c-rcar.c +++ b/drivers/i2c/busses/i2c-rcar.c @@ -91,7 +91,6 @@
#define RCAR_BUS_PHASE_START (MDBS | MIE | ESG) #define RCAR_BUS_PHASE_DATA (MDBS | MIE) -#define RCAR_BUS_MASK_DATA (~(ESG | FSB) & 0xFF) #define RCAR_BUS_PHASE_STOP (MDBS | MIE | FSB)
#define RCAR_IRQ_SEND (MNR | MAL | MST | MAT | MDE) @@ -621,7 +620,7 @@ static bool rcar_i2c_slave_irq(struct rcar_i2c_priv *priv) /* * This driver has a lock-free design because there are IP cores (at least * R-Car Gen2) which have an inherent race condition in their hardware design. - * There, we need to clear RCAR_BUS_MASK_DATA bits as soon as possible after + * There, we need to switch to RCAR_BUS_PHASE_DATA as soon as possible after * the interrupt was generated, otherwise an unwanted repeated message gets * generated. It turned out that taking a spinlock at the beginning of the ISR * was already causing repeated messages. Thus, this driver was converted to @@ -630,13 +629,11 @@ static bool rcar_i2c_slave_irq(struct rcar_i2c_priv *priv) static irqreturn_t rcar_i2c_irq(int irq, void *ptr) { struct rcar_i2c_priv *priv = ptr; - u32 msr, val; + u32 msr;
/* Clear START or STOP immediately, except for REPSTART after read */ - if (likely(!(priv->flags & ID_P_REP_AFTER_RD))) { - val = rcar_i2c_read(priv, ICMCR); - rcar_i2c_write(priv, ICMCR, val & RCAR_BUS_MASK_DATA); - } + if (likely(!(priv->flags & ID_P_REP_AFTER_RD))) + rcar_i2c_write(priv, ICMCR, RCAR_BUS_PHASE_DATA);
msr = rcar_i2c_read(priv, ICMSR);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit 25c2e0fb5fefb8d7847214cf114d94c7aad8e9ce ]
'flags' and 'io' are needed first, so they should be at the beginning of the private struct.
Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-rcar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c index 824586d7ee56..ad6630e3cc77 100644 --- a/drivers/i2c/busses/i2c-rcar.c +++ b/drivers/i2c/busses/i2c-rcar.c @@ -119,6 +119,7 @@ enum rcar_i2c_type { };
struct rcar_i2c_priv { + u32 flags; void __iomem *io; struct i2c_adapter adap; struct i2c_msg *msg; @@ -129,7 +130,6 @@ struct rcar_i2c_priv {
int pos; u32 icccr; - u32 flags; u8 recovery_icmcr; /* protected by adapter lock */ enum rcar_i2c_type devtype; struct i2c_client *slave;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: akshatzen akshatzen@google.com
[ Upstream commit 5d28026891c7041deec08cc5ddd8f3abd90195e1 ]
Tag was not freed in NVMD get/set data request failure scenario. This caused a tag leak each time a request failed.
Link: https://lore.kernel.org/r/20210109123849.17098-5-Viswas.G@microchip.com Acked-by: Jack Wang jinpu.wang@cloud.ionos.com Signed-off-by: akshatzen akshatzen@google.com Signed-off-by: Viswas G Viswas.G@microchip.com Signed-off-by: Ruksar Devadi Ruksar.devadi@microchip.com Signed-off-by: Radha Ramachandran radha@google.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm8001_hwi.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c index dd15246d5b03..ea43dff40a85 100644 --- a/drivers/scsi/pm8001/pm8001_hwi.c +++ b/drivers/scsi/pm8001/pm8001_hwi.c @@ -3038,8 +3038,8 @@ void pm8001_mpi_set_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb) complete(pm8001_ha->nvmd_completion); pm8001_dbg(pm8001_ha, MSG, "Set nvm data complete!\n"); if ((dlen_status & NVMD_STAT) != 0) { - pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error!\n"); - return; + pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error %x\n", + dlen_status); } ccb->task = NULL; ccb->ccb_tag = 0xFFFFFFFF; @@ -3062,11 +3062,17 @@ pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
pm8001_dbg(pm8001_ha, MSG, "Get nvm data complete!\n"); if ((dlen_status & NVMD_STAT) != 0) { - pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error!\n"); + pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error %x\n", + dlen_status); complete(pm8001_ha->nvmd_completion); + /* We should free tag during failure also, the tag is not being + * freed by requesting path anywhere. + */ + ccb->task = NULL; + ccb->ccb_tag = 0xFFFFFFFF; + pm8001_tag_free(pm8001_ha, tag); return; } - if (ir_tds_bn_dps_das_nvm & IPMode) { /* indirect mode - IR bit set */ pm8001_dbg(pm8001_ha, MSG, "Get NVMD success, IR=1\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit a2fca52ee640a04112ed9d9a137c940ea6ad288e ]
Kernel stack violation when getting unit_descriptor/wb_buf_alloc_units from rpmb LUN. The reason is that the unit descriptor length is different per LU.
The length of Normal LU is 45 while the one of rpmb LU is 35.
int ufshcd_read_desc_param(struct ufs_hba *hba, ...) { param_offset=41; param_size=4; buff_len=45; ... buff_len=35 by rpmb LU;
if (is_kmalloc) { /* Make sure we don't copy more data than available */ if (param_offset + param_size > buff_len) param_size = buff_len - param_offset; --> param_size = 250; memcpy(param_read_buf, &desc_buf[param_offset], param_size); --> memcpy(param_read_buf, desc_buf+41, 250);
[ 141.868974][ T9174] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: wb_buf_alloc_units_show+0x11c/0x11c } }
Link: https://lore.kernel.org/r/20210111095927.1830311-1-jaegeuk@kernel.org Reviewed-by: Avri Altman avri.altman@wdc.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufs-sysfs.c | 3 ++- drivers/scsi/ufs/ufs.h | 6 ++++-- drivers/scsi/ufs/ufshcd.c | 2 +- 3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index 08e72b7eef6a..50e90416262b 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -792,7 +792,8 @@ static ssize_t _pname##_show(struct device *dev, \ struct scsi_device *sdev = to_scsi_device(dev); \ struct ufs_hba *hba = shost_priv(sdev->host); \ u8 lun = ufshcd_scsi_to_upiu_lun(sdev->lun); \ - if (!ufs_is_valid_unit_desc_lun(&hba->dev_info, lun)) \ + if (!ufs_is_valid_unit_desc_lun(&hba->dev_info, lun, \ + _duname##_DESC_PARAM##_puname)) \ return -EINVAL; \ return ufs_sysfs_read_desc_param(hba, QUERY_DESC_IDN_##_duname, \ lun, _duname##_DESC_PARAM##_puname, buf, _size); \ diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 14dfda735adf..580aa56965d0 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -552,13 +552,15 @@ struct ufs_dev_info { * @return: true if the lun has a matching unit descriptor, false otherwise */ static inline bool ufs_is_valid_unit_desc_lun(struct ufs_dev_info *dev_info, - u8 lun) + u8 lun, u8 param_offset) { if (!dev_info || !dev_info->max_lu_supported) { pr_err("Max General LU supported by UFS isn't initialized\n"); return false; } - + /* WB is available only for the logical unit from 0 to 7 */ + if (param_offset == UNIT_DESC_PARAM_WB_BUF_ALLOC_UNITS) + return lun < UFS_UPIU_MAX_WB_LUN_ID; return lun == UFS_UPIU_RPMB_WLUN || (lun < dev_info->max_lu_supported); }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 428b9e0ac47e..a568f7ae0566 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3427,7 +3427,7 @@ static inline int ufshcd_read_unit_desc_param(struct ufs_hba *hba, * Unit descriptors are only available for general purpose LUs (LUN id * from 0 to 7) and RPMB Well known LU. */ - if (!ufs_is_valid_unit_desc_lun(&hba->dev_info, lun)) + if (!ufs_is_valid_unit_desc_lun(&hba->dev_info, lun, param_offset)) return -EOPNOTSUPP;
return ufshcd_read_desc_param(hba, QUERY_DESC_IDN_UNIT, lun,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Can Guo cang@codeaurora.org
[ Upstream commit 0e9d4ca43ba8112821397f56a26d20682001c011 ]
In contexts like suspend, shutdown, and error handling we need to suspend devfreq to make sure these contexts won't be disturbed by clock scaling. However, suspending devfreq is not enough since users can still trigger a clock scaling by manipulating the devfreq sysfs nodes like min/max_freq and governor even after devfreq is suspended. Moreover, mere suspending devfreq cannot synchroinze a clock scaling which has already been invoked through these sysfs nodes. Add one more flag in struct clk_scaling and wrap the entire func ufshcd_devfreq_scale() with the clk_scaling_lock, so that we can use this flag and clk_scaling_lock to control and synchronize clock scaling invoked through devfreq sysfs nodes.
Link: https://lore.kernel.org/r/1611137065-14266-2-git-send-email-cang@codeaurora.... Reviewed-by: Stanley Chu stanley.chu@mediatek.com Signed-off-by: Can Guo cang@codeaurora.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd.c | 80 ++++++++++++++++++++++++--------------- drivers/scsi/ufs/ufshcd.h | 6 ++- 2 files changed, 54 insertions(+), 32 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index a568f7ae0566..16e1bd1aa49d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1184,19 +1184,30 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba) */ ufshcd_scsi_block_requests(hba); down_write(&hba->clk_scaling_lock); - if (ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) { + + if (!hba->clk_scaling.is_allowed || + ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) { ret = -EBUSY; up_write(&hba->clk_scaling_lock); ufshcd_scsi_unblock_requests(hba); + goto out; }
+ /* let's not get into low power until clock scaling is completed */ + ufshcd_hold(hba, false); + +out: return ret; }
-static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba) +static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, bool writelock) { - up_write(&hba->clk_scaling_lock); + if (writelock) + up_write(&hba->clk_scaling_lock); + else + up_read(&hba->clk_scaling_lock); ufshcd_scsi_unblock_requests(hba); + ufshcd_release(hba); }
/** @@ -1211,13 +1222,11 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba) static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up) { int ret = 0; - - /* let's not get into low power until clock scaling is completed */ - ufshcd_hold(hba, false); + bool is_writelock = true;
ret = ufshcd_clock_scaling_prepare(hba); if (ret) - goto out; + return ret;
/* scale down the gear before scaling down clocks */ if (!scale_up) { @@ -1243,14 +1252,12 @@ static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up) }
/* Enable Write Booster if we have scaled up else disable it */ - up_write(&hba->clk_scaling_lock); + downgrade_write(&hba->clk_scaling_lock); + is_writelock = false; ufshcd_wb_ctrl(hba, scale_up); - down_write(&hba->clk_scaling_lock);
out_unprepare: - ufshcd_clock_scaling_unprepare(hba); -out: - ufshcd_release(hba); + ufshcd_clock_scaling_unprepare(hba, is_writelock); return ret; }
@@ -1524,7 +1531,7 @@ static ssize_t ufshcd_clkscale_enable_show(struct device *dev, { struct ufs_hba *hba = dev_get_drvdata(dev);
- return snprintf(buf, PAGE_SIZE, "%d\n", hba->clk_scaling.is_allowed); + return snprintf(buf, PAGE_SIZE, "%d\n", hba->clk_scaling.is_enabled); }
static ssize_t ufshcd_clkscale_enable_store(struct device *dev, @@ -1538,7 +1545,7 @@ static ssize_t ufshcd_clkscale_enable_store(struct device *dev, return -EINVAL;
value = !!value; - if (value == hba->clk_scaling.is_allowed) + if (value == hba->clk_scaling.is_enabled) goto out;
pm_runtime_get_sync(hba->dev); @@ -1547,7 +1554,7 @@ static ssize_t ufshcd_clkscale_enable_store(struct device *dev, cancel_work_sync(&hba->clk_scaling.suspend_work); cancel_work_sync(&hba->clk_scaling.resume_work);
- hba->clk_scaling.is_allowed = value; + hba->clk_scaling.is_enabled = value;
if (value) { ufshcd_resume_clkscaling(hba); @@ -1885,8 +1892,6 @@ static void ufshcd_init_clk_scaling(struct ufs_hba *hba) snprintf(wq_name, sizeof(wq_name), "ufs_clkscaling_%d", hba->host->host_no); hba->clk_scaling.workq = create_singlethread_workqueue(wq_name); - - ufshcd_clkscaling_init_sysfs(hba); }
static void ufshcd_exit_clk_scaling(struct ufs_hba *hba) @@ -1894,6 +1899,8 @@ static void ufshcd_exit_clk_scaling(struct ufs_hba *hba) if (!ufshcd_is_clkscaling_supported(hba)) return;
+ if (hba->clk_scaling.enable_attr.attr.name) + device_remove_file(hba->dev, &hba->clk_scaling.enable_attr); destroy_workqueue(hba->clk_scaling.workq); ufshcd_devfreq_remove(hba); } @@ -1958,7 +1965,7 @@ static void ufshcd_clk_scaling_start_busy(struct ufs_hba *hba) if (!hba->clk_scaling.active_reqs++) queue_resume_work = true;
- if (!hba->clk_scaling.is_allowed || hba->pm_op_in_progress) + if (!hba->clk_scaling.is_enabled || hba->pm_op_in_progress) return;
if (queue_resume_work) @@ -5744,18 +5751,24 @@ static void ufshcd_err_handling_prepare(struct ufs_hba *hba) ufshcd_vops_resume(hba, pm_op); } else { ufshcd_hold(hba, false); - if (hba->clk_scaling.is_allowed) { + if (hba->clk_scaling.is_enabled) { cancel_work_sync(&hba->clk_scaling.suspend_work); cancel_work_sync(&hba->clk_scaling.resume_work); ufshcd_suspend_clkscaling(hba); } + down_write(&hba->clk_scaling_lock); + hba->clk_scaling.is_allowed = false; + up_write(&hba->clk_scaling_lock); } }
static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) { ufshcd_release(hba); - if (hba->clk_scaling.is_allowed) + down_write(&hba->clk_scaling_lock); + hba->clk_scaling.is_allowed = true; + up_write(&hba->clk_scaling_lock); + if (hba->clk_scaling.is_enabled) ufshcd_resume_clkscaling(hba); pm_runtime_put(hba->dev); } @@ -7741,12 +7754,14 @@ static int ufshcd_add_lus(struct ufs_hba *hba) sizeof(struct ufs_pa_layer_attr)); hba->clk_scaling.saved_pwr_info.is_valid = true; if (!hba->devfreq) { + hba->clk_scaling.is_allowed = true; ret = ufshcd_devfreq_init(hba); if (ret) goto out; - }
- hba->clk_scaling.is_allowed = true; + hba->clk_scaling.is_enabled = true; + ufshcd_clkscaling_init_sysfs(hba); + } }
ufs_bsg_probe(hba); @@ -8661,11 +8676,14 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) ufshcd_hold(hba, false); hba->clk_gating.is_suspended = true;
- if (hba->clk_scaling.is_allowed) { + if (hba->clk_scaling.is_enabled) { cancel_work_sync(&hba->clk_scaling.suspend_work); cancel_work_sync(&hba->clk_scaling.resume_work); ufshcd_suspend_clkscaling(hba); } + down_write(&hba->clk_scaling_lock); + hba->clk_scaling.is_allowed = false; + up_write(&hba->clk_scaling_lock);
if (req_dev_pwr_mode == UFS_ACTIVE_PWR_MODE && req_link_state == UIC_LINK_ACTIVE_STATE) { @@ -8762,8 +8780,6 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) goto out;
set_link_active: - if (hba->clk_scaling.is_allowed) - ufshcd_resume_clkscaling(hba); ufshcd_vreg_set_hpm(hba); /* * Device hardware reset is required to exit DeepSleep. Also, for @@ -8787,7 +8803,10 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) if (!ufshcd_set_dev_pwr_mode(hba, UFS_ACTIVE_PWR_MODE)) ufshcd_disable_auto_bkops(hba); enable_gating: - if (hba->clk_scaling.is_allowed) + down_write(&hba->clk_scaling_lock); + hba->clk_scaling.is_allowed = true; + up_write(&hba->clk_scaling_lock); + if (hba->clk_scaling.is_enabled) ufshcd_resume_clkscaling(hba); hba->clk_gating.is_suspended = false; hba->dev_info.b_rpm_dev_flush_capable = false; @@ -8891,7 +8910,10 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
hba->clk_gating.is_suspended = false;
- if (hba->clk_scaling.is_allowed) + down_write(&hba->clk_scaling_lock); + hba->clk_scaling.is_allowed = true; + up_write(&hba->clk_scaling_lock); + if (hba->clk_scaling.is_enabled) ufshcd_resume_clkscaling(hba);
/* Enable Auto-Hibernate if configured */ @@ -8917,8 +8939,6 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) ufshcd_vreg_set_lpm(hba); disable_irq_and_vops_clks: ufshcd_disable_irq(hba); - if (hba->clk_scaling.is_allowed) - ufshcd_suspend_clkscaling(hba); ufshcd_setup_clocks(hba, false); if (ufshcd_is_clkgating_allowed(hba)) { hba->clk_gating.state = CLKS_OFF; @@ -9155,8 +9175,6 @@ void ufshcd_remove(struct ufs_hba *hba)
ufshcd_exit_clk_scaling(hba); ufshcd_exit_clk_gating(hba); - if (ufshcd_is_clkscaling_supported(hba)) - device_remove_file(hba->dev, &hba->clk_scaling.enable_attr); ufshcd_hba_exit(hba); } EXPORT_SYMBOL_GPL(ufshcd_remove); diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 1885ec9126c4..7d0b00f23761 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -419,7 +419,10 @@ struct ufs_saved_pwr_info { * @suspend_work: worker to suspend devfreq * @resume_work: worker to resume devfreq * @min_gear: lowest HS gear to scale down to - * @is_allowed: tracks if scaling is currently allowed or not + * @is_enabled: tracks if scaling is currently enabled or not, controlled by + clkscale_enable sysfs node + * @is_allowed: tracks if scaling is currently allowed or not, used to block + clock scaling which is not invoked from devfreq governor * @is_busy_started: tracks if busy period has started or not * @is_suspended: tracks if devfreq is suspended or not */ @@ -434,6 +437,7 @@ struct ufs_clk_scaling { struct work_struct suspend_work; struct work_struct resume_work; u32 min_gear; + bool is_enabled; bool is_allowed; bool is_busy_started; bool is_suspended;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Steven J. Magnani magnani@ieee.org
[ Upstream commit 63c9e47a1642fc817654a1bc18a6ec4bbcc0f056 ]
When extending a file, udf_do_extend_file() may enter following empty indirect extent. At the end of udf_do_extend_file() we revert prev_epos to point to the last written extent. However if we end up not adding any further extent in udf_do_extend_file(), the reverting points prev_epos into the header area of the AED and following updates of the extents (in udf_update_extents()) will corrupt the header.
Make sure that we do not follow indirect extent if we are not going to add any more extents so that returning back to the last written extent works correctly.
Link: https://lore.kernel.org/r/20210107234116.6190-2-magnani@ieee.org Signed-off-by: Steven J. Magnani magnani@ieee.org Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/inode.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/udf/inode.c b/fs/udf/inode.c index bb89c3e43212..0dd2f93ac048 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -544,11 +544,14 @@ static int udf_do_extend_file(struct inode *inode,
udf_write_aext(inode, last_pos, &last_ext->extLocation, last_ext->extLength, 1); + /* - * We've rewritten the last extent but there may be empty - * indirect extent after it - enter it. + * We've rewritten the last extent. If we are going to add + * more extents, we may need to enter possible following + * empty indirect extent. */ - udf_next_aext(inode, last_pos, &tmploc, &tmplen, 0); + if (new_block_bytes || prealloc_len) + udf_next_aext(inode, last_pos, &tmploc, &tmplen, 0); }
/* Managed to do everything necessary? */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lu Baolu baolu.lu@linux.intel.com
[ Upstream commit 28a77185f1cd0650b664f54614143aaaa3a7a615 ]
It is incorrect to always clear PRO when it's set w/o first checking whether the overflow condition has been cleared. Current code assumes that if an overflow condition occurs it must have been cleared by earlier loop. However since the code runs in a threaded context, the overflow condition could occur even after setting the head to the tail under some extreme condition. To be sane, we should read both head/tail again when seeing a pending PRO and only clear PRO after all pending PRs have been handled.
Suggested-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@M... Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/svm.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 18a9f05df407..b3bcd6dec93e 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -1079,8 +1079,17 @@ static irqreturn_t prq_event_thread(int irq, void *d) * Clear the page request overflow bit and wake up all threads that * are waiting for the completion of this handling. */ - if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) - writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG); + if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) { + pr_info_ratelimited("IOMMU: %s: PRQ overflow detected\n", + iommu->name); + head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK; + tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK; + if (head == tail) { + writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG); + pr_info_ratelimited("IOMMU: %s: PRQ overflow cleared", + iommu->name); + } + }
if (!completion_done(&iommu->prq_complete)) complete(&iommu->prq_complete);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 0bb7e560f821c7770973a94e346654c4bdccd42c ]
If 'mmc_of_parse()' fails, we must undo the previous 'dma_request_chan()' call.
Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/20201208203527.49262-1-christophe.jaillet@wanadoo.... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/mxs-mmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c index 56bbc6cd9c84..947581de7860 100644 --- a/drivers/mmc/host/mxs-mmc.c +++ b/drivers/mmc/host/mxs-mmc.c @@ -628,7 +628,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
ret = mmc_of_parse(mmc); if (ret) - goto out_clk_disable; + goto out_free_dma;
mmc->ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Chaotian Jing chaotian.jing@mediatek.com
[ Upstream commit 0354ca6edd464a2cf332f390581977b8699ed081 ]
when get request SW timeout, if CMD/DAT xfer done irq coming right now, then there is race between the msdc_request_timeout work and irq handler, and the host->cmd and host->data may set to NULL in irq handler. also, current flow ensure that only one path can go to msdc_request_done(), so no need check the return value of cancel_delayed_work().
Signed-off-by: Chaotian Jing chaotian.jing@mediatek.com Link: https://lore.kernel.org/r/20201218071611.12276-1-chaotian.jing@mediatek.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/mtk-sd.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/mmc/host/mtk-sd.c b/drivers/mmc/host/mtk-sd.c index de09c6347524..898ed1b023df 100644 --- a/drivers/mmc/host/mtk-sd.c +++ b/drivers/mmc/host/mtk-sd.c @@ -1127,13 +1127,13 @@ static void msdc_track_cmd_data(struct msdc_host *host, static void msdc_request_done(struct msdc_host *host, struct mmc_request *mrq) { unsigned long flags; - bool ret;
- ret = cancel_delayed_work(&host->req_timeout); - if (!ret) { - /* delay work already running */ - return; - } + /* + * No need check the return value of cancel_delayed_work, as only ONE + * path will go here! + */ + cancel_delayed_work(&host->req_timeout); + spin_lock_irqsave(&host->lock, flags); host->mrq = NULL; spin_unlock_irqrestore(&host->lock, flags); @@ -1155,7 +1155,7 @@ static bool msdc_cmd_done(struct msdc_host *host, int events, bool done = false; bool sbc_error; unsigned long flags; - u32 *rsp = cmd->resp; + u32 *rsp;
if (mrq->sbc && cmd == mrq->cmd && (events & (MSDC_INT_ACMDRDY | MSDC_INT_ACMDCRCERR @@ -1176,6 +1176,7 @@ static bool msdc_cmd_done(struct msdc_host *host, int events,
if (done) return true; + rsp = cmd->resp;
sdr_clr_bits(host->base + MSDC_INTEN, cmd_ints_mask);
@@ -1363,7 +1364,7 @@ static void msdc_data_xfer_next(struct msdc_host *host, static bool msdc_data_xfer_done(struct msdc_host *host, u32 events, struct mmc_request *mrq, struct mmc_data *data) { - struct mmc_command *stop = data->stop; + struct mmc_command *stop; unsigned long flags; bool done; unsigned int check_data = events & @@ -1379,6 +1380,7 @@ static bool msdc_data_xfer_done(struct msdc_host *host, u32 events,
if (done) return true; + stop = data->stop;
if (check_data || (stop && stop->error)) { dev_dbg(host->dev, "DMA status: 0x%8X\n",
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jeremy Linton jeremy.linton@arm.com
[ Upstream commit 4f9833d3ec8da34861cd0680b00c73e653877eb9 ]
The RPi4 has an Arasan controller it carries over from the RPi3 and a newer eMMC2 controller. Because of a couple of quirks, it seems wiser to bind these controllers to the same driver that DT is using on this platform rather than the generic sdhci_acpi driver with PNP0D40.
So, BCM2847 describes the older Arasan and BRCME88C describes the newer eMMC2. The older Arasan is reusing an existing ACPI _HID used by other OSes booting these tables on the RPi.
With this change, Linux is capable of utilizing the SD card slot, and the Wi-Fi when booted with UEFI+ACPI on the RPi4.
Signed-off-by: Jeremy Linton jeremy.linton@arm.com Acked-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20210120000406.1843400-2-jeremy.linton@arm.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/sdhci-iproc.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/mmc/host/sdhci-iproc.c b/drivers/mmc/host/sdhci-iproc.c index c9434b461aab..ddeaf8e1f72f 100644 --- a/drivers/mmc/host/sdhci-iproc.c +++ b/drivers/mmc/host/sdhci-iproc.c @@ -296,9 +296,27 @@ static const struct of_device_id sdhci_iproc_of_match[] = { MODULE_DEVICE_TABLE(of, sdhci_iproc_of_match);
#ifdef CONFIG_ACPI +/* + * This is a duplicate of bcm2835_(pltfrm_)data without caps quirks + * which are provided by the ACPI table. + */ +static const struct sdhci_pltfm_data sdhci_bcm_arasan_data = { + .quirks = SDHCI_QUIRK_BROKEN_CARD_DETECTION | + SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK | + SDHCI_QUIRK_NO_HISPD_BIT, + .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, + .ops = &sdhci_iproc_32only_ops, +}; + +static const struct sdhci_iproc_data bcm_arasan_data = { + .pdata = &sdhci_bcm_arasan_data, +}; + static const struct acpi_device_id sdhci_iproc_acpi_ids[] = { { .id = "BRCM5871", .driver_data = (kernel_ulong_t)&iproc_cygnus_data }, { .id = "BRCM5872", .driver_data = (kernel_ulong_t)&iproc_data }, + { .id = "BCM2847", .driver_data = (kernel_ulong_t)&bcm_arasan_data }, + { .id = "BRCME88C", .driver_data = (kernel_ulong_t)&bcm2711_data }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(acpi, sdhci_iproc_acpi_ids);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Pan Bian bianpan2016@163.com
[ Upstream commit 745ed17a04f966406c8c27c8f992544336c06013 ]
Put the PCI device rdev on error paths to fix potential reference count leaks.
Signed-off-by: Pan Bian bianpan2016@163.com Link: https://lore.kernel.org/r/20210121045005.73342-1-bianpan2016@163.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/amd-pmc.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/x86/amd-pmc.c b/drivers/platform/x86/amd-pmc.c index ef8342572463..b9da58ee9b1e 100644 --- a/drivers/platform/x86/amd-pmc.c +++ b/drivers/platform/x86/amd-pmc.c @@ -210,31 +210,39 @@ static int amd_pmc_probe(struct platform_device *pdev) dev->dev = &pdev->dev;
rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0)); - if (!rdev || !pci_match_id(pmc_pci_ids, rdev)) + if (!rdev || !pci_match_id(pmc_pci_ids, rdev)) { + pci_dev_put(rdev); return -ENODEV; + }
dev->cpu_id = rdev->device; err = pci_write_config_dword(rdev, AMD_PMC_SMU_INDEX_ADDRESS, AMD_PMC_BASE_ADDR_LO); if (err) { dev_err(dev->dev, "error writing to 0x%x\n", AMD_PMC_SMU_INDEX_ADDRESS); + pci_dev_put(rdev); return pcibios_err_to_errno(err); }
err = pci_read_config_dword(rdev, AMD_PMC_SMU_INDEX_DATA, &val); - if (err) + if (err) { + pci_dev_put(rdev); return pcibios_err_to_errno(err); + }
base_addr_lo = val & AMD_PMC_BASE_ADDR_HI_MASK;
err = pci_write_config_dword(rdev, AMD_PMC_SMU_INDEX_ADDRESS, AMD_PMC_BASE_ADDR_HI); if (err) { dev_err(dev->dev, "error writing to 0x%x\n", AMD_PMC_SMU_INDEX_ADDRESS); + pci_dev_put(rdev); return pcibios_err_to_errno(err); }
err = pci_read_config_dword(rdev, AMD_PMC_SMU_INDEX_DATA, &val); - if (err) + if (err) { + pci_dev_put(rdev); return pcibios_err_to_errno(err); + }
base_addr_hi = val & AMD_PMC_BASE_ADDR_LO_MASK; pci_dev_put(rdev);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lubomir Rintel lkundrak@v3.sk
[ Upstream commit cec551ea0d41c679ed11d758e1a386e20285b29d ]
Reset ec_priv if probe ends unsuccessfully.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Link: https://lore.kernel.org/r/20210126073740.10232-2-lkundrak@v3.sk Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/olpc/olpc-ec.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/platform/olpc/olpc-ec.c b/drivers/platform/olpc/olpc-ec.c index f64b82824db2..2db7113383fd 100644 --- a/drivers/platform/olpc/olpc-ec.c +++ b/drivers/platform/olpc/olpc-ec.c @@ -426,11 +426,8 @@ static int olpc_ec_probe(struct platform_device *pdev)
/* get the EC revision */ err = olpc_ec_cmd(EC_FIRMWARE_REV, NULL, 0, &ec->version, 1); - if (err) { - ec_priv = NULL; - kfree(ec); - return err; - } + if (err) + goto error;
config.dev = pdev->dev.parent; config.driver_data = ec; @@ -440,12 +437,16 @@ static int olpc_ec_probe(struct platform_device *pdev) if (IS_ERR(ec->dcon_rdev)) { dev_err(&pdev->dev, "failed to register DCON regulator\n"); err = PTR_ERR(ec->dcon_rdev); - kfree(ec); - return err; + goto error; }
ec->dbgfs_dir = olpc_ec_setup_debugfs();
+ return 0; + +error: + ec_priv = NULL; + kfree(ec); return err; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Oliver O'Halloran oohall@gmail.com
[ Upstream commit 5537fcb319d016ce387f818dd774179bc03217f5 ]
On many powerpc platforms the discovery and initalisation of pci_controllers (PHBs) happens inside of setup_arch(). This is very early in boot (pre-initcalls) and means that we're initialising the PHB long before many basic kernel services (slab allocator, debugfs, a real ioremap) are available.
On PowerNV this causes an additional problem since we map the PHB registers with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early use of ioremap()") a warning is printed because we're using the "incorrect" API to setup and MMIO mapping in searly boot. The kernel does provide early_ioremap(), but that is not intended to create long-lived MMIO mappings and a seperate warning is printed by generic code if early_ioremap() mappings are "leaked."
This is all fixable with dumb hacks like using early_ioremap() to setup the initial mapping then replacing it with a real ioremap later on in boot, but it does raise the question: Why the hell are we setting up the PHB's this early in boot?
The old and wise claim it's due to "hysterical rasins." Aside from amused grapes there doesn't appear to be any real reason to maintain the current behaviour. Already most of the newer embedded platforms perform PHB discovery in an arch_initcall and between the end of setup_arch() and the start of initcalls none of the generic kernel code does anything PCI related. On powerpc scanning PHBs occurs in a subsys_initcall so it should be possible to move the PHB discovery to a core, postcore or arch initcall.
This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that calls it. The core_initcalls are the earliest to be called so this will any possibly issues with dependency between initcalls. This isn't just an academic issue either since on pseries and PowerNV EEH init occurs in an arch_initcall and depends on the pci_controllers being available, similarly the creation of pci_dns occurs at core_initcall_sync (i.e. between core and postcore initcalls). These problems need to be addressed seperately.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Oliver O'Halloran oohall@gmail.com [mpe: Make discover_phbs() static] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20201103043523.916109-1-oohall@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/machdep.h | 3 +++ arch/powerpc/kernel/pci-common.c | 10 ++++++++++ 2 files changed, 13 insertions(+)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index cf6ebbc16cb4..764f2732a821 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -59,6 +59,9 @@ struct machdep_calls { int (*pcibios_root_bridge_prepare)(struct pci_host_bridge *bridge);
+ /* finds all the pci_controllers present at boot */ + void (*discover_phbs)(void); + /* To setup PHBs when using automatic OF platform driver for PCI */ int (*pci_setup_phb)(struct pci_controller *host);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 2b555997b295..001e90cd8948 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1699,3 +1699,13 @@ static void fixup_hide_host_resource_fsl(struct pci_dev *dev) } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MOTOROLA, PCI_ANY_ID, fixup_hide_host_resource_fsl); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, fixup_hide_host_resource_fsl); + + +static int __init discover_phbs(void) +{ + if (ppc_md.discover_phbs) + ppc_md.discover_phbs(); + + return 0; +} +core_initcall(discover_phbs);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alain Volmat alain.volmat@foss.st.com
[ Upstream commit c64e7efe46b7de21937ef4b3594d9b1fc74f07df ]
We do not expect to receive spurious interrupts so rise a warning if it happens.
RX overrun is an error condition that signals a corrupted RX stream both in dma and in irq modes. Report the error and abort the transfer in either cases.
Signed-off-by: Alain Volmat alain.volmat@foss.st.com Link: https://lore.kernel.org/r/1612551572-495-9-git-send-email-alain.volmat@foss.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-stm32.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/drivers/spi/spi-stm32.c b/drivers/spi/spi-stm32.c index 6eeb39669a86..53c4311cc6ab 100644 --- a/drivers/spi/spi-stm32.c +++ b/drivers/spi/spi-stm32.c @@ -928,8 +928,8 @@ static irqreturn_t stm32h7_spi_irq_thread(int irq, void *dev_id) mask |= STM32H7_SPI_SR_RXP;
if (!(sr & mask)) { - dev_dbg(spi->dev, "spurious IT (sr=0x%08x, ier=0x%08x)\n", - sr, ier); + dev_warn(spi->dev, "spurious IT (sr=0x%08x, ier=0x%08x)\n", + sr, ier); spin_unlock_irqrestore(&spi->lock, flags); return IRQ_NONE; } @@ -956,15 +956,8 @@ static irqreturn_t stm32h7_spi_irq_thread(int irq, void *dev_id) }
if (sr & STM32H7_SPI_SR_OVR) { - dev_warn(spi->dev, "Overrun: received value discarded\n"); - if (!spi->cur_usedma && (spi->rx_buf && (spi->rx_len > 0))) - stm32h7_spi_read_rxfifo(spi, false); - /* - * If overrun is detected while using DMA, it means that - * something went wrong, so stop the current transfer - */ - if (spi->cur_usedma) - end = true; + dev_err(spi->dev, "Overrun: RX data lost\n"); + end = true; }
if (sr & STM32H7_SPI_SR_EOT) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 11cb0a25f71818ca7ab4856548ecfd83c169aa4d ]
If an unrecoverable system reset hits in process context, the system does not have to panic. Similar to machine check, call nmi_exit() before die().
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210130130852.2952424-26-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/traps.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 3ec7b443fe6b..4be05517f2db 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -503,8 +503,11 @@ void system_reset_exception(struct pt_regs *regs) die("Unrecoverable nested System Reset", regs, SIGABRT); #endif /* Must die if the interrupt is not recoverable */ - if (!(regs->msr & MSR_RI)) + if (!(regs->msr & MSR_RI)) { + /* For the reason explained in die_mce, nmi_exit before die */ + nmi_exit(); die("Unrecoverable System Reset", regs, SIGABRT); + }
if (saved_hsrrs) { mtspr(SPRN_HSRR0, hsrr0);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Athira Rajeev atrajeev@linux.vnet.ibm.com
[ Upstream commit d137845c973147a22622cc76c7b0bc16f6206323 ]
While sampling for marked events, currently we record the sample only if the SIAR valid bit of Sampled Instruction Event Register (SIER) is set. SIAR_VALID bit is used for fetching the instruction address from Sampled Instruction Address Register(SIAR). But there are some usecases, where the user is interested only in the PMU stats at each counter overflow and the exact IP of the overflow event is not required. Dropping SIAR invalid samples will fail to record some of the counter overflows in such cases.
Example of such usecase is dumping the PMU stats (event counts) after some regular amount of instructions/events from the userspace (ex: via ptrace). Here counter overflow is indicated to userspace via signal handler, and captured by monitoring and enabling I/O signaling on the event file descriptor. In these cases, we expect to get sample/overflow indication after each specified sample_period.
Perf event attribute will not have PERF_SAMPLE_IP set in the sample_type if exact IP of the overflow event is not requested. So while profiling if SAMPLE_IP is not set, just record the counter overflow irrespective of SIAR_VALID check.
Suggested-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Athira Rajeev atrajeev@linux.vnet.ibm.com [mpe: Reflow comment and if formatting] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vn... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/perf/core-book3s.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index a4908652242e..51f413521fde 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2149,7 +2149,17 @@ static void record_and_restart(struct perf_event *event, unsigned long val, left += period; if (left <= 0) left = period; - record = siar_valid(regs); + + /* + * If address is not requested in the sample via + * PERF_SAMPLE_IP, just record that sample irrespective + * of SIAR valid check. + */ + if (event->attr.sample_type & PERF_SAMPLE_IP) + record = siar_valid(regs); + else + record = 1; + event->hw.last_period = event->hw.sample_period; } if (left < 0x80000000LL) @@ -2167,9 +2177,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val, * MMCR2. Check attr.exclude_kernel and address to drop the sample in * these cases. */ - if (event->attr.exclude_kernel && record) - if (is_kernel_addr(mfspr(SPRN_SIAR))) - record = 0; + if (event->attr.exclude_kernel && + (event->attr.sample_type & PERF_SAMPLE_IP) && + is_kernel_addr(mfspr(SPRN_SIAR))) + record = 0;
/* * Finally record data if requested.
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Filipe Laíns lains@riseup.net
[ Upstream commit fab3a95654eea01d6b0204995be8b7492a00d001 ]
This new connection type is the new iteration of the Lightspeed connection and will probably be used in some of the newer gaming devices. It is currently use in the G Pro X Superlight.
This patch should be backported to older versions, as currently the driver will panic when seing the unsupported connection. This isn't an issue when using the receiver that came with the device, as Logitech has been using different PIDs when they change the connection type, but is an issue when using a generic receiver (well, generic Lightspeed receiver), which is the case of the one in the Powerplay mat. Currently, the only generic Ligthspeed receiver we support, and the only one that exists AFAIK, is ther Powerplay.
As it stands, the driver will panic when seeing a G Pro X Superlight connected to the Powerplay receiver and won't send any input events to userspace! The kernel will warn about this so the issue should be easy to identify, but it is still very worrying how hard it will fail :(
[915977.398471] logitech-djreceiver 0003:046D:C53A.0107: unusable device of type UNKNOWN (0x0f) connected on slot 1
Signed-off-by: Filipe Laíns lains@riseup.net Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-logitech-dj.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c index fcdc922bc973..271bd8d24339 100644 --- a/drivers/hid/hid-logitech-dj.c +++ b/drivers/hid/hid-logitech-dj.c @@ -995,7 +995,12 @@ static void logi_hidpp_recv_queue_notif(struct hid_device *hdev, workitem.reports_supported |= STD_KEYBOARD; break; case 0x0d: - device_type = "eQUAD Lightspeed 1_1"; + device_type = "eQUAD Lightspeed 1.1"; + logi_hidpp_dev_conn_notif_equad(hdev, hidpp_report, &workitem); + workitem.reports_supported |= STD_KEYBOARD; + break; + case 0x0f: + device_type = "eQUAD Lightspeed 1.2"; logi_hidpp_dev_conn_notif_equad(hdev, hidpp_report, &workitem); workitem.reports_supported |= STD_KEYBOARD; break;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit e3de1e291fa58a1ab0f471a4b458eff2514e4b5f ]
In commit bf13718bc57a ("powerpc: show registers when unwinding interrupt frames") we changed our stack dumping logic to show the full registers whenever we find an interrupt frame on the stack.
However we didn't notice that on 64-bit this doesn't show the final frame, ie. the interrupt that brought us in from userspace, whereas on 32-bit it does.
That is due to confusion about the size of that last frame. The code in show_stack() calls validate_sp(), passing it STACK_INT_FRAME_SIZE to check the sp is at least that far below the top of the stack.
However on 64-bit that size is too large for the final frame, because it includes the red zone, but we don't allocate a red zone for the first frame.
So add a new define that encodes the correct size for 32-bit and 64-bit, and use it in show_stack().
This results in the full trace being shown on 64-bit, eg:
sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 83 Comm: sh Not tainted 5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty #649 Call Trace: [c00000000a1c3ac0] [c000000000897b70] dump_stack+0xc4/0x114 (unreliable) [c00000000a1c3b00] [c00000000014334c] panic+0x178/0x41c [c00000000a1c3ba0] [c00000000094e600] sysrq_handle_crash+0x40/0x50 [c00000000a1c3c00] [c00000000094ef98] __handle_sysrq+0xd8/0x210 [c00000000a1c3ca0] [c00000000094f820] write_sysrq_trigger+0x100/0x188 [c00000000a1c3ce0] [c0000000005559dc] proc_reg_write+0x10c/0x1b0 [c00000000a1c3d10] [c000000000479950] vfs_write+0xf0/0x360 [c00000000a1c3d60] [c000000000479d9c] ksys_write+0x7c/0x140 [c00000000a1c3db0] [c00000000002bf5c] system_call_exception+0x19c/0x2c0 [c00000000a1c3e10] [c00000000000d35c] system_call_common+0xec/0x278 --- interrupt: c00 at 0x7fff9fbab428 NIP: 00007fff9fbab428 LR: 000000001000b724 CTR: 0000000000000000 REGS: c00000000a1c3e80 TRAP: 0c00 Not tainted (5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty) MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 22002884 XER: 00000000 IRQMASK: 0 GPR00: 0000000000000004 00007fffc3cb8960 00007fff9fc59900 0000000000000001 GPR04: 000000002a4b32d0 0000000000000002 0000000000000063 0000000000000063 GPR08: 000000002a4b32d0 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fff9fcca9a0 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000100b8fd0 GPR20: 000000002a4b3485 00000000100b8f90 0000000000000000 0000000000000000 GPR24: 000000002a4b0440 00000000100e77b8 0000000000000020 000000002a4b32d0 GPR28: 0000000000000001 0000000000000002 000000002a4b32d0 0000000000000001 NIP [00007fff9fbab428] 0x7fff9fbab428 LR [000000001000b724] 0x1000b724 --- interrupt: c00
Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210209141627.2898485-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/ptrace.h | 3 +++ arch/powerpc/kernel/asm-offsets.c | 2 +- arch/powerpc/kernel/process.c | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h index 58f9dc060a7b..8236c5e749e4 100644 --- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -70,6 +70,9 @@ struct pt_regs }; #endif
+ +#define STACK_FRAME_WITH_PT_REGS (STACK_FRAME_OVERHEAD + sizeof(struct pt_regs)) + #ifdef __powerpc64__
/* diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index b12d7c049bfe..989006b5ad0f 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -309,7 +309,7 @@ int main(void)
/* Interrupt register frame */ DEFINE(INT_FRAME_SIZE, STACK_INT_FRAME_SIZE); - DEFINE(SWITCH_FRAME_SIZE, STACK_FRAME_OVERHEAD + sizeof(struct pt_regs)); + DEFINE(SWITCH_FRAME_SIZE, STACK_FRAME_WITH_PT_REGS); STACK_PT_REGS_OFFSET(GPR0, gpr[0]); STACK_PT_REGS_OFFSET(GPR1, gpr[1]); STACK_PT_REGS_OFFSET(GPR2, gpr[2]); diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index a66f435dabbf..b65a73e4d642 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -2176,7 +2176,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack, * See if this is an exception frame. * We look for the "regshere" marker in the current frame. */ - if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE) + if (validate_sp(sp, tsk, STACK_FRAME_WITH_PT_REGS) && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) { struct pt_regs *regs = (struct pt_regs *) (sp + STACK_FRAME_OVERHEAD);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Suravee Suthikulpanit suravee.suthikulpanit@amd.com
[ Upstream commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b ]
Certain AMD platforms enable power gating feature for IOMMU PMC, which prevents the IOMMU driver from updating the counter while trying to validate the PMC functionality in the init_iommu_perf_ctr(). This results in disabling PMC support and the following error message:
"AMD-Vi: Unable to read/write to IOMMU perf counter"
To workaround this issue, disable power gating temporarily by programming the counter source to non-zero value while validating the counter, and restore the prior state afterward.
Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Tested-by: Tj (Elloe Linux) ml.linux@elloe.vision Link: https://lore.kernel.org/r/20210208122712.5048-1-suravee.suthikulpanit@amd.co... Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/init.c | 45 ++++++++++++++++++++++++++++++---------- 1 file changed, 34 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 83d8ab2aed9f..01da76dc1caa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -12,6 +12,7 @@ #include <linux/acpi.h> #include <linux/list.h> #include <linux/bitmap.h> +#include <linux/delay.h> #include <linux/slab.h> #include <linux/syscore_ops.h> #include <linux/interrupt.h> @@ -254,6 +255,8 @@ static enum iommu_init_state init_state = IOMMU_START_STATE; static int amd_iommu_enable_interrupts(void); static int __init iommu_go_to_state(enum iommu_init_state state); static void init_device_table_dma(void); +static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, + u8 fxn, u64 *value, bool is_write);
static bool amd_iommu_pre_enabled = true;
@@ -1712,13 +1715,11 @@ static int __init init_iommu_all(struct acpi_table_header *table) return 0; }
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, - u8 fxn, u64 *value, bool is_write); - -static void init_iommu_perf_ctr(struct amd_iommu *iommu) +static void __init init_iommu_perf_ctr(struct amd_iommu *iommu) { + int retry; struct pci_dev *pdev = iommu->dev; - u64 val = 0xabcd, val2 = 0, save_reg = 0; + u64 val = 0xabcd, val2 = 0, save_reg, save_src;
if (!iommu_feature(iommu, FEATURE_PC)) return; @@ -1726,17 +1727,39 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu) amd_iommu_pc_present = true;
/* save the value to restore, if writable */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false)) + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) || + iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false)) goto pc_false;
- /* Check if the performance counters can be written to */ - if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) || - (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) || - (val != val2)) + /* + * Disable power gating by programing the performance counter + * source to 20 (i.e. counts the reads and writes from/to IOMMU + * Reserved Register [MMIO Offset 1FF8h] that are ignored.), + * which never get incremented during this init phase. + * (Note: The event is also deprecated.) + */ + val = 20; + if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true)) goto pc_false;
+ /* Check if the performance counters can be written to */ + val = 0xabcd; + for (retry = 5; retry; retry--) { + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) || + iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) || + val2) + break; + + /* Wait about 20 msec for power gating to disable and retry. */ + msleep(20); + } + /* restore */ - if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true)) + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) || + iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true)) + goto pc_false; + + if (val != val2) goto pc_false;
pci_info(pdev, "IOMMU performance counters supported\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org
[ Upstream commit 785c02eb35009a4be6dbc68f4f7d916e90b7177d ]
In some rare occasions, we want to only set the RETAIN_MEM bit, but not the RETAIN_PERIPH one: this is seen on at least SDM630/636/660's GPU-GX GDSC, where unsetting and setting back the RETAIN_PERIPH bit will generate chaos and panics during GPU suspend time (mainly, the chaos is unaligned access).
For this reason, introduce a new NO_RET_PERIPH flag to the GDSC driver to address this corner case.
Signed-off-by: AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org Link: https://lore.kernel.org/r/20210113183817.447866-8-angelogioacchino.delregno@... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gdsc.c | 10 ++++++++-- drivers/clk/qcom/gdsc.h | 3 ++- 2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c index af26e0695b86..51ed640e527b 100644 --- a/drivers/clk/qcom/gdsc.c +++ b/drivers/clk/qcom/gdsc.c @@ -183,7 +183,10 @@ static inline int gdsc_assert_reset(struct gdsc *sc) static inline void gdsc_force_mem_on(struct gdsc *sc) { int i; - u32 mask = RETAIN_MEM | RETAIN_PERIPH; + u32 mask = RETAIN_MEM; + + if (!(sc->flags & NO_RET_PERIPH)) + mask |= RETAIN_PERIPH;
for (i = 0; i < sc->cxc_count; i++) regmap_update_bits(sc->regmap, sc->cxcs[i], mask, mask); @@ -192,7 +195,10 @@ static inline void gdsc_force_mem_on(struct gdsc *sc) static inline void gdsc_clear_mem_on(struct gdsc *sc) { int i; - u32 mask = RETAIN_MEM | RETAIN_PERIPH; + u32 mask = RETAIN_MEM; + + if (!(sc->flags & NO_RET_PERIPH)) + mask |= RETAIN_PERIPH;
for (i = 0; i < sc->cxc_count; i++) regmap_update_bits(sc->regmap, sc->cxcs[i], mask, 0); diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h index bd537438c793..5bb396b344d1 100644 --- a/drivers/clk/qcom/gdsc.h +++ b/drivers/clk/qcom/gdsc.h @@ -42,7 +42,7 @@ struct gdsc { #define PWRSTS_ON BIT(2) #define PWRSTS_OFF_ON (PWRSTS_OFF | PWRSTS_ON) #define PWRSTS_RET_ON (PWRSTS_RET | PWRSTS_ON) - const u8 flags; + const u16 flags; #define VOTABLE BIT(0) #define CLAMP_IO BIT(1) #define HW_CTRL BIT(2) @@ -51,6 +51,7 @@ struct gdsc { #define POLL_CFG_GDSCR BIT(5) #define ALWAYS_ON BIT(6) #define RETAIN_FF_ENABLE BIT(7) +#define NO_RET_PERIPH BIT(8) struct reset_controller_dev *rcdev; unsigned int *resets; unsigned int reset_count;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andreas Larsson andreas@gaisler.com
[ Upstream commit bda166930c37604ffa93f2425426af6921ec575a ]
Commit cca079ef8ac29a7c02192d2bad2ffe4c0c5ffdd0 changed sparc32 to use memblocks instead of bootmem, but also made high memory available via memblock allocation which does not work together with e.g. phys_to_virt and can lead to kernel panic.
This changes back to only low memory being allocatable in the early stages, now using memblock allocation.
Signed-off-by: Andreas Larsson andreas@gaisler.com Acked-by: Mike Rapoport rppt@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/mm/init_32.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c index eb2946b1df8a..6139c5700ccc 100644 --- a/arch/sparc/mm/init_32.c +++ b/arch/sparc/mm/init_32.c @@ -197,6 +197,9 @@ unsigned long __init bootmem_init(unsigned long *pages_avail) size = memblock_phys_mem_size() - memblock_reserved_size(); *pages_avail = (size >> PAGE_SHIFT) - high_pages;
+ /* Only allow low memory to be allocated via memblock allocation */ + memblock_set_current_limit(max_low_pfn << PAGE_SHIFT); + return max_pfn; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Khalid Aziz khalid.aziz@oracle.com
[ Upstream commit 147d8622f2a26ef34beacc60e1ed8b66c2fa457f ]
When userspace calls mprotect() to enable ADI on an address range, do_mprotect_pkey() calls arch_validate_prot() to validate new protection flags. arch_validate_prot() for sparc looks at the first VMA associated with address range to verify if ADI can indeed be enabled on this address range. This has two issues - (1) Address range might cover multiple VMAs while arch_validate_prot() looks at only the first VMA, (2) arch_validate_prot() peeks at VMA without holding mmap lock which can result in race condition.
arch_validate_flags() from commit c462ac288f2c ("mm: Introduce arch_validate_flags()") allows for VMA flags to be validated for all VMAs that cover the address range given by user while holding mmap lock. This patch updates sparc code to move the VMA check from arch_validate_prot() to arch_validate_flags() to fix above two issues.
Suggested-by: Jann Horn jannh@google.com Suggested-by: Christoph Hellwig hch@infradead.org Suggested-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Khalid Aziz khalid.aziz@oracle.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/include/asm/mman.h | 54 +++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 25 deletions(-)
diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h index f94532f25db1..274217e7ed70 100644 --- a/arch/sparc/include/asm/mman.h +++ b/arch/sparc/include/asm/mman.h @@ -57,35 +57,39 @@ static inline int sparc_validate_prot(unsigned long prot, unsigned long addr) { if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_ADI)) return 0; - if (prot & PROT_ADI) { - if (!adi_capable()) - return 0; + return 1; +}
- if (addr) { - struct vm_area_struct *vma; +#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags) +/* arch_validate_flags() - Ensure combination of flags is valid for a + * VMA. + */ +static inline bool arch_validate_flags(unsigned long vm_flags) +{ + /* If ADI is being enabled on this VMA, check for ADI + * capability on the platform and ensure VMA is suitable + * for ADI + */ + if (vm_flags & VM_SPARC_ADI) { + if (!adi_capable()) + return false;
- vma = find_vma(current->mm, addr); - if (vma) { - /* ADI can not be enabled on PFN - * mapped pages - */ - if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP)) - return 0; + /* ADI can not be enabled on PFN mapped pages */ + if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP)) + return false;
- /* Mergeable pages can become unmergeable - * if ADI is enabled on them even if they - * have identical data on them. This can be - * because ADI enabled pages with identical - * data may still not have identical ADI - * tags on them. Disallow ADI on mergeable - * pages. - */ - if (vma->vm_flags & VM_MERGEABLE) - return 0; - } - } + /* Mergeable pages can become unmergeable + * if ADI is enabled on them even if they + * have identical data on them. This can be + * because ADI enabled pages with identical + * data may still not have identical ADI + * tags on them. Disallow ADI on mergeable + * pages. + */ + if (vm_flags & VM_MERGEABLE) + return false; } - return 1; + return true; } #endif /* CONFIG_SPARC64 */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ronald Tschalär ronald@innovation.ch
[ Upstream commit 0ce1ac23149c6da939a5926c098c270c58c317a0 ]
The response to a command may never arrive or it may be corrupted (and hence dropped) for some reason. While exceedingly rare, when it did happen it blocked all further commands. One way to fix this was to do a suspend/resume. However, recovering automatically seems like a nicer option. Hence this puts a time limit (1 sec) on how long we're willing to wait for a response, after which we assume it got lost.
Signed-off-by: Ronald Tschalär ronald@innovation.ch Link: https://lore.kernel.org/r/20210217190718.11035-1-ronald@innovation.ch Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/keyboard/applespi.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/input/keyboard/applespi.c b/drivers/input/keyboard/applespi.c index d22223154177..27e87c45edf2 100644 --- a/drivers/input/keyboard/applespi.c +++ b/drivers/input/keyboard/applespi.c @@ -48,6 +48,7 @@ #include <linux/efi.h> #include <linux/input.h> #include <linux/input/mt.h> +#include <linux/ktime.h> #include <linux/leds.h> #include <linux/module.h> #include <linux/spinlock.h> @@ -409,7 +410,7 @@ struct applespi_data { unsigned int cmd_msg_cntr; /* lock to protect the above parameters and flags below */ spinlock_t cmd_msg_lock; - bool cmd_msg_queued; + ktime_t cmd_msg_queued; enum applespi_evt_type cmd_evt_type;
struct led_classdev backlight_info; @@ -729,7 +730,7 @@ static void applespi_msg_complete(struct applespi_data *applespi, wake_up_all(&applespi->drain_complete);
if (is_write_msg) { - applespi->cmd_msg_queued = false; + applespi->cmd_msg_queued = 0; applespi_send_cmd_msg(applespi); }
@@ -771,8 +772,16 @@ static int applespi_send_cmd_msg(struct applespi_data *applespi) return 0;
/* check whether send is in progress */ - if (applespi->cmd_msg_queued) - return 0; + if (applespi->cmd_msg_queued) { + if (ktime_ms_delta(ktime_get(), applespi->cmd_msg_queued) < 1000) + return 0; + + dev_warn(&applespi->spi->dev, "Command %d timed out\n", + applespi->cmd_evt_type); + + applespi->cmd_msg_queued = 0; + applespi->write_active = false; + }
/* set up packet */ memset(packet, 0, APPLESPI_PACKET_SIZE); @@ -869,7 +878,7 @@ static int applespi_send_cmd_msg(struct applespi_data *applespi) return sts; }
- applespi->cmd_msg_queued = true; + applespi->cmd_msg_queued = ktime_get_coarse(); applespi->write_active = true;
return 0; @@ -1921,7 +1930,7 @@ static int __maybe_unused applespi_resume(struct device *dev) applespi->drain = false; applespi->have_cl_led_on = false; applespi->have_bl_level = 0; - applespi->cmd_msg_queued = false; + applespi->cmd_msg_queued = 0; applespi->read_active = false; applespi->write_active = false;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Martin Kaiser martin@kaiser.cx
[ Upstream commit a93c00e5f975f23592895b7e83f35de2d36b7633 ]
Fix a race where a pending interrupt could be received and the handler called before the handler's data has been setup, by converting to irq_set_chained_handler_and_data().
See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler").
Based on the mail discussion, it seems ok to drop the error handling.
Link: https://lore.kernel.org/r/20210115212435.19940-3-martin@kaiser.cx Signed-off-by: Martin Kaiser martin@kaiser.cx Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-xgene-msi.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/controller/pci-xgene-msi.c b/drivers/pci/controller/pci-xgene-msi.c index 2470782cb01a..1c34c897a7e2 100644 --- a/drivers/pci/controller/pci-xgene-msi.c +++ b/drivers/pci/controller/pci-xgene-msi.c @@ -384,13 +384,9 @@ static int xgene_msi_hwirq_alloc(unsigned int cpu) if (!msi_group->gic_irq) continue;
- irq_set_chained_handler(msi_group->gic_irq, - xgene_msi_isr); - err = irq_set_handler_data(msi_group->gic_irq, msi_group); - if (err) { - pr_err("failed to register GIC IRQ handler\n"); - return -EINVAL; - } + irq_set_chained_handler_and_data(msi_group->gic_irq, + xgene_msi_isr, msi_group); + /* * Statically allocate MSI GIC IRQs to each CPU core. * With 8-core X-Gene v1, 2 MSI GIC IRQs are allocated
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Krzysztof Wilczyński kw@linux.com
[ Upstream commit 42814c438aac79746d310f413a27d5b0b959c5de ]
The for_each_available_child_of_node helper internally makes use of the of_get_next_available_child() which performs an of_node_get() on each iteration when searching for next available child node.
Should an available child node be found, then it would return a device node pointer with reference count incremented, thus early return from the middle of the loop requires an explicit of_node_put() to prevent reference count leak.
To stop the reference leak, explicitly call of_node_put() before returning after an error occurred.
Link: https://lore.kernel.org/r/20210120184810.3068794-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński kw@linux.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-mediatek.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c index cf4c18f0c25a..23548b517e4b 100644 --- a/drivers/pci/controller/pcie-mediatek.c +++ b/drivers/pci/controller/pcie-mediatek.c @@ -1035,14 +1035,14 @@ static int mtk_pcie_setup(struct mtk_pcie *pcie) err = of_pci_get_devfn(child); if (err < 0) { dev_err(dev, "failed to parse devfn: %d\n", err); - return err; + goto error_put_node; }
slot = PCI_SLOT(err);
err = mtk_pcie_parse_port(pcie, child, slot); if (err) - return err; + goto error_put_node; }
err = mtk_pcie_subsys_powerup(pcie); @@ -1058,6 +1058,9 @@ static int mtk_pcie_setup(struct mtk_pcie *pcie) mtk_pcie_subsys_powerdown(pcie);
return 0; +error_put_node: + of_node_put(child); + return err; }
static int mtk_pcie_probe(struct platform_device *pdev)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 38009c766725a9877ea8866fc813a5460011817f ]
The structleak plugin causes the stack frame size to grow immensely:
drivers/base/test/property-entry-test.c: In function 'pe_test_reference': drivers/base/test/property-entry-test.c:481:1: error: the frame size of 2640 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] 481 | } | ^ drivers/base/test/property-entry-test.c: In function 'pe_test_uints': drivers/base/test/property-entry-test.c:99:1: error: the frame size of 2592 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Turn it off in this file.
Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20210125124533.101339-3-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/test/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile index 3ca56367c84b..2f15fae8625f 100644 --- a/drivers/base/test/Makefile +++ b/drivers/base/test/Makefile @@ -2,3 +2,4 @@ obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o
obj-$(CONFIG_KUNIT_DRIVER_PE_TEST) += property-entry-test.o +CFLAGS_REMOVE_property-entry-test.o += -fplugin-arg-structleak_plugin-byref -fplugin-arg-structleak_plugin-byref-all
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit b4c7d2076b4e767dd2e075a2b3a9e57753fc67f5 ]
The PCIe Bandwidth Change Notification feature logs messages when the link bandwidth changes. Some users have reported that these messages occur often enough to significantly reduce NVMe performance. GPUs also seem to generate these messages.
We don't know why the link bandwidth changes, but in the reported cases there's no indication that it's caused by hardware failures.
Remove the bandwidth change notifications for now. Hopefully we can add this back when we have a better understanding of why this happens and how we can make the messages useful instead of overwhelming.
Link: https://lore.kernel.org/r/20200115221008.GA191037@google.com/ Link: https://lore.kernel.org/r/155605909349.3575.13433421148215616375.stgit@gimli... Link: https://bugzilla.kernel.org/show_bug.cgi?id=206197 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/Kconfig | 8 -- drivers/pci/pcie/Makefile | 1 - drivers/pci/pcie/bw_notification.c | 138 ----------------------------- drivers/pci/pcie/portdrv.h | 6 -- drivers/pci/pcie/portdrv_pci.c | 1 - 5 files changed, 154 deletions(-) delete mode 100644 drivers/pci/pcie/bw_notification.c
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig index 3946555a6042..45a2ef702b45 100644 --- a/drivers/pci/pcie/Kconfig +++ b/drivers/pci/pcie/Kconfig @@ -133,14 +133,6 @@ config PCIE_PTM This is only useful if you have devices that support PTM, but it is safe to enable even if you don't.
-config PCIE_BW - bool "PCI Express Bandwidth Change Notification" - depends on PCIEPORTBUS - help - This enables PCI Express Bandwidth Change Notification. If - you know link width or rate changes occur only to correct - unreliable links, you may answer Y. - config PCIE_EDR bool "PCI Express Error Disconnect Recover support" depends on PCIE_DPC && ACPI diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index d9697892fa3e..b2980db88cc0 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -12,5 +12,4 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o obj-$(CONFIG_PCIE_PME) += pme.o obj-$(CONFIG_PCIE_DPC) += dpc.o obj-$(CONFIG_PCIE_PTM) += ptm.o -obj-$(CONFIG_PCIE_BW) += bw_notification.o obj-$(CONFIG_PCIE_EDR) += edr.o diff --git a/drivers/pci/pcie/bw_notification.c b/drivers/pci/pcie/bw_notification.c deleted file mode 100644 index 565d23cccb8b..000000000000 --- a/drivers/pci/pcie/bw_notification.c +++ /dev/null @@ -1,138 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0+ -/* - * PCI Express Link Bandwidth Notification services driver - * Author: Alexandru Gagniuc mr.nuke.me@gmail.com - * - * Copyright (C) 2019, Dell Inc - * - * The PCIe Link Bandwidth Notification provides a way to notify the - * operating system when the link width or data rate changes. This - * capability is required for all root ports and downstream ports - * supporting links wider than x1 and/or multiple link speeds. - * - * This service port driver hooks into the bandwidth notification interrupt - * and warns when links become degraded in operation. - */ - -#define dev_fmt(fmt) "bw_notification: " fmt - -#include "../pci.h" -#include "portdrv.h" - -static bool pcie_link_bandwidth_notification_supported(struct pci_dev *dev) -{ - int ret; - u32 lnk_cap; - - ret = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnk_cap); - return (ret == PCIBIOS_SUCCESSFUL) && (lnk_cap & PCI_EXP_LNKCAP_LBNC); -} - -static void pcie_enable_link_bandwidth_notification(struct pci_dev *dev) -{ - u16 lnk_ctl; - - pcie_capability_write_word(dev, PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_LBMS); - - pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnk_ctl); - lnk_ctl |= PCI_EXP_LNKCTL_LBMIE; - pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnk_ctl); -} - -static void pcie_disable_link_bandwidth_notification(struct pci_dev *dev) -{ - u16 lnk_ctl; - - pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnk_ctl); - lnk_ctl &= ~PCI_EXP_LNKCTL_LBMIE; - pcie_capability_write_word(dev, PCI_EXP_LNKCTL, lnk_ctl); -} - -static irqreturn_t pcie_bw_notification_irq(int irq, void *context) -{ - struct pcie_device *srv = context; - struct pci_dev *port = srv->port; - u16 link_status, events; - int ret; - - ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); - events = link_status & PCI_EXP_LNKSTA_LBMS; - - if (ret != PCIBIOS_SUCCESSFUL || !events) - return IRQ_NONE; - - pcie_capability_write_word(port, PCI_EXP_LNKSTA, events); - pcie_update_link_speed(port->subordinate, link_status); - return IRQ_WAKE_THREAD; -} - -static irqreturn_t pcie_bw_notification_handler(int irq, void *context) -{ - struct pcie_device *srv = context; - struct pci_dev *port = srv->port; - struct pci_dev *dev; - - /* - * Print status from downstream devices, not this root port or - * downstream switch port. - */ - down_read(&pci_bus_sem); - list_for_each_entry(dev, &port->subordinate->devices, bus_list) - pcie_report_downtraining(dev); - up_read(&pci_bus_sem); - - return IRQ_HANDLED; -} - -static int pcie_bandwidth_notification_probe(struct pcie_device *srv) -{ - int ret; - - /* Single-width or single-speed ports do not have to support this. */ - if (!pcie_link_bandwidth_notification_supported(srv->port)) - return -ENODEV; - - ret = request_threaded_irq(srv->irq, pcie_bw_notification_irq, - pcie_bw_notification_handler, - IRQF_SHARED, "PCIe BW notif", srv); - if (ret) - return ret; - - pcie_enable_link_bandwidth_notification(srv->port); - pci_info(srv->port, "enabled with IRQ %d\n", srv->irq); - - return 0; -} - -static void pcie_bandwidth_notification_remove(struct pcie_device *srv) -{ - pcie_disable_link_bandwidth_notification(srv->port); - free_irq(srv->irq, srv); -} - -static int pcie_bandwidth_notification_suspend(struct pcie_device *srv) -{ - pcie_disable_link_bandwidth_notification(srv->port); - return 0; -} - -static int pcie_bandwidth_notification_resume(struct pcie_device *srv) -{ - pcie_enable_link_bandwidth_notification(srv->port); - return 0; -} - -static struct pcie_port_service_driver pcie_bandwidth_notification_driver = { - .name = "pcie_bw_notification", - .port_type = PCIE_ANY_PORT, - .service = PCIE_PORT_SERVICE_BWNOTIF, - .probe = pcie_bandwidth_notification_probe, - .suspend = pcie_bandwidth_notification_suspend, - .resume = pcie_bandwidth_notification_resume, - .remove = pcie_bandwidth_notification_remove, -}; - -int __init pcie_bandwidth_notification_init(void) -{ - return pcie_port_service_register(&pcie_bandwidth_notification_driver); -} diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index af7cf237432a..2ff5724b8f13 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -53,12 +53,6 @@ int pcie_dpc_init(void); static inline int pcie_dpc_init(void) { return 0; } #endif
-#ifdef CONFIG_PCIE_BW -int pcie_bandwidth_notification_init(void); -#else -static inline int pcie_bandwidth_notification_init(void) { return 0; } -#endif - /* Port Type */ #define PCIE_ANY_PORT (~0)
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 0b250bc5f405..8bd4992a4f32 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -255,7 +255,6 @@ static void __init pcie_init_services(void) pcie_pme_init(); pcie_dpc_init(); pcie_hp_init(); - pcie_bandwidth_notification_init(); }
static int __init pcie_portdrv_init(void)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit 027f14f5357279655c3ebc6d14daff8368d4f53f ]
If we try to make any changes via the journal between when the journal is initialized, but before the multi-block allocated is initialized, we will end up deferencing a NULL pointer when the journal commit callback function calls ext4_process_freed_data().
The proximate cause of this failure was commit 2d01ddc86606 ("ext4: save error info to sb through journal if available") since file system corruption problems detected before the call to ext4_mb_init() would result in a journal commit before we aborted the mount of the file system.... and we would then trigger the NULL pointer deref.
Link: https://lore.kernel.org/r/YAm8qH/0oo2ofSMR@mit.edu Reported-by: Murphy Zhou jencce.kernel@gmail.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/super.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9a6f9875aa34..2ae0af1c88c7 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4875,7 +4875,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
set_task_ioprio(sbi->s_journal->j_task, journal_ioprio);
- sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; sbi->s_journal->j_submit_inode_data_buffers = ext4_journal_submit_inode_data_buffers; sbi->s_journal->j_finish_inode_data_buffers = @@ -4987,6 +4986,14 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) goto failed_mount5; }
+ /* + * We can only set up the journal commit callback once + * mballoc is initialized + */ + if (sbi->s_journal) + sbi->s_journal->j_commit_callback = + ext4_journal_commit_callback; + block = ext4_count_free_clusters(sb); ext4_free_blocks_count_set(sbi->s_es, EXT4_C2B(sbi, block));
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
[ Upstream commit 9b82f13e7ef316cdc0a8858f1349f4defce3f9e0 ]
Right now if SUBLEVEL becomes larger than 255 it will overflow into the territory of PATCHLEVEL, causing havoc in userspace that tests for specific kernel version.
While userspace code tests for MAJOR and PATCHLEVEL, it doesn't test SUBLEVEL at any point as ABI changes don't happen in the context of stable tree.
Thus, to avoid overflows, simply clamp SUBLEVEL to it's maximum value in the context of LINUX_VERSION_CODE. This does not affect "make kernelversion" and such.
Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/Makefile b/Makefile index 472136a7881e..80d166cadaa3 100644 --- a/Makefile +++ b/Makefile @@ -1246,9 +1246,15 @@ define filechk_utsrelease.h endef
define filechk_version.h - echo #define LINUX_VERSION_CODE $(shell \ - expr $(VERSION) * 65536 + 0$(PATCHLEVEL) * 256 + 0$(SUBLEVEL)); \ - echo '#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))' + if [ $(SUBLEVEL) -gt 255 ]; then \ + echo #define LINUX_VERSION_CODE $(shell \ + expr $(VERSION) * 65536 + 0$(PATCHLEVEL) * 256 + 255); \ + else \ + echo #define LINUX_VERSION_CODE $(shell \ + expr $(VERSION) * 65536 + 0$(PATCHLEVEL) * 256 + $(SUBLEVEL)); \ + fi; \ + echo '#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + \ + ((c) > 255 ? 255 : (c)))' endef
$(version_h): FORCE
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit f6bda644fa3a7070621c3bf12cd657f69a42f170 ]
Kmemleak reports:
unreferenced object 0xc328de40 (size 64): comm "kworker/1:1", pid 21, jiffies 4294938212 (age 1484.670s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 e0 d8 fc eb 00 00 00 00 ................ 00 00 10 fe 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace: [<ad758d10>] pci_register_io_range+0x3c/0x80 [<2c7f139e>] of_pci_range_to_resource+0x48/0xc0 [<f079ecc8>] devm_of_pci_get_host_bridge_resources.constprop.0+0x2ac/0x3ac [<e999753b>] devm_of_pci_bridge_init+0x60/0x1b8 [<a895b229>] devm_pci_alloc_host_bridge+0x54/0x64 [<e451ddb0>] rcar_pcie_probe+0x2c/0x644
In case a PCI host driver's probe is deferred, the same I/O range may be allocated again, and be ignored, causing a memory leak.
Fix this by (a) letting logic_pio_register_range() return -EEXIST if the passed range already exists, so pci_register_io_range() will free it, and by (b) making pci_register_io_range() not consider -EEXIST an error condition.
Link: https://lore.kernel.org/r/20210202100332.829047-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 4 ++++ lib/logic_pio.c | 3 +++ 2 files changed, 7 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index ba791165ed19..9449dfde2841 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4029,6 +4029,10 @@ int pci_register_io_range(struct fwnode_handle *fwnode, phys_addr_t addr, ret = logic_pio_register_range(range); if (ret) kfree(range); + + /* Ignore duplicates due to deferred probing */ + if (ret == -EEXIST) + ret = 0; #endif
return ret; diff --git a/lib/logic_pio.c b/lib/logic_pio.c index f32fe481b492..07b4b9a1f54b 100644 --- a/lib/logic_pio.c +++ b/lib/logic_pio.c @@ -28,6 +28,8 @@ static DEFINE_MUTEX(io_range_mutex); * @new_range: pointer to the IO range to be registered. * * Returns 0 on success, the error code in case of failure. + * If the range already exists, -EEXIST will be returned, which should be + * considered a success. * * Register a new IO range node in the IO range list. */ @@ -51,6 +53,7 @@ int logic_pio_register_range(struct logic_pio_hwaddr *new_range) list_for_each_entry(range, &io_range_list, list) { if (range->fwnode == new_range->fwnode) { /* range already there */ + ret = -EEXIST; goto end_register; } if (range->flags == LOGIC_PIO_CPU_MMIO &&
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp
[ Upstream commit 58cab46c622d6324e47bd1c533693c94498e4172 ]
Struct i40e_veb is allocated in function i40e_setup_pf_switch, and stored to an array field veb inside struct i40e_pf. However when i40e_setup_misc_vector fails, this memory leaks.
Fix this by calling exit and teardown functions.
Signed-off-by: Keita Suzuki keitasuzuki.park@sslab.ics.keio.ac.jp Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index fcd6f623f2fd..4a2d03cada01 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -15100,6 +15100,8 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (err) { dev_info(&pdev->dev, "setup of misc vector failed: %d\n", err); + i40e_cloud_filter_exit(pf); + i40e_fdir_teardown(pf); goto err_vsis; } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Keith Busch kbusch@kernel.org
[ Upstream commit 387c72cdd7fb6bef650fb078d0f6ae9682abf631 ]
Overwriting the frozen detected status with the result of the link reset loses the NEED_RESET result that drivers are depending on for error handling to report the .slot_reset() callback. Retain this status so that subsequent error handling has the correct flow.
Link: https://lore.kernel.org/r/20210104230300.1277180-4-kbusch@kernel.org Reported-by: Hinko Kocevar hinko.kocevar@ess.eu Tested-by: Hedi Berriche hedi.berriche@hpe.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Sean V Kelley sean.v.kelley@intel.com Acked-by: Hedi Berriche hedi.berriche@hpe.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/err.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index 510f31f0ef6d..4798bd6de97d 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -198,8 +198,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, pci_dbg(bridge, "broadcast error_detected message\n"); if (state == pci_channel_io_frozen) { pci_walk_bridge(bridge, report_frozen_detected, &status); - status = reset_subordinates(bridge); - if (status != PCI_ERS_RESULT_RECOVERED) { + if (reset_subordinates(bridge) != PCI_ERS_RESULT_RECOVERED) { pci_warn(bridge, "subordinate device reset failed\n"); goto failed; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andrey Konovalov andreyknvl@google.com
[ Upstream commit e66e1799a76621003e5b04c9c057826a2152e103 ]
Since the hardware tag-based KASAN mode might not have a redzone that comes after an allocated object (when kasan.mode=prod is enabled), the kasan_bitops_tags() test ends up corrupting the next object in memory.
Change the test so it always accesses the redzone that lies within the allocated object's boundaries.
Link: https://linux-review.googlesource.com/id/I67f51d1ee48f0a8d0fe2658c2a39e4879f... Link: https://lkml.kernel.org/r/7d452ce4ae35bb1988d2c9244dfea56cf2cc9315.161073311... Signed-off-by: Andrey Konovalov andreyknvl@google.com Reviewed-by: Marco Elver elver@google.com Reviewed-by: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Branislav Rankov Branislav.Rankov@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Kevin Brodsky kevin.brodsky@arm.com Cc: Peter Collingbourne pcc@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_kasan.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/lib/test_kasan.c b/lib/test_kasan.c index 2947274cc2d3..5a2f104ca13f 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -737,13 +737,13 @@ static void kasan_bitops_tags(struct kunit *test) return; }
- /* Allocation size will be rounded to up granule size, which is 16. */ - bits = kzalloc(sizeof(*bits), GFP_KERNEL); + /* kmalloc-64 cache will be used and the last 16 bytes will be the redzone. */ + bits = kzalloc(48, GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, bits);
- /* Do the accesses past the 16 allocated bytes. */ - kasan_bitops_modify(test, BITS_PER_LONG, &bits[1]); - kasan_bitops_test_and_modify(test, BITS_PER_LONG + BITS_PER_BYTE, &bits[1]); + /* Do the accesses past the 48 allocated bytes, but within the redone. */ + kasan_bitops_modify(test, BITS_PER_LONG, (void *)bits + 48); + kasan_bitops_test_and_modify(test, BITS_PER_LONG + BITS_PER_BYTE, (void *)bits + 48);
kfree(bits); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit 62c8dca9e194326802b43c60763f856d782b225c ]
Avoid a potentially large stack frame and overflow by making "cpumask_t avail" a static variable. There is no concurrent access due to the existing locking.
Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 27c763014114..1bae4a65416b 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -770,7 +770,7 @@ static int smp_add_core(struct sclp_core_entry *core, cpumask_t *avail, static int __smp_rescan_cpus(struct sclp_core_info *info, bool early) { struct sclp_core_entry *core; - cpumask_t avail; + static cpumask_t avail; bool configured; u16 core_id; int nr, i;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: David Hildenbrand david@redhat.com
[ Upstream commit e9a2e48e8704c9d20a625c6f2357147d03ea7b97 ]
No need to store the value for each and every memory block, as we can easily query the value at runtime. Reshuffle the members to optimize the memory layout. Also, let's clarify what the interface once was used for and why it's legacy nowadays.
"phys_device" was used on s390x in older versions of lsmem[2]/chmem[3], back when they were still part of s390x-tools. They were later replaced by the variants in linux-utils. For example, RHEL6 and RHEL7 contain lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux on s390x [4].
"phys_device" was added with sysfs support for memory hotplug in commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") in 2005. It always returned 0.
s390x started returning something != 0 on some setups (if sclp.rzm is set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set phys_device").
For s390x, it allowed for identifying which memory block devices belong to the same storage increment (RZM). Only if all memory block devices comprising a single storage increment were offline, the memory could actually be removed in the hypervisor.
Since commit e5d709bb5fb7 ("s390/memory hotplug: provide memory_block_size_bytes() function") in 2013 a memory block device spans at least one storage increment - which is why the interface isn't really helpful/used anymore (except by old lsmem/chmem tools).
There were once RFC patches to make use of "phys_device" in ACPI context; however, the underlying problem could be solved using different interfaces [1].
[1] https://patchwork.kernel.org/patch/2163871/ [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134
Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: Dave Hansen dave.hansen@intel.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Gerald Schaefer gerald.schaefer@linux.ibm.com Cc: Jonathan Corbet corbet@lwn.net Cc: "Rafael J. Wysocki" rafael@kernel.org Cc: Mauro Carvalho Chehab mchehab+huawei@kernel.org Cc: Ilya Dryomov idryomov@gmail.com Cc: Vaibhav Jain vaibhav@linux.ibm.com Cc: Tom Rix trix@redhat.com Cc: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ABI/testing/sysfs-devices-memory | 5 ++-- .../admin-guide/mm/memory-hotplug.rst | 4 +-- drivers/base/memory.c | 25 +++++++------------ include/linux/memory.h | 3 +-- 4 files changed, 15 insertions(+), 22 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory index 246a45b96d22..58dbc592bc57 100644 --- a/Documentation/ABI/testing/sysfs-devices-memory +++ b/Documentation/ABI/testing/sysfs-devices-memory @@ -26,8 +26,9 @@ Date: September 2008 Contact: Badari Pulavarty pbadari@us.ibm.com Description: The file /sys/devices/system/memory/memoryX/phys_device - is read-only and is designed to show the name of physical - memory device. Implementation is currently incomplete. + is read-only; it is a legacy interface only ever used on s390x + to expose the covered storage increment. +Users: Legacy s390-tools lsmem/chmem
What: /sys/devices/system/memory/memoryX/phys_index Date: September 2008 diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 5c4432c96c4b..245739f55ac7 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -160,8 +160,8 @@ Under each memory block, you can see 5 files:
"online_movable", "online", "offline" command which will be performed on all sections in the block. -``phys_device`` read-only: designed to show the name of physical memory - device. This is not well implemented now. +``phys_device`` read-only: legacy interface only ever used on s390x to + expose the covered storage increment. ``removable`` read-only: contains an integer value indicating whether the memory block is removable or not removable. A value of 1 indicates that the memory diff --git a/drivers/base/memory.c b/drivers/base/memory.c index eef4ffb6122c..de058d15b33e 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -290,20 +290,20 @@ static ssize_t state_store(struct device *dev, struct device_attribute *attr, }
/* - * phys_device is a bad name for this. What I really want - * is a way to differentiate between memory ranges that - * are part of physical devices that constitute - * a complete removable unit or fru. - * i.e. do these ranges belong to the same physical device, - * s.t. if I offline all of these sections I can then - * remove the physical device? + * Legacy interface that we cannot remove: s390x exposes the storage increment + * covered by a memory block, allowing for identifying which memory blocks + * comprise a storage increment. Since a memory block spans complete + * storage increments nowadays, this interface is basically unused. Other + * archs never exposed != 0. */ static ssize_t phys_device_show(struct device *dev, struct device_attribute *attr, char *buf) { struct memory_block *mem = to_memory_block(dev); + unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
- return sysfs_emit(buf, "%d\n", mem->phys_device); + return sysfs_emit(buf, "%d\n", + arch_get_memory_phys_device(start_pfn)); }
#ifdef CONFIG_MEMORY_HOTREMOVE @@ -488,11 +488,7 @@ static DEVICE_ATTR_WO(soft_offline_page); static DEVICE_ATTR_WO(hard_offline_page); #endif
-/* - * Note that phys_device is optional. It is here to allow for - * differentiation between which *physical* devices each - * section belongs to... - */ +/* See phys_device_show(). */ int __weak arch_get_memory_phys_device(unsigned long start_pfn) { return 0; @@ -574,7 +570,6 @@ int register_memory(struct memory_block *memory) static int init_memory_block(unsigned long block_id, unsigned long state) { struct memory_block *mem; - unsigned long start_pfn; int ret = 0;
mem = find_memory_block_by_id(block_id); @@ -588,8 +583,6 @@ static int init_memory_block(unsigned long block_id, unsigned long state)
mem->start_section_nr = block_id * sections_per_block; mem->state = state; - start_pfn = section_nr_to_pfn(mem->start_section_nr); - mem->phys_device = arch_get_memory_phys_device(start_pfn); mem->nid = NUMA_NO_NODE;
ret = register_memory(mem); diff --git a/include/linux/memory.h b/include/linux/memory.h index 439a89e758d8..4da95e684e20 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -27,9 +27,8 @@ struct memory_block { unsigned long start_section_nr; unsigned long state; /* serialized by the dev->lock */ int online_type; /* for passing data to online routine */ - int phys_device; /* to which fru does this belong? */ - struct device dev; int nid; /* NID for this memory block */ + struct device dev; };
int arch_get_memory_phys_device(unsigned long start_pfn);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lin Feng linf@wangsu.com
[ Upstream commit 3b3376f222e3ab58367d9dd405cafd09d5e37b7c ]
Apart from subsystem specific .proc_handler handler, all ctl_tables with extra1 and extra2 members set should use proc_dointvec_minmax instead of proc_dointvec, or the limit set in extra* never work and potentially echo underflow values(negative numbers) is likely make system unstable.
Especially vfs_cache_pressure and zone_reclaim_mode, -1 is apparently not a valid value, but we can set to them. And then kernel may crash.
# echo -1 > /proc/sys/vm/vfs_cache_pressure
Link: https://lkml.kernel.org/r/20201223105535.2875-1-linf@wangsu.com Signed-off-by: Lin Feng linf@wangsu.com Cc: Alexey Dobriyan adobriyan@gmail.com Cc: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sysctl.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c index c9fbdd848138..62fbd09b5dc1 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2962,7 +2962,7 @@ static struct ctl_table vm_table[] = { .data = &block_dump, .maxlen = sizeof(block_dump), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, { @@ -2970,7 +2970,7 @@ static struct ctl_table vm_table[] = { .data = &sysctl_vfs_cache_pressure, .maxlen = sizeof(sysctl_vfs_cache_pressure), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #if defined(HAVE_ARCH_PICK_MMAP_LAYOUT) || \ @@ -2980,7 +2980,7 @@ static struct ctl_table vm_table[] = { .data = &sysctl_legacy_va_layout, .maxlen = sizeof(sysctl_legacy_va_layout), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, #endif @@ -2990,7 +2990,7 @@ static struct ctl_table vm_table[] = { .data = &node_reclaim_mode, .maxlen = sizeof(node_reclaim_mode), .mode = 0644, - .proc_handler = proc_dointvec, + .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Mike Christie michael.christie@oracle.com
[ Upstream commit d28d48c699779973ab9a3bd0e5acfa112bd4fdef ]
If iscsi_prep_scsi_cmd_pdu() fails we try to add it back to the cmdqueue, but we leave it partially setup. We don't have functions that can undo the pdu and init task setup. We only have cleanup_task which can clean up both parts. So this has us just fail the cmd and go through the standard cleanup routine and then have the SCSI midlayer retry it like is done when it fails in the queuecommand path.
Link: https://lore.kernel.org/r/20210207044608.27585-2-michael.christie@oracle.com Reviewed-by: Lee Duncan lduncan@suse.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libiscsi.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 1851015299b3..af40de7e51e7 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1532,14 +1532,9 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) } rc = iscsi_prep_scsi_cmd_pdu(conn->task); if (rc) { - if (rc == -ENOMEM || rc == -EACCES) { - spin_lock_bh(&conn->taskqueuelock); - list_add_tail(&conn->task->running, - &conn->cmdqueue); - conn->task = NULL; - spin_unlock_bh(&conn->taskqueuelock); - goto done; - } else + if (rc == -ENOMEM || rc == -EACCES) + fail_scsi_task(conn->task, DID_IMM_RETRY); + else fail_scsi_task(conn->task, DID_ABORT); spin_lock_bh(&conn->taskqueuelock); continue;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Aleksandr Miloserdov a.miloserdov@yadro.com
[ Upstream commit 1c73e0c5e54d5f7d77f422a10b03ebe61eaed5ad ]
TCM doesn't properly handle underflow case for service actions. One way to prevent it is to always complete command with target_complete_cmd_with_length(), however it requires access to data_sg, which is not always available.
This change introduces target_set_cmd_data_length() function which allows to set command data length before completing it.
Link: https://lore.kernel.org/r/20210209072202.41154-2-a.miloserdov@yadro.com Reviewed-by: Roman Bolshakov r.bolshakov@yadro.com Reviewed-by: Bodo Stroesser bostroesser@gmail.com Signed-off-by: Aleksandr Miloserdov a.miloserdov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_transport.c | 15 +++++++++++---- include/target/target_core_backend.h | 1 + 2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index fca4bd079d02..8a4d58fdc9fe 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -879,11 +879,9 @@ void target_complete_cmd(struct se_cmd *cmd, u8 scsi_status) } EXPORT_SYMBOL(target_complete_cmd);
-void target_complete_cmd_with_length(struct se_cmd *cmd, u8 scsi_status, int length) +void target_set_cmd_data_length(struct se_cmd *cmd, int length) { - if ((scsi_status == SAM_STAT_GOOD || - cmd->se_cmd_flags & SCF_TREAT_READ_AS_NORMAL) && - length < cmd->data_length) { + if (length < cmd->data_length) { if (cmd->se_cmd_flags & SCF_UNDERFLOW_BIT) { cmd->residual_count += cmd->data_length - length; } else { @@ -893,6 +891,15 @@ void target_complete_cmd_with_length(struct se_cmd *cmd, u8 scsi_status, int len
cmd->data_length = length; } +} +EXPORT_SYMBOL(target_set_cmd_data_length); + +void target_complete_cmd_with_length(struct se_cmd *cmd, u8 scsi_status, int length) +{ + if (scsi_status == SAM_STAT_GOOD || + cmd->se_cmd_flags & SCF_TREAT_READ_AS_NORMAL) { + target_set_cmd_data_length(cmd, length); + }
target_complete_cmd(cmd, scsi_status); } diff --git a/include/target/target_core_backend.h b/include/target/target_core_backend.h index 6336780d83a7..ce2fba49c95d 100644 --- a/include/target/target_core_backend.h +++ b/include/target/target_core_backend.h @@ -72,6 +72,7 @@ int transport_backend_register(const struct target_backend_ops *); void target_backend_unregister(const struct target_backend_ops *);
void target_complete_cmd(struct se_cmd *, u8); +void target_set_cmd_data_length(struct se_cmd *, int); void target_complete_cmd_with_length(struct se_cmd *, u8, int);
void transport_copy_sense_to_cmd(struct se_cmd *, unsigned char *);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Aleksandr Miloserdov a.miloserdov@yadro.com
[ Upstream commit 14d24e2cc77411301e906a8cf41884739de192de ]
TCM buffer length doesn't necessarily equal 8 + ADDITIONAL LENGTH which might be considered an underflow in case of Data-In size being greater than 8 + ADDITIONAL LENGTH. So truncate buffer length to prevent underflow.
Link: https://lore.kernel.org/r/20210209072202.41154-3-a.miloserdov@yadro.com Reviewed-by: Roman Bolshakov r.bolshakov@yadro.com Reviewed-by: Bodo Stroesser bostroesser@gmail.com Signed-off-by: Aleksandr Miloserdov a.miloserdov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_pr.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c index 14db5e568f22..d4cc43afe05b 100644 --- a/drivers/target/target_core_pr.c +++ b/drivers/target/target_core_pr.c @@ -3739,6 +3739,7 @@ core_scsi3_pri_read_keys(struct se_cmd *cmd) spin_unlock(&dev->t10_pr.registration_lock);
put_unaligned_be32(add_len, &buf[4]); + target_set_cmd_data_length(cmd, 8 + add_len);
transport_kunmap_data_sg(cmd);
@@ -3757,7 +3758,7 @@ core_scsi3_pri_read_reservation(struct se_cmd *cmd) struct t10_pr_registration *pr_reg; unsigned char *buf; u64 pr_res_key; - u32 add_len = 16; /* Hardcoded to 16 when a reservation is held. */ + u32 add_len = 0;
if (cmd->data_length < 8) { pr_err("PRIN SA READ_RESERVATIONS SCSI Data Length: %u" @@ -3775,8 +3776,9 @@ core_scsi3_pri_read_reservation(struct se_cmd *cmd) pr_reg = dev->dev_pr_res_holder; if (pr_reg) { /* - * Set the hardcoded Additional Length + * Set the Additional Length to 16 when a reservation is held */ + add_len = 16; put_unaligned_be32(add_len, &buf[4]);
if (cmd->data_length < 22) @@ -3812,6 +3814,8 @@ core_scsi3_pri_read_reservation(struct se_cmd *cmd) (pr_reg->pr_res_type & 0x0f); }
+ target_set_cmd_data_length(cmd, 8 + add_len); + err: spin_unlock(&dev->dev_reservation_lock); transport_kunmap_data_sg(cmd); @@ -3830,7 +3834,7 @@ core_scsi3_pri_report_capabilities(struct se_cmd *cmd) struct se_device *dev = cmd->se_dev; struct t10_reservation *pr_tmpl = &dev->t10_pr; unsigned char *buf; - u16 add_len = 8; /* Hardcoded to 8. */ + u16 len = 8; /* Hardcoded to 8. */
if (cmd->data_length < 6) { pr_err("PRIN SA REPORT_CAPABILITIES SCSI Data Length:" @@ -3842,7 +3846,7 @@ core_scsi3_pri_report_capabilities(struct se_cmd *cmd) if (!buf) return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
- put_unaligned_be16(add_len, &buf[0]); + put_unaligned_be16(len, &buf[0]); buf[2] |= 0x10; /* CRH: Compatible Reservation Hanlding bit. */ buf[2] |= 0x08; /* SIP_C: Specify Initiator Ports Capable bit */ buf[2] |= 0x04; /* ATP_C: All Target Ports Capable bit */ @@ -3871,6 +3875,8 @@ core_scsi3_pri_report_capabilities(struct se_cmd *cmd) buf[4] |= 0x02; /* PR_TYPE_WRITE_EXCLUSIVE */ buf[5] |= 0x01; /* PR_TYPE_EXCLUSIVE_ACCESS_ALLREG */
+ target_set_cmd_data_length(cmd, len); + transport_kunmap_data_sg(cmd);
return 0; @@ -4031,6 +4037,7 @@ core_scsi3_pri_read_full_status(struct se_cmd *cmd) * Set ADDITIONAL_LENGTH */ put_unaligned_be32(add_len, &buf[4]); + target_set_cmd_data_length(cmd, 8 + add_len);
transport_kunmap_data_sg(cmd);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org
[ Upstream commit a59c16c80bd791878cf81d1d5aae508eeb2e73f1 ]
The GPU GX GDSC has GPU_GX_BCR reset and gfx3d_clk CXC, as stated on downstream kernels (and as verified upstream, because otherwise random lockups happen). Also, add PWRSTS_RET and NO_RET_PERIPH: also as found downstream, and also as verified here, to avoid GPU related lockups it is necessary to force retain mem, but *not* peripheral when enabling this GDSC (and, of course, the inverse on disablement).
With this change, the GPU finally works flawlessly on my four different MSM8998 devices from two different manufacturers.
Signed-off-by: AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org Link: https://lore.kernel.org/r/20210114221059.483390-11-angelogioacchino.delregno... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gpucc-msm8998.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/gpucc-msm8998.c b/drivers/clk/qcom/gpucc-msm8998.c index 9b3923af02a1..1a518c4915b4 100644 --- a/drivers/clk/qcom/gpucc-msm8998.c +++ b/drivers/clk/qcom/gpucc-msm8998.c @@ -253,12 +253,16 @@ static struct gdsc gpu_cx_gdsc = { static struct gdsc gpu_gx_gdsc = { .gdscr = 0x1094, .clamp_io_ctrl = 0x130, + .resets = (unsigned int []){ GPU_GX_BCR }, + .reset_count = 1, + .cxcs = (unsigned int []){ 0x1098 }, + .cxc_count = 1, .pd = { .name = "gpu_gx", }, .parent = &gpu_cx_gdsc.pd, - .pwrsts = PWRSTS_OFF_ON, - .flags = CLAMP_IO | AON_RESET, + .pwrsts = PWRSTS_OFF_ON | PWRSTS_RET, + .flags = CLAMP_IO | SW_RESET | AON_RESET | NO_RET_PERIPH, };
static struct clk_regmap *gpucc_msm8998_clocks[] = {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: John Ernberg john.ernberg@actia.se
commit fc7c5c208eb7bc2df3a9f4234f14eca250001cb6 upstream.
The microphone in the Plantronics C320-M headset will randomly fail to initialize properly, at least when using Microsoft Teams. Introducing a 20ms delay on the control messages appears to resolve the issue.
Link: https://gitlab.freedesktop.org/pulseaudio/pulseaudio/-/issues/1065 Tested-by: Andreas Kempe kempe@lysator.liu.se Signed-off-by: John Ernberg john.ernberg@actia.se Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210303181405.39835-1-john.ernberg@actia.se Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1670,6 +1670,14 @@ void snd_usb_ctl_msg_quirk(struct usb_de && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) msleep(20);
+ /* + * Plantronics C320-M needs a delay to avoid random + * microhpone failures. + */ + if (chip->usb_id == USB_ID(0x047f, 0xc025) && + (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) + msleep(20); + /* Zoom R16/24, many Logitech(at least H650e/H570e/BCC950), * Jabra 550a, Kingston HyperX needs a tiny delay here, * otherwise requests like get/set frequency return
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit eea46a0879bcca23e15071f9968c0f6e6596e470 upstream.
The per_pin->work might be still floating at the suspend, and this may hit the access to the hardware at an unexpected timing. Cancel the work properly at the suspend callback for avoiding the buggy access.
Note that the bug doesn't trigger easily in the recent kernels since the work is queued only when the repoll count is set, and usually it's only at the resume callback, but it's still possible to hit in theory.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Reported-and-tested-by: Abhishek Sahu abhsahu@nvidia.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210310112809.9215-4-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_hdmi.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -2472,6 +2472,18 @@ static void generic_hdmi_free(struct hda }
#ifdef CONFIG_PM +static int generic_hdmi_suspend(struct hda_codec *codec) +{ + struct hdmi_spec *spec = codec->spec; + int pin_idx; + + for (pin_idx = 0; pin_idx < spec->num_pins; pin_idx++) { + struct hdmi_spec_per_pin *per_pin = get_pin(spec, pin_idx); + cancel_delayed_work_sync(&per_pin->work); + } + return 0; +} + static int generic_hdmi_resume(struct hda_codec *codec) { struct hdmi_spec *spec = codec->spec; @@ -2495,6 +2507,7 @@ static const struct hda_codec_ops generi .build_controls = generic_hdmi_build_controls, .unsol_event = hdmi_unsol_event, #ifdef CONFIG_PM + .suspend = generic_hdmi_suspend, .resume = generic_hdmi_resume, #endif };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 56b26497bb4b7ff970612dc25a8a008c34463f7b upstream.
The mute and mic-mute LEDs on HP ZBook Studio G5 are controlled via GPIO bits 0x10 and 0x20, respectively, and we need the extra setup for those.
As the similar code is already present for other HP models but with different GPIO pins, this patch factors out the common helper code and applies those GPIO values for each model.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211893 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210306095018.11746-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 62 +++++++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 17 deletions(-)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -149,6 +149,21 @@ static int cx_auto_vmaster_mute_led(stru return 0; }
+static void cxt_init_gpio_led(struct hda_codec *codec) +{ + struct conexant_spec *spec = codec->spec; + unsigned int mask = spec->gpio_mute_led_mask | spec->gpio_mic_led_mask; + + if (mask) { + snd_hda_codec_write(codec, 0x01, 0, AC_VERB_SET_GPIO_MASK, + mask); + snd_hda_codec_write(codec, 0x01, 0, AC_VERB_SET_GPIO_DIRECTION, + mask); + snd_hda_codec_write(codec, 0x01, 0, AC_VERB_SET_GPIO_DATA, + spec->gpio_led); + } +} + static int cx_auto_init(struct hda_codec *codec) { struct conexant_spec *spec = codec->spec; @@ -156,6 +171,7 @@ static int cx_auto_init(struct hda_codec if (!spec->dynamic_eapd) cx_auto_turn_eapd(codec, spec->num_eapds, spec->eapds, true);
+ cxt_init_gpio_led(codec); snd_hda_apply_fixup(codec, HDA_FIXUP_ACT_INIT);
return 0; @@ -215,6 +231,7 @@ enum { CXT_FIXUP_HP_SPECTRE, CXT_FIXUP_HP_GATE_MIC, CXT_FIXUP_MUTE_LED_GPIO, + CXT_FIXUP_HP_ZBOOK_MUTE_LED, CXT_FIXUP_HEADSET_MIC, CXT_FIXUP_HP_MIC_NO_PRESENCE, }; @@ -654,31 +671,36 @@ static int cxt_gpio_micmute_update(struc return 0; }
- -static void cxt_fixup_mute_led_gpio(struct hda_codec *codec, - const struct hda_fixup *fix, int action) +static void cxt_setup_mute_led(struct hda_codec *codec, + unsigned int mute, unsigned int mic_mute) { struct conexant_spec *spec = codec->spec; - static const struct hda_verb gpio_init[] = { - { 0x01, AC_VERB_SET_GPIO_MASK, 0x03 }, - { 0x01, AC_VERB_SET_GPIO_DIRECTION, 0x03 }, - {} - };
- if (action == HDA_FIXUP_ACT_PRE_PROBE) { + spec->gpio_led = 0; + spec->mute_led_polarity = 0; + if (mute) { snd_hda_gen_add_mute_led_cdev(codec, cxt_gpio_mute_update); - spec->gpio_led = 0; - spec->mute_led_polarity = 0; - spec->gpio_mute_led_mask = 0x01; - spec->gpio_mic_led_mask = 0x02; + spec->gpio_mute_led_mask = mute; + } + if (mic_mute) { snd_hda_gen_add_micmute_led_cdev(codec, cxt_gpio_micmute_update); + spec->gpio_mic_led_mask = mic_mute; } - snd_hda_add_verbs(codec, gpio_init); - if (spec->gpio_led) - snd_hda_codec_write(codec, 0x01, 0, AC_VERB_SET_GPIO_DATA, - spec->gpio_led); }
+static void cxt_fixup_mute_led_gpio(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + if (action == HDA_FIXUP_ACT_PRE_PROBE) + cxt_setup_mute_led(codec, 0x01, 0x02); +} + +static void cxt_fixup_hp_zbook_mute_led(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + if (action == HDA_FIXUP_ACT_PRE_PROBE) + cxt_setup_mute_led(codec, 0x10, 0x20); +}
/* ThinkPad X200 & co with cxt5051 */ static const struct hda_pintbl cxt_pincfg_lenovo_x200[] = { @@ -839,6 +861,10 @@ static const struct hda_fixup cxt_fixups .type = HDA_FIXUP_FUNC, .v.func = cxt_fixup_mute_led_gpio, }, + [CXT_FIXUP_HP_ZBOOK_MUTE_LED] = { + .type = HDA_FIXUP_FUNC, + .v.func = cxt_fixup_hp_zbook_mute_led, + }, [CXT_FIXUP_HEADSET_MIC] = { .type = HDA_FIXUP_FUNC, .v.func = cxt_fixup_headset_mic, @@ -917,6 +943,7 @@ static const struct snd_pci_quirk cxt506 SND_PCI_QUIRK(0x103c, 0x8299, "HP 800 G3 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x829a, "HP 800 G3 DM", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8402, "HP ProBook 645 G4", CXT_FIXUP_MUTE_LED_GPIO), + SND_PCI_QUIRK(0x103c, 0x8427, "HP ZBook Studio G5", CXT_FIXUP_HP_ZBOOK_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x8455, "HP Z2 G4", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8456, "HP Z2 G4 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x8457, "HP Z2 G4 mini", CXT_FIXUP_HP_MIC_NO_PRESENCE), @@ -956,6 +983,7 @@ static const struct hda_model_fixup cxt5 { .id = CXT_FIXUP_MUTE_LED_EAPD, .name = "mute-led-eapd" }, { .id = CXT_FIXUP_HP_DOCK, .name = "hp-dock" }, { .id = CXT_FIXUP_MUTE_LED_GPIO, .name = "mute-led-gpio" }, + { .id = CXT_FIXUP_HP_ZBOOK_MUTE_LED, .name = "hp-zbook-mute-led" }, { .id = CXT_FIXUP_HP_MIC_NO_PRESENCE, .name = "hp-mic-fix" }, {} };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Simeon Simeonoff sim.simeonoff@gmail.com
commit f15c5c11abfbf8909eb30598315ecbec2311cfdc upstream.
The new AE-5 Plus model has a different Subsystem ID compared to the non-plus model. Adding the new id to the list of quirks.
Signed-off-by: Simeon Simeonoff sim.simeonoff@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/998cafbe10b648f724ee33570553f2d780a38963.camel@gma... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_ca0132.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_ca0132.c +++ b/sound/pci/hda/patch_ca0132.c @@ -1309,6 +1309,7 @@ static const struct snd_pci_quirk ca0132 SND_PCI_QUIRK(0x1102, 0x0013, "Recon3D", QUIRK_R3D), SND_PCI_QUIRK(0x1102, 0x0018, "Recon3D", QUIRK_R3D), SND_PCI_QUIRK(0x1102, 0x0051, "Sound Blaster AE-5", QUIRK_AE5), + SND_PCI_QUIRK(0x1102, 0x0191, "Sound Blaster AE-5 Plus", QUIRK_AE5), SND_PCI_QUIRK(0x1102, 0x0081, "Sound Blaster AE-7", QUIRK_AE7), {} };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 28e96c1693ec1cdc963807611f8b5ad400431e82 upstream.
The commit c02f77d32d2c ("ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)") introduced a few workarounds for the recent AMD HD-audio controller, and one of them is the forced BATCH PCM mode so that PulseAudio avoids the timer-based scheduling. This was thought to cover for some badly working applications, but this actually worsens for more others. In total, this wasn't a good idea to enforce it.
This is a partial revert of the commit above for dropping the PCM BATCH enforcement part to recover from the regression again.
Fixes: c02f77d32d2c ("ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=195303 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210308160726.22930-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_controller.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/sound/pci/hda/hda_controller.c +++ b/sound/pci/hda/hda_controller.c @@ -609,13 +609,6 @@ static int azx_pcm_open(struct snd_pcm_s 20, 178000000);
- /* by some reason, the playback stream stalls on PulseAudio with - * tsched=1 when a capture stream triggers. Until we figure out the - * real cause, disable tsched mode by telling the PCM info flag. - */ - if (chip->driver_caps & AZX_DCAPS_AMD_WORKAROUND) - runtime->hw.info |= SNDRV_PCM_INFO_BATCH; - if (chip->align_buffer_size) /* constrain buffer sizes to be multiple of 128 bytes. This is more efficient in terms of memory
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 13661fc48461282e43fe8f76bf5bf449b3d40687 upstream.
The HD-audio controller driver processes the unsolicited events via its work asynchronously, and this might be pending when the system goes to suspend. When a lengthy event handling like ELD byte reads is running, this might trigger unexpected accesses among suspend/resume procedure, typically seen with Nvidia driver that still requires the handling via unsolicited event verbs for ELD updates.
This patch adds the flush of unsol_work to assure that pending events are processed before going into suspend.
Buglink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Reported-and-tested-by: Abhishek Sahu abhsahu@nvidia.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210310112809.9215-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_intel.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1026,6 +1026,8 @@ static int azx_prepare(struct device *de chip = card->private_data; chip->pm_prepared = 1;
+ flush_work(&azx_bus(chip)->unsol_work); + /* HDA controller always requires different WAKEEN for runtime suspend * and system suspend, so don't use direct-complete here. */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 5ff9dde42e8c72ed8102eb8cb62e03f9dc2103ab upstream.
When HD-audio bus receives unsolicited events during its system suspend/resume (S3 and S4) phase, the controller driver may still try to process events although the codec chips are already (or yet) powered down. This might screw up the codec communication, resulting in CORB/RIRB errors. Such events should be rather skipped, as the codec chip status such as the jack status will be fully refreshed at the system resume time.
Since we're tracking the system suspend/resume state in codec power.power_state field, let's add the check in the common unsol event handler entry point to filter out such events.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Tested-by: Abhishek Sahu abhsahu@nvidia.com Cc: stable@vger.kernel.org # 183ab39eb0ea: ALSA: hda: Initialize power_state Link: https://lore.kernel.org/r/20210310112809.9215-3-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_bind.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -47,6 +47,10 @@ static void hda_codec_unsol_event(struct if (codec->bus->shutdown) return;
+ /* ignore unsol events during system suspend/resume */ + if (codec->core.dev.power.power_state.event != PM_EVENT_ON) + return; + if (codec->patch_ops.unsol_event) codec->patch_ops.unsol_event(codec, ev); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit fec60c3bc5d1713db2727cdffc638d48f9c07dc3 upstream.
Dell AE515 sound bar (413c:a506) spews the error messages when the driver tries to read the current sample frequency, hence it needs to be on the list in snd_usb_get_sample_rate_quirk().
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211551 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210304083021.2152-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1520,6 +1520,7 @@ bool snd_usb_get_sample_rate_quirk(struc case USB_ID(0x1901, 0x0191): /* GE B850V3 CP2114 audio interface */ case USB_ID(0x21b4, 0x0081): /* AudioQuest DragonFly */ case USB_ID(0x2912, 0x30c8): /* Audioengine D1 */ + case USB_ID(0x413c, 0xa506): /* Dell AE515 sound bar */ return true; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Takashi Iwai tiwai@suse.de
commit 06abcb18b3a021ba1a3f2020cbefb3ed04e59e72 upstream.
Other Plantronics headset models seem requiring the same workaround as C320-M to add the 20ms delay for the control messages, too. Apply the workaround generically for devices with the vendor ID 0x047f.
Note that the problem didn't surface before 5.11 just with luck. Since 5.11 got a big code rewrite about the stream handling, the parameter setup procedure has changed, and this seemed triggering the problem more often.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182552 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210304085009.4770-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1672,10 +1672,10 @@ void snd_usb_ctl_msg_quirk(struct usb_de msleep(20);
/* - * Plantronics C320-M needs a delay to avoid random - * microhpone failures. + * Plantronics headsets (C320, C320-M, etc) need a delay to avoid + * random microhpone failures. */ - if (chip->usb_id == USB_ID(0x047f, 0xc025) && + if (USB_ID_VENDOR(chip->usb_id) == 0x047f && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) msleep(20);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit 9799110825dba087c2bdce886977cf84dada2005 upstream.
Rear audio on Lenovo ThinkStation P620 stops working after commit 1965c4364bdd ("ALSA: usb-audio: Disable autosuspend for Lenovo ThinkStation P620"): [ 6.013526] usbcore: registered new interface driver snd-usb-audio [ 6.023064] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023083] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.023090] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023098] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.023103] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023110] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045846] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045866] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045877] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045886] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045894] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045908] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4
I overlooked the issue because when I was working on the said commit, only the front audio is tested. Apology for that.
Changing supports_autosuspend in driver is too late for disabling autosuspend, because it was already used by USB probe routine, so it can break the balance on the following code that depends on supports_autosuspend.
Fix it by using usb_disable_autosuspend() helper, and balance the suspend count in disconnect callback.
Fixes: 1965c4364bdd ("ALSA: usb-audio: Disable autosuspend for Lenovo ThinkStation P620") Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210304043419.287191-1-kai.heng.feng@canonical.co... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/card.c | 5 +++++ sound/usb/quirks.c | 2 +- sound/usb/usbaudio.h | 1 + 3 files changed, 7 insertions(+), 1 deletion(-)
--- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -831,6 +831,8 @@ static int usb_audio_probe(struct usb_in snd_media_device_create(chip, intf); }
+ chip->quirk_type = quirk->type; + usb_chip[chip->index] = chip; chip->intf[chip->num_interfaces] = intf; chip->num_interfaces++; @@ -913,6 +915,9 @@ static void usb_audio_disconnect(struct } else { mutex_unlock(®ister_mutex); } + + if (chip->quirk_type & QUIRK_SETUP_DISABLE_AUTOSUSPEND) + usb_enable_autosuspend(interface_to_usbdev(intf)); }
/* lock the shutdown (disconnect) task and autoresume */ --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -547,7 +547,7 @@ static int setup_disable_autosuspend(str struct usb_driver *driver, const struct snd_usb_audio_quirk *quirk) { - driver->supports_autosuspend = 0; + usb_disable_autosuspend(interface_to_usbdev(iface)); return 1; /* Continue with creating streams and mixer */ }
--- a/sound/usb/usbaudio.h +++ b/sound/usb/usbaudio.h @@ -27,6 +27,7 @@ struct snd_usb_audio { struct snd_card *card; struct usb_interface *intf[MAX_CARD_INTERFACES]; u32 usb_id; + uint16_t quirk_type; struct mutex mutex; unsigned int system_suspend; atomic_t active;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Pavel Skripkin paskripkin@gmail.com
commit 30dea07180de3aa0ad613af88431ef4e34b5ef68 upstream.
syzbot reported null pointer dereference in usb_audio_probe. The problem was in case, when quirk == NULL. It's not an error condition, so quirk must be checked before dereferencing.
Call Trace: usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396 really_probe+0x291/0xe60 drivers/base/dd.c:554 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:740 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:846 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431 __device_attach+0x228/0x4a0 drivers/base/dd.c:914 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491 device_add+0xbdb/0x1db0 drivers/base/core.c:3242 usb_set_configuration+0x113f/0x1910 drivers/usb/core/message.c:2164 usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238 usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293 really_probe+0x291/0xe60 drivers/base/dd.c:554 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:740 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:846 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431 __device_attach+0x228/0x4a0 drivers/base/dd.c:914 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491 device_add+0xbdb/0x1db0 drivers/base/core.c:3242 usb_new_device.cold+0x721/0x1058 drivers/usb/core/hub.c:2555 hub_port_connect drivers/usb/core/hub.c:5223 [inline] hub_port_connect_change drivers/usb/core/hub.c:5363 [inline] port_event drivers/usb/core/hub.c:5509 [inline] hub_event+0x2357/0x4320 drivers/usb/core/hub.c:5591 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Reported-by: syzbot+719da9b149a931f5143f@syzkaller.appspotmail.com Fixes: 9799110825db ("ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend()") Signed-off-by: Pavel Skripkin paskripkin@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/f1ebad6e721412843bd1b12584444c0a63c6b2fb.161524218... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/card.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -831,7 +831,8 @@ static int usb_audio_probe(struct usb_in snd_media_device_create(chip, intf); }
- chip->quirk_type = quirk->type; + if (quirk) + chip->quirk_type = quirk->type;
usb_chip[chip->index] = chip; chip->intf[chip->num_interfaces] = intf;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Pavel Skripkin paskripkin@gmail.com
commit c5aa956eaeb05fe87e33433d7fd9f5e4d23c7416 upstream.
The problem was in wrong "if" placement. chip->quirk_type is freed in snd_card_free_when_closed(), but inside if statement it's accesed.
Fixes: 9799110825db ("ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend()") Signed-off-by: Pavel Skripkin paskripkin@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/16da19126ff461e5e64a9aec648cce28fb8ed73e.161524218... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/card.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -908,6 +908,9 @@ static void usb_audio_disconnect(struct } }
+ if (chip->quirk_type & QUIRK_SETUP_DISABLE_AUTOSUSPEND) + usb_enable_autosuspend(interface_to_usbdev(intf)); + chip->num_interfaces--; if (chip->num_interfaces <= 0) { usb_chip[chip->index] = NULL; @@ -916,9 +919,6 @@ static void usb_audio_disconnect(struct } else { mutex_unlock(®ister_mutex); } - - if (chip->quirk_type & QUIRK_SETUP_DISABLE_AUTOSUSPEND) - usb_enable_autosuspend(interface_to_usbdev(intf)); }
/* lock the shutdown (disconnect) task and autoresume */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Beata Michalska beata.michalska@arm.com
commit 606a5d4227e4610399c61086ac55c46068a90b03 upstream.
We are required to call dev_pm_opp_put() from outside of the opp_table->lock as debugfs removal needs to happen lock-less to avoid circular dependency issues.
commit cf1fac943c63 ("opp: Reduce the size of critical section in _opp_kref_release()") tried to fix that introducing a new routine _opp_get_next() which keeps returning OPPs that can be freed by the callers and this routine shall be called without holding the opp_table->lock.
Though the commit overlooked the fact that the OPPs can be referenced by other users as well and this routine will end up dropping references which were taken by other users and hence freeing the OPPs prematurely.
In effect, other users of the OPPs will end up having invalid pointers at hand. We didn't see any crash reports earlier as the exact situation never happened, though it is certainly possible.
We need a way to mark which OPPs are no longer referenced by the OPP core, so we don't drop extra references to them accidentally.
This commit adds another OPP flag, "removed", which is used to track this. And now we should never end up dropping extra references to the OPPs.
Cc: v5.11+ stable@vger.kernel.org # v5.11+ Fixes: cf1fac943c63 ("opp: Reduce the size of critical section in _opp_kref_release()") Signed-off-by: Beata Michalska beata.michalska@arm.com [ Viresh: Almost rewrote entire patch, added new "removed" field, rewrote commit log and added the correct Fixes tag. ] Co-developed-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/opp/core.c | 48 +++++++++++++++++++++++++----------------------- drivers/opp/opp.h | 2 ++ 2 files changed, 27 insertions(+), 23 deletions(-)
--- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1335,7 +1335,11 @@ static struct dev_pm_opp *_opp_get_next(
mutex_lock(&opp_table->lock); list_for_each_entry(temp, &opp_table->opp_list, node) { - if (dynamic == temp->dynamic) { + /* + * Refcount must be dropped only once for each OPP by OPP core, + * do that with help of "removed" flag. + */ + if (!temp->removed && dynamic == temp->dynamic) { opp = temp; break; } @@ -1345,10 +1349,27 @@ static struct dev_pm_opp *_opp_get_next( return opp; }
-bool _opp_remove_all_static(struct opp_table *opp_table) +/* + * Can't call dev_pm_opp_put() from under the lock as debugfs removal needs to + * happen lock less to avoid circular dependency issues. This routine must be + * called without the opp_table->lock held. + */ +static void _opp_remove_all(struct opp_table *opp_table, bool dynamic) { struct dev_pm_opp *opp;
+ while ((opp = _opp_get_next(opp_table, dynamic))) { + opp->removed = true; + dev_pm_opp_put(opp); + + /* Drop the references taken by dev_pm_opp_add() */ + if (dynamic) + dev_pm_opp_put_opp_table(opp_table); + } +} + +bool _opp_remove_all_static(struct opp_table *opp_table) +{ mutex_lock(&opp_table->lock);
if (!opp_table->parsed_static_opps) { @@ -1363,13 +1384,7 @@ bool _opp_remove_all_static(struct opp_t
mutex_unlock(&opp_table->lock);
- /* - * Can't remove the OPP from under the lock, debugfs removal needs to - * happen lock less to avoid circular dependency issues. - */ - while ((opp = _opp_get_next(opp_table, false))) - dev_pm_opp_put(opp); - + _opp_remove_all(opp_table, false); return true; }
@@ -1382,25 +1397,12 @@ bool _opp_remove_all_static(struct opp_t void dev_pm_opp_remove_all_dynamic(struct device *dev) { struct opp_table *opp_table; - struct dev_pm_opp *opp; - int count = 0;
opp_table = _find_opp_table(dev); if (IS_ERR(opp_table)) return;
- /* - * Can't remove the OPP from under the lock, debugfs removal needs to - * happen lock less to avoid circular dependency issues. - */ - while ((opp = _opp_get_next(opp_table, true))) { - dev_pm_opp_put(opp); - count++; - } - - /* Drop the references taken by dev_pm_opp_add() */ - while (count--) - dev_pm_opp_put_opp_table(opp_table); + _opp_remove_all(opp_table, true);
/* Drop the reference taken by _find_opp_table() */ dev_pm_opp_put_opp_table(opp_table); --- a/drivers/opp/opp.h +++ b/drivers/opp/opp.h @@ -56,6 +56,7 @@ extern struct list_head opp_tables; * @dynamic: not-created from static DT entries. * @turbo: true if turbo (boost) OPP * @suspend: true if suspend OPP + * @removed: flag indicating that OPP's reference is dropped by OPP core. * @pstate: Device's power domain's performance state. * @rate: Frequency in hertz * @level: Performance level @@ -78,6 +79,7 @@ struct dev_pm_opp { bool dynamic; bool turbo; bool suspend; + bool removed; unsigned int pstate; unsigned long rate; unsigned int level;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Eric W. Biederman ebiederm@xmission.com
commit 3b0c2d3eaa83da259d7726192cf55a137769012f upstream.
It turns out that there are in fact userspace implementations that care and this recent change caused a regression.
https://github.com/containers/buildah/issues/3071
As the motivation for the original change was future development, and the impact is existing real world code just revert this change and allow the ambiguity in v3 file caps.
Cc: stable@vger.kernel.org Fixes: 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") Signed-off-by: Eric W. Biederman ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/commoncap.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)
--- a/security/commoncap.c +++ b/security/commoncap.c @@ -500,8 +500,7 @@ int cap_convert_nscap(struct dentry *den __u32 magic, nsmagic; struct inode *inode = d_backing_inode(dentry); struct user_namespace *task_ns = current_user_ns(), - *fs_ns = inode->i_sb->s_user_ns, - *ancestor; + *fs_ns = inode->i_sb->s_user_ns; kuid_t rootid; size_t newsize;
@@ -524,15 +523,6 @@ int cap_convert_nscap(struct dentry *den if (nsrootid == -1) return -EINVAL;
- /* - * Do not allow allow adding a v3 filesystem capability xattr - * if the rootid field is ambiguous. - */ - for (ancestor = task_ns->parent; ancestor; ancestor = ancestor->parent) { - if (from_kuid(ancestor, rootid) == 0) - return -EINVAL; - } - newsize = sizeof(struct vfs_ns_cap_data); nscap = kmalloc(newsize, GFP_ATOMIC); if (!nscap)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
commit e5113505904ea1c1c0e1f92c1cfa91fbf4da1694 upstream.
When zone reset ioctl and data read race for a same zone on zoned block devices, the data read leaves stale page cache even though the zone reset ioctl zero clears all the zone data on the device. To avoid non-zero data read from the stale page cache after zone reset, discard page cache of reset target zones in blkdev_zone_mgmt_ioctl(). Introduce the helper function blkdev_truncate_zone_range() to discard the page cache. Ensure the page cache discarded by calling the helper function before and after zone reset in same manner as fallocate does.
This patch can be applied back to the stable kernel version v5.10.y. Rework is needed for older stable kernels.
Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls") Cc: stable@vger.kernel.org # 5.10+ Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20210311072546.678999-1-shinichiro.kawasaki@wdc.co... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-zoned.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-)
--- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -318,6 +318,22 @@ int blkdev_report_zones_ioctl(struct blo return 0; }
+static int blkdev_truncate_zone_range(struct block_device *bdev, fmode_t mode, + const struct blk_zone_range *zrange) +{ + loff_t start, end; + + if (zrange->sector + zrange->nr_sectors <= zrange->sector || + zrange->sector + zrange->nr_sectors > get_capacity(bdev->bd_disk)) + /* Out of range */ + return -EINVAL; + + start = zrange->sector << SECTOR_SHIFT; + end = ((zrange->sector + zrange->nr_sectors) << SECTOR_SHIFT) - 1; + + return truncate_bdev_range(bdev, mode, start, end); +} + /* * BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing. * Called from blkdev_ioctl. @@ -329,6 +345,7 @@ int blkdev_zone_mgmt_ioctl(struct block_ struct request_queue *q; struct blk_zone_range zrange; enum req_opf op; + int ret;
if (!argp) return -EINVAL; @@ -352,6 +369,11 @@ int blkdev_zone_mgmt_ioctl(struct block_ switch (cmd) { case BLKRESETZONE: op = REQ_OP_ZONE_RESET; + + /* Invalidate the page cache, including dirty pages. */ + ret = blkdev_truncate_zone_range(bdev, mode, &zrange); + if (ret) + return ret; break; case BLKOPENZONE: op = REQ_OP_ZONE_OPEN; @@ -366,8 +388,20 @@ int blkdev_zone_mgmt_ioctl(struct block_ return -ENOTTY; }
- return blkdev_zone_mgmt(bdev, op, zrange.sector, zrange.nr_sectors, - GFP_KERNEL); + ret = blkdev_zone_mgmt(bdev, op, zrange.sector, zrange.nr_sectors, + GFP_KERNEL); + + /* + * Invalidate the page cache again for zone reset: writes can only be + * direct for zoned devices so concurrent writes would not add any page + * to the page cache after/during reset. The page cache may be filled + * again due to concurrent reads though and dropping the pages for + * these is fine. + */ + if (!ret && cmd == BLKRESETZONE) + ret = blkdev_truncate_zone_range(bdev, mode, &zrange); + + return ret; }
static inline unsigned long *blk_alloc_zone_bitmap(int node,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jan Kara jack@suse.cz
commit 56887cffe946bb0a90c74429fa94d6110a73119d upstream.
Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem") made paths issuing discard or zeroout requests to the underlying device try to grab block device in exclusive mode. If that failed we returned EBUSY to userspace. This however caused unexpected fallout in userspace where e.g. FUSE filesystems issue discard requests from userspace daemons although the device is open exclusively by the kernel. Also shrinking of logical volume by LVM issues discard requests to a device which may be claimed exclusively because there's another LV on the same PV. So to avoid these userspace regressions, fall back to invalidate_inode_pages2_range() instead of returning EBUSY to userspace and return EBUSY only of that call fails as well (meaning that there's indeed someone using the particular device range we are trying to discard).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167 Fixes: 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/block_dev.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -118,13 +118,22 @@ int truncate_bdev_range(struct block_dev if (!(mode & FMODE_EXCL)) { int err = bd_prepare_to_claim(bdev, truncate_bdev_range); if (err) - return err; + goto invalidate; }
truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend); if (!(mode & FMODE_EXCL)) bd_abort_claiming(bdev, truncate_bdev_range); return 0; + +invalidate: + /* + * Someone else has handle exclusively open. Try invalidating instead. + * The 'end' argument is inclusive so the rounding is safe. + */ + return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping, + lstart >> PAGE_SHIFT, + lend >> PAGE_SHIFT); } EXPORT_SYMBOL(truncate_bdev_range);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andrey Konovalov andreyknvl@google.com
commit 86c83365ab76e4b43cedd3ce07a07d32a4dc79ba upstream.
When CONFIG_DEBUG_VIRTUAL is enabled, the default page_to_virt() macro implementation from include/linux/mm.h is used. That definition doesn't account for KASAN tags, which leads to no tags on page_alloc allocations.
Provide an arm64-specific definition for page_to_virt() when CONFIG_DEBUG_VIRTUAL is enabled that takes care of KASAN tags.
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Cc: stable@vger.kernel.org Signed-off-by: Andrey Konovalov andreyknvl@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/r/4b55b35202706223d3118230701c6a59749d9b72.161521950... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/memory.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -315,6 +315,11 @@ static inline void *phys_to_virt(phys_ad #define ARCH_PFN_OFFSET ((unsigned long)PHYS_PFN_OFFSET)
#if !defined(CONFIG_SPARSEMEM_VMEMMAP) || defined(CONFIG_DEBUG_VIRTUAL) +#define page_to_virt(x) ({ \ + __typeof__(x) __page = x; \ + void *__addr = __va(page_to_phys(__page)); \ + (void *)__tag_set((const void *)__addr, page_kasan_tag(__page));\ +}) #define virt_to_page(x) pfn_to_page(virt_to_pfn(x)) #else #define page_to_virt(x) ({ \
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Catalin Marinas catalin.marinas@arm.com
commit d15dfd31384ba3cb93150e5f87661a76fa419f74 upstream.
In a system supporting MTE, the linear map must allow reading/writing allocation tags by setting the memory type as Normal Tagged. Currently, this is only handled for memory present at boot. Hotplugged memory uses Normal non-Tagged memory.
Introduce pgprot_mhp() for hotplugged memory and use it in add_memory_resource(). The arm64 code maps pgprot_mhp() to pgprot_tagged().
Note that ZONE_DEVICE memory should not be mapped as Tagged and therefore setting the memory type in arch_add_memory() is not feasible.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Fixes: 0178dc761368 ("arm64: mte: Use Normal Tagged attributes for the linear map") Reported-by: Patrick Daly pdaly@codeaurora.org Tested-by: Patrick Daly pdaly@codeaurora.org Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora... Cc: stable@vger.kernel.org # 5.10.x Cc: Will Deacon will@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: David Hildenbrand david@redhat.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/pgtable-prot.h | 1 - arch/arm64/include/asm/pgtable.h | 3 +++ arch/arm64/mm/mmu.c | 3 ++- include/linux/pgtable.h | 4 ++++ mm/memory_hotplug.c | 2 +- 5 files changed, 10 insertions(+), 3 deletions(-)
--- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -66,7 +66,6 @@ extern bool arm64_use_ng_mappings; #define _PAGE_DEFAULT (_PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL))
#define PAGE_KERNEL __pgprot(PROT_NORMAL) -#define PAGE_KERNEL_TAGGED __pgprot(PROT_NORMAL_TAGGED) #define PAGE_KERNEL_RO __pgprot((PROT_NORMAL & ~PTE_WRITE) | PTE_RDONLY) #define PAGE_KERNEL_ROX __pgprot((PROT_NORMAL & ~(PTE_WRITE | PTE_PXN)) | PTE_RDONLY) #define PAGE_KERNEL_EXEC __pgprot(PROT_NORMAL & ~PTE_PXN) --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -486,6 +486,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t p __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN) #define pgprot_device(prot) \ __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN) +#define pgprot_tagged(prot) \ + __pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED)) +#define pgprot_mhp pgprot_tagged /* * DMA allocations for non-coherent devices use what the Arm architecture calls * "Normal non-cacheable" memory, which permits speculation, unaligned accesses --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -512,7 +512,8 @@ static void __init map_mem(pgd_t *pgdp) * if MTE is present. Otherwise, it has the same attributes as * PAGE_KERNEL. */ - __map_memblock(pgdp, start, end, PAGE_KERNEL_TAGGED, flags); + __map_memblock(pgdp, start, end, pgprot_tagged(PAGE_KERNEL), + flags); }
/* --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -912,6 +912,10 @@ static inline void ptep_modify_prot_comm #define pgprot_device pgprot_noncached #endif
+#ifndef pgprot_mhp +#define pgprot_mhp(prot) (prot) +#endif + #ifdef CONFIG_MMU #ifndef pgprot_modify #define pgprot_modify pgprot_modify --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1019,7 +1019,7 @@ static int online_memory_block(struct me */ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { - struct mhp_params params = { .pgprot = PAGE_KERNEL }; + struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; u64 start, size; bool new_node = false; int ret;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Rob Herring robh@kernel.org
commit 7bb8bc6eb550116c504fb25af8678b9d7ca2abc5 upstream.
Commit 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") changed armv8pmu_read_evcntr() to return a u32 instead of u64. The result is silent truncation of the event counter when using 64-bit counters. Given the offending commit appears to have passed thru several folks, it seems likely this was a bad rebase after v8.5 PMU 64-bit counters landed.
Cc: Alexandru Elisei alexandru.elisei@arm.com Cc: Julien Thierry julien.thierry.kdev@gmail.com Cc: Mark Rutland mark.rutland@arm.com Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@redhat.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: stable@vger.kernel.org Fixes: 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") Signed-off-by: Rob Herring robh@kernel.org Acked-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Alexandru Elisei alexandru.elisei@arm.com Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/perf_event.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -460,7 +460,7 @@ static inline int armv8pmu_counter_has_o return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx)); }
-static inline u32 armv8pmu_read_evcntr(int idx) +static inline u64 armv8pmu_read_evcntr(int idx) { u32 counter = ARMV8_IDX_TO_COUNTER(idx);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Stefan Haberland sth@linux.ibm.com
commit 7d365bd0bff3c0310c39ebaffc9a8458e036d666 upstream.
In case of an unbind of the DASD device driver the function dasd_generic_remove() is called which shuts down the device. Among others this functions removes the int_handler from the cdev. During shutdown the device cancels all outstanding IO requests and waits for completion of the clear request. Unfortunately the clear interrupt will never be received when there is no interrupt handler connected.
Fix by moving the int_handler removal after the call to the state machine where no request or interrupt is outstanding.
Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland sth@linux.ibm.com Tested-by: Bjoern Walk bwalk@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -3503,8 +3503,6 @@ void dasd_generic_remove(struct ccw_devi struct dasd_device *device; struct dasd_block *block;
- cdev->handler = NULL; - device = dasd_device_from_cdev(cdev); if (IS_ERR(device)) { dasd_remove_sysfs_files(cdev); @@ -3523,6 +3521,7 @@ void dasd_generic_remove(struct ccw_devi * no quite down yet. */ dasd_set_target_state(device, DASD_STATE_NEW); + cdev->handler = NULL; /* dasd_delete_device destroys the device reference. */ block = device->block; dasd_delete_device(device);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Stefan Haberland sth@linux.ibm.com
commit 66f669a272898feb1c69b770e1504aa2ec7723d1 upstream.
Prevent that an IO request is build during device shutdown initiated by a driver unbind. This request will never be able to be processed or canceled and will hang forever. This will lead also to a hanging unbind.
Fix by checking not only if the device is in READY state but also check that there is no device offline initiated before building a new IO request.
Fixes: e443343e509a ("s390/dasd: blk-mq conversion")
Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Stefan Haberland sth@linux.ibm.com Tested-by: Bjoern Walk bwalk@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -3068,7 +3068,8 @@ static blk_status_t do_dasd_request(stru
basedev = block->base; spin_lock_irq(&dq->lock); - if (basedev->state < DASD_STATE_READY) { + if (basedev->state < DASD_STATE_READY || + test_bit(DASD_FLAG_OFFLINE, &basedev->flags)) { DBF_DEV_EVENT(DBF_ERR, basedev, "device not ready for request %p", req); rc = BLK_STS_IOERR;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Heikki Krogerus heikki.krogerus@linux.intel.com
commit 8891123f9cbb9c1ee531e5a87fa116f0af685c48 upstream.
Software node can not be registered before its parent.
Fixes: 80488a6b1d3c ("software node: Add support for static node descriptors") Cc: 5.10+ stable@vger.kernel.org # 5.10+ Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Tested-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/swnode.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/base/swnode.c +++ b/drivers/base/swnode.c @@ -786,6 +786,9 @@ int software_node_register(const struct if (software_node_to_swnode(node)) return -EEXIST;
+ if (node->parent && !parent) + return -EINVAL; + return PTR_ERR_OR_ZERO(swnode_register(node, parent, 0)); } EXPORT_SYMBOL_GPL(software_node_register);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Juergen Gross jgross@suse.com
commit 9e77d96b8e2724ed00380189f7b0ded61113b39f upstream.
When creating a new event channel with 2-level events the affinity needs to be reset initially in order to avoid using an old affinity from earlier usage of the event channel port. So when tearing an event channel down reset all affinity bits.
The same applies to the affinity when onlining a vcpu: all old affinity settings for this vcpu must be reset. As percpu events get initialized before the percpu event channel hook is called, resetting of the affinities happens after offlining a vcpu (this is working, as initial percpu memory is zeroed out).
Cc: stable@vger.kernel.org Reported-by: Julien Grall julien@xen.org Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Julien Grall jgrall@amazon.com Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_2l.c | 15 +++++++++++++++ drivers/xen/events/events_base.c | 1 + drivers/xen/events/events_internal.h | 8 ++++++++ 3 files changed, 24 insertions(+)
--- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -47,6 +47,11 @@ static unsigned evtchn_2l_max_channels(v return EVTCHN_2L_NR_CHANNELS; }
+static void evtchn_2l_remove(evtchn_port_t evtchn, unsigned int cpu) +{ + clear_bit(evtchn, BM(per_cpu(cpu_evtchn_mask, cpu))); +} + static void evtchn_2l_bind_to_cpu(evtchn_port_t evtchn, unsigned int cpu, unsigned int old_cpu) { @@ -355,9 +360,18 @@ static void evtchn_2l_resume(void) EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD); }
+static int evtchn_2l_percpu_deinit(unsigned int cpu) +{ + memset(per_cpu(cpu_evtchn_mask, cpu), 0, sizeof(xen_ulong_t) * + EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD); + + return 0; +} + static const struct evtchn_ops evtchn_ops_2l = { .max_channels = evtchn_2l_max_channels, .nr_channels = evtchn_2l_max_channels, + .remove = evtchn_2l_remove, .bind_to_cpu = evtchn_2l_bind_to_cpu, .clear_pending = evtchn_2l_clear_pending, .set_pending = evtchn_2l_set_pending, @@ -367,6 +381,7 @@ static const struct evtchn_ops evtchn_op .unmask = evtchn_2l_unmask, .handle_events = evtchn_2l_handle_events, .resume = evtchn_2l_resume, + .percpu_deinit = evtchn_2l_percpu_deinit, };
void __init xen_evtchn_2l_init(void) --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -368,6 +368,7 @@ static int xen_irq_info_pirq_setup(unsig static void xen_irq_info_cleanup(struct irq_info *info) { set_evtchn_to_irq(info->evtchn, -1); + xen_evtchn_port_remove(info->evtchn, info->cpu); info->evtchn = 0; channels_on_cpu_dec(info); } --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -14,6 +14,7 @@ struct evtchn_ops { unsigned (*nr_channels)(void);
int (*setup)(evtchn_port_t port); + void (*remove)(evtchn_port_t port, unsigned int cpu); void (*bind_to_cpu)(evtchn_port_t evtchn, unsigned int cpu, unsigned int old_cpu);
@@ -54,6 +55,13 @@ static inline int xen_evtchn_port_setup( return 0; }
+static inline void xen_evtchn_port_remove(evtchn_port_t evtchn, + unsigned int cpu) +{ + if (evtchn_ops->remove) + evtchn_ops->remove(evtchn, cpu); +} + static inline void xen_evtchn_port_bind_to_cpu(evtchn_port_t evtchn, unsigned int cpu, unsigned int old_cpu)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Juergen Gross jgross@suse.com
commit 25da4618af240fbec6112401498301a6f2bc9702 upstream.
An event channel should be kept masked when an eoi is pending for it. When being migrated to another cpu it might be unmasked, though.
In order to avoid this keep three different flags for each event channel to be able to distinguish "normal" masking/unmasking from eoi related masking/unmasking and temporary masking. The event channel should only be able to generate an interrupt if all flags are cleared.
Cc: stable@vger.kernel.org Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework") Reported-by: Julien Grall julien@xen.org Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Julien Grall jgrall@amazon.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Tested-by: Ross Lagerwall ross.lagerwall@citrix.com Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
[boris -- corrected Fixed tag format]
Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com --- drivers/xen/events/events_2l.c | 7 -- drivers/xen/events/events_base.c | 101 +++++++++++++++++++++++++++-------- drivers/xen/events/events_fifo.c | 7 -- drivers/xen/events/events_internal.h | 6 -- 4 files changed, 80 insertions(+), 41 deletions(-)
--- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -77,12 +77,6 @@ static bool evtchn_2l_is_pending(evtchn_ return sync_test_bit(port, BM(&s->evtchn_pending[0])); }
-static bool evtchn_2l_test_and_set_mask(evtchn_port_t port) -{ - struct shared_info *s = HYPERVISOR_shared_info; - return sync_test_and_set_bit(port, BM(&s->evtchn_mask[0])); -} - static void evtchn_2l_mask(evtchn_port_t port) { struct shared_info *s = HYPERVISOR_shared_info; @@ -376,7 +370,6 @@ static const struct evtchn_ops evtchn_op .clear_pending = evtchn_2l_clear_pending, .set_pending = evtchn_2l_set_pending, .is_pending = evtchn_2l_is_pending, - .test_and_set_mask = evtchn_2l_test_and_set_mask, .mask = evtchn_2l_mask, .unmask = evtchn_2l_unmask, .handle_events = evtchn_2l_handle_events, --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -97,13 +97,18 @@ struct irq_info { short refcnt; u8 spurious_cnt; u8 is_accounted; - enum xen_irq_type type; /* type */ + short type; /* type: IRQT_* */ + u8 mask_reason; /* Why is event channel masked */ +#define EVT_MASK_REASON_EXPLICIT 0x01 +#define EVT_MASK_REASON_TEMPORARY 0x02 +#define EVT_MASK_REASON_EOI_PENDING 0x04 unsigned irq; evtchn_port_t evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ unsigned short eoi_cpu; /* EOI must happen on this cpu-1 */ unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ u64 eoi_time; /* Time in jiffies when to EOI. */ + spinlock_t lock;
union { unsigned short virq; @@ -152,6 +157,7 @@ static DEFINE_RWLOCK(evtchn_rwlock); * evtchn_rwlock * IRQ-desc lock * percpu eoi_list_lock + * irq_info->lock */
static LIST_HEAD(xen_irq_list_head); @@ -302,6 +308,8 @@ static int xen_irq_info_common_setup(str info->irq = irq; info->evtchn = evtchn; info->cpu = cpu; + info->mask_reason = EVT_MASK_REASON_EXPLICIT; + spin_lock_init(&info->lock);
ret = set_evtchn_to_irq(evtchn, irq); if (ret < 0) @@ -450,6 +458,34 @@ unsigned int cpu_from_evtchn(evtchn_port return ret; }
+static void do_mask(struct irq_info *info, u8 reason) +{ + unsigned long flags; + + spin_lock_irqsave(&info->lock, flags); + + if (!info->mask_reason) + mask_evtchn(info->evtchn); + + info->mask_reason |= reason; + + spin_unlock_irqrestore(&info->lock, flags); +} + +static void do_unmask(struct irq_info *info, u8 reason) +{ + unsigned long flags; + + spin_lock_irqsave(&info->lock, flags); + + info->mask_reason &= ~reason; + + if (!info->mask_reason) + unmask_evtchn(info->evtchn); + + spin_unlock_irqrestore(&info->lock, flags); +} + #ifdef CONFIG_X86 static bool pirq_check_eoi_map(unsigned irq) { @@ -586,7 +622,7 @@ static void xen_irq_lateeoi_locked(struc }
info->eoi_time = 0; - unmask_evtchn(evtchn); + do_unmask(info, EVT_MASK_REASON_EOI_PENDING); }
static void xen_irq_lateeoi_worker(struct work_struct *work) @@ -831,7 +867,8 @@ static unsigned int __startup_pirq(unsig goto err;
out: - unmask_evtchn(evtchn); + do_unmask(info, EVT_MASK_REASON_EXPLICIT); + eoi_pirq(irq_get_irq_data(irq));
return 0; @@ -858,7 +895,7 @@ static void shutdown_pirq(struct irq_dat if (!VALID_EVTCHN(evtchn)) return;
- mask_evtchn(evtchn); + do_mask(info, EVT_MASK_REASON_EXPLICIT); xen_evtchn_close(evtchn); xen_irq_info_cleanup(info); } @@ -1691,10 +1728,10 @@ void rebind_evtchn_irq(evtchn_port_t evt }
/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static int xen_rebind_evtchn_to_cpu(evtchn_port_t evtchn, unsigned int tcpu) +static int xen_rebind_evtchn_to_cpu(struct irq_info *info, unsigned int tcpu) { struct evtchn_bind_vcpu bind_vcpu; - int masked; + evtchn_port_t evtchn = info ? info->evtchn : 0;
if (!VALID_EVTCHN(evtchn)) return -1; @@ -1710,7 +1747,7 @@ static int xen_rebind_evtchn_to_cpu(evtc * Mask the event while changing the VCPU binding to prevent * it being delivered on an unexpected VCPU. */ - masked = test_and_set_mask(evtchn); + do_mask(info, EVT_MASK_REASON_TEMPORARY);
/* * If this fails, it usually just indicates that we're dealing with a @@ -1720,8 +1757,7 @@ static int xen_rebind_evtchn_to_cpu(evtc if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) bind_evtchn_to_cpu(evtchn, tcpu, false);
- if (!masked) - unmask_evtchn(evtchn); + do_unmask(info, EVT_MASK_REASON_TEMPORARY);
return 0; } @@ -1760,7 +1796,7 @@ static int set_affinity_irq(struct irq_d unsigned int tcpu = select_target_cpu(dest); int ret;
- ret = xen_rebind_evtchn_to_cpu(evtchn_from_irq(data->irq), tcpu); + ret = xen_rebind_evtchn_to_cpu(info_for_irq(data->irq), tcpu); if (!ret) irq_data_update_effective_affinity(data, cpumask_of(tcpu));
@@ -1769,18 +1805,20 @@ static int set_affinity_irq(struct irq_d
static void enable_dynirq(struct irq_data *data) { - evtchn_port_t evtchn = evtchn_from_irq(data->irq); + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0;
if (VALID_EVTCHN(evtchn)) - unmask_evtchn(evtchn); + do_unmask(info, EVT_MASK_REASON_EXPLICIT); }
static void disable_dynirq(struct irq_data *data) { - evtchn_port_t evtchn = evtchn_from_irq(data->irq); + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0;
if (VALID_EVTCHN(evtchn)) - mask_evtchn(evtchn); + do_mask(info, EVT_MASK_REASON_EXPLICIT); }
static void ack_dynirq(struct irq_data *data) @@ -1799,18 +1837,39 @@ static void mask_ack_dynirq(struct irq_d ack_dynirq(data); }
+static void lateeoi_ack_dynirq(struct irq_data *data) +{ + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0; + + if (VALID_EVTCHN(evtchn)) { + do_mask(info, EVT_MASK_REASON_EOI_PENDING); + clear_evtchn(evtchn); + } +} + +static void lateeoi_mask_ack_dynirq(struct irq_data *data) +{ + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0; + + if (VALID_EVTCHN(evtchn)) { + do_mask(info, EVT_MASK_REASON_EXPLICIT); + clear_evtchn(evtchn); + } +} + static int retrigger_dynirq(struct irq_data *data) { - evtchn_port_t evtchn = evtchn_from_irq(data->irq); - int masked; + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0;
if (!VALID_EVTCHN(evtchn)) return 0;
- masked = test_and_set_mask(evtchn); + do_mask(info, EVT_MASK_REASON_TEMPORARY); set_evtchn(evtchn); - if (!masked) - unmask_evtchn(evtchn); + do_unmask(info, EVT_MASK_REASON_TEMPORARY);
return 1; } @@ -2024,8 +2083,8 @@ static struct irq_chip xen_lateeoi_chip .irq_mask = disable_dynirq, .irq_unmask = enable_dynirq,
- .irq_ack = mask_ack_dynirq, - .irq_mask_ack = mask_ack_dynirq, + .irq_ack = lateeoi_ack_dynirq, + .irq_mask_ack = lateeoi_mask_ack_dynirq,
.irq_set_affinity = set_affinity_irq, .irq_retrigger = retrigger_dynirq, --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -209,12 +209,6 @@ static bool evtchn_fifo_is_pending(evtch return sync_test_bit(EVTCHN_FIFO_BIT(PENDING, word), BM(word)); }
-static bool evtchn_fifo_test_and_set_mask(evtchn_port_t port) -{ - event_word_t *word = event_word_from_port(port); - return sync_test_and_set_bit(EVTCHN_FIFO_BIT(MASKED, word), BM(word)); -} - static void evtchn_fifo_mask(evtchn_port_t port) { event_word_t *word = event_word_from_port(port); @@ -423,7 +417,6 @@ static const struct evtchn_ops evtchn_op .clear_pending = evtchn_fifo_clear_pending, .set_pending = evtchn_fifo_set_pending, .is_pending = evtchn_fifo_is_pending, - .test_and_set_mask = evtchn_fifo_test_and_set_mask, .mask = evtchn_fifo_mask, .unmask = evtchn_fifo_unmask, .handle_events = evtchn_fifo_handle_events, --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -21,7 +21,6 @@ struct evtchn_ops { void (*clear_pending)(evtchn_port_t port); void (*set_pending)(evtchn_port_t port); bool (*is_pending)(evtchn_port_t port); - bool (*test_and_set_mask)(evtchn_port_t port); void (*mask)(evtchn_port_t port); void (*unmask)(evtchn_port_t port);
@@ -84,11 +83,6 @@ static inline bool test_evtchn(evtchn_po return evtchn_ops->is_pending(port); }
-static inline bool test_and_set_mask(evtchn_port_t port) -{ - return evtchn_ops->test_and_set_mask(port); -} - static inline void mask_evtchn(evtchn_port_t port) { return evtchn_ops->mask(port);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Juergen Gross jgross@suse.com
commit b6622798bc50b625a1e62f82c7190df40c1f5b21 upstream.
When changing the cpu affinity of an event it can happen today that (with some unlucky timing) the same event will be handled on the old and the new cpu at the same time.
Avoid that by adding an "event active" flag to the per-event data and call the handler only if this flag isn't set.
Cc: stable@vger.kernel.org Reported-by: Julien Grall julien@xen.org Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Julien Grall jgrall@amazon.com Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -102,6 +102,7 @@ struct irq_info { #define EVT_MASK_REASON_EXPLICIT 0x01 #define EVT_MASK_REASON_TEMPORARY 0x02 #define EVT_MASK_REASON_EOI_PENDING 0x04 + u8 is_active; /* Is event just being handled? */ unsigned irq; evtchn_port_t evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ @@ -791,6 +792,12 @@ static void xen_evtchn_close(evtchn_port BUG(); }
+static void event_handler_exit(struct irq_info *info) +{ + smp_store_release(&info->is_active, 0); + clear_evtchn(info->evtchn); +} + static void pirq_query_unmask(int irq) { struct physdev_irq_status_query irq_status; @@ -809,14 +816,15 @@ static void pirq_query_unmask(int irq)
static void eoi_pirq(struct irq_data *data) { - evtchn_port_t evtchn = evtchn_from_irq(data->irq); + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0; struct physdev_eoi eoi = { .irq = pirq_from_irq(data->irq) }; int rc = 0;
if (!VALID_EVTCHN(evtchn)) return;
- clear_evtchn(evtchn); + event_handler_exit(info);
if (pirq_needs_eoi(data->irq)) { rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi); @@ -1640,6 +1648,8 @@ void handle_irq_for_port(evtchn_port_t p }
info = info_for_irq(irq); + if (xchg_acquire(&info->is_active, 1)) + return;
if (ctrl->defer_eoi) { info->eoi_cpu = smp_processor_id(); @@ -1823,12 +1833,11 @@ static void disable_dynirq(struct irq_da
static void ack_dynirq(struct irq_data *data) { - evtchn_port_t evtchn = evtchn_from_irq(data->irq); - - if (!VALID_EVTCHN(evtchn)) - return; + struct irq_info *info = info_for_irq(data->irq); + evtchn_port_t evtchn = info ? info->evtchn : 0;
- clear_evtchn(evtchn); + if (VALID_EVTCHN(evtchn)) + event_handler_exit(info); }
static void mask_ack_dynirq(struct irq_data *data) @@ -1844,7 +1853,7 @@ static void lateeoi_ack_dynirq(struct ir
if (VALID_EVTCHN(evtchn)) { do_mask(info, EVT_MASK_REASON_EOI_PENDING); - clear_evtchn(evtchn); + event_handler_exit(info); } }
@@ -1855,7 +1864,7 @@ static void lateeoi_mask_ack_dynirq(stru
if (VALID_EVTCHN(evtchn)) { do_mask(info, EVT_MASK_REASON_EXPLICIT); - clear_evtchn(evtchn); + event_handler_exit(info); } }
@@ -1968,10 +1977,11 @@ static void restore_cpu_ipis(unsigned in /* Clear an irq's pending state, in preparation for polling on it */ void xen_clear_irq_pending(int irq) { - evtchn_port_t evtchn = evtchn_from_irq(irq); + struct irq_info *info = info_for_irq(irq); + evtchn_port_t evtchn = info ? info->evtchn : 0;
if (VALID_EVTCHN(evtchn)) - clear_evtchn(evtchn); + event_handler_exit(info); } EXPORT_SYMBOL(xen_clear_irq_pending); void xen_set_irq_pending(int irq)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Yann Gautier yann.gautier@foss.st.com
commit 774514bf977377c9137640a0310bd64eed0f7323 upstream.
An issue has been observed on STM32MP157C-EV1 board, with an erase command with secure erase argument, ending up waiting for ~4 hours before timeout.
The requested busy timeout from the mmc core ends up with 14784000ms (~4 hours), but the supported host->max_busy_timeout is 86767ms, which leads to that the core switch to use an R1 response in favor of the R1B and polls for busy with the host->card_busy() ops. In this case the polling doesn't work as expected, as we never detects that the card stops signaling busy, which leads to the following message:
mmc1: Card stuck being busy! __mmc_poll_for_busy
The problem boils done to that the stm32 variants can't use R1 responses in favor of R1B responses, as it leads to an internal state machine in the controller to get stuck. To continue to process requests, it would need to be reset.
To fix this problem, let's set MMC_CAP_NEED_RSP_BUSY for the stm32 variant, which prevent the mmc core from switching to R1 responses. Additionally, let's cap the cmd->busy_timeout to the host->max_busy_timeout, thus rely on 86767ms to be sufficient (~66 seconds was need for this test case).
Fixes: 94fe2580a2f3 ("mmc: core: Enable erase/discard/trim support for all mmc hosts") Signed-off-by: Yann Gautier yann.gautier@foss.st.com Link: https://lore.kernel.org/r/20210225145454.12780-1-yann.gautier@foss.st.com Cc: stable@vger.kernel.org [Ulf: Simplified the code and extended the commit message] Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mmci.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/mmc/host/mmci.c +++ b/drivers/mmc/host/mmci.c @@ -1241,7 +1241,11 @@ mmci_start_command(struct mmci_host *hos if (!cmd->busy_timeout) cmd->busy_timeout = 10 * MSEC_PER_SEC;
- clks = (unsigned long long)cmd->busy_timeout * host->cclk; + if (cmd->busy_timeout > host->mmc->max_busy_timeout) + clks = (unsigned long long)host->mmc->max_busy_timeout * host->cclk; + else + clks = (unsigned long long)cmd->busy_timeout * host->cclk; + do_div(clks, MSEC_PER_SEC); writel_relaxed(clks, host->base + MMCIDATATIMER); } @@ -2091,6 +2095,10 @@ static int mmci_probe(struct amba_device mmc->caps |= MMC_CAP_WAIT_WHILE_BUSY; }
+ /* Variants with mandatory busy timeout in HW needs R1B responses. */ + if (variant->busy_timeout) + mmc->caps |= MMC_CAP_NEED_RSP_BUSY; + /* Prepare a CMD12 - needed to clear the DPSM on some variants. */ host->stop_abort.opcode = MMC_STOP_TRANSMISSION; host->stop_abort.arg = 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Adrian Hunter adrian.hunter@intel.com
commit 66fbacccbab91e6e55d9c8f1fc0910a8eb6c81f7 upstream.
Avoid the following warning by always defining partition switch time:
[ 3.209874] mmc1: unspecified timeout for CMD6 - use generic [ 3.222780] ------------[ cut here ]------------ [ 3.233363] WARNING: CPU: 1 PID: 111 at drivers/mmc/core/mmc_ops.c:575 __mmc_switch+0x200/0x204
Reported-by: Paul Fertser fercerpav@gmail.com Fixes: 1c447116d017 ("mmc: mmc: Fix partition switch timeout for some eMMCs") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/168bbfd6-0c5b-5ace-ab41-402e7937c46e@intel.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/mmc.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/mmc/core/mmc.c +++ b/drivers/mmc/core/mmc.c @@ -423,10 +423,6 @@ static int mmc_decode_ext_csd(struct mmc
/* EXT_CSD value is in units of 10ms, but we store in ms */ card->ext_csd.part_time = 10 * ext_csd[EXT_CSD_PART_SWITCH_TIME]; - /* Some eMMC set the value too low so set a minimum */ - if (card->ext_csd.part_time && - card->ext_csd.part_time < MMC_MIN_PART_SWITCH_TIME) - card->ext_csd.part_time = MMC_MIN_PART_SWITCH_TIME;
/* Sleep / awake timeout in 100ns units */ if (sa_shift > 0 && sa_shift <= 0x17) @@ -616,6 +612,17 @@ static int mmc_decode_ext_csd(struct mmc card->ext_csd.data_sector_size = 512; }
+ /* + * GENERIC_CMD6_TIME is to be used "unless a specific timeout is defined + * when accessing a specific field", so use it here if there is no + * PARTITION_SWITCH_TIME. + */ + if (!card->ext_csd.part_time) + card->ext_csd.part_time = card->ext_csd.generic_cmd6_time; + /* Some eMMC set the value too low so set a minimum */ + if (card->ext_csd.part_time < MMC_MIN_PART_SWITCH_TIME) + card->ext_csd.part_time = MMC_MIN_PART_SWITCH_TIME; + /* eMMC v5 or later */ if (card->ext_csd.rev >= 7) { memcpy(card->ext_csd.fwrev, &ext_csd[EXT_CSD_FIRMWARE_VERSION],
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Frank Li lznuaa@gmail.com
commit f06391c45e83f9a731045deb23df7cc3814fd795 upstream.
[ 6684.493350] Unable to handle kernel paging request at virtual address ffff800011c5b0f0 [ 6684.498531] mmc0: card 0001 removed [ 6684.501556] Mem abort info: [ 6684.509681] ESR = 0x96000047 [ 6684.512786] EC = 0x25: DABT (current EL), IL = 32 bits [ 6684.518394] SET = 0, FnV = 0 [ 6684.521707] EA = 0, S1PTW = 0 [ 6684.524998] Data abort info: [ 6684.528236] ISV = 0, ISS = 0x00000047 [ 6684.532986] CM = 0, WnR = 1 [ 6684.536129] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081b22000 [ 6684.543923] [ffff800011c5b0f0] pgd=00000000bffff003, p4d=00000000bffff003, pud=00000000bfffe003, pmd=00000000900e1003, pte=0000000000000000 [ 6684.557915] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 6684.564240] Modules linked in: sdhci_esdhc_imx(-) sdhci_pltfm sdhci cqhci mmc_block mmc_core fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes crct10dif_ce flexcan can_dev caam error [last unloaded: mmc_core] [ 6684.587281] CPU: 0 PID: 79138 Comm: kworker/0:3H Not tainted 5.10.9-01410-g3ba33182767b-dirty #10 [ 6684.596160] Hardware name: Freescale i.MX8DXL EVK (DT) [ 6684.601320] Workqueue: kblockd blk_mq_run_work_fn
[ 6684.606094] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 6684.612286] pc : cqhci_request+0x148/0x4e8 [cqhci] ^GMessage from syslogd@ at Thu Jan 1 01:51:24 1970 ...[ 6684.617085] lr : cqhci_request+0x314/0x4e8 [cqhci] [ 6684.626734] sp : ffff80001243b9f0 [ 6684.630049] x29: ffff80001243b9f0 x28: ffff00002c3dd000 [ 6684.635367] x27: 0000000000000001 x26: 0000000000000001 [ 6684.640690] x25: ffff00002c451000 x24: 000000000000000f [ 6684.646007] x23: ffff000017e71c80 x22: ffff00002c451000 [ 6684.651326] x21: ffff00002c0f3550 x20: ffff00002c0f3550 [ 6684.656651] x19: ffff000017d46880 x18: ffff00002cea1500 [ 6684.661977] x17: 0000000000000000 x16: 0000000000000000 [ 6684.667294] x15: 000001ee628e3ed1 x14: 0000000000000278 [ 6684.672610] x13: 0000000000000001 x12: 0000000000000001 [ 6684.677927] x11: 0000000000000000 x10: 0000000000000000 [ 6684.683243] x9 : 000000000000002b x8 : 0000000000001000 [ 6684.688560] x7 : 0000000000000010 x6 : ffff00002c0f3678 [ 6684.693886] x5 : 000000000000000f x4 : ffff800011c5b000 [ 6684.699211] x3 : 000000000002d988 x2 : 0000000000000008 [ 6684.704537] x1 : 00000000000000f0 x0 : 0002d9880008102f [ 6684.709854] Call trace: [ 6684.712313] cqhci_request+0x148/0x4e8 [cqhci] [ 6684.716803] mmc_cqe_start_req+0x58/0x68 [mmc_core] [ 6684.721698] mmc_blk_mq_issue_rq+0x460/0x810 [mmc_block] [ 6684.727018] mmc_mq_queue_rq+0x118/0x2b0 [mmc_block]
The problem occurs when cqhci_request() get called after cqhci_disable() as it leads to access of allocated memory that has already been freed. Let's fix the problem by calling cqhci_disable() a bit later in the remove path.
Signed-off-by: Frank Li Frank.Li@nxp.com Diagnosed-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20210303174248.542175-1-Frank.Li@nxp.com Fixes: f690f4409ddd ("mmc: mmc: Enable CQE's") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/bus.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/drivers/mmc/core/bus.c +++ b/drivers/mmc/core/bus.c @@ -399,11 +399,6 @@ void mmc_remove_card(struct mmc_card *ca mmc_remove_card_debugfs(card); #endif
- if (host->cqe_enabled) { - host->cqe_ops->cqe_disable(host); - host->cqe_enabled = false; - } - if (mmc_card_present(card)) { if (mmc_host_is_spi(card->host)) { pr_info("%s: SPI card removed\n", @@ -416,6 +411,10 @@ void mmc_remove_card(struct mmc_card *ca of_node_put(card->dev.of_node); }
+ if (host->cqe_enabled) { + host->cqe_ops->cqe_disable(host); + host->cqe_enabled = false; + } + put_device(&card->dev); } -
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paulo Alcantara pc@cjr.nz
commit 04ad69c342fc4de5bd23be9ef15ea7574fb1a87e upstream.
In case of interrupted syscalls, prevent sending CLOSE commands for compound CREATE+CLOSE requests by introducing an CIFS_CP_CREATE_CLOSE_OP flag to indicate lower layers that it should not send a CLOSE command to the MIDs corresponding the compound CREATE+CLOSE request.
A simple reproducer:
#!/bin/bash
mount //server/share /mnt -o username=foo,password=*** tc qdisc add dev eth0 root netem delay 450ms stat -f /mnt &>/dev/null & pid=$! sleep 0.01 kill $pid tc qdisc del dev eth0 root umount /mnt
Before patch:
... 6 0.256893470 192.168.122.2 → 192.168.122.15 SMB2 402 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request 7 0.257144491 192.168.122.15 → 192.168.122.2 SMB2 498 Create Response File: ;GetInfo Response;Close Response 9 0.260798209 192.168.122.2 → 192.168.122.15 SMB2 146 Close Request File: 10 0.260841089 192.168.122.15 → 192.168.122.2 SMB2 130 Close Response, Error: STATUS_FILE_CLOSED
Signed-off-by: Paulo Alcantara (SUSE) pc@cjr.nz Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Aurelien Aptel aaptel@suse.com CC: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifsglob.h | 11 ++++++----- fs/cifs/smb2inode.c | 1 + fs/cifs/smb2misc.c | 8 ++++---- fs/cifs/smb2ops.c | 10 +++++----- fs/cifs/smb2proto.h | 3 +-- fs/cifs/transport.c | 2 +- 6 files changed, 18 insertions(+), 17 deletions(-)
--- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -256,7 +256,7 @@ struct smb_version_operations { /* verify the message */ int (*check_message)(char *, unsigned int, struct TCP_Server_Info *); bool (*is_oplock_break)(char *, struct TCP_Server_Info *); - int (*handle_cancelled_mid)(char *, struct TCP_Server_Info *); + int (*handle_cancelled_mid)(struct mid_q_entry *, struct TCP_Server_Info *); void (*downgrade_oplock)(struct TCP_Server_Info *server, struct cifsInodeInfo *cinode, __u32 oplock, unsigned int epoch, bool *purge_cache); @@ -1701,10 +1701,11 @@ static inline bool is_retryable_error(in #define CIFS_NO_RSP_BUF 0x040 /* no response buffer required */
/* Type of request operation */ -#define CIFS_ECHO_OP 0x080 /* echo request */ -#define CIFS_OBREAK_OP 0x0100 /* oplock break request */ -#define CIFS_NEG_OP 0x0200 /* negotiate request */ -#define CIFS_OP_MASK 0x0380 /* mask request type */ +#define CIFS_ECHO_OP 0x080 /* echo request */ +#define CIFS_OBREAK_OP 0x0100 /* oplock break request */ +#define CIFS_NEG_OP 0x0200 /* negotiate request */ +#define CIFS_CP_CREATE_CLOSE_OP 0x0400 /* compound create+close request */ +#define CIFS_OP_MASK 0x0780 /* mask request type */
#define CIFS_HAS_CREDITS 0x0400 /* already has credits */ #define CIFS_TRANSFORM_REQ 0x0800 /* transform request before sending */ --- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -358,6 +358,7 @@ smb2_compound_op(const unsigned int xid, if (cfile) goto after_close; /* Close */ + flags |= CIFS_CP_CREATE_CLOSE_OP; rqst[num_rqst].rq_iov = &vars->close_iov[0]; rqst[num_rqst].rq_nvec = 1; rc = SMB2_close_init(tcon, server, --- a/fs/cifs/smb2misc.c +++ b/fs/cifs/smb2misc.c @@ -844,14 +844,14 @@ smb2_handle_cancelled_close(struct cifs_ }
int -smb2_handle_cancelled_mid(char *buffer, struct TCP_Server_Info *server) +smb2_handle_cancelled_mid(struct mid_q_entry *mid, struct TCP_Server_Info *server) { - struct smb2_sync_hdr *sync_hdr = (struct smb2_sync_hdr *)buffer; - struct smb2_create_rsp *rsp = (struct smb2_create_rsp *)buffer; + struct smb2_sync_hdr *sync_hdr = mid->resp_buf; + struct smb2_create_rsp *rsp = mid->resp_buf; struct cifs_tcon *tcon; int rc;
- if (sync_hdr->Command != SMB2_CREATE || + if ((mid->optype & CIFS_CP_CREATE_CLOSE_OP) || sync_hdr->Command != SMB2_CREATE || sync_hdr->Status != STATUS_SUCCESS) return 0;
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -1164,7 +1164,7 @@ smb2_set_ea(const unsigned int xid, stru struct TCP_Server_Info *server = cifs_pick_channel(ses); __le16 *utf16_path = NULL; int ea_name_len = strlen(ea_name); - int flags = 0; + int flags = CIFS_CP_CREATE_CLOSE_OP; int len; struct smb_rqst rqst[3]; int resp_buftype[3]; @@ -1542,7 +1542,7 @@ smb2_ioctl_query_info(const unsigned int struct smb_query_info qi; struct smb_query_info __user *pqi; int rc = 0; - int flags = 0; + int flags = CIFS_CP_CREATE_CLOSE_OP; struct smb2_query_info_rsp *qi_rsp = NULL; struct smb2_ioctl_rsp *io_rsp = NULL; void *buffer = NULL; @@ -2516,7 +2516,7 @@ smb2_query_info_compound(const unsigned { struct cifs_ses *ses = tcon->ses; struct TCP_Server_Info *server = cifs_pick_channel(ses); - int flags = 0; + int flags = CIFS_CP_CREATE_CLOSE_OP; struct smb_rqst rqst[3]; int resp_buftype[3]; struct kvec rsp_iov[3]; @@ -2914,7 +2914,7 @@ smb2_query_symlink(const unsigned int xi unsigned int sub_offset; unsigned int print_len; unsigned int print_offset; - int flags = 0; + int flags = CIFS_CP_CREATE_CLOSE_OP; struct smb_rqst rqst[3]; int resp_buftype[3]; struct kvec rsp_iov[3]; @@ -3096,7 +3096,7 @@ smb2_query_reparse_tag(const unsigned in struct cifs_open_parms oparms; struct cifs_fid fid; struct TCP_Server_Info *server = cifs_pick_channel(tcon->ses); - int flags = 0; + int flags = CIFS_CP_CREATE_CLOSE_OP; struct smb_rqst rqst[3]; int resp_buftype[3]; struct kvec rsp_iov[3]; --- a/fs/cifs/smb2proto.h +++ b/fs/cifs/smb2proto.h @@ -246,8 +246,7 @@ extern int SMB2_oplock_break(const unsig extern int smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid, __u64 volatile_fid); -extern int smb2_handle_cancelled_mid(char *buffer, - struct TCP_Server_Info *server); +extern int smb2_handle_cancelled_mid(struct mid_q_entry *mid, struct TCP_Server_Info *server); void smb2_cancelled_close_fid(struct work_struct *work); extern int SMB2_QFS_info(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_file_id, u64 volatile_file_id, --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -101,7 +101,7 @@ static void _cifs_mid_q_entry_release(st if (midEntry->resp_buf && (midEntry->mid_flags & MID_WAIT_CANCELLED) && midEntry->mid_state == MID_RESPONSE_RECEIVED && server->ops->handle_cancelled_mid) - server->ops->handle_cancelled_mid(midEntry->resp_buf, server); + server->ops->handle_cancelled_mid(midEntry, server);
midEntry->mid_state = MID_FREE; atomic_dec(&midCount);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Yorick de Wid ydewid@gmail.com
commit 4d8654e81db7346f915eca9f1aff18f385cab621 upstream.
The CDC ACM driver is false matching the Goodix Fingerprint device against the USB_CDC_ACM_PROTO_AT_V25TER.
The Goodix Fingerprint device is a biometrics sensor that should be handled in user-space. libfprint has some support for Goodix fingerprint sensors, although not for this particular one. It is possible that the vendor allocates a PID per OEM (Lenovo, Dell etc). If this happens to be the case then more devices from the same vendor could potentially match the ACM modem module table.
Signed-off-by: Yorick de Wid ydewid@gmail.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210213144901.53199-1-ydewid@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-acm.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -1929,6 +1929,11 @@ static const struct usb_device_id acm_id .driver_info = SEND_ZERO_PACKET, },
+ /* Exclude Goodix Fingerprint Reader */ + { USB_DEVICE(0x27c6, 0x5395), + .driver_info = IGNORE_DEVICE, + }, + /* control interfaces without any protocol set */ { USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_ACM, USB_CDC_PROTO_NONE) },
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wei Yongjun weiyongjun1@huawei.com
commit 414c20df7d401bcf1cb6c13d2dd944fb53ae4acf upstream.
In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR().
Fixes: 188db4435ac6 ("usb: gadget: s3c: use platform resources") Cc: stable stable@vger.kernel.org Reported-by: Hulk Robot hulkci@huawei.com Reviewed-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Link: https://lore.kernel.org/r/20210305034927.3232386-1-weiyongjun1@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/s3c2410_udc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/gadget/udc/s3c2410_udc.c +++ b/drivers/usb/gadget/udc/s3c2410_udc.c @@ -1773,8 +1773,8 @@ static int s3c2410_udc_probe(struct plat udc_info = dev_get_platdata(&pdev->dev);
base_addr = devm_platform_ioremap_resource(pdev, 0); - if (!base_addr) { - retval = -ENOMEM; + if (IS_ERR(base_addr)) { + retval = PTR_ERR(base_addr); goto err_mem; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit 650bf52208d804ad5ee449c58102f8dc43175573 upstream.
If the string is invalid, this should return -EINVAL instead of 0.
Fixes: 73517cf49bd4 ("usb: gadget: add RNDIS configfs options for class/subclass/protocol") Cc: stable stable@vger.kernel.org Acked-by: Lorenzo Colitti lorenzo@google.com Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/YCqZ3P53yyIg5cn7@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/u_ether_configfs.h | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/usb/gadget/function/u_ether_configfs.h +++ b/drivers/usb/gadget/function/u_ether_configfs.h @@ -169,12 +169,11 @@ out: \ size_t len) \ { \ struct f_##_f_##_opts *opts = to_f_##_f_##_opts(item); \ - int ret; \ + int ret = -EINVAL; \ u8 val; \ \ mutex_lock(&opts->lock); \ - ret = sscanf(page, "%02hhx", &val); \ - if (ret > 0) { \ + if (sscanf(page, "%02hhx", &val) > 0) { \ opts->_n_ = val; \ ret = len; \ } \
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ruslan Bilovol ruslan.bilovol@gmail.com
commit 789ea77310f0200c84002884ffd628e2baf3ad8a upstream.
As per UAC2 Audio Data Formats spec (2.3.1.1 USB Packets), if the sampling rate is a constant, the allowable variation of number of audio slots per virtual frame is +/- 1 audio slot.
It means that endpoint should be able to accept/send +1 audio slot.
Previous endpoint max_packet_size calculation code was adding sometimes +1 audio slot due to DIV_ROUND_UP behaviour which was rounding up to closest integer. However this doesn't work if the numbers are divisible.
It had no any impact with Linux hosts which ignore this issue, but in case of more strict Windows it caused rejected enumeration
Thus always add +1 audio slot to endpoint's max packet size
Fixes: 913e4a90b6f9 ("usb: gadget: f_uac2: finalize wMaxPacketSize according to bandwidth") Cc: Peter Chen peter.chen@freescale.com Cc: stable@vger.kernel.org #v4.3+ Signed-off-by: Ruslan Bilovol ruslan.bilovol@gmail.com Link: https://lore.kernel.org/r/1614599375-8803-2-git-send-email-ruslan.bilovol@gm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_uac2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/gadget/function/f_uac2.c +++ b/drivers/usb/gadget/function/f_uac2.c @@ -478,7 +478,7 @@ static int set_ep_max_packet_size(const }
max_size_bw = num_channels(chmask) * ssize * - DIV_ROUND_UP(srate, factor / (1 << (ep_desc->bInterval - 1))); + ((srate / (factor / (1 << (ep_desc->bInterval - 1)))) + 1); ep_desc->wMaxPacketSize = cpu_to_le16(min_t(u16, max_size_bw, max_size_ep));
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ruslan Bilovol ruslan.bilovol@gmail.com
commit cc2ac63d4cf72104e0e7f58bb846121f0f51bb19 upstream.
There is missing playback stop/cleanup in case of gadget's ->disable callback that happens on events like USB host resetting or gadget disconnection
Fixes: 0591bc236015 ("usb: gadget: add f_uac1 variant based on a new u_audio api") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Ruslan Bilovol ruslan.bilovol@gmail.com Link: https://lore.kernel.org/r/1614599375-8803-3-git-send-email-ruslan.bilovol@gm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_uac1.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/gadget/function/f_uac1.c +++ b/drivers/usb/gadget/function/f_uac1.c @@ -499,6 +499,7 @@ static void f_audio_disable(struct usb_f uac1->as_out_alt = 0; uac1->as_in_alt = 0;
+ u_audio_stop_playback(&uac1->g_audio); u_audio_stop_capture(&uac1->g_audio); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Serge Semin Sergey.Semin@baikalelectronics.ru
commit 1cffb1c66499a9db9a735473778abf8427d16287 upstream.
of_get_child_by_name() increments the reference counter of the OF node it managed to find. So after the code is done using the device node, the refcount must be decremented. Add missing of_node_put() invocation then to the dwc3_qcom_of_register_core() method, since DWC3 OF node is being used only there.
Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Signed-off-by: Serge Semin Sergey.Semin@baikalelectronics.ru Link: https://lore.kernel.org/r/20210212205521.14280-1-Sergey.Semin@baikalelectron... Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-qcom.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc3/dwc3-qcom.c +++ b/drivers/usb/dwc3/dwc3-qcom.c @@ -639,16 +639,19 @@ static int dwc3_qcom_of_register_core(st ret = of_platform_populate(np, NULL, NULL, dev); if (ret) { dev_err(dev, "failed to register dwc3 core - %d\n", ret); - return ret; + goto node_put; }
qcom->dwc3 = of_find_device_by_node(dwc3_np); if (!qcom->dwc3) { + ret = -ENODEV; dev_err(dev, "failed to get dwc3 platform device\n"); - return -ENODEV; }
- return 0; +node_put: + of_node_put(dwc3_np); + + return ret; }
static int dwc3_qcom_probe(struct platform_device *pdev)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shawn Guo shawn.guo@linaro.org
commit c25c210f590e7a37eecd865d84f97d1f40e39786 upstream.
For sdm845 ACPI boot, the URS (USB Role Switch) node in ACPI DSDT table holds the memory resource, while interrupt resources reside in the child nodes USB0 and UFN0. It adds USB0 host support by probing URS node, creating platform device for USB0 node, and then retrieve interrupt resources from USB0 platform device.
Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Shawn Guo shawn.guo@linaro.org Link: https://lore.kernel.org/r/20210115035057.10994-1-shawn.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-qcom.c | 59 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 56 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc3/dwc3-qcom.c +++ b/drivers/usb/dwc3/dwc3-qcom.c @@ -60,12 +60,14 @@ struct dwc3_acpi_pdata { int dp_hs_phy_irq_index; int dm_hs_phy_irq_index; int ss_phy_irq_index; + bool is_urs; };
struct dwc3_qcom { struct device *dev; void __iomem *qscratch_base; struct platform_device *dwc3; + struct platform_device *urs_usb; struct clk **clks; int num_clocks; struct reset_control *resets; @@ -429,13 +431,15 @@ static void dwc3_qcom_select_utmi_clk(st static int dwc3_qcom_get_irq(struct platform_device *pdev, const char *name, int num) { + struct dwc3_qcom *qcom = platform_get_drvdata(pdev); + struct platform_device *pdev_irq = qcom->urs_usb ? qcom->urs_usb : pdev; struct device_node *np = pdev->dev.of_node; int ret;
if (np) - ret = platform_get_irq_byname(pdev, name); + ret = platform_get_irq_byname(pdev_irq, name); else - ret = platform_get_irq(pdev, num); + ret = platform_get_irq(pdev_irq, num);
return ret; } @@ -568,6 +572,8 @@ static int dwc3_qcom_acpi_register_core( struct dwc3_qcom *qcom = platform_get_drvdata(pdev); struct device *dev = &pdev->dev; struct resource *res, *child_res = NULL; + struct platform_device *pdev_irq = qcom->urs_usb ? qcom->urs_usb : + pdev; int irq; int ret;
@@ -597,7 +603,7 @@ static int dwc3_qcom_acpi_register_core( child_res[0].end = child_res[0].start + qcom->acpi_pdata->dwc3_core_base_size;
- irq = platform_get_irq(pdev, 0); + irq = platform_get_irq(pdev_irq, 0); child_res[1].flags = IORESOURCE_IRQ; child_res[1].start = child_res[1].end = irq;
@@ -654,6 +660,33 @@ node_put: return ret; }
+static struct platform_device * +dwc3_qcom_create_urs_usb_platdev(struct device *dev) +{ + struct fwnode_handle *fwh; + struct acpi_device *adev; + char name[8]; + int ret; + int id; + + /* Figure out device id */ + ret = sscanf(fwnode_get_name(dev->fwnode), "URS%d", &id); + if (!ret) + return NULL; + + /* Find the child using name */ + snprintf(name, sizeof(name), "USB%d", id); + fwh = fwnode_get_named_child_node(dev->fwnode, name); + if (!fwh) + return NULL; + + adev = to_acpi_device_node(fwh); + if (!adev) + return NULL; + + return acpi_create_platform_device(adev, NULL); +} + static int dwc3_qcom_probe(struct platform_device *pdev) { struct device_node *np = pdev->dev.of_node; @@ -718,6 +751,14 @@ static int dwc3_qcom_probe(struct platfo qcom->acpi_pdata->qscratch_base_offset; parent_res->end = parent_res->start + qcom->acpi_pdata->qscratch_base_size; + + if (qcom->acpi_pdata->is_urs) { + qcom->urs_usb = dwc3_qcom_create_urs_usb_platdev(dev); + if (!qcom->urs_usb) { + dev_err(dev, "failed to create URS USB platdev\n"); + return -ENODEV; + } + } }
qcom->qscratch_base = devm_ioremap_resource(dev, parent_res); @@ -880,8 +921,20 @@ static const struct dwc3_acpi_pdata sdm8 .ss_phy_irq_index = 2 };
+static const struct dwc3_acpi_pdata sdm845_acpi_urs_pdata = { + .qscratch_base_offset = SDM845_QSCRATCH_BASE_OFFSET, + .qscratch_base_size = SDM845_QSCRATCH_SIZE, + .dwc3_core_base_size = SDM845_DWC3_CORE_SIZE, + .hs_phy_irq_index = 1, + .dp_hs_phy_irq_index = 4, + .dm_hs_phy_irq_index = 3, + .ss_phy_irq_index = 2, + .is_urs = true, +}; + static const struct acpi_device_id dwc3_qcom_acpi_match[] = { { "QCOM2430", (unsigned long)&sdm845_acpi_pdata }, + { "QCOM0304", (unsigned long)&sdm845_acpi_urs_pdata }, { }, }; MODULE_DEVICE_TABLE(acpi, dwc3_qcom_acpi_match);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shawn Guo shawn.guo@linaro.org
commit 1edbff9c80ed32071fffa7dbaaea507fdb21ff2d upstream.
It enables USB Host support for sc8180x ACPI boot, both the standalone one and the one behind URS (USB Role Switch). And they share the the same dwc3_acpi_pdata with sdm845.
Signed-off-by: Shawn Guo shawn.guo@linaro.org Link: https://lore.kernel.org/r/20210301075745.20544-1-shawn.guo@linaro.org Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-qcom.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/dwc3/dwc3-qcom.c +++ b/drivers/usb/dwc3/dwc3-qcom.c @@ -935,6 +935,8 @@ static const struct dwc3_acpi_pdata sdm8 static const struct acpi_device_id dwc3_qcom_acpi_match[] = { { "QCOM2430", (unsigned long)&sdm845_acpi_pdata }, { "QCOM0304", (unsigned long)&sdm845_acpi_urs_pdata }, + { "QCOM0497", (unsigned long)&sdm845_acpi_urs_pdata }, + { "QCOM04A6", (unsigned long)&sdm845_acpi_pdata }, { }, }; MODULE_DEVICE_TABLE(acpi, dwc3_qcom_acpi_match);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Matthias Kaehlcke mka@chromium.org
commit 2664deb0930643149d61cddbb66ada527ae180bd upstream.
The dwc3-qcom currently enables wakeup interrupts unconditionally when suspending, however this should not be done when wakeup is disabled (e.g. through the sysfs attribute power/wakeup). Only enable wakeup interrupts when device_may_wakeup() returns true.
Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Matthias Kaehlcke mka@chromium.org Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210302103659.v2.1.I44954d9e1169f2cf5c44e6454d357... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-qcom.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/dwc3-qcom.c +++ b/drivers/usb/dwc3/dwc3-qcom.c @@ -358,8 +358,10 @@ static int dwc3_qcom_suspend(struct dwc3 if (ret) dev_warn(qcom->dev, "failed to disable interconnect: %d\n", ret);
+ if (device_may_wakeup(qcom->dev)) + dwc3_qcom_enable_interrupts(qcom); + qcom->is_suspended = true; - dwc3_qcom_enable_interrupts(qcom);
return 0; } @@ -372,7 +374,8 @@ static int dwc3_qcom_resume(struct dwc3_ if (!qcom->is_suspended) return 0;
- dwc3_qcom_disable_interrupts(qcom); + if (device_may_wakeup(qcom->dev)) + dwc3_qcom_disable_interrupts(qcom);
for (i = 0; i < qcom->num_clocks; i++) { ret = clk_prepare_enable(qcom->clks[i]);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Pete Zaitcev zaitcev@redhat.com
commit 9de2c43acf37a17dc4c69ff78bb099b80fb74325 upstream.
Apparently an application that opens a device and calls select() on it, will hang if the decice is disconnected. It's a little surprising that we had this bug for 15 years, but apparently nobody ever uses select() with a printer: only write() and read(), and those work fine. Well, you can also select() with a timeout.
The fix is modeled after devio.c. A few other drivers check the condition first, then do not add the wait queue in case the device is disconnected. We doubt that's completely race-free. So, this patch adds the process first, then locks properly and checks for the disconnect.
Reviewed-by: Zqiang qiang.zhang@windriver.com Signed-off-by: Pete Zaitcev zaitcev@redhat.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210303221053.1cf3313e@suzdal.zaitcev.lan Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usblp.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/drivers/usb/class/usblp.c +++ b/drivers/usb/class/usblp.c @@ -494,16 +494,24 @@ static int usblp_release(struct inode *i /* No kernel lock - fine */ static __poll_t usblp_poll(struct file *file, struct poll_table_struct *wait) { - __poll_t ret; + struct usblp *usblp = file->private_data; + __poll_t ret = 0; unsigned long flags;
- struct usblp *usblp = file->private_data; /* Should we check file->f_mode & FMODE_WRITE before poll_wait()? */ poll_wait(file, &usblp->rwait, wait); poll_wait(file, &usblp->wwait, wait); + + mutex_lock(&usblp->mut); + if (!usblp->present) + ret |= EPOLLHUP; + mutex_unlock(&usblp->mut); + spin_lock_irqsave(&usblp->lock, flags); - ret = ((usblp->bidir && usblp->rcomplete) ? EPOLLIN | EPOLLRDNORM : 0) | - ((usblp->no_paper || usblp->wcomplete) ? EPOLLOUT | EPOLLWRNORM : 0); + if (usblp->bidir && usblp->rcomplete) + ret |= EPOLLIN | EPOLLRDNORM; + if (usblp->no_paper || usblp->wcomplete) + ret |= EPOLLOUT | EPOLLWRNORM; spin_unlock_irqrestore(&usblp->lock, flags); return ret; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
commit b1d25e6ee57c2605845595b6c61340d734253eb3 upstream.
According to the datasheet, this controller has a restriction which "set an endpoint number so that combinations of the DIR bit and the EPNUM bits do not overlap.". However, since the udc core driver is possible to assign a bulk pipe as an interrupt endpoint, an endpoint number may not match the pipe number. After that, when user rebinds another gadget driver, this driver broke the restriction because the driver didn't clear any configuration in usb_ep_disable().
Example: # modprobe g_ncm Then, EP3 = pipe 3, EP4 = pipe 4, EP5 = pipe 6 # rmmod g_ncm # modprobe g_hid Then, EP3 = pipe 6, EP4 = pipe 7. So, pipe 3 and pipe 6 are set as EP3.
So, clear PIPECFG register in usbhs_pipe_free().
Fixes: dfb87b8bfe09 ("usb: renesas_usbhs: gadget: fix re-enabling pipe without re-connecting") Cc: stable stable@vger.kernel.org Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://lore.kernel.org/r/1615168538-26101-1-git-send-email-yoshihiro.shimod... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/renesas_usbhs/pipe.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/renesas_usbhs/pipe.c +++ b/drivers/usb/renesas_usbhs/pipe.c @@ -746,6 +746,8 @@ struct usbhs_pipe *usbhs_pipe_malloc(str
void usbhs_pipe_free(struct usbhs_pipe *pipe) { + usbhsp_pipe_select(pipe); + usbhsp_pipe_cfg_set(pipe, 0xFFFF, 0); usbhsp_put_pipe(pipe); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Stanislaw Gruszka stf_xl@wp.pl
commit a4a251f8c23518899d2078c320cf9ce2fa459c9f upstream.
On some systems rt2800usb and mt7601u devices are unable to operate since commit f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors")
Seems that some xHCI controllers can not perform Soft Retry correctly, affecting those devices.
To avoid the problem add xhci->quirks flag that restore pre soft retry xhci behaviour for affected xHCI controllers. Currently those are AMD_PROMONTORYA_4 and AMD_PROMONTORYA_2, since it was confirmed by the users: on those xHCI hosts issue happen and is gone after disabling Soft Retry.
[minor commit message rewording for checkpatch -Mathias]
Fixes: f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors") Cc: stable@vger.kernel.org # 4.20+ Reported-by: Bernhard bernhard.gebetsberger@gmx.at Tested-by: Bernhard bernhard.gebetsberger@gmx.at Signed-off-by: Stanislaw Gruszka stf_xl@wp.pl Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202541 Link: https://lore.kernel.org/r/20210311115353.2137560-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 5 +++++ drivers/usb/host/xhci-ring.c | 3 ++- drivers/usb/host/xhci.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -295,6 +295,11 @@ static void xhci_pci_quirks(struct devic pdev->device == 0x9026) xhci->quirks |= XHCI_RESET_PLL_ON_DISCONNECT;
+ if (pdev->vendor == PCI_VENDOR_ID_AMD && + (pdev->device == PCI_DEVICE_ID_AMD_PROMONTORYA_2 || + pdev->device == PCI_DEVICE_ID_AMD_PROMONTORYA_4)) + xhci->quirks |= XHCI_NO_SOFT_RETRY; + if (xhci->quirks & XHCI_RESET_ON_RESUME) xhci_dbg_trace(xhci, trace_xhci_dbg_quirks, "QUIRK: Resetting on resume"); --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2307,7 +2307,8 @@ static int process_bulk_intr_td(struct x remaining = 0; break; case COMP_USB_TRANSACTION_ERROR: - if ((ep_ring->err_count++ > MAX_SOFT_RETRY) || + if (xhci->quirks & XHCI_NO_SOFT_RETRY || + (ep_ring->err_count++ > MAX_SOFT_RETRY) || le32_to_cpu(slot_ctx->tt_info) & TT_SLOT) break; *status = 0; --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1883,6 +1883,7 @@ struct xhci_hcd { #define XHCI_SKIP_PHY_INIT BIT_ULL(37) #define XHCI_DISABLE_SPARSE BIT_ULL(38) #define XHCI_SG_TRB_CACHE_SIZE_QUIRK BIT_ULL(39) +#define XHCI_NO_SOFT_RETRY BIT_ULL(40)
unsigned int num_active_eps; unsigned int limit_active_eps;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 253f588c70f66184b1f3a9bbb428b49bbda73e80 upstream.
A xHC USB 3 port might miss the first wake signal from a USB 3 device if the port LFPS reveiver isn't enabled fast enough after xHC resume.
xHC host will anyway be resumed by a PME# signal, but will go back to suspend if no port activity is seen. The device resends the U3 LFPS wake signal after a 100ms delay, but by then host is already suspended, starting all over from the beginning of this issue.
USB 3 specs say U3 wake LFPS signal is sent for max 10ms, then device needs to delay 100ms before resending the wake.
Don't suspend immediately if port activity isn't detected in resume. Instead add a retry. If there is no port activity then delay for 120ms, and re-check for port activity.
Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210311115353.2137560-3-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1088,6 +1088,7 @@ int xhci_resume(struct xhci_hcd *xhci, b struct usb_hcd *secondary_hcd; int retval = 0; bool comp_timer_running = false; + bool pending_portevent = false;
if (!hcd->state) return 0; @@ -1226,13 +1227,22 @@ int xhci_resume(struct xhci_hcd *xhci, b
done: if (retval == 0) { - /* Resume root hubs only when have pending events. */ - if (xhci_pending_portevent(xhci)) { + /* + * Resume roothubs only if there are pending events. + * USB 3 devices resend U3 LFPS wake after a 100ms delay if + * the first wake signalling failed, give it that chance. + */ + pending_portevent = xhci_pending_portevent(xhci); + if (!pending_portevent) { + msleep(120); + pending_portevent = xhci_pending_portevent(xhci); + } + + if (pending_portevent) { usb_hcd_resume_root_hub(xhci->shared_hcd); usb_hcd_resume_root_hub(hcd); } } - /* * If system is subject to the Quirk, Compliance Mode Timer needs to * be re-initialized Always after a system resume. Ports are subject
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Forest Crossman cyrozap@gmail.com
commit b71c669ad8390dd1c866298319ff89fe68b45653 upstream.
I've confirmed that both the ASMedia ASM1042A and ASM3242 have the same problem as the ASM1142 and ASM2142/ASM3142, where they lose some of the upper bits of 64-bit DMA addresses. As with the other chips, this can cause problems on systems where the upper bits matter, and adding the XHCI_NO_64BIT_SUPPORT quirk completely fixes the issue.
Cc: stable@vger.kernel.org Signed-off-by: Forest Crossman cyrozap@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210311115353.2137560-4-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -66,6 +66,7 @@ #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI 0x1142 #define PCI_DEVICE_ID_ASMEDIA_1142_XHCI 0x1242 #define PCI_DEVICE_ID_ASMEDIA_2142_XHCI 0x2142 +#define PCI_DEVICE_ID_ASMEDIA_3242_XHCI 0x3242
static const char hcd_name[] = "xhci_hcd";
@@ -276,11 +277,14 @@ static void xhci_pci_quirks(struct devic pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) xhci->quirks |= XHCI_BROKEN_STREAMS; if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && - pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI) + pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI) { xhci->quirks |= XHCI_TRUST_TX_LENGTH; + xhci->quirks |= XHCI_NO_64BIT_SUPPORT; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && (pdev->device == PCI_DEVICE_ID_ASMEDIA_1142_XHCI || - pdev->device == PCI_DEVICE_ID_ASMEDIA_2142_XHCI)) + pdev->device == PCI_DEVICE_ID_ASMEDIA_2142_XHCI || + pdev->device == PCI_DEVICE_ID_ASMEDIA_3242_XHCI)) xhci->quirks |= XHCI_NO_64BIT_SUPPORT;
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Mathias Nyman mathias.nyman@linux.intel.com
commit d26c00e7276fc92b18c253d69e872f6b03832bad upstream.
If port terminations are detected in suspend, but link never reaches U0 then xHCI may have an internal uncleared wake state that will cause an immediate wake after suspend.
This wake state is normally cleared when driver clears the PORT_CSC bit, which is set after a device is enabled and in U0.
Write 1 to clear PORT_CSC for ports that don't have anything connected when suspending. This makes sure any pending internal wake states in xHCI are cleared.
Cc: stable@vger.kernel.org Tested-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210311115353.2137560-5-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci.c | 62 +++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 32 deletions(-)
--- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -883,44 +883,42 @@ static void xhci_clear_command_ring(stru xhci_set_cmd_ring_deq(xhci); }
-static void xhci_disable_port_wake_on_bits(struct xhci_hcd *xhci) +/* + * Disable port wake bits if do_wakeup is not set. + * + * Also clear a possible internal port wake state left hanging for ports that + * detected termination but never successfully enumerated (trained to 0U). + * Internal wake causes immediate xHCI wake after suspend. PORT_CSC write done + * at enumeration clears this wake, force one here as well for unconnected ports + */ + +static void xhci_disable_hub_port_wake(struct xhci_hcd *xhci, + struct xhci_hub *rhub, + bool do_wakeup) { - struct xhci_port **ports; - int port_index; unsigned long flags; u32 t1, t2, portsc; + int i;
spin_lock_irqsave(&xhci->lock, flags);
- /* disable usb3 ports Wake bits */ - port_index = xhci->usb3_rhub.num_ports; - ports = xhci->usb3_rhub.ports; - while (port_index--) { - t1 = readl(ports[port_index]->addr); - portsc = t1; - t1 = xhci_port_state_to_neutral(t1); - t2 = t1 & ~PORT_WAKE_BITS; - if (t1 != t2) { - writel(t2, ports[port_index]->addr); - xhci_dbg(xhci, "disable wake bits port %d-%d, portsc: 0x%x, write: 0x%x\n", - xhci->usb3_rhub.hcd->self.busnum, - port_index + 1, portsc, t2); - } - } + for (i = 0; i < rhub->num_ports; i++) { + portsc = readl(rhub->ports[i]->addr); + t1 = xhci_port_state_to_neutral(portsc); + t2 = t1; + + /* clear wake bits if do_wake is not set */ + if (!do_wakeup) + t2 &= ~PORT_WAKE_BITS; + + /* Don't touch csc bit if connected or connect change is set */ + if (!(portsc & (PORT_CSC | PORT_CONNECT))) + t2 |= PORT_CSC;
- /* disable usb2 ports Wake bits */ - port_index = xhci->usb2_rhub.num_ports; - ports = xhci->usb2_rhub.ports; - while (port_index--) { - t1 = readl(ports[port_index]->addr); - portsc = t1; - t1 = xhci_port_state_to_neutral(t1); - t2 = t1 & ~PORT_WAKE_BITS; if (t1 != t2) { - writel(t2, ports[port_index]->addr); - xhci_dbg(xhci, "disable wake bits port %d-%d, portsc: 0x%x, write: 0x%x\n", - xhci->usb2_rhub.hcd->self.busnum, - port_index + 1, portsc, t2); + writel(t2, rhub->ports[i]->addr); + xhci_dbg(xhci, "config port %d-%d wake bits, portsc: 0x%x, write: 0x%x\n", + rhub->hcd->self.busnum, i + 1, portsc, t2); } } spin_unlock_irqrestore(&xhci->lock, flags); @@ -983,8 +981,8 @@ int xhci_suspend(struct xhci_hcd *xhci, return -EINVAL;
/* Clear root port wake on bits if wakeup not allowed. */ - if (!do_wakeup) - xhci_disable_port_wake_on_bits(xhci); + xhci_disable_hub_port_wake(xhci, &xhci->usb3_rhub, do_wakeup); + xhci_disable_hub_port_wake(xhci, &xhci->usb2_rhub, do_wakeup);
if (!HCD_HW_ACCESSIBLE(hcd)) return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Pavel Skripkin paskripkin@gmail.com
commit cfdc67acc785e01a8719eeb7012709d245564701 upstream.
sysbot found memory leak in edge_startup(). The problem was that when an error was received from the usb_submit_urb(), nothing was cleaned up.
Reported-by: syzbot+59f777bdcbdd7eea5305@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver") Cc: stable@vger.kernel.org # 2.6.21: c5c0c55598ce Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/io_edgeport.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
--- a/drivers/usb/serial/io_edgeport.c +++ b/drivers/usb/serial/io_edgeport.c @@ -3003,26 +3003,32 @@ static int edge_startup(struct usb_seria response = -ENODEV; }
- usb_free_urb(edge_serial->interrupt_read_urb); - kfree(edge_serial->interrupt_in_buffer); - - usb_free_urb(edge_serial->read_urb); - kfree(edge_serial->bulk_in_buffer); - - kfree(edge_serial); - - return response; + goto error; }
/* start interrupt read for this edgeport this interrupt will * continue as long as the edgeport is connected */ response = usb_submit_urb(edge_serial->interrupt_read_urb, GFP_KERNEL); - if (response) + if (response) { dev_err(ddev, "%s - Error %d submitting control urb\n", __func__, response); + + goto error; + } } return response; + +error: + usb_free_urb(edge_serial->interrupt_read_urb); + kfree(edge_serial->interrupt_in_buffer); + + usb_free_urb(edge_serial->read_urb); + kfree(edge_serial->bulk_in_buffer); + + kfree(edge_serial); + + return response; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Niv Sardi xaiki@evilgiggle.com
commit 5563b3b6420362c8a1f468ca04afe6d5f0a8d0a3 upstream.
Add PID for CH340 that's found on cheap programmers.
The driver works flawlessly as soon as the new PID (0x9986) is added to it. These look like ANU232MI but ship with a ch341 inside. They have no special identifiers (mine only has the string "DB9D20130716" printed on the PCB and nothing identifiable on the packaging. The merchant i bought it from doesn't sell these anymore).
the lsusb -v output is: Bus 001 Device 009: ID 9986:7523 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 255 Vendor Specific Class bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x9986 idProduct 0x7523 bcdDevice 2.54 iManufacturer 0 iProduct 0 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0027 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 96mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 1 bInterfaceProtocol 2 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0020 1x 32 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0020 1x 32 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 1
Signed-off-by: Niv Sardi xaiki@evilgiggle.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/ch341.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/ch341.c +++ b/drivers/usb/serial/ch341.c @@ -86,6 +86,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x1a86, 0x7522) }, { USB_DEVICE(0x1a86, 0x7523) }, { USB_DEVICE(0x4348, 0x5523) }, + { USB_DEVICE(0x9986, 0x7523) }, { }, }; MODULE_DEVICE_TABLE(usb, id_table);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Karan Singhal karan.singhal@acuitybrands.com
commit ca667a33207daeaf9c62b106815728718def60ec upstream.
IDs of nLight Air Adapter, Acuity Brands, Inc.: vid: 10c4 pid: 88d8
Signed-off-by: Karan Singhal karan.singhal@acuitybrands.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -146,6 +146,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x10C4, 0x8857) }, /* CEL EM357 ZigBee USB Stick */ { USB_DEVICE(0x10C4, 0x88A4) }, /* MMB Networks ZigBee USB Device */ { USB_DEVICE(0x10C4, 0x88A5) }, /* Planet Innovation Ingeni ZigBee USB Device */ + { USB_DEVICE(0x10C4, 0x88D8) }, /* Acuity Brands nLight Air Adapter */ { USB_DEVICE(0x10C4, 0x88FB) }, /* CESINEL MEDCAL STII Network Analyzer */ { USB_DEVICE(0x10C4, 0x8938) }, /* CESINEL MEDCAL S II Network Analyzer */ { USB_DEVICE(0x10C4, 0x8946) }, /* Ketra N1 Wireless Interface */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Sebastian Reichel sebastian.reichel@collabora.com
commit 42213a0190b535093a604945db05a4225bf43885 upstream.
GE CS1000 has some more custom USB IDs for CP2102N; add them to the driver to have working auto-probing.
Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -203,6 +203,8 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x1901, 0x0194) }, /* GE Healthcare Remote Alarm Box */ { USB_DEVICE(0x1901, 0x0195) }, /* GE B850/B650/B450 CP2104 DP UART interface */ { USB_DEVICE(0x1901, 0x0196) }, /* GE B850 CP2105 DP UART interface */ + { USB_DEVICE(0x1901, 0x0197) }, /* GE CS1000 Display serial interface */ + { USB_DEVICE(0x1901, 0x0198) }, /* GE CS1000 M.2 Key E serial interface */ { USB_DEVICE(0x199B, 0xBA30) }, /* LORD WSDA-200-USB */ { USB_DEVICE(0x19CF, 0x3000) }, /* Parrot NMEA GPS Flight Recorder */ { USB_DEVICE(0x1ADB, 0x0001) }, /* Schweitzer Engineering C662 Cable */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit 47ccc8fc2c9c94558b27b6f9e2582df32d29e6e8 upstream.
Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream.
Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/e942d2bd03afb8e8552bd2a5d84e18d17670d521.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/stub_dev.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -69,8 +69,16 @@ static ssize_t usbip_sockfd_store(struct }
socket = sockfd_lookup(sockfd, &err); - if (!socket) + if (!socket) { + dev_err(dev, "failed to lookup sock"); goto err; + } + + if (socket->type != SOCK_STREAM) { + dev_err(dev, "Expecting SOCK_STREAM - found %d", + socket->type); + goto sock_err; + }
sdev->ud.tcp_socket = socket; sdev->ud.sockfd = sockfd; @@ -100,6 +108,8 @@ static ssize_t usbip_sockfd_store(struct
return count;
+sock_err: + sockfd_put(socket); err: spin_unlock_irq(&sdev->ud.lock); return -EINVAL;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit f55a0571690c4aae03180e001522538c0927432f upstream.
Fix attach_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream.
Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/52712aa308915bda02cece1589e04ee8b401d1f3.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/vhci_sysfs.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/usb/usbip/vhci_sysfs.c +++ b/drivers/usb/usbip/vhci_sysfs.c @@ -349,8 +349,16 @@ static ssize_t attach_store(struct devic
/* Extract socket from fd. */ socket = sockfd_lookup(sockfd, &err); - if (!socket) + if (!socket) { + dev_err(dev, "failed to lookup sock"); return -EINVAL; + } + if (socket->type != SOCK_STREAM) { + dev_err(dev, "Expecting SOCK_STREAM - found %d", + socket->type); + sockfd_put(socket); + return -EINVAL; + }
/* now need lock until setting vdev status as used */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit 6801854be94fe8819b3894979875ea31482f5658 upstream.
Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream.
Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/387a670316002324113ac7ea1e8b53f4085d0c95.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/vudc_sysfs.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/usb/usbip/vudc_sysfs.c +++ b/drivers/usb/usbip/vudc_sysfs.c @@ -138,6 +138,13 @@ static ssize_t usbip_sockfd_store(struct goto unlock_ud; }
+ if (socket->type != SOCK_STREAM) { + dev_err(dev, "Expecting SOCK_STREAM - found %d", + socket->type); + ret = -EINVAL; + goto sock_err; + } + udc->ud.tcp_socket = socket;
spin_unlock_irq(&udc->ud.lock); @@ -177,6 +184,8 @@ static ssize_t usbip_sockfd_store(struct
return count;
+sock_err: + sockfd_put(socket); unlock_ud: spin_unlock_irq(&udc->ud.lock); unlock:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit 9380afd6df70e24eacbdbde33afc6a3950965d22 upstream.
usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export.
Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid.
Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync.
Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync.
There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync.
1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling.
Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete.
Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence.
Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000
Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Reported-by: syzbot syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com Reported-by: syzbot syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/268a0668144d5ff36ec7d87fdfa90faf583b7ccc.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/stub_dev.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-)
--- a/drivers/usb/usbip/stub_dev.c +++ b/drivers/usb/usbip/stub_dev.c @@ -46,6 +46,8 @@ static ssize_t usbip_sockfd_store(struct int sockfd = 0; struct socket *socket; int rv; + struct task_struct *tcp_rx = NULL; + struct task_struct *tcp_tx = NULL;
if (!sdev) { dev_err(dev, "sdev is null\n"); @@ -80,20 +82,36 @@ static ssize_t usbip_sockfd_store(struct goto sock_err; }
- sdev->ud.tcp_socket = socket; - sdev->ud.sockfd = sockfd; - + /* unlock and create threads and get tasks */ spin_unlock_irq(&sdev->ud.lock); + tcp_rx = kthread_create(stub_rx_loop, &sdev->ud, "stub_rx"); + if (IS_ERR(tcp_rx)) { + sockfd_put(socket); + return -EINVAL; + } + tcp_tx = kthread_create(stub_tx_loop, &sdev->ud, "stub_tx"); + if (IS_ERR(tcp_tx)) { + kthread_stop(tcp_rx); + sockfd_put(socket); + return -EINVAL; + }
- sdev->ud.tcp_rx = kthread_get_run(stub_rx_loop, &sdev->ud, - "stub_rx"); - sdev->ud.tcp_tx = kthread_get_run(stub_tx_loop, &sdev->ud, - "stub_tx"); + /* get task structs now */ + get_task_struct(tcp_rx); + get_task_struct(tcp_tx);
+ /* lock and update sdev->ud state */ spin_lock_irq(&sdev->ud.lock); + sdev->ud.tcp_socket = socket; + sdev->ud.sockfd = sockfd; + sdev->ud.tcp_rx = tcp_rx; + sdev->ud.tcp_tx = tcp_tx; sdev->ud.status = SDEV_ST_USED; spin_unlock_irq(&sdev->ud.lock);
+ wake_up_process(sdev->ud.tcp_rx); + wake_up_process(sdev->ud.tcp_tx); + } else { dev_info(dev, "stub down\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit 718ad9693e3656120064b715fe931f43a6201e67 upstream.
attach_store() is invoked when user requests import (attach) a device from usbip host.
Attach and detach are governed by local state and shared state - Shared state (usbip device status) - Device status is used to manage the attach and detach operations on import-able devices. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - Device has to be in the right state to be attached and detached.
Attach sequence includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the host to get access to the imported device. rx and tx threads depends on local and shared state to be correct and in sync.
Detach sequence shuts the socket down and stops the rx and tx threads. Detach sequence relies on local and shared states to be in sync.
There are races in updating the local and shared status in the current attach sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync.
1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to VDEV_ST_NOTASSIGNED. This opens up a race condition between the threads, port connect, and detach handling.
Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold vhci and usbip_device locks to update local and shared states after creating rx and tx threads. - Update usbip_device status to VDEV_ST_NOTASSIGNED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete.
Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the attach sequence. - Update usbip_device tcp_rx and tcp_tx pointers holding vhci and usbip_device locks.
Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000
Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Reported-by: syzbot syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com Reported-by: syzbot syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/bb434bd5d7a64fbec38b5ecfb838a6baef6eb12b.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/vhci_sysfs.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-)
--- a/drivers/usb/usbip/vhci_sysfs.c +++ b/drivers/usb/usbip/vhci_sysfs.c @@ -312,6 +312,8 @@ static ssize_t attach_store(struct devic struct vhci *vhci; int err; unsigned long flags; + struct task_struct *tcp_rx = NULL; + struct task_struct *tcp_tx = NULL;
/* * @rhport: port number of vhci_hcd @@ -360,9 +362,24 @@ static ssize_t attach_store(struct devic return -EINVAL; }
- /* now need lock until setting vdev status as used */ + /* create threads before locking */ + tcp_rx = kthread_create(vhci_rx_loop, &vdev->ud, "vhci_rx"); + if (IS_ERR(tcp_rx)) { + sockfd_put(socket); + return -EINVAL; + } + tcp_tx = kthread_create(vhci_tx_loop, &vdev->ud, "vhci_tx"); + if (IS_ERR(tcp_tx)) { + kthread_stop(tcp_rx); + sockfd_put(socket); + return -EINVAL; + } + + /* get task structs now */ + get_task_struct(tcp_rx); + get_task_struct(tcp_tx);
- /* begin a lock */ + /* now begin lock until setting vdev status set */ spin_lock_irqsave(&vhci->lock, flags); spin_lock(&vdev->ud.lock);
@@ -372,6 +389,8 @@ static ssize_t attach_store(struct devic spin_unlock_irqrestore(&vhci->lock, flags);
sockfd_put(socket); + kthread_stop_put(tcp_rx); + kthread_stop_put(tcp_tx);
dev_err(dev, "port %d already used\n", rhport); /* @@ -390,14 +409,16 @@ static ssize_t attach_store(struct devic vdev->speed = speed; vdev->ud.sockfd = sockfd; vdev->ud.tcp_socket = socket; + vdev->ud.tcp_rx = tcp_rx; + vdev->ud.tcp_tx = tcp_tx; vdev->ud.status = VDEV_ST_NOTASSIGNED;
spin_unlock(&vdev->ud.lock); spin_unlock_irqrestore(&vhci->lock, flags); /* end the lock */
- vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, &vdev->ud, "vhci_rx"); - vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, &vdev->ud, "vhci_tx"); + wake_up_process(vdev->ud.tcp_rx); + wake_up_process(vdev->ud.tcp_tx);
rh_port_connect(vdev, speed);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shuah Khan skhan@linuxfoundation.org
commit 46613c9dfa964c0c60b5385dbdf5aaa18be52a9c upstream.
usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb gadget device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export.
Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. Device status is shared between host and client. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid.
Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync.
Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync.
There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync.
1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling.
Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete.
Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence.
Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com Reported-by: syzbot syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com Reported-by: syzbot syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Shuah Khan skhan@linuxfoundation.org Link: https://lore.kernel.org/r/b1c08b983ffa185449c9f0f7d1021dc8c8454b60.161517120... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/usbip/vudc_sysfs.c | 42 +++++++++++++++++++++++++++++++++-------- 1 file changed, 34 insertions(+), 8 deletions(-)
--- a/drivers/usb/usbip/vudc_sysfs.c +++ b/drivers/usb/usbip/vudc_sysfs.c @@ -90,8 +90,9 @@ unlock: } static BIN_ATTR_RO(dev_desc, sizeof(struct usb_device_descriptor));
-static ssize_t usbip_sockfd_store(struct device *dev, struct device_attribute *attr, - const char *in, size_t count) +static ssize_t usbip_sockfd_store(struct device *dev, + struct device_attribute *attr, + const char *in, size_t count) { struct vudc *udc = (struct vudc *) dev_get_drvdata(dev); int rv; @@ -100,6 +101,8 @@ static ssize_t usbip_sockfd_store(struct struct socket *socket; unsigned long flags; int ret; + struct task_struct *tcp_rx = NULL; + struct task_struct *tcp_tx = NULL;
rv = kstrtoint(in, 0, &sockfd); if (rv != 0) @@ -145,24 +148,47 @@ static ssize_t usbip_sockfd_store(struct goto sock_err; }
- udc->ud.tcp_socket = socket; - + /* unlock and create threads and get tasks */ spin_unlock_irq(&udc->ud.lock); spin_unlock_irqrestore(&udc->lock, flags);
- udc->ud.tcp_rx = kthread_get_run(&v_rx_loop, - &udc->ud, "vudc_rx"); - udc->ud.tcp_tx = kthread_get_run(&v_tx_loop, - &udc->ud, "vudc_tx"); + tcp_rx = kthread_create(&v_rx_loop, &udc->ud, "vudc_rx"); + if (IS_ERR(tcp_rx)) { + sockfd_put(socket); + return -EINVAL; + } + tcp_tx = kthread_create(&v_tx_loop, &udc->ud, "vudc_tx"); + if (IS_ERR(tcp_tx)) { + kthread_stop(tcp_rx); + sockfd_put(socket); + return -EINVAL; + } + + /* get task structs now */ + get_task_struct(tcp_rx); + get_task_struct(tcp_tx);
+ /* lock and update udc->ud state */ spin_lock_irqsave(&udc->lock, flags); spin_lock_irq(&udc->ud.lock); + + udc->ud.tcp_socket = socket; + udc->ud.tcp_rx = tcp_rx; + udc->ud.tcp_rx = tcp_tx; udc->ud.status = SDEV_ST_USED; + spin_unlock_irq(&udc->ud.lock);
ktime_get_ts64(&udc->start_time); v_start_timer(udc); udc->connected = 1; + + spin_unlock_irqrestore(&udc->lock, flags); + + wake_up_process(udc->ud.tcp_rx); + wake_up_process(udc->ud.tcp_tx); + return count; + } else { if (!udc->connected) { dev_err(dev, "Device not connected");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alexander Shiyan shc_work@mail.ru
commit 2334de198fed3da72e9785ecdd691d101aa96e77 upstream.
This reverts commit fce3c5c1a2d9cd888f2987662ce17c0c651916b2.
FIFO is triggered 4 intervals after receiving a byte, it's good when we don't care about the time of reception, but are only interested in the presence of any activity on the line. Unfortunately, this method is not suitable for all tasks, for example, the RS-485 protocol will not work properly, since the state machine must track the request-response time and after the timeout expires, a decision is made that the device on the line is not responding.
Signed-off-by: Alexander Shiyan shc_work@mail.ru Link: https://lore.kernel.org/r/20210217080608.31192-1-shc_work@mail.ru Fixes: fce3c5c1a2d9 ("serial: max310x: rework RX interrupt handling") Cc: Thomas Petazzoni thomas.petazzoni@bootlin.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/max310x.c | 29 +++++------------------------ 1 file changed, 5 insertions(+), 24 deletions(-)
--- a/drivers/tty/serial/max310x.c +++ b/drivers/tty/serial/max310x.c @@ -1056,9 +1056,9 @@ static int max310x_startup(struct uart_p max310x_port_update(port, MAX310X_MODE1_REG, MAX310X_MODE1_TRNSCVCTRL_BIT, 0);
- /* Reset FIFOs */ - max310x_port_write(port, MAX310X_MODE2_REG, - MAX310X_MODE2_FIFORST_BIT); + /* Configure MODE2 register & Reset FIFOs*/ + val = MAX310X_MODE2_RXEMPTINV_BIT | MAX310X_MODE2_FIFORST_BIT; + max310x_port_write(port, MAX310X_MODE2_REG, val); max310x_port_update(port, MAX310X_MODE2_REG, MAX310X_MODE2_FIFORST_BIT, 0);
@@ -1086,27 +1086,8 @@ static int max310x_startup(struct uart_p /* Clear IRQ status register */ max310x_port_read(port, MAX310X_IRQSTS_REG);
- /* - * Let's ask for an interrupt after a timeout equivalent to - * the receiving time of 4 characters after the last character - * has been received. - */ - max310x_port_write(port, MAX310X_RXTO_REG, 4); - - /* - * Make sure we also get RX interrupts when the RX FIFO is - * filling up quickly, so get an interrupt when half of the RX - * FIFO has been filled in. - */ - max310x_port_write(port, MAX310X_FIFOTRIGLVL_REG, - MAX310X_FIFOTRIGLVL_RX(MAX310X_FIFO_SIZE / 2)); - - /* Enable RX timeout interrupt in LSR */ - max310x_port_write(port, MAX310X_LSR_IRQEN_REG, - MAX310X_LSR_RXTO_BIT); - - /* Enable LSR, RX FIFO trigger, CTS change interrupts */ - val = MAX310X_IRQ_LSR_BIT | MAX310X_IRQ_RXFIFO_BIT | MAX310X_IRQ_TXEMPTY_BIT; + /* Enable RX, TX, CTS change interrupts */ + val = MAX310X_IRQ_RXEMPTY_BIT | MAX310X_IRQ_TXEMPTY_BIT; max310x_port_write(port, MAX310X_IRQEN_REG, val | MAX310X_IRQ_CTS_BIT);
return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shile Zhang shile.zhang@linux.alibaba.com
commit 65527a51c66f4edfa28602643d7dd4fa366eb826 upstream.
Export the module FDT device table to ensure the FDT compatible strings are listed in the module alias. This help the pvpanic driver can be loaded on boot automatically not only the ACPI device, but also the FDT device.
Fixes: 46f934c9a12fc ("misc/pvpanic: add support to get pvpanic device info FDT") Signed-off-by: Shile Zhang shile.zhang@linux.alibaba.com Link: https://lore.kernel.org/r/20210218123116.207751-1-shile.zhang@linux.alibaba.... Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/pvpanic.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/misc/pvpanic.c +++ b/drivers/misc/pvpanic.c @@ -92,6 +92,7 @@ static const struct of_device_id pvpanic { .compatible = "qemu,pvpanic-mmio", }, {} }; +MODULE_DEVICE_TABLE(of, pvpanic_mmio_match);
static const struct acpi_device_id pvpanic_device_ids[] = { { "QEMU0001", 0 },
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
commit 20c40794eb85ea29852d7bc37c55713802a543d6 upstream.
Verify that user applications are not using the kernel RPC message handle to restrict them from directly attaching to guest OS on the remote subsystem. This is a port of CVE-2019-2308 fix.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Cc: Jonathan Marek jonathan@marek.ca Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20210212192658.3476137-1-dmitry.baryshkov@linaro.o... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/fastrpc.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/misc/fastrpc.c +++ b/drivers/misc/fastrpc.c @@ -950,6 +950,11 @@ static int fastrpc_internal_invoke(struc if (!fl->cctx->rpdev) return -EPIPE;
+ if (handle == FASTRPC_INIT_HANDLE && !kernel) { + dev_warn_ratelimited(fl->sctx->dev, "user app trying to send a kernel RPC message (%d)\n", handle); + return -EPERM; + } + ctx = fastrpc_context_alloc(fl, kernel, sc, args); if (IS_ERR(ctx)) return PTR_ERR(ctx);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit 87107518d7a93fec6cdb2559588862afeee800fb upstream.
We need to cap len at IW_ESSID_MAX_SIZE (32) to avoid memory corruption. This can be controlled by the user via the ioctl.
Fixes: 5f53d8ca3d5d ("Staging: add rtl8192SU wireless usb driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/YEHoAWMOSZBUw91F@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8192u/r8192U_wx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/rtl8192u/r8192U_wx.c +++ b/drivers/staging/rtl8192u/r8192U_wx.c @@ -331,8 +331,10 @@ static int r8192_wx_set_scan(struct net_ struct iw_scan_req *req = (struct iw_scan_req *)b;
if (req->essid_len) { - ieee->current_network.ssid_len = req->essid_len; - memcpy(ieee->current_network.ssid, req->essid, req->essid_len); + int len = min_t(int, req->essid_len, IW_ESSID_MAX_SIZE); + + ieee->current_network.ssid_len = len; + memcpy(ieee->current_network.ssid, req->essid, len); } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit 74b6b20df8cfe90ada777d621b54c32e69e27cd7 upstream.
This code has a check to prevent read overflow but it needs another check to prevent writing beyond the end of the ->ssid[] array.
Fixes: a2c60d42d97c ("staging: r8188eu: Add files for new driver - part 16") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/YEHymwsnHewzoam7@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c +++ b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c @@ -1133,9 +1133,11 @@ static int rtw_wx_set_scan(struct net_de break; } sec_len = *(pos++); len -= 1; - if (sec_len > 0 && sec_len <= len) { + if (sec_len > 0 && + sec_len <= len && + sec_len <= 32) { ssid[ssid_index].ssid_length = sec_len; - memcpy(ssid[ssid_index].ssid, pos, ssid[ssid_index].ssid_length); + memcpy(ssid[ssid_index].ssid, pos, sec_len); ssid_index++; } pos += sec_len;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit d660f4f42ccea50262c6ee90c8e7ad19a69fb225 upstream.
The memdup_user() function does not necessarily return a NUL terminated string so this can lead to a read overflow. Switch from memdup_user() to strndup_user() to fix this bug.
Fixes: c6dc001f2add ("staging: r8712u: Merging Realtek's latest (v2.6.6). Various fixes.") Cc: stable stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/YDYSR+1rj26NRhvb@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8712/rtl871x_ioctl_linux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/rtl8712/rtl871x_ioctl_linux.c +++ b/drivers/staging/rtl8712/rtl871x_ioctl_linux.c @@ -924,7 +924,7 @@ static int r871x_wx_set_priv(struct net_ struct iw_point *dwrq = (struct iw_point *)awrq;
len = dwrq->length; - ext = memdup_user(dwrq->pointer, len); + ext = strndup_user(dwrq->pointer, len); if (IS_ERR(ext)) return PTR_ERR(ext);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit d4ac640322b06095128a5c45ba4a1e80929fe7f3 upstream.
The "ie_len" is a value in the 1-255 range that comes from the user. We have to cap it to ensure that it's not too large or it could lead to memory corruption.
Fixes: 9a7fe54ddc3a ("staging: r8188eu: Add source files for new driver - part 1") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/YEHyQCrFZKTXyT7J@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8188eu/core/rtw_ap.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/staging/rtl8188eu/core/rtw_ap.c +++ b/drivers/staging/rtl8188eu/core/rtw_ap.c @@ -791,6 +791,7 @@ int rtw_check_beacon_data(struct adapter p = rtw_get_ie(ie + _BEACON_IE_OFFSET_, WLAN_EID_SSID, &ie_len, pbss_network->ie_length - _BEACON_IE_OFFSET_); if (p && ie_len > 0) { + ie_len = min_t(int, ie_len, sizeof(pbss_network->ssid.ssid)); memset(&pbss_network->ssid, 0, sizeof(struct ndis_802_11_ssid)); memcpy(pbss_network->ssid.ssid, p + 2, ie_len); pbss_network->ssid.ssid_length = ie_len; @@ -811,6 +812,7 @@ int rtw_check_beacon_data(struct adapter p = rtw_get_ie(ie + _BEACON_IE_OFFSET_, WLAN_EID_SUPP_RATES, &ie_len, pbss_network->ie_length - _BEACON_IE_OFFSET_); if (p) { + ie_len = min_t(int, ie_len, NDIS_802_11_LENGTH_RATES_EX); memcpy(supportRate, p + 2, ie_len); supportRateNum = ie_len; } @@ -819,6 +821,8 @@ int rtw_check_beacon_data(struct adapter p = rtw_get_ie(ie + _BEACON_IE_OFFSET_, WLAN_EID_EXT_SUPP_RATES, &ie_len, pbss_network->ie_length - _BEACON_IE_OFFSET_); if (p) { + ie_len = min_t(int, ie_len, + NDIS_802_11_LENGTH_RATES_EX - supportRateNum); memcpy(supportRate + supportRateNum, p + 2, ie_len); supportRateNum += ie_len; } @@ -934,6 +938,7 @@ int rtw_check_beacon_data(struct adapter
pht_cap->mcs.rx_mask[0] = 0xff; pht_cap->mcs.rx_mask[1] = 0x0; + ie_len = min_t(int, ie_len, sizeof(pmlmepriv->htpriv.ht_cap)); memcpy(&pmlmepriv->htpriv.ht_cap, p + 2, ie_len); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dan Carpenter dan.carpenter@oracle.com
commit e163b9823a0b08c3bb8dc4f5b4b5c221c24ec3e5 upstream.
The user can specify a "req->essid_len" of up to 255 but if it's over IW_ESSID_MAX_SIZE (32) that can lead to memory corruption.
Fixes: 13a9930d15b4 ("staging: ks7010: add driver from Nanonote extra-repository") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/YD4fS8+HmM/Qmrw6@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/ks7010/ks_wlan_net.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/ks7010/ks_wlan_net.c +++ b/drivers/staging/ks7010/ks_wlan_net.c @@ -1120,6 +1120,7 @@ static int ks_wlan_set_scan(struct net_d { struct ks_wlan_private *priv = netdev_priv(dev); struct iw_scan_req *req = NULL; + int len;
if (priv->sleep_mode == SLP_SLEEP) return -EPERM; @@ -1129,8 +1130,9 @@ static int ks_wlan_set_scan(struct net_d if (wrqu->data.length == sizeof(struct iw_scan_req) && wrqu->data.flags & IW_SCAN_THIS_ESSID) { req = (struct iw_scan_req *)extra; - priv->scan_ssid_len = req->essid_len; - memcpy(priv->scan_ssid, req->essid, priv->scan_ssid_len); + len = min_t(int, req->essid_len, IW_ESSID_MAX_SIZE); + priv->scan_ssid_len = len; + memcpy(priv->scan_ssid, req->essid, len); } else { priv->scan_ssid_len = 0; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lee Gibson leegib@gmail.com
commit b93c1e3981af19527beee1c10a2bef67a228c48c upstream.
Function r8712_sitesurvey_cmd calls memcpy without checking the length. A user could control that length and trigger a buffer overflow. Fix by checking the length is within the maximum allowed size.
Signed-off-by: Lee Gibson leegib@gmail.com Link: https://lore.kernel.org/r/20210301132648.420296-1-leegib@gmail.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8712/rtl871x_cmd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/rtl8712/rtl871x_cmd.c +++ b/drivers/staging/rtl8712/rtl871x_cmd.c @@ -192,8 +192,10 @@ u8 r8712_sitesurvey_cmd(struct _adapter psurveyPara->ss_ssidlen = 0; memset(psurveyPara->ss_ssid, 0, IW_ESSID_MAX_SIZE + 1); if (pssid && pssid->SsidLength) { - memcpy(psurveyPara->ss_ssid, pssid->Ssid, pssid->SsidLength); - psurveyPara->ss_ssidlen = cpu_to_le32(pssid->SsidLength); + int len = min_t(int, pssid->SsidLength, IW_ESSID_MAX_SIZE); + + memcpy(psurveyPara->ss_ssid, pssid->Ssid, len); + psurveyPara->ss_ssidlen = cpu_to_le32(len); } set_fwstate(pmlmepriv, _FW_UNDER_SURVEY); r8712_enqueue_cmd(pcmdpriv, ph2c);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lee Gibson leegib@gmail.com
commit 8687bf9ef9551bcf93897e33364d121667b1aadf upstream.
Function _rtl92e_wx_set_scan calls memcpy without checking the length. A user could control that length and trigger a buffer overflow. Fix by checking the length is within the maximum allowed size.
Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Lee Gibson leegib@gmail.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210226145157.424065-1-leegib@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8192e/rtl8192e/rtl_wx.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c +++ b/drivers/staging/rtl8192e/rtl8192e/rtl_wx.c @@ -406,9 +406,10 @@ static int _rtl92e_wx_set_scan(struct ne struct iw_scan_req *req = (struct iw_scan_req *)b;
if (req->essid_len) { - ieee->current_network.ssid_len = req->essid_len; - memcpy(ieee->current_network.ssid, req->essid, - req->essid_len); + int len = min_t(int, req->essid_len, IW_ESSID_MAX_SIZE); + + ieee->current_network.ssid_len = len; + memcpy(ieee->current_network.ssid, req->essid, len); } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit 25317f428a78fde71b2bf3f24d05850f08a73a52 upstream.
The Change-Of-State (COS) subdevice supports Comedi asynchronous commands to read 16-bit change-of-state values. However, the interrupt handler is calling `comedi_buf_write_samples()` with the address of a 32-bit integer `&s->state`. On bigendian architectures, it will copy 2 bytes from the wrong end of the 32-bit integer. Fix it by transferring the value via a 16-bit integer.
Fixes: 6bb45f2b0c86 ("staging: comedi: addi_apci_1032: use comedi_buf_write_samples()") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-2-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/addi_apci_1032.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/addi_apci_1032.c +++ b/drivers/staging/comedi/drivers/addi_apci_1032.c @@ -260,6 +260,7 @@ static irqreturn_t apci1032_interrupt(in struct apci1032_private *devpriv = dev->private; struct comedi_subdevice *s = dev->read_subdev; unsigned int ctrl; + unsigned short val;
/* check interrupt is from this device */ if ((inl(devpriv->amcc_iobase + AMCC_OP_REG_INTCSR) & @@ -275,7 +276,8 @@ static irqreturn_t apci1032_interrupt(in outl(ctrl & ~APCI1032_CTRL_INT_ENA, dev->iobase + APCI1032_CTRL_REG);
s->state = inl(dev->iobase + APCI1032_STATUS_REG) & 0xffff; - comedi_buf_write_samples(s, &s->state, 1); + val = s->state; + comedi_buf_write_samples(s, &val, 1); comedi_handle_events(dev, s);
/* enable the interrupt */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit ac0bbf55ed3be75fde1f8907e91ecd2fd589bde3 upstream.
The digital input subdevice supports Comedi asynchronous commands that read interrupt status information. This uses 16-bit Comedi samples (of which only the bottom 8 bits contain status information). However, the interrupt handler is calling `comedi_buf_write_samples()` with the address of a 32-bit variable `unsigned int status`. On a bigendian machine, this will copy 2 bytes from the wrong end of the variable. Fix it by changing the type of the variable to `unsigned short`.
Fixes: a8c66b684efa ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions") Cc: stable@vger.kernel.org #4.0+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-3-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/addi_apci_1500.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/staging/comedi/drivers/addi_apci_1500.c +++ b/drivers/staging/comedi/drivers/addi_apci_1500.c @@ -208,7 +208,7 @@ static irqreturn_t apci1500_interrupt(in struct comedi_device *dev = d; struct apci1500_private *devpriv = dev->private; struct comedi_subdevice *s = dev->read_subdev; - unsigned int status = 0; + unsigned short status = 0; unsigned int val;
val = inl(devpriv->amcc + AMCC_OP_REG_INTCSR); @@ -238,14 +238,14 @@ static irqreturn_t apci1500_interrupt(in * * Mask Meaning * ---------- ------------------------------------------ - * 0x00000001 Event 1 has occurred - * 0x00000010 Event 2 has occurred - * 0x00000100 Counter/timer 1 has run down (not implemented) - * 0x00001000 Counter/timer 2 has run down (not implemented) - * 0x00010000 Counter 3 has run down (not implemented) - * 0x00100000 Watchdog has run down (not implemented) - * 0x01000000 Voltage error - * 0x10000000 Short-circuit error + * 0b00000001 Event 1 has occurred + * 0b00000010 Event 2 has occurred + * 0b00000100 Counter/timer 1 has run down (not implemented) + * 0b00001000 Counter/timer 2 has run down (not implemented) + * 0b00010000 Counter 3 has run down (not implemented) + * 0b00100000 Watchdog has run down (not implemented) + * 0b01000000 Voltage error + * 0b10000000 Short-circuit error */ comedi_buf_write_samples(s, &status, 1); comedi_handle_events(dev, s);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit b2e78630f733a76508b53ba680528ca39c890e82 upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the calls to `comedi_buf_write_samples()` are passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variables holding the sample value to `unsigned short`. The type of the `val` parameter of `pci1710_ai_read_sample()` is changed to `unsigned short *` accordingly. The type of the `val` variable in `pci1710_ai_insn_read()` is also changed to `unsigned short` since its address is passed to `pci1710_ai_read_sample()`.
Fixes: a9c3a015c12f ("staging: comedi: adv_pci1710: use comedi_buf_write_samples()") Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-4-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/adv_pci1710.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/staging/comedi/drivers/adv_pci1710.c +++ b/drivers/staging/comedi/drivers/adv_pci1710.c @@ -300,11 +300,11 @@ static int pci1710_ai_eoc(struct comedi_ static int pci1710_ai_read_sample(struct comedi_device *dev, struct comedi_subdevice *s, unsigned int cur_chan, - unsigned int *val) + unsigned short *val) { const struct boardtype *board = dev->board_ptr; struct pci1710_private *devpriv = dev->private; - unsigned int sample; + unsigned short sample; unsigned int chan;
sample = inw(dev->iobase + PCI171X_AD_DATA_REG); @@ -345,7 +345,7 @@ static int pci1710_ai_insn_read(struct c pci1710_ai_setup_chanlist(dev, s, &insn->chanspec, 1, 1);
for (i = 0; i < insn->n; i++) { - unsigned int val; + unsigned short val;
/* start conversion */ outw(0, dev->iobase + PCI171X_SOFTTRG_REG); @@ -395,7 +395,7 @@ static void pci1710_handle_every_sample( { struct comedi_cmd *cmd = &s->async->cmd; unsigned int status; - unsigned int val; + unsigned short val; int ret;
status = inw(dev->iobase + PCI171X_STATUS_REG); @@ -455,7 +455,7 @@ static void pci1710_handle_fifo(struct c }
for (i = 0; i < devpriv->max_samples; i++) { - unsigned int val; + unsigned short val; int ret;
ret = pci1710_ai_read_sample(dev, s, s->async->cur_chan, &val);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit 1c0f20b78781b9ca50dc3ecfd396d0db5b141890 upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`.
Fixes: d1d24cb65ee3 ("staging: comedi: das6402: read analog input samples in interrupt handler") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-5-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/das6402.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/das6402.c +++ b/drivers/staging/comedi/drivers/das6402.c @@ -186,7 +186,7 @@ static irqreturn_t das6402_interrupt(int if (status & DAS6402_STATUS_FFULL) { async->events |= COMEDI_CB_OVERFLOW; } else if (status & DAS6402_STATUS_FFNE) { - unsigned int val; + unsigned short val;
val = das6402_ai_read_sample(dev, s); comedi_buf_write_samples(s, &val, 1);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit 459b1e8c8fe97fcba0bd1b623471713dce2c5eaf upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`.
Fixes: ad9eb43c93d8 ("staging: comedi: das800: use comedi_buf_write_samples()") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-6-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/das800.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/das800.c +++ b/drivers/staging/comedi/drivers/das800.c @@ -427,7 +427,7 @@ static irqreturn_t das800_interrupt(int struct comedi_cmd *cmd; unsigned long irq_flags; unsigned int status; - unsigned int val; + unsigned short val; bool fifo_empty; bool fifo_overflow; int i;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit 54999c0d94b3c26625f896f8e3460bc029821578 upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`.
[Note: the bug was introduced in commit 1700529b24cc ("staging: comedi: dmm32at: use comedi_buf_write_samples()") but the patch applies better to the later (but in the same kernel release) commit 0c0eadadcbe6e ("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()").]
Fixes: 0c0eadadcbe6e ("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-7-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/dmm32at.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/dmm32at.c +++ b/drivers/staging/comedi/drivers/dmm32at.c @@ -404,7 +404,7 @@ static irqreturn_t dmm32at_isr(int irq, { struct comedi_device *dev = d; unsigned char intstat; - unsigned int val; + unsigned short val; int i;
if (!dev->attached) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit b39dfcced399d31e7c4b7341693b18e01c8f655e upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the calls to `comedi_buf_write_samples()` are passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`.
Fixes: de88924f67d1 ("staging: comedi: me4000: use comedi_buf_write_samples()") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-8-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/me4000.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/me4000.c +++ b/drivers/staging/comedi/drivers/me4000.c @@ -924,7 +924,7 @@ static irqreturn_t me4000_ai_isr(int irq struct comedi_subdevice *s = dev->read_subdev; int i; int c = 0; - unsigned int lval; + unsigned short lval;
if (!dev->attached) return IRQ_NONE;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit a084303a645896e834883f2c5170d044410dfdb3 upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`.
Fixes: 1f44c034de2e ("staging: comedi: pcl711: use comedi_buf_write_samples()") Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-9-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/pcl711.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/pcl711.c +++ b/drivers/staging/comedi/drivers/pcl711.c @@ -184,7 +184,7 @@ static irqreturn_t pcl711_interrupt(int struct comedi_device *dev = d; struct comedi_subdevice *s = dev->read_subdev; struct comedi_cmd *cmd = &s->async->cmd; - unsigned int data; + unsigned short data;
if (!dev->attached) { dev_err(dev->class_dev, "spurious interrupt\n");
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ian Abbott abbotti@mev.co.uk
commit 148e34fd33d53740642db523724226de14ee5281 upstream.
The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer parameter. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the parameter holding the sample value to `unsigned short`.
[Note: the bug was introduced in commit edf4537bcbf5 ("staging: comedi: pcl818: use comedi_buf_write_samples()") but the patch applies better to commit d615416de615 ("staging: comedi: pcl818: introduce pcl818_ai_write_sample()").]
Fixes: d615416de615 ("staging: comedi: pcl818: introduce pcl818_ai_write_sample()") Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20210223143055.257402-10-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/pcl818.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/pcl818.c +++ b/drivers/staging/comedi/drivers/pcl818.c @@ -423,7 +423,7 @@ static int pcl818_ai_eoc(struct comedi_d
static bool pcl818_ai_write_sample(struct comedi_device *dev, struct comedi_subdevice *s, - unsigned int chan, unsigned int val) + unsigned int chan, unsigned short val) { struct pcl818_private *devpriv = dev->private; struct comedi_cmd *cmd = &s->async->cmd;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit dc860b88ce0a7ed9a048d5042cbb175daf60b657 ]
Routes are currently processed from a workqueue whereas nexthop objects are processed in system call context. This can result in the driver not finding a suitable nexthop group for a route and issuing a warning [1].
Fix this by ignoring such routes earlier in the process. The subsequent deletion notification will be ignored as well.
[1] WARNING: CPU: 2 PID: 7754 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4853 mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum] [...] CPU: 2 PID: 7754 Comm: kworker/u8:0 Not tainted 5.11.0-rc6-cq-20210207-1 #16 Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, BIOS 5.6.5 05/24/2018 Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib_event_work [mlxsw_spectrum] RIP: 0010:mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum]
Fixes: cdd6cfc54c64 ("mlxsw: spectrum_router: Allow programming routes with nexthop objects") Signed-off-by: Ido Schimmel idosch@nvidia.com Reported-by: Alex Veber alexve@nvidia.com Tested-by: Alex Veber alexve@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 41424ee909a0..23d9fe18adba 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -5861,6 +5861,10 @@ mlxsw_sp_router_fib4_replace(struct mlxsw_sp *mlxsw_sp, if (mlxsw_sp->router->aborted) return 0;
+ if (fen_info->fi->nh && + !mlxsw_sp_nexthop_obj_group_lookup(mlxsw_sp, fen_info->fi->nh->id)) + return 0; + fib_node = mlxsw_sp_fib_node_get(mlxsw_sp, fen_info->tb_id, &fen_info->dst, sizeof(fen_info->dst), fen_info->dst_len, @@ -6511,6 +6515,9 @@ static int mlxsw_sp_router_fib6_replace(struct mlxsw_sp *mlxsw_sp, if (mlxsw_sp_fib6_rt_should_ignore(rt)) return 0;
+ if (rt->nh && !mlxsw_sp_nexthop_obj_group_lookup(mlxsw_sp, rt->nh->id)) + return 0; + fib_node = mlxsw_sp_fib_node_get(mlxsw_sp, rt->fib6_table->tb6_id, &rt->fib6_dst.addr, sizeof(rt->fib6_dst.addr),
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ioana Ciornei ioana.ciornei@nxp.com
[ Upstream commit 73f476aa1975bae6a792b340f5b26ffcfba869a6 ]
The previous implementation of .handle_interrupt() did not take into account the fact that all the interrupt status registers should be acknowledged since multiple interrupt sources could be asserted.
Fix this by reading all the status registers before exiting with IRQ_NONE or triggering the PHY state machine.
Fixes: 1d1ae3c6ca3f ("net: phy: ti: implement generic .handle_interrupt() callback") Reported-by: Sven Schuchmann schuchmann@schleissheimer.de Signed-off-by: Ioana Ciornei ioana.ciornei@nxp.com Link: https://lore.kernel.org/r/20210226153020.867852-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/dp83822.c | 9 +++++---- drivers/net/phy/dp83tc811.c | 11 ++++++----- 2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c index fff371ca1086..423952cb9e1c 100644 --- a/drivers/net/phy/dp83822.c +++ b/drivers/net/phy/dp83822.c @@ -290,6 +290,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
static irqreturn_t dp83822_handle_interrupt(struct phy_device *phydev) { + bool trigger_machine = false; int irq_status;
/* The MISR1 and MISR2 registers are holding the interrupt status in @@ -305,7 +306,7 @@ static irqreturn_t dp83822_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; } if (irq_status & ((irq_status & GENMASK(7, 0)) << 8)) - goto trigger_machine; + trigger_machine = true;
irq_status = phy_read(phydev, MII_DP83822_MISR2); if (irq_status < 0) { @@ -313,11 +314,11 @@ static irqreturn_t dp83822_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; } if (irq_status & ((irq_status & GENMASK(7, 0)) << 8)) - goto trigger_machine; + trigger_machine = true;
- return IRQ_NONE; + if (!trigger_machine) + return IRQ_NONE;
-trigger_machine: phy_trigger_machine(phydev);
return IRQ_HANDLED; diff --git a/drivers/net/phy/dp83tc811.c b/drivers/net/phy/dp83tc811.c index 688fadffb249..7ea32fb77190 100644 --- a/drivers/net/phy/dp83tc811.c +++ b/drivers/net/phy/dp83tc811.c @@ -264,6 +264,7 @@ static int dp83811_config_intr(struct phy_device *phydev)
static irqreturn_t dp83811_handle_interrupt(struct phy_device *phydev) { + bool trigger_machine = false; int irq_status;
/* The INT_STAT registers 1, 2 and 3 are holding the interrupt status @@ -279,7 +280,7 @@ static irqreturn_t dp83811_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; } if (irq_status & ((irq_status & GENMASK(7, 0)) << 8)) - goto trigger_machine; + trigger_machine = true;
irq_status = phy_read(phydev, MII_DP83811_INT_STAT2); if (irq_status < 0) { @@ -287,7 +288,7 @@ static irqreturn_t dp83811_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; } if (irq_status & ((irq_status & GENMASK(7, 0)) << 8)) - goto trigger_machine; + trigger_machine = true;
irq_status = phy_read(phydev, MII_DP83811_INT_STAT3); if (irq_status < 0) { @@ -295,11 +296,11 @@ static irqreturn_t dp83811_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; } if (irq_status & ((irq_status & GENMASK(7, 0)) << 8)) - goto trigger_machine; + trigger_machine = true;
- return IRQ_NONE; + if (!trigger_machine) + return IRQ_NONE;
-trigger_machine: phy_trigger_machine(phydev);
return IRQ_HANDLED;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Sergey Shtylyov s.shtylyov@omprussia.ru
[ Upstream commit 75be7fb7f978202c4c3a1a713af4485afb2ff5f6 ]
According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware, Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data::trscer_err_mask for R7S72100.
Fixes: db893473d313 ("sh_eth: Add support for r7s72100") Signed-off-by: Sergey Shtylyov s.shtylyov@omprussia.ru Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/sh_eth.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 1dfecfd938cf..f029c7c03804 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -560,6 +560,8 @@ static struct sh_eth_cpu_data r7s72100_data = { EESR_TDE, .fdr_value = 0x0000070f,
+ .trscer_err_mask = DESC_I_RINT8 | DESC_I_RINT5, + .no_psr = 1, .apr = 1, .mpr = 1,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jordan Niethe jniethe5@gmail.com
[ Upstream commit 5c88a17e15795226b56d83f579cbb9b7a4864f79 ]
Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") added loading and storing 32 word long data into adjacent VSRs. However the calculation used to determine if two VSRs needed to be loaded/stored inadvertently prevented the load/storing taking place for instructions with a data length less than 16 words.
This causes the emulation to not function correctly, which can be seen by the alignment_handler selftest:
$ ./alignment_handler [snip] test: test_alignment_handler_vsx_207 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 2.07B Doing lxsspx: PASSED Doing lxsiwax: FAILED: Wrong Data Doing lxsiwzx: PASSED Doing stxsspx: PASSED Doing stxsiwx: PASSED failure: test_alignment_handler_vsx_207 test: test_alignment_handler_vsx_300 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 3.00B Doing lxsd: PASSED Doing lxsibzx: PASSED Doing lxsihzx: PASSED Doing lxssp: FAILED: Wrong Data Doing lxv: PASSED Doing lxvb16x: PASSED Doing lxvh8x: PASSED Doing lxvx: PASSED Doing lxvwsx: FAILED: Wrong Data Doing lxvl: PASSED Doing lxvll: PASSED Doing stxsd: PASSED Doing stxsibx: PASSED Doing stxsihx: PASSED Doing stxssp: PASSED Doing stxv: PASSED Doing stxvb16x: PASSED Doing stxvh8x: PASSED Doing stxvx: PASSED Doing stxvl: PASSED Doing stxvll: PASSED failure: test_alignment_handler_vsx_300 [snip]
Fix this by making sure all VSX instruction emulation correctly load/store from the VSRs.
Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") Signed-off-by: Jordan Niethe jniethe5@gmail.com Reviewed-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210225031946.1458206-1-jniethe5@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/lib/sstep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index bb5c20d4ca91..c6aebc149d14 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -904,7 +904,7 @@ static nokprobe_inline int do_vsx_load(struct instruction_op *op, if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs)) return -EFAULT;
- nr_vsx_regs = size / sizeof(__vector128); + nr_vsx_regs = max(1ul, size / sizeof(__vector128)); emulate_vsx_load(op, buf, mem, cross_endian); preempt_disable(); if (reg < 32) { @@ -951,7 +951,7 @@ static nokprobe_inline int do_vsx_store(struct instruction_op *op, if (!address_ok(regs, ea, size)) return -EFAULT;
- nr_vsx_regs = size / sizeof(__vector128); + nr_vsx_regs = max(1ul, size / sizeof(__vector128)); preempt_disable(); if (reg < 32) { /* FP regs + extensions */
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Atish Patra atish.patra@wdc.com
[ Upstream commit b12422362ce947098ac420ac3c975fc006af4c02 ]
There is no usrio config defined for default gem config leading to a kernel panic devices that don't define a data. This issue can be reprdouced with microchip polar fire soc where compatible string is defined as "cdns,macb".
Fixes: edac63861db7 ("add userio bits as platform configuration")
Signed-off-by: Atish Patra atish.patra@wdc.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 814a5b10141d..07cdb38e7d11 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -3950,6 +3950,13 @@ static int macb_init(struct platform_device *pdev) return 0; }
+static const struct macb_usrio_config macb_default_usrio = { + .mii = MACB_BIT(MII), + .rmii = MACB_BIT(RMII), + .rgmii = GEM_BIT(RGMII), + .refclk = MACB_BIT(CLKEN), +}; + #if defined(CONFIG_OF) /* 1518 rounded up */ #define AT91ETHER_MAX_RBUFF_SZ 0x600 @@ -4435,13 +4442,6 @@ static int fu540_c000_init(struct platform_device *pdev) return macb_init(pdev); }
-static const struct macb_usrio_config macb_default_usrio = { - .mii = MACB_BIT(MII), - .rmii = MACB_BIT(RMII), - .rgmii = GEM_BIT(RGMII), - .refclk = MACB_BIT(CLKEN), -}; - static const struct macb_usrio_config sama7g5_usrio = { .mii = 0, .rmii = 1, @@ -4590,6 +4590,7 @@ static const struct macb_config default_gem_config = { .dma_burst_length = 16, .clk_init = macb_clk_init, .init = macb_init, + .usrio = &macb_default_usrio, .jumbo_max_len = 10240, };
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Shawn Guo shawn.guo@linaro.org
[ Upstream commit 02fc409540303801994d076fcdb7064bd634dbf3 ]
Commit 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") introduces an issue of dereferencing freed memory 'data'. Fix it.
Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Shawn Guo shawn.guo@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/qcom-cpufreq-hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c index 2726e77c9e5a..5cdd20e38771 100644 --- a/drivers/cpufreq/qcom-cpufreq-hw.c +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -368,7 +368,7 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) error: kfree(data); unmap_base: - iounmap(data->base); + iounmap(base); release_region: release_mem_region(res->start, resource_size(res)); return ret;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit 536eb97abeba857126ad055de5923fa592acef25 ]
In case of error, the function ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test.
Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Acked-by: Shawn Guo shawn.guo@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/qcom-cpufreq-hw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c index 5cdd20e38771..6de07556665b 100644 --- a/drivers/cpufreq/qcom-cpufreq-hw.c +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -317,9 +317,9 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) }
base = ioremap(res->start, resource_size(res)); - if (IS_ERR(base)) { + if (!base) { dev_err(dev, "failed to map resource %pR\n", res); - ret = PTR_ERR(base); + ret = -ENOMEM; goto release_region; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit eeb0753ba27b26f609e61f9950b14f1b934fe429 ]
pfn_valid() validates a pfn but basically it checks for a valid struct page backing for that pfn. It should always return positive for memory ranges backed with struct page mapping. But currently pfn_valid() fails for all ZONE_DEVICE based memory types even though they have struct page mapping.
pfn_valid() asserts that there is a memblock entry for a given pfn without MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is that they do not have memblock entries. Hence memblock_is_map_memory() will invariably fail via memblock_search() for a ZONE_DEVICE based address. This eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged into the system via memremap_pages() called from a driver, their respective memory sections will not have SECTION_IS_EARLY set.
Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set for firmware reserved memory regions. memblock_is_map_memory() can just be skipped as its always going to be positive and that will be an optimization for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal hotplugged memory too will not have SECTION_IS_EARLY set for their sections
Skipping memblock_is_map_memory() for all non early memory sections would fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its performance for normal hotplug memory as well.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: Robin Murphy robin.murphy@arm.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: David Hildenbrand david@redhat.com Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support") Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/r/1614921898-4099-2-git-send-email-anshuman.khandual... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/init.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 709d98fea90c..1141075e4d53 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -230,6 +230,18 @@ int pfn_valid(unsigned long pfn)
if (!valid_section(__pfn_to_section(pfn))) return 0; + + /* + * ZONE_DEVICE memory does not have the memblock entries. + * memblock_is_map_memory() check for ZONE_DEVICE based + * addresses will always fail. Even the normal hotplugged + * memory will never have MEMBLOCK_NOMAP flag set in their + * memblock entries. Skip memblock search for all non early + * memory sections covering all of hotplug memory including + * both normal and ZONE_DEVICE based. + */ + if (!early_section(__pfn_to_section(pfn))) + return pfn_section_valid(__pfn_to_section(pfn), pfn); #endif return memblock_is_map_memory(addr); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Benjamin Coddington bcodding@redhat.com
[ Upstream commit f0940f4b3284a00f38a5d42e6067c2aaa20e1f2e ]
We could recurse into NFS doing memory reclaim while sending a sync task, which might result in a deadlock. Set memalloc_nofs_save for sync task execution.
Fixes: a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs") Signed-off-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/sched.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index cf702a5f7fe5..39ed0e0afe6d 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -963,8 +963,11 @@ void rpc_execute(struct rpc_task *task)
rpc_set_active(task); rpc_make_runnable(rpciod_workqueue, task); - if (!is_async) + if (!is_async) { + unsigned int pflags = memalloc_nofs_save(); __rpc_execute(task); + memalloc_nofs_restore(pflags); + } }
static void rpc_async_schedule(struct work_struct *work)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 82e7ca1334ab16e2e04fafded1cab9dfcdc11b40 ]
There should be no reason to expect the directory permissions to change just because the directory contents changed or a negative lookup timed out. So let's avoid doing a full call to nfs_mark_for_revalidate() in that case. Furthermore, if this is a negative dentry, and we haven't actually done a new lookup, then we have no reason yet to believe the directory has changed at all. So let's remove the gratuitous directory inode invalidation altogether when called from nfs_lookup_revalidate_negative().
Reported-by: Geert Jansen gerardu@amazon.com Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/dir.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index ef827ae193d2..7bcc6fcf1096 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1401,6 +1401,15 @@ int nfs_lookup_verify_inode(struct inode *inode, unsigned int flags) goto out; }
+static void nfs_mark_dir_for_revalidate(struct inode *inode) +{ + struct nfs_inode *nfsi = NFS_I(inode); + + spin_lock(&inode->i_lock); + nfsi->cache_validity |= NFS_INO_REVAL_PAGECACHE; + spin_unlock(&inode->i_lock); +} + /* * We judge how long we want to trust negative * dentries by looking at the parent inode mtime. @@ -1435,7 +1444,6 @@ nfs_lookup_revalidate_done(struct inode *dir, struct dentry *dentry, __func__, dentry); return 1; case 0: - nfs_mark_for_revalidate(dir); if (inode && S_ISDIR(inode->i_mode)) { /* Purge readdir caches. */ nfs_zap_caches(inode); @@ -1525,6 +1533,13 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry, nfs_free_fattr(fattr); nfs_free_fhandle(fhandle); nfs4_label_free(label); + + /* + * If the lookup failed despite the dentry change attribute being + * a match, then we should revalidate the directory cache. + */ + if (!ret && nfs_verify_change_attribute(dir, dentry->d_time)) + nfs_mark_dir_for_revalidate(dir); return nfs_lookup_revalidate_done(dir, dentry, inode, ret); }
@@ -1567,7 +1582,7 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry, error = nfs_lookup_verify_inode(inode, flags); if (error) { if (error == -ESTALE) - nfs_zap_caches(dir); + nfs_mark_dir_for_revalidate(dir); goto out_bad; } nfs_advise_use_readdirplus(dir); @@ -2064,7 +2079,6 @@ nfs_add_or_obtain(struct dentry *dentry, struct nfs_fh *fhandle, dput(parent); return d; out_error: - nfs_mark_for_revalidate(dir); d = ERR_PTR(error); goto out; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 47397915ede0192235474b145ebcd81b37b03624 ]
The fact that the lookup revalidation failed, does not mean that the inode contents have changed.
Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/dir.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 7bcc6fcf1096..4db3018776f6 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1444,18 +1444,14 @@ nfs_lookup_revalidate_done(struct inode *dir, struct dentry *dentry, __func__, dentry); return 1; case 0: - if (inode && S_ISDIR(inode->i_mode)) { - /* Purge readdir caches. */ - nfs_zap_caches(inode); - /* - * We can't d_drop the root of a disconnected tree: - * its d_hash is on the s_anon list and d_drop() would hide - * it from shrink_dcache_for_unmount(), leading to busy - * inodes on unmount and further oopses. - */ - if (IS_ROOT(dentry)) - return 1; - } + /* + * We can't d_drop the root of a disconnected tree: + * its d_hash is on the s_anon list and d_drop() would hide + * it from shrink_dcache_for_unmount(), leading to busy + * inodes on unmount and further oopses. + */ + if (inode && IS_ROOT(dentry)) + return 1; dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is invalid\n", __func__, dentry); return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ondrej Mosnacek omosnace@redhat.com
[ Upstream commit 53cb245454df5b13d7063162afd7a785aed6ebf2 ]
An xattr 'get' handler is expected to return the length of the value on success, yet _nfs4_get_security_label() (and consequently also nfs4_xattr_get_nfs4_label(), which is used as an xattr handler) returns just 0 on success.
Fix this by returning label.len instead, which contains the length of the result.
Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Reviewed-by: James Morris jamorris@linux.microsoft.com Reviewed-by: Paul Moore paul@paul-moore.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index fc8bbfd9beb3..7eb44f37558c 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5972,7 +5972,7 @@ static int _nfs4_get_security_label(struct inode *inode, void *buf, return ret; if (!(fattr.valid & NFS_ATTR_FATTR_V4_SECURITY_LABEL)) return -ENOENT; - return 0; + return label.len; }
static int nfs4_get_security_label(struct inode *inode, void *buf,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit df66617bfe87487190a60783d26175b65d2502ce ]
When create_singlethread_workqueue returns NULL to card->event_wq, no error return code of rsxx_pci_probe() is assigned.
To fix this bug, st is assigned with -ENOMEM in this case.
Fixes: 8722ff8cdbfa ("block: IBM RamSan 70/80 device driver") Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Link: https://lore.kernel.org/r/20210310033017.4023-1-baijiaju1990@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/rsxx/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/block/rsxx/core.c b/drivers/block/rsxx/core.c index 5ac1881396af..227e1be4c6f9 100644 --- a/drivers/block/rsxx/core.c +++ b/drivers/block/rsxx/core.c @@ -871,6 +871,7 @@ static int rsxx_pci_probe(struct pci_dev *dev, card->event_wq = create_singlethread_workqueue(DRIVER_NAME"_event"); if (!card->event_wq) { dev_err(CARD_TO_DEV(card), "Failed card event setup.\n"); + st = -ENOMEM; goto failed_event_handler; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Anthony DeRossi ajderossi@gmail.com
[ Upstream commit ca63d76fd2319db984f2875992643f900caf2c72 ]
Freed pages are not subtracted from the allocated_pages counter in ttm_pool_type_fini(), causing a leak in the count on device removal. The next shrinker invocation loops forever trying to free pages that are no longer in the pool:
rcu: INFO: rcu_sched self-detected stall on CPU rcu: 3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237 (t=10001 jiffies g=2194533 q=49211) NMI backtrace for cpu 3 CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P O 5.11.0-com #1 Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019 Call Trace: <IRQ> ... </IRQ> sysvec_apic_timer_interrupt+0x77/0x80 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:mutex_unlock+0x16/0x20 Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10 RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246 RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000 RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0 RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80 R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60 ttm_pool_shrink+0x7d/0x90 [ttm] ttm_pool_shrinker_scan+0x5/0x20 [ttm] do_shrink_slab+0x13a/0x1a0 ...
debugfs shows the incorrect total:
$ cat /sys/kernel/debug/dri/0/ttm_page_pool --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10--- wc : 0 0 0 0 0 0 0 0 0 0 0 uc : 0 0 0 0 0 0 0 0 0 0 0 wc 32 : 0 0 0 0 0 0 0 0 0 0 0 uc 32 : 0 0 0 0 0 0 0 0 0 0 0 DMA uc : 0 0 0 0 0 0 0 0 0 0 0 DMA wc : 0 0 0 0 0 0 0 0 0 0 0 DMA : 0 0 0 0 0 0 0 0 0 0 0
total : 3029 of 8244261
Using ttm_pool_type_take() to remove pages from the pool before freeing them correctly accounts for the freed pages.
Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Signed-off-by: Anthony DeRossi ajderossi@gmail.com Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderos... Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_pool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 6e27cb1bf48b..4eb6efb8b8c0 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -268,13 +268,13 @@ static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool, /* Remove a pool_type from the global shrinker list and free all pages */ static void ttm_pool_type_fini(struct ttm_pool_type *pt) { - struct page *p, *tmp; + struct page *p;
mutex_lock(&shrinker_lock); list_del(&pt->shrinker_list); mutex_unlock(&shrinker_lock);
- list_for_each_entry_safe(p, tmp, &pt->pages, lru) + while ((p = ttm_pool_type_take(pt))) ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: James Smart jsmart2021@gmail.com
[ Upstream commit f20ef34d71abc1fc56b322aaa251f90f94320140 ]
Recent patch to prevent calling __nvme_fc_abort_outstanding_ios in interrupt context results in a possible race condition. A controller reset results in errored io completions, which schedules error work. The change of error work to a work element allows it to fire after the ctrl state transition to NVME_CTRL_CONNECTING, causing any outstanding io (used to initialize the controller) to fail and cause problems for connect_work.
Add a state check to only schedule error work if not in the RESETTING state.
Fixes: 19fce0470f05 ("nvme-fc: avoid calling _nvme_fc_abort_outstanding_ios from interrupt context") Signed-off-by: Nigel Kirkland nkirkland2304@gmail.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/fc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 5f36cfa8136c..7ec6869b3e5b 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2055,7 +2055,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req) nvme_fc_complete_rq(rq);
check_error: - if (terminate_assoc) + if (terminate_assoc && ctrl->ctrl.state != NVME_CTRL_RESETTING) queue_work(nvme_reset_wq, &ctrl->ioerr_work); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Daiyue Zhang zhangdaiyue1@huawei.com
[ Upstream commit 14fbbc8297728e880070f7b077b3301a8c698ef9 ]
Commit b0841eefd969 ("configfs: provide exclusion between IO and removals") uses ->frag_dead to mark the fragment state, thus no bothering with extra refcount on config_item when opening a file. The configfs_get_config_item was removed in __configfs_open_file, but not with config_item_put. So the refcount on config_item will lost its balance, causing use-after-free issues in some occasions like this:
Test: 1. Mount configfs on /config with read-only items: drwxrwx--- 289 root root 0 2021-04-01 11:55 /config drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a --w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt ......
2. Then run: for file in /config do echo $file grep -R 'key' $file done
3. __configfs_open_file will be called in parallel, the first one got called will do: if (file->f_mode & FMODE_READ) { if (!(inode->i_mode & S_IRUGO)) goto out_put_module; config_item_put(buffer->item); kref_put() package_details_release() kfree()
the other one will run into use-after-free issues like this: BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0 Read of size 8 at addr fffffff155f02480 by task grep/13096 CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1 TGID: 13096 Comm: grep Call trace: dump_stack+0x118/0x160 kasan_report+0x22c/0x294 __asan_load8+0x80/0x88 __configfs_open_file+0x1bc/0x3b0 configfs_open_file+0x28/0x34 do_dentry_open+0x2cc/0x5c0 vfs_open+0x80/0xe0 path_openat+0xd8c/0x2988 do_filp_open+0x1c4/0x2fc do_sys_open+0x23c/0x404 SyS_openat+0x38/0x48
Allocated by task 2138: kasan_kmalloc+0xe0/0x1ac kmem_cache_alloc_trace+0x334/0x394 packages_make_item+0x4c/0x180 configfs_mkdir+0x358/0x740 vfs_mkdir2+0x1bc/0x2e8 SyS_mkdirat+0x154/0x23c el0_svc_naked+0x34/0x38
Freed by task 13096: kasan_slab_free+0xb8/0x194 kfree+0x13c/0x910 package_details_release+0x524/0x56c kref_put+0xc4/0x104 config_item_put+0x24/0x34 __configfs_open_file+0x35c/0x3b0 configfs_open_file+0x28/0x34 do_dentry_open+0x2cc/0x5c0 vfs_open+0x80/0xe0 path_openat+0xd8c/0x2988 do_filp_open+0x1c4/0x2fc do_sys_open+0x23c/0x404 SyS_openat+0x38/0x48 el0_svc_naked+0x34/0x38
To fix this issue, remove the config_item_put in __configfs_open_file to balance the refcount of config_item.
Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals") Signed-off-by: Daiyue Zhang zhangdaiyue1@huawei.com Signed-off-by: Yi Chen chenyi77@huawei.com Signed-off-by: Ge Qiu qiuge@huawei.com Reviewed-by: Chao Yu yuchao0@huawei.com Acked-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/configfs/file.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/configfs/file.c b/fs/configfs/file.c index 1f0270229d7b..da8351d1e455 100644 --- a/fs/configfs/file.c +++ b/fs/configfs/file.c @@ -378,7 +378,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type
attr = to_attr(dentry); if (!attr) - goto out_put_item; + goto out_free_buffer;
if (type & CONFIGFS_ITEM_BIN_ATTR) { buffer->bin_attr = to_bin_attr(dentry); @@ -391,7 +391,7 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type /* Grab the module reference for this attribute if we have one */ error = -ENODEV; if (!try_module_get(buffer->owner)) - goto out_put_item; + goto out_free_buffer;
error = -EACCES; if (!buffer->item->ci_type) @@ -435,8 +435,6 @@ static int __configfs_open_file(struct inode *inode, struct file *file, int type
out_put_module: module_put(buffer->owner); -out_put_item: - config_item_put(buffer->item); out_free_buffer: up_read(&frag->frag_sem); kfree(buffer);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ard Biesheuvel ardb@kernel.org
[ Upstream commit 7ba8f2b2d652cd8d8a2ab61f4be66973e70f9f88 ]
52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one.
This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well.
Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable.
Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: Mark Salter msalter@redhat.com Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com Signed-off-by: Ard Biesheuvel ardb@kernel.org Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/mmu_context.h | 5 +---- arch/arm64/kernel/head.S | 2 +- arch/arm64/mm/mmu.c | 2 +- 3 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 0b3079fd28eb..1c364ec0ad31 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -65,10 +65,7 @@ extern u64 idmap_ptrs_per_pgd;
static inline bool __cpu_uses_extended_idmap(void) { - if (IS_ENABLED(CONFIG_ARM64_VA_BITS_52)) - return false; - - return unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)); + return unlikely(idmap_t0sz != TCR_T0SZ(vabits_actual)); }
/* diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 7ec430e18f95..a0b3bfe67609 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -319,7 +319,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables) */ adrp x5, __idmap_text_end clz x5, x5 - cmp x5, TCR_T0SZ(VA_BITS) // default T0SZ small enough? + cmp x5, TCR_T0SZ(VA_BITS_MIN) // default T0SZ small enough? b.ge 1f // .. then skip VA range extension
adr_l x6, idmap_t0sz diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index cb78343181db..6f0648777d34 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -40,7 +40,7 @@ #define NO_BLOCK_MAPPINGS BIT(0) #define NO_CONT_MAPPINGS BIT(1)
-u64 idmap_t0sz = TCR_T0SZ(VA_BITS); +u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN); u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
u64 __section(".mmuoff.data.write") vabits_actual;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jens Axboe axboe@kernel.dk
[ Upstream commit d052d1d685f5125249ab4ff887562c88ba959638 ]
We bypass IOPOLL completion polling (and reaping) for the SQPOLL thread, but if it's the thread itself invoking cancelations, then we still need to perform it or no one will.
Fixes: 9936c7c2bc76 ("io_uring: deduplicate core cancellations sequence") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 241313278e5a..00ef0b90d149 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8891,7 +8891,8 @@ static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx, }
/* SQPOLL thread does its own polling */ - if (!(ctx->flags & IORING_SETUP_SQPOLL) && !files) { + if ((!(ctx->flags & IORING_SETUP_SQPOLL) && !files) || + (ctx->sq_data && ctx->sq_data->thread == current)) { while (!list_empty_careful(&ctx->iopoll_list)) { io_iopoll_try_reap_events(ctx); ret = true;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Dave Airlie airlied@redhat.com
[ Upstream commit 4042160c2e5433e0759782c402292a90b5bf458d ]
The index variable should only be increased in one place.
Noticed this while trying to track down another oops.
v2: use while loop.
Fixes: f295c8cfec83 ("drm/nouveau: fix dma syncing warning with debugging on.") Signed-off-by: Dave Airlie airlied@redhat.com Reviewed-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20210311043527.5376-1-airlied@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_bo.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 7ea367a5444d..f1c9a22083be 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -556,7 +556,8 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo) if (nvbo->force_coherent) return;
- for (i = 0; i < ttm_dma->num_pages; ++i) { + i = 0; + while (i < ttm_dma->num_pages) { struct page *p = ttm_dma->pages[i]; size_t num_pages = 1;
@@ -587,7 +588,8 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo) if (nvbo->force_coherent) return;
- for (i = 0; i < ttm_dma->num_pages; ++i) { + i = 0; + while (i < ttm_dma->num_pages) { struct page *p = ttm_dma->pages[i]; size_t num_pages = 1;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit c8e3866836528a4ba3b0535834f03768d74f7d8e ]
Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function.
Fixes: 53c218da220c ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Link: https://lore.kernel.org/r/20210312080421.277562-1-weiyongjun1@huawei.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/perf/arm_dmc620_pmu.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/perf/arm_dmc620_pmu.c b/drivers/perf/arm_dmc620_pmu.c index 004930eb4bbb..b50b47f1a0d9 100644 --- a/drivers/perf/arm_dmc620_pmu.c +++ b/drivers/perf/arm_dmc620_pmu.c @@ -681,6 +681,7 @@ static int dmc620_pmu_device_probe(struct platform_device *pdev) if (!name) { dev_err(&pdev->dev, "Create name failed, PMU @%pa\n", &res->start); + ret = -ENOMEM; goto out_teardown_dev; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Willem de Bruijn willemb@google.com
[ Upstream commit b228c9b058760500fda5edb3134527f629fc2dc3 ]
The referenced commit expands the skb_seq_state used by skb_find_text with a 4B frag_off field, growing it to 48B.
This exceeds container ts_state->cb, causing a stack corruption:
[ 73.238353] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: skb_find_text+0xc5/0xd0 [ 73.247384] CPU: 1 PID: 376 Comm: nping Not tainted 5.11.0+ #4 [ 73.252613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 73.260078] Call Trace: [ 73.264677] dump_stack+0x57/0x6a [ 73.267866] panic+0xf6/0x2b7 [ 73.270578] ? skb_find_text+0xc5/0xd0 [ 73.273964] __stack_chk_fail+0x10/0x10 [ 73.277491] skb_find_text+0xc5/0xd0 [ 73.280727] string_mt+0x1f/0x30 [ 73.283639] ipt_do_table+0x214/0x410
The struct is passed between skb_find_text and its callbacks skb_prepare_seq_read, skb_seq_read and skb_abort_seq read through the textsearch interface using TS_SKB_CB.
I assumed that this mapped to skb->cb like other .._SKB_CB wrappers. skb->cb is 48B. But it maps to ts_state->cb, which is only 40B.
skb->cb was increased from 40B to 48B after ts_state was introduced, in commit 3e3850e989c5 ("[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder").
Increase ts_state.cb[] to 48 to fit the struct.
Also add a BUILD_BUG_ON to avoid a repeat.
The alternative is to directly add a dependency from textsearch onto linux/skbuff.h, but I think the intent is textsearch to have no such dependencies on its callers.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=211911 Fixes: 97550f6fa592 ("net: compound page support in skb_seq_read") Reported-by: Kris Karas bugs-a17@moonlit-rail.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/textsearch.h | 2 +- net/core/skbuff.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/textsearch.h b/include/linux/textsearch.h index 13770cfe33ad..6673e4d4ac2e 100644 --- a/include/linux/textsearch.h +++ b/include/linux/textsearch.h @@ -23,7 +23,7 @@ struct ts_config; struct ts_state { unsigned int offset; - char cb[40]; + char cb[48]; };
/** diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 28b8242f18d7..2b784d62a9fe 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3622,6 +3622,8 @@ unsigned int skb_find_text(struct sk_buff *skb, unsigned int from, struct ts_state state; unsigned int ret;
+ BUILD_BUG_ON(sizeof(struct skb_seq_state) > sizeof(state.cb)); + config->get_next_block = skb_ts_get_next_block; config->finish = skb_ts_finish;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Florian Westphal fw@strlen.de
[ Upstream commit f07157792c633b528de5fc1dbe2e4ea54f8e09d4 ]
mptcp_add_pending_subflow() performs a sock_hold() on the subflow, then adds the subflow to the join list.
Without a sock_put the subflow sk won't be freed in case connect() fails.
unreferenced object 0xffff88810c03b100 (size 3000): [..] sk_prot_alloc.isra.0+0x2f/0x110 sk_alloc+0x5d/0xc20 inet6_create+0x2b7/0xd30 __sock_create+0x17f/0x410 mptcp_subflow_create_socket+0xff/0x9c0 __mptcp_subflow_connect+0x1da/0xaf0 mptcp_pm_nl_work+0x6e0/0x1120 mptcp_worker+0x508/0x9a0
Fixes: 5b950ff4331ddda ("mptcp: link MPC subflow into msk only after accept") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/subflow.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 81b7be67d288..c3090003a17b 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1174,6 +1174,7 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc, spin_lock_bh(&msk->join_list_lock); list_del(&subflow->node); spin_unlock_bh(&msk->join_list_lock); + sock_put(mptcp_subflow_tcp_sock(subflow));
failed: subflow->disposable = 1;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit eaeef1ce55ec9161e0c44ff27017777b1644b421 ]
In case of memory pressure the MPTCP xmit path keeps at most a single skb in the tx cache, eventually freeing additional ones.
The associated counter for forward memory is not update accordingly, and that causes the following splat:
WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0 RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003 RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7 R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100 R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85 FS: 0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0 Call Trace: __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272 process_one_work+0x896/0x1170 kernel/workqueue.c:2275 worker_thread+0x605/0x1350 kernel/workqueue.c:2421 kthread+0x344/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296
At close time, as reported by syzkaller/Christoph.
This change address the issue properly updating the fwd allocated memory counter in the error path.
Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/136 Fixes: 724cfd2ee8aa ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index de89824a2a36..056846eb2e5b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1176,6 +1176,7 @@ static bool mptcp_tx_cache_refill(struct sock *sk, int size, */ while (skbs->qlen > 1) { skb = __skb_dequeue_tail(skbs); + *total_ts -= skb->truesize; __kfree_skb(skb); } return skbs->qlen > 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kan Liang kan.liang@linux.intel.com
[ Upstream commit a5398bffc01fe044848c5024e5e867e407f239b8 ]
Sometimes the PMU internal buffers have to be flushed for per-CPU events during a context switch, e.g., large PEBS. Otherwise, the perf tool may report samples in locations that do not belong to the process where the samples are processed in, because PEBS does not tag samples with PID/TID.
The current code only flush the buffers for a per-task event. It doesn't check a per-CPU event.
Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the PMU internal buffers have to be flushed for this event during a context switch.
Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx which is required to be flushed.
Only need to invoke the sched_task() for per-CPU events in this patch. The per-task events have been handled in perf_event_context_sched_in/out already.
Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Reported-by: Gabriel Marin gmx@google.com Originally-by: Namhyung Kim namhyung@kernel.org Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.liang@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 42 ++++++++++++++++++++++++++++++++++---- 2 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9a38f579bc76..419a4d77de00 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -606,6 +606,7 @@ struct swevent_hlist { #define PERF_ATTACH_TASK 0x04 #define PERF_ATTACH_TASK_DATA 0x08 #define PERF_ATTACH_ITRACE 0x10 +#define PERF_ATTACH_SCHED_CB 0x20
struct perf_cgroup; struct perf_buffer; @@ -872,6 +873,7 @@ struct perf_cpu_context { struct list_head cgrp_cpuctx_entry; #endif
+ struct list_head sched_cb_entry; int sched_cb_usage;
int online; diff --git a/kernel/events/core.c b/kernel/events/core.c index 55d18791a72d..8425dbc1d239 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -385,6 +385,7 @@ static DEFINE_MUTEX(perf_sched_mutex); static atomic_t perf_sched_count;
static DEFINE_PER_CPU(atomic_t, perf_cgroup_events); +static DEFINE_PER_CPU(int, perf_sched_cb_usages); static DEFINE_PER_CPU(struct pmu_event_list, pmu_sb_events);
static atomic_t nr_mmap_events __read_mostly; @@ -3474,11 +3475,16 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, } }
+static DEFINE_PER_CPU(struct list_head, sched_cb_list); + void perf_sched_cb_dec(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
- --cpuctx->sched_cb_usage; + this_cpu_dec(perf_sched_cb_usages); + + if (!--cpuctx->sched_cb_usage) + list_del(&cpuctx->sched_cb_entry); }
@@ -3486,7 +3492,10 @@ void perf_sched_cb_inc(struct pmu *pmu) { struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
- cpuctx->sched_cb_usage++; + if (!cpuctx->sched_cb_usage++) + list_add(&cpuctx->sched_cb_entry, this_cpu_ptr(&sched_cb_list)); + + this_cpu_inc(perf_sched_cb_usages); }
/* @@ -3515,6 +3524,24 @@ static void __perf_pmu_sched_task(struct perf_cpu_context *cpuctx, bool sched_in perf_ctx_unlock(cpuctx, cpuctx->task_ctx); }
+static void perf_pmu_sched_task(struct task_struct *prev, + struct task_struct *next, + bool sched_in) +{ + struct perf_cpu_context *cpuctx; + + if (prev == next) + return; + + list_for_each_entry(cpuctx, this_cpu_ptr(&sched_cb_list), sched_cb_entry) { + /* will be handled in perf_event_context_sched_in/out */ + if (cpuctx->task_ctx) + continue; + + __perf_pmu_sched_task(cpuctx, sched_in); + } +} + static void perf_event_switch(struct task_struct *task, struct task_struct *next_prev, bool sched_in);
@@ -3537,6 +3564,9 @@ void __perf_event_task_sched_out(struct task_struct *task, { int ctxn;
+ if (__this_cpu_read(perf_sched_cb_usages)) + perf_pmu_sched_task(task, next, false); + if (atomic_read(&nr_switch_events)) perf_event_switch(task, next, false);
@@ -3845,6 +3875,9 @@ void __perf_event_task_sched_in(struct task_struct *prev,
if (atomic_read(&nr_switch_events)) perf_event_switch(task, prev, true); + + if (__this_cpu_read(perf_sched_cb_usages)) + perf_pmu_sched_task(prev, task, true); }
static u64 perf_calculate_period(struct perf_event *event, u64 nsec, u64 count) @@ -4669,7 +4702,7 @@ static void unaccount_event(struct perf_event *event) if (event->parent) return;
- if (event->attach_state & PERF_ATTACH_TASK) + if (event->attach_state & (PERF_ATTACH_TASK | PERF_ATTACH_SCHED_CB)) dec = true; if (event->attr.mmap || event->attr.mmap_data) atomic_dec(&nr_mmap_events); @@ -11168,7 +11201,7 @@ static void account_event(struct perf_event *event) if (event->parent) return;
- if (event->attach_state & PERF_ATTACH_TASK) + if (event->attach_state & (PERF_ATTACH_TASK | PERF_ATTACH_SCHED_CB)) inc = true; if (event->attr.mmap || event->attr.mmap_data) atomic_inc(&nr_mmap_events); @@ -12960,6 +12993,7 @@ static void __init perf_event_init_all_cpus(void) #ifdef CONFIG_CGROUP_PERF INIT_LIST_HEAD(&per_cpu(cgrp_cpuctx_list, cpu)); #endif + INIT_LIST_HEAD(&per_cpu(sched_cb_list, cpu)); } }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Kan Liang kan.liang@linux.intel.com
[ Upstream commit afbef30149587ad46f4780b1e0cc5e219745ce90 ]
To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer in a context switch.
For normal LBRs, a context switch can flip the address space and LBR entries are not tagged with an identifier, we need to wipe the LBR, even for per-cpu events.
For LBR callstack, save/restore the stack is required during a context switch.
Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR.
Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.liang@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 4faaef3a8f6c..d3f5cf70c1a0 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3578,8 +3578,10 @@ static int intel_pmu_hw_config(struct perf_event *event) if (!(event->attr.freq || (event->attr.wakeup_events && !event->attr.watermark))) { event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD; if (!(event->attr.sample_type & - ~intel_pmu_large_pebs_flags(event))) + ~intel_pmu_large_pebs_flags(event))) { event->hw.flags |= PERF_X86_EVENT_LARGE_PEBS; + event->attach_state |= PERF_ATTACH_SCHED_CB; + } } if (x86_pmu.pebs_aliases) x86_pmu.pebs_aliases(event); @@ -3592,6 +3594,7 @@ static int intel_pmu_hw_config(struct perf_event *event) ret = intel_pmu_setup_lbr_filter(event); if (ret) return ret; + event->attach_state |= PERF_ATTACH_SCHED_CB;
/* * BTS is set up earlier in this path, so don't account twice
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Anna-Maria Behnsen anna-maria@linutronix.de
[ Upstream commit 46eb1701c046cc18c032fa68f3c8ccbf24483ee4 ]
hrtimer_force_reprogram() and hrtimer_interrupt() invokes __hrtimer_get_next_event() to find the earliest expiry time of hrtimer bases. __hrtimer_get_next_event() does not update cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That needs to be done at the callsites.
hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when the first expiring timer is a softirq timer and the soft interrupt is not activated. That's wrong because cpu_base::softirq_expires_next is left stale when the first expiring timer of all bases is a timer which expires in hard interrupt context. hrtimer_interrupt() does never update cpu_base::softirq_expires_next which is wrong too.
That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that timer before the stale cpu_base::softirq_expires_next.
cpu_base::softirq_expires_next is cached to make the check for raising the soft interrupt fast. In the above case the soft interrupt won't be raised until clock monotonic reaches the stale cpu_base::softirq_expires_next value. That's incorrect, but what's worse it that if the softirq timer becomes the first expiring timer of all clock bases after the hard expiry timer has been handled the reprogramming of the clockevent from hrtimer_interrupt() will result in an interrupt storm. That happens because the reprogramming does not use cpu_base::softirq_expires_next, it uses __hrtimer_get_next_event() which returns the actual expiry time. Once clock MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is raised and the storm subsides.
Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard bases seperately, update softirq_expires_next and handle the case when a soft expiring timer is the first of all bases by comparing the expiry times and updating the required cpu base fields. Split this functionality into a separate function to be able to use it in hrtimer_interrupt() as well without copy paste.
Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers") Reported-by: Mikael Beckius mikael.beckius@windriver.com Suggested-by: Thomas Gleixner tglx@linutronix.de Tested-by: Mikael Beckius mikael.beckius@windriver.com Signed-off-by: Anna-Maria Behnsen anna-maria@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/hrtimer.c | 60 ++++++++++++++++++++++++++++--------------- 1 file changed, 39 insertions(+), 21 deletions(-)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 743c852e10f2..788b9d137de4 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -546,8 +546,11 @@ static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, }
/* - * Recomputes cpu_base::*next_timer and returns the earliest expires_next but - * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram. + * Recomputes cpu_base::*next_timer and returns the earliest expires_next + * but does not set cpu_base::*expires_next, that is done by + * hrtimer[_force]_reprogram and hrtimer_interrupt only. When updating + * cpu_base::*expires_next right away, reprogramming logic would no longer + * work. * * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases, * those timers will get run whenever the softirq gets handled, at the end of @@ -588,6 +591,37 @@ __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base, unsigned int active_ return expires_next; }
+static ktime_t hrtimer_update_next_event(struct hrtimer_cpu_base *cpu_base) +{ + ktime_t expires_next, soft = KTIME_MAX; + + /* + * If the soft interrupt has already been activated, ignore the + * soft bases. They will be handled in the already raised soft + * interrupt. + */ + if (!cpu_base->softirq_activated) { + soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT); + /* + * Update the soft expiry time. clock_settime() might have + * affected it. + */ + cpu_base->softirq_expires_next = soft; + } + + expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD); + /* + * If a softirq timer is expiring first, update cpu_base->next_timer + * and program the hardware with the soft expiry time. + */ + if (expires_next > soft) { + cpu_base->next_timer = cpu_base->softirq_next_timer; + expires_next = soft; + } + + return expires_next; +} + static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base) { ktime_t *offs_real = &base->clock_base[HRTIMER_BASE_REALTIME].offset; @@ -628,23 +662,7 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal) { ktime_t expires_next;
- /* - * Find the current next expiration time. - */ - expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); - - if (cpu_base->next_timer && cpu_base->next_timer->is_soft) { - /* - * When the softirq is activated, hrtimer has to be - * programmed with the first hard hrtimer because soft - * timer interrupt could occur too late. - */ - if (cpu_base->softirq_activated) - expires_next = __hrtimer_get_next_event(cpu_base, - HRTIMER_ACTIVE_HARD); - else - cpu_base->softirq_expires_next = expires_next; - } + expires_next = hrtimer_update_next_event(cpu_base);
if (skip_equal && expires_next == cpu_base->expires_next) return; @@ -1644,8 +1662,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
- /* Reevaluate the clock bases for the next expiry */ - expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); + /* Reevaluate the clock bases for the [soft] next expiry */ + expires_next = hrtimer_update_next_event(cpu_base); /* * Store the new expiry value so the migration code can verify * against it.
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Daniel Axtens dja@axtens.net
[ Upstream commit c080a173301ffc62cb6c76308c803c7fee05517a ]
Nick's patch cleaning up the SRR specifiers in exception-64s.S missed a single instance of EXC_HV_OR_STD. Clean that up.
Caught by clang's integrated assembler.
Fixes: 3f7fbd97d07d ("powerpc/64s/exception: Clean up SRR specifiers") Signed-off-by: Daniel Axtens dja@axtens.net Acked-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210225031006.1204774-2-dja@axtens.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/exceptions-64s.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 6e53f7638737..de988770a7e4 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -470,7 +470,7 @@ DEFINE_FIXED_SYMBOL(\name()_common_real)
ld r10,PACAKMSR(r13) /* get MSR value for kernel */ /* MSR[RI] is clear iff using SRR regs */ - .if IHSRR == EXC_HV_OR_STD + .if IHSRR_IF_HVMODE BEGIN_FTR_SECTION xori r10,r10,MSR_RI END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 4817a52b306136c8b2b2271d8770401441e4cf79 ]
seqcount_init() must be a macro in order to preserve the static variable that is used for the lockdep key. Don't then wrap it in an inline function, which destroys that.
Luckily there aren't many users of this function, but fix it before it becomes a problem.
Fixes: 80793c3471d9 ("seqlock: Introduce seqcount_latch_t") Reported-by: Eric Dumazet eric.dumazet@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/YEeFEbNUVkZaXDp4@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/seqlock.h | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index 2f7bb92b4c9e..f61e34fbaaea 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -664,10 +664,7 @@ typedef struct { * seqcount_latch_init() - runtime initializer for seqcount_latch_t * @s: Pointer to the seqcount_latch_t instance */ -static inline void seqcount_latch_init(seqcount_latch_t *s) -{ - seqcount_init(&s->seqcount); -} +#define seqcount_latch_init(s) seqcount_init(&(s)->seqcount)
/** * raw_read_seqcount_latch() - pick even/odd latch data copy
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 34dc2efb39a231280fd6696a59bbe712bf3c5c4a ]
The inlining logic in clang-13 is rewritten to often not inline some functions that were inlined by all earlier compilers.
In case of the memblock interfaces, this exposed a harmless bug of a missing __init annotation:
WARNING: modpost: vmlinux.o(.text+0x507c0a): Section mismatch in reference from the function memblock_bottom_up() to the variable .meminit.data:memblock The function memblock_bottom_up() references the variable __meminitdata memblock. This is often because memblock_bottom_up lacks a __meminitdata annotation or the annotation of memblock is wrong.
Interestingly, these annotations were present originally, but got removed with the explanation that the __init annotation prevents the function from getting inlined. I checked this again and found that while this is the case with clang, gcc (version 7 through 10, did not test others) does inline the functions regardless.
As the previous change was apparently intended to help the clang builds, reverting it to help the newer clang versions seems appropriate as well. gcc builds don't seem to care either way.
Link: https://lkml.kernel.org/r/20210225133808.2188581-1-arnd@kernel.org Fixes: 5bdba520c1b3 ("mm: memblock: drop __init from memblock functions to make it inline") Reference: 2cfb3665e864 ("include/linux/memblock.h: add __init to memblock_set_bottom_up()") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Mike Rapoport rppt@linux.ibm.com Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: Faiyaz Mohammed faiyazm@codeaurora.org Cc: Baoquan He bhe@redhat.com Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: Aslan Bakirov aslan@fb.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/memblock.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h index b93c44b9121e..7643d2dfa959 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -460,7 +460,7 @@ static inline void memblock_free_late(phys_addr_t base, phys_addr_t size) /* * Set the allocation direction to bottom-up or top-down. */ -static inline void memblock_set_bottom_up(bool enable) +static inline __init void memblock_set_bottom_up(bool enable) { memblock.bottom_up = enable; } @@ -470,7 +470,7 @@ static inline void memblock_set_bottom_up(bool enable) * if this is true, that said, memblock will allocate memory * in bottom-up direction. */ -static inline bool memblock_bottom_up(void) +static inline __init bool memblock_bottom_up(void) { return memblock.bottom_up; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit cbf78d85079cee662c45749ef4f744d41be85d48 ]
With clang-13, some functions only get partially inlined, with a specialized version referring to a global variable. This triggers a harmless build-time check for the intel-rng driver:
WARNING: modpost: drivers/char/hw_random/intel-rng.o(.text+0xe): Section mismatch in reference from the function stop_machine() to the function .init.text:intel_rng_hw_init() The function stop_machine() references the function __init intel_rng_hw_init(). This is often because stop_machine lacks a __init annotation or the annotation of intel_rng_hw_init is wrong.
In this instance, an easy workaround is to force the stop_machine() function to be inline, along with related interfaces that did not show the same behavior at the moment, but theoretically could.
The combination of the two patches listed below triggers the behavior in clang-13, but individually these commits are correct.
Link: https://lkml.kernel.org/r/20210225130153.1956990-1-arnd@kernel.org Fixes: fe5595c07400 ("stop_machine: Provide stop_machine_cpuslocked()") Fixes: ee527cd3a20c ("Use stop_machine_run in the Intel RNG driver") Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: "Paul E. McKenney" paulmck@kernel.org Cc: Ingo Molnar mingo@kernel.org Cc: Prarit Bhargava prarit@redhat.com Cc: Daniel Bristot de Oliveira bristot@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Valentin Schneider valentin.schneider@arm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/stop_machine.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h index 30577c3aecf8..46fb3ebdd16e 100644 --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -128,7 +128,7 @@ int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus); #else /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */
-static inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, +static __always_inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus) { unsigned long flags; @@ -139,14 +139,15 @@ static inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, return ret; }
-static inline int stop_machine(cpu_stop_fn_t fn, void *data, - const struct cpumask *cpus) +static __always_inline int +stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus) { return stop_machine_cpuslocked(fn, data, cpus); }
-static inline int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, - const struct cpumask *cpus) +static __always_inline int +stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus) { return stop_machine(fn, data, cpus); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit 149fc787353f65b7e72e05e7b75d34863266c3e2 ]
Fix a sparse warning by using rcu_dereference(). Technically this is a bug and a sufficiently aggressive compiler could reload the `real_parent' pointer outside the protection of the rcu lock (and access freed memory), but I think it's pretty unlikely to happen.
Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org Fixes: b18dc5f291c0 ("mm, oom: skip vforked tasks from being selected") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: Miaohe Lin linmiaohe@huawei.com Acked-by: Michal Hocko mhocko@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sched/mm.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 1ae08b8462a4..90b2a0bce11c 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -140,7 +140,8 @@ static inline bool in_vfork(struct task_struct *tsk) * another oom-unkillable task does this it should blame itself. */ rcu_read_lock(); - ret = tsk->vfork_done && tsk->real_parent->mm == tsk->mm; + ret = tsk->vfork_done && + rcu_dereference(tsk->real_parent)->mm == tsk->mm; rcu_read_unlock();
return ret;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Alexey Dobriyan adobriyan@gmail.com
[ Upstream commit c995f12ad8842dbf5cfed113fb52cdd083f5afd1 ]
Doing a
prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1);
will copy 1 byte from userspace to (quite big) on-stack array and then stash everything to mm->saved_auxv. AT_NULL terminator will be inserted at the very end.
/proc/*/auxv handler will find that AT_NULL terminator and copy original stack contents to userspace.
This devious scheme requires CAP_SYS_RESOURCE.
Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sys.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sys.c b/kernel/sys.c index 51f00fe20e4d..7cf21c947649 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2080,7 +2080,7 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr, * up to the caller to provide sane values here, otherwise userspace * tools which use this vector might be unhappy. */ - unsigned long user_auxv[AT_VECTOR_SIZE]; + unsigned long user_auxv[AT_VECTOR_SIZE] = {};
if (len > sizeof(user_auxv)) return -EINVAL;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Minchan Kim minchan@kernel.org
commit 57e0076e6575a7b7cef620a0bd2ee2549ef77818 upstream.
writeback_store's return value is overwritten by submit_bio_wait's return value. Thus, writeback_store will return zero since there was no IO error. In the end, write syscall from userspace will see the zero as return value, which could make the process stall to keep trying the write until it will succeed.
Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org Fixes: 3b82a051c101("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store") Signed-off-by: Minchan Kim minchan@kernel.org Cc: Sergey Senozhatsky sergey.senozhatsky@gmail.com Cc: Colin Ian King colin.king@canonical.com Cc: John Dias joaodias@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -628,7 +628,7 @@ static ssize_t writeback_store(struct de struct bio_vec bio_vec; struct page *page; ssize_t ret = len; - int mode; + int mode, err; unsigned long blk_idx = 0;
if (sysfs_streq(buf, "idle")) @@ -729,12 +729,17 @@ static ssize_t writeback_store(struct de * XXX: A single page IO would be inefficient for write * but it would be not bad as starter. */ - ret = submit_bio_wait(&bio); - if (ret) { + err = submit_bio_wait(&bio); + if (err) { zram_slot_lock(zram, index); zram_clear_flag(zram, index, ZRAM_UNDER_WB); zram_clear_flag(zram, index, ZRAM_IDLE); zram_slot_unlock(zram, index); + /* + * Return last IO error unless every IO were + * not suceeded. + */ + ret = err; continue; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Minchan Kim minchan@kernel.org
commit 2766f1821600cc7562bae2128ad0b163f744c5d9 upstream.
commit 0d8359620d9b ("zram: support page writeback") introduced two problems. It overwrites writeback_store's return value as kstrtol's return value, which makes return value zero so user could see zero as return value of write syscall even though it wrote data successfully.
It also breaks index value in the loop in that it doesn't increase the index any longer. It means it can write only first starting block index so user couldn't write all idle pages in the zram so lose memory saving chance.
This patch fixes those issues.
Link: https://lkml.kernel.org/r/20210312173949.2197662-2-minchan@kernel.org Fixes: 0d8359620d9b("zram: support page writeback") Signed-off-by: Minchan Kim minchan@kernel.org Reported-by: Amos Bianchi amosbianchi@google.com Cc: Sergey Senozhatsky sergey.senozhatsky@gmail.com Cc: John Dias joaodias@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/zram/zram_drv.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -639,8 +639,8 @@ static ssize_t writeback_store(struct de if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1)) return -EINVAL;
- ret = kstrtol(buf + sizeof(PAGE_WB_SIG) - 1, 10, &index); - if (ret || index >= nr_pages) + if (kstrtol(buf + sizeof(PAGE_WB_SIG) - 1, 10, &index) || + index >= nr_pages) return -EINVAL;
nr_pages = 1; @@ -664,7 +664,7 @@ static ssize_t writeback_store(struct de goto release_init_lock; }
- while (nr_pages--) { + for (; nr_pages != 0; index++, nr_pages--) { struct bio_vec bvec;
bvec.bv_page = page;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Arnd Bergmann arnd@arndb.de
commit 97e4910232fa1f81e806aa60c25a0450276d99a2 upstream.
Separating compiler-clang.h from compiler-gcc.h inadventently dropped the definitions of the three HAVE_BUILTIN_BSWAP macros, which requires falling back to the open-coded version and hoping that the compiler detects it.
Since all versions of clang support the __builtin_bswap interfaces, add back the flags and have the headers pick these up automatically.
This results in a 4% improvement of compilation speed for arm defconfig.
Note: it might also be worth revisiting which architectures set CONFIG_ARCH_USE_BUILTIN_BSWAP for one compiler or the other, today this is set on six architectures (arm32, csky, mips, powerpc, s390, x86), while another ten architectures define custom helpers (alpha, arc, ia64, m68k, mips, nios2, parisc, sh, sparc, xtensa), and the rest (arm64, h8300, hexagon, microblaze, nds32, openrisc, riscv) just get the unoptimized version and rely on the compiler to detect it.
A long time ago, the compiler builtins were architecture specific, but nowadays, all compilers that are able to build the kernel have correct implementations of them, though some may not be as optimized as the inline asm versions.
The patch that dropped the optimization landed in v4.19, so as discussed it would be fairly safe to backport this revert to stable kernels to the 4.19/5.4/5.10 stable kernels, but there is a remaining risk for regressions, and it has no known side-effects besides compile speed.
Link: https://lkml.kernel.org/r/20210226161151.2629097-1-arnd@kernel.org Link: https://lore.kernel.org/lkml/20210225164513.3667778-1-arnd@kernel.org/ Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Miguel Ojeda ojeda@kernel.org Acked-by: Nick Desaulniers ndesaulniers@google.com Acked-by: Luc Van Oostenryck luc.vanoostenryck@gmail.com Cc: Masahiro Yamada masahiroy@kernel.org Cc: Nick Hu nickhu@andestech.com Cc: Greentime Hu green.hu@gmail.com Cc: Vincent Chen deanbo422@gmail.com Cc: Paul Walmsley paul.walmsley@sifive.com Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Albert Ou aou@eecs.berkeley.edu Cc: Guo Ren guoren@kernel.org Cc: Randy Dunlap rdunlap@infradead.org Cc: Sami Tolvanen samitolvanen@google.com Cc: Marco Elver elver@google.com Cc: Arvind Sankar nivedita@alum.mit.edu Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/compiler-clang.h | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -41,6 +41,12 @@ #define __no_sanitize_thread #endif
+#if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) +#define __HAVE_BUILTIN_BSWAP32__ +#define __HAVE_BUILTIN_BSWAP64__ +#define __HAVE_BUILTIN_BSWAP16__ +#endif /* CONFIG_ARCH_USE_BUILTIN_BSWAP */ + #if __has_feature(undefined_behavior_sanitizer) /* GCC does not have __SANITIZE_UNDEFINED__ */ #define __no_sanitize_undefined \
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit 8a6edb5257e2a84720fe78cb179eca58ba76126f upstream.
When affine_move_task(p) is called on a running task @p, which is not otherwise already changing affinity, we'll first set p->migration_pending and then do:
stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg);
This then gets us to migration_cpu_stop() running on the CPU that was previously running our victim task @p.
If we find that our task is no longer on that runqueue (this can happen because of a concurrent migration due to load-balance etc.), then we'll end up at the:
} else if (dest_cpu < 1 || pending) {
branch. Which we'll take because we set pending earlier. Here we first check if the task @p has already satisfied the affinity constraints, if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop() onto the CPU that is now hosting our task @p:
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, &pending->arg, &pending->stop_work);
Except, we've never initialized pending->arg, which will be all 0s.
This then results in running migration_cpu_stop() on the next CPU with arg->p == NULL, which gives the by now obvious result of fireworks.
The cure is to change affine_move_task() to always use pending->arg, furthermore we can use the exact same pattern as the SCA_MIGRATE_ENABLE case, since we'll block on the pending->done completion anyway, no point in adding yet another completion in stop_one_cpu().
This then gives a clear distinction between the two migration_cpu_stop() use cases:
- sched_exec() / migrate_task_to() : arg->pending == NULL - affine_move_task() : arg->pending != NULL;
And we can have it ignore p->migration_pending when !arg->pending. Any stop work from sched_exec() / migrate_task_to() is in addition to stop works from affine_move_task(), which will be sufficient to issue the completion.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 39 ++++++++++++++++++++++++++++----------- 1 file changed, 28 insertions(+), 11 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1922,6 +1922,24 @@ static int migration_cpu_stop(void *data rq_lock(rq, &rf);
pending = p->migration_pending; + if (pending && !arg->pending) { + /* + * This happens from sched_exec() and migrate_task_to(), + * neither of them care about pending and just want a task to + * maybe move about. + * + * Even if there is a pending, we can ignore it, since + * affine_move_task() will have it's own stop_work's in flight + * which will manage the completion. + * + * Notably, pending doesn't need to match arg->pending. This can + * happen when tripple concurrent affine_move_task() first sets + * pending, then clears pending and eventually sets another + * pending. + */ + pending = NULL; + } + /* * If task_rq(p) != rq, it cannot be migrated here, because we're * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because @@ -2194,10 +2212,6 @@ static int affine_move_task(struct rq *r int dest_cpu, unsigned int flags) { struct set_affinity_pending my_pending = { }, *pending = NULL; - struct migration_arg arg = { - .task = p, - .dest_cpu = dest_cpu, - }; bool complete = false;
/* Can the task run on the task's current CPU? If so, we're done */ @@ -2235,6 +2249,12 @@ static int affine_move_task(struct rq *r /* Install the request */ refcount_set(&my_pending.refs, 1); init_completion(&my_pending.done); + my_pending.arg = (struct migration_arg) { + .task = p, + .dest_cpu = -1, /* any */ + .pending = &my_pending, + }; + p->migration_pending = &my_pending; } else { pending = p->migration_pending; @@ -2265,12 +2285,6 @@ static int affine_move_task(struct rq *r p->migration_flags &= ~MDF_PUSH; task_rq_unlock(rq, p, rf);
- pending->arg = (struct migration_arg) { - .task = p, - .dest_cpu = -1, - .pending = pending, - }; - stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, &pending->arg, &pending->stop_work);
@@ -2283,8 +2297,11 @@ static int affine_move_task(struct rq *r * is_migration_disabled(p) checks to the stopper, which will * run on the same CPU as said p. */ + refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ task_rq_unlock(rq, p, rf); - stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); + + stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, + &pending->arg, &pending->stop_work);
} else {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Mathieu Desnoyers mathieu.desnoyers@efficios.com
commit ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825 upstream.
The function sync_runqueues_membarrier_state() should copy the membarrier state from the @mm received as parameter to each runqueue currently running tasks using that mm.
However, the use of smp_call_function_many() skips the current runqueue, which is unintended. Replace by a call to on_each_cpu_mask().
Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load") Reported-by: Nadav Amit nadav.amit@gmail.com Signed-off-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org # 5.4.x+ Link: https://lore.kernel.org/r/74F1E842-4A84-47BF-B6C2-5407DFDD4A4A@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/membarrier.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -471,9 +471,7 @@ static int sync_runqueues_membarrier_sta } rcu_read_unlock();
- preempt_disable(); - smp_call_function_many(tmpmask, ipi_sync_rq_state, mm, 1); - preempt_enable(); + on_each_cpu_mask(tmpmask, ipi_sync_rq_state, mm, true);
free_cpumask_var(tmpmask); cpus_read_unlock();
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit 58b1a45086b5f80f2b2842aa7ed0da51a64a302b upstream.
The SCA_MIGRATE_ENABLE and task_running() cases are almost identical, collapse them to avoid further duplication.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.500108964@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2279,30 +2279,23 @@ static int affine_move_task(struct rq *r return -EINVAL; }
- if (flags & SCA_MIGRATE_ENABLE) { - - refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ - p->migration_flags &= ~MDF_PUSH; - task_rq_unlock(rq, p, rf); - - stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, - &pending->arg, &pending->stop_work); - - return 0; - } - if (task_running(rq, p) || p->state == TASK_WAKING) { /* - * Lessen races (and headaches) by delegating - * is_migration_disabled(p) checks to the stopper, which will - * run on the same CPU as said p. + * MIGRATE_ENABLE gets here because 'p == current', but for + * anything else we cannot do is_migration_disabled(), punt + * and have the stopper function handle it all race-free. */ + refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ + if (flags & SCA_MIGRATE_ENABLE) + p->migration_flags &= ~MDF_PUSH; task_rq_unlock(rq, p, rf);
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, &pending->arg, &pending->stop_work);
+ if (flags & SCA_MIGRATE_ENABLE) + return 0; } else {
if (!is_migration_disabled(p)) {
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit c20cf065d4a619d394d23290093b1002e27dff86 upstream.
When affine_move_task() issues a migration_cpu_stop(), the purpose of that function is to complete that @pending, not any random other p->migration_pending that might have gotten installed since.
This realization much simplifies migration_cpu_stop() and allows further necessary steps to fix all this as it provides the guarantee that @pending's stopper will complete @pending (and not some random other @pending).
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.430014682@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 56 +++++++--------------------------------------------- 1 file changed, 8 insertions(+), 48 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1898,8 +1898,8 @@ static struct rq *__migrate_task(struct */ static int migration_cpu_stop(void *data) { - struct set_affinity_pending *pending; struct migration_arg *arg = data; + struct set_affinity_pending *pending = arg->pending; struct task_struct *p = arg->task; int dest_cpu = arg->dest_cpu; struct rq *rq = this_rq(); @@ -1921,25 +1921,6 @@ static int migration_cpu_stop(void *data raw_spin_lock(&p->pi_lock); rq_lock(rq, &rf);
- pending = p->migration_pending; - if (pending && !arg->pending) { - /* - * This happens from sched_exec() and migrate_task_to(), - * neither of them care about pending and just want a task to - * maybe move about. - * - * Even if there is a pending, we can ignore it, since - * affine_move_task() will have it's own stop_work's in flight - * which will manage the completion. - * - * Notably, pending doesn't need to match arg->pending. This can - * happen when tripple concurrent affine_move_task() first sets - * pending, then clears pending and eventually sets another - * pending. - */ - pending = NULL; - } - /* * If task_rq(p) != rq, it cannot be migrated here, because we're * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because @@ -1950,31 +1931,20 @@ static int migration_cpu_stop(void *data goto out;
if (pending) { - p->migration_pending = NULL; + if (p->migration_pending == pending) + p->migration_pending = NULL; complete = true; }
- /* migrate_enable() -- we must not race against SCA */ - if (dest_cpu < 0) { - /* - * When this was migrate_enable() but we no longer - * have a @pending, a concurrent SCA 'fixed' things - * and we should be valid again. Nothing to do. - */ - if (!pending) { - WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)); - goto out; - } - + if (dest_cpu < 0) dest_cpu = cpumask_any_distribute(&p->cpus_mask); - }
if (task_on_rq_queued(p)) rq = __migrate_task(rq, &rf, p, dest_cpu); else p->wake_cpu = dest_cpu;
- } else if (dest_cpu < 0 || pending) { + } else if (pending) { /* * This happens when we get migrated between migrate_enable()'s * preempt_enable() and scheduling the stopper task. At that @@ -1989,23 +1959,14 @@ static int migration_cpu_stop(void *data * ->pi_lock, so the allowed mask is stable - if it got * somewhere allowed, we're done. */ - if (pending && cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { - p->migration_pending = NULL; + if (cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { + if (p->migration_pending == pending) + p->migration_pending = NULL; complete = true; goto out; }
/* - * When this was migrate_enable() but we no longer have an - * @pending, a concurrent SCA 'fixed' things and we should be - * valid again. Nothing to do. - */ - if (!pending) { - WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)); - goto out; - } - - /* * When migrate_enable() hits a rq mis-match we can't reliably * determine is_migration_disabled() and so have to chase after * it. @@ -2022,7 +1983,6 @@ out: complete_all(&pending->done);
/* For pending->{arg,stop_work} */ - pending = arg->pending; if (pending && refcount_dec_and_test(&pending->refs)) wake_up_var(&pending->refs);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit 3f1bc119cd7fc987c8ed25ffb717f99403bb308c upstream.
When the purpose of migration_cpu_stop() is to migrate the task to 'any' valid CPU, don't migrate the task when it's already running on a valid CPU.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.569238629@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1936,14 +1936,25 @@ static int migration_cpu_stop(void *data complete = true; }
- if (dest_cpu < 0) + if (dest_cpu < 0) { + if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) + goto out; + dest_cpu = cpumask_any_distribute(&p->cpus_mask); + }
if (task_on_rq_queued(p)) rq = __migrate_task(rq, &rf, p, dest_cpu); else p->wake_cpu = dest_cpu;
+ /* + * XXX __migrate_task() can fail, at which point we might end + * up running on a dodgy CPU, AFAICT this can only happen + * during CPU hotplug, at which point we'll get pushed out + * anyway, so it's probably not a big deal. + */ + } else if (pending) { /* * This happens when we get migrated between migrate_enable()'s
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit 9e81889c7648d48dd5fe13f41cbc99f3c362484a upstream.
Consider:
sched_setaffinity(p, X); sched_setaffinity(p, Y);
Then the first will install p->migration_pending = &my_pending; and issue stop_one_cpu_nowait(pending); and the second one will read p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending), the _SAME_ @pending.
This causes stopper list corruption.
Add set_affinity_pending::stop_pending, to indicate if a stopper is in progress.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.649146419@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1864,6 +1864,7 @@ struct migration_arg {
struct set_affinity_pending { refcount_t refs; + unsigned int stop_pending; struct completion done; struct cpu_stop_work stop_work; struct migration_arg arg; @@ -1982,12 +1983,15 @@ static int migration_cpu_stop(void *data * determine is_migration_disabled() and so have to chase after * it. */ + WARN_ON_ONCE(!pending->stop_pending); task_rq_unlock(rq, p, &rf); stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop, &pending->arg, &pending->stop_work); return 0; } out: + if (pending) + pending->stop_pending = false; task_rq_unlock(rq, p, &rf);
if (complete) @@ -2183,7 +2187,7 @@ static int affine_move_task(struct rq *r int dest_cpu, unsigned int flags) { struct set_affinity_pending my_pending = { }, *pending = NULL; - bool complete = false; + bool stop_pending, complete = false;
/* Can the task run on the task's current CPU? If so, we're done */ if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) { @@ -2256,14 +2260,19 @@ static int affine_move_task(struct rq *r * anything else we cannot do is_migration_disabled(), punt * and have the stopper function handle it all race-free. */ + stop_pending = pending->stop_pending; + if (!stop_pending) + pending->stop_pending = true;
refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ if (flags & SCA_MIGRATE_ENABLE) p->migration_flags &= ~MDF_PUSH; task_rq_unlock(rq, p, rf);
- stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, - &pending->arg, &pending->stop_work); + if (!stop_pending) { + stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, + &pending->arg, &pending->stop_work); + }
if (flags & SCA_MIGRATE_ENABLE) return 0;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Peter Zijlstra peterz@infradead.org
commit 50caf9c14b1498c90cf808dbba2ca29bd32ccba4 upstream.
Now that we have set_affinity_pending::stop_pending to indicate if a stopper is in progress, and we have the guarantee that if that stopper exists, it will (eventually) complete our @pending we can simplify the refcount scheme by no longer counting the stopper thread.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Link: https://lkml.kernel.org/r/20210224131355.724130207@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1862,6 +1862,10 @@ struct migration_arg { struct set_affinity_pending *pending; };
+/* + * @refs: number of wait_for_completion() + * @stop_pending: is @stop_work in use + */ struct set_affinity_pending { refcount_t refs; unsigned int stop_pending; @@ -1997,10 +2001,6 @@ out: if (complete) complete_all(&pending->done);
- /* For pending->{arg,stop_work} */ - if (pending && refcount_dec_and_test(&pending->refs)) - wake_up_var(&pending->refs); - return 0; }
@@ -2199,12 +2199,16 @@ static int affine_move_task(struct rq *r push_task = get_task_struct(p); }
+ /* + * If there are pending waiters, but no pending stop_work, + * then complete now. + */ pending = p->migration_pending; - if (pending) { - refcount_inc(&pending->refs); + if (pending && !pending->stop_pending) { p->migration_pending = NULL; complete = true; } + task_rq_unlock(rq, p, rf);
if (push_task) { @@ -2213,7 +2217,7 @@ static int affine_move_task(struct rq *r }
if (complete) - goto do_complete; + complete_all(&pending->done);
return 0; } @@ -2264,9 +2268,9 @@ static int affine_move_task(struct rq *r if (!stop_pending) pending->stop_pending = true;
- refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ if (flags & SCA_MIGRATE_ENABLE) p->migration_flags &= ~MDF_PUSH; + task_rq_unlock(rq, p, rf);
if (!stop_pending) { @@ -2282,12 +2286,13 @@ static int affine_move_task(struct rq *r if (task_on_rq_queued(p)) rq = move_queued_task(rq, rf, p, dest_cpu);
- p->migration_pending = NULL; - complete = true; + if (!pending->stop_pending) { + p->migration_pending = NULL; + complete = true; + } } task_rq_unlock(rq, p, rf);
-do_complete: if (complete) complete_all(&pending->done); } @@ -2295,7 +2300,7 @@ do_complete: wait_for_completion(&pending->done);
if (refcount_dec_and_test(&pending->refs)) - wake_up_var(&pending->refs); + wake_up_var(&pending->refs); /* No UaF, just an address */
/* * Block the original owner of &pending until all subsequent callers @@ -2303,6 +2308,9 @@ do_complete: */ wait_var_event(&my_pending.refs, !refcount_read(&my_pending.refs));
+ /* ARGH */ + WARN_ON_ONCE(my_pending.stop_pending); + return 0; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Ard Biesheuvel ardb@kernel.org
commit 9e9888a0fe97b9501a40f717225d2bef7100a2c1 upstream.
The EFI_RT_PROPERTIES_TABLE contains a mask of runtime services that are available after ExitBootServices(). This mostly does not concern the EFI stub at all, given that it runs before that. However, there is one call that is made at runtime, which is the call to SetVirtualAddressMap() (which is not even callable at boot time to begin with)
So add the missing handling of the RT_PROP table to ensure that we only call SetVirtualAddressMap() if it is not being advertised as unsupported by the firmware.
Cc: stable@vger.kernel.org # v5.10+ Tested-by: Shawn Guo shawn.guo@linaro.org Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/efi/libstub/efi-stub.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/drivers/firmware/efi/libstub/efi-stub.c +++ b/drivers/firmware/efi/libstub/efi-stub.c @@ -96,6 +96,18 @@ static void install_memreserve_table(voi efi_err("Failed to install memreserve config table!\n"); }
+static u32 get_supported_rt_services(void) +{ + const efi_rt_properties_table_t *rt_prop_table; + u32 supported = EFI_RT_SUPPORTED_ALL; + + rt_prop_table = get_efi_config_table(EFI_RT_PROPERTIES_TABLE_GUID); + if (rt_prop_table) + supported &= rt_prop_table->runtime_services_supported; + + return supported; +} + /* * EFI entry point for the arm/arm64 EFI stubs. This is the entrypoint * that is described in the PE/COFF header. Most of the code is the same @@ -250,6 +262,10 @@ efi_status_t __efiapi efi_pe_entry(efi_h (prop_tbl->memory_protection_attribute & EFI_PROPERTIES_RUNTIME_MEMORY_PROTECTION_NON_EXECUTABLE_PE_DATA);
+ /* force efi_novamap if SetVirtualAddressMap() is unsupported */ + efi_novamap |= !(get_supported_rt_services() & + EFI_RT_SUPPORTED_SET_VIRTUAL_ADDRESS_MAP); + /* hibernation expects the runtime regions to stay in the same place */ if (!IS_ENABLED(CONFIG_HIBERNATION) && !efi_nokaslr && !flat_va_mapping) { /*
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com
commit cea15316ceee2d4a51dfdecd79e08a438135416c upstream.
'lis r2,N' is 'addis r2,0,N' and the instruction encoding in the macro LIS_R2 is incorrect (it currently maps to 'addis r0,r2,N'). Fix the same.
Fixes: c71b7eff426f ("powerpc: Add ABIv2 support to ppc_function_entry") Cc: stable@vger.kernel.org # v3.16+ Reported-by: Jiri Olsa jolsa@redhat.com Signed-off-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Acked-by: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210304020411.16796-1-naveen.n.rao@linux.vnet.ibm... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/code-patching.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/include/asm/code-patching.h +++ b/arch/powerpc/include/asm/code-patching.h @@ -73,7 +73,7 @@ void __patch_exception(int exc, unsigned #endif
#define OP_RT_RA_MASK 0xffff0000UL -#define LIS_R2 0x3c020000UL +#define LIS_R2 0x3c400000UL #define ADDIS_R2_R12 0x3c4c0000UL #define ADDI_R2_R2 0x38420000UL
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Nicholas Piggin npiggin@gmail.com
commit 73ac79881804eed2e9d76ecdd1018037f8510cb1 upstream.
This bit operation was inverted and set the low bit rather than cleared it, breaking the ability to ptrace non-volatile GPRs after exec. Fix.
Only affects 64e and 32-bit.
Fixes: feb9df3462e6 ("powerpc/64s: Always has full regs, so remove remnant checks") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210308085530.3191843-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/ptrace.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -195,7 +195,7 @@ static inline void regs_set_return_value #define TRAP_FLAGS_MASK 0x11 #define TRAP(regs) ((regs)->trap & ~TRAP_FLAGS_MASK) #define FULL_REGS(regs) (((regs)->trap & 1) == 0) -#define SET_FULL_REGS(regs) ((regs)->trap |= 1) +#define SET_FULL_REGS(regs) ((regs)->trap &= ~1) #endif #define CHECK_FULL_REGS(regs) BUG_ON(!FULL_REGS(regs)) #define NV_REG_POISON 0xdeadbeefdeadbeefUL @@ -210,7 +210,7 @@ static inline void regs_set_return_value #define TRAP_FLAGS_MASK 0x1F #define TRAP(regs) ((regs)->trap & ~TRAP_FLAGS_MASK) #define FULL_REGS(regs) (((regs)->trap & 1) == 0) -#define SET_FULL_REGS(regs) ((regs)->trap |= 1) +#define SET_FULL_REGS(regs) ((regs)->trap &= ~1) #define IS_CRITICAL_EXC(regs) (((regs)->trap & 2) != 0) #define IS_MCHECK_EXC(regs) (((regs)->trap & 4) != 0) #define IS_DEBUG_EXC(regs) (((regs)->trap & 8) != 0)
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Christophe Leroy christophe.leroy@csgroup.eu
commit bd73758803c2eedc037c2268b65a19542a832594 upstream.
Add stub instances of enable_kernel_vsx() and disable_kernel_vsx() when CONFIG_VSX is not set, to avoid following build failure.
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o In file included from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29, from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37, from drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 'dcn_bw_apply_registry_override': ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? [-Werror=implicit-function-declaration] 64 | enable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in expansion of macro 'DC_FP_START' 640 | DC_FP_START(); | ^~~~~~~~~~~ ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? [-Werror=implicit-function-declaration] 75 | disable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in expansion of macro 'DC_FP_END' 676 | DC_FP_END(); | ^~~~~~~~~ cc1: some warnings being treated as errors make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] Error 1
This works because the caller is checking if VSX is available using cpu_has_feature():
#define DC_FP_START() { \ if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ preempt_disable(); \ enable_kernel_vsx(); \ } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ preempt_disable(); \ enable_kernel_altivec(); \ } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ preempt_disable(); \ enable_kernel_fp(); \ } \
When CONFIG_VSX is not selected, cpu_has_feature(CPU_FTR_VSX_COMP) constant folds to 'false' so the call to enable_kernel_vsx() is discarded and the build succeeds.
Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER") Cc: stable@vger.kernel.org # v5.6+ Reported-by: Geert Uytterhoeven geert@linux-m68k.org Reported-by: kernel test robot lkp@intel.com Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu [mpe: Incorporate some discussion comments into the change log] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/8d7d285a027e9d21f5ff7f850fa71a2655b0c4af.161527917... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/switch_to.h | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -71,6 +71,16 @@ static inline void disable_kernel_vsx(vo { msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX); } +#else +static inline void enable_kernel_vsx(void) +{ + BUILD_BUG(); +} + +static inline void disable_kernel_vsx(void) +{ + BUILD_BUG(); +} #endif
#ifdef CONFIG_SPE
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Lior Ribak liorribak@gmail.com
commit e7850f4d844e0acfac7e570af611d89deade3146 upstream.
There is a deadlock in bm_register_write:
First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lock(d_inode(root)).
Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call open_exec on the user-provided interpreter.
open_exec will call a path lookup, and if the path lookup process includes the root of binfmt_misc, it will try to take a shared lock on its inode again, but it is already locked, and the code will get stuck in a deadlock
To reproduce the bug: $ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register
backtrace of where the lock occurs (#5): 0 schedule () at ./arch/x86/include/asm/current.h:15 1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992 2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213 3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222 4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355 5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783 6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177 7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366 8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396 9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913 10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948 11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682 12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF ", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603 13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF ", count=19) at fs/read_write.c:658 14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46 15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
To solve the issue, the open_exec call is moved to before the write lock is taken by bm_register_write
Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers") Signed-off-by: Lior Ribak liorribak@gmail.com Acked-by: Helge Deller deller@gmx.de Cc: Al Viro viro@zeniv.linux.org.uk Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/binfmt_misc.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)
--- a/fs/binfmt_misc.c +++ b/fs/binfmt_misc.c @@ -647,12 +647,24 @@ static ssize_t bm_register_write(struct struct super_block *sb = file_inode(file)->i_sb; struct dentry *root = sb->s_root, *dentry; int err = 0; + struct file *f = NULL;
e = create_entry(buffer, count);
if (IS_ERR(e)) return PTR_ERR(e);
+ if (e->flags & MISC_FMT_OPEN_FILE) { + f = open_exec(e->interpreter); + if (IS_ERR(f)) { + pr_notice("register: failed to install interpreter file %s\n", + e->interpreter); + kfree(e); + return PTR_ERR(f); + } + e->interp_file = f; + } + inode_lock(d_inode(root)); dentry = lookup_one_len(e->name, root, strlen(e->name)); err = PTR_ERR(dentry); @@ -676,21 +688,6 @@ static ssize_t bm_register_write(struct goto out2; }
- if (e->flags & MISC_FMT_OPEN_FILE) { - struct file *f; - - f = open_exec(e->interpreter); - if (IS_ERR(f)) { - err = PTR_ERR(f); - pr_notice("register: failed to install interpreter file %s\n", e->interpreter); - simple_release_fs(&bm_mnt, &entry_count); - iput(inode); - inode = NULL; - goto out2; - } - e->interp_file = f; - } - e->dentry = dget(dentry); inode->i_private = e; inode->i_fop = &bm_entry_operations; @@ -707,6 +704,8 @@ out: inode_unlock(d_inode(root));
if (err) { + if (f) + filp_close(f, NULL); kfree(e); return err; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andrey Konovalov andreyknvl@google.com
commit f9d79e8dce4077d3c6ab739c808169dfa99af9ef upstream.
Currently, kasan_free_nondeferred_pages()->kasan_free_pages() is called after debug_pagealloc_unmap_pages(). This causes a crash when debug_pagealloc is enabled, as HW_TAGS KASAN can't set tags on an unmapped page.
This patch puts kasan_free_nondeferred_pages() before debug_pagealloc_unmap_pages() and arch_free_page(), which can also make the page unavailable.
Link: https://lkml.kernel.org/r/24cd7db274090f0e5bc3adcdc7399243668e3171.161498731... Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov andreyknvl@google.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Alexander Potapenko glider@google.com Cc: Marco Elver elver@google.com Cc: Peter Collingbourne pcc@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Branislav Rankov Branislav.Rankov@arm.com Cc: Kevin Brodsky kevin.brodsky@arm.com Cc: Christoph Hellwig hch@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1282,6 +1282,12 @@ static __always_inline bool free_pages_p kernel_poison_pages(page, 1 << order);
/* + * With hardware tag-based KASAN, memory tags must be set before the + * page becomes unavailable via debug_pagealloc or arch_free_page. + */ + kasan_free_nondeferred_pages(page, order); + + /* * arch_free_page() can make the page's contents inaccessible. s390 * does this. So nothing which can access the page's contents should * happen after this. @@ -1290,8 +1296,6 @@ static __always_inline bool free_pages_p
debug_pagealloc_unmap_pages(page, 1 << order);
- kasan_free_nondeferred_pages(page, order); - return true; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andrey Konovalov andreyknvl@google.com
commit d9b571c885a8974fbb7d4ee639dbc643fd000f9e upstream.
There's a runtime failure when running HW_TAGS-enabled kernel built with GCC on hardware that doesn't support MTE. GCC-built kernels always have CONFIG_KASAN_STACK enabled, even though stack instrumentation isn't supported by HW_TAGS. Having that config enabled causes KASAN to issue MTE-only instructions to unpoison kernel stacks, which causes the failure.
Fix the issue by disallowing CONFIG_KASAN_STACK when HW_TAGS is used.
(The commit that introduced CONFIG_KASAN_HW_TAGS specified proper dependency for CONFIG_KASAN_STACK_ENABLE but not for CONFIG_KASAN_STACK.)
Link: https://lkml.kernel.org/r/59e75426241dbb5611277758c8d4d6f5f9298dac.161521544... Fixes: 6a63a63ff1ac ("kasan: introduce CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov andreyknvl@google.com Reported-by: Catalin Marinas catalin.marinas@arm.com Cc: stable@vger.kernel.org Cc: Will Deacon will.deacon@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Alexander Potapenko glider@google.com Cc: Marco Elver elver@google.com Cc: Peter Collingbourne pcc@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Branislav Rankov Branislav.Rankov@arm.com Cc: Kevin Brodsky kevin.brodsky@arm.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/Kconfig.kasan | 1 + 1 file changed, 1 insertion(+)
--- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -156,6 +156,7 @@ config KASAN_STACK_ENABLE
config KASAN_STACK int + depends on KASAN_GENERIC || KASAN_SW_TAGS default 1 if KASAN_STACK_ENABLE || CC_IS_GCC default 0
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Josh Poimboeuf jpoimboe@redhat.com
commit e504e74cc3a2c092b05577ce3e8e013fae7d94e6 upstream.
KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG.
Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing.
This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG.
But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning.
Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Tested-by: Ivan Babrou ivan@cloudflare.com Cc: stable@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.161253464... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/unwind_orc.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -367,8 +367,8 @@ static bool deref_stack_regs(struct unwi if (!stack_access_ok(state, addr, sizeof(struct pt_regs))) return false;
- *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; }
@@ -380,8 +380,8 @@ static bool deref_stack_iret_regs(struct if (!stack_access_ok(state, addr, IRET_FRAME_SIZE)) return false;
- *ip = regs->ip; - *sp = regs->sp; + *ip = READ_ONCE_NOCHECK(regs->ip); + *sp = READ_ONCE_NOCHECK(regs->sp); return true; }
@@ -402,12 +402,12 @@ static bool get_reg(struct unwind_state return false;
if (state->full_regs) { - *val = ((unsigned long *)state->regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->regs)[reg]); return true; }
if (state->prev_regs) { - *val = ((unsigned long *)state->prev_regs)[reg]; + *val = READ_ONCE_NOCHECK(((unsigned long *)state->prev_regs)[reg]); return true; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joerg Roedel jroedel@suse.de
commit 78a81d88f60ba773cbe890205e1ee67f00502948 upstream.
Introduce a helper to check whether an exception came from the syscall gap and use it in the SEV-ES code. Extend the check to also cover the compatibility SYSCALL entry path.
Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-2-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_64_compat.S | 2 ++ arch/x86/include/asm/proto.h | 1 + arch/x86/include/asm/ptrace.h | 15 +++++++++++++++ arch/x86/kernel/traps.c | 3 +-- 4 files changed, 19 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -210,6 +210,8 @@ SYM_CODE_START(entry_SYSCALL_compat) /* Switch to the kernel stack */ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+SYM_INNER_LABEL(entry_SYSCALL_compat_safe_stack, SYM_L_GLOBAL) + /* Construct struct pt_regs on stack */ pushq $__USER32_DS /* pt_regs->ss */ pushq %r8 /* pt_regs->sp */ --- a/arch/x86/include/asm/proto.h +++ b/arch/x86/include/asm/proto.h @@ -25,6 +25,7 @@ void __end_SYSENTER_singlestep_region(vo void entry_SYSENTER_compat(void); void __end_entry_SYSENTER_compat(void); void entry_SYSCALL_compat(void); +void entry_SYSCALL_compat_safe_stack(void); void entry_INT80_compat(void); #ifdef CONFIG_XEN_PV void xen_entry_INT80_compat(void); --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -94,6 +94,8 @@ struct pt_regs { #include <asm/paravirt_types.h> #endif
+#include <asm/proto.h> + struct cpuinfo_x86; struct task_struct;
@@ -175,6 +177,19 @@ static inline bool any_64bit_mode(struct #ifdef CONFIG_X86_64 #define current_user_stack_pointer() current_pt_regs()->sp #define compat_user_stack_pointer() current_pt_regs()->sp + +static inline bool ip_within_syscall_gap(struct pt_regs *regs) +{ + bool ret = (regs->ip >= (unsigned long)entry_SYSCALL_64 && + regs->ip < (unsigned long)entry_SYSCALL_64_safe_stack); + +#ifdef CONFIG_IA32_EMULATION + ret = ret || (regs->ip >= (unsigned long)entry_SYSCALL_compat && + regs->ip < (unsigned long)entry_SYSCALL_compat_safe_stack); +#endif + + return ret; +} #endif
static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -694,8 +694,7 @@ asmlinkage __visible noinstr struct pt_r * In the SYSCALL entry path the RSP value comes from user-space - don't * trust it and switch to the current kernel stack */ - if (regs->ip >= (unsigned long)entry_SYSCALL_64 && - regs->ip < (unsigned long)entry_SYSCALL_64_safe_stack) { + if (ip_within_syscall_gap(regs)) { sp = this_cpu_read(cpu_current_top_of_stack); goto sync; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joerg Roedel jroedel@suse.de
commit 545ac14c16b5dbd909d5a90ddf5b5a629a40fa94 upstream.
The code in the NMI handler to adjust the #VC handler IST stack is needed in case an NMI hits when the #VC handler is still using its IST stack.
But the check for this condition also needs to look if the regs->sp value is trusted, meaning it was not set by user-space. Extend the check to not use regs->sp when the NMI interrupted user-space code or the SYSCALL gap.
Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Reported-by: Andy Lutomirski luto@kernel.org Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-3-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/sev-es.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/sev-es.c +++ b/arch/x86/kernel/sev-es.c @@ -121,8 +121,18 @@ static void __init setup_vc_stacks(int c cea_set_pte((void *)vaddr, pa, PAGE_KERNEL); }
-static __always_inline bool on_vc_stack(unsigned long sp) +static __always_inline bool on_vc_stack(struct pt_regs *regs) { + unsigned long sp = regs->sp; + + /* User-mode RSP is not trusted */ + if (user_mode(regs)) + return false; + + /* SYSCALL gap still has user-mode RSP */ + if (ip_within_syscall_gap(regs)) + return false; + return ((sp >= __this_cpu_ist_bottom_va(VC)) && (sp < __this_cpu_ist_top_va(VC))); }
@@ -144,7 +154,7 @@ void noinstr __sev_es_ist_enter(struct p old_ist = __this_cpu_read(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC]);
/* Make room on the IST stack */ - if (on_vc_stack(regs->sp)) + if (on_vc_stack(regs)) new_ist = ALIGN_DOWN(regs->sp, 8) - sizeof(old_ist); else new_ist = old_ist - sizeof(old_ist);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joerg Roedel jroedel@suse.de
commit 62441a1fb53263bda349b6e5997c3cc5c120d89e upstream.
Call irqentry_nmi_enter()/irqentry_nmi_exit() in the #VC handler to correctly track the IRQ state during its execution.
Fixes: 0786138c78e79 ("x86/sev-es: Add a Runtime #VC Exception Handler") Reported-by: Andy Lutomirski luto@kernel.org Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-5-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/sev-es.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/sev-es.c +++ b/arch/x86/kernel/sev-es.c @@ -1258,13 +1258,12 @@ static __always_inline bool on_vc_fallba DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication) { struct sev_es_runtime_data *data = this_cpu_read(runtime_data); + irqentry_state_t irq_state; struct ghcb_state state; struct es_em_ctxt ctxt; enum es_result result; struct ghcb *ghcb;
- lockdep_assert_irqs_disabled(); - /* * Handle #DB before calling into !noinstr code to avoid recursive #DB. */ @@ -1273,6 +1272,8 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_co return; }
+ irq_state = irqentry_nmi_enter(regs); + lockdep_assert_irqs_disabled(); instrumentation_begin();
/* @@ -1335,6 +1336,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_co
out: instrumentation_end(); + irqentry_nmi_exit(regs, irq_state);
return;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Joerg Roedel jroedel@suse.de
commit bffe30dd9f1f3b2608a87ac909a224d6be472485 upstream.
The #VC handler must run in atomic context and cannot sleep. This is a problem when it tries to fetch instruction bytes from user-space via copy_from_user().
Introduce a insn_fetch_from_user_inatomic() helper which uses __copy_from_user_inatomic() to safely copy the instruction bytes to kernel memory in the #VC handler.
Fixes: 5e3427a7bc432 ("x86/sev-es: Handle instruction fetches from user-space") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-6-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/insn-eval.h | 2 + arch/x86/kernel/sev-es.c | 2 - arch/x86/lib/insn-eval.c | 66 ++++++++++++++++++++++++++++++--------- 3 files changed, 55 insertions(+), 15 deletions(-)
--- a/arch/x86/include/asm/insn-eval.h +++ b/arch/x86/include/asm/insn-eval.h @@ -23,6 +23,8 @@ unsigned long insn_get_seg_base(struct p int insn_get_code_seg_params(struct pt_regs *regs); int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE]); +int insn_fetch_from_user_inatomic(struct pt_regs *regs, + unsigned char buf[MAX_INSN_SIZE]); bool insn_decode(struct insn *insn, struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE], int buf_size);
--- a/arch/x86/kernel/sev-es.c +++ b/arch/x86/kernel/sev-es.c @@ -258,7 +258,7 @@ static enum es_result vc_decode_insn(str int res;
if (user_mode(ctxt->regs)) { - res = insn_fetch_from_user(ctxt->regs, buffer); + res = insn_fetch_from_user_inatomic(ctxt->regs, buffer); if (!res) { ctxt->fi.vector = X86_TRAP_PF; ctxt->fi.error_code = X86_PF_INSTR | X86_PF_USER; --- a/arch/x86/lib/insn-eval.c +++ b/arch/x86/lib/insn-eval.c @@ -1415,6 +1415,25 @@ void __user *insn_get_addr_ref(struct in } }
+static unsigned long insn_get_effective_ip(struct pt_regs *regs) +{ + unsigned long seg_base = 0; + + /* + * If not in user-space long mode, a custom code segment could be in + * use. This is true in protected mode (if the process defined a local + * descriptor table), or virtual-8086 mode. In most of the cases + * seg_base will be zero as in USER_CS. + */ + if (!user_64bit_mode(regs)) { + seg_base = insn_get_seg_base(regs, INAT_SEG_REG_CS); + if (seg_base == -1L) + return 0; + } + + return seg_base + regs->ip; +} + /** * insn_fetch_from_user() - Copy instruction bytes from user-space memory * @regs: Structure with register values as seen when entering kernel mode @@ -1431,24 +1450,43 @@ void __user *insn_get_addr_ref(struct in */ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE]) { - unsigned long seg_base = 0; + unsigned long ip; int not_copied;
- /* - * If not in user-space long mode, a custom code segment could be in - * use. This is true in protected mode (if the process defined a local - * descriptor table), or virtual-8086 mode. In most of the cases - * seg_base will be zero as in USER_CS. - */ - if (!user_64bit_mode(regs)) { - seg_base = insn_get_seg_base(regs, INAT_SEG_REG_CS); - if (seg_base == -1L) - return 0; - } + ip = insn_get_effective_ip(regs); + if (!ip) + return 0; + + not_copied = copy_from_user(buf, (void __user *)ip, MAX_INSN_SIZE); + + return MAX_INSN_SIZE - not_copied; +} + +/** + * insn_fetch_from_user_inatomic() - Copy instruction bytes from user-space memory + * while in atomic code + * @regs: Structure with register values as seen when entering kernel mode + * @buf: Array to store the fetched instruction + * + * Gets the linear address of the instruction and copies the instruction bytes + * to the buf. This function must be used in atomic context. + * + * Returns: + * + * Number of instruction bytes copied. + * + * 0 if nothing was copied. + */ +int insn_fetch_from_user_inatomic(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE]) +{ + unsigned long ip; + int not_copied;
+ ip = insn_get_effective_ip(regs); + if (!ip) + return 0;
- not_copied = copy_from_user(buf, (void __user *)(seg_base + regs->ip), - MAX_INSN_SIZE); + not_copied = __copy_from_user_inatomic(buf, (void __user *)ip, MAX_INSN_SIZE);
return MAX_INSN_SIZE - not_copied; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andy Lutomirski luto@kernel.org
commit 5d5675df792ff67e74a500c4c94db0f99e6a10ef upstream.
On a 32-bit fast syscall that fails to read its arguments from user memory, the kernel currently does syscall exit work but not syscall entry work. This confuses audit and ptrace. For example:
$ ./tools/testing/selftests/x86/syscall_arg_fault_32 ... strace: pid 264258: entering, ptrace_syscall_info.op == 2 ...
This is a minimal fix intended for ease of backporting. A more complete cleanup is coming.
Fixes: 0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry") Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8c82296ddf803b91f8d1e5eac89e5803ba54ab0e.161488467... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -128,7 +128,8 @@ static noinstr bool __do_fast_syscall_32 regs->ax = -EFAULT;
instrumentation_end(); - syscall_exit_to_user_mode(regs); + local_irq_disable(); + irqentry_exit_to_user_mode(regs); return false; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Sean Christopherson seanjc@google.com
commit beda430177f56656e7980dcce93456ffaa35676b upstream.
When posting a deadline timer interrupt, open code the checks guarding __kvm_wait_lapic_expire() in order to skip the lapic_timer_int_injected() check in kvm_wait_lapic_expire(). The injection check will always fail since the interrupt has not yet be injected. Moving the call after injection would also be wrong as that wouldn't actually delay delivery of the IRQ if it is indeed sent via posted interrupt.
Fixes: 010fd37fddf6 ("KVM: LAPIC: Reduce world switch latency caused by timer_advance_ns") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210305021808.3769732-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1641,7 +1641,16 @@ static void apic_timer_expired(struct kv }
if (kvm_use_posted_timer_interrupt(apic->vcpu)) { - kvm_wait_lapic_expire(vcpu); + /* + * Ensure the guest's timer has truly expired before posting an + * interrupt. Open code the relevant checks to avoid querying + * lapic_timer_int_injected(), which will be false since the + * interrupt isn't yet injected. Waiting until after injecting + * is not an option since that won't help a posted interrupt. + */ + if (vcpu->arch.apic->lapic_timer.expired_tscdeadline && + vcpu->arch.apic->lapic_timer.timer_advance_ns) + __kvm_wait_lapic_expire(vcpu); kvm_apic_inject_pending_timer_irqs(apic); return; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Wanpeng Li wanpengli@tencent.com
commit d7eb79c6290c7ae4561418544072e0a3266e7384 upstream.
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 88 On-line CPU(s) list: 0-63 Off-line CPU(s) list: 64-87
# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall
# echo 1 > /sys/devices/system/cpu/cpu76/online -bash: echo: write error: Cannot allocate memory
The per-cpu vsyscall pvclock data pointer assigns either an element of the static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall and sometimes you end up with vsyscall disabled if the host does something strange. This patch fixes it by allocating this dynamically memory unconditionally even if vsyscall is disabled.
Fixes: 6a1cac56f4 ("x86/kvm: Use __bss_decrypted attribute in shared variables") Reported-by: Zelin Deng zelin.deng@linux.alibaba.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li wanpengli@tencent.com Message-Id: 1614130683-24137-1-git-send-email-wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/kvmclock.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
--- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -268,21 +268,20 @@ static void __init kvmclock_init_mem(voi
static int __init kvm_setup_vsyscall_timeinfo(void) { -#ifdef CONFIG_X86_64 - u8 flags; + kvmclock_init_mem();
- if (!per_cpu(hv_clock_per_cpu, 0) || !kvmclock_vsyscall) - return 0; +#ifdef CONFIG_X86_64 + if (per_cpu(hv_clock_per_cpu, 0) && kvmclock_vsyscall) { + u8 flags;
- flags = pvclock_read_flags(&hv_clock_boot[0].pvti); - if (!(flags & PVCLOCK_TSC_STABLE_BIT)) - return 0; + flags = pvclock_read_flags(&hv_clock_boot[0].pvti); + if (!(flags & PVCLOCK_TSC_STABLE_BIT)) + return 0;
- kvm_clock.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK; + kvm_clock.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK; + } #endif
- kvmclock_init_mem(); - return 0; } early_initcall(kvm_setup_vsyscall_timeinfo);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Marc Zyngier maz@kernel.org
commit 01dc9262ff5797b675c32c0c6bc682777d23de05 upstream.
It recently became apparent that the ARMv8 architecture has interesting rules regarding attributes being used when fetching instructions if the MMU is off at Stage-1.
In this situation, the CPU is allowed to fetch from the PoC and allocate into the I-cache (unless the memory is mapped with the XN attribute at Stage-2).
If we transpose this to vcpus sharing a single physical CPU, it is possible for a vcpu running with its MMU off to influence another vcpu running with its MMU on, as the latter is expected to fetch from the PoU (and self-patching code doesn't flush below that level).
In order to solve this, reuse the vcpu-private TLB invalidation code to apply the same policy to the I-cache, nuking it every time the vcpu runs on a physical CPU that ran another vcpu of the same VM in the past.
This involve renaming __kvm_tlb_flush_local_vmid() to __kvm_flush_cpu_context(), and inserting a local i-cache invalidation there.
Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Acked-by: Will Deacon will@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/kvm_asm.h | 4 ++-- arch/arm64/kvm/arm.c | 7 ++++++- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 +++--- arch/arm64/kvm/hyp/nvhe/tlb.c | 3 ++- arch/arm64/kvm/hyp/vhe/tlb.c | 3 ++- 5 files changed, 15 insertions(+), 8 deletions(-)
--- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -47,7 +47,7 @@ #define __KVM_HOST_SMCCC_FUNC___kvm_flush_vm_context 2 #define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa 3 #define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid 4 -#define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_local_vmid 5 +#define __KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context 5 #define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff 6 #define __KVM_HOST_SMCCC_FUNC___kvm_enable_ssbs 7 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_ich_vtr_el2 8 @@ -183,10 +183,10 @@ DECLARE_KVM_HYP_SYM(__bp_harden_hyp_vecs #define __bp_harden_hyp_vecs CHOOSE_HYP_SYM(__bp_harden_hyp_vecs)
extern void __kvm_flush_vm_context(void); +extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level); extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); -extern void __kvm_tlb_flush_local_vmid(struct kvm_s2_mmu *mmu);
extern void __kvm_timer_set_cntvoff(u64 cntvoff);
--- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -385,11 +385,16 @@ void kvm_arch_vcpu_load(struct kvm_vcpu last_ran = this_cpu_ptr(mmu->last_vcpu_ran);
/* + * We guarantee that both TLBs and I-cache are private to each + * vcpu. If detecting that a vcpu from the same VM has + * previously run on the same physical CPU, call into the + * hypervisor code to nuke the relevant contexts. + * * We might get preempted before the vCPU actually runs, but * over-invalidation doesn't affect correctness. */ if (*last_ran != vcpu->vcpu_id) { - kvm_call_hyp(__kvm_tlb_flush_local_vmid, mmu); + kvm_call_hyp(__kvm_flush_cpu_context, mmu); *last_ran = vcpu->vcpu_id; }
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -46,11 +46,11 @@ static void handle___kvm_tlb_flush_vmid( __kvm_tlb_flush_vmid(kern_hyp_va(mmu)); }
-static void handle___kvm_tlb_flush_local_vmid(struct kvm_cpu_context *host_ctxt) +static void handle___kvm_flush_cpu_context(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1);
- __kvm_tlb_flush_local_vmid(kern_hyp_va(mmu)); + __kvm_flush_cpu_context(kern_hyp_va(mmu)); }
static void handle___kvm_timer_set_cntvoff(struct kvm_cpu_context *host_ctxt) @@ -115,7 +115,7 @@ static const hcall_t *host_hcall[] = { HANDLE_FUNC(__kvm_flush_vm_context), HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa), HANDLE_FUNC(__kvm_tlb_flush_vmid), - HANDLE_FUNC(__kvm_tlb_flush_local_vmid), + HANDLE_FUNC(__kvm_flush_cpu_context), HANDLE_FUNC(__kvm_timer_set_cntvoff), HANDLE_FUNC(__kvm_enable_ssbs), HANDLE_FUNC(__vgic_v3_get_ich_vtr_el2), --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -123,7 +123,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_ __tlb_switch_to_host(&cxt); }
-void __kvm_tlb_flush_local_vmid(struct kvm_s2_mmu *mmu) +void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt;
@@ -131,6 +131,7 @@ void __kvm_tlb_flush_local_vmid(struct k __tlb_switch_to_guest(mmu, &cxt);
__tlbi(vmalle1); + asm volatile("ic iallu"); dsb(nsh); isb();
--- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -127,7 +127,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_ __tlb_switch_to_host(&cxt); }
-void __kvm_tlb_flush_local_vmid(struct kvm_s2_mmu *mmu) +void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt;
@@ -135,6 +135,7 @@ void __kvm_tlb_flush_local_vmid(struct k __tlb_switch_to_guest(mmu, &cxt);
__tlbi(vmalle1); + asm volatile("ic iallu"); dsb(nsh); isb();
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Jia He justin.he@arm.com
commit 357ad203d45c0f9d76a8feadbd5a1c5d460c638b upstream.
When walking the page tables at a given level, and if the start address for the range isn't aligned for that level, we propagate the misalignment on each iteration at that level.
This results in the walker ignoring a number of entries (depending on the original misalignment) on each subsequent iteration.
Properly aligning the address before the next iteration addresses this issue.
Cc: stable@vger.kernel.org Reported-by: Howard Zhang Howard.Zhang@arm.com Acked-by: Will Deacon will@kernel.org Signed-off-by: Jia He justin.he@arm.com Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker infrastructure") [maz: rewrite commit message] Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210303024225.2591-1-justin.he@arm.com Message-Id: 20210305185254.3730990-9-maz@kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/hyp/pgtable.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(st goto out;
if (!table) { + data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level)); data->addr += kvm_granule_size(level); goto out; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Will Deacon will@kernel.org
commit 31948332d5fa392ad933f4a6a10026850649ed76 upstream.
Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") tracks the currently running vCPU, clearing the pointer to NULL on exit from a guest.
Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS code to go off into the weeds when it saves the DISR assuming that the CPU context is embedded in a struct vCPU.
Leave x1 alone and use x3 as a temporary register instead when clearing the vCPU on the guest exit path.
Cc: Marc Zyngier maz@kernel.org Cc: Andrew Scull ascull@google.com Cc: stable@vger.kernel.org Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") Suggested-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210226181211.14542-1-will@kernel.org Message-Id: 20210305185254.3730990-3-maz@kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/hyp/entry.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -146,7 +146,7 @@ SYM_INNER_LABEL(__guest_exit, SYM_L_GLOB // Now restore the hyp regs restore_callee_saved_regs x2
- set_loaded_vcpu xzr, x1, x2 + set_loaded_vcpu xzr, x2, x3
alternative_if ARM64_HAS_RAS_EXTN // If we have the RAS extensions we can consume a pending error
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Suzuki K Poulose suzuki.poulose@arm.com
commit b96b0c5de685df82019e16826a282d53d86d112c upstream.
The nVHE KVM hyp drains and disables the SPE buffer, before entering the guest, as the EL1&0 translation regime is going to be loaded with that of the guest.
But this operation is performed way too late, because : - The owning translation regime of the SPE buffer is transferred to EL2. (MDCR_EL2_E2PB == 0) - The guest Stage1 is loaded.
Thus the flush could use the host EL1 virtual address, but use the EL2 translations instead of host EL1, for writing out any cached data.
Fix this by moving the SPE buffer handling early enough. The restore path is doing the right thing.
Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow") Cc: stable@vger.kernel.org Cc: Christoffer Dall christoffer.dall@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Mark Rutland mark.rutland@arm.com Cc: Alexandru Elisei alexandru.elisei@arm.com Reviewed-by: Alexandru Elisei alexandru.elisei@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210302120345.3102874-1-suzuki.poulose@arm.com Message-Id: 20210305185254.3730990-2-maz@kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/kvm_hyp.h | 5 +++++ arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++-- arch/arm64/kvm/hyp/nvhe/switch.c | 11 ++++++++++- 3 files changed, 25 insertions(+), 3 deletions(-)
--- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -83,6 +83,11 @@ void sysreg_restore_guest_state_vhe(stru void __debug_switch_to_guest(struct kvm_vcpu *vcpu); void __debug_switch_to_host(struct kvm_vcpu *vcpu);
+#ifdef __KVM_NVHE_HYPERVISOR__ +void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu); +void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu); +#endif + void __fpsimd_save_state(struct user_fpsimd_state *fp_regs); void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c @@ -58,16 +58,24 @@ static void __debug_restore_spe(u64 pmsc write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1); }
-void __debug_switch_to_guest(struct kvm_vcpu *vcpu) +void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu) { /* Disable and flush SPE data generation */ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1); +} + +void __debug_switch_to_guest(struct kvm_vcpu *vcpu) +{ __debug_switch_to_guest_common(vcpu); }
-void __debug_switch_to_host(struct kvm_vcpu *vcpu) +void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu) { __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1); +} + +void __debug_switch_to_host(struct kvm_vcpu *vcpu) +{ __debug_switch_to_host_common(vcpu); }
--- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -192,6 +192,14 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
__sysreg_save_state_nvhe(host_ctxt); + /* + * We must flush and disable the SPE buffer for nVHE, as + * the translation regime(EL1&0) is going to be loaded with + * that of the guest. And we must do this before we change the + * translation regime to EL2 (via MDCR_EL2_E2PB == 0) and + * before we load guest Stage1. + */ + __debug_save_host_buffers_nvhe(vcpu);
__adjust_pc(vcpu);
@@ -234,11 +242,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) __fpsimd_save_fpexc32(vcpu);
+ __debug_switch_to_host(vcpu); /* * This must come after restoring the host sysregs, since a non-VHE * system may enable SPE here and make use of the TTBRs. */ - __debug_switch_to_host(vcpu); + __debug_restore_host_buffers_nvhe(vcpu);
if (pmu_switch_needed) __pmu_switch_to_host(host_ctxt);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Marc Zyngier maz@kernel.org
commit 7d717558dd5ef10d28866750d5c24ff892ea3778 upstream.
KVM/arm64 has forever used a 40bit default IPA space, partially due to its 32bit heritage (where the only choice is 40bit).
However, there are implementations in the wild that have a *cough* much smaller *cough* IPA space, which leads to a misprogramming of VTCR_EL2, and a guest that is stuck on its first memory access if userspace dares to ask for the default IPA setting (which most VMMs do).
Instead, blundly reject the creation of such VM, as we can't satisfy the requirements from userspace (with a one-off warning). Also clarify the boot warning, and document that the VM creation will fail when an unsupported IPA size is provided.
Although this is an ABI change, it doesn't really change much for userspace:
- the guest couldn't run before this change, but no error was returned. At least userspace knows what is happening.
- a memory slot that was accepted because it did fit the default IPA space now doesn't even get a chance to be registered.
The other thing that is left doing is to convince userspace to actually use the IPA space setting instead of relying on the antiquated default.
Fixes: 233a7cb23531 ("kvm: arm64: Allow tuning the physical address size for VM") Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones drjones@redhat.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/virt/kvm/api.rst | 3 +++ arch/arm64/kvm/reset.c | 12 ++++++++---- 2 files changed, 11 insertions(+), 4 deletions(-)
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -182,6 +182,9 @@ is dependent on the CPU capability and t be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION ioctl() at run-time.
+Creation of the VM will fail if the requested IPA size (whether it is +implicit or explicit) is unsupported on the host. + Please note that configuring the IPA size does not affect the capability exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects size of the address translated by the stage2 level (guest physical to --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -324,10 +324,9 @@ int kvm_set_ipa_limit(void) }
kvm_ipa_limit = id_aa64mmfr0_parange_to_phys_shift(parange); - WARN(kvm_ipa_limit < KVM_PHYS_SHIFT, - "KVM IPA Size Limit (%d bits) is smaller than default size\n", - kvm_ipa_limit); - kvm_info("IPA Size Limit: %d bits\n", kvm_ipa_limit); + kvm_info("IPA Size Limit: %d bits%s\n", kvm_ipa_limit, + ((kvm_ipa_limit < KVM_PHYS_SHIFT) ? + " (Reduced IPA size, limited VM/VMM compatibility)" : ""));
return 0; } @@ -356,6 +355,11 @@ int kvm_arm_setup_stage2(struct kvm *kvm return -EINVAL; } else { phys_shift = KVM_PHYS_SHIFT; + if (phys_shift > kvm_ipa_limit) { + pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n", + current->comm); + return -EINVAL; + } }
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Marc Zyngier maz@kernel.org
commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.
When registering a memslot, we check the size and location of that memslot against the IPA size to ensure that we can provide guest access to the whole of the memory.
Unfortunately, this check rejects memslot that end-up at the exact limit of the addressing capability for a given IPA size. For example, it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit IPA space.
Fix it by relaxing the check to accept a memslot reaching the limit of the IPA space.
Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE") Reviewed-by: Eric Auger eric.auger@redhat.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones drjones@redhat.com Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1309,8 +1309,7 @@ int kvm_arch_prepare_memory_region(struc * Prevent userspace from creating a memory region outside of the IPA * space addressable by the KVM guest IPA space. */ - if (memslot->base_gfn + memslot->npages >= - (kvm_phys_size(kvm) >> PAGE_SHIFT)) + if ((memslot->base_gfn + memslot->npages) > (kvm_phys_size(kvm) >> PAGE_SHIFT)) return -EFAULT;
mmap_read_lock(current->mm);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: OGAWA Hirofumi hirofumi@mail.parknet.co.jp
commit 184cee516f3e24019a08ac8eb5c7cf04c00933cb upstream.
zero_user_segments() is used from __block_write_begin_int(), for example like the following
zero_user_segments(page, 4096, 1024, 512, 918)
But new the zero_user_segments() implementation for for HIGHMEM + TRANSPARENT_HUGEPAGE doesn't handle "start > end" case correctly, and hits BUG_ON(). (we can fix __block_write_begin_int() instead though, it is the old and multiple usage)
Also it calls kmap_atomic() unnecessarily while start == end == 0.
Link: https://lkml.kernel.org/r/87v9ab60r4.fsf@mail.parknet.co.jp Fixes: 0060ef3b4e6d ("mm: support THPs in zero_user_segments") Signed-off-by: OGAWA Hirofumi hirofumi@mail.parknet.co.jp Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/highmem.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/mm/highmem.c +++ b/mm/highmem.c @@ -368,20 +368,24 @@ void zero_user_segments(struct page *pag
BUG_ON(end1 > page_size(page) || end2 > page_size(page));
+ if (start1 >= end1) + start1 = end1 = 0; + if (start2 >= end2) + start2 = end2 = 0; + for (i = 0; i < compound_nr(page); i++) { void *kaddr = NULL;
- if (start1 < PAGE_SIZE || start2 < PAGE_SIZE) - kaddr = kmap_atomic(page + i); - if (start1 >= PAGE_SIZE) { start1 -= PAGE_SIZE; end1 -= PAGE_SIZE; } else { unsigned this_end = min_t(unsigned, end1, PAGE_SIZE);
- if (end1 > start1) + if (end1 > start1) { + kaddr = kmap_atomic(page + i); memset(kaddr + start1, 0, this_end - start1); + } end1 -= this_end; start1 = 0; } @@ -392,8 +396,11 @@ void zero_user_segments(struct page *pag } else { unsigned this_end = min_t(unsigned, end2, PAGE_SIZE);
- if (end2 > start2) + if (end2 > start2) { + if (!kaddr) + kaddr = kmap_atomic(page + i); memset(kaddr + start2, 0, this_end - start2); + } end2 -= this_end; start2 = 0; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Nadav Amit namit@vmware.com
commit 6ce64428d62026a10cb5d80138ff2f90cc21d367 upstream.
Userfaultfd self-test fails occasionally, indicating a memory corruption.
Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent deferred TLB flushes in wp_page_copy(). Although the PTE is flushed from the TLBs in wp_page_copy(), this flush takes place after the copy has already been performed, and therefore changes of the page are possible between the time of the copy and the time in which the PTE is flushed.
To make matters worse, memory-unprotection using userfaultfd also poses a problem. Although memory unprotection is logically a promotion of PTE permissions, and therefore should not require a TLB flush, the current userrfaultfd code might actually cause a demotion of the architectural PTE permission: when userfaultfd_writeprotect() unprotects memory region, it unintentionally *clears* the RW-bit if it was already set. Note that this unprotecting a PTE that is not write-protected is a valid use-case: the userfaultfd monitor might ask to unprotect a region that holds both write-protected and write-unprotected PTEs.
The scenario that happens in selftests/vm/userfaultfd is as follows:
cpu0 cpu1 cpu2 ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-*unprotect* ] mwriteprotect_range() mmap_read_lock() change_protection()
change_protection_range() ... change_pte_range() [ *clear* “write”-bit ] [ defer TLB flushes ] [ page-fault ] ... wp_page_copy() cow_user_page() [ copy page ] [ write to old page ] ... set_pte_at_notify()
A similar scenario can happen:
cpu0 cpu1 cpu2 cpu3 ---- ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-protect ] [ deferred TLB flush ] userfaultfd_writeprotect() [ write-unprotect ] [ deferred TLB flush] [ page-fault ] wp_page_copy() cow_user_page() [ copy page ] ... [ write to page ] set_pte_at_notify()
This race exists since commit 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit"). Yet, as Yu Zhao pointed, these races became apparent since commit 09854ba94c6a ("mm: do_wp_page() simplification") which made wp_page_copy() more likely to take place, specifically if page_count(page)
To resolve the aforementioned races, check whether there are pending flushes on uffd-write-protected VMAs, and if there are, perform a flush before doing the COW.
Further optimizations will follow to avoid during uffd-write-unprotect unnecassary PTE write-protection and TLB flushes.
Link: https://lkml.kernel.org/r/20210304095423.3825684-1-namit@vmware.com Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") Signed-off-by: Nadav Amit namit@vmware.com Suggested-by: Yu Zhao yuzhao@google.com Reviewed-by: Peter Xu peterx@redhat.com Tested-by: Peter Xu peterx@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Andy Lutomirski luto@kernel.org Cc: Pavel Emelyanov xemul@openvz.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Minchan Kim minchan@kernel.org Cc: Will Deacon will@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org [5.9+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/mm/memory.c +++ b/mm/memory.c @@ -3092,6 +3092,14 @@ static vm_fault_t do_wp_page(struct vm_f return handle_userfault(vmf, VM_UFFD_WP); }
+ /* + * Userfaultfd write-protect can defer flushes. Ensure the TLB + * is flushed in this case before copying. + */ + if (unlikely(userfaultfd_wp(vmf->vma) && + mm_tlb_flush_pending(vmf->vma->vm_mm))) + flush_tlb_page(vmf->vma, vmf->address); + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /*
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Suren Baghdasaryan surenb@google.com
commit 96cfe2c0fd23ea7c2368d14f769d287e7ae1082e upstream.
process_madvise currently requires ptrace attach capability. PTRACE_MODE_ATTACH gives one process complete control over another process. It effectively removes the security boundary between the two processes (in one direction). Granting ptrace attach capability even to a system process is considered dangerous since it creates an attack surface. This severely limits the usage of this API.
The operations process_madvise can perform do not affect the correctness of the operation of the target process; they only affect where the data is physically located (and therefore, how fast it can be accessed). What we want is the ability for one process to influence another process in order to optimize performance across the entire system while leaving the security boundary intact.
Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata and CAP_SYS_NICE for influencing process performance.
Link: https://lkml.kernel.org/r/20210303185807.2160264-1-surenb@google.com Signed-off-by: Suren Baghdasaryan surenb@google.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Minchan Kim minchan@kernel.org Acked-by: David Rientjes rientjes@google.com Cc: Jann Horn jannh@google.com Cc: Jeff Vander Stoep jeffv@google.com Cc: Michal Hocko mhocko@suse.com Cc: Shakeel Butt shakeelb@google.com Cc: Tim Murray timmurray@google.com Cc: Florian Weimer fweimer@redhat.com Cc: Oleg Nesterov oleg@redhat.com Cc: James Morris jmorris@namei.org Cc: stable@vger.kernel.org [5.10+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/madvise.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/mm/madvise.c +++ b/mm/madvise.c @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pi goto release_task; }
- mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); if (IS_ERR_OR_NULL(mm)) { ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; goto release_task; }
+ /* + * Require CAP_SYS_NICE for influencing process performance. Note that + * only non-destructive hints are currently supported. + */ + if (!capable(CAP_SYS_NICE)) { + ret = -EPERM; + goto release_mm; + } + total_len = iov_iter_count(&iter);
while (iov_iter_count(&iter)) { @@ -1217,6 +1227,7 @@ SYSCALL_DEFINE5(process_madvise, int, pi if (ret == 0) ret = total_len - iov_iter_count(&iter);
+release_mm: mmput(mm); release_task: put_task_struct(task);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Zhou Guanghui zhouguanghui1@huawei.com
commit e1baddf8475b06cc56f4bafecf9a32a124343d9f upstream.
As described in the split_page() comment, for the non-compound high order page, the sub-pages must be freed individually. If the memcg of the first page is valid, the tail pages cannot be uncharged when be freed.
For example, when alloc_pages_exact is used to allocate 1MB continuous physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is set). When make_alloc_exact free the unused 1MB and free_pages_exact free the applied 1MB, actually, only 4KB(one page) is uncharged.
Therefore, the memcg of the tail page needs to be set when splitting a page.
Michel:
There are at least two explicit users of __GFP_ACCOUNT with alloc_exact_pages added recently. See 7efe8ef274024 ("KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT") and c419621873713 ("KVM: s390: Add memcg accounting to KVM allocations"), so this is not just a theoretical issue.
Link: https://lkml.kernel.org/r/20210304074053.65527-3-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui zhouguanghui1@huawei.com Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Zi Yan ziy@nvidia.com Reviewed-by: Shakeel Butt shakeelb@google.com Acked-by: Michal Hocko mhocko@suse.com Cc: Hanjun Guo guohanjun@huawei.com Cc: Hugh Dickins hughd@google.com Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Rui Xiang rui.xiang@huawei.com Cc: Tianhong Ding dingtianhong@huawei.com Cc: Weilong Chen chenweilong@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3313,6 +3313,7 @@ void split_page(struct page *page, unsig for (i = 1; i < (1 << order); i++) set_page_refcounted(page + i); split_page_owner(page, 1 << order); + split_page_memcg(page, 1 << order); } EXPORT_SYMBOL_GPL(split_page);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Zhou Guanghui zhouguanghui1@huawei.com
commit be6c8982e4ab9a41907555f601b711a7e2a17d4c upstream.
Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly pass in page number argument.
In this way, the interface name is more common and can be used by potential users. In addition, the complete info(memcg and flag) of the memcg needs to be set to the tail pages.
Link: https://lkml.kernel.org/r/20210304074053.65527-2-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui zhouguanghui1@huawei.com Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Zi Yan ziy@nvidia.com Reviewed-by: Shakeel Butt shakeelb@google.com Acked-by: Michal Hocko mhocko@suse.com Cc: Hugh Dickins hughd@google.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Hanjun Guo guohanjun@huawei.com Cc: Tianhong Ding dingtianhong@huawei.com Cc: Weilong Chen chenweilong@huawei.com Cc: Rui Xiang rui.xiang@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/memcontrol.h | 6 ++---- mm/huge_memory.c | 2 +- mm/memcontrol.c | 15 ++++++--------- 3 files changed, 9 insertions(+), 14 deletions(-)
--- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1072,9 +1072,7 @@ static inline void memcg_memory_event_mm rcu_read_unlock(); }
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE -void mem_cgroup_split_huge_fixup(struct page *head); -#endif +void split_page_memcg(struct page *head, unsigned int nr);
#else /* CONFIG_MEMCG */
@@ -1416,7 +1414,7 @@ unsigned long mem_cgroup_soft_limit_recl return 0; }
-static inline void mem_cgroup_split_huge_fixup(struct page *head) +static inline void split_page_memcg(struct page *head, unsigned int nr) { }
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2465,7 +2465,7 @@ static void __split_huge_page(struct pag int i;
/* complete memcg works before add pages to LRU */ - mem_cgroup_split_huge_fixup(head); + split_page_memcg(head, nr);
if (PageAnon(head) && PageSwapCache(head)) { swp_entry_t entry = { .val = page_private(head) }; --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3296,24 +3296,21 @@ void obj_cgroup_uncharge(struct obj_cgro
#endif /* CONFIG_MEMCG_KMEM */
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE /* - * Because page_memcg(head) is not set on compound tails, set it now. + * Because page_memcg(head) is not set on tails, set it now. */ -void mem_cgroup_split_huge_fixup(struct page *head) +void split_page_memcg(struct page *head, unsigned int nr) { struct mem_cgroup *memcg = page_memcg(head); int i;
- if (mem_cgroup_disabled()) + if (mem_cgroup_disabled() || !memcg) return;
- for (i = 1; i < HPAGE_PMD_NR; i++) { - css_get(&memcg->css); - head[i].memcg_data = (unsigned long)memcg; - } + for (i = 1; i < nr; i++) + head[i].memcg_data = head->memcg_data; + css_get_many(&memcg->css, nr - 1); } -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#ifdef CONFIG_MEMCG_SWAP /**
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Mike Rapoport rppt@linux.ibm.com
commit 0740a50b9baa4472cfb12442df4b39e2712a64a4 upstream.
There could be struct pages that are not backed by actual physical memory. This can happen when the actual memory bank is not a multiple of SECTION_SIZE or when an architecture does not register memory holes reserved by the firmware as memblock.memory.
Such pages are currently initialized using init_unavailable_mem() function that iterates through PFNs in holes in memblock.memory and if there is a struct page corresponding to a PFN, the fields of this page are set to default values and it is marked as Reserved.
init_unavailable_mem() does not take into account zone and node the page belongs to and sets both zone and node links in struct page to zero.
Before commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") the holes inside a zone were re-initialized during memmap_init() and got their zone/node links right. However, after that commit nothing updates the struct pages representing such holes.
On a system that has firmware reserved holes in a zone above ZONE_DMA, for instance in a configuration below:
# grep -A1 E820 /proc/iomem 7a17b000-7a216fff : Unknown E820 type 7a217000-7bffffff : System RAM
unset zone link in struct page will trigger
VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
in set_pfnblock_flags_mask() when called with a struct page from a range other than E820_TYPE_RAM because there are pages in the range of ZONE_DMA32 but the unset zone link in struct page makes them appear as a part of ZONE_DMA.
Interleave initialization of the unavailable pages with the normal initialization of memory map, so that zone and node information will be properly set on struct pages that are not backed by the actual memory.
With this change the pages for holes inside a zone will get proper zone/node links and the pages that are not spanned by any node will get links to the adjacent zone/node. The holes between nodes will be prepended to the zone/node above the hole and the trailing pages in the last section that will be appended to the zone/node below.
[akpm@linux-foundation.org: don't initialize static to zero, use %llu for u64]
Link: https://lkml.kernel.org/r/20210225224351.7356-2-rppt@kernel.org Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") Signed-off-by: Mike Rapoport rppt@linux.ibm.com Reported-by: Qian Cai cai@lca.pw Reported-by: Andrea Arcangeli aarcange@redhat.com Reviewed-by: Baoquan He bhe@redhat.com Acked-by: Vlastimil Babka vbabka@suse.cz Reviewed-by: David Hildenbrand david@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Chris Wilson chris@chris-wilson.co.uk Cc: "H. Peter Anvin" hpa@zytor.com Cc: Łukasz Majczak lma@semihalf.com Cc: Ingo Molnar mingo@redhat.com Cc: Mel Gorman mgorman@suse.de Cc: Michal Hocko mhocko@kernel.org Cc: "Sarvela, Tomi P" tomi.p.sarvela@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Mike Rapoport rppt@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 158 ++++++++++++++++++++++++++------------------------------ 1 file changed, 75 insertions(+), 83 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6262,13 +6262,66 @@ static void __meminit zone_init_free_lis } }
+#if !defined(CONFIG_FLAT_NODE_MEM_MAP) +/* + * Only struct pages that correspond to ranges defined by memblock.memory + * are zeroed and initialized by going through __init_single_page() during + * memmap_init_zone(). + * + * But, there could be struct pages that correspond to holes in + * memblock.memory. This can happen because of the following reasons: + * - physical memory bank size is not necessarily the exact multiple of the + * arbitrary section size + * - early reserved memory may not be listed in memblock.memory + * - memory layouts defined with memmap= kernel parameter may not align + * nicely with memmap sections + * + * Explicitly initialize those struct pages so that: + * - PG_Reserved is set + * - zone and node links point to zone and node that span the page if the + * hole is in the middle of a zone + * - zone and node links point to adjacent zone/node if the hole falls on + * the zone boundary; the pages in such holes will be prepended to the + * zone/node above the hole except for the trailing pages in the last + * section that will be appended to the zone/node below. + */ +static u64 __meminit init_unavailable_range(unsigned long spfn, + unsigned long epfn, + int zone, int node) +{ + unsigned long pfn; + u64 pgcnt = 0; + + for (pfn = spfn; pfn < epfn; pfn++) { + if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) { + pfn = ALIGN_DOWN(pfn, pageblock_nr_pages) + + pageblock_nr_pages - 1; + continue; + } + __init_single_page(pfn_to_page(pfn), pfn, zone, node); + __SetPageReserved(pfn_to_page(pfn)); + pgcnt++; + } + + return pgcnt; +} +#else +static inline u64 init_unavailable_range(unsigned long spfn, unsigned long epfn, + int zone, int node) +{ + return 0; +} +#endif + void __meminit __weak memmap_init(unsigned long size, int nid, unsigned long zone, unsigned long range_start_pfn) { + static unsigned long hole_pfn; unsigned long start_pfn, end_pfn; unsigned long range_end_pfn = range_start_pfn + size; int i; + u64 pgcnt = 0;
for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) { start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn); @@ -6279,7 +6332,29 @@ void __meminit __weak memmap_init(unsign memmap_init_zone(size, nid, zone, start_pfn, range_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE); } + + if (hole_pfn < start_pfn) + pgcnt += init_unavailable_range(hole_pfn, start_pfn, + zone, nid); + hole_pfn = end_pfn; } + +#ifdef CONFIG_SPARSEMEM + /* + * Initialize the hole in the range [zone_end_pfn, section_end]. + * If zone boundary falls in the middle of a section, this hole + * will be re-initialized during the call to this function for the + * higher zone. + */ + end_pfn = round_up(range_end_pfn, PAGES_PER_SECTION); + if (hole_pfn < end_pfn) + pgcnt += init_unavailable_range(hole_pfn, end_pfn, + zone, nid); +#endif + + if (pgcnt) + pr_info(" %s zone: %llu pages in unavailable ranges\n", + zone_names[zone], pgcnt); }
static int zone_batchsize(struct zone *zone) @@ -7080,88 +7155,6 @@ void __init free_area_init_memoryless_no free_area_init_node(nid); }
-#if !defined(CONFIG_FLAT_NODE_MEM_MAP) -/* - * Initialize all valid struct pages in the range [spfn, epfn) and mark them - * PageReserved(). Return the number of struct pages that were initialized. - */ -static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn) -{ - unsigned long pfn; - u64 pgcnt = 0; - - for (pfn = spfn; pfn < epfn; pfn++) { - if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) { - pfn = ALIGN_DOWN(pfn, pageblock_nr_pages) - + pageblock_nr_pages - 1; - continue; - } - /* - * Use a fake node/zone (0) for now. Some of these pages - * (in memblock.reserved but not in memblock.memory) will - * get re-initialized via reserve_bootmem_region() later. - */ - __init_single_page(pfn_to_page(pfn), pfn, 0, 0); - __SetPageReserved(pfn_to_page(pfn)); - pgcnt++; - } - - return pgcnt; -} - -/* - * Only struct pages that are backed by physical memory are zeroed and - * initialized by going through __init_single_page(). But, there are some - * struct pages which are reserved in memblock allocator and their fields - * may be accessed (for example page_to_pfn() on some configuration accesses - * flags). We must explicitly initialize those struct pages. - * - * This function also addresses a similar issue where struct pages are left - * uninitialized because the physical address range is not covered by - * memblock.memory or memblock.reserved. That could happen when memblock - * layout is manually configured via memmap=, or when the highest physical - * address (max_pfn) does not end on a section boundary. - */ -static void __init init_unavailable_mem(void) -{ - phys_addr_t start, end; - u64 i, pgcnt; - phys_addr_t next = 0; - - /* - * Loop through unavailable ranges not covered by memblock.memory. - */ - pgcnt = 0; - for_each_mem_range(i, &start, &end) { - if (next < start) - pgcnt += init_unavailable_range(PFN_DOWN(next), - PFN_UP(start)); - next = end; - } - - /* - * Early sections always have a fully populated memmap for the whole - * section - see pfn_valid(). If the last section has holes at the - * end and that section is marked "online", the memmap will be - * considered initialized. Make sure that memmap has a well defined - * state. - */ - pgcnt += init_unavailable_range(PFN_DOWN(next), - round_up(max_pfn, PAGES_PER_SECTION)); - - /* - * Struct pages that do not have backing memory. This could be because - * firmware is using some of this memory, or for some other reasons. - */ - if (pgcnt) - pr_info("Zeroed struct page in unavailable ranges: %lld pages", pgcnt); -} -#else -static inline void __init init_unavailable_mem(void) -{ -} -#endif /* !CONFIG_FLAT_NODE_MEM_MAP */ - #if MAX_NUMNODES > 1 /* * Figure out the number of possible node ids. @@ -7585,7 +7578,6 @@ void __init free_area_init(unsigned long /* Initialise every node */ mminit_verify_pageflags_layout(); setup_nr_node_ids(); - init_unavailable_mem(); for_each_online_node(nid) { pg_data_t *pgdat = NODE_DATA(nid); free_area_init_node(nid);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
From: Andrew Scull ascull@google.com
Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.
When panicking from the nVHE hyp and restoring the host context, x29 is expected to hold a pointer to the host context. This wasn't being done so fix it to make sure there's a valid pointer the host context being used.
Rather than passing a boolean indicating whether or not the host context should be restored, instead pass the pointer to the host context. NULL is passed to indicate that no context should be restored.
Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics") Cc: stable@vger.kernel.org # 5.11.y only Signed-off-by: Andrew Scull ascull@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20210219122406.1337626-1-ascull@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/kvm_hyp.h | 3 ++- arch/arm64/kvm/hyp/nvhe/host.S | 20 ++++++++++---------- arch/arm64/kvm/hyp/nvhe/switch.c | 3 +-- 3 files changed, 13 insertions(+), 13 deletions(-)
--- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -102,7 +102,8 @@ bool kvm_host_psci_handler(struct kvm_cp
void __noreturn hyp_panic(void); #ifdef __KVM_NVHE_HYPERVISOR__ -void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par); +void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr, + u64 elr, u64 par); #endif
#endif /* __ARM64_KVM_HYP_H__ */ --- a/arch/arm64/kvm/hyp/nvhe/host.S +++ b/arch/arm64/kvm/hyp/nvhe/host.S @@ -71,10 +71,15 @@ SYM_FUNC_START(__host_enter) SYM_FUNC_END(__host_enter)
/* - * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par); + * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr, + * u64 elr, u64 par); */ SYM_FUNC_START(__hyp_do_panic) - /* Load the format arguments into x1-7 */ + mov x29, x0 + + /* Load the format string into x0 and arguments into x1-7 */ + ldr x0, =__hyp_panic_string + mov x6, x3 get_vcpu_ptr x7, x3
@@ -89,13 +94,8 @@ SYM_FUNC_START(__hyp_do_panic) ldr lr, =panic msr elr_el2, lr
- /* - * Set the panic format string and enter the host, conditionally - * restoring the host context. - */ - cmp x0, xzr - ldr x0, =__hyp_panic_string - b.eq __host_enter_without_restoring + /* Enter the host, conditionally restoring the host context. */ + cbz x29, __host_enter_without_restoring b __host_enter_for_panic SYM_FUNC_END(__hyp_do_panic)
@@ -150,7 +150,7 @@ SYM_FUNC_END(__hyp_do_panic)
.macro invalid_host_el1_vect .align 7 - mov x0, xzr /* restore_host = false */ + mov x0, xzr /* host_ctxt = NULL */ mrs x1, spsr_el2 mrs x2, elr_el2 mrs x3, par_el1 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -266,7 +266,6 @@ void __noreturn hyp_panic(void) u64 spsr = read_sysreg_el2(SYS_SPSR); u64 elr = read_sysreg_el2(SYS_ELR); u64 par = read_sysreg_par(); - bool restore_host = true; struct kvm_cpu_context *host_ctxt; struct kvm_vcpu *vcpu;
@@ -280,7 +279,7 @@ void __noreturn hyp_panic(void) __sysreg_restore_state_nvhe(host_ctxt); }
- __hyp_do_panic(restore_host, spsr, elr, par); + __hyp_do_panic(host_ctxt, spsr, elr, par); unreachable(); }
On Mon, 15 Mar 2021 14:51:03 +0100, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.11: 12 builds: 12 pass, 0 fail 28 boots: 28 pass, 0 fail 65 tests: 65 pass, 0 fail
Linux version: 5.11.7-rc1-gb87703cf6c9b Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, 15 Mar 2021 14:51:03 +0100 gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release.
Tested on amd64, arm64, armhf, i386, m68k, or1k, powerpc, ppc64, ppc64el, riscv64, s390x, sparc64. No problems detected.
Tested-by: Jason Self jason@bluehome.net
On Mon, 15 Mar 2021 at 19:27, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Summary ------------------------------------------------------------------------
kernel: 5.11.7-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.11.y git commit: f109baa2424b2523fbb975de996fedc7a53ef531 git describe: v5.11.6-307-gf109baa2424b Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.11.y/build/v5.11....
No regressions (compared to build v5.11.6)
No fixes (compared to build v5.11.6)
Ran 55999 total tests in the following environments and test suites.
Environments -------------- - arc - arm - arm64 - dragonboard-410c - hi6220-hikey - i386 - juno-64k_page_size - juno-r2 - juno-r2-compat - juno-r2-kasan - mips - nxp-ls2088 - nxp-ls2088-64k_page_size - parisc - powerpc - qemu-arm-clang - qemu-arm64-clang - qemu-arm64-kasan - qemu-i386-clang - qemu-x86_64-clang - qemu-x86_64-kasan - qemu-x86_64-kcsan - qemu_arm - qemu_arm64 - qemu_arm64-compat - qemu_i386 - qemu_x86_64 - qemu_x86_64-compat - riscv - s390 - sh - sparc - x15 - x86 - x86-kasan - x86_64
Test Suites ----------- * build * linux-log-parser * install-android-platform-tools-r2600 * kselftest- * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-lib * kselftest-lkdtm * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-dio-tests * ltp-fs-tests * ltp-io-tests * ltp-ipc-tests * ltp-syscalls-tests * perf * v4l2-compliance * fwts * kselftest-android * kselftest-bpf * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-zram * ltp-commands-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-tracing-tests * network-basic-tests * kselftest-intel_pstate * kselftest-kexec * kselftest-kvm * kselftest-livepatch * kselftest-ptrace * kselftest-vm * kselftest-x86 * libhugetlbfs * ltp-controllers-tests * ltp-mm-tests * ltp-open-posix-tests * kvm-unit-tests * rcutorture * kunit * kselftest-vsyscall-mode-native- * kselftest-vsyscall-mode-none- * ssuite
On Tue, Mar 16, 2021 at 10:45:40AM +0530, Naresh Kamboju wrote:
On Mon, 15 Mar 2021 at 19:27, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.11.7-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.11.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Thanks for testing them all and letting me know.
greg k-h
On Mon, Mar 15, 2021 at 02:51:03PM +0100, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 436 pass: 436 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Tue, Mar 16, 2021 at 02:14:43PM -0700, Guenter Roeck wrote:
On Mon, Mar 15, 2021 at 02:51:03PM +0100, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 17 Mar 2021 13:54:26 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 436 pass: 436 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Thanks for testing them all and letting me know.
greg k-h
On Mon, Mar 15, 2021 at 02:51:03PM +0100, gregkh@linuxfoundation.org wrote:
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
This is the start of the stable review cycle for the 5.11.7 release. There are 306 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Compiled and booted with no regressions on x86_64.
Tested-by: Ross Schmidt ross.schm.dev@gmail.com
thanks,
Ross
linux-stable-mirror@lists.linaro.org