This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.4-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.0.4-rc1
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: Respect RPC call timeouts when retrying transmission
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: Fix up RPC back channel transmission
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: Prevent thundering herd when the socket is not connected
Martin Schwidefsky schwidefsky@de.ibm.com s390/setup: fix boot crash for machine without EDAT-1
Johan Hovold johan@kernel.org net: dsa: lantiq_gswip: fix OF child-node lookups
Johan Hovold johan@kernel.org net: dsa: lantiq_gswip: fix use-after-free on failed probe
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Check a single byte for VMCS "launched" in nested early checks
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Apply addr size mask to effective address for VMX instructions
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Sign extend displacements of VMX instr's mem operands
Sean Christopherson sean.j.christopherson@intel.com KVM: x86/mmu: Do not cache MMIO accesses while memslots are in flux
Sean Christopherson sean.j.christopherson@intel.com KVM: x86/mmu: Detect MMIO generation wrap in any address space
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Zero out *all* general purpose registers after VM-Exit
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run
Sean Christopherson sean.j.christopherson@intel.com KVM: Call kvm_arch_memslots_updated() before updating memslots
Harry Wentland harry.wentland@amd.com drm/amd/display: don't call dm_pp_ function from an fpu block
Evan Quan evan.quan@amd.com drm/amd/powerplay: correct power reading on fiji
Gustavo A. R. Silva gustavo@embeddedor.com drm/radeon/evergreen_cs: fix missing break in switch statement
Noralf Trønnes noralf@tronnes.org drm/fb-helper: generic: Fix drm_fbdev_client_restore()
Steve Longerbeam slongerbeam@gmail.com media: imx: csi: Stop upstream before disabling IDMA channel
Steve Longerbeam slongerbeam@gmail.com media: imx: csi: Disable CSI immediately after last EOF
Steve Longerbeam slongerbeam@gmail.com media: imx-csi: Input connections to CSI should be optional
Lucas A. M. Magalhães lucmaga@gmail.com media: vimc: Add vimc-streamer for stream control
Sakari Ailus sakari.ailus@linux.intel.com media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
Chen-Yu Tsai wens@csie.org media: sun6i: Fix CSI regmap's max_register
French, Nicholas A naf@ou.edu media: lgdt330x: fix lock status reporting
Steve Longerbeam slongerbeam@gmail.com media: imx: prpencvf: Stop upstream before disabling IDMA channel
Zhang, Jun jun.zhang@intel.com rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
Jarkko Sakkinen jarkko.sakkinen@linux.intel.com tpm: Unify the send callback behaviour
Jarkko Sakkinen jarkko.sakkinen@linux.intel.com tpm/tpm_crb: Avoid unaligned reads in crb_recv()
Steven Rostedt (VMware) rostedt@goodmis.org x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
Pavel Tatashin pasha.tatashin@soleen.com x86/kvmclock: set offset for kvm unstable clock
Aditya Pakki pakki001@umn.edu md: Fix failed allocation of md_register_thread
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix divide by zero when TSC is not available
Kan Liang kan.liang@linux.intel.com perf/x86/intel/uncore: Fix client IMC events return huge result
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix overlap calculation for padding
Adrian Hunter adrian.hunter@intel.com perf auxtrace: Define auxtrace record alignment
Adrian Hunter adrian.hunter@intel.com perf tools: Fix split_kallsyms_for_kcore() for trampoline symbols
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix CYC timestamp calculation after OVF
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Fix ORC unwind table alignment
Nicolas Pitre nicolas.pitre@linaro.org vt: perform safe console erase in the right order
Greg Kroah-Hartman gregkh@linuxfoundation.org stable-kernel-rules.rst: add link to networking patch queue
Coly Li colyli@suse.de bcache: use (REQ_META|REQ_PRIO) to indicate bio for metadata
Tang Junhui tang.junhui.linux@gmail.com bcache: treat stale && dirty keys as bad keys
Daniel Axtens dja@axtens.net bcache: never writeback a discard operation
Viresh Kumar viresh.kumar@linaro.org PM / OPP: Update performance state when freq == old_freq
Viresh Kumar viresh.kumar@linaro.org PM / wakeup: Rework wakeup source timer cancellation
J. Bruce Fields bfields@redhat.com svcrpc: fix UDP on servers with lots of threads
Trond Myklebust trond.myklebust@hammerspace.com NFSv4.1: Reinitialise sequence results before retransmitting a request
Yihao Wu wuyihao@linux.alibaba.com nfsd: fix wrong check in write_v4_end_grace()
NeilBrown neilb@suse.com nfsd: fix memory corruption caused by readdir
J. Bruce Fields bfields@redhat.com nfsd: fix performance-limiting session calculation
Trond Myklebust trond.myklebust@hammerspace.com NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
Trond Myklebust trond.myklebust@hammerspace.com NFS: Fix an I/O request leakage in nfs_do_recoalesce
Trond Myklebust trond.myklebust@hammerspace.com NFS: Fix I/O request leakages
Rafael J. Wysocki rafael.j.wysocki@intel.com cpuidle: governor: Add new governors to cpuidle_governors again
Pavel Machek pavel@ucw.cz cpcap-charger: generate events for userspace
Gustavo A. R. Silva gustavo@embeddedor.com mfd: sm501: Fix potential NULL pointer dereference
Cody P Schafer dev@codyps.com media: cx25840: mark pad sig_types to fix cx231xx init
Mikulas Patocka mpatocka@redhat.com dm integrity: limit the rate of error messages
NeilBrown neil@brown.name dm: fix to_sector() for 32bit
Yang Yingliang yangyingliang@huawei.com ipmi_si: fix use-after-free of resource->name
Corey Minyard cminyard@mvista.com ipmi_si: Fix crash when using hard-coded device
Ben Gardon bgardon@google.com Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
Dave Martin Dave.Martin@arm.com arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
Will Deacon will.deacon@arm.com arm64: debug: Ensure debug handlers check triggering exception level
Will Deacon will.deacon@arm.com arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
Julien Thierry julien.thierry@arm.com arm64: Fix HCR.TGE status for NMI contexts
Gustavo A. R. Silva gustavo@embeddedor.com ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
Christophe Leroy christophe.leroy@c-s.fr powerpc/traps: Fix the message printed when stack overflows
Christophe Leroy christophe.leroy@c-s.fr powerpc/traps: fix recoverability of machine check handling on book3s/32
Nicholas Piggin npiggin@gmail.com powerpc/smp: Fix NMI IPI xmon timeout
Nicholas Piggin npiggin@gmail.com powerpc/smp: Fix NMI IPI timeout
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
Michael Ellerman mpe@ellerman.id.au powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning
Mark Cave-Ayland mark.cave-ayland@ilande.co.uk powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest
Nicholas Piggin npiggin@gmail.com powerpc/64s/hash: Fix assert_slb_presence() use of the slbfee. instruction
Paul Mackerras paulus@ozlabs.org powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit
Michael Ellerman mpe@ellerman.id.au powerpc/kvm: Save and restore host AMR/IAMR/UAMOR
Christophe Leroy christophe.leroy@c-s.fr powerpc/83xx: Also save/restore SPRG4-7 during suspend
Jordan Niethe jniethe5@gmail.com powerpc/powernv: Make opal log only readable by root
Christophe Leroy christophe.leroy@c-s.fr powerpc/wii: properly disable use of BATs when requested.
Christophe Leroy christophe.leroy@c-s.fr powerpc/32: Clear on-stack exception marker upon exception return
J. Bruce Fields bfields@redhat.com security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock
Xin Long lucien.xin@gmail.com selinux: add the missing walk_size + len check in selinux_sctp_bind_connect
zhangyi (F) yi.zhang@huawei.com jbd2: fix compile warning when using JBUFFER_TRACE
zhangyi (F) yi.zhang@huawei.com jbd2: clear dirty flag when revoking a buffer from an older transaction
Jay Dolan jay.dolan@accesio.com serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
Jay Dolan jay.dolan@accesio.com serial: 8250_pci: Fix number of ports for ACCES serial cards
Lubomir Rintel lkundrak@v3.sk serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
Anssi Hannula anssi.hannula@bitwise.fi serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
Phuong Nguyen phuong.nguyen.xw@renesas.com dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit
Nikolaus Voss nikolaus.voss@loewensteinmedical.de usb: typec: tps6598x: handle block writes separately with plain-I2C adapters
Dmitry Osipenko digetx@gmail.com usb: chipidea: tegra: Fix missed ci_hdrc_remove_device()
Paul Cercueil paul@crapouillou.net clk: ingenic: Fix doc of ingenic_cgu_div_info
Paul Cercueil paul@crapouillou.net clk: ingenic: Fix round_rate misbehaving with non-integer dividers
Krzysztof Kozlowski krzk@kernel.org clk: samsung: exynos5: Fix kfree() of const memory on setting driver_override
Krzysztof Kozlowski krzk@kernel.org clk: samsung: exynos5: Fix possible NULL pointer exception on platform_device_alloc() failure
Tony Lindgren tony@atomide.com clk: clk-twl6040: Fix imprecise external abort for pdmclk
Kunihiko Hayashi hayashi.kunihiko@socionext.com clk: uniphier: Fix update register for CPU-gear
Jan Kara jack@suse.cz ext2: Fix underflow in ext2_max_size()
Vaibhav Jain vaibhav@linux.ibm.com cxl: Wrap iterations over afu slices inside 'afu_list_lock'
Michael J. Ruhl michael.j.ruhl@intel.com IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
Mike Marciniszyn mike.marciniszyn@intel.com IB/rdmavt: Fix loopback send with invalidate ordering
Michael J. Ruhl michael.j.ruhl@intel.com IB/hfi1: Close race condition on user context disable and close
Thomas Petazzoni thomas.petazzoni@bootlin.com PCI: pci-bridge-emul: Extend pci_bridge_emul_init() with flags
Thomas Petazzoni thomas.petazzoni@bootlin.com PCI: pci-bridge-emul: Create per-bridge copy of register behavior
Mika Westerberg mika.westerberg@linux.intel.com PCI: pciehp: Disable Data Link Layer State Changed event on suspend
Lucas Stach l.stach@pengutronix.de PCI: dwc: skip MSI init if MSIs have been explicitly disabled
Bjorn Andersson bjorn.andersson@linaro.org PCI: qcom: Don't deassert reset GPIO during probe
Dongdong Liu liudongdong3@huawei.com PCI/DPC: Fix print AER status in DPC event handling
Bjorn Helgaas bhelgaas@google.com PCI/ASPM: Use LTR if already enabled by platform
Joerg Roedel jroedel@suse.de swiotlb: Add is_swiotlb_active() function
Joerg Roedel jroedel@suse.de swiotlb: Introduce swiotlb_max_mapping_size()
Joerg Roedel jroedel@suse.de dma: Introduce dma_max_mapping_size()
Jan Kara jack@suse.cz ext4: fix crash during online resizing
yangerkun yangerkun@huawei.com ext4: add mask of ext4 flags to swap
yangerkun yangerkun@huawei.com ext4: update quota information while swapping boot loader inode
Mark Walton mark.walton@serialtek.com gpio: pca953x: Fix dereference of irq data in shutdown
Loic Poulain loic.poulain@linaro.org media: i2c: ov5640: Fix post-reset delay
Sowjanya Komatineni skomatineni@nvidia.com i2c: tegra: update maximum transfer size
Sowjanya Komatineni skomatineni@nvidia.com i2c: tegra: fix maximum transfer size
QiaoChong qiaochong@loongson.cn parport_pc: fix find_superio io compare code, should use equal test.
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: Don't reference unassigned outputs
Heikki Krogerus heikki.krogerus@linux.intel.com device property: Fix the length used in PROPERTY_ENTRY_STRING()
Bartosz Golaszewski bgolaszewski@baylibre.com nvmem: core: don't check the return value of notifier chain call
Zev Weiss zev@bewilderbeest.net kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
Jan Stancek jstancek@redhat.com mm/memory.c: do_fault: avoid usage of stale vm_area_struct
Roman Penyaev rpenyaev@suse.de mm/vmalloc: fix size check for remap_vmalloc_range_partial()
zhongjiang zhongjiang@huawei.com mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
yangerkun yangerkun@huawei.com ext4: cleanup pagecache before swap i_data
yangerkun yangerkun@huawei.com ext4: fix check of inode in swap_inode_boot_loader
Arnd Bergmann arnd@arndb.de cpufreq: pxa2xx: remove incorrect __init annotation
Yangtao Li tiny.windzz@gmail.com cpufreq: tegra124: add missing of_node_put()
Viresh Kumar viresh.kumar@linaro.org cpufreq: kryo: Release OPP tables on module removal
Masami Hiramatsu mhiramat@kernel.org x86/kprobes: Prohibit probing on optprobe template code
Doug Berger opendmb@gmail.com irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code
Zenghui Yu yuzenghui@huawei.com irqchip/gic-v3-its: Avoid parsing _indirect_ twice for Device table
Lubomir Rintel lkundrak@v3.sk libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer
Stephen Boyd swboyd@chromium.org soc: qcom: rpmh: Avoid accessing freed memory from batch API
Filipe Manana fdmanana@suse.com Btrfs: fix deadlock between clone/dedupe and rename
Filipe Manana fdmanana@suse.com Btrfs: fix corruption reading shared and compressed extents after hole punching
Dan Robertson dan@dlrobertson.com btrfs: init csum_list before possible free
Johannes Thumshirn jthumshirn@suse.de btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
Dan Carpenter dan.carpenter@oracle.com btrfs: drop the lock on error in btrfs_dev_replace_cancel
Anand Jain anand.jain@oracle.com btrfs: scrub: fix circular locking dependency warning
Filipe Manana fdmanana@suse.com Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl
Filipe Manana fdmanana@suse.com Btrfs: setup a nofs context for memory allocation at btrfs_create_tree()
Finn Thain fthain@telegraphics.com.au m68k: Add -ffreestanding to CFLAGS
Vivek Goyal vgoyal@redhat.com ovl: Do not lose security.capability xattr over metadata file copy-up
Vivek Goyal vgoyal@redhat.com ovl: During copy up, first copy up data and then xattrs
Jann Horn jannh@google.com splice: don't merge into linked buffers
Varad Gautam vrd@amazon.de fs/devpts: always delete dcache dentry-s in dput()
Quinn Tran qtran@marvell.com scsi: qla2xxx: Use complete switch scan for RSCN events
Giridhar Malavali gmalavali@marvell.com scsi: qla2xxx: Avoid PCI IRQ affinity mapping when multiqueue is not supported
Himanshu Madhani hmadhani@marvell.com scsi: qla2xxx: Fix LUN discovery if loop id is not assigned yet by firmware
Bart Van Assche bvanassche@acm.org scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
Martin K. Petersen martin.petersen@oracle.com scsi: sd: Optimal I/O size should be a multiple of physical block size
Sagar Biradar sagar.biradar@microchip.com scsi: aacraid: Fix performance issue on logical drives
Felipe Franciosi felipe@nutanix.com scsi: virtio_scsi: don't send sc payload with tmfs
Halil Pasic pasic@linux.ibm.com s390/virtio: handle find on invalid queue gracefully
Martin Schwidefsky schwidefsky@de.ibm.com s390/setup: fix early warning messages
Pierre Morel pmorel@linux.ibm.com s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
Samuel Holland samuel@sholland.org clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability
Stuart Menefy stuart.menefy@mathembedded.com clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
Stuart Menefy stuart.menefy@mathembedded.com clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
Felix Fietkau nbd@nbd.name mt76: fix corrupted software generated tx CCMP PN
Stuart Menefy stuart.menefy@mathembedded.com regulator: s2mpa01: Fix step values for some LDOs
Mark Zhang markz@nvidia.com regulator: max77620: Initialize values for DT properties
Krzysztof Kozlowski krzk@kernel.org regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
Russell King rmk+kernel@armlinux.org.uk spi: spi-gpio: fix SPI_CS_HIGH capability
Vignesh R vigneshr@ti.com spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch
Andy Shevchenko andriy.shevchenko@linux.intel.com spi: pxa2xx: Setup maximum supported DMA transfer length
Vignesh R vigneshr@ti.com spi: ti-qspi: Fix mmap read when more than one CS in use
Jiong Wu lohengrin1024@gmail.com mmc:fix a bug when max_discard is 0
Takeshi Saito takeshi.saito.xv@renesas.com mmc: renesas_sdhi: Fix card initialization failure in high speed mode
BOUGH CHEN haibo.chen@nxp.com mmc: sdhci-esdhc-imx: fix HS400 timing issue
Andy Shevchenko andriy.shevchenko@linux.intel.com ACPI / device_sysfs: Avoid OF modalias creation for removed device
Juergen Gross jgross@suse.com xen: fix dom0 boot on huge systems
Dan Carpenter dan.carpenter@oracle.com vmw_balloon: release lock on error in vmballoon_reset()
Jann Horn jannh@google.com tracing/perf: Use strndup_user() instead of buggy open-coded version
zhangyi (F) yi.zhang@huawei.com tracing: Do not free iter->trace in fail path of tracing_open_pipe()
Tom Zanussi tom.zanussi@linux.intel.com tracing: Use strncpy instead of memcpy for string keys in hist triggers
Steve French stfrench@microsoft.com smb3: make default i/o size for smb3 mounts larger
Pavel Shilovsky piastryyy@gmail.com CIFS: Fix read after write for files with read caching
Pavel Shilovsky piastryyy@gmail.com CIFS: Do not skip SMB2 message IDs on send failures
Pavel Shilovsky piastryyy@gmail.com CIFS: Do not reset lease state to NONE on lease break
Pavel Shilovsky pshilov@microsoft.com CIFS: Fix leaking locked VFS cache pages in writeback retry
Ard Biesheuvel ard.biesheuvel@linaro.org crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine
Ard Biesheuvel ard.biesheuvel@linaro.org crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
Eric Biggers ebiggers@google.com crypto: x86/morus - fix handling chunked inputs and MAY_SLEEP
Eric Biggers ebiggers@google.com crypto: x86/aesni-gcm - fix crash on empty plaintext
Eric Biggers ebiggers@google.com crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEP
Eric Biggers ebiggers@google.com crypto: testmgr - skip crc32c context test for ahash algorithms
Eric Biggers ebiggers@google.com crypto: skcipher - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
Eric Biggers ebiggers@google.com crypto: pcbc - remove bogus memcpy()s with src == dest
Eric Biggers ebiggers@google.com crypto: morus - fix handling chunked inputs
Eric Biggers ebiggers@google.com crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
Ard Biesheuvel ard.biesheuvel@linaro.org crypto: arm64/crct10dif - revert to C code for short inputs
Eric Biggers ebiggers@google.com crypto: arm64/aes-neonbs - fix returning final keystream block
Ard Biesheuvel ard.biesheuvel@linaro.org crypto: arm/crct10dif - revert to C code for short inputs
Eric Biggers ebiggers@google.com crypto: aegis - fix handling chunked inputs
Eric Biggers ebiggers@google.com crypto: aead - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
Al Viro viro@zeniv.linux.org.uk fix cgroup_do_mount() handling of failure exits
Oliver O'Halloran oohall@gmail.com libnvdimm: Fix altmap reservation size calculation
Dan Williams dan.j.williams@intel.com libnvdimm/pmem: Honor force_raw for legacy pmem regions
Wei Yang richardw.yang@linux.intel.com libnvdimm, pfn: Fix over-trim in trim_pfn_device()
Dan Williams dan.j.williams@intel.com libnvdimm/label: Clear 'updating' flag after label-set update
Dan Williams dan.j.williams@intel.com nfit/ars: Attempt short-ARS even in the no_init_ars case
Dan Williams dan.j.williams@intel.com nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
Dan Williams dan.j.williams@intel.com acpi/nfit: Fix bus command validation
Dexuan Cui decui@microsoft.com nfit: acpi_nfit_ctl(): Check out_obj->type in the right place
Dan Williams dan.j.williams@intel.com nfit: Fix nfit_intel_shutdown_status() command submission
Matthew Wilcox willy@infradead.org dax: Flush partial PMDs correctly
Zhang Zhijie zhangzj@rock-chips.com crypto: rockchip - update new iv to device in multiple operations
Zhang Zhijie zhangzj@rock-chips.com crypto: rockchip - fix scatterlist nents error
Eric Biggers ebiggers@google.com crypto: ahash - fix another early termination in hash walk
Eric Biggers ebiggers@google.com crypto: ofb - fix handling partial blocks and make thread-safe
Eric Biggers ebiggers@google.com crypto: cfb - remove bogus memcpy() with src == dest
Eric Biggers ebiggers@google.com crypto: cfb - add missing 'chunksize' property
Gilad Ben-Yossef gilad@benyossef.com crypto: ccree - don't copy zero size ciphertext
Gilad Ben-Yossef gilad@benyossef.com crypto: ccree - unmap buffer before copying IV
Hadar Gat hadar.gat@arm.com crypto: ccree - fix free of unallocated mlli buffer
Horia Geantă horia.geanta@nxp.com crypto: caam - fix DMA mapping of stack memory
Pankaj Gupta pankaj.gupta@nxp.com crypto: caam - fixed handling of sg list
Gustavo A. R. Silva gustavo@embeddedor.com crypto: ccree - fix missing break in switch statement
Franck LENORMAND franck.lenormand@nxp.com crypto: caam - fix hash context DMA unmap size
Zhi Jin zhi.jin@intel.com stm class: Fix an endless loop in channel allocation
Alexander Shishkin alexander.shishkin@linux.intel.com stm class: Prevent division by zero
Alexander Usyskin alexander.usyskin@intel.com mei: bus: move hw module get/put to probe/release
Alexander Usyskin alexander.usyskin@intel.com mei: hbm: clean the feature flags on link reset
Krzysztof Kozlowski krzk@kernel.org iio: adc: exynos-adc: Use proper number of channels for Exynos4x12
Krzysztof Kozlowski krzk@kernel.org iio: adc: exynos-adc: Fix NULL pointer exception on unbind
Codrin Ciubotariu codrin.ciubotariu@microchip.com ASoC: codecs: pcm186x: Fix energysense SLEEP bit
Codrin Ciubotariu codrin.ciubotariu@microchip.com ASoC: codecs: pcm186x: fix wrong usage of DECLARE_TLV_DB_SCALE()
S.j. Wang shengjiu.wang@nxp.com ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
zhengbin zhengbin13@huawei.com 9p/net: fix memory leak in p9_client_create
Hou Tao houtao1@huawei.com 9p: use inode->i_lock to protect i_size_write() under 32-bit
-------------
Diffstat:
Documentation/DMA-API.txt | 8 + Documentation/arm64/silicon-errata.txt | 2 + .../bindings/iio/adc/samsung,exynos-adc.txt | 4 +- Documentation/process/stable-kernel-rules.rst | 3 + Makefile | 4 +- arch/arm/crypto/crct10dif-ce-core.S | 14 +- arch/arm/crypto/crct10dif-ce-glue.c | 23 +- arch/arm/mach-s3c24xx/mach-osiris-dvs.c | 8 +- arch/arm64/crypto/aes-ce-ccm-core.S | 5 +- arch/arm64/crypto/aes-ce-ccm-glue.c | 4 +- arch/arm64/crypto/aes-neonbs-core.S | 8 +- arch/arm64/crypto/crct10dif-ce-glue.c | 25 +-- arch/arm64/include/asm/hardirq.h | 31 +++ arch/arm64/kernel/irq.c | 3 + arch/arm64/kernel/kgdb.c | 14 +- arch/arm64/kernel/probes/kprobes.c | 6 + arch/arm64/kvm/sys_regs.c | 2 +- arch/arm64/mm/fault.c | 9 +- arch/m68k/Makefile | 5 +- arch/mips/include/asm/kvm_host.h | 2 +- arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 + arch/powerpc/include/asm/kvm_host.h | 2 +- arch/powerpc/include/asm/powernv.h | 2 + arch/powerpc/kernel/entry_32.S | 9 + arch/powerpc/kernel/process.c | 2 +- arch/powerpc/kernel/ptrace.c | 10 +- arch/powerpc/kernel/smp.c | 90 +++----- arch/powerpc/kernel/traps.c | 12 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 26 ++- arch/powerpc/mm/slb.c | 5 + arch/powerpc/platforms/83xx/suspend-asm.S | 34 ++- arch/powerpc/platforms/embedded6xx/wii.c | 4 + arch/powerpc/platforms/powernv/idle.c | 27 +-- arch/powerpc/platforms/powernv/opal-msglog.c | 2 +- arch/powerpc/platforms/powernv/smp.c | 25 +++ arch/s390/include/asm/kvm_host.h | 2 +- arch/s390/kernel/setup.c | 31 ++- arch/x86/crypto/aegis128-aesni-glue.c | 38 ++-- arch/x86/crypto/aegis128l-aesni-glue.c | 38 ++-- arch/x86/crypto/aegis256-aesni-glue.c | 38 ++-- arch/x86/crypto/aesni-intel_glue.c | 13 +- arch/x86/crypto/morus1280_glue.c | 40 ++-- arch/x86/crypto/morus640_glue.c | 39 ++-- arch/x86/events/intel/uncore.c | 1 + arch/x86/events/intel/uncore.h | 12 +- arch/x86/events/intel/uncore_snb.c | 4 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kernel/ftrace.c | 42 ++-- arch/x86/kernel/kprobes/opt.c | 5 + arch/x86/kernel/kvmclock.c | 6 +- arch/x86/kvm/mmu.c | 39 ++-- arch/x86/kvm/vmx/nested.c | 43 +++- arch/x86/kvm/vmx/vmx.c | 16 +- arch/x86/kvm/x86.c | 4 +- arch/x86/kvm/x86.h | 7 +- arch/x86/xen/mmu_pv.c | 13 +- crypto/aead.c | 4 +- crypto/aegis128.c | 14 +- crypto/aegis128l.c | 14 +- crypto/aegis256.c | 14 +- crypto/ahash.c | 42 ++-- crypto/cfb.c | 14 +- crypto/morus1280.c | 13 +- crypto/morus640.c | 13 +- crypto/ofb.c | 91 ++++---- crypto/pcbc.c | 14 +- crypto/shash.c | 18 +- crypto/skcipher.c | 27 ++- crypto/testmgr.c | 14 +- crypto/testmgr.h | 53 ++++- drivers/acpi/device_sysfs.c | 6 +- drivers/acpi/nfit/core.c | 84 ++++---- drivers/base/power/wakeup.c | 8 +- drivers/char/ipmi/ipmi_si.h | 4 +- drivers/char/ipmi/ipmi_si_hardcode.c | 236 +++++++++++++++------ drivers/char/ipmi/ipmi_si_intf.c | 28 ++- drivers/char/ipmi/ipmi_si_mem_io.c | 5 +- drivers/char/ipmi/ipmi_si_platform.c | 29 ++- drivers/char/ipmi/ipmi_si_port_io.c | 5 +- drivers/char/tpm/st33zp24/st33zp24.c | 2 +- drivers/char/tpm/tpm-interface.c | 11 +- drivers/char/tpm/tpm_atmel.c | 2 +- drivers/char/tpm/tpm_crb.c | 22 +- drivers/char/tpm/tpm_i2c_atmel.c | 6 +- drivers/char/tpm/tpm_i2c_infineon.c | 2 +- drivers/char/tpm/tpm_i2c_nuvoton.c | 2 +- drivers/char/tpm/tpm_ibmvtpm.c | 8 +- drivers/char/tpm/tpm_infineon.c | 2 +- drivers/char/tpm/tpm_nsc.c | 2 +- drivers/char/tpm/tpm_tis_core.c | 2 +- drivers/char/tpm/tpm_vtpm_proxy.c | 3 +- drivers/char/tpm/xen-tpmfront.c | 2 +- drivers/clk/clk-twl6040.c | 53 ++++- drivers/clk/ingenic/cgu.c | 10 +- drivers/clk/ingenic/cgu.h | 2 +- drivers/clk/samsung/clk-exynos5-subcmu.c | 13 +- drivers/clk/uniphier/clk-uniphier-cpugear.c | 2 +- drivers/clocksource/Kconfig | 10 + drivers/clocksource/arm_arch_timer.c | 55 +++++ drivers/clocksource/exynos_mct.c | 23 +- drivers/cpufreq/pxa2xx-cpufreq.c | 4 +- drivers/cpufreq/qcom-cpufreq-kryo.c | 20 +- drivers/cpufreq/tegra124-cpufreq.c | 2 + drivers/cpuidle/governor.c | 1 + drivers/crypto/caam/caamalg.c | 1 + drivers/crypto/caam/caamhash.c | 93 +++----- drivers/crypto/ccree/cc_buffer_mgr.c | 8 +- drivers/crypto/ccree/cc_cipher.c | 7 +- drivers/crypto/rockchip/rk3288_crypto.c | 2 +- drivers/crypto/rockchip/rk3288_crypto.h | 4 +- drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 39 +++- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 2 +- drivers/dma/sh/usb-dmac.c | 2 + drivers/gpio/gpio-pca953x.c | 3 +- drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 8 +- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 6 +- drivers/gpu/drm/drm_fb_helper.c | 4 +- drivers/gpu/drm/radeon/evergreen_cs.c | 1 + drivers/hwtracing/intel_th/gth.c | 4 + drivers/hwtracing/stm/core.c | 11 +- drivers/i2c/busses/i2c-tegra.c | 8 +- drivers/iio/adc/exynos_adc.c | 19 +- drivers/infiniband/hw/hfi1/hfi.h | 2 +- drivers/infiniband/hw/hfi1/init.c | 14 +- drivers/infiniband/sw/rdmavt/qp.c | 59 ++++-- drivers/irqchip/irq-brcmstb-l2.c | 10 +- drivers/irqchip/irq-gic-v3-its.c | 2 + drivers/md/bcache/extents.c | 13 +- drivers/md/bcache/request.c | 7 +- drivers/md/bcache/writeback.h | 3 + drivers/md/dm-integrity.c | 8 +- drivers/md/raid10.c | 2 + drivers/md/raid5.c | 2 + drivers/media/dvb-frontends/lgdt330x.c | 2 +- drivers/media/i2c/cx25840/cx25840-core.c | 3 +- drivers/media/i2c/cx25840/cx25840-core.h | 1 - drivers/media/i2c/ov5640.c | 2 +- drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c | 2 +- drivers/media/platform/vimc/Makefile | 3 +- drivers/media/platform/vimc/vimc-capture.c | 18 +- drivers/media/platform/vimc/vimc-common.c | 35 --- drivers/media/platform/vimc/vimc-common.h | 15 +- drivers/media/platform/vimc/vimc-debayer.c | 26 +-- drivers/media/platform/vimc/vimc-scaler.c | 28 +-- drivers/media/platform/vimc/vimc-sensor.c | 56 ++--- drivers/media/platform/vimc/vimc-streamer.c | 188 ++++++++++++++++ drivers/media/platform/vimc/vimc-streamer.h | 38 ++++ drivers/media/usb/uvc/uvc_video.c | 8 + drivers/mfd/sm501.c | 3 + drivers/misc/cxl/guest.c | 2 + drivers/misc/cxl/pci.c | 39 +++- drivers/misc/mei/bus.c | 21 +- drivers/misc/mei/hbm.c | 7 + drivers/misc/vmw_balloon.c | 5 +- drivers/mmc/core/core.c | 4 +- drivers/mmc/host/renesas_sdhi_core.c | 11 +- drivers/mmc/host/sdhci-esdhc-imx.c | 1 + drivers/net/dsa/lantiq_gswip.c | 21 +- drivers/net/wireless/marvell/libertas_tf/if_usb.c | 2 - drivers/net/wireless/mediatek/mt76/mt76x02_mac.c | 2 +- drivers/nvdimm/label.c | 23 +- drivers/nvdimm/namespace_devs.c | 4 + drivers/nvdimm/pfn_devs.c | 4 +- drivers/nvmem/core.c | 4 +- drivers/opp/core.c | 2 +- drivers/parport/parport_pc.c | 2 +- drivers/pci/controller/dwc/pcie-designware-host.c | 2 +- drivers/pci/controller/dwc/pcie-qcom.c | 2 +- drivers/pci/controller/pci-aardvark.c | 2 +- drivers/pci/controller/pci-mvebu.c | 2 +- drivers/pci/hotplug/pciehp_hpc.c | 17 +- drivers/pci/pci-bridge-emul.c | 86 +++++--- drivers/pci/pci-bridge-emul.h | 13 +- drivers/pci/pcie/dpc.c | 27 ++- drivers/pci/probe.c | 36 ++-- drivers/power/supply/cpcap-charger.c | 1 + drivers/regulator/max77620-regulator.c | 10 +- drivers/regulator/s2mpa01.c | 10 +- drivers/regulator/s2mps11.c | 6 +- drivers/s390/crypto/vfio_ap_drv.c | 44 +++- drivers/s390/crypto/vfio_ap_ops.c | 4 +- drivers/s390/crypto/vfio_ap_private.h | 1 + drivers/s390/virtio/virtio_ccw.c | 4 +- drivers/scsi/aacraid/linit.c | 13 +- drivers/scsi/qla2xxx/qla_init.c | 99 +-------- drivers/scsi/qla2xxx/qla_isr.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 2 +- drivers/scsi/sd.c | 59 +++++- drivers/scsi/virtio_scsi.c | 2 - drivers/soc/qcom/rpmh.c | 34 +-- drivers/spi/spi-gpio.c | 4 +- drivers/spi/spi-omap2-mcspi.c | 4 +- drivers/spi/spi-pxa2xx.c | 1 + drivers/spi/spi-ti-qspi.c | 6 +- drivers/staging/media/imx/imx-ic-prpencvf.c | 26 ++- drivers/staging/media/imx/imx-media-csi.c | 44 ++-- drivers/target/iscsi/iscsi_target.c | 4 +- drivers/tty/serial/8250/8250_of.c | 4 + drivers/tty/serial/8250/8250_pci.c | 141 ++++++++++-- drivers/tty/serial/xilinx_uartps.c | 8 +- drivers/tty/vt/vt.c | 15 +- drivers/usb/chipidea/ci_hdrc_tegra.c | 1 + drivers/usb/typec/tps6598x.c | 26 ++- fs/9p/v9fs_vfs.h | 23 +- fs/9p/vfs_file.c | 6 +- fs/9p/vfs_inode.c | 23 +- fs/9p/vfs_inode_dotl.c | 27 +-- fs/9p/vfs_super.c | 4 +- fs/btrfs/acl.c | 9 + fs/btrfs/dev-replace.c | 1 + fs/btrfs/disk-io.c | 8 + fs/btrfs/extent_io.c | 4 +- fs/btrfs/ioctl.c | 21 +- fs/btrfs/scrub.c | 24 +-- fs/btrfs/volumes.c | 4 +- fs/cifs/cifs_fs_sb.h | 1 + fs/cifs/cifsfs.c | 1 + fs/cifs/cifsglob.h | 20 ++ fs/cifs/cifssmb.c | 17 +- fs/cifs/connect.c | 26 ++- fs/cifs/file.c | 12 +- fs/cifs/inode.c | 2 +- fs/cifs/smb2misc.c | 17 +- fs/cifs/smb2ops.c | 28 ++- fs/cifs/smb2transport.c | 14 +- fs/cifs/transport.c | 6 +- fs/dax.c | 19 +- fs/devpts/inode.c | 1 + fs/ext2/super.c | 39 ++-- fs/ext4/ext4.h | 3 + fs/ext4/ioctl.c | 100 ++++++--- fs/ext4/resize.c | 3 +- fs/jbd2/transaction.c | 33 +-- fs/kernfs/mount.c | 8 +- fs/nfs/nfs4proc.c | 12 +- fs/nfs/pagelist.c | 29 ++- fs/nfsd/nfs3proc.c | 16 +- fs/nfsd/nfs3xdr.c | 1 + fs/nfsd/nfs4state.c | 8 +- fs/nfsd/nfsctl.c | 2 +- fs/overlayfs/copy_up.c | 59 ++++-- fs/overlayfs/overlayfs.h | 2 + fs/overlayfs/util.c | 55 +++-- fs/pipe.c | 14 ++ fs/splice.c | 4 + include/asm-generic/vmlinux.lds.h | 2 +- include/linux/device-mapper.h | 2 +- include/linux/dma-mapping.h | 8 + include/linux/hardirq.h | 7 + include/linux/kvm_host.h | 2 +- include/linux/pipe_fs_i.h | 1 + include/linux/property.h | 2 +- include/linux/swiotlb.h | 11 + kernel/cgroup/cgroup.c | 9 +- kernel/dma/direct.c | 11 + kernel/dma/mapping.c | 14 ++ kernel/dma/swiotlb.c | 14 ++ kernel/rcu/tree.c | 19 +- kernel/sysctl.c | 11 +- kernel/trace/trace.c | 1 - kernel/trace/trace_event_perf.c | 16 +- kernel/trace/trace_events_hist.c | 5 +- mm/memory-failure.c | 14 +- mm/memory.c | 5 +- mm/vmalloc.c | 2 +- net/9p/client.c | 2 +- net/sunrpc/clnt.c | 124 ++++++----- net/sunrpc/svcsock.c | 20 +- security/selinux/hooks.c | 8 +- sound/soc/codecs/pcm186x.c | 8 +- sound/soc/fsl/fsl_esai.c | 7 +- tools/perf/util/auxtrace.c | 4 +- tools/perf/util/auxtrace.h | 3 + .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 37 +++- tools/perf/util/intel-pt.c | 2 + tools/perf/util/symbol.c | 2 + virt/kvm/arm/mmu.c | 2 +- virt/kvm/kvm_main.c | 7 +- 278 files changed, 3049 insertions(+), 1609 deletions(-)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hou Tao houtao1@huawei.com
commit 5e3cc1ee1405a7eb3487ed24f786dec01b4cbe1f upstream.
Use inode->i_lock to protect i_size_write(), else i_size_read() in generic_fillattr() may loop infinitely in read_seqcount_begin() when multiple processes invoke v9fs_vfs_getattr() or v9fs_vfs_getattr_dotl() simultaneously under 32-bit SMP environment, and a soft lockup will be triggered as show below:
watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:2217] Modules linked in: CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 Hardware name: Generic DT based system PC is at generic_fillattr+0x104/0x108 LR is at 0xec497f00 pc : [<802b8898>] lr : [<ec497f00>] psr: 200c0013 sp : ec497e20 ip : ed608030 fp : ec497e3c r10: 00000000 r9 : ec497f00 r8 : ed608030 r7 : ec497ebc r6 : ec497f00 r5 : ee5c1550 r4 : ee005780 r3 : 0000052d r2 : 00000000 r1 : ec497f00 r0 : ed608030 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: ac48006a DAC: 00000051 CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 Hardware name: Generic DT based system Backtrace: [<8010d974>] (dump_backtrace) from [<8010dc88>] (show_stack+0x20/0x24) [<8010dc68>] (show_stack) from [<80a1d194>] (dump_stack+0xb0/0xdc) [<80a1d0e4>] (dump_stack) from [<80109f34>] (show_regs+0x1c/0x20) [<80109f18>] (show_regs) from [<801d0a80>] (watchdog_timer_fn+0x280/0x2f8) [<801d0800>] (watchdog_timer_fn) from [<80198658>] (__hrtimer_run_queues+0x18c/0x380) [<801984cc>] (__hrtimer_run_queues) from [<80198e60>] (hrtimer_run_queues+0xb8/0xf0) [<80198da8>] (hrtimer_run_queues) from [<801973e8>] (run_local_timers+0x28/0x64) [<801973c0>] (run_local_timers) from [<80197460>] (update_process_times+0x3c/0x6c) [<80197424>] (update_process_times) from [<801ab2b8>] (tick_nohz_handler+0xe0/0x1bc) [<801ab1d8>] (tick_nohz_handler) from [<80843050>] (arch_timer_handler_virt+0x38/0x48) [<80843018>] (arch_timer_handler_virt) from [<80180a64>] (handle_percpu_devid_irq+0x8c/0x240) [<801809d8>] (handle_percpu_devid_irq) from [<8017ac20>] (generic_handle_irq+0x34/0x44) [<8017abec>] (generic_handle_irq) from [<8017b344>] (__handle_domain_irq+0x6c/0xc4) [<8017b2d8>] (__handle_domain_irq) from [<801022e0>] (gic_handle_irq+0x4c/0x88) [<80102294>] (gic_handle_irq) from [<80101a30>] (__irq_svc+0x70/0x98) [<802b8794>] (generic_fillattr) from [<8056b284>] (v9fs_vfs_getattr_dotl+0x74/0xa4) [<8056b210>] (v9fs_vfs_getattr_dotl) from [<802b8904>] (vfs_getattr_nosec+0x68/0x7c) [<802b889c>] (vfs_getattr_nosec) from [<802b895c>] (vfs_getattr+0x44/0x48) [<802b8918>] (vfs_getattr) from [<802b8a74>] (vfs_statx+0x9c/0xec) [<802b89d8>] (vfs_statx) from [<802b9428>] (sys_lstat64+0x48/0x78) [<802b93e0>] (sys_lstat64) from [<80101000>] (ret_fast_syscall+0x0/0x28)
[dominique.martinet@cea.fr: updated comment to not refer to a function in another subsystem] Link: http://lkml.kernel.org/r/20190124063514.8571-2-houtao1@huawei.com Cc: stable@vger.kernel.org Fixes: 7549ae3e81cc ("9p: Use the i_size_[read, write]() macros instead of using inode->i_size directly.") Reported-by: Xing Gaopeng xingaopeng@huawei.com Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Dominique Martinet dominique.martinet@cea.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/9p/v9fs_vfs.h | 23 +++++++++++++++++++++-- fs/9p/vfs_file.c | 6 +++++- fs/9p/vfs_inode.c | 23 +++++++++++------------ fs/9p/vfs_inode_dotl.c | 27 ++++++++++++++------------- fs/9p/vfs_super.c | 4 ++-- 5 files changed, 53 insertions(+), 30 deletions(-)
--- a/fs/9p/v9fs_vfs.h +++ b/fs/9p/v9fs_vfs.h @@ -40,6 +40,9 @@ */ #define P9_LOCK_TIMEOUT (30*HZ)
+/* flags for v9fs_stat2inode() & v9fs_stat2inode_dotl() */ +#define V9FS_STAT2INODE_KEEP_ISIZE 1 + extern struct file_system_type v9fs_fs_type; extern const struct address_space_operations v9fs_addr_operations; extern const struct file_operations v9fs_file_operations; @@ -61,8 +64,10 @@ int v9fs_init_inode(struct v9fs_session_ struct inode *inode, umode_t mode, dev_t); void v9fs_evict_inode(struct inode *inode); ino_t v9fs_qid2ino(struct p9_qid *qid); -void v9fs_stat2inode(struct p9_wstat *, struct inode *, struct super_block *); -void v9fs_stat2inode_dotl(struct p9_stat_dotl *, struct inode *); +void v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, + struct super_block *sb, unsigned int flags); +void v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode, + unsigned int flags); int v9fs_dir_release(struct inode *inode, struct file *filp); int v9fs_file_open(struct inode *inode, struct file *file); void v9fs_inode2stat(struct inode *inode, struct p9_wstat *stat); @@ -83,4 +88,18 @@ static inline void v9fs_invalidate_inode }
int v9fs_open_to_dotl_flags(int flags); + +static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size) +{ + /* + * 32-bit need the lock, concurrent updates could break the + * sequences and make i_size_read() loop forever. + * 64-bit updates are atomic and can skip the locking. + */ + if (sizeof(i_size) > sizeof(long)) + spin_lock(&inode->i_lock); + i_size_write(inode, i_size); + if (sizeof(i_size) > sizeof(long)) + spin_unlock(&inode->i_lock); +} #endif --- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -446,7 +446,11 @@ v9fs_file_write_iter(struct kiocb *iocb, i_size = i_size_read(inode); if (iocb->ki_pos > i_size) { inode_add_bytes(inode, iocb->ki_pos - i_size); - i_size_write(inode, iocb->ki_pos); + /* + * Need to serialize against i_size_write() in + * v9fs_stat2inode() + */ + v9fs_i_size_write(inode, iocb->ki_pos); } return retval; } --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -538,7 +538,7 @@ static struct inode *v9fs_qid_iget(struc if (retval) goto error;
- v9fs_stat2inode(st, inode, sb); + v9fs_stat2inode(st, inode, sb, 0); v9fs_cache_inode_get_cookie(inode); unlock_new_inode(inode); return inode; @@ -1092,7 +1092,7 @@ v9fs_vfs_getattr(const struct path *path if (IS_ERR(st)) return PTR_ERR(st);
- v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb); + v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb, 0); generic_fillattr(d_inode(dentry), stat);
p9stat_free(st); @@ -1170,12 +1170,13 @@ static int v9fs_vfs_setattr(struct dentr * @stat: Plan 9 metadata (mistat) structure * @inode: inode to populate * @sb: superblock of filesystem + * @flags: control flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE) * */
void v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, - struct super_block *sb) + struct super_block *sb, unsigned int flags) { umode_t mode; char ext[32]; @@ -1216,10 +1217,11 @@ v9fs_stat2inode(struct p9_wstat *stat, s mode = p9mode2perm(v9ses, stat); mode |= inode->i_mode & ~S_IALLUGO; inode->i_mode = mode; - i_size_write(inode, stat->length);
+ if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) + v9fs_i_size_write(inode, stat->length); /* not real number of blocks, but 512 byte ones ... */ - inode->i_blocks = (i_size_read(inode) + 512 - 1) >> 9; + inode->i_blocks = (stat->length + 512 - 1) >> 9; v9inode->cache_validity &= ~V9FS_INO_INVALID_ATTR; }
@@ -1416,9 +1418,9 @@ int v9fs_refresh_inode(struct p9_fid *fi { int umode; dev_t rdev; - loff_t i_size; struct p9_wstat *st; struct v9fs_session_info *v9ses; + unsigned int flags;
v9ses = v9fs_inode2v9ses(inode); st = p9_client_stat(fid); @@ -1431,16 +1433,13 @@ int v9fs_refresh_inode(struct p9_fid *fi if ((inode->i_mode & S_IFMT) != (umode & S_IFMT)) goto out;
- spin_lock(&inode->i_lock); /* * We don't want to refresh inode->i_size, * because we may have cached data */ - i_size = inode->i_size; - v9fs_stat2inode(st, inode, inode->i_sb); - if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) - inode->i_size = i_size; - spin_unlock(&inode->i_lock); + flags = (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) ? + V9FS_STAT2INODE_KEEP_ISIZE : 0; + v9fs_stat2inode(st, inode, inode->i_sb, flags); out: p9stat_free(st); kfree(st); --- a/fs/9p/vfs_inode_dotl.c +++ b/fs/9p/vfs_inode_dotl.c @@ -143,7 +143,7 @@ static struct inode *v9fs_qid_iget_dotl( if (retval) goto error;
- v9fs_stat2inode_dotl(st, inode); + v9fs_stat2inode_dotl(st, inode, 0); v9fs_cache_inode_get_cookie(inode); retval = v9fs_get_acl(inode, fid); if (retval) @@ -496,7 +496,7 @@ v9fs_vfs_getattr_dotl(const struct path if (IS_ERR(st)) return PTR_ERR(st);
- v9fs_stat2inode_dotl(st, d_inode(dentry)); + v9fs_stat2inode_dotl(st, d_inode(dentry), 0); generic_fillattr(d_inode(dentry), stat); /* Change block size to what the server returned */ stat->blksize = st->st_blksize; @@ -607,11 +607,13 @@ int v9fs_vfs_setattr_dotl(struct dentry * v9fs_stat2inode_dotl - populate an inode structure with stat info * @stat: stat structure * @inode: inode to populate + * @flags: ctrl flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE) * */
void -v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode) +v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode, + unsigned int flags) { umode_t mode; struct v9fs_inode *v9inode = V9FS_I(inode); @@ -631,7 +633,8 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl mode |= inode->i_mode & ~S_IALLUGO; inode->i_mode = mode;
- i_size_write(inode, stat->st_size); + if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) + v9fs_i_size_write(inode, stat->st_size); inode->i_blocks = stat->st_blocks; } else { if (stat->st_result_mask & P9_STATS_ATIME) { @@ -661,8 +664,9 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl } if (stat->st_result_mask & P9_STATS_RDEV) inode->i_rdev = new_decode_dev(stat->st_rdev); - if (stat->st_result_mask & P9_STATS_SIZE) - i_size_write(inode, stat->st_size); + if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE) && + stat->st_result_mask & P9_STATS_SIZE) + v9fs_i_size_write(inode, stat->st_size); if (stat->st_result_mask & P9_STATS_BLOCKS) inode->i_blocks = stat->st_blocks; } @@ -928,9 +932,9 @@ v9fs_vfs_get_link_dotl(struct dentry *de
int v9fs_refresh_inode_dotl(struct p9_fid *fid, struct inode *inode) { - loff_t i_size; struct p9_stat_dotl *st; struct v9fs_session_info *v9ses; + unsigned int flags;
v9ses = v9fs_inode2v9ses(inode); st = p9_client_getattr_dotl(fid, P9_STATS_ALL); @@ -942,16 +946,13 @@ int v9fs_refresh_inode_dotl(struct p9_fi if ((inode->i_mode & S_IFMT) != (st->st_mode & S_IFMT)) goto out;
- spin_lock(&inode->i_lock); /* * We don't want to refresh inode->i_size, * because we may have cached data */ - i_size = inode->i_size; - v9fs_stat2inode_dotl(st, inode); - if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) - inode->i_size = i_size; - spin_unlock(&inode->i_lock); + flags = (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) ? + V9FS_STAT2INODE_KEEP_ISIZE : 0; + v9fs_stat2inode_dotl(st, inode, flags); out: kfree(st); return 0; --- a/fs/9p/vfs_super.c +++ b/fs/9p/vfs_super.c @@ -172,7 +172,7 @@ static struct dentry *v9fs_mount(struct goto release_sb; } d_inode(root)->i_ino = v9fs_qid2ino(&st->qid); - v9fs_stat2inode_dotl(st, d_inode(root)); + v9fs_stat2inode_dotl(st, d_inode(root), 0); kfree(st); } else { struct p9_wstat *st = NULL; @@ -183,7 +183,7 @@ static struct dentry *v9fs_mount(struct }
d_inode(root)->i_ino = v9fs_qid2ino(&st->qid); - v9fs_stat2inode(st, d_inode(root), sb); + v9fs_stat2inode(st, d_inode(root), sb, 0);
p9stat_free(st); kfree(st);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhengbin zhengbin13@huawei.com
commit bb06c388fa20ae24cfe80c52488de718a7e3a53f upstream.
If msize is less than 4096, we should close and put trans, destroy tagpool, not just free client. This patch fixes that.
Link: http://lkml.kernel.org/m/1552464097-142659-1-git-send-email-zhengbin13@huawe... Cc: stable@vger.kernel.org Fixes: 574d356b7a02 ("9p/net: put a lower bound on msize") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: zhengbin zhengbin13@huawei.com Signed-off-by: Dominique Martinet dominique.martinet@cea.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/9p/client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/9p/client.c +++ b/net/9p/client.c @@ -1061,7 +1061,7 @@ struct p9_client *p9_client_create(const p9_debug(P9_DEBUG_ERROR, "Please specify a msize of at least 4k\n"); err = -EINVAL; - goto free_client; + goto close_trans; }
err = p9_client_version(clnt);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: S.j. Wang shengjiu.wang@nxp.com
commit cc29ea007347f39f4c5a4d27b0b555955a0277f9 upstream.
The ESAI_xCR_xWA is xCR's bit, not the xCCR's bit, driver set it to wrong register, correct it.
Fixes 43d24e76b698 ("ASoC: fsl_esai: Add ESAI CPU DAI driver") Cc: stable@vger.kernel.org Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Reviewed-by: Fabio Estevam festevam@gmail.com Ackedy-by: Nicolin Chen nicoleotsuka@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/fsl/fsl_esai.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/sound/soc/fsl/fsl_esai.c +++ b/sound/soc/fsl/fsl_esai.c @@ -398,7 +398,8 @@ static int fsl_esai_set_dai_fmt(struct s break; case SND_SOC_DAIFMT_RIGHT_J: /* Data on rising edge of bclk, frame high, right aligned */ - xccr |= ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP | ESAI_xCR_xWA; + xccr |= ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP; + xcr |= ESAI_xCR_xWA; break; case SND_SOC_DAIFMT_DSP_A: /* Data on rising edge of bclk, frame high, 1clk before data */ @@ -455,12 +456,12 @@ static int fsl_esai_set_dai_fmt(struct s return -EINVAL; }
- mask = ESAI_xCR_xFSL | ESAI_xCR_xFSR; + mask = ESAI_xCR_xFSL | ESAI_xCR_xFSR | ESAI_xCR_xWA; regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, mask, xcr); regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, mask, xcr);
mask = ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP | ESAI_xCCR_xFSP | - ESAI_xCCR_xFSD | ESAI_xCCR_xCKD | ESAI_xCR_xWA; + ESAI_xCCR_xFSD | ESAI_xCCR_xCKD; regmap_update_bits(esai_priv->regmap, REG_ESAI_TCCR, mask, xccr); regmap_update_bits(esai_priv->regmap, REG_ESAI_RCCR, mask, xccr);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Codrin Ciubotariu codrin.ciubotariu@microchip.com
commit fcf4daabf08079e6d09958a2992e7446ef8d0438 upstream.
According to DS, the gain is between -12 dB and 40 dB, with a 0.5 dB step. Tested on pcm1863.
Signed-off-by: Codrin Ciubotariu codrin.ciubotariu@microchip.com Acked-by: Andrew F. Davis afd@ti.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/codecs/pcm186x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/codecs/pcm186x.c +++ b/sound/soc/codecs/pcm186x.c @@ -42,7 +42,7 @@ struct pcm186x_priv { bool is_master_mode; };
-static const DECLARE_TLV_DB_SCALE(pcm186x_pga_tlv, -1200, 4000, 50); +static const DECLARE_TLV_DB_SCALE(pcm186x_pga_tlv, -1200, 50, 0);
static const struct snd_kcontrol_new pcm1863_snd_controls[] = { SOC_DOUBLE_R_S_TLV("ADC Capture Volume", PCM186X_PGA_VAL_CH1_L,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Codrin Ciubotariu codrin.ciubotariu@microchip.com
commit 05bd7fcdd06b19a10f069af1bea3ad9abac038d7 upstream.
The ADCs are sleeping when the SLEEP bit is set and running when it's cleared, so the bit should be inverted. Tested on pcm1863.
Signed-off-by: Codrin Ciubotariu codrin.ciubotariu@microchip.com Acked-by: Andrew F. Davis afd@ti.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/codecs/pcm186x.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/sound/soc/codecs/pcm186x.c +++ b/sound/soc/codecs/pcm186x.c @@ -158,7 +158,7 @@ static const struct snd_soc_dapm_widget * Put the codec into SLEEP mode when not in use, allowing the * Energysense mechanism to operate. */ - SND_SOC_DAPM_ADC("ADC", "HiFi Capture", PCM186X_POWER_CTRL, 1, 0), + SND_SOC_DAPM_ADC("ADC", "HiFi Capture", PCM186X_POWER_CTRL, 1, 1), };
static const struct snd_soc_dapm_widget pcm1865_dapm_widgets[] = { @@ -184,8 +184,8 @@ static const struct snd_soc_dapm_widget * Put the codec into SLEEP mode when not in use, allowing the * Energysense mechanism to operate. */ - SND_SOC_DAPM_ADC("ADC1", "HiFi Capture 1", PCM186X_POWER_CTRL, 1, 0), - SND_SOC_DAPM_ADC("ADC2", "HiFi Capture 2", PCM186X_POWER_CTRL, 1, 0), + SND_SOC_DAPM_ADC("ADC1", "HiFi Capture 1", PCM186X_POWER_CTRL, 1, 1), + SND_SOC_DAPM_ADC("ADC2", "HiFi Capture 2", PCM186X_POWER_CTRL, 1, 1), };
static const struct snd_soc_dapm_route pcm1863_dapm_routes[] = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzk@kernel.org
commit 2ea8bab4dd2a9014e723b28091831fa850b82d83 upstream.
Fix NULL pointer exception on device unbind when device tree does not contain "has-touchscreen" property. In such case the input device is not registered so it should not be unregistered.
$ echo "12d10000.adc" > /sys/bus/platform/drivers/exynos-adc/unbind
Unable to handle kernel NULL pointer dereference at virtual address 00000474 ... (input_unregister_device) from [<c0772060>] (exynos_adc_remove+0x20/0x80) (exynos_adc_remove) from [<c0587d5c>] (platform_drv_remove+0x20/0x40) (platform_drv_remove) from [<c05860f0>] (device_release_driver_internal+0xdc/0x1ac) (device_release_driver_internal) from [<c0583ecc>] (unbind_store+0x60/0xd4) (unbind_store) from [<c031b89c>] (kernfs_fop_write+0x100/0x1e0) (kernfs_fop_write) from [<c029709c>] (__vfs_write+0x2c/0x17c) (__vfs_write) from [<c0297374>] (vfs_write+0xa4/0x184) (vfs_write) from [<c0297594>] (ksys_write+0x4c/0xac) (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
Fixes: 2bb8ad9b44c5 ("iio: exynos-adc: add experimental touchscreen support") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/adc/exynos_adc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/exynos_adc.c +++ b/drivers/iio/adc/exynos_adc.c @@ -929,7 +929,7 @@ static int exynos_adc_remove(struct plat struct iio_dev *indio_dev = platform_get_drvdata(pdev); struct exynos_adc *info = iio_priv(indio_dev);
- if (IS_REACHABLE(CONFIG_INPUT)) { + if (IS_REACHABLE(CONFIG_INPUT) && info->input) { free_irq(info->tsirq, info); input_unregister_device(info->input); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzk@kernel.org
commit 103cda6a3b8d2c10d5f8cd7abad118e9db8f4776 upstream.
Exynos4212 and Exynos4412 have only four ADC channels so using "samsung,exynos-adc-v1" compatible (for eight channels ADCv1) on them is wrong. Add a new compatible for Exynos4x12.
Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.txt | 4 +- drivers/iio/adc/exynos_adc.c | 17 ++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.txt +++ b/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.txt @@ -11,11 +11,13 @@ New driver handles the following
Required properties: - compatible: Must be "samsung,exynos-adc-v1" - for exynos4412/5250 controllers. + for Exynos5250 controllers. Must be "samsung,exynos-adc-v2" for future controllers. Must be "samsung,exynos3250-adc" for controllers compatible with ADC of Exynos3250. + Must be "samsung,exynos4212-adc" for + controllers compatible with ADC of Exynos4212 and Exynos4412. Must be "samsung,exynos7-adc" for the ADC in Exynos7 and compatibles Must be "samsung,s3c2410-adc" for --- a/drivers/iio/adc/exynos_adc.c +++ b/drivers/iio/adc/exynos_adc.c @@ -115,6 +115,7 @@ #define MAX_ADC_V2_CHANNELS 10 #define MAX_ADC_V1_CHANNELS 8 #define MAX_EXYNOS3250_ADC_CHANNELS 2 +#define MAX_EXYNOS4212_ADC_CHANNELS 4 #define MAX_S5PV210_ADC_CHANNELS 10
/* Bit definitions common for ADC_V1 and ADC_V2 */ @@ -271,6 +272,19 @@ static void exynos_adc_v1_start_conv(str writel(con1 | ADC_CON_EN_START, ADC_V1_CON(info->regs)); }
+/* Exynos4212 and 4412 is like ADCv1 but with four channels only */ +static const struct exynos_adc_data exynos4212_adc_data = { + .num_channels = MAX_EXYNOS4212_ADC_CHANNELS, + .mask = ADC_DATX_MASK, /* 12 bit ADC resolution */ + .needs_adc_phy = true, + .phy_offset = EXYNOS_ADCV1_PHY_OFFSET, + + .init_hw = exynos_adc_v1_init_hw, + .exit_hw = exynos_adc_v1_exit_hw, + .clear_irq = exynos_adc_v1_clear_irq, + .start_conv = exynos_adc_v1_start_conv, +}; + static const struct exynos_adc_data exynos_adc_v1_data = { .num_channels = MAX_ADC_V1_CHANNELS, .mask = ADC_DATX_MASK, /* 12 bit ADC resolution */ @@ -493,6 +507,9 @@ static const struct of_device_id exynos_ .compatible = "samsung,s5pv210-adc", .data = &exynos_adc_s5pv210_data, }, { + .compatible = "samsung,exynos4212-adc", + .data = &exynos4212_adc_data, + }, { .compatible = "samsung,exynos-adc-v1", .data = &exynos_adc_v1_data, }, {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit 37fd0b623023484ef6df79ed46f21f06ecc611ff upstream.
The list of supported functions can be altered upon link reset, clean the flags to allow correct selections of supported features.
Cc: stable@vger.kernel.org v4.19+ Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/hbm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/misc/mei/hbm.c +++ b/drivers/misc/mei/hbm.c @@ -1029,29 +1029,36 @@ static void mei_hbm_config_features(stru dev->version.minor_version >= HBM_MINOR_VERSION_PGI) dev->hbm_f_pg_supported = 1;
+ dev->hbm_f_dc_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_DC) dev->hbm_f_dc_supported = 1;
+ dev->hbm_f_ie_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_IE) dev->hbm_f_ie_supported = 1;
/* disconnect on connect timeout instead of link reset */ + dev->hbm_f_dot_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_DOT) dev->hbm_f_dot_supported = 1;
/* Notification Event Support */ + dev->hbm_f_ev_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_EV) dev->hbm_f_ev_supported = 1;
/* Fixed Address Client Support */ + dev->hbm_f_fa_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_FA) dev->hbm_f_fa_supported = 1;
/* OS ver message Support */ + dev->hbm_f_os_supported = 0; if (dev->version.major_version >= HBM_MAJOR_VERSION_OS) dev->hbm_f_os_supported = 1;
/* DMA Ring Support */ + dev->hbm_f_dr_supported = 0; if (dev->version.major_version > HBM_MAJOR_VERSION_DR || (dev->version.major_version == HBM_MAJOR_VERSION_DR && dev->version.minor_version >= HBM_MINOR_VERSION_DR))
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit b5958faa34e2f99f3475ad89c52d98dfea079d33 upstream.
Fix unbalanced module reference counting during internal reset, which prevents the drivers unloading. Tracking mei_me/txe modules on mei client bus via mei_cldev_enable/disable is error prone due to possible internal reset flow, where clients are disconnected underneath. Moving reference counting to probe and release of mei bus client driver solves this issue in simplest way, as each client provides only a single connection to a client bus driver.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/bus.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)
--- a/drivers/misc/mei/bus.c +++ b/drivers/misc/mei/bus.c @@ -541,17 +541,9 @@ int mei_cldev_enable(struct mei_cl_devic goto out; }
- if (!mei_cl_bus_module_get(cldev)) { - dev_err(&cldev->dev, "get hw module failed"); - ret = -ENODEV; - goto out; - } - ret = mei_cl_connect(cl, cldev->me_cl, NULL); - if (ret < 0) { + if (ret < 0) dev_err(&cldev->dev, "cannot connect\n"); - mei_cl_bus_module_put(cldev); - }
out: mutex_unlock(&bus->device_lock); @@ -614,7 +606,6 @@ int mei_cldev_disable(struct mei_cl_devi if (err < 0) dev_err(bus->dev, "Could not disconnect from the ME client\n");
- mei_cl_bus_module_put(cldev); out: /* Flush queues and remove any pending read */ mei_cl_flush_queues(cl, NULL); @@ -725,9 +716,16 @@ static int mei_cl_device_probe(struct de if (!id) return -ENODEV;
+ if (!mei_cl_bus_module_get(cldev)) { + dev_err(&cldev->dev, "get hw module failed"); + return -ENODEV; + } + ret = cldrv->probe(cldev, id); - if (ret) + if (ret) { + mei_cl_bus_module_put(cldev); return ret; + }
__module_get(THIS_MODULE); return 0; @@ -755,6 +753,7 @@ static int mei_cl_device_remove(struct d
mei_cldev_unregister_callbacks(cldev);
+ mei_cl_bus_module_put(cldev); module_put(THIS_MODULE); dev->driver = NULL; return ret;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit bf7cbaae0831252b416f375ca9b1027ecd4642dd upstream.
Using STP_POLICY_ID_SET ioctl command with dummy_stm device, or any STM device that supplies zero mmio channel size, will trigger a division by zero bug in the kernel.
Prevent this by disallowing channel widths other than 1 for such devices.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") CC: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/stm/core.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/hwtracing/stm/core.c +++ b/drivers/hwtracing/stm/core.c @@ -732,7 +732,7 @@ static int stm_char_policy_set_ioctl(str struct stm_device *stm = stmf->stm; struct stp_policy_id *id; char *ids[] = { NULL, NULL }; - int ret = -EINVAL; + int ret = -EINVAL, wlimit = 1; u32 size;
if (stmf->output.nr_chans) @@ -760,8 +760,10 @@ static int stm_char_policy_set_ioctl(str if (id->__reserved_0 || id->__reserved_1) goto err_free;
- if (id->width < 1 || - id->width > PAGE_SIZE / stm->data->sw_mmiosz) + if (stm->data->sw_mmiosz) + wlimit = PAGE_SIZE / stm->data->sw_mmiosz; + + if (id->width < 1 || id->width > wlimit) goto err_free;
ids[0] = id->id;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhi Jin zhi.jin@intel.com
commit a1d75dad3a2c689e70a1c4e0214cca9de741d0aa upstream.
There is a bug in the channel allocation logic that leads to an endless loop when looking for a contiguous range of channels in a range with a mixture of free and occupied channels. For example, opening three consequtive channels, closing the first two and requesting 4 channels in a row will trigger this soft lockup. The bug is that the search loop forgets to skip over the range once it detects that one channel in that range is occupied.
Restore the original intent to the logic by fixing the omission.
Signed-off-by: Zhi Jin zhi.jin@intel.com Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") CC: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/stm/core.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/hwtracing/stm/core.c +++ b/drivers/hwtracing/stm/core.c @@ -244,6 +244,9 @@ static int find_free_channels(unsigned l ; if (i == width) return pos; + + /* step over [pos..pos+i) to continue search */ + pos += i; }
return -1;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Franck LENORMAND franck.lenormand@nxp.com
commit 65055e2108847af5e577cc7ce6bde45ea136d29a upstream.
When driver started using state->caam_ctxt for storing both running hash and final hash, it was not updated to handle different DMA unmap lengths.
Cc: stable@vger.kernel.org # v4.19+ Fixes: c19650d6ea99 ("crypto: caam - fix DMA mapping of stack memory") Signed-off-by: Franck LENORMAND franck.lenormand@nxp.com Signed-off-by: Horia Geantă horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamhash.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -113,6 +113,7 @@ struct caam_hash_ctx { struct caam_hash_state { dma_addr_t buf_dma; dma_addr_t ctx_dma; + int ctx_dma_len; u8 buf_0[CAAM_MAX_HASH_BLOCK_SIZE] ____cacheline_aligned; int buflen_0; u8 buf_1[CAAM_MAX_HASH_BLOCK_SIZE] ____cacheline_aligned; @@ -165,6 +166,7 @@ static inline int map_seq_out_ptr_ctx(u3 struct caam_hash_state *state, int ctx_len) { + state->ctx_dma_len = ctx_len; state->ctx_dma = dma_map_single(jrdev, state->caam_ctx, ctx_len, DMA_FROM_DEVICE); if (dma_mapping_error(jrdev, state->ctx_dma)) { @@ -218,6 +220,7 @@ static inline int ctx_map_to_sec4_sg(str struct caam_hash_state *state, int ctx_len, struct sec4_sg_entry *sec4_sg, u32 flag) { + state->ctx_dma_len = ctx_len; state->ctx_dma = dma_map_single(jrdev, state->caam_ctx, ctx_len, flag); if (dma_mapping_error(jrdev, state->ctx_dma)) { dev_err(jrdev, "unable to map ctx\n"); @@ -468,12 +471,10 @@ static inline void ahash_unmap_ctx(struc struct ahash_edesc *edesc, struct ahash_request *req, int dst_len, u32 flag) { - struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); - struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); struct caam_hash_state *state = ahash_request_ctx(req);
if (state->ctx_dma) { - dma_unmap_single(dev, state->ctx_dma, ctx->ctx_len, flag); + dma_unmap_single(dev, state->ctx_dma, state->ctx_dma_len, flag); state->ctx_dma = 0; } ahash_unmap(dev, edesc, req, dst_len); @@ -1446,6 +1447,7 @@ static int ahash_init(struct ahash_reque state->final = ahash_final_no_ctx;
state->ctx_dma = 0; + state->ctx_dma_len = 0; state->current_buf = 0; state->buf_dma = 0; state->buflen_0 = 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit b5be853181a8d4a6e20f2073ccd273d6280cad88 upstream.
Add missing break statement in order to prevent the code from falling through to case S_DIN_to_DES.
This bug was found thanks to the ongoing efforts to enable -Wimplicit-fallthrough.
Fixes: 63ee04c8b491 ("crypto: ccree - add skcipher support") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccree/cc_cipher.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/crypto/ccree/cc_cipher.c +++ b/drivers/crypto/ccree/cc_cipher.c @@ -80,6 +80,7 @@ static int validate_keys_sizes(struct cc default: break; } + break; case S_DIN_to_DES: if (size == DES3_EDE_KEY_SIZE || size == DES_KEY_SIZE) return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pankaj Gupta pankaj.gupta@nxp.com
commit 42e95d1f10dcf8b18b1d7f52f7068985b3dc5b79 upstream.
when the source sg contains more than 1 fragment and destination sg contains 1 fragment, the caam driver mishandle the buffers to be sent to caam.
Fixes: f2147b88b2b1 ("crypto: caam - Convert GCM to new AEAD interface") Cc: stable@vger.kernel.org # 4.2+ Signed-off-by: Pankaj Gupta pankaj.gupta@nxp.com Signed-off-by: Arun Pathak arun.pathak@nxp.com Reviewed-by: Horia Geanta horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamalg.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/crypto/caam/caamalg.c +++ b/drivers/crypto/caam/caamalg.c @@ -1040,6 +1040,7 @@ static void init_aead_job(struct aead_re if (unlikely(req->src != req->dst)) { if (edesc->dst_nents == 1) { dst_dma = sg_dma_address(req->dst); + out_options = 0; } else { dst_dma = edesc->sec4_sg_dma + sec4_sg_index *
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Horia Geantă horia.geanta@nxp.com
commit c19650d6ea99bcd903d3e55dd61860026c701339 upstream.
Roland reports the following issue and provides a root cause analysis:
"On a v4.19 i.MX6 system with IMA and CONFIG_DMA_API_DEBUG enabled, a warning is generated when accessing files on a filesystem for which IMA measurement is enabled:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:1181 check_for_stack.part.9+0xd0/0x120 caam_jr 2101000.jr0: DMA-API: device driver maps memory from stack [addr=b668049e] Modules linked in: CPU: 0 PID: 1 Comm: switch_root Not tainted 4.19.0-20181214-1 #2 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<c010efb8>] (dump_backtrace) from [<c010f2d0>] (show_stack+0x20/0x24) [<c010f2b0>] (show_stack) from [<c08b04f4>] (dump_stack+0xa0/0xcc) [<c08b0454>] (dump_stack) from [<c012b610>] (__warn+0xf0/0x108) [<c012b520>] (__warn) from [<c012b680>] (warn_slowpath_fmt+0x58/0x74) [<c012b62c>] (warn_slowpath_fmt) from [<c0199acc>] (check_for_stack.part.9+0xd0/0x120) [<c01999fc>] (check_for_stack.part.9) from [<c019a040>] (debug_dma_map_page+0x144/0x174) [<c0199efc>] (debug_dma_map_page) from [<c065f7f4>] (ahash_final_ctx+0x5b4/0xcf0) [<c065f240>] (ahash_final_ctx) from [<c065b3c4>] (ahash_final+0x1c/0x20) [<c065b3a8>] (ahash_final) from [<c03fe278>] (crypto_ahash_op+0x38/0x80) [<c03fe240>] (crypto_ahash_op) from [<c03fe2e0>] (crypto_ahash_final+0x20/0x24) [<c03fe2c0>] (crypto_ahash_final) from [<c03f19a8>] (ima_calc_file_hash+0x29c/0xa40) [<c03f170c>] (ima_calc_file_hash) from [<c03f2b24>] (ima_collect_measurement+0x1dc/0x240) [<c03f2948>] (ima_collect_measurement) from [<c03f0a60>] (process_measurement+0x4c4/0x6b8) [<c03f059c>] (process_measurement) from [<c03f0cdc>] (ima_file_check+0x88/0xa4) [<c03f0c54>] (ima_file_check) from [<c02d8adc>] (path_openat+0x5d8/0x1364) [<c02d8504>] (path_openat) from [<c02dad24>] (do_filp_open+0x84/0xf0) [<c02daca0>] (do_filp_open) from [<c02cf50c>] (do_open_execat+0x84/0x1b0) [<c02cf488>] (do_open_execat) from [<c02d1058>] (__do_execve_file+0x43c/0x890) [<c02d0c1c>] (__do_execve_file) from [<c02d1770>] (sys_execve+0x44/0x4c) [<c02d172c>] (sys_execve) from [<c0101000>] (ret_fast_syscall+0x0/0x28) ---[ end trace 3455789a10e3aefd ]---
The cause is that the struct ahash_request *req is created as a stack-local variable up in the stack (presumably somewhere in the IMA implementation), then passed down into the CAAM driver, which tries to dma_single_map the req->result (indirectly via map_seq_out_ptr_result) in order to make that buffer available for the CAAM to store the result of the following hash operation.
The calling code doesn't know how req will be used by the CAAM driver, and there could be other such occurrences where stack memory is passed down to the CAAM driver. Therefore we should rather fix this issue in the CAAM driver where the requirements are known."
Fix this problem by: -instructing the crypto engine to write the final hash in state->caam_ctx -subsequently memcpy-ing the final hash into req->result
Cc: stable@vger.kernel.org # v4.19+ Reported-by: Roland Hieber rhi@pengutronix.de Signed-off-by: Horia Geantă horia.geanta@nxp.com Tested-by: Roland Hieber rhi@pengutronix.de Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamhash.c | 85 ++++++++++------------------------------- 1 file changed, 21 insertions(+), 64 deletions(-)
--- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -180,18 +180,6 @@ static inline int map_seq_out_ptr_ctx(u3 return 0; }
-/* Map req->result, and append seq_out_ptr command that points to it */ -static inline dma_addr_t map_seq_out_ptr_result(u32 *desc, struct device *jrdev, - u8 *result, int digestsize) -{ - dma_addr_t dst_dma; - - dst_dma = dma_map_single(jrdev, result, digestsize, DMA_FROM_DEVICE); - append_seq_out_ptr(desc, dst_dma, digestsize, 0); - - return dst_dma; -} - /* Map current buffer in state (if length > 0) and put it in link table */ static inline int buf_map_to_sec4_sg(struct device *jrdev, struct sec4_sg_entry *sec4_sg, @@ -429,7 +417,6 @@ static int ahash_setkey(struct crypto_ah
/* * ahash_edesc - s/w-extended ahash descriptor - * @dst_dma: physical mapped address of req->result * @sec4_sg_dma: physical mapped address of h/w link table * @src_nents: number of segments in input scatterlist * @sec4_sg_bytes: length of dma mapped sec4_sg space @@ -437,7 +424,6 @@ static int ahash_setkey(struct crypto_ah * @sec4_sg: h/w link table */ struct ahash_edesc { - dma_addr_t dst_dma; dma_addr_t sec4_sg_dma; int src_nents; int sec4_sg_bytes; @@ -453,8 +439,6 @@ static inline void ahash_unmap(struct de
if (edesc->src_nents) dma_unmap_sg(dev, req->src, edesc->src_nents, DMA_TO_DEVICE); - if (edesc->dst_dma) - dma_unmap_single(dev, edesc->dst_dma, dst_len, DMA_FROM_DEVICE);
if (edesc->sec4_sg_bytes) dma_unmap_single(dev, edesc->sec4_sg_dma, @@ -487,9 +471,9 @@ static void ahash_done(struct device *jr struct ahash_edesc *edesc; struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); int digestsize = crypto_ahash_digestsize(ahash); + struct caam_hash_state *state = ahash_request_ctx(req); #ifdef DEBUG struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); - struct caam_hash_state *state = ahash_request_ctx(req);
dev_err(jrdev, "%s %d: err 0x%x\n", __func__, __LINE__, err); #endif @@ -498,17 +482,14 @@ static void ahash_done(struct device *jr if (err) caam_jr_strstatus(jrdev, err);
- ahash_unmap(jrdev, edesc, req, digestsize); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + memcpy(req->result, state->caam_ctx, digestsize); kfree(edesc);
#ifdef DEBUG print_hex_dump(KERN_ERR, "ctx@"__stringify(__LINE__)": ", DUMP_PREFIX_ADDRESS, 16, 4, state->caam_ctx, ctx->ctx_len, 1); - if (req->result) - print_hex_dump(KERN_ERR, "result@"__stringify(__LINE__)": ", - DUMP_PREFIX_ADDRESS, 16, 4, req->result, - digestsize, 1); #endif
req->base.complete(&req->base, err); @@ -556,9 +537,9 @@ static void ahash_done_ctx_src(struct de struct ahash_edesc *edesc; struct crypto_ahash *ahash = crypto_ahash_reqtfm(req); int digestsize = crypto_ahash_digestsize(ahash); + struct caam_hash_state *state = ahash_request_ctx(req); #ifdef DEBUG struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash); - struct caam_hash_state *state = ahash_request_ctx(req);
dev_err(jrdev, "%s %d: err 0x%x\n", __func__, __LINE__, err); #endif @@ -567,17 +548,14 @@ static void ahash_done_ctx_src(struct de if (err) caam_jr_strstatus(jrdev, err);
- ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_TO_DEVICE); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_BIDIRECTIONAL); + memcpy(req->result, state->caam_ctx, digestsize); kfree(edesc);
#ifdef DEBUG print_hex_dump(KERN_ERR, "ctx@"__stringify(__LINE__)": ", DUMP_PREFIX_ADDRESS, 16, 4, state->caam_ctx, ctx->ctx_len, 1); - if (req->result) - print_hex_dump(KERN_ERR, "result@"__stringify(__LINE__)": ", - DUMP_PREFIX_ADDRESS, 16, 4, req->result, - digestsize, 1); #endif
req->base.complete(&req->base, err); @@ -838,7 +816,7 @@ static int ahash_final_ctx(struct ahash_ edesc->sec4_sg_bytes = sec4_sg_bytes;
ret = ctx_map_to_sec4_sg(jrdev, state, ctx->ctx_len, - edesc->sec4_sg, DMA_TO_DEVICE); + edesc->sec4_sg, DMA_BIDIRECTIONAL); if (ret) goto unmap_ctx;
@@ -858,14 +836,7 @@ static int ahash_final_ctx(struct ahash_
append_seq_in_ptr(desc, edesc->sec4_sg_dma, ctx->ctx_len + buflen, LDST_SGF); - - edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, - digestsize); - if (dma_mapping_error(jrdev, edesc->dst_dma)) { - dev_err(jrdev, "unable to map dst\n"); - ret = -ENOMEM; - goto unmap_ctx; - } + append_seq_out_ptr(desc, state->ctx_dma, digestsize, 0);
#ifdef DEBUG print_hex_dump(KERN_ERR, "jobdesc@"__stringify(__LINE__)": ", @@ -878,7 +849,7 @@ static int ahash_final_ctx(struct ahash_
return -EINPROGRESS; unmap_ctx: - ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_BIDIRECTIONAL); kfree(edesc); return ret; } @@ -932,7 +903,7 @@ static int ahash_finup_ctx(struct ahash_ edesc->src_nents = src_nents;
ret = ctx_map_to_sec4_sg(jrdev, state, ctx->ctx_len, - edesc->sec4_sg, DMA_TO_DEVICE); + edesc->sec4_sg, DMA_BIDIRECTIONAL); if (ret) goto unmap_ctx;
@@ -946,13 +917,7 @@ static int ahash_finup_ctx(struct ahash_ if (ret) goto unmap_ctx;
- edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, - digestsize); - if (dma_mapping_error(jrdev, edesc->dst_dma)) { - dev_err(jrdev, "unable to map dst\n"); - ret = -ENOMEM; - goto unmap_ctx; - } + append_seq_out_ptr(desc, state->ctx_dma, digestsize, 0);
#ifdef DEBUG print_hex_dump(KERN_ERR, "jobdesc@"__stringify(__LINE__)": ", @@ -965,7 +930,7 @@ static int ahash_finup_ctx(struct ahash_
return -EINPROGRESS; unmap_ctx: - ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_BIDIRECTIONAL); kfree(edesc); return ret; } @@ -1024,10 +989,8 @@ static int ahash_digest(struct ahash_req
desc = edesc->hw_desc;
- edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, - digestsize); - if (dma_mapping_error(jrdev, edesc->dst_dma)) { - dev_err(jrdev, "unable to map dst\n"); + ret = map_seq_out_ptr_ctx(desc, jrdev, state, digestsize); + if (ret) { ahash_unmap(jrdev, edesc, req, digestsize); kfree(edesc); return -ENOMEM; @@ -1042,7 +1005,7 @@ static int ahash_digest(struct ahash_req if (!ret) { ret = -EINPROGRESS; } else { - ahash_unmap(jrdev, edesc, req, digestsize); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); kfree(edesc); }
@@ -1084,12 +1047,9 @@ static int ahash_final_no_ctx(struct aha append_seq_in_ptr(desc, state->buf_dma, buflen, 0); }
- edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, - digestsize); - if (dma_mapping_error(jrdev, edesc->dst_dma)) { - dev_err(jrdev, "unable to map dst\n"); + ret = map_seq_out_ptr_ctx(desc, jrdev, state, digestsize); + if (ret) goto unmap; - }
#ifdef DEBUG print_hex_dump(KERN_ERR, "jobdesc@"__stringify(__LINE__)": ", @@ -1100,7 +1060,7 @@ static int ahash_final_no_ctx(struct aha if (!ret) { ret = -EINPROGRESS; } else { - ahash_unmap(jrdev, edesc, req, digestsize); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); kfree(edesc); }
@@ -1299,12 +1259,9 @@ static int ahash_finup_no_ctx(struct aha goto unmap; }
- edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, - digestsize); - if (dma_mapping_error(jrdev, edesc->dst_dma)) { - dev_err(jrdev, "unable to map dst\n"); + ret = map_seq_out_ptr_ctx(desc, jrdev, state, digestsize); + if (ret) goto unmap; - }
#ifdef DEBUG print_hex_dump(KERN_ERR, "jobdesc@"__stringify(__LINE__)": ", @@ -1315,7 +1272,7 @@ static int ahash_finup_no_ctx(struct aha if (!ret) { ret = -EINPROGRESS; } else { - ahash_unmap(jrdev, edesc, req, digestsize); + ahash_unmap_ctx(jrdev, edesc, req, digestsize, DMA_FROM_DEVICE); kfree(edesc); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hadar Gat hadar.gat@arm.com
commit a49411959ea6d4915a9fd2a7eb5ba220e6284e9a upstream.
In cc_unmap_aead_request(), call dma_pool_free() for mlli buffer only if an item is allocated from the pool and not always if there is a pool allocated. This fixes a kernel panic when trying to free a non-allocated item.
Cc: stable@vger.kernel.org Signed-off-by: Hadar Gat hadar.gat@arm.com Signed-off-by: Gilad Ben-Yossef gilad@benyossef.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccree/cc_buffer_mgr.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/crypto/ccree/cc_buffer_mgr.c +++ b/drivers/crypto/ccree/cc_buffer_mgr.c @@ -614,10 +614,10 @@ void cc_unmap_aead_request(struct device hw_iv_size, DMA_BIDIRECTIONAL); }
- /*In case a pool was set, a table was - *allocated and should be released - */ - if (areq_ctx->mlli_params.curr_pool) { + /* Release pool */ + if ((areq_ctx->assoc_buff_type == CC_DMA_BUF_MLLI || + areq_ctx->data_buff_type == CC_DMA_BUF_MLLI) && + (areq_ctx->mlli_params.mlli_virt_addr)) { dev_dbg(dev, "free MLLI buffer: dma=%pad virt=%pK\n", &areq_ctx->mlli_params.mlli_dma_addr, areq_ctx->mlli_params.mlli_virt_addr);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gilad Ben-Yossef gilad@benyossef.com
commit c139c72e2beb3e3db5148910b3962b7322e24374 upstream.
We were copying the last ciphertext block into the IV field for CBC before removing the DMA mapping of the output buffer with the result of the buffer sometime being out-of-sync cache wise and were getting intermittent cases of bad output IV.
Fix it by moving the DMA buffer unmapping before the copy.
Signed-off-by: Gilad Ben-Yossef gilad@benyossef.com Fixes: 00904aa0cd59 ("crypto: ccree - fix iv handling") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccree/cc_cipher.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/crypto/ccree/cc_cipher.c +++ b/drivers/crypto/ccree/cc_cipher.c @@ -653,6 +653,8 @@ static void cc_cipher_complete(struct de unsigned int ivsize = crypto_skcipher_ivsize(sk_tfm); unsigned int len;
+ cc_unmap_cipher_request(dev, req_ctx, ivsize, src, dst); + switch (ctx_p->cipher_mode) { case DRV_CIPHER_CBC: /* @@ -682,7 +684,6 @@ static void cc_cipher_complete(struct de break; }
- cc_unmap_cipher_request(dev, req_ctx, ivsize, src, dst); kzfree(req_ctx->iv);
skcipher_request_complete(req, err);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gilad Ben-Yossef gilad@benyossef.com
commit 2b5ac17463dcb2411fed506edcf259a89bb538ba upstream.
For decryption in CBC mode we need to save the last ciphertext block for use as the next IV. However, we were trying to do this also with zero sized ciphertext resulting in a panic.
Fix this by only doing the copy if the ciphertext length is at least of IV size.
Signed-off-by: Gilad Ben-Yossef gilad@benyossef.com Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccree/cc_cipher.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/crypto/ccree/cc_cipher.c +++ b/drivers/crypto/ccree/cc_cipher.c @@ -801,7 +801,8 @@ static int cc_cipher_decrypt(struct skci
memset(req_ctx, 0, sizeof(*req_ctx));
- if (ctx_p->cipher_mode == DRV_CIPHER_CBC) { + if ((ctx_p->cipher_mode == DRV_CIPHER_CBC) && + (req->cryptlen >= ivsize)) {
/* Allocate and save the last IV sized bytes of the source, * which will be lost in case of in-place decryption.
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 394a9e044702e6a8958a5e89d2a291605a587a2a upstream.
Like some other block cipher mode implementations, the CFB implementation assumes that while walking through the scatterlist, a partial block does not occur until the end. But the walk is incorrectly being done with a blocksize of 1, as 'cra_blocksize' is set to 1 (since CFB is a stream cipher) but no 'chunksize' is set. This bug causes incorrect encryption/decryption for some scatterlist layouts.
Fix it by setting the 'chunksize'. Also extend the CFB test vectors to cover this bug as well as cases where the message length is not a multiple of the block size.
Fixes: a7d85e06ed80 ("crypto: cfb - add support for Cipher FeedBack mode") Cc: stable@vger.kernel.org # v4.17+ Cc: James Bottomley James.Bottomley@HansenPartnership.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/cfb.c | 6 ++++++ crypto/testmgr.h | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+)
--- a/crypto/cfb.c +++ b/crypto/cfb.c @@ -298,6 +298,12 @@ static int crypto_cfb_create(struct cryp inst->alg.base.cra_blocksize = 1; inst->alg.base.cra_alignmask = alg->cra_alignmask;
+ /* + * To simplify the implementation, configure the skcipher walk to only + * give a partial block at the very end, never earlier. + */ + inst->alg.chunksize = alg->cra_blocksize; + inst->alg.ivsize = alg->cra_blocksize; inst->alg.min_keysize = alg->cra_cipher.cia_min_keysize; inst->alg.max_keysize = alg->cra_cipher.cia_max_keysize; --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -12870,6 +12870,31 @@ static const struct cipher_testvec aes_c "\x75\xa3\x85\x74\x1a\xb9\xce\xf8" "\x20\x31\x62\x3d\x55\xb1\xe4\x71", .len = 64, + .also_non_np = 1, + .np = 2, + .tap = { 31, 33 }, + }, { /* > 16 bytes, not a multiple of 16 bytes */ + .key = "\x2b\x7e\x15\x16\x28\xae\xd2\xa6" + "\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + .klen = 16, + .iv = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", + .ptext = "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96" + "\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae", + .ctext = "\x3b\x3f\xd9\x2e\xb7\x2d\xad\x20" + "\x33\x34\x49\xf8\xe8\x3c\xfb\x4a" + "\xc8", + .len = 17, + }, { /* < 16 bytes */ + .key = "\x2b\x7e\x15\x16\x28\xae\xd2\xa6" + "\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + .klen = 16, + .iv = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", + .ptext = "\x6b\xc1\xbe\xe2\x2e\x40\x9f", + .ctext = "\x3b\x3f\xd9\x2e\xb7\x2d\xad", + .len = 7, }, };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 6c2e322b3621dc8be72e5c86d4fdb587434ba625 upstream.
The memcpy() in crypto_cfb_decrypt_inplace() uses walk->iv as both the source and destination, which has undefined behavior. It is unneeded because walk->iv is already used to hold the previous ciphertext block; thus, walk->iv is already updated to its final value. So, remove it.
Also, note that in-place decryption is the only case where the previous ciphertext block is not directly available. Therefore, as a related cleanup I also updated crypto_cfb_encrypt_segment() to directly use the previous ciphertext block rather than save it into walk->iv. This makes it consistent with in-place encryption and out-of-place decryption; now only in-place decryption is different, because it has to be.
Fixes: a7d85e06ed80 ("crypto: cfb - add support for Cipher FeedBack mode") Cc: stable@vger.kernel.org # v4.17+ Cc: James Bottomley James.Bottomley@HansenPartnership.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/cfb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/crypto/cfb.c +++ b/crypto/cfb.c @@ -77,12 +77,14 @@ static int crypto_cfb_encrypt_segment(st do { crypto_cfb_encrypt_one(tfm, iv, dst); crypto_xor(dst, src, bsize); - memcpy(iv, dst, bsize); + iv = dst;
src += bsize; dst += bsize; } while ((nbytes -= bsize) >= bsize);
+ memcpy(walk->iv, iv, bsize); + return nbytes; }
@@ -162,7 +164,7 @@ static int crypto_cfb_decrypt_inplace(st const unsigned int bsize = crypto_cfb_bsize(tfm); unsigned int nbytes = walk->nbytes; u8 *src = walk->src.virt.addr; - u8 *iv = walk->iv; + u8 * const iv = walk->iv; u8 tmp[MAX_CIPHER_BLOCKSIZE];
do { @@ -172,8 +174,6 @@ static int crypto_cfb_decrypt_inplace(st src += bsize; } while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize); - return nbytes; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit b3e3e2db7de4a1ffe8845876c3520b866cd48de1 upstream.
Fix multiple bugs in the OFB implementation:
1. It stored the per-request state 'cnt' in the tfm context, which can be used by multiple threads concurrently (e.g. via AF_ALG). 2. It didn't support messages not a multiple of the block cipher size, despite being a stream cipher. 3. It didn't set cra_blocksize to 1 to indicate it is a stream cipher.
To fix these, set the 'chunksize' property to the cipher block size to guarantee that when walking through the scatterlist, a partial block can only occur at the end. Then change the implementation to XOR a block at a time at first, then XOR the partial block at the end if needed. This is the same way CTR and CFB are implemented. As a bonus, this also improves performance in most cases over the current approach.
Fixes: e497c51896b3 ("crypto: ofb - add output feedback mode") Cc: stable@vger.kernel.org # v4.20+ Cc: Gilad Ben-Yossef gilad@benyossef.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Gilad Ben-Yossef gilad@benyossef.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/ofb.c | 91 ++++++++++++++++++++++--------------------------------- crypto/testmgr.h | 28 +++++++++++++++- 2 files changed, 63 insertions(+), 56 deletions(-)
--- a/crypto/ofb.c +++ b/crypto/ofb.c @@ -5,9 +5,6 @@ * * Copyright (C) 2018 ARM Limited or its affiliates. * All rights reserved. - * - * Based loosely on public domain code gleaned from libtomcrypt - * (https://github.com/libtom/libtomcrypt). */
#include <crypto/algapi.h> @@ -21,7 +18,6 @@
struct crypto_ofb_ctx { struct crypto_cipher *child; - int cnt; };
@@ -41,58 +37,40 @@ static int crypto_ofb_setkey(struct cryp return err; }
-static int crypto_ofb_encrypt_segment(struct crypto_ofb_ctx *ctx, - struct skcipher_walk *walk, - struct crypto_cipher *tfm) -{ - int bsize = crypto_cipher_blocksize(tfm); - int nbytes = walk->nbytes; - - u8 *src = walk->src.virt.addr; - u8 *dst = walk->dst.virt.addr; - u8 *iv = walk->iv; - - do { - if (ctx->cnt == bsize) { - if (nbytes < bsize) - break; - crypto_cipher_encrypt_one(tfm, iv, iv); - ctx->cnt = 0; - } - *dst = *src ^ iv[ctx->cnt]; - src++; - dst++; - ctx->cnt++; - } while (--nbytes); - return nbytes; -} - -static int crypto_ofb_encrypt(struct skcipher_request *req) +static int crypto_ofb_crypt(struct skcipher_request *req) { - struct skcipher_walk walk; struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - unsigned int bsize; struct crypto_ofb_ctx *ctx = crypto_skcipher_ctx(tfm); - struct crypto_cipher *child = ctx->child; - int ret = 0; + struct crypto_cipher *cipher = ctx->child; + const unsigned int bsize = crypto_cipher_blocksize(cipher); + struct skcipher_walk walk; + int err;
- bsize = crypto_cipher_blocksize(child); - ctx->cnt = bsize; + err = skcipher_walk_virt(&walk, req, false);
- ret = skcipher_walk_virt(&walk, req, false); + while (walk.nbytes >= bsize) { + const u8 *src = walk.src.virt.addr; + u8 *dst = walk.dst.virt.addr; + u8 * const iv = walk.iv; + unsigned int nbytes = walk.nbytes; + + do { + crypto_cipher_encrypt_one(cipher, iv, iv); + crypto_xor_cpy(dst, src, iv, bsize); + dst += bsize; + src += bsize; + } while ((nbytes -= bsize) >= bsize);
- while (walk.nbytes) { - ret = crypto_ofb_encrypt_segment(ctx, &walk, child); - ret = skcipher_walk_done(&walk, ret); + err = skcipher_walk_done(&walk, nbytes); }
- return ret; -} - -/* OFB encrypt and decrypt are identical */ -static int crypto_ofb_decrypt(struct skcipher_request *req) -{ - return crypto_ofb_encrypt(req); + if (walk.nbytes) { + crypto_cipher_encrypt_one(cipher, walk.iv, walk.iv); + crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr, walk.iv, + walk.nbytes); + err = skcipher_walk_done(&walk, 0); + } + return err; }
static int crypto_ofb_init_tfm(struct crypto_skcipher *tfm) @@ -165,13 +143,18 @@ static int crypto_ofb_create(struct cryp if (err) goto err_drop_spawn;
+ /* OFB mode is a stream cipher. */ + inst->alg.base.cra_blocksize = 1; + + /* + * To simplify the implementation, configure the skcipher walk to only + * give a partial block at the very end, never earlier. + */ + inst->alg.chunksize = alg->cra_blocksize; + inst->alg.base.cra_priority = alg->cra_priority; - inst->alg.base.cra_blocksize = alg->cra_blocksize; inst->alg.base.cra_alignmask = alg->cra_alignmask;
- /* We access the data as u32s when xoring. */ - inst->alg.base.cra_alignmask |= __alignof__(u32) - 1; - inst->alg.ivsize = alg->cra_blocksize; inst->alg.min_keysize = alg->cra_cipher.cia_min_keysize; inst->alg.max_keysize = alg->cra_cipher.cia_max_keysize; @@ -182,8 +165,8 @@ static int crypto_ofb_create(struct cryp inst->alg.exit = crypto_ofb_exit_tfm;
inst->alg.setkey = crypto_ofb_setkey; - inst->alg.encrypt = crypto_ofb_encrypt; - inst->alg.decrypt = crypto_ofb_decrypt; + inst->alg.encrypt = crypto_ofb_crypt; + inst->alg.decrypt = crypto_ofb_crypt;
inst->free = crypto_ofb_free;
--- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -16681,8 +16681,7 @@ static const struct cipher_testvec aes_c };
static const struct cipher_testvec aes_ofb_tv_template[] = { - /* From NIST Special Publication 800-38A, Appendix F.5 */ - { + { /* From NIST Special Publication 800-38A, Appendix F.5 */ .key = "\x2b\x7e\x15\x16\x28\xae\xd2\xa6" "\xab\xf7\x15\x88\x09\xcf\x4f\x3c", .klen = 16, @@ -16705,6 +16704,31 @@ static const struct cipher_testvec aes_o "\x30\x4c\x65\x28\xf6\x59\xc7\x78" "\x66\xa5\x10\xd9\xc1\xd6\xae\x5e", .len = 64, + .also_non_np = 1, + .np = 2, + .tap = { 31, 33 }, + }, { /* > 16 bytes, not a multiple of 16 bytes */ + .key = "\x2b\x7e\x15\x16\x28\xae\xd2\xa6" + "\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + .klen = 16, + .iv = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", + .ptext = "\x6b\xc1\xbe\xe2\x2e\x40\x9f\x96" + "\xe9\x3d\x7e\x11\x73\x93\x17\x2a" + "\xae", + .ctext = "\x3b\x3f\xd9\x2e\xb7\x2d\xad\x20" + "\x33\x34\x49\xf8\xe8\x3c\xfb\x4a" + "\x77", + .len = 17, + }, { /* < 16 bytes */ + .key = "\x2b\x7e\x15\x16\x28\xae\xd2\xa6" + "\xab\xf7\x15\x88\x09\xcf\x4f\x3c", + .klen = 16, + .iv = "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", + .ptext = "\x6b\xc1\xbe\xe2\x2e\x40\x9f", + .ctext = "\x3b\x3f\xd9\x2e\xb7\x2d\xad", + .len = 7, } };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 77568e535af7c4f97eaef1e555bf0af83772456c upstream.
Hash algorithms with an alignmask set, e.g. "xcbc(aes-aesni)" and "michael_mic", fail the improved hash tests because they sometimes produce the wrong digest. The bug is that in the case where a scatterlist element crosses pages, not all the data is actually hashed because the scatterlist walk terminates too early. This happens because the 'nbytes' variable in crypto_hash_walk_done() is assigned the number of bytes remaining in the page, then later interpreted as the number of bytes remaining in the scatterlist element. Fix it.
Fixes: 900a081f6912 ("crypto: ahash - Fix early termination in hash walk") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/ahash.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -86,17 +86,17 @@ static int hash_walk_new_entry(struct cr int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err) { unsigned int alignmask = walk->alignmask; - unsigned int nbytes = walk->entrylen;
walk->data -= walk->offset;
- if (nbytes && walk->offset & alignmask && !err) { - walk->offset = ALIGN(walk->offset, alignmask + 1); - nbytes = min(nbytes, - ((unsigned int)(PAGE_SIZE)) - walk->offset); - walk->entrylen -= nbytes; + if (walk->entrylen && (walk->offset & alignmask) && !err) { + unsigned int nbytes;
+ walk->offset = ALIGN(walk->offset, alignmask + 1); + nbytes = min(walk->entrylen, + (unsigned int)(PAGE_SIZE - walk->offset)); if (nbytes) { + walk->entrylen -= nbytes; walk->data += walk->offset; return nbytes; } @@ -116,7 +116,7 @@ int crypto_hash_walk_done(struct crypto_ if (err) return err;
- if (nbytes) { + if (walk->entrylen) { walk->offset = 0; walk->pg++; return hash_walk_next(walk);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Zhijie zhangzj@rock-chips.com
commit 4359669a087633132203c52d67dd8c31e09e7b2e upstream.
In some cases, the nents of src scatterlist is different from dst scatterlist. So two variables are used to handle the nents of src&dst scatterlist.
Reported-by: Eric Biggers ebiggers@google.com Fixes: 433cd2c617bf ("crypto: rockchip - add crypto driver for rk3288") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Zhang Zhijie zhangzj@rock-chips.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/rockchip/rk3288_crypto.c | 2 +- drivers/crypto/rockchip/rk3288_crypto.h | 3 ++- drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 5 +++-- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 2 +- 4 files changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/crypto/rockchip/rk3288_crypto.c +++ b/drivers/crypto/rockchip/rk3288_crypto.c @@ -119,7 +119,7 @@ static int rk_load_data(struct rk_crypto count = (dev->left_bytes > PAGE_SIZE) ? PAGE_SIZE : dev->left_bytes;
- if (!sg_pcopy_to_buffer(dev->first, dev->nents, + if (!sg_pcopy_to_buffer(dev->first, dev->src_nents, dev->addr_vir, count, dev->total - dev->left_bytes)) { dev_err(dev->dev, "[%s:%d] pcopy err\n", --- a/drivers/crypto/rockchip/rk3288_crypto.h +++ b/drivers/crypto/rockchip/rk3288_crypto.h @@ -207,7 +207,8 @@ struct rk_crypto_info { void *addr_vir; int aligned; int align_size; - size_t nents; + size_t src_nents; + size_t dst_nents; unsigned int total; unsigned int count; dma_addr_t addr_in; --- a/drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c +++ b/drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c @@ -260,8 +260,9 @@ static int rk_ablk_start(struct rk_crypt dev->total = req->nbytes; dev->sg_src = req->src; dev->first = req->src; - dev->nents = sg_nents(req->src); + dev->src_nents = sg_nents(req->src); dev->sg_dst = req->dst; + dev->dst_nents = sg_nents(req->dst); dev->aligned = 1;
spin_lock_irqsave(&dev->lock, flags); @@ -297,7 +298,7 @@ static int rk_ablk_rx(struct rk_crypto_i
dev->unload_data(dev); if (!dev->aligned) { - if (!sg_pcopy_from_buffer(req->dst, dev->nents, + if (!sg_pcopy_from_buffer(req->dst, dev->dst_nents, dev->addr_vir, dev->count, dev->total - dev->left_bytes - dev->count)) { --- a/drivers/crypto/rockchip/rk3288_crypto_ahash.c +++ b/drivers/crypto/rockchip/rk3288_crypto_ahash.c @@ -206,7 +206,7 @@ static int rk_ahash_start(struct rk_cryp dev->sg_dst = NULL; dev->sg_src = req->src; dev->first = req->src; - dev->nents = sg_nents(req->src); + dev->src_nents = sg_nents(req->src); rctx = ahash_request_ctx(req); rctx->mode = 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Zhijie zhangzj@rock-chips.com
commit c1c214adcb56d36433480c8fedf772498e7e539c upstream.
For chain mode in cipher(eg. AES-CBC/DES-CBC), the iv is continuously updated in the operation. The new iv value should be written to device register by software.
Reported-by: Eric Biggers ebiggers@google.com Fixes: 433cd2c617bf ("crypto: rockchip - add crypto driver for rk3288") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Zhang Zhijie zhangzj@rock-chips.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/rockchip/rk3288_crypto.h | 1 drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c | 34 +++++++++++++++++++++ 2 files changed, 35 insertions(+)
--- a/drivers/crypto/rockchip/rk3288_crypto.h +++ b/drivers/crypto/rockchip/rk3288_crypto.h @@ -245,6 +245,7 @@ struct rk_cipher_ctx { struct rk_crypto_info *dev; unsigned int keylen; u32 mode; + u8 iv[AES_BLOCK_SIZE]; };
enum alg_type { --- a/drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c +++ b/drivers/crypto/rockchip/rk3288_crypto_ablkcipher.c @@ -242,6 +242,17 @@ static void crypto_dma_start(struct rk_c static int rk_set_data_start(struct rk_crypto_info *dev) { int err; + struct ablkcipher_request *req = + ablkcipher_request_cast(dev->async_req); + struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req); + struct rk_cipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + u32 ivsize = crypto_ablkcipher_ivsize(tfm); + u8 *src_last_blk = page_address(sg_page(dev->sg_src)) + + dev->sg_src->offset + dev->sg_src->length - ivsize; + + /* store the iv that need to be updated in chain mode */ + if (ctx->mode & RK_CRYPTO_DEC) + memcpy(ctx->iv, src_last_blk, ivsize);
err = dev->load_data(dev, dev->sg_src, dev->sg_dst); if (!err) @@ -286,6 +297,28 @@ static void rk_iv_copyback(struct rk_cry memcpy_fromio(req->info, dev->reg + RK_CRYPTO_AES_IV_0, ivsize); }
+static void rk_update_iv(struct rk_crypto_info *dev) +{ + struct ablkcipher_request *req = + ablkcipher_request_cast(dev->async_req); + struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req); + struct rk_cipher_ctx *ctx = crypto_ablkcipher_ctx(tfm); + u32 ivsize = crypto_ablkcipher_ivsize(tfm); + u8 *new_iv = NULL; + + if (ctx->mode & RK_CRYPTO_DEC) { + new_iv = ctx->iv; + } else { + new_iv = page_address(sg_page(dev->sg_dst)) + + dev->sg_dst->offset + dev->sg_dst->length - ivsize; + } + + if (ivsize == DES_BLOCK_SIZE) + memcpy_toio(dev->reg + RK_CRYPTO_TDES_IV_0, new_iv, ivsize); + else if (ivsize == AES_BLOCK_SIZE) + memcpy_toio(dev->reg + RK_CRYPTO_AES_IV_0, new_iv, ivsize); +} + /* return: * true some err was occurred * fault no err, continue @@ -307,6 +340,7 @@ static int rk_ablk_rx(struct rk_crypto_i } } if (dev->left_bytes) { + rk_update_iv(dev); if (dev->aligned) { if (sg_is_last(dev->sg_src)) { dev_err(dev->dev, "[%s:%d] Lack of data\n",
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox willy@infradead.org
commit e4b3448bc346fedf36db64124a664a959995b085 upstream.
The radix tree would rewind the index in an iterator to the lowest index of a multi-slot entry. The XArray iterators instead leave the index unchanged, but I overlooked that when converting DAX from the radix tree to the XArray. Adjust the index that we use for flushing to the start of the PMD range.
Fixes: c1901cd33cf4 ("page cache: Convert find_get_entries_tag to XArray") Cc: stable@vger.kernel.org Reported-by: Piotr Balcer piotr.balcer@intel.com Tested-by: Dan Williams dan.j.williams@intel.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Matthew Wilcox willy@infradead.org Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/dax.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
--- a/fs/dax.c +++ b/fs/dax.c @@ -843,9 +843,8 @@ unlock_pte: static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, struct address_space *mapping, void *entry) { - unsigned long pfn; + unsigned long pfn, index, count; long ret = 0; - size_t size;
/* * A page got tagged dirty in DAX mapping? Something is seriously @@ -894,17 +893,18 @@ static int dax_writeback_one(struct xa_s xas_unlock_irq(xas);
/* - * Even if dax_writeback_mapping_range() was given a wbc->range_start - * in the middle of a PMD, the 'index' we are given will be aligned to - * the start index of the PMD, as will the pfn we pull from 'entry'. + * If dax_writeback_mapping_range() was given a wbc->range_start + * in the middle of a PMD, the 'index' we use needs to be + * aligned to the start of the PMD. * This allows us to flush for PMD_SIZE and not have to worry about * partial PMD writebacks. */ pfn = dax_to_pfn(entry); - size = PAGE_SIZE << dax_entry_order(entry); + count = 1UL << dax_entry_order(entry); + index = xas->xa_index & ~(count - 1);
- dax_entry_mkclean(mapping, xas->xa_index, pfn); - dax_flush(dax_dev, page_address(pfn_to_page(pfn)), size); + dax_entry_mkclean(mapping, index, pfn); + dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE); /* * After we have flushed the cache, we can clear the dirty tag. There * cannot be new dirty data in the pfn after the flush has completed as @@ -917,8 +917,7 @@ static int dax_writeback_one(struct xa_s xas_clear_mark(xas, PAGECACHE_TAG_DIRTY); dax_wake_entry(xas, entry, false);
- trace_dax_writeback_one(mapping->host, xas->xa_index, - size >> PAGE_SHIFT); + trace_dax_writeback_one(mapping->host, index, count); return ret;
put_unlocked:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit f596c8844fe1d0022007ae6c7a377361fb653eff upstream.
The implementation is broken in all the ways the unit test did not touch:
1/ The local definition of in_buf and in_obj violated C99 initializer expectations for zeroing. By only initializing 2 out of the three struct members the compiler was free to zero-initialize the remaining entry even though the aliased location in the union was initialized.
2/ The implementation made assumptions about the state of the 'smart' payload after command execution that are satisfied by acpi_nfit_ctl(), but not acpi_evaluate_dsm().
3/ populate_shutdown_status() is skipped on Intel NVDIMMs due to the early return for skipping the common _LS{I,R,W} enabling.
4/ The input length should be zero.
This breakage was missed due to the unit test implementation only testing the case where nfit_intel_shutdown_status() returns a valid payload.
Much of this complexity would be saved if acpi_nfit_ctl() could be used, but that currently requires a 'struct nvdimm *' argument and one is not created until later in the init process. The health result is needed before the device is created because the payload gates whether the nmemX/nfit/dirty_shutdown property is visible in sysfs.
Cc: stable@vger.kernel.org Fixes: 0ead11181fe0 ("acpi, nfit: Collect shutdown status") Reported-by: Dexuan Cui decui@microsoft.com Reviewed-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/nfit/core.c | 43 +++++++++++++++++++++++++------------------ 1 file changed, 25 insertions(+), 18 deletions(-)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1759,14 +1759,14 @@ static bool acpi_nvdimm_has_method(struc
__weak void nfit_intel_shutdown_status(struct nfit_mem *nfit_mem) { + struct device *dev = &nfit_mem->adev->dev; struct nd_intel_smart smart = { 0 }; union acpi_object in_buf = { - .type = ACPI_TYPE_BUFFER, - .buffer.pointer = (char *) &smart, - .buffer.length = sizeof(smart), + .buffer.type = ACPI_TYPE_BUFFER, + .buffer.length = 0, }; union acpi_object in_obj = { - .type = ACPI_TYPE_PACKAGE, + .package.type = ACPI_TYPE_PACKAGE, .package.count = 1, .package.elements = &in_buf, }; @@ -1781,8 +1781,15 @@ __weak void nfit_intel_shutdown_status(s return;
out_obj = acpi_evaluate_dsm(handle, guid, revid, func, &in_obj); - if (!out_obj) + if (!out_obj || out_obj->type != ACPI_TYPE_BUFFER + || out_obj->buffer.length < sizeof(smart)) { + dev_dbg(dev->parent, "%s: failed to retrieve initial health\n", + dev_name(dev)); + ACPI_FREE(out_obj); return; + } + memcpy(&smart, out_obj->buffer.pointer, sizeof(smart)); + ACPI_FREE(out_obj);
if (smart.flags & ND_INTEL_SMART_SHUTDOWN_VALID) { if (smart.shutdown_state) @@ -1793,7 +1800,6 @@ __weak void nfit_intel_shutdown_status(s set_bit(NFIT_MEM_DIRTY_COUNT, &nfit_mem->flags); nfit_mem->dirty_shutdown = smart.shutdown_count; } - ACPI_FREE(out_obj); }
static void populate_shutdown_status(struct nfit_mem *nfit_mem) @@ -1915,18 +1921,19 @@ static int acpi_nfit_add_dimm(struct acp | 1 << ND_CMD_SET_CONFIG_DATA; if (family == NVDIMM_FAMILY_INTEL && (dsm_mask & label_mask) == label_mask) - return 0; - - if (acpi_nvdimm_has_method(adev_dimm, "_LSI") - && acpi_nvdimm_has_method(adev_dimm, "_LSR")) { - dev_dbg(dev, "%s: has _LSR\n", dev_name(&adev_dimm->dev)); - set_bit(NFIT_MEM_LSR, &nfit_mem->flags); - } - - if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags) - && acpi_nvdimm_has_method(adev_dimm, "_LSW")) { - dev_dbg(dev, "%s: has _LSW\n", dev_name(&adev_dimm->dev)); - set_bit(NFIT_MEM_LSW, &nfit_mem->flags); + /* skip _LS{I,R,W} enabling */; + else { + if (acpi_nvdimm_has_method(adev_dimm, "_LSI") + && acpi_nvdimm_has_method(adev_dimm, "_LSR")) { + dev_dbg(dev, "%s: has _LSR\n", dev_name(&adev_dimm->dev)); + set_bit(NFIT_MEM_LSR, &nfit_mem->flags); + } + + if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags) + && acpi_nvdimm_has_method(adev_dimm, "_LSW")) { + dev_dbg(dev, "%s: has _LSW\n", dev_name(&adev_dimm->dev)); + set_bit(NFIT_MEM_LSW, &nfit_mem->flags); + } }
populate_shutdown_status(nfit_mem);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dexuan Cui decui@microsoft.com
commit 43f89877f26671c6309cd87d7364b1a3e66e71cf upstream.
In the case of ND_CMD_CALL, we should also check out_obj->type.
The patch uses out_obj->type, which is a short alias to out_obj->package.type.
Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism") Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/nfit/core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -554,6 +554,13 @@ int acpi_nfit_ctl(struct nvdimm_bus_desc return -EINVAL; }
+ if (out_obj->type != ACPI_TYPE_BUFFER) { + dev_dbg(dev, "%s unexpected output object type cmd: %s type: %d\n", + dimm_name, cmd_name, out_obj->type); + rc = -EINVAL; + goto out; + } + if (call_pkg) { call_pkg->nd_fw_size = out_obj->buffer.length; memcpy(call_pkg->nd_payload + call_pkg->nd_size_in, @@ -572,13 +579,6 @@ int acpi_nfit_ctl(struct nvdimm_bus_desc return 0; }
- if (out_obj->package.type != ACPI_TYPE_BUFFER) { - dev_dbg(dev, "%s unexpected output object type cmd: %s type: %d\n", - dimm_name, cmd_name, out_obj->type); - rc = -EINVAL; - goto out; - } - dev_dbg(dev, "%s cmd: %s output length: %d\n", dimm_name, cmd_name, out_obj->buffer.length); print_hex_dump_debug(cmd_name, DUMP_PREFIX_OFFSET, 4, 4,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit ebe9f6f19d80d8978d16078dff3d5bd93ad8d102 upstream.
Commit 11189c1089da "acpi/nfit: Fix command-supported detection" broke ND_CMD_CALL for bus-level commands. The "func = cmd" assumption is only valid for:
ND_CMD_ARS_CAP ND_CMD_ARS_START ND_CMD_ARS_STATUS ND_CMD_CLEAR_ERROR
The function number otherwise needs to be pulled from the command payload for:
NFIT_CMD_TRANSLATE_SPA NFIT_CMD_ARS_INJECT_SET NFIT_CMD_ARS_INJECT_CLEAR NFIT_CMD_ARS_INJECT_GET
Update cmd_to_func() for the bus case and call it in the common path.
Fixes: 11189c1089da ("acpi/nfit: Fix command-supported detection") Cc: stable@vger.kernel.org Reviewed-by: Vishal Verma vishal.l.verma@intel.com Reported-by: Grzegorz Burzynski grzegorz.burzynski@intel.com Tested-by: Jeff Moyer jmoyer@redhat.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/nfit/core.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -415,7 +415,7 @@ static int cmd_to_func(struct nfit_mem * if (call_pkg) { int i;
- if (nfit_mem->family != call_pkg->nd_family) + if (nfit_mem && nfit_mem->family != call_pkg->nd_family) return -ENOTTY;
for (i = 0; i < ARRAY_SIZE(call_pkg->nd_reserved2); i++) @@ -424,6 +424,10 @@ static int cmd_to_func(struct nfit_mem * return call_pkg->nd_command; }
+ /* In the !call_pkg case, bus commands == bus functions */ + if (!nfit_mem) + return cmd; + /* Linux ND commands == NVDIMM_FAMILY_INTEL function numbers */ if (nfit_mem->family == NVDIMM_FAMILY_INTEL) return cmd; @@ -454,17 +458,18 @@ int acpi_nfit_ctl(struct nvdimm_bus_desc if (cmd_rc) *cmd_rc = -EINVAL;
+ if (cmd == ND_CMD_CALL) + call_pkg = buf; + func = cmd_to_func(nfit_mem, cmd, call_pkg); + if (func < 0) + return func; + if (nvdimm) { struct acpi_device *adev = nfit_mem->adev;
if (!adev) return -ENOTTY;
- if (cmd == ND_CMD_CALL) - call_pkg = buf; - func = cmd_to_func(nfit_mem, cmd, call_pkg); - if (func < 0) - return func; dimm_name = nvdimm_name(nvdimm); cmd_name = nvdimm_cmd_name(cmd); cmd_mask = nvdimm_cmd_mask(nvdimm); @@ -475,12 +480,9 @@ int acpi_nfit_ctl(struct nvdimm_bus_desc } else { struct acpi_device *adev = to_acpi_dev(acpi_desc);
- func = cmd; cmd_name = nvdimm_bus_cmd_name(cmd); cmd_mask = nd_desc->cmd_mask; - dsm_mask = cmd_mask; - if (cmd == ND_CMD_CALL) - dsm_mask = nd_desc->bus_dsm_mask; + dsm_mask = nd_desc->bus_dsm_mask; desc = nd_cmd_bus_desc(cmd); guid = to_nfit_uuid(NFIT_DEV_BUS); handle = adev->handle;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit c6c5df293bf1b488cf8459aac658aecfdccb13a9 upstream.
If query-ARS reports that ARS has stopped and requires continuation attempt to retrieve short-ARS results before continuing the long operation.
Fixes: bc6ba8085842 ("nfit, address-range-scrub: rework and simplify ARS...") Cc: stable@vger.kernel.org Reported-by: Krzysztof Rusocki krzysztof.rusocki@intel.com Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/nfit/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3021,6 +3021,7 @@ static int ars_register(struct acpi_nfit
switch (acpi_nfit_query_poison(acpi_desc)) { case 0: + case -ENOSPC: case -EAGAIN: rc = ars_start(acpi_desc, nfit_spa, ARS_REQ_SHORT); /* shouldn't happen, try again later */ @@ -3045,7 +3046,6 @@ static int ars_register(struct acpi_nfit break; case -EBUSY: case -ENOMEM: - case -ENOSPC: /* * BIOS was using ARS, wait for it to complete (or * resources to become available) and then perform our
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit fa3ed4d981b1fc19acdd07fcb152a4bd3706892b upstream.
The no_init_ars option is meant to prevent long-ARS, but short-ARS should be allowed to grab any immediate results.
Fixes: bc6ba8085842 ("nfit, address-range-scrub: rework and simplify ARS...") Cc: stable@vger.kernel.org Reported-by: Erwin Tsaur erwin.tsaur@oracle.com Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/nfit/core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3013,11 +3013,12 @@ static int ars_register(struct acpi_nfit { int rc;
- if (no_init_ars || test_bit(ARS_FAILED, &nfit_spa->ars_state)) + if (test_bit(ARS_FAILED, &nfit_spa->ars_state)) return acpi_nfit_register_region(acpi_desc, nfit_spa);
set_bit(ARS_REQ_SHORT, &nfit_spa->ars_state); - set_bit(ARS_REQ_LONG, &nfit_spa->ars_state); + if (!no_init_ars) + set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
switch (acpi_nfit_query_poison(acpi_desc)) { case 0:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 966d23a006ca7b44ac8cf4d0c96b19785e0c3da0 upstream.
The UEFI 2.7 specification sets expectations that the 'updating' flag is eventually cleared. To date, the libnvdimm core has never adhered to that protocol. The policy of the core matches the policy of other multi-device info-block formats like MD-Software-RAID that expect administrator intervention on inconsistent info-blocks, not automatic invalidation.
However, some pre-boot environments may unfortunately attempt to "clean up" the labels and invalidate a set when it fails to find at least one "non-updating" label in the set. Clear the updating flag after set updates to minimize the window of vulnerability to aggressive pre-boot environments.
Ideally implementations would not write to the label area outside of creating namespaces.
Note that this only minimizes the window, it does not close it as the system can still crash while clearing the flag and the set can be subsequently deleted / invalidated by the pre-boot environment.
Fixes: f524bf271a5c ("libnvdimm: write pmem label set") Cc: stable@vger.kernel.org Cc: Kelly Couch kelly.j.couch@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvdimm/label.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
--- a/drivers/nvdimm/label.c +++ b/drivers/nvdimm/label.c @@ -755,7 +755,7 @@ static const guid_t *to_abstraction_guid
static int __pmem_label_update(struct nd_region *nd_region, struct nd_mapping *nd_mapping, struct nd_namespace_pmem *nspm, - int pos) + int pos, unsigned long flags) { struct nd_namespace_common *ndns = &nspm->nsio.common; struct nd_interleave_set *nd_set = nd_region->nd_set; @@ -796,7 +796,7 @@ static int __pmem_label_update(struct nd memcpy(nd_label->uuid, nspm->uuid, NSLABEL_UUID_LEN); if (nspm->alt_name) memcpy(nd_label->name, nspm->alt_name, NSLABEL_NAME_LEN); - nd_label->flags = __cpu_to_le32(NSLABEL_FLAG_UPDATING); + nd_label->flags = __cpu_to_le32(flags); nd_label->nlabel = __cpu_to_le16(nd_region->ndr_mappings); nd_label->position = __cpu_to_le16(pos); nd_label->isetcookie = __cpu_to_le64(cookie); @@ -1249,13 +1249,13 @@ static int del_labels(struct nd_mapping int nd_pmem_namespace_label_update(struct nd_region *nd_region, struct nd_namespace_pmem *nspm, resource_size_t size) { - int i; + int i, rc;
for (i = 0; i < nd_region->ndr_mappings; i++) { struct nd_mapping *nd_mapping = &nd_region->mapping[i]; struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); struct resource *res; - int rc, count = 0; + int count = 0;
if (size == 0) { rc = del_labels(nd_mapping, nspm->uuid); @@ -1273,7 +1273,20 @@ int nd_pmem_namespace_label_update(struc if (rc < 0) return rc;
- rc = __pmem_label_update(nd_region, nd_mapping, nspm, i); + rc = __pmem_label_update(nd_region, nd_mapping, nspm, i, + NSLABEL_FLAG_UPDATING); + if (rc) + return rc; + } + + if (size == 0) + return 0; + + /* Clear the UPDATING flag per UEFI 2.7 expectations */ + for (i = 0; i < nd_region->ndr_mappings; i++) { + struct nd_mapping *nd_mapping = &nd_region->mapping[i]; + + rc = __pmem_label_update(nd_region, nd_mapping, nspm, i, 0); if (rc) return rc; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Yang richardw.yang@linux.intel.com
commit f101ada7da6551127d192c2f1742c1e9e0f62799 upstream.
When trying to see whether current nd_region intersects with others, trim_pfn_device() has already calculated the *size* to be expanded to SECTION size.
Do not double append 'adjust' to 'size' when calculating whether the end of a region collides with the next pmem region.
Fixes: ae86cbfef381 "libnvdimm, pfn: Pad pfn namespaces relative to other regions" Cc: stable@vger.kernel.org Signed-off-by: Wei Yang richardw.yang@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvdimm/pfn_devs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -678,7 +678,7 @@ static void trim_pfn_device(struct nd_pf if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) == REGION_MIXED || !IS_ALIGNED(end, nd_pfn->align) - || nd_region_conflict(nd_region, start, size + adjust)) + || nd_region_conflict(nd_region, start, size)) *end_trunc = end - phys_pmem_align_down(nd_pfn, end); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit fa7d2e639cd90442d868dfc6ca1d4cc9d8bf206e upstream.
For recovery, where non-dax access is needed to a given physical address range, and testing, allow the 'force_raw' attribute to override the default establishment of a dev_pagemap.
Otherwise without this capability it is possible to end up with a namespace that can not be activated due to corrupted info-block, and one that can not be repaired due to a section collision.
Cc: stable@vger.kernel.org Fixes: 004f1afbe199 ("libnvdimm, pmem: direct map legacy pmem by default") Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvdimm/namespace_devs.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -138,6 +138,7 @@ bool nd_is_uuid_unique(struct device *de bool pmem_should_map_pages(struct device *dev) { struct nd_region *nd_region = to_nd_region(dev->parent); + struct nd_namespace_common *ndns = to_ndns(dev); struct nd_namespace_io *nsio;
if (!IS_ENABLED(CONFIG_ZONE_DEVICE)) @@ -149,6 +150,9 @@ bool pmem_should_map_pages(struct device if (is_nd_pfn(dev) || is_nd_btt(dev)) return false;
+ if (ndns->force_raw) + return false; + nsio = to_nd_namespace_io(dev); if (region_intersects(nsio->res.start, resource_size(&nsio->res), IORESOURCE_SYSTEM_RAM,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver O'Halloran oohall@gmail.com
commit 07464e88365e9236febaca9ed1a2e2006d8bc952 upstream.
Libnvdimm reserves the first 8K of pfn and devicedax namespaces to store a superblock describing the namespace. This 8K reservation is contained within the altmap area which the kernel uses for the vmemmap backing for the pages within the namespace. The altmap allows for some pages at the start of the altmap area to be reserved and that mechanism is used to protect the superblock from being re-used as vmemmap backing.
The number of PFNs to reserve is calculated using:
PHYS_PFN(SZ_8K)
Which is implemented as:
#define PHYS_PFN(x) ((unsigned long)((x) >> PAGE_SHIFT))
So on systems where PAGE_SIZE is greater than 8K the reservation size is truncated to zero and the superblock area is re-used as vmemmap backing. As a result all the namespace information stored in the superblock (i.e. if it's a PFN or DAX namespace) is lost and the namespace needs to be re-created to get access to the contents.
This patch fixes this by using PFN_UP() rather than PHYS_PFN() to ensure that at least one page is reserved. On systems with a 4K pages size this patch should have no effect.
Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Fixes: ac515c084be9 ("libnvdimm, pmem, pfn: move pfn setup to the core") Signed-off-by: Oliver O'Halloran oohall@gmail.com Reviewed-by: Vishal Verma vishal.l.verma@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvdimm/pfn_devs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -593,7 +593,7 @@ static unsigned long init_altmap_base(re
static unsigned long init_altmap_reserve(resource_size_t base) { - unsigned long reserve = PHYS_PFN(SZ_8K); + unsigned long reserve = PFN_UP(SZ_8K); unsigned long base_pfn = PHYS_PFN(base);
reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
commit 399504e21a10be16dd1408ba0147367d9d82a10c upstream.
same story as with last May fixes in sysfs (7b745a4e4051 "unfuck sysfs_mount()"); new_sb is left uninitialized in case of early errors in kernfs_mount_ns() and papering over it by treating any error from kernfs_mount_ns() as equivalent to !new_ns ends up conflating the cases when objects had never been transferred to a superblock with ones when that has happened and resulting new superblock had been dropped. Easily fixed (same way as in sysfs case). Additionally, there's a superblock leak on kernfs_node_dentry() failure *and* a dentry leak inside kernfs_node_dentry() itself - the latter on probably impossible errors, but the former not impossible to trigger (as the matter of fact, injecting allocation failures at that point *does* trigger it).
Cc: stable@kernel.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/kernfs/mount.c | 8 ++++++-- kernel/cgroup/cgroup.c | 9 ++++++--- 2 files changed, 12 insertions(+), 5 deletions(-)
--- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -196,8 +196,10 @@ struct dentry *kernfs_node_dentry(struct return dentry;
knparent = find_next_ancestor(kn, NULL); - if (WARN_ON(!knparent)) + if (WARN_ON(!knparent)) { + dput(dentry); return ERR_PTR(-EINVAL); + }
do { struct dentry *dtmp; @@ -206,8 +208,10 @@ struct dentry *kernfs_node_dentry(struct if (kn == knparent) return dentry; kntmp = find_next_ancestor(kn, knparent); - if (WARN_ON(!kntmp)) + if (WARN_ON(!kntmp)) { + dput(dentry); return ERR_PTR(-EINVAL); + } dtmp = lookup_one_len_unlocked(kntmp->name, dentry, strlen(kntmp->name)); dput(dentry); --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2033,7 +2033,7 @@ struct dentry *cgroup_do_mount(struct fi struct cgroup_namespace *ns) { struct dentry *dentry; - bool new_sb; + bool new_sb = false;
dentry = kernfs_mount(fs_type, flags, root->kf_root, magic, &new_sb);
@@ -2043,6 +2043,7 @@ struct dentry *cgroup_do_mount(struct fi */ if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { struct dentry *nsdentry; + struct super_block *sb = dentry->d_sb; struct cgroup *cgrp;
mutex_lock(&cgroup_mutex); @@ -2053,12 +2054,14 @@ struct dentry *cgroup_do_mount(struct fi spin_unlock_irq(&css_set_lock); mutex_unlock(&cgroup_mutex);
- nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb); + nsdentry = kernfs_node_dentry(cgrp->kn, sb); dput(dentry); + if (IS_ERR(nsdentry)) + deactivate_locked_super(sb); dentry = nsdentry; }
- if (IS_ERR(dentry) || !new_sb) + if (!new_sb) cgroup_put(&root->cgrp);
return dentry;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 6ebc97006b196aafa9df0497fdfa866cf26f259b upstream.
Some algorithms have a ->setkey() method that is not atomic, in the sense that setting a key can fail after changes were already made to the tfm context. In this case, if a key was already set the tfm can end up in a state that corresponds to neither the old key nor the new key.
For example, in gcm.c, if the kzalloc() fails due to lack of memory, then the CTR part of GCM will have the new key but GHASH will not.
It's not feasible to make all ->setkey() methods atomic, especially ones that have to key multiple sub-tfms. Therefore, make the crypto API set CRYPTO_TFM_NEED_KEY if ->setkey() fails, to prevent the tfm from being used until a new key is set.
[Cc stable mainly because when introducing the NEED_KEY flag I changed AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG previously didn't have this problem. So these "incompletely keyed" states became theoretically accessible via AF_ALG -- though, the opportunities for causing real mischief seem pretty limited.]
Fixes: dc26c17f743a ("crypto: aead - prevent using AEADs without setting key") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/aead.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/crypto/aead.c +++ b/crypto/aead.c @@ -61,8 +61,10 @@ int crypto_aead_setkey(struct crypto_aea else err = crypto_aead_alg(tfm)->setkey(tfm, key, keylen);
- if (err) + if (unlikely(err)) { + crypto_aead_set_flags(tfm, CRYPTO_TFM_NEED_KEY); return err; + }
crypto_aead_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 0f533e67d26f228ea5dfdacc8a4bdeb487af5208 upstream.
The generic AEGIS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Fix them.
Fixes: f606a88e5823 ("crypto: aegis - Add generic AEGIS AEAD implementations") Cc: stable@vger.kernel.org # v4.18+ Cc: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/aegis128.c | 14 +++++++------- crypto/aegis128l.c | 14 +++++++------- crypto/aegis256.c | 14 +++++++------- 3 files changed, 21 insertions(+), 21 deletions(-)
--- a/crypto/aegis128.c +++ b/crypto/aegis128.c @@ -290,19 +290,19 @@ static void crypto_aegis128_process_cryp const struct aegis128_ops *ops) { struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; + unsigned int nbytes = walk.nbytes;
- ops->crypt_chunk(state, dst, src, chunksize); + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride);
- skcipher_walk_done(&walk, 0); + ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes); + + skcipher_walk_done(&walk, walk.nbytes - nbytes); } }
--- a/crypto/aegis128l.c +++ b/crypto/aegis128l.c @@ -353,19 +353,19 @@ static void crypto_aegis128l_process_cry const struct aegis128l_ops *ops) { struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; + unsigned int nbytes = walk.nbytes;
- ops->crypt_chunk(state, dst, src, chunksize); + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride);
- skcipher_walk_done(&walk, 0); + ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes); + + skcipher_walk_done(&walk, walk.nbytes - nbytes); } }
--- a/crypto/aegis256.c +++ b/crypto/aegis256.c @@ -303,19 +303,19 @@ static void crypto_aegis256_process_cryp const struct aegis256_ops *ops) { struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; + unsigned int nbytes = walk.nbytes;
- ops->crypt_chunk(state, dst, src, chunksize); + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride);
- skcipher_walk_done(&walk, 0); + ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes); + + skcipher_walk_done(&walk, walk.nbytes - nbytes); } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit 62fecf295e3c48be1b5f17c440b93875b9adb4d6 upstream.
The SIMD routine ported from x86 used to have a special code path for inputs < 16 bytes, which got lost somewhere along the way. Instead, the current glue code aligns the input pointer to permit the NEON routine to use special versions of the vld1 instructions that assume 16 byte alignment, but this could result in inputs of less than 16 bytes to be passed in. This not only fails the new extended tests that Eric has implemented, it also results in the code reading past the end of the input, which could potentially result in crashes when dealing with less than 16 bytes of input at the end of a page which is followed by an unmapped page.
So update the glue code to only invoke the NEON routine if the input is at least 16 bytes.
Reported-by: Eric Biggers ebiggers@kernel.org Reviewed-by: Eric Biggers ebiggers@kernel.org Fixes: 1d481f1cd892 ("crypto: arm/crct10dif - port x86 SSE implementation to ARM") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/crypto/crct10dif-ce-core.S | 14 +++++++------- arch/arm/crypto/crct10dif-ce-glue.c | 23 ++++++----------------- 2 files changed, 13 insertions(+), 24 deletions(-)
--- a/arch/arm/crypto/crct10dif-ce-core.S +++ b/arch/arm/crypto/crct10dif-ce-core.S @@ -124,10 +124,10 @@ ENTRY(crc_t10dif_pmull) vext.8 q10, qzr, q0, #4
// receive the initial 64B data, xor the initial crc value - vld1.64 {q0-q1}, [arg2, :128]! - vld1.64 {q2-q3}, [arg2, :128]! - vld1.64 {q4-q5}, [arg2, :128]! - vld1.64 {q6-q7}, [arg2, :128]! + vld1.64 {q0-q1}, [arg2]! + vld1.64 {q2-q3}, [arg2]! + vld1.64 {q4-q5}, [arg2]! + vld1.64 {q6-q7}, [arg2]! CPU_LE( vrev64.8 q0, q0 ) CPU_LE( vrev64.8 q1, q1 ) CPU_LE( vrev64.8 q2, q2 ) @@ -167,7 +167,7 @@ CPU_LE( vrev64.8 q7, q7 ) _fold_64_B_loop:
.macro fold64, reg1, reg2 - vld1.64 {q11-q12}, [arg2, :128]! + vld1.64 {q11-q12}, [arg2]!
vmull.p64 q8, \reg1()h, d21 vmull.p64 \reg1, \reg1()l, d20 @@ -238,7 +238,7 @@ _16B_reduction_loop: vmull.p64 q7, d15, d21 veor.8 q7, q7, q8
- vld1.64 {q0}, [arg2, :128]! + vld1.64 {q0}, [arg2]! CPU_LE( vrev64.8 q0, q0 ) vswp d0, d1 veor.8 q7, q7, q0 @@ -335,7 +335,7 @@ _less_than_128: vmov.i8 q0, #0 vmov s3, arg1_low32 // get the initial crc value
- vld1.64 {q7}, [arg2, :128]! + vld1.64 {q7}, [arg2]! CPU_LE( vrev64.8 q7, q7 ) vswp d14, d15 veor.8 q7, q7, q0 --- a/arch/arm/crypto/crct10dif-ce-glue.c +++ b/arch/arm/crypto/crct10dif-ce-glue.c @@ -35,26 +35,15 @@ static int crct10dif_update(struct shash unsigned int length) { u16 *crc = shash_desc_ctx(desc); - unsigned int l;
- if (!may_use_simd()) { - *crc = crc_t10dif_generic(*crc, data, length); + if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) { + kernel_neon_begin(); + *crc = crc_t10dif_pmull(*crc, data, length); + kernel_neon_end(); } else { - if (unlikely((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) { - l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE - - ((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE)); - - *crc = crc_t10dif_generic(*crc, data, l); - - length -= l; - data += l; - } - if (length > 0) { - kernel_neon_begin(); - *crc = crc_t10dif_pmull(*crc, data, length); - kernel_neon_end(); - } + *crc = crc_t10dif_generic(*crc, data, length); } + return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 12455e320e19e9cc7ad97f4ab89c280fe297387c upstream.
The arm64 NEON bit-sliced implementation of AES-CTR fails the improved skcipher tests because it sometimes produces the wrong ciphertext. The bug is that the final keystream block isn't returned from the assembly code when the number of non-final blocks is zero. This can happen if the input data ends a few bytes after a page boundary. In this case the last bytes get "encrypted" by XOR'ing them with uninitialized memory.
Fix the assembly code to return the final keystream block when needed.
Fixes: 88a3f582bea9 ("crypto: arm64/aes - don't use IV buffer to return final keystream block") Cc: stable@vger.kernel.org # v4.11+ Reviewed-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/aes-neonbs-core.S | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/arm64/crypto/aes-neonbs-core.S +++ b/arch/arm64/crypto/aes-neonbs-core.S @@ -971,18 +971,22 @@ CPU_LE( rev x8, x8 )
8: next_ctr v0 st1 {v0.16b}, [x24] - cbz x23, 0f + cbz x23, .Lctr_done
cond_yield_neon 98b b 99b
-0: frame_pop +.Lctr_done: + frame_pop ret
/* * If we are handling the tail of the input (x6 != NULL), return the * final keystream block back to the caller. */ +0: cbz x25, 8b + st1 {v0.16b}, [x25] + b 8b 1: cbz x25, 8b st1 {v1.16b}, [x25] b 8b
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit d72b9d4acd548251f55b16843fc7a05dc5c80de8 upstream.
The SIMD routine ported from x86 used to have a special code path for inputs < 16 bytes, which got lost somewhere along the way. Instead, the current glue code aligns the input pointer to 16 bytes, which is not really necessary on this architecture (although it could be beneficial to performance to expose aligned data to the the NEON routine), but this could result in inputs of less than 16 bytes to be passed in. This not only fails the new extended tests that Eric has implemented, it also results in the code reading past the end of the input, which could potentially result in crashes when dealing with less than 16 bytes of input at the end of a page which is followed by an unmapped page.
So update the glue code to only invoke the NEON routine if the input is at least 16 bytes.
Reported-by: Eric Biggers ebiggers@kernel.org Reviewed-by: Eric Biggers ebiggers@kernel.org Fixes: 6ef5737f3931 ("crypto: arm64/crct10dif - port x86 SSE implementation to arm64") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/crct10dif-ce-glue.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-)
--- a/arch/arm64/crypto/crct10dif-ce-glue.c +++ b/arch/arm64/crypto/crct10dif-ce-glue.c @@ -39,26 +39,13 @@ static int crct10dif_update(struct shash unsigned int length) { u16 *crc = shash_desc_ctx(desc); - unsigned int l;
- if (unlikely((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) { - l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE - - ((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)); - - *crc = crc_t10dif_generic(*crc, data, l); - - length -= l; - data += l; - } - - if (length > 0) { - if (may_use_simd()) { - kernel_neon_begin(); - *crc = crc_t10dif_pmull(*crc, data, length); - kernel_neon_end(); - } else { - *crc = crc_t10dif_generic(*crc, data, length); - } + if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) { + kernel_neon_begin(); + *crc = crc_t10dif_pmull(*crc, data, length); + kernel_neon_end(); + } else { + *crc = crc_t10dif_generic(*crc, data, length); }
return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit ba7d7433a0e998c902132bd47330e355a1eaa894 upstream.
Some algorithms have a ->setkey() method that is not atomic, in the sense that setting a key can fail after changes were already made to the tfm context. In this case, if a key was already set the tfm can end up in a state that corresponds to neither the old key nor the new key.
It's not feasible to make all ->setkey() methods atomic, especially ones that have to key multiple sub-tfms. Therefore, make the crypto API set CRYPTO_TFM_NEED_KEY if ->setkey() fails and the algorithm requires a key, to prevent the tfm from being used until a new key is set.
Note: we can't set CRYPTO_TFM_NEED_KEY for OPTIONAL_KEY algorithms, so ->setkey() for those must nevertheless be atomic. That's fine for now since only the crc32 and crc32c algorithms set OPTIONAL_KEY, and it's not intended that OPTIONAL_KEY be used much.
[Cc stable mainly because when introducing the NEED_KEY flag I changed AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG previously didn't have this problem. So these "incompletely keyed" states became theoretically accessible via AF_ALG -- though, the opportunities for causing real mischief seem pretty limited.]
Fixes: 9fa68f620041 ("crypto: hash - prevent using keyed hashes without setting key") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/ahash.c | 28 +++++++++++++++++++--------- crypto/shash.c | 18 +++++++++++++----- 2 files changed, 32 insertions(+), 14 deletions(-)
--- a/crypto/ahash.c +++ b/crypto/ahash.c @@ -190,6 +190,21 @@ static int ahash_setkey_unaligned(struct return ret; }
+static int ahash_nosetkey(struct crypto_ahash *tfm, const u8 *key, + unsigned int keylen) +{ + return -ENOSYS; +} + +static void ahash_set_needkey(struct crypto_ahash *tfm) +{ + const struct hash_alg_common *alg = crypto_hash_alg_common(tfm); + + if (tfm->setkey != ahash_nosetkey && + !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY)) + crypto_ahash_set_flags(tfm, CRYPTO_TFM_NEED_KEY); +} + int crypto_ahash_setkey(struct crypto_ahash *tfm, const u8 *key, unsigned int keylen) { @@ -201,20 +216,16 @@ int crypto_ahash_setkey(struct crypto_ah else err = tfm->setkey(tfm, key, keylen);
- if (err) + if (unlikely(err)) { + ahash_set_needkey(tfm); return err; + }
crypto_ahash_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0; } EXPORT_SYMBOL_GPL(crypto_ahash_setkey);
-static int ahash_nosetkey(struct crypto_ahash *tfm, const u8 *key, - unsigned int keylen) -{ - return -ENOSYS; -} - static inline unsigned int ahash_align_buffer_size(unsigned len, unsigned long mask) { @@ -489,8 +500,7 @@ static int crypto_ahash_init_tfm(struct
if (alg->setkey) { hash->setkey = alg->setkey; - if (!(alg->halg.base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY)) - crypto_ahash_set_flags(hash, CRYPTO_TFM_NEED_KEY); + ahash_set_needkey(hash); }
return 0; --- a/crypto/shash.c +++ b/crypto/shash.c @@ -53,6 +53,13 @@ static int shash_setkey_unaligned(struct return err; }
+static void shash_set_needkey(struct crypto_shash *tfm, struct shash_alg *alg) +{ + if (crypto_shash_alg_has_setkey(alg) && + !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY)) + crypto_shash_set_flags(tfm, CRYPTO_TFM_NEED_KEY); +} + int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key, unsigned int keylen) { @@ -65,8 +72,10 @@ int crypto_shash_setkey(struct crypto_sh else err = shash->setkey(tfm, key, keylen);
- if (err) + if (unlikely(err)) { + shash_set_needkey(tfm, shash); return err; + }
crypto_shash_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0; @@ -373,7 +382,8 @@ int crypto_init_shash_ops_async(struct c crt->final = shash_async_final; crt->finup = shash_async_finup; crt->digest = shash_async_digest; - crt->setkey = shash_async_setkey; + if (crypto_shash_alg_has_setkey(alg)) + crt->setkey = shash_async_setkey;
crypto_ahash_set_flags(crt, crypto_shash_get_flags(shash) & CRYPTO_TFM_NEED_KEY); @@ -395,9 +405,7 @@ static int crypto_shash_init_tfm(struct
hash->descsize = alg->descsize;
- if (crypto_shash_alg_has_setkey(alg) && - !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY)) - crypto_shash_set_flags(hash, CRYPTO_TFM_NEED_KEY); + shash_set_needkey(hash, alg);
return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit d644f1c8746ed24f81075480f9e9cb3777ae8d65 upstream.
The generic MORUS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Fix them.
Fixes: 396be41f16fd ("crypto: morus - Add generic MORUS AEAD implementations") Cc: stable@vger.kernel.org # v4.18+ Cc: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/morus1280.c | 13 +++++++------ crypto/morus640.c | 13 +++++++------ 2 files changed, 14 insertions(+), 12 deletions(-)
--- a/crypto/morus1280.c +++ b/crypto/morus1280.c @@ -366,18 +366,19 @@ static void crypto_morus1280_process_cry const struct morus1280_ops *ops) { struct skcipher_walk walk; - u8 *dst; - const u8 *src;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; + unsigned int nbytes = walk.nbytes;
- ops->crypt_chunk(state, dst, src, walk.nbytes); + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride);
- skcipher_walk_done(&walk, 0); + ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes); + + skcipher_walk_done(&walk, walk.nbytes - nbytes); } }
--- a/crypto/morus640.c +++ b/crypto/morus640.c @@ -365,18 +365,19 @@ static void crypto_morus640_process_cryp const struct morus640_ops *ops) { struct skcipher_walk walk; - u8 *dst; - const u8 *src;
ops->skcipher_walk_init(&walk, req, false);
while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; + unsigned int nbytes = walk.nbytes;
- ops->crypt_chunk(state, dst, src, walk.nbytes); + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride);
- skcipher_walk_done(&walk, 0); + ops->crypt_chunk(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes); + + skcipher_walk_done(&walk, walk.nbytes - nbytes); } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 251b7aea34ba3c4d4fdfa9447695642eb8b8b098 upstream.
The memcpy()s in the PCBC implementation use walk->iv as both the source and destination, which has undefined behavior. These memcpy()'s are actually unneeded, because walk->iv is already used to hold the previous plaintext block XOR'd with the previous ciphertext block. Thus, walk->iv is already updated to its final value.
So remove the broken and unnecessary memcpy()s.
Fixes: 91652be5d1b9 ("[CRYPTO] pcbc: Add Propagated CBC template") Cc: stable@vger.kernel.org # v2.6.21+ Cc: David Howells dhowells@redhat.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/pcbc.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
--- a/crypto/pcbc.c +++ b/crypto/pcbc.c @@ -51,7 +51,7 @@ static int crypto_pcbc_encrypt_segment(s unsigned int nbytes = walk->nbytes; u8 *src = walk->src.virt.addr; u8 *dst = walk->dst.virt.addr; - u8 *iv = walk->iv; + u8 * const iv = walk->iv;
do { crypto_xor(iv, src, bsize); @@ -72,7 +72,7 @@ static int crypto_pcbc_encrypt_inplace(s int bsize = crypto_cipher_blocksize(tfm); unsigned int nbytes = walk->nbytes; u8 *src = walk->src.virt.addr; - u8 *iv = walk->iv; + u8 * const iv = walk->iv; u8 tmpbuf[MAX_CIPHER_BLOCKSIZE];
do { @@ -84,8 +84,6 @@ static int crypto_pcbc_encrypt_inplace(s src += bsize; } while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize); - return nbytes; }
@@ -121,7 +119,7 @@ static int crypto_pcbc_decrypt_segment(s unsigned int nbytes = walk->nbytes; u8 *src = walk->src.virt.addr; u8 *dst = walk->dst.virt.addr; - u8 *iv = walk->iv; + u8 * const iv = walk->iv;
do { crypto_cipher_decrypt_one(tfm, dst, src); @@ -132,8 +130,6 @@ static int crypto_pcbc_decrypt_segment(s dst += bsize; } while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize); - return nbytes; }
@@ -144,7 +140,7 @@ static int crypto_pcbc_decrypt_inplace(s int bsize = crypto_cipher_blocksize(tfm); unsigned int nbytes = walk->nbytes; u8 *src = walk->src.virt.addr; - u8 *iv = walk->iv; + u8 * const iv = walk->iv; u8 tmpbuf[MAX_CIPHER_BLOCKSIZE] __aligned(__alignof__(u32));
do { @@ -156,8 +152,6 @@ static int crypto_pcbc_decrypt_inplace(s src += bsize; } while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize); - return nbytes; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit b1f6b4bf416b49f00f3abc49c639371cdecaaad1 upstream.
Some algorithms have a ->setkey() method that is not atomic, in the sense that setting a key can fail after changes were already made to the tfm context. In this case, if a key was already set the tfm can end up in a state that corresponds to neither the old key nor the new key.
For example, in lrw.c, if gf128mul_init_64k_bbe() fails due to lack of memory, then priv::table will be left NULL. After that, encryption with that tfm will cause a NULL pointer dereference.
It's not feasible to make all ->setkey() methods atomic, especially ones that have to key multiple sub-tfms. Therefore, make the crypto API set CRYPTO_TFM_NEED_KEY if ->setkey() fails and the algorithm requires a key, to prevent the tfm from being used until a new key is set.
[Cc stable mainly because when introducing the NEED_KEY flag I changed AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG previously didn't have this problem. So these "incompletely keyed" states became theoretically accessible via AF_ALG -- though, the opportunities for causing real mischief seem pretty limited.]
Fixes: f8d33fac8480 ("crypto: skcipher - prevent using skciphers without setting key") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/skcipher.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
--- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -585,6 +585,12 @@ static unsigned int crypto_skcipher_exts return crypto_alg_extsize(alg); }
+static void skcipher_set_needkey(struct crypto_skcipher *tfm) +{ + if (tfm->keysize) + crypto_skcipher_set_flags(tfm, CRYPTO_TFM_NEED_KEY); +} + static int skcipher_setkey_blkcipher(struct crypto_skcipher *tfm, const u8 *key, unsigned int keylen) { @@ -598,8 +604,10 @@ static int skcipher_setkey_blkcipher(str err = crypto_blkcipher_setkey(blkcipher, key, keylen); crypto_skcipher_set_flags(tfm, crypto_blkcipher_get_flags(blkcipher) & CRYPTO_TFM_RES_MASK); - if (err) + if (unlikely(err)) { + skcipher_set_needkey(tfm); return err; + }
crypto_skcipher_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0; @@ -677,8 +685,7 @@ static int crypto_init_skcipher_ops_blkc skcipher->ivsize = crypto_blkcipher_ivsize(blkcipher); skcipher->keysize = calg->cra_blkcipher.max_keysize;
- if (skcipher->keysize) - crypto_skcipher_set_flags(skcipher, CRYPTO_TFM_NEED_KEY); + skcipher_set_needkey(skcipher);
return 0; } @@ -698,8 +705,10 @@ static int skcipher_setkey_ablkcipher(st crypto_skcipher_set_flags(tfm, crypto_ablkcipher_get_flags(ablkcipher) & CRYPTO_TFM_RES_MASK); - if (err) + if (unlikely(err)) { + skcipher_set_needkey(tfm); return err; + }
crypto_skcipher_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0; @@ -776,8 +785,7 @@ static int crypto_init_skcipher_ops_ablk sizeof(struct ablkcipher_request); skcipher->keysize = calg->cra_ablkcipher.max_keysize;
- if (skcipher->keysize) - crypto_skcipher_set_flags(skcipher, CRYPTO_TFM_NEED_KEY); + skcipher_set_needkey(skcipher);
return 0; } @@ -820,8 +828,10 @@ static int skcipher_setkey(struct crypto else err = cipher->setkey(tfm, key, keylen);
- if (err) + if (unlikely(err)) { + skcipher_set_needkey(tfm); return err; + }
crypto_skcipher_clear_flags(tfm, CRYPTO_TFM_NEED_KEY); return 0; @@ -852,8 +862,7 @@ static int crypto_skcipher_init_tfm(stru skcipher->ivsize = alg->ivsize; skcipher->keysize = alg->max_keysize;
- if (skcipher->keysize) - crypto_skcipher_set_flags(skcipher, CRYPTO_TFM_NEED_KEY); + skcipher_set_needkey(skcipher);
if (alg->exit) skcipher->base.exit = crypto_skcipher_exit_tfm;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit eb5e6730db98fcc4b51148b4a819fa4bf864ae54 upstream.
Instantiating "cryptd(crc32c)" causes a crypto self-test failure because the crypto_alloc_shash() in alg_test_crc32c() fails. This is because cryptd(crc32c) is an ahash algorithm, not a shash algorithm; so it can only be accessed through the ahash API, unlike shash algorithms which can be accessed through both the ahash and shash APIs.
As the test is testing the shash descriptor format which is only applicable to shash algorithms, skip it for ahash algorithms.
(Note that it's still important to fix crypto self-test failures even for weird algorithm instantiations like cryptd(crc32c) that no one would really use; in fips_enabled mode unprivileged users can use them to panic the kernel, and also they prevent treating a crypto self-test failure as a bug when fuzzing the kernel.)
Fixes: 8e3ee85e68c5 ("crypto: crc32c - Test descriptor context format") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/testmgr.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -1894,14 +1894,21 @@ static int alg_test_crc32c(const struct
err = alg_test_hash(desc, driver, type, mask); if (err) - goto out; + return err;
tfm = crypto_alloc_shash(driver, type, mask); if (IS_ERR(tfm)) { + if (PTR_ERR(tfm) == -ENOENT) { + /* + * This crc32c implementation is only available through + * ahash API, not the shash API, so the remaining part + * of the test is not applicable to it. + */ + return 0; + } printk(KERN_ERR "alg: crc32c: Failed to load transform for %s: " "%ld\n", driver, PTR_ERR(tfm)); - err = PTR_ERR(tfm); - goto out; + return PTR_ERR(tfm); }
do { @@ -1928,7 +1935,6 @@ static int alg_test_crc32c(const struct
crypto_free_shash(tfm);
-out: return err; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit ba6771c0a0bc2fac9d6a8759bab8493bd1cffe3b upstream.
The x86 AEGIS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end.
Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin().
Fix these bugs.
Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Cc: stable@vger.kernel.org # v4.18+ Cc: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/crypto/aegis128-aesni-glue.c | 38 +++++++++++++-------------------- arch/x86/crypto/aegis128l-aesni-glue.c | 38 +++++++++++++-------------------- arch/x86/crypto/aegis256-aesni-glue.c | 38 +++++++++++++-------------------- 3 files changed, 45 insertions(+), 69 deletions(-)
--- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -119,31 +119,20 @@ static void crypto_aegis128_aesni_proces }
static void crypto_aegis128_aesni_process_crypt( - struct aegis_state *state, struct aead_request *req, + struct aegis_state *state, struct skcipher_walk *walk, const struct aegis_crypt_ops *ops) { - struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize, base; - - ops->skcipher_walk_init(&walk, req, false); - - while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; - - ops->crypt_blocks(state, chunksize, src, dst); - - base = chunksize & ~(AEGIS128_BLOCK_SIZE - 1); - src += base; - dst += base; - chunksize &= AEGIS128_BLOCK_SIZE - 1; - - if (chunksize > 0) - ops->crypt_tail(state, chunksize, src, dst); + while (walk->nbytes >= AEGIS128_BLOCK_SIZE) { + ops->crypt_blocks(state, + round_down(walk->nbytes, AEGIS128_BLOCK_SIZE), + walk->src.virt.addr, walk->dst.virt.addr); + skcipher_walk_done(walk, walk->nbytes % AEGIS128_BLOCK_SIZE); + }
- skcipher_walk_done(&walk, 0); + if (walk->nbytes) { + ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr, + walk->dst.virt.addr); + skcipher_walk_done(walk, 0); } }
@@ -186,13 +175,16 @@ static void crypto_aegis128_aesni_crypt( { struct crypto_aead *tfm = crypto_aead_reqtfm(req); struct aegis_ctx *ctx = crypto_aegis128_aesni_ctx(tfm); + struct skcipher_walk walk; struct aegis_state state;
+ ops->skcipher_walk_init(&walk, req, true); + kernel_fpu_begin();
crypto_aegis128_aesni_init(&state, ctx->key.bytes, req->iv); crypto_aegis128_aesni_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_aesni_process_crypt(&state, req, ops); + crypto_aegis128_aesni_process_crypt(&state, &walk, ops); crypto_aegis128_aesni_final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end(); --- a/arch/x86/crypto/aegis128l-aesni-glue.c +++ b/arch/x86/crypto/aegis128l-aesni-glue.c @@ -119,31 +119,20 @@ static void crypto_aegis128l_aesni_proce }
static void crypto_aegis128l_aesni_process_crypt( - struct aegis_state *state, struct aead_request *req, + struct aegis_state *state, struct skcipher_walk *walk, const struct aegis_crypt_ops *ops) { - struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize, base; - - ops->skcipher_walk_init(&walk, req, false); - - while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; - - ops->crypt_blocks(state, chunksize, src, dst); - - base = chunksize & ~(AEGIS128L_BLOCK_SIZE - 1); - src += base; - dst += base; - chunksize &= AEGIS128L_BLOCK_SIZE - 1; - - if (chunksize > 0) - ops->crypt_tail(state, chunksize, src, dst); + while (walk->nbytes >= AEGIS128L_BLOCK_SIZE) { + ops->crypt_blocks(state, round_down(walk->nbytes, + AEGIS128L_BLOCK_SIZE), + walk->src.virt.addr, walk->dst.virt.addr); + skcipher_walk_done(walk, walk->nbytes % AEGIS128L_BLOCK_SIZE); + }
- skcipher_walk_done(&walk, 0); + if (walk->nbytes) { + ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr, + walk->dst.virt.addr); + skcipher_walk_done(walk, 0); } }
@@ -186,13 +175,16 @@ static void crypto_aegis128l_aesni_crypt { struct crypto_aead *tfm = crypto_aead_reqtfm(req); struct aegis_ctx *ctx = crypto_aegis128l_aesni_ctx(tfm); + struct skcipher_walk walk; struct aegis_state state;
+ ops->skcipher_walk_init(&walk, req, true); + kernel_fpu_begin();
crypto_aegis128l_aesni_init(&state, ctx->key.bytes, req->iv); crypto_aegis128l_aesni_process_ad(&state, req->src, req->assoclen); - crypto_aegis128l_aesni_process_crypt(&state, req, ops); + crypto_aegis128l_aesni_process_crypt(&state, &walk, ops); crypto_aegis128l_aesni_final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end(); --- a/arch/x86/crypto/aegis256-aesni-glue.c +++ b/arch/x86/crypto/aegis256-aesni-glue.c @@ -119,31 +119,20 @@ static void crypto_aegis256_aesni_proces }
static void crypto_aegis256_aesni_process_crypt( - struct aegis_state *state, struct aead_request *req, + struct aegis_state *state, struct skcipher_walk *walk, const struct aegis_crypt_ops *ops) { - struct skcipher_walk walk; - u8 *src, *dst; - unsigned int chunksize, base; - - ops->skcipher_walk_init(&walk, req, false); - - while (walk.nbytes) { - src = walk.src.virt.addr; - dst = walk.dst.virt.addr; - chunksize = walk.nbytes; - - ops->crypt_blocks(state, chunksize, src, dst); - - base = chunksize & ~(AEGIS256_BLOCK_SIZE - 1); - src += base; - dst += base; - chunksize &= AEGIS256_BLOCK_SIZE - 1; - - if (chunksize > 0) - ops->crypt_tail(state, chunksize, src, dst); + while (walk->nbytes >= AEGIS256_BLOCK_SIZE) { + ops->crypt_blocks(state, + round_down(walk->nbytes, AEGIS256_BLOCK_SIZE), + walk->src.virt.addr, walk->dst.virt.addr); + skcipher_walk_done(walk, walk->nbytes % AEGIS256_BLOCK_SIZE); + }
- skcipher_walk_done(&walk, 0); + if (walk->nbytes) { + ops->crypt_tail(state, walk->nbytes, walk->src.virt.addr, + walk->dst.virt.addr); + skcipher_walk_done(walk, 0); } }
@@ -186,13 +175,16 @@ static void crypto_aegis256_aesni_crypt( { struct crypto_aead *tfm = crypto_aead_reqtfm(req); struct aegis_ctx *ctx = crypto_aegis256_aesni_ctx(tfm); + struct skcipher_walk walk; struct aegis_state state;
+ ops->skcipher_walk_init(&walk, req, true); + kernel_fpu_begin();
crypto_aegis256_aesni_init(&state, ctx->key, req->iv); crypto_aegis256_aesni_process_ad(&state, req->src, req->assoclen); - crypto_aegis256_aesni_process_crypt(&state, req, ops); + crypto_aegis256_aesni_process_crypt(&state, &walk, ops); crypto_aegis256_aesni_final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 3af349639597fea582a93604734d717e59a0e223 upstream.
gcmaes_crypt_by_sg() dereferences the NULL pointer returned by scatterwalk_ffwd() when encrypting an empty plaintext and the source scatterlist ends immediately after the associated data.
Fix it by only fast-forwarding to the src/dst data scatterlists if the data length is nonzero.
This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run with the new AEAD test manager.
Fixes: e845520707f8 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather") Cc: stable@vger.kernel.org # v4.17+ Cc: Dave Watson davejwatson@fb.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/crypto/aesni-intel_glue.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -821,11 +821,14 @@ static int gcmaes_crypt_by_sg(bool enc, scatterwalk_map_and_copy(assoc, req->src, 0, assoclen, 0); }
- src_sg = scatterwalk_ffwd(src_start, req->src, req->assoclen); - scatterwalk_start(&src_sg_walk, src_sg); - if (req->src != req->dst) { - dst_sg = scatterwalk_ffwd(dst_start, req->dst, req->assoclen); - scatterwalk_start(&dst_sg_walk, dst_sg); + if (left) { + src_sg = scatterwalk_ffwd(src_start, req->src, req->assoclen); + scatterwalk_start(&src_sg_walk, src_sg); + if (req->src != req->dst) { + dst_sg = scatterwalk_ffwd(dst_start, req->dst, + req->assoclen); + scatterwalk_start(&dst_sg_walk, dst_sg); + } }
kernel_fpu_begin();
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 2060e284e9595fc3baed6e035903c05b93266555 upstream.
The x86 MORUS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end.
Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin().
Fix these bugs.
Fixes: 56e8e57fc3a7 ("crypto: morus - Add common SIMD glue code for MORUS") Cc: stable@vger.kernel.org # v4.18+ Cc: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/crypto/morus1280_glue.c | 40 +++++++++++++++------------------------ arch/x86/crypto/morus640_glue.c | 39 ++++++++++++++------------------------ 2 files changed, 31 insertions(+), 48 deletions(-)
--- a/arch/x86/crypto/morus1280_glue.c +++ b/arch/x86/crypto/morus1280_glue.c @@ -85,31 +85,20 @@ static void crypto_morus1280_glue_proces
static void crypto_morus1280_glue_process_crypt(struct morus1280_state *state, struct morus1280_ops ops, - struct aead_request *req) + struct skcipher_walk *walk) { - struct skcipher_walk walk; - u8 *cursor_src, *cursor_dst; - unsigned int chunksize, base; - - ops.skcipher_walk_init(&walk, req, false); - - while (walk.nbytes) { - cursor_src = walk.src.virt.addr; - cursor_dst = walk.dst.virt.addr; - chunksize = walk.nbytes; - - ops.crypt_blocks(state, cursor_src, cursor_dst, chunksize); - - base = chunksize & ~(MORUS1280_BLOCK_SIZE - 1); - cursor_src += base; - cursor_dst += base; - chunksize &= MORUS1280_BLOCK_SIZE - 1; - - if (chunksize > 0) - ops.crypt_tail(state, cursor_src, cursor_dst, - chunksize); + while (walk->nbytes >= MORUS1280_BLOCK_SIZE) { + ops.crypt_blocks(state, walk->src.virt.addr, + walk->dst.virt.addr, + round_down(walk->nbytes, + MORUS1280_BLOCK_SIZE)); + skcipher_walk_done(walk, walk->nbytes % MORUS1280_BLOCK_SIZE); + }
- skcipher_walk_done(&walk, 0); + if (walk->nbytes) { + ops.crypt_tail(state, walk->src.virt.addr, walk->dst.virt.addr, + walk->nbytes); + skcipher_walk_done(walk, 0); } }
@@ -147,12 +136,15 @@ static void crypto_morus1280_glue_crypt( struct crypto_aead *tfm = crypto_aead_reqtfm(req); struct morus1280_ctx *ctx = crypto_aead_ctx(tfm); struct morus1280_state state; + struct skcipher_walk walk; + + ops.skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
ctx->ops->init(&state, &ctx->key, req->iv); crypto_morus1280_glue_process_ad(&state, ctx->ops, req->src, req->assoclen); - crypto_morus1280_glue_process_crypt(&state, ops, req); + crypto_morus1280_glue_process_crypt(&state, ops, &walk); ctx->ops->final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end(); --- a/arch/x86/crypto/morus640_glue.c +++ b/arch/x86/crypto/morus640_glue.c @@ -85,31 +85,19 @@ static void crypto_morus640_glue_process
static void crypto_morus640_glue_process_crypt(struct morus640_state *state, struct morus640_ops ops, - struct aead_request *req) + struct skcipher_walk *walk) { - struct skcipher_walk walk; - u8 *cursor_src, *cursor_dst; - unsigned int chunksize, base; - - ops.skcipher_walk_init(&walk, req, false); - - while (walk.nbytes) { - cursor_src = walk.src.virt.addr; - cursor_dst = walk.dst.virt.addr; - chunksize = walk.nbytes; - - ops.crypt_blocks(state, cursor_src, cursor_dst, chunksize); - - base = chunksize & ~(MORUS640_BLOCK_SIZE - 1); - cursor_src += base; - cursor_dst += base; - chunksize &= MORUS640_BLOCK_SIZE - 1; - - if (chunksize > 0) - ops.crypt_tail(state, cursor_src, cursor_dst, - chunksize); + while (walk->nbytes >= MORUS640_BLOCK_SIZE) { + ops.crypt_blocks(state, walk->src.virt.addr, + walk->dst.virt.addr, + round_down(walk->nbytes, MORUS640_BLOCK_SIZE)); + skcipher_walk_done(walk, walk->nbytes % MORUS640_BLOCK_SIZE); + }
- skcipher_walk_done(&walk, 0); + if (walk->nbytes) { + ops.crypt_tail(state, walk->src.virt.addr, walk->dst.virt.addr, + walk->nbytes); + skcipher_walk_done(walk, 0); } }
@@ -143,12 +131,15 @@ static void crypto_morus640_glue_crypt(s struct crypto_aead *tfm = crypto_aead_reqtfm(req); struct morus640_ctx *ctx = crypto_aead_ctx(tfm); struct morus640_state state; + struct skcipher_walk walk; + + ops.skcipher_walk_init(&walk, req, true);
kernel_fpu_begin();
ctx->ops->init(&state, &ctx->key, req->iv); crypto_morus640_glue_process_ad(&state, ctx->ops, req->src, req->assoclen); - crypto_morus640_glue_process_crypt(&state, ops, req); + crypto_morus640_glue_process_crypt(&state, ops, &walk); ctx->ops->final(&state, tag_xor, req->assoclen, cryptlen);
kernel_fpu_end();
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit eaf46edf6ea89675bd36245369c8de5063a0272c upstream.
The NEON MAC calculation routine fails to handle the case correctly where there is some data in the buffer, and the input fills it up exactly. In this case, we enter the loop at the end with w8 == 0, while a negative value is assumed, and so the loop carries on until the increment of the 32-bit counter wraps around, which is quite obviously wrong.
So omit the loop altogether in this case, and exit right away.
Reported-by: Eric Biggers ebiggers@kernel.org Fixes: a3fd82105b9d1 ("arm64/crypto: AES in CCM mode using ARMv8 Crypto ...") Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/aes-ce-ccm-core.S | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -74,12 +74,13 @@ ENTRY(ce_aes_ccm_auth_data) beq 10f ext v0.16b, v0.16b, v0.16b, #1 /* rotate out the mac bytes */ b 7b -8: mov w7, w8 +8: cbz w8, 91f + mov w7, w8 add w8, w8, #16 9: ext v1.16b, v1.16b, v1.16b, #1 adds w7, w7, #1 bne 9b - eor v0.16b, v0.16b, v1.16b +91: eor v0.16b, v0.16b, v1.16b st1 {v0.16b}, [x0] 10: str w8, [x3] ret
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit 969e2f59d589c15f6aaf306e590dde16f12ea4b3 upstream.
Commit 5092fcf34908 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback") introduced C fallback code to replace the NEON routines when invoked from a context where the NEON is not available (i.e., from the context of a softirq taken while the NEON is already being used in kernel process context)
Fix two logical flaws in the MAC calculation of the associated data.
Reported-by: Eric Biggers ebiggers@kernel.org Fixes: 5092fcf34908 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback") Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/aes-ce-ccm-glue.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -125,7 +125,7 @@ static void ccm_update_mac(struct crypto abytes -= added; }
- while (abytes > AES_BLOCK_SIZE) { + while (abytes >= AES_BLOCK_SIZE) { __aes_arm64_encrypt(key->key_enc, mac, mac, num_rounds(key)); crypto_xor(mac, in, AES_BLOCK_SIZE); @@ -139,8 +139,6 @@ static void ccm_update_mac(struct crypto num_rounds(key)); crypto_xor(mac, in, abytes); *macp = abytes; - } else { - *macp = 0; } } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Shilovsky pshilov@microsoft.com
commit 165df9a080b6863ae286fa01780c13d87cd81076 upstream.
If we don't find a writable file handle when retrying writepages we break of the loop and do not unlock and put pages neither from wdata2 nor from the original wdata. Fix this by walking through all the remaining pages and cleanup them properly.
Cc: stable@vger.kernel.org Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifssmb.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
--- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -2125,12 +2125,13 @@ cifs_writev_requeue(struct cifs_writedat
wdata2->cfile = find_writable_file(CIFS_I(inode), false); if (!wdata2->cfile) { - cifs_dbg(VFS, "No writable handles for inode\n"); + cifs_dbg(VFS, "No writable handle to retry writepages\n"); rc = -EBADF; - break; + } else { + wdata2->pid = wdata2->cfile->pid; + rc = server->ops->async_writev(wdata2, + cifs_writedata_release); } - wdata2->pid = wdata2->cfile->pid; - rc = server->ops->async_writev(wdata2, cifs_writedata_release);
for (j = 0; j < nr_pages; j++) { unlock_page(wdata2->pages[j]); @@ -2145,6 +2146,7 @@ cifs_writev_requeue(struct cifs_writedat kref_put(&wdata2->refcount, cifs_writedata_release); if (is_retryable_error(rc)) continue; + i += nr_pages; break; }
@@ -2152,6 +2154,13 @@ cifs_writev_requeue(struct cifs_writedat i += nr_pages; } while (i < wdata->nr_pages);
+ /* cleanup remaining pages from the original wdata */ + for (; i < wdata->nr_pages; i++) { + SetPageError(wdata->pages[i]); + end_page_writeback(wdata->pages[i]); + put_page(wdata->pages[i]); + } + if (rc != 0 && !is_retryable_error(rc)) mapping_set_error(inode->i_mapping, rc); kref_put(&wdata->refcount, cifs_writedata_release);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Shilovsky piastryyy@gmail.com
commit 7b9b9edb49ad377b1e06abf14354c227e9ac4b06 upstream.
Currently on lease break the client sets a caching level twice: when oplock is detected and when oplock is processed. While the 1st attempt sets the level to the value provided by the server, the 2nd one resets the level to None unconditionally. This happens because the oplock/lease processing code was changed to avoid races between page cache flushes and oplock breaks. The commit c11f1df5003d534 ("cifs: Wait for writebacks to complete before attempting write.") fixed the races for oplocks but didn't apply the same changes for leases resulting in overwriting the server granted value to None. Fix this by properly processing lease breaks.
Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2misc.c | 17 ++++++++++++++--- fs/cifs/smb2ops.c | 15 ++++++++++++--- 2 files changed, 26 insertions(+), 6 deletions(-)
--- a/fs/cifs/smb2misc.c +++ b/fs/cifs/smb2misc.c @@ -517,7 +517,6 @@ smb2_tcon_has_lease(struct cifs_tcon *tc __u8 lease_state; struct list_head *tmp; struct cifsFileInfo *cfile; - struct TCP_Server_Info *server = tcon->ses->server; struct cifs_pending_open *open; struct cifsInodeInfo *cinode; int ack_req = le32_to_cpu(rsp->Flags & @@ -537,13 +536,25 @@ smb2_tcon_has_lease(struct cifs_tcon *tc cifs_dbg(FYI, "lease key match, lease break 0x%x\n", le32_to_cpu(rsp->NewLeaseState));
- server->ops->set_oplock_level(cinode, lease_state, 0, NULL); - if (ack_req) cfile->oplock_break_cancelled = false; else cfile->oplock_break_cancelled = true;
+ set_bit(CIFS_INODE_PENDING_OPLOCK_BREAK, &cinode->flags); + + /* + * Set or clear flags depending on the lease state being READ. + * HANDLE caching flag should be added when the client starts + * to defer closing remote file handles with HANDLE leases. + */ + if (lease_state & SMB2_LEASE_READ_CACHING_HE) + set_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, + &cinode->flags); + else + clear_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, + &cinode->flags); + queue_work(cifsoplockd_wq, &cfile->oplock_break); kfree(lw); return true; --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -2595,6 +2595,15 @@ smb2_downgrade_oplock(struct TCP_Server_ }
static void +smb21_downgrade_oplock(struct TCP_Server_Info *server, + struct cifsInodeInfo *cinode, bool set_level2) +{ + server->ops->set_oplock_level(cinode, + set_level2 ? SMB2_LEASE_READ_CACHING_HE : + 0, 0, NULL); +} + +static void smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock, unsigned int epoch, bool *purge_cache) { @@ -3646,7 +3655,7 @@ struct smb_version_operations smb21_oper .print_stats = smb2_print_stats, .is_oplock_break = smb2_is_valid_oplock_break, .handle_cancelled_mid = smb2_handle_cancelled_mid, - .downgrade_oplock = smb2_downgrade_oplock, + .downgrade_oplock = smb21_downgrade_oplock, .need_neg = smb2_need_neg, .negotiate = smb2_negotiate, .negotiate_wsize = smb2_negotiate_wsize, @@ -3743,7 +3752,7 @@ struct smb_version_operations smb30_oper .dump_share_caps = smb2_dump_share_caps, .is_oplock_break = smb2_is_valid_oplock_break, .handle_cancelled_mid = smb2_handle_cancelled_mid, - .downgrade_oplock = smb2_downgrade_oplock, + .downgrade_oplock = smb21_downgrade_oplock, .need_neg = smb2_need_neg, .negotiate = smb2_negotiate, .negotiate_wsize = smb3_negotiate_wsize, @@ -3848,7 +3857,7 @@ struct smb_version_operations smb311_ope .dump_share_caps = smb2_dump_share_caps, .is_oplock_break = smb2_is_valid_oplock_break, .handle_cancelled_mid = smb2_handle_cancelled_mid, - .downgrade_oplock = smb2_downgrade_oplock, + .downgrade_oplock = smb21_downgrade_oplock, .need_neg = smb2_need_neg, .negotiate = smb2_negotiate, .negotiate_wsize = smb3_negotiate_wsize,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Shilovsky piastryyy@gmail.com
commit c781af7e0c1fed9f1d0e0ec31b86f5b21a8dca17 upstream.
When we hit failures during constructing MIDs or sending PDUs through the network, we end up not using message IDs assigned to the packet. The next SMB packet will skip those message IDs and continue with the next one. This behavior may lead to a server not granting us credits until we use the skipped IDs. Fix this by reverting the current ID to the original value if any errors occur before we push the packet through the network stack.
This patch fixes the generic/310 test from the xfs-tests.
Cc: stable@vger.kernel.org # 4.19.x Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifsglob.h | 19 +++++++++++++++++++ fs/cifs/smb2ops.c | 13 +++++++++++++ fs/cifs/smb2transport.c | 14 ++++++++++++-- fs/cifs/transport.c | 6 +++++- 4 files changed, 49 insertions(+), 3 deletions(-)
--- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -236,6 +236,8 @@ struct smb_version_operations { int * (*get_credits_field)(struct TCP_Server_Info *, const int); unsigned int (*get_credits)(struct mid_q_entry *); __u64 (*get_next_mid)(struct TCP_Server_Info *); + void (*revert_current_mid)(struct TCP_Server_Info *server, + const unsigned int val); /* data offset from read response message */ unsigned int (*read_data_offset)(char *); /* @@ -770,6 +772,22 @@ get_next_mid(struct TCP_Server_Info *ser return cpu_to_le16(mid); }
+static inline void +revert_current_mid(struct TCP_Server_Info *server, const unsigned int val) +{ + if (server->ops->revert_current_mid) + server->ops->revert_current_mid(server, val); +} + +static inline void +revert_current_mid_from_hdr(struct TCP_Server_Info *server, + const struct smb2_sync_hdr *shdr) +{ + unsigned int num = le16_to_cpu(shdr->CreditCharge); + + return revert_current_mid(server, num > 0 ? num : 1); +} + static inline __u16 get_mid(const struct smb_hdr *smb) { @@ -1422,6 +1440,7 @@ struct mid_q_entry { struct kref refcount; struct TCP_Server_Info *server; /* server corresponding to this mid */ __u64 mid; /* multiplex id */ + __u16 credits; /* number of credits consumed by this mid */ __u32 pid; /* process id */ __u32 sequence_number; /* for CIFS signing */ unsigned long when_alloc; /* when mid was created */ --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -219,6 +219,15 @@ smb2_get_next_mid(struct TCP_Server_Info return mid; }
+static void +smb2_revert_current_mid(struct TCP_Server_Info *server, const unsigned int val) +{ + spin_lock(&GlobalMid_Lock); + if (server->CurrentMid >= val) + server->CurrentMid -= val; + spin_unlock(&GlobalMid_Lock); +} + static struct mid_q_entry * smb2_find_mid(struct TCP_Server_Info *server, char *buf) { @@ -3550,6 +3559,7 @@ struct smb_version_operations smb20_oper .get_credits = smb2_get_credits, .wait_mtu_credits = cifs_wait_mtu_credits, .get_next_mid = smb2_get_next_mid, + .revert_current_mid = smb2_revert_current_mid, .read_data_offset = smb2_read_data_offset, .read_data_length = smb2_read_data_length, .map_error = map_smb2_to_linux_error, @@ -3645,6 +3655,7 @@ struct smb_version_operations smb21_oper .get_credits = smb2_get_credits, .wait_mtu_credits = smb2_wait_mtu_credits, .get_next_mid = smb2_get_next_mid, + .revert_current_mid = smb2_revert_current_mid, .read_data_offset = smb2_read_data_offset, .read_data_length = smb2_read_data_length, .map_error = map_smb2_to_linux_error, @@ -3741,6 +3752,7 @@ struct smb_version_operations smb30_oper .get_credits = smb2_get_credits, .wait_mtu_credits = smb2_wait_mtu_credits, .get_next_mid = smb2_get_next_mid, + .revert_current_mid = smb2_revert_current_mid, .read_data_offset = smb2_read_data_offset, .read_data_length = smb2_read_data_length, .map_error = map_smb2_to_linux_error, @@ -3846,6 +3858,7 @@ struct smb_version_operations smb311_ope .get_credits = smb2_get_credits, .wait_mtu_credits = smb2_wait_mtu_credits, .get_next_mid = smb2_get_next_mid, + .revert_current_mid = smb2_revert_current_mid, .read_data_offset = smb2_read_data_offset, .read_data_length = smb2_read_data_length, .map_error = map_smb2_to_linux_error, --- a/fs/cifs/smb2transport.c +++ b/fs/cifs/smb2transport.c @@ -576,6 +576,7 @@ smb2_mid_entry_alloc(const struct smb2_s struct TCP_Server_Info *server) { struct mid_q_entry *temp; + unsigned int credits = le16_to_cpu(shdr->CreditCharge);
if (server == NULL) { cifs_dbg(VFS, "Null TCP session in smb2_mid_entry_alloc\n"); @@ -586,6 +587,7 @@ smb2_mid_entry_alloc(const struct smb2_s memset(temp, 0, sizeof(struct mid_q_entry)); kref_init(&temp->refcount); temp->mid = le64_to_cpu(shdr->MessageId); + temp->credits = credits > 0 ? credits : 1; temp->pid = current->pid; temp->command = shdr->Command; /* Always LE */ temp->when_alloc = jiffies; @@ -674,13 +676,18 @@ smb2_setup_request(struct cifs_ses *ses, smb2_seq_num_into_buf(ses->server, shdr);
rc = smb2_get_mid_entry(ses, shdr, &mid); - if (rc) + if (rc) { + revert_current_mid_from_hdr(ses->server, shdr); return ERR_PTR(rc); + } + rc = smb2_sign_rqst(rqst, ses->server); if (rc) { + revert_current_mid_from_hdr(ses->server, shdr); cifs_delete_mid(mid); return ERR_PTR(rc); } + return mid; }
@@ -695,11 +702,14 @@ smb2_setup_async_request(struct TCP_Serv smb2_seq_num_into_buf(server, shdr);
mid = smb2_mid_entry_alloc(shdr, server); - if (mid == NULL) + if (mid == NULL) { + revert_current_mid_from_hdr(server, shdr); return ERR_PTR(-ENOMEM); + }
rc = smb2_sign_rqst(rqst, server); if (rc) { + revert_current_mid_from_hdr(server, shdr); DeleteMidQEntry(mid); return ERR_PTR(rc); } --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -647,6 +647,7 @@ cifs_call_async(struct TCP_Server_Info * cifs_in_send_dec(server);
if (rc < 0) { + revert_current_mid(server, mid->credits); server->sequence_number -= 2; cifs_delete_mid(mid); } @@ -868,6 +869,7 @@ compound_send_recv(const unsigned int xi for (i = 0; i < num_rqst; i++) { midQ[i] = ses->server->ops->setup_request(ses, &rqst[i]); if (IS_ERR(midQ[i])) { + revert_current_mid(ses->server, i); for (j = 0; j < i; j++) cifs_delete_mid(midQ[j]); mutex_unlock(&ses->server->srv_mutex); @@ -897,8 +899,10 @@ compound_send_recv(const unsigned int xi for (i = 0; i < num_rqst; i++) cifs_save_when_sent(midQ[i]);
- if (rc < 0) + if (rc < 0) { + revert_current_mid(ses->server, num_rqst); ses->server->sequence_number -= 2; + }
mutex_unlock(&ses->server->srv_mutex);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Shilovsky piastryyy@gmail.com
commit 6dfbd84684700cb58b34e8602c01c12f3d2595c8 upstream.
When we have a READ lease for a file and have just issued a write operation to the server we need to purge the cache and set oplock/lease level to NONE to avoid reading stale data. Currently we do that only if a write operation succedeed thus not covering cases when a request was sent to the server but a negative error code was returned later for some other reasons (e.g. -EIOCBQUEUED or -EINTR). Fix this by turning off caching regardless of the error code being returned.
The patches fixes generic tests 075 and 112 from the xfs-tests.
Cc: stable@vger.kernel.org Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/file.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -3028,14 +3028,16 @@ cifs_strict_writev(struct kiocb *iocb, s * these pages but not on the region from pos to ppos+len-1. */ written = cifs_user_writev(iocb, from); - if (written > 0 && CIFS_CACHE_READ(cinode)) { + if (CIFS_CACHE_READ(cinode)) { /* - * Windows 7 server can delay breaking level2 oplock if a write - * request comes - break it on the client to prevent reading - * an old data. + * We have read level caching and we have just sent a write + * request to the server thus making data in the cache stale. + * Zap the cache and set oplock/lease level to NONE to avoid + * reading stale data from the cache. All subsequent read + * operations will read new data from the server. */ cifs_zap_mapping(inode); - cifs_dbg(FYI, "Set no oplock for inode=%p after a write operation\n", + cifs_dbg(FYI, "Set Oplock/Lease to NONE for inode=%p after write\n", inode); cinode->oplock = 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit e8506d25f740fd058791cc12a6dfa9386ada6b96 upstream.
We negotiate rsize mounts (and it can be overridden by user) to typically 4MB, so using larger default I/O sizes from userspace (changing to 1MB default i/o size returned by stat) the performance is much better (and not just for long latency network connections) in most use cases for SMB3 than the default I/O size (which ends up being 128K for cp and can be even smaller for cp). This can be 4x slower or worse depending on network latency.
By changing inode->blocksize from 32K (which was perhaps ok for very old SMB1/CIFS) to a larger value, 1MB (but still less than max size negotiated with the server which is 4MB, in order to minimize risk) it significantly increases performance for the noncached case, and slightly increases it for the cached case. This can be changed by the user on mount (specifying bsize= values from 16K to 16MB) to tune better for performance for applications that depend on blocksize.
Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifs_fs_sb.h | 1 + fs/cifs/cifsfs.c | 1 + fs/cifs/cifsglob.h | 1 + fs/cifs/connect.c | 26 ++++++++++++++++++++++++-- fs/cifs/inode.c | 2 +- 5 files changed, 28 insertions(+), 3 deletions(-)
--- a/fs/cifs/cifs_fs_sb.h +++ b/fs/cifs/cifs_fs_sb.h @@ -58,6 +58,7 @@ struct cifs_sb_info { spinlock_t tlink_tree_lock; struct tcon_link *master_tlink; struct nls_table *local_nls; + unsigned int bsize; unsigned int rsize; unsigned int wsize; unsigned long actimeo; /* attribute cache timeout (jiffies) */ --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -554,6 +554,7 @@ cifs_show_options(struct seq_file *s, st
seq_printf(s, ",rsize=%u", cifs_sb->rsize); seq_printf(s, ",wsize=%u", cifs_sb->wsize); + seq_printf(s, ",bsize=%u", cifs_sb->bsize); seq_printf(s, ",echo_interval=%lu", tcon->ses->server->echo_interval / HZ); if (tcon->snapshot_time) --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -559,6 +559,7 @@ struct smb_vol { bool resilient:1; /* noresilient not required since not fored for CA */ bool domainauto:1; bool rdma:1; + unsigned int bsize; unsigned int rsize; unsigned int wsize; bool sockopt_tcp_nodelay:1; --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -102,7 +102,7 @@ enum { Opt_backupuid, Opt_backupgid, Opt_uid, Opt_cruid, Opt_gid, Opt_file_mode, Opt_dirmode, Opt_port, - Opt_rsize, Opt_wsize, Opt_actimeo, + Opt_blocksize, Opt_rsize, Opt_wsize, Opt_actimeo, Opt_echo_interval, Opt_max_credits, Opt_snapshot,
@@ -204,6 +204,7 @@ static const match_table_t cifs_mount_op { Opt_dirmode, "dirmode=%s" }, { Opt_dirmode, "dir_mode=%s" }, { Opt_port, "port=%s" }, + { Opt_blocksize, "bsize=%s" }, { Opt_rsize, "rsize=%s" }, { Opt_wsize, "wsize=%s" }, { Opt_actimeo, "actimeo=%s" }, @@ -1571,7 +1572,7 @@ cifs_parse_mount_options(const char *mou vol->cred_uid = current_uid(); vol->linux_uid = current_uid(); vol->linux_gid = current_gid(); - + vol->bsize = 1024 * 1024; /* can improve cp performance significantly */ /* * default to SFM style remapping of seven reserved characters * unless user overrides it or we negotiate CIFS POSIX where @@ -1944,6 +1945,26 @@ cifs_parse_mount_options(const char *mou } port = (unsigned short)option; break; + case Opt_blocksize: + if (get_option_ul(args, &option)) { + cifs_dbg(VFS, "%s: Invalid blocksize value\n", + __func__); + goto cifs_parse_mount_err; + } + /* + * inode blocksize realistically should never need to be + * less than 16K or greater than 16M and default is 1MB. + * Note that small inode block sizes (e.g. 64K) can lead + * to very poor performance of common tools like cp and scp + */ + if ((option < CIFS_MAX_MSGSIZE) || + (option > (4 * SMB3_DEFAULT_IOSIZE))) { + cifs_dbg(VFS, "%s: Invalid blocksize\n", + __func__); + goto cifs_parse_mount_err; + } + vol->bsize = option; + break; case Opt_rsize: if (get_option_ul(args, &option)) { cifs_dbg(VFS, "%s: Invalid rsize value\n", @@ -3839,6 +3860,7 @@ int cifs_setup_cifs_sb(struct smb_vol *p spin_lock_init(&cifs_sb->tlink_tree_lock); cifs_sb->tlink_tree = RB_ROOT;
+ cifs_sb->bsize = pvolume_info->bsize; /* * Temporarily set r/wsize for matching superblock. If we end up using * new sb then client will later negotiate it downward if needed. --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -2080,7 +2080,7 @@ int cifs_getattr(const struct path *path return rc;
generic_fillattr(inode, stat); - stat->blksize = CIFS_MAX_MSGSIZE; + stat->blksize = cifs_sb->bsize; stat->ino = CIFS_I(inode)->uniqueid;
/* old CIFS Unix Extensions doesn't return create time */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Zanussi tom.zanussi@linux.intel.com
commit 9f0bbf3115ca9f91f43b7c74e9ac7d79f47fc6c2 upstream.
Because there may be random garbage beyond a string's null terminator, it's not correct to copy the the complete character array for use as a hist trigger key. This results in multiple histogram entries for the 'same' string key.
So, in the case of a string key, use strncpy instead of memcpy to avoid copying in the extra bytes.
Before, using the gdbus entries in the following hist trigger as an example:
# echo 'hist:key=comm' > /sys/kernel/debug/tracing/events/sched/sched_waking/trigger # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist
...
{ comm: ImgDecoder #4 } hitcount: 203 { comm: gmain } hitcount: 213 { comm: gmain } hitcount: 216 { comm: StreamTrans #73 } hitcount: 221 { comm: mozStorage #3 } hitcount: 230 { comm: gdbus } hitcount: 233 { comm: StyleThread#5 } hitcount: 253 { comm: gdbus } hitcount: 256 { comm: gdbus } hitcount: 260 { comm: StyleThread#4 } hitcount: 271
...
# cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l 51
After:
# cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l 1
Link: http://lkml.kernel.org/r/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756...
Cc: Namhyung Kim namhyung@kernel.org Cc: stable@vger.kernel.org Fixes: 79e577cbce4c4 ("tracing: Support string type key properly") Signed-off-by: Tom Zanussi tom.zanussi@linux.intel.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_events_hist.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -4695,9 +4695,10 @@ static inline void add_to_key(char *comp /* ensure NULL-termination */ if (size > key_field->size - 1) size = key_field->size - 1; - }
- memcpy(compound_key + key_field->offset, key, size); + strncpy(compound_key + key_field->offset, (char *)key, size); + } else + memcpy(compound_key + key_field->offset, key, size); }
static void
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhangyi (F) yi.zhang@huawei.com
commit e7f0c424d0806b05d6f47be9f202b037eb701707 upstream.
Commit d716ff71dd12 ("tracing: Remove taking of trace_types_lock in pipe files") use the current tracer instead of the copy in tracing_open_pipe(), but it forget to remove the freeing sentence in the error path.
There's an error path that can call kfree(iter->trace) after the iter->trace was assigned to tr->current_trace, which would be bad to free.
Link: http://lkml.kernel.org/r/1550060946-45984-1-git-send-email-yi.zhang@huawei.c...
Cc: stable@vger.kernel.org Fixes: d716ff71dd12 ("tracing: Remove taking of trace_types_lock in pipe files") Signed-off-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace.c | 1 - 1 file changed, 1 deletion(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5626,7 +5626,6 @@ out: return ret;
fail: - kfree(iter->trace); kfree(iter); __trace_array_put(tr); mutex_unlock(&trace_types_lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 83540fbc8812a580b6ad8f93f4c29e62e417687e upstream.
The first version of this method was missing the check for `ret == PATH_MAX`; then such a check was added, but it didn't call kfree() on error, so there was still a small memory leak in the error case. Fix it by using strndup_user() instead of open-coding it.
Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com
Cc: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Fixes: 0eadcc7a7bc0 ("perf/core: Fix perf_uprobe_init()") Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Song Liu songliubraving@fb.com Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_event_perf.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
--- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -299,15 +299,13 @@ int perf_uprobe_init(struct perf_event *
if (!p_event->attr.uprobe_path) return -EINVAL; - path = kzalloc(PATH_MAX, GFP_KERNEL); - if (!path) - return -ENOMEM; - ret = strncpy_from_user( - path, u64_to_user_ptr(p_event->attr.uprobe_path), PATH_MAX); - if (ret == PATH_MAX) - return -E2BIG; - if (ret < 0) - goto out; + + path = strndup_user(u64_to_user_ptr(p_event->attr.uprobe_path), + PATH_MAX); + if (IS_ERR(path)) { + ret = PTR_ERR(path); + return (ret == -EINVAL) ? -E2BIG : ret; + } if (path[0] == '\0') { ret = -EINVAL; goto out;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit d04071a5d6413b65f17f7bd6e2bdb22e22e4ace7 upstream.
We added some locking to this function but forgot to drop the lock on these two error paths. This bug would lead to an immediate deadlock.
Fixes: c7b3690fb152 ("vmw_balloon: stats rework") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Cc: stable@vger.kernel.org Reviewed-by: Nadav Amit namit@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/vmw_balloon.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@ -1287,7 +1287,7 @@ static void vmballoon_reset(struct vmbal vmballoon_pop(b);
if (vmballoon_send_start(b, VMW_BALLOON_CAPABILITIES)) - return; + goto unlock;
if ((b->capabilities & VMW_BALLOON_BATCHED_CMDS) != 0) { if (vmballoon_init_batching(b)) { @@ -1298,7 +1298,7 @@ static void vmballoon_reset(struct vmbal * The guest will retry in one second. */ vmballoon_send_start(b, 0); - return; + goto unlock; } } else if ((b->capabilities & VMW_BALLOON_BASIC_CMDS) != 0) { vmballoon_deinit_batching(b); @@ -1314,6 +1314,7 @@ static void vmballoon_reset(struct vmbal if (vmballoon_send_guest_id(b)) pr_err("failed to send guest ID to the host\n");
+unlock: up_write(&b->conf_sem); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
commit 01bd2ac2f55a1916d81dace12fa8d7ae1c79b5ea upstream.
Commit f7c90c2aa40048 ("x86/xen: don't write ptes directly in 32-bit PV guests") introduced a regression for booting dom0 on huge systems with lots of RAM (in the TB range).
Reason is that on those hosts the p2m list needs to be moved early in the boot process and this requires temporary page tables to be created. Said commit modified xen_set_pte_init() to use a hypercall for writing a PTE, but this requires the page table being in the direct mapped area, which is not the case for the temporary page tables used in xen_relocate_p2m().
As the page tables are completely written before being linked to the actual address space instead of set_pte() a plain write to memory can be used in xen_relocate_p2m().
Fixes: f7c90c2aa40048 ("x86/xen: don't write ptes directly in 32-bit PV guests") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Jan Beulich jbeulich@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/xen/mmu_pv.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2114,10 +2114,10 @@ void __init xen_relocate_p2m(void) pt = early_memremap(pt_phys, PAGE_SIZE); clear_page(pt); for (idx_pte = 0; - idx_pte < min(n_pte, PTRS_PER_PTE); - idx_pte++) { - set_pte(pt + idx_pte, - pfn_pte(p2m_pfn, PAGE_KERNEL)); + idx_pte < min(n_pte, PTRS_PER_PTE); + idx_pte++) { + pt[idx_pte] = pfn_pte(p2m_pfn, + PAGE_KERNEL); p2m_pfn++; } n_pte -= PTRS_PER_PTE; @@ -2125,8 +2125,7 @@ void __init xen_relocate_p2m(void) make_lowmem_page_readonly(__va(pt_phys)); pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, PFN_DOWN(pt_phys)); - set_pmd(pmd + idx_pt, - __pmd(_PAGE_TABLE | pt_phys)); + pmd[idx_pt] = __pmd(_PAGE_TABLE | pt_phys); pt_phys += PAGE_SIZE; } n_pt -= PTRS_PER_PMD; @@ -2134,7 +2133,7 @@ void __init xen_relocate_p2m(void) make_lowmem_page_readonly(__va(pmd_phys)); pin_pagetable_pfn(MMUEXT_PIN_L2_TABLE, PFN_DOWN(pmd_phys)); - set_pud(pud + idx_pmd, __pud(_PAGE_TABLE | pmd_phys)); + pud[idx_pmd] = __pud(_PAGE_TABLE | pmd_phys); pmd_phys += PAGE_SIZE; } n_pmd -= PTRS_PER_PUD;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit f16eb8a4b096514ac06fb25bf599dcc792899b3d upstream.
If SSDT overlay is loaded via ConfigFS and then unloaded the device, we would like to have OF modalias for, already gone. Thus, acpi_get_name() returns no allocated buffer for such case and kernel crashes afterwards:
ACPI: Host-directed Dynamic ACPI Table Unload ads7950 spi-PRP0001:00: Dropping the link to regulator.0 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ #96 Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48 Workqueue: kacpi_hotplug acpi_device_del_work_fn RIP: 0010:create_of_modalias.isra.1+0x4c/0x150 Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04 RSP: 0000:ffffa51040297c10 EFLAGS: 00010246 RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000 RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0 RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000 R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218 R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0 Call Trace: __acpi_device_uevent_modalias+0xb0/0x100 spi_uevent+0xd/0x40
...
In order to fix above let create_of_modalias() check the status returned by acpi_get_name() and bail out in case of failure.
Fixes: 8765c5ba1949 ("ACPI / scan: Rework modalias creation when "compatible" is present") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381 Reported-by: Ferry Toth fntoth@gmail.com Tested-by: Ferry Tothfntoth@gmail.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Cc: 4.1+ stable@vger.kernel.org # 4.1+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/device_sysfs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/acpi/device_sysfs.c +++ b/drivers/acpi/device_sysfs.c @@ -202,11 +202,15 @@ static int create_of_modalias(struct acp { struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER }; const union acpi_object *of_compatible, *obj; + acpi_status status; int len, count; int i, nval; char *c;
- acpi_get_name(acpi_dev->handle, ACPI_SINGLE_NAME, &buf); + status = acpi_get_name(acpi_dev->handle, ACPI_SINGLE_NAME, &buf); + if (ACPI_FAILURE(status)) + return -ENODEV; + /* DT strings are all in lower case */ for (c = buf.pointer; *c != '\0'; c++) *c = tolower(*c);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: BOUGH CHEN haibo.chen@nxp.com
commit de0a0decf2edfc5b0c782915f4120cf990a9bd13 upstream.
Now tuning reset will be done when the timing is MMC_TIMING_LEGACY/ MMC_TIMING_MMC_HS/MMC_TIMING_SD_HS. But for timing MMC_TIMING_MMC_HS, we can not do tuning reset, otherwise HS400 timing is not right.
Here is the process of init HS400, first finish tuning in HS200 mode, then switch to HS mode and 8 bit DDR mode, finally switch to HS400 mode. If we do tuning reset in HS mode, this will cause HS400 mode lost the tuning setting, which will cause CRC error.
Signed-off-by: Haibo Chen haibo.chen@nxp.com Cc: stable@vger.kernel.org # v4.12+ Acked-by: Adrian Hunter adrian.hunter@intel.com Fixes: d9370424c948 ("mmc: sdhci-esdhc-imx: reset tuning circuit when power on mmc card") Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/sdhci-esdhc-imx.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mmc/host/sdhci-esdhc-imx.c +++ b/drivers/mmc/host/sdhci-esdhc-imx.c @@ -979,6 +979,7 @@ static void esdhc_set_uhs_signaling(stru case MMC_TIMING_UHS_SDR25: case MMC_TIMING_UHS_SDR50: case MMC_TIMING_UHS_SDR104: + case MMC_TIMING_MMC_HS: case MMC_TIMING_MMC_HS200: writel(m, host->ioaddr + ESDHC_MIX_CTRL); break;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takeshi Saito takeshi.saito.xv@renesas.com
commit d30ae056adb81e1d2b8b953efa74735a020b8e3b upstream.
This fixes card initialization failure in high speed mode.
If U-Boot uses SDR or HS200/400 mode before starting Linux and Linux DT does not enable SDR/HS200/HS400 mode, card initialization fails in high speed mode.
It is necessary to initialize SCC registers during card initialization phase. HW reset function is registered only for a port with either of SDR/HS200/HS400 properties in device tree. If SDR/HS200/HS400 properties are not present in device tree, SCC registers will not be reset. In SoC that support SCC registers, HW reset function should be registered regardless of the configuration of device tree.
Reproduction procedure: - Use U-Boot that support MMC HS200/400 mode. - Delete HS200/HS400 properties in device tree. (Delete mmc-hs200-1_8v and mmc-hs400-1_8v) - MMC port works high speed mode and all commands fail.
Signed-off-by: Takeshi Saito takeshi.saito.xv@renesas.com Signed-off-by: Marek Vasut marek.vasut+renesas@gmail.com Cc: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Cc: Simon Horman horms+renesas@verge.net.au Reviewed-by: Wolfram Sang wsa+renesas@sang-engineering.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/renesas_sdhi_core.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -723,6 +723,13 @@ int renesas_sdhi_probe(struct platform_d host->ops.start_signal_voltage_switch = renesas_sdhi_start_signal_voltage_switch; host->sdcard_irq_setbit_mask = TMIO_STAT_ALWAYS_SET_27; + + /* SDR and HS200/400 registers requires HW reset */ + if (of_data && of_data->scc_offset) { + priv->scc_ctl = host->ctl + of_data->scc_offset; + host->mmc->caps |= MMC_CAP_HW_RESET; + host->hw_reset = renesas_sdhi_hw_reset; + } }
/* Orginally registers were 16 bit apart, could be 32 or 64 nowadays */ @@ -775,8 +782,6 @@ int renesas_sdhi_probe(struct platform_d const struct renesas_sdhi_scc *taps = of_data->taps; bool hit = false;
- host->mmc->caps |= MMC_CAP_HW_RESET; - for (i = 0; i < of_data->taps_num; i++) { if (taps[i].clk_rate == 0 || taps[i].clk_rate == host->mmc->f_max) { @@ -789,12 +794,10 @@ int renesas_sdhi_probe(struct platform_d if (!hit) dev_warn(&host->pdev->dev, "Unknown clock rate for SDR104\n");
- priv->scc_ctl = host->ctl + of_data->scc_offset; host->init_tuning = renesas_sdhi_init_tuning; host->prepare_tuning = renesas_sdhi_prepare_tuning; host->select_tuning = renesas_sdhi_select_tuning; host->check_scc_error = renesas_sdhi_check_scc_error; - host->hw_reset = renesas_sdhi_hw_reset; host->prepare_hs400_tuning = renesas_sdhi_prepare_hs400_tuning; host->hs400_downgrade = renesas_sdhi_disable_scc;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiong Wu lohengrin1024@gmail.com
commit d4721339dcca7def04909a8e60da43c19a24d8bf upstream.
The original purpose of the code I fix is to replace max_discard with max_trim if max_trim is less than max_discard. When max_discard is 0 we should replace max_discard with max_trim as well, because max_discard equals 0 happens only when the max_do_calc_max_discard process is overflowed, so if mmc_can_trim(card) is true, max_discard should be replaced by an available max_trim. However, in the original code, there are two lines of code interfere the right process. 1) if (max_discard && mmc_can_trim(card)) when max_discard is 0, it skips the process checking if max_discard needs to be replaced with max_trim. 2) if (max_trim < max_discard) the condition is false when max_discard is 0. it also skips the process that replaces max_discard with max_trim, in fact, we should replace the 0-valued max_discard with max_trim.
Signed-off-by: Jiong Wu Lohengrin1024@gmail.com Fixes: b305882fbc87 (mmc: core: optimize mmc_calc_max_discard) Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/core/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -2381,9 +2381,9 @@ unsigned int mmc_calc_max_discard(struct return card->pref_erase;
max_discard = mmc_do_calc_max_discard(card, MMC_ERASE_ARG); - if (max_discard && mmc_can_trim(card)) { + if (mmc_can_trim(card)) { max_trim = mmc_do_calc_max_discard(card, MMC_TRIM_ARG); - if (max_trim < max_discard) + if (max_trim < max_discard || max_discard == 0) max_discard = max_trim; } else if (max_discard < card->erase_size) { max_discard = 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh R vigneshr@ti.com
commit 673c865efbdc5fec3cc525c46d71844d42c60072 upstream.
Commit 4dea6c9b0b64 ("spi: spi-ti-qspi: add mmap mode read support") has has got order of parameter wrong when calling regmap_update_bits() to select CS for mmap access. Mask and value arguments are interchanged. Code will work on a system with single slave, but fails when more than one CS is in use. Fix this by correcting the order of parameters when calling regmap_update_bits().
Fixes: 4dea6c9b0b64 ("spi: spi-ti-qspi: add mmap mode read support") Cc: stable@vger.kernel.org Signed-off-by: Vignesh R vigneshr@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-ti-qspi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -490,8 +490,8 @@ static void ti_qspi_enable_memory_map(st ti_qspi_write(qspi, MM_SWITCH, QSPI_SPI_SWITCH_REG); if (qspi->ctrl_base) { regmap_update_bits(qspi->ctrl_base, qspi->ctrl_reg, - MEM_CS_EN(spi->chip_select), - MEM_CS_MASK); + MEM_CS_MASK, + MEM_CS_EN(spi->chip_select)); } qspi->mmap_enabled = true; } @@ -503,7 +503,7 @@ static void ti_qspi_disable_memory_map(s ti_qspi_write(qspi, 0, QSPI_SPI_SWITCH_REG); if (qspi->ctrl_base) regmap_update_bits(qspi->ctrl_base, qspi->ctrl_reg, - 0, MEM_CS_MASK); + MEM_CS_MASK, 0); qspi->mmap_enabled = false; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit ef070b4e4aa25bb5f8632ad196644026c11903bf upstream.
When the commit b6ced294fb61
("spi: pxa2xx: Switch to SPI core DMA mapping functionality")
switches to SPI core provided DMA helpers, it missed to setup maximum supported DMA transfer length for the controller and thus users mistakenly try to send more data than supported with the following warning:
ili9341 spi-PRP0001:01: DMA disabled for transfer length 153600 greater than 65536
Setup maximum supported DMA transfer length in order to make users know the limit.
Fixes: b6ced294fb61 ("spi: pxa2xx: Switch to SPI core DMA mapping functionality") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-pxa2xx.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -1696,6 +1696,7 @@ static int pxa2xx_spi_probe(struct platf platform_info->enable_dma = false; } else { master->can_dma = pxa2xx_spi_can_dma; + master->max_dma_len = MAX_DMA_LEN; } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh R vigneshr@ti.com
commit baf8b9f8d260c55a86405f70a384c29cda888476 upstream.
Commit b682cffa3ac6 ("spi: omap2-mcspi: Set FIFO DMA trigger level to word length") broke SPI transfers where bits_per_word != 8. This is because of mimsatch between McSPI FIFO level event trigger size (SPI word length) and DMA request size(word length * maxburst). This leads to data corruption, lockup and errors like:
spi1.0: EOW timed out
Fix this by setting DMA maxburst size to 1 so that McSPI FIFO level event trigger size matches DMA request size.
Fixes: b682cffa3ac6 ("spi: omap2-mcspi: Set FIFO DMA trigger level to word length") Cc: stable@vger.kernel.org Reported-by: David Lechner david@lechnology.com Tested-by: David Lechner david@lechnology.com Signed-off-by: Vignesh R vigneshr@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-omap2-mcspi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/spi/spi-omap2-mcspi.c +++ b/drivers/spi/spi-omap2-mcspi.c @@ -623,8 +623,8 @@ omap2_mcspi_txrx_dma(struct spi_device * cfg.dst_addr = cs->phys + OMAP2_MCSPI_TX0; cfg.src_addr_width = width; cfg.dst_addr_width = width; - cfg.src_maxburst = es; - cfg.dst_maxburst = es; + cfg.src_maxburst = 1; + cfg.dst_maxburst = 1;
rx = xfer->rx_buf; tx = xfer->tx_buf;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
commit b89fefda7d4e3a649129584d855be233c7465264 upstream.
spi-gpio is capable of dealing with active-high chip-selects. Unfortunately, commit 4b859db2c606 ("spi: spi-gpio: add SPI_3WIRE support") broke this by setting master->mode_bits, which overrides the setting in the spi-bitbang code. Fix this.
[Fixed a trivial conflict with SPI_3WIRE_HIZ support -- broonie]
Fixes: 4b859db2c606 ("spi: spi-gpio: add SPI_3WIRE support") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-gpio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/spi/spi-gpio.c +++ b/drivers/spi/spi-gpio.c @@ -428,7 +428,8 @@ static int spi_gpio_probe(struct platfor return status;
master->bits_per_word_mask = SPI_BPW_RANGE_MASK(1, 32); - master->mode_bits = SPI_3WIRE | SPI_3WIRE_HIZ | SPI_CPHA | SPI_CPOL; + master->mode_bits = SPI_3WIRE | SPI_3WIRE_HIZ | SPI_CPHA | SPI_CPOL | + SPI_CS_HIGH; master->flags = master_flags; master->bus_num = pdev->id; /* The master needs to think there is a chipselect even if not connected */ @@ -455,7 +456,6 @@ static int spi_gpio_probe(struct platfor spi_gpio->bitbang.txrx_word[SPI_MODE_3] = spi_gpio_spec_txrx_word_mode3; } spi_gpio->bitbang.setup_transfer = spi_bitbang_setup_transfer; - spi_gpio->bitbang.flags = SPI_CS_HIGH;
status = spi_bitbang_start(&spi_gpio->bitbang); if (status)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzk@kernel.org
commit 56b5d4ea778c1b0989c5cdb5406d4a488144c416 upstream.
LDO35 uses 25 mV step, not 50 mV. Bucks 7 and 8 use 12.5 mV step instead of 6.25 mV. Wrong step caused over-voltage (LDO35) or under-voltage (buck7 and 8) if regulators were used (e.g. on Exynos5420 Arndale Octa board).
Cc: stable@vger.kernel.org Fixes: cb74685ecb39 ("regulator: s2mps11: Add samsung s2mps11 regulator driver") Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/regulator/s2mps11.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -362,7 +362,7 @@ static const struct regulator_desc s2mps regulator_desc_s2mps11_ldo(32, STEP_50_MV), regulator_desc_s2mps11_ldo(33, STEP_50_MV), regulator_desc_s2mps11_ldo(34, STEP_50_MV), - regulator_desc_s2mps11_ldo(35, STEP_50_MV), + regulator_desc_s2mps11_ldo(35, STEP_25_MV), regulator_desc_s2mps11_ldo(36, STEP_50_MV), regulator_desc_s2mps11_ldo(37, STEP_50_MV), regulator_desc_s2mps11_ldo(38, STEP_50_MV), @@ -372,8 +372,8 @@ static const struct regulator_desc s2mps regulator_desc_s2mps11_buck1_4(4), regulator_desc_s2mps11_buck5, regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_6_25_MV), + regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_12_5_MV), + regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_12_5_MV), regulator_desc_s2mps11_buck9, regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV), };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Zhang markz@nvidia.com
commit 0ab66b3c326ef8f77dae9f528118966365757c0c upstream.
If regulator DT node doesn't exist, its of_parse_cb callback function isn't called. Then all values for DT properties are filled with zero. This leads to wrong register update for FPS and POK settings.
Signed-off-by: Jinyoung Park jinyoungp@nvidia.com Signed-off-by: Mark Zhang markz@nvidia.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/regulator/max77620-regulator.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/regulator/max77620-regulator.c +++ b/drivers/regulator/max77620-regulator.c @@ -1,7 +1,7 @@ /* * Maxim MAX77620 Regulator driver * - * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved. + * Copyright (c) 2016-2018, NVIDIA CORPORATION. All rights reserved. * * Author: Mallikarjun Kasoju mkasoju@nvidia.com * Laxman Dewangan ldewangan@nvidia.com @@ -803,6 +803,14 @@ static int max77620_regulator_probe(stru rdesc = &rinfo[id].desc; pmic->rinfo[id] = &max77620_regs_info[id]; pmic->enable_power_mode[id] = MAX77620_POWER_MODE_NORMAL; + pmic->reg_pdata[id].active_fps_src = -1; + pmic->reg_pdata[id].active_fps_pd_slot = -1; + pmic->reg_pdata[id].active_fps_pu_slot = -1; + pmic->reg_pdata[id].suspend_fps_src = -1; + pmic->reg_pdata[id].suspend_fps_pd_slot = -1; + pmic->reg_pdata[id].suspend_fps_pu_slot = -1; + pmic->reg_pdata[id].power_ok = -1; + pmic->reg_pdata[id].ramp_rate_setting = -1;
ret = max77620_read_slew_rate(pmic, id); if (ret < 0)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stuart Menefy stuart.menefy@mathembedded.com
commit 28c4f730d2a44f2591cb104091da29a38dac49fe upstream.
The step values for some of the LDOs appears to be incorrect, resulting in incorrect voltages (or at least, ones which are different from the Samsung 3.4 vendor kernel).
Signed-off-by: Stuart Menefy stuart.menefy@mathembedded.com Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/regulator/s2mpa01.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/regulator/s2mpa01.c +++ b/drivers/regulator/s2mpa01.c @@ -298,13 +298,13 @@ static const struct regulator_desc regul regulator_desc_ldo(2, STEP_50_MV), regulator_desc_ldo(3, STEP_50_MV), regulator_desc_ldo(4, STEP_50_MV), - regulator_desc_ldo(5, STEP_50_MV), + regulator_desc_ldo(5, STEP_25_MV), regulator_desc_ldo(6, STEP_25_MV), regulator_desc_ldo(7, STEP_50_MV), regulator_desc_ldo(8, STEP_50_MV), regulator_desc_ldo(9, STEP_50_MV), regulator_desc_ldo(10, STEP_50_MV), - regulator_desc_ldo(11, STEP_25_MV), + regulator_desc_ldo(11, STEP_50_MV), regulator_desc_ldo(12, STEP_50_MV), regulator_desc_ldo(13, STEP_50_MV), regulator_desc_ldo(14, STEP_50_MV), @@ -315,11 +315,11 @@ static const struct regulator_desc regul regulator_desc_ldo(19, STEP_50_MV), regulator_desc_ldo(20, STEP_50_MV), regulator_desc_ldo(21, STEP_50_MV), - regulator_desc_ldo(22, STEP_25_MV), - regulator_desc_ldo(23, STEP_25_MV), + regulator_desc_ldo(22, STEP_50_MV), + regulator_desc_ldo(23, STEP_50_MV), regulator_desc_ldo(24, STEP_50_MV), regulator_desc_ldo(25, STEP_50_MV), - regulator_desc_ldo(26, STEP_50_MV), + regulator_desc_ldo(26, STEP_25_MV), regulator_desc_buck1_4(1), regulator_desc_buck1_4(2), regulator_desc_buck1_4(3),
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
commit 906d2d3f874a54183df5a609fda180adf0462428 upstream.
Since ccmp_pn is u8 *, the second half needs to start at array index 4 instead of 0. Fixes a connection stall after a certain amount of traffic
Fixes: 23405236460b9 ("mt76: fix transmission of encrypted management frames") Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/mediatek/mt76/mt76x02_mac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/mediatek/mt76/mt76x02_mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt76x02_mac.c @@ -309,7 +309,7 @@ void mt76x02_mac_write_txwi(struct mt76x ccmp_pn[6] = pn >> 32; ccmp_pn[7] = pn >> 40; txwi->iv = *((__le32 *)&ccmp_pn[0]); - txwi->eiv = *((__le32 *)&ccmp_pn[1]); + txwi->eiv = *((__le32 *)&ccmp_pn[4]); }
spin_lock_bh(&dev->mt76.lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stuart Menefy stuart.menefy@mathembedded.com
commit a5719a40aef956ba704f2aa1c7b977224d60fa96 upstream.
When a timer tick occurs and the clock is in one-shot mode, the timer needs to be stopped to prevent it triggering subsequent interrupts. Currently this code is in exynos4_mct_tick_clear(), but as it is only needed when an ISR occurs move it into exynos4_mct_tick_isr(), leaving exynos4_mct_tick_clear() just doing what its name suggests it should.
Signed-off-by: Stuart Menefy stuart.menefy@mathembedded.com Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Marek Szyprowski m.szyprowski@samsung.com Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clocksource/exynos_mct.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/drivers/clocksource/exynos_mct.c +++ b/drivers/clocksource/exynos_mct.c @@ -388,6 +388,13 @@ static void exynos4_mct_tick_start(unsig exynos4_mct_write(tmp, mevt->base + MCT_L_TCON_OFFSET); }
+static void exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) +{ + /* Clear the MCT tick interrupt */ + if (readl_relaxed(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) + exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); +} + static int exynos4_tick_set_next_event(unsigned long cycles, struct clock_event_device *evt) { @@ -420,8 +427,11 @@ static int set_state_periodic(struct clo return 0; }
-static void exynos4_mct_tick_clear(struct mct_clock_event_device *mevt) +static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) { + struct mct_clock_event_device *mevt = dev_id; + struct clock_event_device *evt = &mevt->evt; + /* * This is for supporting oneshot mode. * Mct would generate interrupt periodically @@ -430,16 +440,6 @@ static void exynos4_mct_tick_clear(struc if (!clockevent_state_periodic(&mevt->evt)) exynos4_mct_tick_stop(mevt);
- /* Clear the MCT tick interrupt */ - if (readl_relaxed(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1) - exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET); -} - -static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id) -{ - struct mct_clock_event_device *mevt = dev_id; - struct clock_event_device *evt = &mevt->evt; - exynos4_mct_tick_clear(mevt);
evt->event_handler(evt);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stuart Menefy stuart.menefy@mathembedded.com
commit d2f276c8d3c224d5b493c42b6cf006ae4e64fb1c upstream.
When shutting down the timer, ensure that after we have stopped the timer any pending interrupts are cleared. This fixes a problem when suspending, as interrupts are disabled before the timer is stopped, so the timer interrupt may still be asserted, preventing the system entering a low power state when the wfi is executed.
Signed-off-by: Stuart Menefy stuart.menefy@mathembedded.com Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Tested-by: Marek Szyprowski m.szyprowski@samsung.com Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clocksource/exynos_mct.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/clocksource/exynos_mct.c +++ b/drivers/clocksource/exynos_mct.c @@ -411,6 +411,7 @@ static int set_state_shutdown(struct clo
mevt = container_of(evt, struct mct_clock_event_device, evt); exynos4_mct_tick_stop(mevt); + exynos4_mct_tick_clear(mevt); return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland samuel@sholland.org
commit c950ca8c35eeb32224a63adc47e12f9e226da241 upstream.
The Allwinner A64 SoC is known[1] to have an unstable architectural timer, which manifests itself most obviously in the time jumping forward a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a timer frequency of 24 MHz, implying that the time went slightly backward (and this was interpreted by the kernel as it jumping forward and wrapping around past the epoch).
Investigation revealed instability in the low bits of CNTVCT at the point a high bit rolls over. This leads to power-of-two cycle forward and backward jumps. (Testing shows that forward jumps are about twice as likely as backward jumps.) Since the counter value returns to normal after an indeterminate read, each "jump" really consists of both a forward and backward jump from the software perspective.
Unless the kernel is trapping CNTVCT reads, a userspace program is able to read the register in a loop faster than it changes. A test program running on all 4 CPU cores that reported jumps larger than 100 ms was run for 13.6 hours and reported the following:
Count | Event -------+--------------------------- 9940 | jumped backward 699ms 268 | jumped backward 1398ms 1 | jumped backward 2097ms 16020 | jumped forward 175ms 6443 | jumped forward 699ms 2976 | jumped forward 1398ms 9 | jumped forward 356516ms 9 | jumped forward 357215ms 4 | jumped forward 714430ms 1 | jumped forward 3578440ms
This works out to a jump larger than 100 ms about every 5.5 seconds on each CPU core.
The largest jump (almost an hour!) was the following sequence of reads: 0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000
Note that the middle bits don't necessarily all read as all zeroes or all ones during the anomalous behavior; however the low 10 bits checked by the function in this patch have never been observed with any other value.
Also note that smaller jumps are much more common, with backward jumps of 2048 (2^11) cycles observed over 400 times per second on each core. (Of course, this is partially explained by lower bits rolling over more frequently.) Any one of these could have caused the 95 year time skip.
Similar anomalies were observed while reading CNTPCT (after patching the kernel to allow reads from userspace). However, the CNTPCT jumps are much less frequent, and only small jumps were observed. The same program as before (except now reading CNTPCT) observed after 72 hours:
Count | Event -------+--------------------------- 17 | jumped backward 699ms 52 | jumped forward 175ms 2831 | jumped forward 699ms 5 | jumped forward 1398ms
Further investigation showed that the instability in CNTPCT/CNTVCT also affected the respective timer's TVAL register. The following values were observed immediately after writing CNVT_TVAL to 0x10000000:
CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000 0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000 0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000 0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000
The pattern of errors in CNTV_TVAL seemed to depend on exactly which value was written to it. For example, after writing 0x10101010:
CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000 0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000 0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000 0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000 0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000 0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000
I was also twice able to reproduce the issue covered by Allwinner's workaround[4], that writing to TVAL sometimes fails, and both CVAL and TVAL are left with entirely bogus values. One was the following values:
CNTVCT | CNTV_TVAL | CNTV_CVAL --------------------+------------+-------------------------------------- 0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past) Reviewed-by: Marc Zyngier marc.zyngier@arm.com
========================================================================
Because the CPU can read the CNTPCT/CNTVCT registers faster than they change, performing two reads of the register and comparing the high bits (like other workarounds) is not a workable solution. And because the timer can jump both forward and backward, no pair of reads can distinguish a good value from a bad one. The only way to guarantee a good value from consecutive reads would be to read _three_ times, and take the middle value only if the three values are 1) each unique and 2) increasing. This takes at minimum 3 counter cycles (125 ns), or more if an anomaly is detected.
However, since there is a distinct pattern to the bad values, we can optimize the common case (1022/1024 of the time) to a single read by simply ignoring values that match the error pattern. This still takes no more than 3 cycles in the worst case, and requires much less code. As an additional safety check, we still limit the loop iteration to the number of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods.
For the TVAL registers, the simple solution is to not use them. Instead, read or write the CVAL and calculate the TVAL value in software.
Although the manufacturer is aware of at least part of the erratum[4], there is no official name for it. For now, use the kernel-internal name "UNKNOWN1".
[1]: https://github.com/armbian/build/commit/a08cd6fe7ae9 [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/ [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26 [4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/cloc...
Acked-by: Maxime Ripard maxime.ripard@bootlin.com Tested-by: Andre Przywara andre.przywara@arm.com Signed-off-by: Samuel Holland samuel@sholland.org Cc: stable@vger.kernel.org Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/arm64/silicon-errata.txt | 2 + drivers/clocksource/Kconfig | 10 ++++++ drivers/clocksource/arm_arch_timer.c | 55 +++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+)
--- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -44,6 +44,8 @@ stable kernels.
| Implementor | Component | Erratum ID | Kconfig | +----------------+-----------------+-----------------+-----------------------------+ +| Allwinner | A64/R18 | UNKNOWN1 | SUN50I_ERRATUM_UNKNOWN1 | +| | | | | | ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 | | ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 | | ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 | --- a/drivers/clocksource/Kconfig +++ b/drivers/clocksource/Kconfig @@ -360,6 +360,16 @@ config ARM64_ERRATUM_858921 The workaround will be dynamically enabled when an affected core is detected.
+config SUN50I_ERRATUM_UNKNOWN1 + bool "Workaround for Allwinner A64 erratum UNKNOWN1" + default y + depends on ARM_ARCH_TIMER && ARM64 && ARCH_SUNXI + select ARM_ARCH_TIMER_OOL_WORKAROUND + help + This option enables a workaround for instability in the timer on + the Allwinner A64 SoC. The workaround will only be active if the + allwinner,erratum-unknown1 property is found in the timer node. + config ARM_GLOBAL_TIMER bool "Support for the ARM global timer" if COMPILE_TEST select TIMER_OF if OF --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -326,6 +326,48 @@ static u64 notrace arm64_1188873_read_cn } #endif
+#ifdef CONFIG_SUN50I_ERRATUM_UNKNOWN1 +/* + * The low bits of the counter registers are indeterminate while bit 10 or + * greater is rolling over. Since the counter value can jump both backward + * (7ff -> 000 -> 800) and forward (7ff -> fff -> 800), ignore register values + * with all ones or all zeros in the low bits. Bound the loop by the maximum + * number of CPU cycles in 3 consecutive 24 MHz counter periods. + */ +#define __sun50i_a64_read_reg(reg) ({ \ + u64 _val; \ + int _retries = 150; \ + \ + do { \ + _val = read_sysreg(reg); \ + _retries--; \ + } while (((_val + 1) & GENMASK(9, 0)) <= 1 && _retries); \ + \ + WARN_ON_ONCE(!_retries); \ + _val; \ +}) + +static u64 notrace sun50i_a64_read_cntpct_el0(void) +{ + return __sun50i_a64_read_reg(cntpct_el0); +} + +static u64 notrace sun50i_a64_read_cntvct_el0(void) +{ + return __sun50i_a64_read_reg(cntvct_el0); +} + +static u32 notrace sun50i_a64_read_cntp_tval_el0(void) +{ + return read_sysreg(cntp_cval_el0) - sun50i_a64_read_cntpct_el0(); +} + +static u32 notrace sun50i_a64_read_cntv_tval_el0(void) +{ + return read_sysreg(cntv_cval_el0) - sun50i_a64_read_cntvct_el0(); +} +#endif + #ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND DEFINE_PER_CPU(const struct arch_timer_erratum_workaround *, timer_unstable_counter_workaround); EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround); @@ -423,6 +465,19 @@ static const struct arch_timer_erratum_w .read_cntvct_el0 = arm64_1188873_read_cntvct_el0, }, #endif +#ifdef CONFIG_SUN50I_ERRATUM_UNKNOWN1 + { + .match_type = ate_match_dt, + .id = "allwinner,erratum-unknown1", + .desc = "Allwinner erratum UNKNOWN1", + .read_cntp_tval_el0 = sun50i_a64_read_cntp_tval_el0, + .read_cntv_tval_el0 = sun50i_a64_read_cntv_tval_el0, + .read_cntpct_el0 = sun50i_a64_read_cntpct_el0, + .read_cntvct_el0 = sun50i_a64_read_cntvct_el0, + .set_next_event_phys = erratum_set_next_event_tval_phys, + .set_next_event_virt = erratum_set_next_event_tval_virt, + }, +#endif };
typedef bool (*ate_match_fn_t)(const struct arch_timer_erratum_workaround *,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre Morel pmorel@linux.ibm.com
commit 36360658eb5a6cf04bb9f2704d1e4ce54037ec99 upstream.
Libudev relies on having a subsystem link for non-root devices. To avoid libudev (and potentially other userspace tools) choking on the matrix device let us introduce a matrix bus and with it the matrix bus subsytem. Also make the matrix device reside within the matrix bus.
Doing this we remove the forced link from the matrix device to the vfio_ap driver and the device_type we do not need anymore.
Since the associated matrix driver is not the vfio_ap driver any more, we have to change the search for the devices on the vfio_ap driver in the function vfio_ap_verify_queue_reserved. Fixes: 1fde573413b5 ("s390: vfio-ap: base implementation of VFIO AP device driver") Cc: stable@vger.kernel.org
Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Reported-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Pierre Morel pmorel@linux.ibm.com Tested-by: Christian Borntraeger borntraeger@de.ibm.com Reviewed-by: Cornelia Huck cohuck@redhat.com Reviewed-by: Tony Krowiak akrowiak@linux.ibm.com Acked-by: Halil Pasic pasic@linux.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/crypto/vfio_ap_drv.c | 44 +++++++++++++++++++++++++++------- drivers/s390/crypto/vfio_ap_ops.c | 4 +-- drivers/s390/crypto/vfio_ap_private.h | 1 3 files changed, 38 insertions(+), 11 deletions(-)
--- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -15,7 +15,6 @@ #include "vfio_ap_private.h"
#define VFIO_AP_ROOT_NAME "vfio_ap" -#define VFIO_AP_DEV_TYPE_NAME "ap_matrix" #define VFIO_AP_DEV_NAME "matrix"
MODULE_AUTHOR("IBM Corporation"); @@ -24,10 +23,6 @@ MODULE_LICENSE("GPL v2");
static struct ap_driver vfio_ap_drv;
-static struct device_type vfio_ap_dev_type = { - .name = VFIO_AP_DEV_TYPE_NAME, -}; - struct ap_matrix_dev *matrix_dev;
/* Only type 10 adapters (CEX4 and later) are supported @@ -62,6 +57,22 @@ static void vfio_ap_matrix_dev_release(s kfree(matrix_dev); }
+static int matrix_bus_match(struct device *dev, struct device_driver *drv) +{ + return 1; +} + +static struct bus_type matrix_bus = { + .name = "matrix", + .match = &matrix_bus_match, +}; + +static struct device_driver matrix_driver = { + .name = "vfio_ap", + .bus = &matrix_bus, + .suppress_bind_attrs = true, +}; + static int vfio_ap_matrix_dev_create(void) { int ret; @@ -71,6 +82,10 @@ static int vfio_ap_matrix_dev_create(voi if (IS_ERR(root_device)) return PTR_ERR(root_device);
+ ret = bus_register(&matrix_bus); + if (ret) + goto bus_register_err; + matrix_dev = kzalloc(sizeof(*matrix_dev), GFP_KERNEL); if (!matrix_dev) { ret = -ENOMEM; @@ -87,30 +102,41 @@ static int vfio_ap_matrix_dev_create(voi mutex_init(&matrix_dev->lock); INIT_LIST_HEAD(&matrix_dev->mdev_list);
- matrix_dev->device.type = &vfio_ap_dev_type; dev_set_name(&matrix_dev->device, "%s", VFIO_AP_DEV_NAME); matrix_dev->device.parent = root_device; + matrix_dev->device.bus = &matrix_bus; matrix_dev->device.release = vfio_ap_matrix_dev_release; - matrix_dev->device.driver = &vfio_ap_drv.driver; + matrix_dev->vfio_ap_drv = &vfio_ap_drv;
ret = device_register(&matrix_dev->device); if (ret) goto matrix_reg_err;
+ ret = driver_register(&matrix_driver); + if (ret) + goto matrix_drv_err; + return 0;
+matrix_drv_err: + device_unregister(&matrix_dev->device); matrix_reg_err: put_device(&matrix_dev->device); matrix_alloc_err: + bus_unregister(&matrix_bus); +bus_register_err: root_device_unregister(root_device); - return ret; }
static void vfio_ap_matrix_dev_destroy(void) { + struct device *root_device = matrix_dev->device.parent; + + driver_unregister(&matrix_driver); device_unregister(&matrix_dev->device); - root_device_unregister(matrix_dev->device.parent); + bus_unregister(&matrix_bus); + root_device_unregister(root_device); }
static int __init vfio_ap_init(void) --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -198,8 +198,8 @@ static int vfio_ap_verify_queue_reserved qres.apqi = apqi; qres.reserved = false;
- ret = driver_for_each_device(matrix_dev->device.driver, NULL, &qres, - vfio_ap_has_queue); + ret = driver_for_each_device(&matrix_dev->vfio_ap_drv->driver, NULL, + &qres, vfio_ap_has_queue); if (ret) return ret;
--- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -40,6 +40,7 @@ struct ap_matrix_dev { struct ap_config_info info; struct list_head mdev_list; struct mutex lock; + struct ap_driver *vfio_ap_drv; };
extern struct ap_matrix_dev *matrix_dev;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit 8727638426b0aea59d7f904ad8ddf483f9234f88 upstream.
The setup_lowcore() function creates a new prefix page for the boot CPU. The PSW mask for the system_call, external interrupt, i/o interrupt and the program check handler have the DAT bit set in this new prefix page.
At the time setup_lowcore is called the system still runs without virtual address translation, the paging_init() function creates the kernel page table and loads the CR13 with the kernel ASCE.
Any code between setup_lowcore() and the end of paging_init() that has a BUG or WARN statement will create a program check that can not be handled correctly as there is no kernel page table yet.
To allow early WARN statements initially setup the lowcore with DAT off and set the DAT bit only after paging_init() has completed.
Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/kernel/setup.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-)
--- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -369,7 +369,7 @@ void __init arch_call_rest_init(void) : : [_frame] "a" (frame)); }
-static void __init setup_lowcore(void) +static void __init setup_lowcore_dat_off(void) { struct lowcore *lc;
@@ -380,19 +380,16 @@ static void __init setup_lowcore(void) lc = memblock_alloc_low(sizeof(*lc), sizeof(*lc)); lc->restart_psw.mask = PSW_KERNEL_BITS; lc->restart_psw.addr = (unsigned long) restart_int_handler; - lc->external_new_psw.mask = PSW_KERNEL_BITS | - PSW_MASK_DAT | PSW_MASK_MCHECK; + lc->external_new_psw.mask = PSW_KERNEL_BITS | PSW_MASK_MCHECK; lc->external_new_psw.addr = (unsigned long) ext_int_handler; lc->svc_new_psw.mask = PSW_KERNEL_BITS | - PSW_MASK_DAT | PSW_MASK_IO | PSW_MASK_EXT | PSW_MASK_MCHECK; + PSW_MASK_IO | PSW_MASK_EXT | PSW_MASK_MCHECK; lc->svc_new_psw.addr = (unsigned long) system_call; - lc->program_new_psw.mask = PSW_KERNEL_BITS | - PSW_MASK_DAT | PSW_MASK_MCHECK; + lc->program_new_psw.mask = PSW_KERNEL_BITS | PSW_MASK_MCHECK; lc->program_new_psw.addr = (unsigned long) pgm_check_handler; lc->mcck_new_psw.mask = PSW_KERNEL_BITS; lc->mcck_new_psw.addr = (unsigned long) mcck_int_handler; - lc->io_new_psw.mask = PSW_KERNEL_BITS | - PSW_MASK_DAT | PSW_MASK_MCHECK; + lc->io_new_psw.mask = PSW_KERNEL_BITS | PSW_MASK_MCHECK; lc->io_new_psw.addr = (unsigned long) io_int_handler; lc->clock_comparator = clock_comparator_max; lc->nodat_stack = ((unsigned long) &init_thread_union) @@ -452,6 +449,17 @@ static void __init setup_lowcore(void) lowcore_ptr[0] = lc; }
+static void __init setup_lowcore_dat_on(void) +{ + struct lowcore *lc; + + lc = lowcore_ptr[0]; + lc->external_new_psw.mask |= PSW_MASK_DAT; + lc->svc_new_psw.mask |= PSW_MASK_DAT; + lc->program_new_psw.mask |= PSW_MASK_DAT; + lc->io_new_psw.mask |= PSW_MASK_DAT; +} + static struct resource code_resource = { .name = "Kernel code", .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM, @@ -1072,7 +1080,7 @@ void __init setup_arch(char **cmdline_p) #endif
setup_resources(); - setup_lowcore(); + setup_lowcore_dat_off(); smp_fill_possible_mask(); cpu_detect_mhz_feature(); cpu_init(); @@ -1085,6 +1093,12 @@ void __init setup_arch(char **cmdline_p) */ paging_init();
+ /* + * After paging_init created the kernel page table, the new PSWs + * in lowcore can now run with DAT enabled. + */ + setup_lowcore_dat_on(); + /* Setup default console */ conmode_default(); set_preferred_console();
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Halil Pasic pasic@linux.ibm.com
commit 3438b2c039b4bf26881786a1f3450f016d66ad11 upstream.
A queue with a capacity of zero is clearly not a valid virtio queue. Some emulators report zero queue size if queried with an invalid queue index. Instead of crashing in this case let us just return -ENOENT. To make that work properly, let us fix the notifier cleanup logic as well.
Cc: stable@vger.kernel.org Signed-off-by: Halil Pasic pasic@linux.ibm.com Signed-off-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/virtio/virtio_ccw.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/s390/virtio/virtio_ccw.c +++ b/drivers/s390/virtio/virtio_ccw.c @@ -272,6 +272,8 @@ static void virtio_ccw_drop_indicators(s { struct virtio_ccw_vq_info *info;
+ if (!vcdev->airq_info) + return; list_for_each_entry(info, &vcdev->virtqueues, node) drop_airq_indicator(info->vq, vcdev->airq_info); } @@ -413,7 +415,7 @@ static int virtio_ccw_read_vq_conf(struc ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF); if (ret) return ret; - return vcdev->config_block->num; + return vcdev->config_block->num ?: -ENOENT; }
static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felipe Franciosi felipe@nutanix.com
commit 3722e6a52174d7c3a00e6f5efd006ca093f346c1 upstream.
The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of device-readable records and a single device-writable response entry:
struct virtio_scsi_ctrl_tmf { // Device-readable part le32 type; le32 subtype; u8 lun[8]; le64 id; // Device-writable part u8 response; }
The above should be organised as two descriptor entries (or potentially more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64 id" or after "u8 response".
The Linux driver doesn't respect that, with virtscsi_abort() and virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf(). It results in the original scsi command payload (or writable buffers) added to the tmf.
This fixes the problem by leaving cmd->sc zeroed out, which makes virtscsi_kick_cmd() add the tmf to the control vq without any payload.
Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi felipe@nutanix.com Reviewed-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/virtio_scsi.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -594,7 +594,6 @@ static int virtscsi_device_reset(struct return FAILED;
memset(cmd, 0, sizeof(*cmd)); - cmd->sc = sc; cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){ .type = VIRTIO_SCSI_T_TMF, .subtype = cpu_to_virtio32(vscsi->vdev, @@ -653,7 +652,6 @@ static int virtscsi_abort(struct scsi_cm return FAILED;
memset(cmd, 0, sizeof(*cmd)); - cmd->sc = sc; cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){ .type = VIRTIO_SCSI_T_TMF, .subtype = VIRTIO_SCSI_T_TMF_ABORT_TASK,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sagar Biradar sagar.biradar@microchip.com
commit 0015437cc046e5ec2b57b00ff8312b8d432eac7c upstream.
Fix performance issue where the queue depth for SmartIOC logical volumes is set to 1, and allow the usual logical volume code to be executed
Fixes: a052865fe287 (aacraid: Set correct Queue Depth for HBA1000 RAW disks) Cc: stable@vger.kernel.org Signed-off-by: Sagar Biradar Sagar.Biradar@microchip.com Reviewed-by: Dave Carroll david.carroll@microsemi.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/aacraid/linit.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -413,13 +413,16 @@ static int aac_slave_configure(struct sc if (chn < AAC_MAX_BUSES && tid < AAC_MAX_TARGETS && aac->sa_firmware) { devtype = aac->hba_map[chn][tid].devtype;
- if (devtype == AAC_DEVTYPE_NATIVE_RAW) + if (devtype == AAC_DEVTYPE_NATIVE_RAW) { depth = aac->hba_map[chn][tid].qd_limit; - else if (devtype == AAC_DEVTYPE_ARC_RAW) + set_timeout = 1; + goto common_config; + } + if (devtype == AAC_DEVTYPE_ARC_RAW) { set_qd_dev_type = true; - - set_timeout = 1; - goto common_config; + set_timeout = 1; + goto common_config; + } }
if (aac->jbod && (sdev->type == TYPE_DISK))
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin K. Petersen martin.petersen@oracle.com
commit a83da8a4509d3ebfe03bb7fffce022e4d5d4764f upstream.
It was reported that some devices report an OPTIMAL TRANSFER LENGTH of 0xFFFF blocks. That looks bogus, especially for a device with a 4096-byte physical block size.
Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's reported physical block size.
To make the sanity checking conditionals more readable--and to facilitate printing warnings--relocate the checking to a helper function. No functional change aside from the printks.
Cc: stable@vger.kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759 Reported-by: Christoph Anton Mitterer calestyo@scientia.net Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sd.c | 59 +++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 50 insertions(+), 9 deletions(-)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3047,6 +3047,55 @@ static void sd_read_security(struct scsi sdkp->security = 1; }
+/* + * Determine the device's preferred I/O size for reads and writes + * unless the reported value is unreasonably small, large, not a + * multiple of the physical block size, or simply garbage. + */ +static bool sd_validate_opt_xfer_size(struct scsi_disk *sdkp, + unsigned int dev_max) +{ + struct scsi_device *sdp = sdkp->device; + unsigned int opt_xfer_bytes = + logical_to_bytes(sdp, sdkp->opt_xfer_blocks); + + if (sdkp->opt_xfer_blocks > dev_max) { + sd_first_printk(KERN_WARNING, sdkp, + "Optimal transfer size %u logical blocks " \ + "> dev_max (%u logical blocks)\n", + sdkp->opt_xfer_blocks, dev_max); + return false; + } + + if (sdkp->opt_xfer_blocks > SD_DEF_XFER_BLOCKS) { + sd_first_printk(KERN_WARNING, sdkp, + "Optimal transfer size %u logical blocks " \ + "> sd driver limit (%u logical blocks)\n", + sdkp->opt_xfer_blocks, SD_DEF_XFER_BLOCKS); + return false; + } + + if (opt_xfer_bytes < PAGE_SIZE) { + sd_first_printk(KERN_WARNING, sdkp, + "Optimal transfer size %u bytes < " \ + "PAGE_SIZE (%u bytes)\n", + opt_xfer_bytes, (unsigned int)PAGE_SIZE); + return false; + } + + if (opt_xfer_bytes & (sdkp->physical_block_size - 1)) { + sd_first_printk(KERN_WARNING, sdkp, + "Optimal transfer size %u bytes not a " \ + "multiple of physical block size (%u bytes)\n", + opt_xfer_bytes, sdkp->physical_block_size); + return false; + } + + sd_first_printk(KERN_INFO, sdkp, "Optimal transfer size %u bytes\n", + opt_xfer_bytes); + return true; +} + /** * sd_revalidate_disk - called the first time a new disk is seen, * performs disk spin up, read_capacity, etc. @@ -3125,15 +3174,7 @@ static int sd_revalidate_disk(struct gen dev_max = min_not_zero(dev_max, sdkp->max_xfer_blocks); q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);
- /* - * Determine the device's preferred I/O size for reads and writes - * unless the reported value is unreasonably small, large, or - * garbage. - */ - if (sdkp->opt_xfer_blocks && - sdkp->opt_xfer_blocks <= dev_max && - sdkp->opt_xfer_blocks <= SD_DEF_XFER_BLOCKS && - logical_to_bytes(sdp, sdkp->opt_xfer_blocks) >= PAGE_SIZE) { + if (sd_validate_opt_xfer_size(sdkp, dev_max)) { q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks); rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks); } else
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bvanassche@acm.org
commit 32e36bfbcf31452a854263e7c7f32fbefc4b44d8 upstream.
When using SCSI passthrough in combination with the iSCSI target driver then cmd->t_state_lock may be obtained from interrupt context. Hence, all code that obtains cmd->t_state_lock from thread context must disable interrupts first. This patch avoids that lockdep reports the following:
WARNING: inconsistent lock state 4.18.0-dbg+ #1 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. iscsi_ttx/1800 [HC1[1]:SC0[2]:HE0:SE0] takes: 000000006e7b0ceb (&(&cmd->t_state_lock)->rlock){?...}, at: target_complete_cmd+0x47/0x2c0 [target_core_mod] {HARDIRQ-ON-W} state was registered at: lock_acquire+0xd2/0x260 _raw_spin_lock+0x32/0x50 iscsit_close_connection+0x97e/0x1020 [iscsi_target_mod] iscsit_take_action_for_connection_exit+0x108/0x200 [iscsi_target_mod] iscsi_target_rx_thread+0x180/0x190 [iscsi_target_mod] kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 irq event stamp: 1281 hardirqs last enabled at (1279): [<ffffffff970ade79>] __local_bh_enable_ip+0xa9/0x160 hardirqs last disabled at (1281): [<ffffffff97a008a5>] interrupt_entry+0xb5/0xd0 softirqs last enabled at (1278): [<ffffffff977cd9a1>] lock_sock_nested+0x51/0xc0 softirqs last disabled at (1280): [<ffffffffc07a6e04>] ip6_finish_output2+0x124/0xe40 [ipv6]
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&(&cmd->t_state_lock)->rlock); <Interrupt> lock(&(&cmd->t_state_lock)->rlock);
--- drivers/target/iscsi/iscsi_target.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -4040,9 +4040,9 @@ static void iscsit_release_commands_from struct se_cmd *se_cmd = &cmd->se_cmd;
if (se_cmd->se_tfo != NULL) { - spin_lock(&se_cmd->t_state_lock); + spin_lock_irq(&se_cmd->t_state_lock); se_cmd->transport_state |= CMD_T_FABRIC_STOP; - spin_unlock(&se_cmd->t_state_lock); + spin_unlock_irq(&se_cmd->t_state_lock); } } spin_unlock_bh(&conn->cmd_lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Himanshu Madhani hmadhani@marvell.com
commit ec322937a7f152d68755dc8316523bf6f831b48f upstream.
This patch fixes LUN discovery when loop ID is not yet assigned by the firmware during driver load/sg_reset operations. Driver will now search for new loop id before retrying login.
Fixes: 48acad099074 ("scsi: qla2xxx: Fix N2N link re-connect") Cc: stable@vger.kernel.org #4.19 Signed-off-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/qla2xxx/qla_init.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -644,11 +644,14 @@ static void qla24xx_handle_gnl_done_even break; case DSC_LS_PORT_UNAVAIL: default: - if (fcport->loop_id != FC_NO_LOOP_ID) - qla2x00_clear_loop_id(fcport); - - fcport->loop_id = loop_id; - fcport->fw_login_state = DSC_LS_PORT_UNAVAIL; + if (fcport->loop_id == FC_NO_LOOP_ID) { + qla2x00_find_new_loop_id(vha, fcport); + fcport->fw_login_state = + DSC_LS_PORT_UNAVAIL; + } + ql_dbg(ql_dbg_disc, vha, 0x20e5, + "%s %d %8phC\n", __func__, __LINE__, + fcport->port_name); qla24xx_fcport_handle_login(vha, fcport); break; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Giridhar Malavali gmalavali@marvell.com
commit f3e026951771bceb17319a4d0d6121ca58746c88 upstream.
This patch fixes warning seen when BLK-MQ is enabled and hardware does not support MQ. This will result into driver requesting MSIx vectors which are equal or less than pre_desc via PCI IRQ Affinity infrastructure.
[ 19.746300] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.12-k. [ 19.746599] qla2xxx [0000:02:00.0]-001d: : Found an ISP2432 irq 18 iobase 0x(____ptrval____). [ 20.203186] ------------[ cut here ]------------ [ 20.203306] WARNING: CPU: 8 PID: 268 at drivers/pci/msi.c:1273 pci_irq_get_affinity+0xf4/0x120 [ 20.203481] Modules linked in: tg3 ptp qla2xxx(+) pps_core sg libphy scsi_transport_fc flash loop autofs4 [ 20.203700] CPU: 8 PID: 268 Comm: systemd-udevd Not tainted 5.0.0-rc5-00358-gdf3865f #113 [ 20.203830] Call Trace: [ 20.203933] [0000000000461bb0] __warn+0xb0/0xe0 [ 20.204090] [00000000006c8f34] pci_irq_get_affinity+0xf4/0x120 [ 20.204219] [000000000068c764] blk_mq_pci_map_queues+0x24/0x120 [ 20.204396] [00000000007162f4] scsi_map_queues+0x14/0x40 [ 20.204626] [0000000000673654] blk_mq_update_queue_map+0x94/0xe0 [ 20.204698] [0000000000676ce0] blk_mq_alloc_tag_set+0x120/0x300 [ 20.204869] [000000000071077c] scsi_add_host_with_dma+0x7c/0x300 [ 20.205419] [00000000100ead54] qla2x00_probe_one+0x19d4/0x2640 [qla2xxx] [ 20.205621] [00000000006b3c88] pci_device_probe+0xc8/0x160 [ 20.205697] [0000000000701c0c] really_probe+0x1ac/0x2e0 [ 20.205770] [0000000000701f90] driver_probe_device+0x50/0x100 [ 20.205843] [0000000000702134] __driver_attach+0xf4/0x120 [ 20.205913] [0000000000700644] bus_for_each_dev+0x44/0x80 [ 20.206081] [0000000000700c98] bus_add_driver+0x198/0x220 [ 20.206300] [0000000000702950] driver_register+0x70/0x120 [ 20.206582] [0000000010248224] qla2x00_module_init+0x224/0x284 [qla2xxx] [ 20.206857] ---[ end trace b1de7a3f79fab2c2 ]---
The fix is to check if the hardware does not have Multi Queue capabiltiy, use pci_alloc_irq_vectors() call instead of pci_alloc_irq_affinity().
Fixes: f664a3cc17b7d ("scsi: kill off the legacy IO path") Cc: stable@vger.kernel.org #4.19 Signed-off-by: Giridhar Malavali gmalavali@marvell.com Signed-off-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/qla2xxx/qla_isr.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3410,7 +3410,7 @@ qla24xx_enable_msix(struct qla_hw_data * min_vecs++; }
- if (USER_CTRL_IRQ(ha)) { + if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, ha->msix_count, PCI_IRQ_MSIX); --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -6936,7 +6936,7 @@ static int qla2xxx_map_queues(struct Scs scsi_qla_host_t *vha = (scsi_qla_host_t *)shost->hostdata; struct blk_mq_queue_map *qmap = &shost->tag_set.map[0];
- if (USER_CTRL_IRQ(vha->hw)) + if (USER_CTRL_IRQ(vha->hw) || !vha->hw->mqiobase) rc = blk_mq_map_queues(qmap); else rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, vha->irq_offset);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qtran@marvell.com
commit 1560bafdff9ed54857ac3a03c4c8d8f10d791ba6 upstream.
This patch removes unnecessary code to handle RSCN, instead performs full scan everytime driver receives RSCN
Fixes: d4f7a16aeca6f ("scsi: qla2xxx: Remove ASYNC GIDPN switch command") Cc: stable@vger.kernel.org #4.19 Signed-off-by: Quinn Tran qtran@marvell.com Signed-off-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/qla2xxx/qla_init.c | 86 ---------------------------------------- 1 file changed, 86 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1474,29 +1474,6 @@ int qla24xx_fcport_handle_login(struct s return 0; }
-static -void qla24xx_handle_rscn_event(fc_port_t *fcport, struct event_arg *ea) -{ - fcport->rscn_gen++; - - ql_dbg(ql_dbg_disc, fcport->vha, 0x210c, - "%s %8phC DS %d LS %d\n", - __func__, fcport->port_name, fcport->disc_state, - fcport->fw_login_state); - - if (fcport->flags & FCF_ASYNC_SENT) - return; - - switch (fcport->disc_state) { - case DSC_DELETED: - case DSC_LOGIN_COMPLETE: - qla24xx_post_gpnid_work(fcport->vha, &ea->id); - break; - default: - break; - } -} - int qla24xx_post_newsess_work(struct scsi_qla_host *vha, port_id_t *id, u8 *port_name, u8 *node_name, void *pla, u8 fc4_type) { @@ -1563,8 +1540,6 @@ static void qla_handle_els_plogi_done(sc
void qla2x00_fcport_event_handler(scsi_qla_host_t *vha, struct event_arg *ea) { - fc_port_t *f, *tf; - uint32_t id = 0, mask, rid; fc_port_t *fcport;
switch (ea->event) { @@ -1577,10 +1552,6 @@ void qla2x00_fcport_event_handler(scsi_q case FCME_RSCN: if (test_bit(UNLOADING, &vha->dpc_flags)) return; - switch (ea->id.b.rsvd_1) { - case RSCN_PORT_ADDR: -#define BIGSCAN 1 -#if defined BIGSCAN & BIGSCAN > 0 { unsigned long flags; fcport = qla2x00_find_fcport_by_nportid @@ -1599,59 +1570,6 @@ void qla2x00_fcport_event_handler(scsi_q } spin_unlock_irqrestore(&vha->work_lock, flags); } -#else - { - int rc; - fcport = qla2x00_find_fcport_by_nportid(vha, &ea->id, 1); - if (!fcport) { - /* cable moved */ - rc = qla24xx_post_gpnid_work(vha, &ea->id); - if (rc) { - ql_log(ql_log_warn, vha, 0xd044, - "RSCN GPNID work failed %06x\n", - ea->id.b24); - } - } else { - ea->fcport = fcport; - fcport->scan_needed = 1; - qla24xx_handle_rscn_event(fcport, ea); - } - } -#endif - break; - case RSCN_AREA_ADDR: - case RSCN_DOM_ADDR: - if (ea->id.b.rsvd_1 == RSCN_AREA_ADDR) { - mask = 0xffff00; - ql_dbg(ql_dbg_async, vha, 0x5044, - "RSCN: Area 0x%06x was affected\n", - ea->id.b24); - } else { - mask = 0xff0000; - ql_dbg(ql_dbg_async, vha, 0x507a, - "RSCN: Domain 0x%06x was affected\n", - ea->id.b24); - } - - rid = ea->id.b24 & mask; - list_for_each_entry_safe(f, tf, &vha->vp_fcports, - list) { - id = f->d_id.b24 & mask; - if (rid == id) { - ea->fcport = f; - qla24xx_handle_rscn_event(f, ea); - } - } - break; - case RSCN_FAB_ADDR: - default: - ql_log(ql_log_warn, vha, 0xd045, - "RSCN: Fabric was affected. Addr format %d\n", - ea->id.b.rsvd_1); - qla2x00_mark_all_devices_lost(vha, 1); - set_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags); - set_bit(LOCAL_LOOP_UPDATE, &vha->dpc_flags); - } break; case FCME_GNL_DONE: qla24xx_handle_gnl_done_event(vha, ea); @@ -1712,11 +1630,7 @@ void qla_rscn_replay(fc_port_t *fcport) ea.event = FCME_RSCN; ea.id = fcport->d_id; ea.id.b.rsvd_1 = RSCN_PORT_ADDR; -#if defined BIGSCAN & BIGSCAN > 0 qla2x00_fcport_event_handler(fcport->vha, &ea); -#else - qla24xx_post_gpnid_work(fcport->vha, &ea.id); -#endif } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Varad Gautam vrd@amazon.de
commit 73052b0daee0b750b39af18460dfec683e4f5887 upstream.
d_delete only unhashes an entry if it is reached with dentry->d_lockref.count != 1. Prior to commit 8ead9dd54716 ("devpts: more pty driver interface cleanups"), d_delete was called on a dentry from devpts_pty_kill with two references held, which would trigger the unhashing, and the subsequent dputs would release it.
Commit 8ead9dd54716 reworked devpts_pty_kill to stop acquiring the second reference from d_find_alias, and the d_delete call left the dentries still on the hashed list without actually ever being dropped from dcache before explicit cleanup. This causes the number of negative dentries for devpts to pile up, and an `ls /dev/pts` invocation can take seconds to return.
Provide always_delete_dentry() from simple_dentry_operations as .d_delete for devpts, to make the dentry be dropped from dcache.
Without this cleanup, the number of dentries in /dev/pts/ can be grown arbitrarily as:
`python -c 'import pty; pty.spawn(["ls", "/dev/pts"])'`
A systemtap probe on dcache_readdir to count d_subdirs shows this count to increase with each pty spawn invocation above:
probe kernel.function("dcache_readdir") { subdirs = &@cast($file->f_path->dentry, "dentry")->d_subdirs; p = subdirs; p = @cast(p, "list_head")->next; i = 0 while (p != subdirs) { p = @cast(p, "list_head")->next; i = i+1; } printf("number of dentries: %d\n", i); }
Fixes: 8ead9dd54716 ("devpts: more pty driver interface cleanups") Signed-off-by: Varad Gautam vrd@amazon.de Reported-by: Zheng Wang wanz@amazon.de Reported-by: Brandon Schwartz bsschwar@amazon.de Root-caused-by: Maximilian Heyne mheyne@amazon.de Root-caused-by: Nicolas Pernas Maradei npernas@amazon.de CC: David Woodhouse dwmw@amazon.co.uk CC: Maximilian Heyne mheyne@amazon.de CC: Stefan Nuernberger snu@amazon.de CC: Amit Shah aams@amazon.de CC: Linus Torvalds torvalds@linux-foundation.org CC: Greg Kroah-Hartman gregkh@linuxfoundation.org CC: Al Viro viro@ZenIV.linux.org.uk CC: Christian Brauner christian.brauner@ubuntu.com CC: Eric W. Biederman ebiederm@xmission.com CC: Matthew Wilcox willy@infradead.org CC: Eric Biggers ebiggers@google.com CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/devpts/inode.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/devpts/inode.c +++ b/fs/devpts/inode.c @@ -455,6 +455,7 @@ devpts_fill_super(struct super_block *s, s->s_blocksize_bits = 10; s->s_magic = DEVPTS_SUPER_MAGIC; s->s_op = &devpts_sops; + s->s_d_op = &simple_dentry_operations; s->s_time_gran = 1;
error = -ENOMEM;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit a0ce2f0aa6ad97c3d4927bf2ca54bcebdf062d55 upstream.
Before this patch, it was possible for two pipes to affect each other after data had been transferred between them with tee():
============ $ cat tee_test.c
int main(void) { int pipe_a[2]; if (pipe(pipe_a)) err(1, "pipe"); int pipe_b[2]; if (pipe(pipe_b)) err(1, "pipe"); if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write"); if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee"); if (write(pipe_b[1], "xx", 2) != 2) err(1, "write");
char buf[5]; if (read(pipe_a[0], buf, 4) != 4) err(1, "read"); buf[4] = 0; printf("got back: '%s'\n", buf); } $ gcc -o tee_test tee_test.c $ ./tee_test got back: 'abxx' $ ============
As suggested by Al Viro, fix it by creating a separate type for non-mergeable pipe buffers, then changing the types of buffers in splice_pipe_to_pipe() and link_pipe().
Cc: stable@vger.kernel.org Fixes: 7c77f0b3f920 ("splice: implement pipe to pipe splicing") Fixes: 70524490ee2e ("[PATCH] splice: add support for sys_tee()") Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/pipe.c | 14 ++++++++++++++ fs/splice.c | 4 ++++ include/linux/pipe_fs_i.h | 1 + 3 files changed, 19 insertions(+)
--- a/fs/pipe.c +++ b/fs/pipe.c @@ -234,6 +234,14 @@ static const struct pipe_buf_operations .get = generic_pipe_buf_get, };
+static const struct pipe_buf_operations anon_pipe_buf_nomerge_ops = { + .can_merge = 0, + .confirm = generic_pipe_buf_confirm, + .release = anon_pipe_buf_release, + .steal = anon_pipe_buf_steal, + .get = generic_pipe_buf_get, +}; + static const struct pipe_buf_operations packet_pipe_buf_ops = { .can_merge = 0, .confirm = generic_pipe_buf_confirm, @@ -242,6 +250,12 @@ static const struct pipe_buf_operations .get = generic_pipe_buf_get, };
+void pipe_buf_mark_unmergeable(struct pipe_buffer *buf) +{ + if (buf->ops == &anon_pipe_buf_ops) + buf->ops = &anon_pipe_buf_nomerge_ops; +} + static ssize_t pipe_read(struct kiocb *iocb, struct iov_iter *to) { --- a/fs/splice.c +++ b/fs/splice.c @@ -1597,6 +1597,8 @@ retry: */ obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
+ pipe_buf_mark_unmergeable(obuf); + obuf->len = len; opipe->nrbufs++; ibuf->offset += obuf->len; @@ -1671,6 +1673,8 @@ static int link_pipe(struct pipe_inode_i */ obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
+ pipe_buf_mark_unmergeable(obuf); + if (obuf->len > len) obuf->len = len;
--- a/include/linux/pipe_fs_i.h +++ b/include/linux/pipe_fs_i.h @@ -182,6 +182,7 @@ void generic_pipe_buf_get(struct pipe_in int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *); int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *); void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *); +void pipe_buf_mark_unmergeable(struct pipe_buffer *buf);
extern const struct pipe_buf_operations nosteal_pipe_buf_ops;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vivek Goyal vgoyal@redhat.com
commit 5f32879ea35523b9842bdbdc0065e13635caada2 upstream.
If a file with capability set (and hence security.capability xattr) is written kernel clears security.capability xattr. For overlay, during file copy up if xattrs are copied up first and then data is, copied up. This means data copy up will result in clearing of security.capability xattr file on lower has. And this can result into surprises. If a lower file has CAP_SETUID, then it should not be cleared over copy up (if nothing was actually written to file).
This also creates problems with chown logic where it first copies up file and then tries to clear setuid bit. But by that time security.capability xattr is already gone (due to data copy up), and caller gets -ENODATA. This has been reported by Giuseppe here.
https://github.com/containers/libpod/issues/2015#issuecomment-447824842
Fix this by copying up data first and then metadta. This is a regression which has been introduced by my commit as part of metadata only copy up patches.
TODO: There will be some corner cases where a file is copied up metadata only and later data copy up happens and that will clear security.capability xattr. Something needs to be done about that too.
Fixes: bd64e57586d3 ("ovl: During copy up, first copy up metadata and then data") Cc: stable@vger.kernel.org # v4.19+ Reported-by: Giuseppe Scrivano gscrivan@redhat.com Signed-off-by: Vivek Goyal vgoyal@redhat.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/copy_up.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-)
--- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -443,6 +443,24 @@ static int ovl_copy_up_inode(struct ovl_ { int err;
+ /* + * Copy up data first and then xattrs. Writing data after + * xattrs will remove security.capability xattr automatically. + */ + if (S_ISREG(c->stat.mode) && !c->metacopy) { + struct path upperpath, datapath; + + ovl_path_upper(c->dentry, &upperpath); + if (WARN_ON(upperpath.dentry != NULL)) + return -EIO; + upperpath.dentry = temp; + + ovl_path_lowerdata(c->dentry, &datapath); + err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size); + if (err) + return err; + } + err = ovl_copy_xattr(c->lowerpath.dentry, temp); if (err) return err; @@ -459,19 +477,6 @@ static int ovl_copy_up_inode(struct ovl_ if (err) return err; } - - if (S_ISREG(c->stat.mode) && !c->metacopy) { - struct path upperpath, datapath; - - ovl_path_upper(c->dentry, &upperpath); - BUG_ON(upperpath.dentry != NULL); - upperpath.dentry = temp; - - ovl_path_lowerdata(c->dentry, &datapath); - err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size); - if (err) - return err; - }
if (c->metacopy) { err = ovl_check_setxattr(c->dentry, temp, OVL_XATTR_METACOPY,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vivek Goyal vgoyal@redhat.com
commit 993a0b2aec52754f0897b1dab4c453be8217cae5 upstream.
If a file has been copied up metadata only, and later data is copied up, upper loses any security.capability xattr it has (underlying filesystem clears it as upon file write).
From a user's point of view, this is just a file copy-up and that should
not result in losing security.capability xattr. Hence, before data copy up, save security.capability xattr (if any) and restore it on upper after data copy up is complete.
Signed-off-by: Vivek Goyal vgoyal@redhat.com Reviewed-by: Amir Goldstein amir73il@gmail.com Fixes: 0c2888749363 ("ovl: A new xattr OVL_XATTR_METACOPY for file on upper") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/copy_up.c | 28 ++++++++++++++++++++++- fs/overlayfs/overlayfs.h | 2 + fs/overlayfs/util.c | 55 +++++++++++++++++++++++++++++------------------ 3 files changed, 63 insertions(+), 22 deletions(-)
--- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -742,6 +742,8 @@ static int ovl_copy_up_meta_inode_data(s { struct path upperpath, datapath; int err; + char *capability = NULL; + ssize_t uninitialized_var(cap_size);
ovl_path_upper(c->dentry, &upperpath); if (WARN_ON(upperpath.dentry == NULL)) @@ -751,15 +753,37 @@ static int ovl_copy_up_meta_inode_data(s if (WARN_ON(datapath.dentry == NULL)) return -EIO;
+ if (c->stat.size) { + err = cap_size = ovl_getxattr(upperpath.dentry, XATTR_NAME_CAPS, + &capability, 0); + if (err < 0 && err != -ENODATA) + goto out; + } + err = ovl_copy_up_data(&datapath, &upperpath, c->stat.size); if (err) - return err; + goto out_free; + + /* + * Writing to upper file will clear security.capability xattr. We + * don't want that to happen for normal copy-up operation. + */ + if (capability) { + err = ovl_do_setxattr(upperpath.dentry, XATTR_NAME_CAPS, + capability, cap_size, 0); + if (err) + goto out_free; + } +
err = vfs_removexattr(upperpath.dentry, OVL_XATTR_METACOPY); if (err) - return err; + goto out_free;
ovl_set_upperdata(d_inode(c->dentry)); +out_free: + kfree(capability); +out: return err; }
--- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -277,6 +277,8 @@ int ovl_lock_rename_workdir(struct dentr int ovl_check_metacopy_xattr(struct dentry *dentry); bool ovl_is_metacopy_dentry(struct dentry *dentry); char *ovl_get_redirect_xattr(struct dentry *dentry, int padding); +ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value, + size_t padding);
static inline bool ovl_is_impuredir(struct dentry *dentry) { --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -863,28 +863,49 @@ bool ovl_is_metacopy_dentry(struct dentr return (oe->numlower > 1); }
-char *ovl_get_redirect_xattr(struct dentry *dentry, int padding) +ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value, + size_t padding) { - int res; - char *s, *next, *buf = NULL; + ssize_t res; + char *buf = NULL;
- res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, NULL, 0); + res = vfs_getxattr(dentry, name, NULL, 0); if (res < 0) { if (res == -ENODATA || res == -EOPNOTSUPP) - return NULL; + return -ENODATA; goto fail; }
- buf = kzalloc(res + padding + 1, GFP_KERNEL); - if (!buf) - return ERR_PTR(-ENOMEM); + if (res != 0) { + buf = kzalloc(res + padding, GFP_KERNEL); + if (!buf) + return -ENOMEM; + + res = vfs_getxattr(dentry, name, buf, res); + if (res < 0) + goto fail; + } + *value = buf;
- if (res == 0) - goto invalid; + return res; + +fail: + pr_warn_ratelimited("overlayfs: failed to get xattr %s: err=%zi)\n", + name, res); + kfree(buf); + return res; +} + +char *ovl_get_redirect_xattr(struct dentry *dentry, int padding) +{ + int res; + char *s, *next, *buf = NULL;
- res = vfs_getxattr(dentry, OVL_XATTR_REDIRECT, buf, res); + res = ovl_getxattr(dentry, OVL_XATTR_REDIRECT, &buf, padding + 1); + if (res == -ENODATA) + return NULL; if (res < 0) - goto fail; + return ERR_PTR(res); if (res == 0) goto invalid;
@@ -900,15 +921,9 @@ char *ovl_get_redirect_xattr(struct dent }
return buf; - -err_free: - kfree(buf); - return ERR_PTR(res); -fail: - pr_warn_ratelimited("overlayfs: failed to get redirect (%i)\n", res); - goto err_free; invalid: pr_warn_ratelimited("overlayfs: invalid redirect (%s)\n", buf); res = -EINVAL; - goto err_free; + kfree(buf); + return ERR_PTR(res); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Finn Thain fthain@telegraphics.com.au
commit 28713169d879b67be2ef2f84dcf54905de238294 upstream.
This patch fixes a build failure when using GCC 8.1:
/usr/bin/ld: block/partitions/ldm.o: in function `ldm_parse_tocblock': block/partitions/ldm.c:153: undefined reference to `strcmp'
This is caused by a new optimization which effectively replaces a strncmp() call with a strcmp() call. This affects a number of strncmp() call sites in the kernel.
The entire class of optimizations is avoided with -fno-builtin, which gets enabled by -ffreestanding. This may avoid possible future build failures in case new optimizations appear in future compilers.
I haven't done any performance measurements with this patch but I did count the function calls in a defconfig build. For example, there are now 23 more sprintf() calls and 39 fewer strcpy() calls. The effect on the other libc functions is smaller.
If this harms performance we can tackle that regression by optimizing the call sites, ideally using semantic patches. That way, clang and ICC builds might benfit too.
Cc: stable@vger.kernel.org Reference: https://marc.info/?l=linux-m68k&m=154514816222244&w=2 Signed-off-by: Finn Thain fthain@telegraphics.com.au Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/m68k/Makefile | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/arch/m68k/Makefile +++ b/arch/m68k/Makefile @@ -58,7 +58,10 @@ cpuflags-$(CONFIG_M5206e) := $(call cc-o cpuflags-$(CONFIG_M5206) := $(call cc-option,-mcpu=5206,-m5200)
KBUILD_AFLAGS += $(cpuflags-y) -KBUILD_CFLAGS += $(cpuflags-y) -pipe +KBUILD_CFLAGS += $(cpuflags-y) + +KBUILD_CFLAGS += -pipe -ffreestanding + ifdef CONFIG_MMU # without -fno-strength-reduce the 53c7xx.c driver fails ;-( KBUILD_CFLAGS += -fno-strength-reduce -ffixed-a2
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit b89f6d1fcb30a8cbdc18ce00c7d93792076af453 upstream.
We are holding a transaction handle when creating a tree, therefore we can not allocate the root using GFP_KERNEL, as we could deadlock if reclaim is triggered by the allocation, therefore setup a nofs context.
Fixes: 74e4d82757f74 ("btrfs: let callers of btrfs_alloc_root pass gfp flags") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/disk-io.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -17,6 +17,7 @@ #include <linux/semaphore.h> #include <linux/error-injection.h> #include <linux/crc32c.h> +#include <linux/sched/mm.h> #include <asm/unaligned.h> #include "ctree.h" #include "disk-io.h" @@ -1258,10 +1259,17 @@ struct btrfs_root *btrfs_create_tree(str struct btrfs_root *tree_root = fs_info->tree_root; struct btrfs_root *root; struct btrfs_key key; + unsigned int nofs_flag; int ret = 0; uuid_le uuid = NULL_UUID_LE;
+ /* + * We're holding a transaction handle, so use a NOFS memory allocation + * context to avoid deadlock if reclaim happens. + */ + nofs_flag = memalloc_nofs_save(); root = btrfs_alloc_root(fs_info, GFP_KERNEL); + memalloc_nofs_restore(nofs_flag); if (!root) return ERR_PTR(-ENOMEM);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit a0873490660246db587849a9e172f2b7b21fa88a upstream.
We are holding a transaction handle when setting an acl, therefore we can not allocate the xattr value buffer using GFP_KERNEL, as we could deadlock if reclaim is triggered by the allocation, therefore setup a nofs context.
Fixes: 39a27ec1004e8 ("btrfs: use GFP_KERNEL for xattr and acl allocations") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/acl.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/btrfs/acl.c +++ b/fs/btrfs/acl.c @@ -9,6 +9,7 @@ #include <linux/posix_acl_xattr.h> #include <linux/posix_acl.h> #include <linux/sched.h> +#include <linux/sched/mm.h> #include <linux/slab.h>
#include "ctree.h" @@ -72,8 +73,16 @@ static int __btrfs_set_acl(struct btrfs_ }
if (acl) { + unsigned int nofs_flag; + size = posix_acl_xattr_size(acl->a_count); + /* + * We're holding a transaction handle, so use a NOFS memory + * allocation context to avoid deadlock if reclaim happens. + */ + nofs_flag = memalloc_nofs_save(); value = kmalloc(size, GFP_KERNEL); + memalloc_nofs_restore(nofs_flag); if (!value) { ret = -ENOMEM; goto out;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anand Jain anand.jain@oracle.com
commit 1cec3f27168d7835ff3d23ab371cd548440131bb upstream.
This fixes a longstanding lockdep warning triggered by fstests/btrfs/011.
Circular locking dependency check reports warning[1], that's because the btrfs_scrub_dev() calls the stack #0 below with, the fs_info::scrub_lock held. The test case leading to this warning:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /btrfs $ btrfs scrub start -B /btrfs
In fact we have fs_info::scrub_workers_refcnt to track if the init and destroy of the scrub workers are needed. So once we have incremented and decremented the fs_info::scrub_workers_refcnt value in the thread, its ok to drop the scrub_lock, and then actually do the btrfs_destroy_workqueue() part. So this patch drops the scrub_lock before calling btrfs_destroy_workqueue().
[359.258534] ====================================================== [359.260305] WARNING: possible circular locking dependency detected [359.261938] 5.0.0-rc6-default #461 Not tainted [359.263135] ------------------------------------------------------ [359.264672] btrfs/20975 is trying to acquire lock: [359.265927] 00000000d4d32bea ((wq_completion)"%s-%s""btrfs", name){+.+.}, at: flush_workqueue+0x87/0x540 [359.268416] [359.268416] but task is already holding lock: [359.270061] 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs] [359.272418] [359.272418] which lock already depends on the new lock. [359.272418] [359.274692] [359.274692] the existing dependency chain (in reverse order) is: [359.276671] [359.276671] -> #3 (&fs_info->scrub_lock){+.+.}: [359.278187] __mutex_lock+0x86/0x9c0 [359.279086] btrfs_scrub_pause+0x31/0x100 [btrfs] [359.280421] btrfs_commit_transaction+0x1e4/0x9e0 [btrfs] [359.281931] close_ctree+0x30b/0x350 [btrfs] [359.283208] generic_shutdown_super+0x64/0x100 [359.284516] kill_anon_super+0x14/0x30 [359.285658] btrfs_kill_super+0x12/0xa0 [btrfs] [359.286964] deactivate_locked_super+0x29/0x60 [359.288242] cleanup_mnt+0x3b/0x70 [359.289310] task_work_run+0x98/0xc0 [359.290428] exit_to_usermode_loop+0x83/0x90 [359.291445] do_syscall_64+0x15b/0x180 [359.292598] entry_SYSCALL_64_after_hwframe+0x49/0xbe [359.294011] [359.294011] -> #2 (sb_internal#2){.+.+}: [359.295432] __sb_start_write+0x113/0x1d0 [359.296394] start_transaction+0x369/0x500 [btrfs] [359.297471] btrfs_finish_ordered_io+0x2aa/0x7c0 [btrfs] [359.298629] normal_work_helper+0xcd/0x530 [btrfs] [359.299698] process_one_work+0x246/0x610 [359.300898] worker_thread+0x3c/0x390 [359.302020] kthread+0x116/0x130 [359.303053] ret_from_fork+0x24/0x30 [359.304152] [359.304152] -> #1 ((work_completion)(&work->normal_work)){+.+.}: [359.306100] process_one_work+0x21f/0x610 [359.307302] worker_thread+0x3c/0x390 [359.308465] kthread+0x116/0x130 [359.309357] ret_from_fork+0x24/0x30 [359.310229] [359.310229] -> #0 ((wq_completion)"%s-%s""btrfs", name){+.+.}: [359.311812] lock_acquire+0x90/0x180 [359.312929] flush_workqueue+0xaa/0x540 [359.313845] drain_workqueue+0xa1/0x180 [359.314761] destroy_workqueue+0x17/0x240 [359.315754] btrfs_destroy_workqueue+0x57/0x200 [btrfs] [359.317245] scrub_workers_put+0x2c/0x60 [btrfs] [359.318585] btrfs_scrub_dev+0x336/0x590 [btrfs] [359.319944] btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs] [359.321622] btrfs_ioctl+0x28a4/0x2e40 [btrfs] [359.322908] do_vfs_ioctl+0xa2/0x6d0 [359.324021] ksys_ioctl+0x3a/0x70 [359.325066] __x64_sys_ioctl+0x16/0x20 [359.326236] do_syscall_64+0x54/0x180 [359.327379] entry_SYSCALL_64_after_hwframe+0x49/0xbe [359.328772] [359.328772] other info that might help us debug this: [359.328772] [359.330990] Chain exists of: [359.330990] (wq_completion)"%s-%s""btrfs", name --> sb_internal#2 --> &fs_info->scrub_lock [359.330990] [359.334376] Possible unsafe locking scenario: [359.334376] [359.336020] CPU0 CPU1 [359.337070] ---- ---- [359.337821] lock(&fs_info->scrub_lock); [359.338506] lock(sb_internal#2); [359.339506] lock(&fs_info->scrub_lock); [359.341461] lock((wq_completion)"%s-%s""btrfs", name); [359.342437] [359.342437] *** DEADLOCK *** [359.342437] [359.343745] 1 lock held by btrfs/20975: [359.344788] #0: 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs] [359.346778] [359.346778] stack backtrace: [359.347897] CPU: 0 PID: 20975 Comm: btrfs Not tainted 5.0.0-rc6-default #461 [359.348983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [359.350501] Call Trace: [359.350931] dump_stack+0x67/0x90 [359.351676] print_circular_bug.isra.37.cold.56+0x15c/0x195 [359.353569] check_prev_add.constprop.44+0x4f9/0x750 [359.354849] ? check_prev_add.constprop.44+0x286/0x750 [359.356505] __lock_acquire+0xb84/0xf10 [359.357505] lock_acquire+0x90/0x180 [359.358271] ? flush_workqueue+0x87/0x540 [359.359098] flush_workqueue+0xaa/0x540 [359.359912] ? flush_workqueue+0x87/0x540 [359.360740] ? drain_workqueue+0x1e/0x180 [359.361565] ? drain_workqueue+0xa1/0x180 [359.362391] drain_workqueue+0xa1/0x180 [359.363193] destroy_workqueue+0x17/0x240 [359.364539] btrfs_destroy_workqueue+0x57/0x200 [btrfs] [359.365673] scrub_workers_put+0x2c/0x60 [btrfs] [359.366618] btrfs_scrub_dev+0x336/0x590 [btrfs] [359.367594] ? start_transaction+0xa1/0x500 [btrfs] [359.368679] btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs] [359.369545] btrfs_ioctl+0x28a4/0x2e40 [btrfs] [359.370186] ? __lock_acquire+0x263/0xf10 [359.370777] ? kvm_clock_read+0x14/0x30 [359.371392] ? kvm_sched_clock_read+0x5/0x10 [359.372248] ? sched_clock+0x5/0x10 [359.372786] ? sched_clock_cpu+0xc/0xc0 [359.373662] ? do_vfs_ioctl+0xa2/0x6d0 [359.374552] do_vfs_ioctl+0xa2/0x6d0 [359.375378] ? do_sigaction+0xff/0x250 [359.376233] ksys_ioctl+0x3a/0x70 [359.376954] __x64_sys_ioctl+0x16/0x20 [359.377772] do_syscall_64+0x54/0x180 [359.378841] entry_SYSCALL_64_after_hwframe+0x49/0xbe [359.380422] RIP: 0033:0x7f5429296a97
Backporting to older kernels: scrub_nocow_workers must be freed the same way as the others.
CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Anand Jain anand.jain@oracle.com [ update changelog ] Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/scrub.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3770,16 +3770,6 @@ fail_scrub_workers: return -ENOMEM; }
-static noinline_for_stack void scrub_workers_put(struct btrfs_fs_info *fs_info) -{ - if (--fs_info->scrub_workers_refcnt == 0) { - btrfs_destroy_workqueue(fs_info->scrub_workers); - btrfs_destroy_workqueue(fs_info->scrub_wr_completion_workers); - btrfs_destroy_workqueue(fs_info->scrub_parity_workers); - } - WARN_ON(fs_info->scrub_workers_refcnt < 0); -} - int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, u64 end, struct btrfs_scrub_progress *progress, int readonly, int is_dev_replace) @@ -3788,6 +3778,9 @@ int btrfs_scrub_dev(struct btrfs_fs_info int ret; struct btrfs_device *dev; unsigned int nofs_flag; + struct btrfs_workqueue *scrub_workers = NULL; + struct btrfs_workqueue *scrub_wr_comp = NULL; + struct btrfs_workqueue *scrub_parity = NULL;
if (btrfs_fs_closing(fs_info)) return -EINVAL; @@ -3927,9 +3920,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info
mutex_lock(&fs_info->scrub_lock); dev->scrub_ctx = NULL; - scrub_workers_put(fs_info); + if (--fs_info->scrub_workers_refcnt == 0) { + scrub_workers = fs_info->scrub_workers; + scrub_wr_comp = fs_info->scrub_wr_completion_workers; + scrub_parity = fs_info->scrub_parity_workers; + } mutex_unlock(&fs_info->scrub_lock);
+ btrfs_destroy_workqueue(scrub_workers); + btrfs_destroy_workqueue(scrub_wr_comp); + btrfs_destroy_workqueue(scrub_parity); scrub_put_ctx(sctx);
return ret;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit 669e859b5ea7c6f4fce0149d3907c64e550c294b upstream.
We should drop the lock on this error path. This has been found by a static tool.
The lock needs to be released, it's there to protect access to the dev_replace members and is not supposed to be left locked. The value of state that's being switched would need to be artifically changed to an invalid value so the default: branch is taken.
Fixes: d189dd70e255 ("btrfs: fix use-after-free due to race between replace start and cancel") CC: stable@vger.kernel.org # 5.0+ Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/dev-replace.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -862,6 +862,7 @@ int btrfs_dev_replace_cancel(struct btrf btrfs_destroy_dev_replace_tgtdev(tgt_device); break; default: + up_write(&dev_replace->rwsem); result = -EINVAL; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Thumshirn jthumshirn@suse.de
commit 349ae63f40638a28c6fce52e8447c2d14b84cc0c upstream.
We recently had a customer issue with a corrupted filesystem. When trying to mount this image btrfs panicked with a division by zero in calc_stripe_length().
The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length() takes this value and divides it by the number of copies the RAID profile is expected to have to calculate the amount of data stripes. As a DUP profile is expected to have 2 copies this division resulted in 1/2 = 0. Later then the 'data_stripes' variable is used as a divisor in the stripe length calculation which results in a division by 0 and thus a kernel panic.
When encountering a filesystem with a DUP block group and a 'num_stripes' value unequal to 2, refuse mounting as the image is corrupted and will lead to unexpected behaviour.
Code inspection showed a RAID1 block group has the same issues.
Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo wqu@suse.com Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Johannes Thumshirn jthumshirn@suse.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/volumes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6782,10 +6782,10 @@ static int btrfs_check_chunk_valid(struc }
if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) || - (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) || + (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes != 2) || (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) || (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) || - (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) || + (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) || ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 && num_stripes != 1)) { btrfs_err(fs_info,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Robertson dan@dlrobertson.com
commit e49be14b8d80e23bb7c53d78c21717a474ade76b upstream.
The scrub_ctx csum_list member must be initialized before scrub_free_ctx is called. If the csum_list is not initialized beforehand, the list_empty call in scrub_free_csums will result in a null deref if the allocation fails in the for loop.
Fixes: a2de733c78fa ("btrfs: scrub") CC: stable@vger.kernel.org # 3.0+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Dan Robertson dan@dlrobertson.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/scrub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -584,6 +584,7 @@ static noinline_for_stack struct scrub_c sctx->pages_per_rd_bio = SCRUB_PAGES_PER_RD_BIO; sctx->curr = -1; sctx->fs_info = fs_info; + INIT_LIST_HEAD(&sctx->csum_list); for (i = 0; i < SCRUB_BIOS_PER_SCTX; ++i) { struct scrub_bio *sbio;
@@ -608,7 +609,6 @@ static noinline_for_stack struct scrub_c atomic_set(&sctx->workers_pending, 0); atomic_set(&sctx->cancel_req, 0); sctx->csum_size = btrfs_super_csum_size(fs_info->super_copy); - INIT_LIST_HEAD(&sctx->csum_list);
spin_lock_init(&sctx->list_lock); spin_lock_init(&sctx->stat_lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 8e928218780e2f1cf2f5891c7575e8f0b284fcce upstream.
In the past we had data corruption when reading compressed extents that are shared within the same file and they are consecutive, this got fixed by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents"). However there was a case that was missing in those fixes, which is when the shared and compressed extents are referenced with a non-zero offset. The following shell script creates a reproducer for this issue:
#!/bin/bash
mkfs.btrfs -f /dev/sdc &> /dev/null mount -o compress /dev/sdc /mnt/sdc
# Create a file with 3 consecutive compressed extents, each has an # uncompressed size of 128Kb and a compressed size of 4Kb. for ((i = 1; i <= 3; i++)); do head -c 4096 /dev/zero for ((j = 1; j <= 31; j++)); do head -c 4096 /dev/zero | tr '\0' "\377" done done > /mnt/sdc/foobar sync
echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)"
# Clone the first extent into offsets 128K and 256K. xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar sync
echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)"
# Punch holes into the regions that are already full of zeroes. xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar sync
echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
echo "Dropping page cache..." sysctl -q vm.drop_caches=1 echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
umount /dev/sdc
When running the script we get the following output:
Digest after file creation: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar linked 131072/131072 bytes at offset 131072 128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec) linked 131072/131072 bytes at offset 262144 128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec) Digest after cloning: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Digest after hole punching: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Dropping page cache... Digest after hole punching: fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar
This happens because after reading all the pages of the extent in the range from 128K to 256K for example, we read the hole at offset 256K and then when reading the page at offset 260K we don't submit the existing bio, which is responsible for filling all the page in the range 128K to 256K only, therefore adding the pages from range 260K to 384K to the existing bio and submitting it after iterating over the entire range. Once the bio completes, the uncompressed data fills only the pages in the range 128K to 256K because there's no more data read from disk, leaving the pages in the range 260K to 384K unfilled. It is just a slightly different variant of what was solved by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents").
Fix this by forcing a bio submit, during readpages(), whenever we find a compressed extent map for a page that is different from the extent map for the previous page or has a different starting offset (in case it's the same compressed extent), instead of the extent map's original start offset.
A test case for fstests follows soon.
Reported-by: Zygo Blaxell ce3g8jdj@umail.furryterror.org Fixes: 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents") Fixes: 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents") Cc: stable@vger.kernel.org # 4.3+ Tested-by: Zygo Blaxell ce3g8jdj@umail.furryterror.org Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/extent_io.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2985,11 +2985,11 @@ static int __do_readpage(struct extent_i */ if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) && prev_em_start && *prev_em_start != (u64)-1 && - *prev_em_start != em->orig_start) + *prev_em_start != em->start) force_bio_submit = true;
if (prev_em_start) - *prev_em_start = em->orig_start; + *prev_em_start = em->start;
free_extent_map(em); em = NULL;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 4ea748e1d2c9f8a27332b949e8210dbbf392987e upstream.
Reflinking (clone/dedupe) and rename are operations that operate on two inodes and therefore need to lock them in the same order to avoid ABBA deadlocks. It happens that Btrfs' reflink implementation always locked them in a different order from VFS's lock_two_nondirectories() helper, which is used by the rename code in VFS, resulting in ABBA type deadlocks.
Btrfs' locking order:
static void btrfs_double_inode_lock(struct inode *inode1, struct inode *inode2) { if (inode1 < inode2) swap(inode1, inode2);
inode_lock_nested(inode1, I_MUTEX_PARENT); inode_lock_nested(inode2, I_MUTEX_CHILD); }
VFS's locking order:
void lock_two_nondirectories(struct inode *inode1, struct inode *inode2) { if (inode1 > inode2) swap(inode1, inode2);
if (inode1 && !S_ISDIR(inode1->i_mode)) inode_lock(inode1); if (inode2 && !S_ISDIR(inode2->i_mode) && inode2 != inode1) inode_lock_nested(inode2, I_MUTEX_NONDIR2); }
Fix this by killing the btrfs helper function that does the double inode locking and replace it with VFS's helper lock_two_nondirectories().
Reported-by: Zygo Blaxell ce3g8jdj@umail.furryterror.org Fixes: 416161db9b63e3 ("btrfs: offline dedupe") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/ioctl.c | 21 +++------------------ 1 file changed, 3 insertions(+), 18 deletions(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3206,21 +3206,6 @@ out: return ret; }
-static void btrfs_double_inode_unlock(struct inode *inode1, struct inode *inode2) -{ - inode_unlock(inode1); - inode_unlock(inode2); -} - -static void btrfs_double_inode_lock(struct inode *inode1, struct inode *inode2) -{ - if (inode1 < inode2) - swap(inode1, inode2); - - inode_lock_nested(inode1, I_MUTEX_PARENT); - inode_lock_nested(inode2, I_MUTEX_CHILD); -} - static void btrfs_double_extent_unlock(struct inode *inode1, u64 loff1, struct inode *inode2, u64 loff2, u64 len) { @@ -3989,7 +3974,7 @@ static int btrfs_remap_file_range_prep(s if (same_inode) inode_lock(inode_in); else - btrfs_double_inode_lock(inode_in, inode_out); + lock_two_nondirectories(inode_in, inode_out);
/* * Now that the inodes are locked, we need to start writeback ourselves @@ -4039,7 +4024,7 @@ static int btrfs_remap_file_range_prep(s if (same_inode) inode_unlock(inode_in); else - btrfs_double_inode_unlock(inode_in, inode_out); + unlock_two_nondirectories(inode_in, inode_out);
return ret; } @@ -4069,7 +4054,7 @@ loff_t btrfs_remap_file_range(struct fil if (same_inode) inode_unlock(src_inode); else - btrfs_double_inode_unlock(src_inode, dst_inode); + unlock_two_nondirectories(src_inode, dst_inode);
return ret < 0 ? ret : len; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Boyd swboyd@chromium.org
commit baef1c90aac7e5bf13f0360a3b334825a23d31a1 upstream.
Using the batch API from the interconnect driver sometimes leads to a KASAN error due to an access to freed memory. This is easier to trigger with threadirqs on the kernel commandline.
BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57
CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G W 4.19.10 #72 Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xcc/0x10c print_address_description+0x74/0x240 kasan_report+0x250/0x26c __asan_report_load1_noabort+0x20/0x2c rpmh_tx_done+0x114/0x12c tcs_tx_done+0x450/0x768 irq_forced_thread_fn+0x58/0x9c irq_thread+0x120/0x1dc kthread+0x248/0x260 ret_from_fork+0x10/0x18
Allocated by task 385: kasan_kmalloc+0xac/0x148 __kmalloc+0x170/0x1e4 rpmh_write_batch+0x174/0x540 qcom_icc_set+0x8dc/0x9ac icc_set+0x288/0x2e8 a6xx_gmu_stop+0x320/0x3c0 a6xx_pm_suspend+0x108/0x124 adreno_suspend+0x50/0x60 pm_generic_runtime_suspend+0x60/0x78 __rpm_callback+0x214/0x32c rpm_callback+0x54/0x184 rpm_suspend+0x3f8/0xa90 pm_runtime_work+0xb4/0x178 process_one_work+0x544/0xbc0 worker_thread+0x514/0x7d0 kthread+0x248/0x260 ret_from_fork+0x10/0x18
Freed by task 385: __kasan_slab_free+0x12c/0x1e0 kasan_slab_free+0x10/0x1c kfree+0x134/0x588 rpmh_write_batch+0x49c/0x540 qcom_icc_set+0x8dc/0x9ac icc_set+0x288/0x2e8 a6xx_gmu_stop+0x320/0x3c0 a6xx_pm_suspend+0x108/0x124 adreno_suspend+0x50/0x60 cr50_spi spi5.0: SPI transfer timed out pm_generic_runtime_suspend+0x60/0x78 __rpm_callback+0x214/0x32c rpm_callback+0x54/0x184 rpm_suspend+0x3f8/0xa90 pm_runtime_work+0xb4/0x178 process_one_work+0x544/0xbc0 worker_thread+0x514/0x7d0 kthread+0x248/0x260 ret_from_fork+0x10/0x18
The buggy address belongs to the object at fffffff51414ac80 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 260 bytes inside of 512-byte region [fffffff51414ac80, fffffff51414ae80) The buggy address belongs to the page: page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
The batch API sets the same completion for each rpmh message that's sent and then loops through all the messages and waits for that single completion declared on the stack to be completed before returning from the function and freeing the message structures. Unfortunately, some messages may still be in process and 'stuck' in the TCS. At some later point, the tcs_tx_done() interrupt will run and try to process messages that have already been freed at the end of rpmh_write_batch(). This will in turn access the 'needs_free' member of the rpmh_request structure and cause KASAN to complain. Furthermore, if there's a message that's completed in rpmh_tx_done() and freed immediately after the complete() call is made we'll be racing with potentially freed memory when accessing the 'needs_free' member:
CPU0 CPU1 ---- ---- rpmh_tx_done() complete(&compl) wait_for_completion(&compl) kfree(rpm_msg) if (rpm_msg->needs_free) <KASAN warning splat>
Let's fix this by allocating a chunk of completions for each message and waiting for all of them to be completed before returning from the batch API. Alternatively, we could wait for the last message in the batch, but that may be a more complicated change because it looks like tcs_tx_done() just iterates through the indices of the queue and completes each message instead of tracking the last inserted message and completing that first.
Fixes: c8790cb6da58 ("drivers: qcom: rpmh: add support for batch RPMH request") Cc: Lina Iyer ilina@codeaurora.org Cc: "Raju P.L.S.S.S.N" rplsssn@codeaurora.org Cc: Matthias Kaehlcke mka@chromium.org Cc: Evan Green evgreen@chromium.org Cc: stable@vger.kernel.org Reviewed-by: Lina Iyer ilina@codeaurora.org Reviewed-by: Evan Green evgreen@chromium.org Signed-off-by: Stephen Boyd swboyd@chromium.org Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Andy Gross andy.gross@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/soc/qcom/rpmh.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-)
--- a/drivers/soc/qcom/rpmh.c +++ b/drivers/soc/qcom/rpmh.c @@ -80,6 +80,7 @@ void rpmh_tx_done(const struct tcs_reque struct rpmh_request *rpm_msg = container_of(msg, struct rpmh_request, msg); struct completion *compl = rpm_msg->completion; + bool free = rpm_msg->needs_free;
rpm_msg->err = r;
@@ -94,7 +95,7 @@ void rpmh_tx_done(const struct tcs_reque complete(compl);
exit: - if (rpm_msg->needs_free) + if (free) kfree(rpm_msg); }
@@ -348,11 +349,12 @@ int rpmh_write_batch(const struct device { struct batch_cache_req *req; struct rpmh_request *rpm_msgs; - DECLARE_COMPLETION_ONSTACK(compl); + struct completion *compls; struct rpmh_ctrlr *ctrlr = get_rpmh_ctrlr(dev); unsigned long time_left; int count = 0; - int ret, i, j; + int ret, i; + void *ptr;
if (!cmd || !n) return -EINVAL; @@ -362,10 +364,15 @@ int rpmh_write_batch(const struct device if (!count) return -EINVAL;
- req = kzalloc(sizeof(*req) + count * sizeof(req->rpm_msgs[0]), + ptr = kzalloc(sizeof(*req) + + count * (sizeof(req->rpm_msgs[0]) + sizeof(*compls)), GFP_ATOMIC); - if (!req) + if (!ptr) return -ENOMEM; + + req = ptr; + compls = ptr + sizeof(*req) + count * sizeof(*rpm_msgs); + req->count = count; rpm_msgs = req->rpm_msgs;
@@ -380,25 +387,26 @@ int rpmh_write_batch(const struct device }
for (i = 0; i < count; i++) { - rpm_msgs[i].completion = &compl; + struct completion *compl = &compls[i]; + + init_completion(compl); + rpm_msgs[i].completion = compl; ret = rpmh_rsc_send_data(ctrlr_to_drv(ctrlr), &rpm_msgs[i].msg); if (ret) { pr_err("Error(%d) sending RPMH message addr=%#x\n", ret, rpm_msgs[i].msg.cmds[0].addr); - for (j = i; j < count; j++) - rpmh_tx_done(&rpm_msgs[j].msg, ret); break; } }
time_left = RPMH_TIMEOUT_MS; - for (i = 0; i < count; i++) { - time_left = wait_for_completion_timeout(&compl, time_left); + while (i--) { + time_left = wait_for_completion_timeout(&compls[i], time_left); if (!time_left) { /* * Better hope they never finish because they'll signal - * the completion on our stack and that's bad once - * we've returned from the function. + * the completion that we're going to free once + * we've returned from this function. */ WARN_ON(1); ret = -ETIMEDOUT; @@ -407,7 +415,7 @@ int rpmh_write_batch(const struct device }
exit: - kfree(req); + kfree(ptr);
return ret; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lubomir Rintel lkundrak@v3.sk
commit 607076a904c435f2677fadaadd4af546279db68b upstream.
It doesn't make sense and the USB core warns on each submit of such URB, easily flooding the message buffer with tracebacks.
Analogous issue was fixed in regular libertas driver in commit 6528d8804780 ("libertas: don't set URB_ZERO_PACKET on IN USB transfer").
Cc: stable@vger.kernel.org Signed-off-by: Lubomir Rintel lkundrak@v3.sk Reviewed-by: Steve deRosier derosier@cal-sierra.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/marvell/libertas_tf/if_usb.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/net/wireless/marvell/libertas_tf/if_usb.c +++ b/drivers/net/wireless/marvell/libertas_tf/if_usb.c @@ -433,8 +433,6 @@ static int __if_usb_submit_rx_urb(struct skb_tail_pointer(skb), MRVDRV_ETH_RX_PACKET_BUFFER_SIZE, callbackfn, cardp);
- cardp->rx_urb->transfer_flags |= URB_ZERO_PACKET; - lbtf_deb_usb2(&cardp->udev->dev, "Pointer for rx_urb %p\n", cardp->rx_urb); ret = usb_submit_urb(cardp->rx_urb, GFP_ATOMIC);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zenghui Yu yuzenghui@huawei.com
commit 8d565748b6035eeda18895c213396a4c9fac6a4c upstream.
In current logic, its_parse_indirect_baser() will be invoked twice when allocating Device tables. Add a *break* to omit the unnecessary and annoying (might be ...) invoking.
Fixes: 32bd44dc19de ("irqchip/gic-v3-its: Fix the incorrect parsing of VCPU table size") Cc: stable@vger.kernel.org Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-gic-v3-its.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1955,6 +1955,8 @@ static int its_alloc_tables(struct its_n indirect = its_parse_indirect_baser(its, baser, psz, &order, its->device_ids); + break; + case GITS_BASER_TYPE_VCPU: indirect = its_parse_indirect_baser(its, baser, psz, &order,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Doug Berger opendmb@gmail.com
commit 33517881ede742107f416533b8c3e4abc56763da upstream.
Using the irq_gc_lock/irq_gc_unlock functions in the suspend and resume functions creates the opportunity for a deadlock during suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/ irq_gc_unlock_irqrestore variants prevents this possible deadlock.
Cc: stable@vger.kernel.org Fixes: 7f646e92766e2 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller") Signed-off-by: Doug Berger opendmb@gmail.com Signed-off-by: Florian Fainelli f.fainelli@gmail.com [maz: tidied up $SUBJECT] Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-brcmstb-l2.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/irqchip/irq-brcmstb-l2.c +++ b/drivers/irqchip/irq-brcmstb-l2.c @@ -129,8 +129,9 @@ static void brcmstb_l2_intc_suspend(stru struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); struct irq_chip_type *ct = irq_data_get_chip_type(d); struct brcmstb_l2_intc_data *b = gc->private; + unsigned long flags;
- irq_gc_lock(gc); + irq_gc_lock_irqsave(gc, flags); /* Save the current mask */ b->saved_mask = irq_reg_readl(gc, ct->regs.mask);
@@ -139,7 +140,7 @@ static void brcmstb_l2_intc_suspend(stru irq_reg_writel(gc, ~gc->wake_active, ct->regs.disable); irq_reg_writel(gc, gc->wake_active, ct->regs.enable); } - irq_gc_unlock(gc); + irq_gc_unlock_irqrestore(gc, flags); }
static void brcmstb_l2_intc_resume(struct irq_data *d) @@ -147,8 +148,9 @@ static void brcmstb_l2_intc_resume(struc struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); struct irq_chip_type *ct = irq_data_get_chip_type(d); struct brcmstb_l2_intc_data *b = gc->private; + unsigned long flags;
- irq_gc_lock(gc); + irq_gc_lock_irqsave(gc, flags); if (ct->chip.irq_ack) { /* Clear unmasked non-wakeup interrupts */ irq_reg_writel(gc, ~b->saved_mask & ~gc->wake_active, @@ -158,7 +160,7 @@ static void brcmstb_l2_intc_resume(struc /* Restore the saved mask */ irq_reg_writel(gc, b->saved_mask, ct->regs.disable); irq_reg_writel(gc, ~b->saved_mask, ct->regs.enable); - irq_gc_unlock(gc); + irq_gc_unlock_irqrestore(gc, flags); }
static int __init brcmstb_l2_intc_of_init(struct device_node *np,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 0192e6535ebe9af68614198ced4fd6d37b778ebf upstream.
Prohibit probing on optprobe template code, since it is not a code but a template instruction sequence. If we modify this template, copied template must be broken.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andrea Righi righi.andrea@gmail.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Peter Zijlstra peterz@infradead.org Cc: Steven Rostedt rostedt@goodmis.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Fixes: 9326638cbee2 ("kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation") Link: http://lkml.kernel.org/r/154998787911.31052.15274376330136234452.stgit@devbo... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/kprobes/opt.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -141,6 +141,11 @@ asm (
void optprobe_template_func(void); STACK_FRAME_NON_STANDARD(optprobe_template_func); +NOKPROBE_SYMBOL(optprobe_template_func); +NOKPROBE_SYMBOL(optprobe_template_entry); +NOKPROBE_SYMBOL(optprobe_template_val); +NOKPROBE_SYMBOL(optprobe_template_call); +NOKPROBE_SYMBOL(optprobe_template_end);
#define TMPL_MOVE_IDX \ ((long)optprobe_template_val - (long)optprobe_template_entry)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viresh Kumar viresh.kumar@linaro.org
commit 0334906c06967142c8805fbe88acf787f65d3d26 upstream.
Commit 5ad7346b4ae2 ("cpufreq: kryo: Add module remove and exit") made it possible to build the kryo cpufreq driver as a module, but it failed to release all the resources, i.e. OPP tables, when the module is unloaded.
This patch fixes it by releasing the OPP tables, by calling dev_pm_opp_put_supported_hw() for them, from the qcom_cpufreq_kryo_remove() routine. The array of pointers to the OPP tables is also allocated dynamically now in qcom_cpufreq_kryo_probe(), as the pointers will be required while releasing the resources.
Compile tested only.
Cc: 4.18+ stable@vger.kernel.org # v4.18+ Fixes: 5ad7346b4ae2 ("cpufreq: kryo: Add module remove and exit") Reviewed-by: Georgi Djakov georgi.djakov@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/qcom-cpufreq-kryo.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/cpufreq/qcom-cpufreq-kryo.c +++ b/drivers/cpufreq/qcom-cpufreq-kryo.c @@ -75,7 +75,7 @@ static enum _msm8996_version qcom_cpufre
static int qcom_cpufreq_kryo_probe(struct platform_device *pdev) { - struct opp_table *opp_tables[NR_CPUS] = {0}; + struct opp_table **opp_tables; enum _msm8996_version msm8996_version; struct nvmem_cell *speedbin_nvmem; struct device_node *np; @@ -133,6 +133,10 @@ static int qcom_cpufreq_kryo_probe(struc } kfree(speedbin);
+ opp_tables = kcalloc(num_possible_cpus(), sizeof(*opp_tables), GFP_KERNEL); + if (!opp_tables) + return -ENOMEM; + for_each_possible_cpu(cpu) { cpu_dev = get_cpu_device(cpu); if (NULL == cpu_dev) { @@ -151,8 +155,10 @@ static int qcom_cpufreq_kryo_probe(struc
cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1, NULL, 0); - if (!IS_ERR(cpufreq_dt_pdev)) + if (!IS_ERR(cpufreq_dt_pdev)) { + platform_set_drvdata(pdev, opp_tables); return 0; + }
ret = PTR_ERR(cpufreq_dt_pdev); dev_err(cpu_dev, "Failed to register platform device\n"); @@ -163,13 +169,23 @@ free_opp: break; dev_pm_opp_put_supported_hw(opp_tables[cpu]); } + kfree(opp_tables);
return ret; }
static int qcom_cpufreq_kryo_remove(struct platform_device *pdev) { + struct opp_table **opp_tables = platform_get_drvdata(pdev); + unsigned int cpu; + platform_device_unregister(cpufreq_dt_pdev); + + for_each_possible_cpu(cpu) + dev_pm_opp_put_supported_hw(opp_tables[cpu]); + + kfree(opp_tables); + return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yangtao Li tiny.windzz@gmail.com
commit 446fae2bb5395f3028d8e3aae1508737e5a72ea1 upstream.
of_cpu_device_node_get() will increase the refcount of device_node, it is necessary to call of_node_put() at the end to release the refcount.
Fixes: 9eb15dbbfa1a2 ("cpufreq: Add cpufreq driver for Tegra124") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Yangtao Li tiny.windzz@gmail.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/tegra124-cpufreq.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/cpufreq/tegra124-cpufreq.c +++ b/drivers/cpufreq/tegra124-cpufreq.c @@ -134,6 +134,8 @@ static int tegra124_cpufreq_probe(struct
platform_set_drvdata(pdev, priv);
+ of_node_put(np); + return 0;
out_switch_to_pllx:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 9505b98ccddc454008ca7efff90044e3e857c827 upstream.
pxa_cpufreq_init_voltages() is marked __init but usually inlined into the non-__init pxa_cpufreq_init() function. When building with clang, it can stay as a standalone function in a discarded section, and produce this warning:
WARNING: vmlinux.o(.text+0x616a00): Section mismatch in reference from the function pxa_cpufreq_init() to the function .init.text:pxa_cpufreq_init_voltages() The function pxa_cpufreq_init() references the function __init pxa_cpufreq_init_voltages(). This is often because pxa_cpufreq_init lacks a __init annotation or the annotation of pxa_cpufreq_init_voltages is wrong.
Fixes: 50e77fcd790e ("ARM: pxa: remove __init from cpufreq_driver->init()") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Viresh Kumar viresh.kumar@linaro.org Reviewed-by: Nathan Chancellor natechancellor@gmail.com Acked-by: Robert Jarzmik robert.jarzmik@free.fr Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpufreq/pxa2xx-cpufreq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/cpufreq/pxa2xx-cpufreq.c +++ b/drivers/cpufreq/pxa2xx-cpufreq.c @@ -143,7 +143,7 @@ static int pxa_cpufreq_change_voltage(co return ret; }
-static void __init pxa_cpufreq_init_voltages(void) +static void pxa_cpufreq_init_voltages(void) { vcc_core = regulator_get(NULL, "vcc_core"); if (IS_ERR(vcc_core)) { @@ -159,7 +159,7 @@ static int pxa_cpufreq_change_voltage(co return 0; }
-static void __init pxa_cpufreq_init_voltages(void) { } +static void pxa_cpufreq_init_voltages(void) { } #endif
static void find_freq_tables(struct cpufreq_frequency_table **freq_table,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: yangerkun yangerkun@huawei.com
commit 67a11611e1a5211f6569044fbf8150875764d1d0 upstream.
Before really do swap between inode and boot inode, something need to check to avoid invalid or not permitted operation, like does this inode has inline data. But the condition check should be protected by inode lock to avoid change while swapping. Also some other condition will not change between swapping, but there has no problem to do this under inode lock.
Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/ioctl.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -116,15 +116,6 @@ static long swap_inode_boot_loader(struc struct inode *inode_bl; struct ext4_inode_info *ei_bl;
- if (inode->i_nlink != 1 || !S_ISREG(inode->i_mode) || - IS_SWAPFILE(inode) || IS_ENCRYPTED(inode) || - ext4_has_inline_data(inode)) - return -EINVAL; - - if (IS_RDONLY(inode) || IS_APPEND(inode) || IS_IMMUTABLE(inode) || - !inode_owner_or_capable(inode) || !capable(CAP_SYS_ADMIN)) - return -EPERM; - inode_bl = ext4_iget(sb, EXT4_BOOT_LOADER_INO, EXT4_IGET_SPECIAL); if (IS_ERR(inode_bl)) return PTR_ERR(inode_bl); @@ -137,6 +128,19 @@ static long swap_inode_boot_loader(struc * that only 1 swap_inode_boot_loader is running. */ lock_two_nondirectories(inode, inode_bl);
+ if (inode->i_nlink != 1 || !S_ISREG(inode->i_mode) || + IS_SWAPFILE(inode) || IS_ENCRYPTED(inode) || + ext4_has_inline_data(inode)) { + err = -EINVAL; + goto journal_err_out; + } + + if (IS_RDONLY(inode) || IS_APPEND(inode) || IS_IMMUTABLE(inode) || + !inode_owner_or_capable(inode) || !capable(CAP_SYS_ADMIN)) { + err = -EPERM; + goto journal_err_out; + } + /* Wait for all existing dio workers */ inode_dio_wait(inode); inode_dio_wait(inode_bl);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: yangerkun yangerkun@huawei.com
commit a46c68a318b08f819047843abf349aeee5d10ac2 upstream.
While do swap, we should make sure there has no new dirty page since we should swap i_data between two inode: 1.We should lock i_mmap_sem with write to avoid new pagecache from mmap read/write; 2.Change filemap_flush to filemap_write_and_wait and move them to the space protected by inode lock to avoid new pagecache from buffer read/write.
Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/ioctl.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -121,9 +121,6 @@ static long swap_inode_boot_loader(struc return PTR_ERR(inode_bl); ei_bl = EXT4_I(inode_bl);
- filemap_flush(inode->i_mapping); - filemap_flush(inode_bl->i_mapping); - /* Protect orig inodes against a truncate and make sure, * that only 1 swap_inode_boot_loader is running. */ lock_two_nondirectories(inode, inode_bl); @@ -141,6 +138,15 @@ static long swap_inode_boot_loader(struc goto journal_err_out; }
+ down_write(&EXT4_I(inode)->i_mmap_sem); + err = filemap_write_and_wait(inode->i_mapping); + if (err) + goto err_out; + + err = filemap_write_and_wait(inode_bl->i_mapping); + if (err) + goto err_out; + /* Wait for all existing dio workers */ inode_dio_wait(inode); inode_dio_wait(inode_bl); @@ -151,7 +157,7 @@ static long swap_inode_boot_loader(struc handle = ext4_journal_start(inode_bl, EXT4_HT_MOVE_EXTENTS, 2); if (IS_ERR(handle)) { err = -EINVAL; - goto journal_err_out; + goto err_out; }
/* Protect extent tree against block allocations via delalloc */ @@ -208,6 +214,8 @@ static long swap_inode_boot_loader(struc ext4_journal_stop(handle); ext4_double_up_write_data_sem(inode, inode_bl);
+err_out: + up_write(&EXT4_I(inode)->i_mmap_sem); journal_err_out: unlock_two_nondirectories(inode, inode_bl); iput(inode_bl);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhongjiang zhongjiang@huawei.com
commit 46612b751c4941c5c0472ddf04027e877ae5990f upstream.
When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE():
Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519!
soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page().
Naoya had fixed a similar issue in c3901e722b29 ("mm: hwpoison: fix thp split handling in memory_failure()"). But he missed fixing soft offline.
Link: http://lkml.kernel.org/r/1551452476-24000-1-git-send-email-zhongjiang@huawei... Fixes: 61f5d698cc97 ("mm: re-enable THP") Signed-off-by: zhongjiang zhongjiang@huawei.com Acked-by: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Cc: Michal Hocko mhocko@suse.com Cc: Hugh Dickins hughd@google.com Cc: Kirill A. Shutemov kirill@shutemov.name Cc: Andrea Arcangeli aarcange@redhat.com Cc: stable@vger.kernel.org [4.5+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/memory-failure.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1825,19 +1825,17 @@ static int soft_offline_in_use_page(stru struct page *hpage = compound_head(page);
if (!PageHuge(page) && PageTransHuge(hpage)) { - lock_page(hpage); - if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) { - unlock_page(hpage); - if (!PageAnon(hpage)) + lock_page(page); + if (!PageAnon(page) || unlikely(split_huge_page(page))) { + unlock_page(page); + if (!PageAnon(page)) pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page)); else pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page)); - put_hwpoison_page(hpage); + put_hwpoison_page(page); return -EBUSY; } - unlock_page(hpage); - get_hwpoison_page(page); - put_hwpoison_page(hpage); + unlock_page(page); }
/*
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Penyaev rpenyaev@suse.de
commit 401592d2e095947344e10ec0623adbcd58934dd4 upstream.
When VM_NO_GUARD is not set area->size includes adjacent guard page, thus for correct size checking get_vm_area_size() should be used, but not area->size.
This fixes possible kernel oops when userspace tries to mmap an area on 1 page bigger than was allocated by vmalloc_user() call: the size check inside remap_vmalloc_range_partial() accounts non-existing guard page also, so check successfully passes but vmalloc_to_page() returns NULL (guard page does not physically exist).
The following code pattern example should trigger an oops:
static int oops_mmap(struct file *file, struct vm_area_struct *vma) { void *mem;
mem = vmalloc_user(4096); BUG_ON(!mem); /* Do not care about mem leak */
return remap_vmalloc_range(vma, mem, 0); }
And userspace simply mmaps size + PAGE_SIZE:
mmap(NULL, 8192, PROT_WRITE|PROT_READ, MAP_PRIVATE, fd, 0);
Possible candidates for oops which do not have any explicit size checks:
*** drivers/media/usb/stkwebcam/stk-webcam.c: v4l_stk_mmap[789] ret = remap_vmalloc_range(vma, sbuf->buffer, 0);
Or the following one:
*** drivers/video/fbdev/core/fbmem.c static int fb_mmap(struct file *file, struct vm_area_struct * vma) ... res = fb->fb_mmap(info, vma);
Where fb_mmap callback calls remap_vmalloc_range() directly without any explicit checks:
*** drivers/video/fbdev/vfb.c static int vfb_mmap(struct fb_info *info, struct vm_area_struct *vma) { return remap_vmalloc_range(vma, (void *)info->fix.smem_start, vma->vm_pgoff); }
Link: http://lkml.kernel.org/r/20190103145954.16942-2-rpenyaev@suse.de Signed-off-by: Roman Penyaev rpenyaev@suse.de Acked-by: Michal Hocko mhocko@suse.com Cc: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Joe Perches joe@perches.com Cc: "Luis R. Rodriguez" mcgrof@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2248,7 +2248,7 @@ int remap_vmalloc_range_partial(struct v if (!(area->flags & VM_USERMAP)) return -EINVAL;
- if (kaddr + size > area->addr + area->size) + if (kaddr + size > area->addr + get_vm_area_size(area)) return -EINVAL;
do {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Stancek jstancek@redhat.com
commit fc8efd2ddfed3f343c11b693e87140ff358d7ff5 upstream.
LTP testcase mtest06 [1] can trigger a crash on s390x running 5.0.0-rc8. This is a stress test, where one thread mmaps/writes/munmaps memory area and other thread is trying to read from it:
CPU: 0 PID: 2611 Comm: mmap1 Not tainted 5.0.0-rc8+ #51 Hardware name: IBM 2964 N63 400 (z/VM 6.4.0) Krnl PSW : 0404e00180000000 00000000001ac8d8 (__lock_acquire+0x7/0x7a8) Call Trace: ([<0000000000000000>] (null)) [<00000000001adae4>] lock_acquire+0xec/0x258 [<000000000080d1ac>] _raw_spin_lock_bh+0x5c/0x98 [<000000000012a780>] page_table_free+0x48/0x1a8 [<00000000002f6e54>] do_fault+0xdc/0x670 [<00000000002fadae>] __handle_mm_fault+0x416/0x5f0 [<00000000002fb138>] handle_mm_fault+0x1b0/0x320 [<00000000001248cc>] do_dat_exception+0x19c/0x2c8 [<000000000080e5ee>] pgm_check_handler+0x19e/0x200
page_table_free() is called with NULL mm parameter, but because "0" is a valid address on s390 (see S390_lowcore), it keeps going until it eventually crashes in lockdep's lock_acquire. This crash is reproducible at least since 4.14.
Problem is that "vmf->vma" used in do_fault() can become stale. Because mmap_sem may be released, other threads can come in, call munmap() and cause "vma" be returned to kmem cache, and get zeroed/re-initialized and re-used:
handle_mm_fault | __handle_mm_fault | do_fault | vma = vmf->vma | do_read_fault | __do_fault | vma->vm_ops->fault(vmf); | mmap_sem is released | | | do_munmap() | remove_vma_list() | remove_vma() | vm_area_free() | # vma is released | ... | # same vma is allocated | # from kmem cache | do_mmap() | vm_area_alloc() | memset(vma, 0, ...) | pte_free(vma->vm_mm, ...); | page_table_free | spin_lock_bh(&mm->context.lock);| <crash> |
Cache mm_struct to avoid using potentially stale "vma".
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/m...
Link: http://lkml.kernel.org/r/5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137... Signed-off-by: Jan Stancek jstancek@redhat.com Reviewed-by: Andrea Arcangeli aarcange@redhat.com Reviewed-by: Matthew Wilcox willy@infradead.org Acked-by: Rafael Aquini aquini@redhat.com Reviewed-by: Minchan Kim minchan@kernel.org Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Rik van Riel riel@surriel.com Cc: Michal Hocko mhocko@suse.com Cc: Huang Ying ying.huang@intel.com Cc: Souptick Joarder jrdr.linux@gmail.com Cc: Jerome Glisse jglisse@redhat.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Cc: David Hildenbrand david@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: David Rientjes rientjes@google.com Cc: Mel Gorman mgorman@techsingularity.net Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/memory.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/memory.c +++ b/mm/memory.c @@ -3517,10 +3517,13 @@ static vm_fault_t do_shared_fault(struct * but allow concurrent faults). * The mmap_sem may have been released depending on flags and our * return value. See filemap_fault() and __lock_page_or_retry(). + * If mmap_sem is released, vma may become invalid (for example + * by other thread calling munmap()). */ static vm_fault_t do_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + struct mm_struct *vm_mm = vma->vm_mm; vm_fault_t ret;
/* @@ -3561,7 +3564,7 @@ static vm_fault_t do_fault(struct vm_fau
/* preallocated pagetable is unused: free it */ if (vmf->prealloc_pte) { - pte_free(vma->vm_mm, vmf->prealloc_pte); + pte_free(vm_mm, vmf->prealloc_pte); vmf->prealloc_pte = NULL; } return ret;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zev Weiss zev@bewilderbeest.net
commit 8cf7630b29701d364f8df4a50e4f1f5e752b2778 upstream.
This bug has apparently existed since the introduction of this function in the pre-git era (4500e91754d3 in Thomas Gleixner's history.git, "[NET]: Add proc_dointvec_userhz_jiffies, use it for proper handling of neighbour sysctls.").
As a minimal fix we can simply duplicate the corresponding check in do_proc_dointvec_conv().
Link: http://lkml.kernel.org/r/20190207123426.9202-3-zev@bewilderbeest.net Signed-off-by: Zev Weiss zev@bewilderbeest.net Cc: Brendan Higgins brendanhiggins@google.com Cc: Iurii Zaikin yzaikin@google.com Cc: Kees Cook keescook@chromium.org Cc: Luis Chamberlain mcgrof@kernel.org Cc: stable@vger.kernel.org [2.6.2+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/sysctl.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2579,7 +2579,16 @@ static int do_proc_dointvec_minmax_conv( { struct do_proc_dointvec_minmax_conv_param *param = data; if (write) { - int val = *negp ? -*lvalp : *lvalp; + int val; + if (*negp) { + if (*lvalp > (unsigned long) INT_MAX + 1) + return -EINVAL; + val = -*lvalp; + } else { + if (*lvalp > (unsigned long) INT_MAX) + return -EINVAL; + val = *lvalp; + } if ((param->min && *param->min > val) || (param->max && *param->max < val)) return -EINVAL;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Golaszewski bgolaszewski@baylibre.com
commit f4853e1c321edb48af229ad5ac85076790d34968 upstream.
blocking_notifier_call_chain() returns the value returned by the last registered callback. A positive return value doesn't indicate an error and an nvmem device should correctly register irrespective of any notifier callback failures. Drop the retval check.
Fixes: bee1138bea15 ("nvmem: add a notifier chain") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Acked-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvmem/core.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -686,9 +686,7 @@ struct nvmem_device *nvmem_register(cons if (rval) goto err_remove_cells;
- rval = blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem); - if (rval) - goto err_remove_cells; + blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
return nvmem;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heikki Krogerus heikki.krogerus@linux.intel.com
commit 2b6e492467c78183bb629bb0a100ea3509b615a5 upstream.
With string type property entries we need to use sizeof(const char *) instead of the number of characters as the length of the entry.
If the string was shorter then sizeof(const char *), attempts to read it would have failed with -EOVERFLOW. The problem has been hidden because all build-in string properties have had a string longer then 8 characters until now.
Fixes: a85f42047533 ("device property: helper macros for property entry creation") Cc: 4.5+ stable@vger.kernel.org # 4.5+ Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/property.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/property.h +++ b/include/linux/property.h @@ -258,7 +258,7 @@ struct property_entry { #define PROPERTY_ENTRY_STRING(_name_, _val_) \ (struct property_entry) { \ .name = _name_, \ - .length = sizeof(_val_), \ + .length = sizeof(const char *), \ .type = DEV_PROP_STRING, \ { .value = { .str = _val_ } }, \ }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 9ed3f22223c33347ed963e7c7019cf2956dd4e37 upstream.
When an output port driver is removed, also remove references to it from any masters. Failing to do this causes a NULL ptr dereference when configuring another output port:
BUG: unable to handle kernel NULL pointer dereference at 000000000000000d RIP: 0010:master_attr_store+0x9d/0x160 [intel_th_gth] Call Trace: dev_attr_store+0x1b/0x30 sysfs_kf_write+0x3c/0x50 kernfs_fop_write+0x125/0x1a0 __vfs_write+0x3a/0x190 ? __vfs_write+0x5/0x190 ? _cond_resched+0x1a/0x50 ? rcu_all_qs+0x5/0xb0 ? __vfs_write+0x5/0x190 vfs_write+0xb8/0x1b0 ksys_write+0x55/0xc0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x5a/0x140 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: b27a6a3f97b9 ("intel_th: Add Global Trace Hub driver") CC: stable@vger.kernel.org # v4.4+ Reported-by: Ammy Yi ammy.yi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/gth.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/hwtracing/intel_th/gth.c +++ b/drivers/hwtracing/intel_th/gth.c @@ -607,6 +607,7 @@ static void intel_th_gth_unassign(struct { struct gth_device *gth = dev_get_drvdata(&thdev->dev); int port = othdev->output.port; + int master;
if (thdev->host_mode) return; @@ -615,6 +616,9 @@ static void intel_th_gth_unassign(struct othdev->output.port = -1; othdev->output.active = false; gth->output[port].output = NULL; + for (master = 0; master < TH_CONFIGURABLE_MASTERS; master++) + if (gth->master[master] == port) + gth->master[master] = -1; spin_unlock(>h->gth_lock); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: QiaoChong qiaochong@loongson.cn
commit 21698fd57984cd28207d841dbdaa026d6061bceb upstream.
In the original code before 181bf1e815a2 the loop was continuing until it finds the first matching superios[i].io and p->base. But after 181bf1e815a2 the logic changed and the loop now returns the pointer to the first mismatched array element which is then used in get_superio_dma() and get_superio_irq() and thus returning the wrong value. Fix the condition so that it now returns the correct pointer.
Fixes: 181bf1e815a2 ("parport_pc: clean up the modified while loops using for") Cc: Alan Cox alan@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: QiaoChong qiaochong@loongson.cn [rewrite the commit message] Signed-off-by: Sudip Mukherjee sudipm.mukherjee@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/parport/parport_pc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/parport/parport_pc.c +++ b/drivers/parport/parport_pc.c @@ -1377,7 +1377,7 @@ static struct superio_struct *find_super { int i; for (i = 0; i < NR_SUPERIOS; i++) - if (superios[i].io != p->base) + if (superios[i].io == p->base) return &superios[i]; return NULL; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sowjanya Komatineni skomatineni@nvidia.com
commit f4e3f4ae1d9c9330de355f432b69952e8cef650c upstream.
Tegra186 and prior supports maximum 4K bytes per packet transfer including 12 bytes of packet header.
This patch fixes max write length limit to account packet header size for transfers.
Cc: stable@vger.kernel.org # 4.4+
Reviewed-by: Dmitry Osipenko digetx@gmail.com Signed-off-by: Sowjanya Komatineni skomatineni@nvidia.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/busses/i2c-tegra.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -837,7 +837,7 @@ static const struct i2c_algorithm tegra_ static const struct i2c_adapter_quirks tegra_i2c_quirks = { .flags = I2C_AQ_NO_ZERO_LEN, .max_read_len = 4096, - .max_write_len = 4096, + .max_write_len = 4096 - 12, };
static const struct i2c_adapter_quirks tegra194_i2c_quirks = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sowjanya Komatineni skomatineni@nvidia.com
commit b03ff2a23359d0dd6f0a1516c6a9e9c4760ed230 upstream.
Tegra194 supports maximum 64K bytes per packet including 12 bytes of packet header irrespective of PIO or DMA mode transfer.
This patch updates Tegra194 max write length to account for packet header size for transfers.
Cc: stable@vger.kernel.org # 4.20+
Reviewed-by: Dmitry Osipenko digetx@gmail.com Signed-off-by: Sowjanya Komatineni skomatineni@nvidia.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/busses/i2c-tegra.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -118,6 +118,9 @@ #define I2C_MST_FIFO_STATUS_TX_MASK 0xff0000 #define I2C_MST_FIFO_STATUS_TX_SHIFT 16
+/* Packet header size in bytes */ +#define I2C_PACKET_HEADER_SIZE 12 + /* * msg_end_type: The bus control which need to be send at end of transfer. * @MSG_END_STOP: Send stop pulse at end of transfer. @@ -836,12 +839,13 @@ static const struct i2c_algorithm tegra_ /* payload size is only 12 bit */ static const struct i2c_adapter_quirks tegra_i2c_quirks = { .flags = I2C_AQ_NO_ZERO_LEN, - .max_read_len = 4096, - .max_write_len = 4096 - 12, + .max_read_len = SZ_4K, + .max_write_len = SZ_4K - I2C_PACKET_HEADER_SIZE, };
static const struct i2c_adapter_quirks tegra194_i2c_quirks = { .flags = I2C_AQ_NO_ZERO_LEN, + .max_write_len = SZ_64K - I2C_PACKET_HEADER_SIZE, };
static const struct tegra_i2c_hw_feature tegra20_i2c_hw = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Loic Poulain loic.poulain@linaro.org
commit 1d4c41f3d887bcd66e82cb2fda124533dad8808a upstream.
According to the ov5640 specification (2.7 power up sequence), host can access the sensor's registers 20ms after reset. Trying to access them before leads to undefined behavior and result in sporadic initialization errors.
Signed-off-by: Loic Poulain loic.poulain@linaro.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Cc: Adam Ford aford173@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/i2c/ov5640.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/i2c/ov5640.c +++ b/drivers/media/i2c/ov5640.c @@ -1893,7 +1893,7 @@ static void ov5640_reset(struct ov5640_d usleep_range(1000, 2000);
gpiod_set_value_cansleep(sensor->reset_gpio, 0); - usleep_range(5000, 10000); + usleep_range(20000, 25000); }
static int ov5640_set_power_on(struct ov5640_dev *sensor)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Walton mark.walton@serialtek.com
commit c378b3aa015931a46c91d6ccc2fe04d97801d060 upstream.
If a PCA953x gpio was used as an interrupt and then released, the shutdown function was trying to extract the pca953x_chip pointer directly from the irq_data, but in reality was getting the gpio_chip structure.
The net effect was that the subsequent writes to the data structure corrupted data in the gpio_chip structure, which wasn't immediately obvious until attempting to use the GPIO again in the future, at which point the kernel panics.
This fix correctly extracts the pca953x_chip structure via the gpio_chip structure, as is correctly done in the other irq functions.
Fixes: 0a70fe00efea ("gpio: pca953x: Clear irq trigger type on irq shutdown") Cc: stable@vger.kernel.org Signed-off-by: Mark Walton mark.walton@serialtek.com Reviewed-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpio-pca953x.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -587,7 +587,8 @@ static int pca953x_irq_set_type(struct i
static void pca953x_irq_shutdown(struct irq_data *d) { - struct pca953x_chip *chip = irq_data_get_irq_chip_data(d); + struct gpio_chip *gc = irq_data_get_irq_chip_data(d); + struct pca953x_chip *chip = gpiochip_get_data(gc); u8 mask = 1 << (d->hwirq % BANK_SZ);
chip->irq_trig_raise[d->hwirq / BANK_SZ] &= ~mask;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: yangerkun yangerkun@huawei.com
commit aa507b5faf38784defe49f5e64605ac3c4425e26 upstream.
While do swap between two inode, they swap i_data without update quota information. Also, swap_inode_boot_loader can do "revert" somtimes, so update the quota while all operations has been finished.
Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/ioctl.c | 56 +++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 43 insertions(+), 13 deletions(-)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -68,8 +68,6 @@ static void swap_inode_data(struct inode ei2 = EXT4_I(inode2);
swap(inode1->i_version, inode2->i_version); - swap(inode1->i_blocks, inode2->i_blocks); - swap(inode1->i_bytes, inode2->i_bytes); swap(inode1->i_atime, inode2->i_atime); swap(inode1->i_mtime, inode2->i_mtime);
@@ -115,6 +113,9 @@ static long swap_inode_boot_loader(struc int err; struct inode *inode_bl; struct ext4_inode_info *ei_bl; + qsize_t size, size_bl, diff; + blkcnt_t blocks; + unsigned short bytes;
inode_bl = ext4_iget(sb, EXT4_BOOT_LOADER_INO, EXT4_IGET_SPECIAL); if (IS_ERR(inode_bl)) @@ -180,6 +181,13 @@ static long swap_inode_boot_loader(struc memset(ei_bl->i_data, 0, sizeof(ei_bl->i_data)); }
+ err = dquot_initialize(inode); + if (err) + goto err_out1; + + size = (qsize_t)(inode->i_blocks) * (1 << 9) + inode->i_bytes; + size_bl = (qsize_t)(inode_bl->i_blocks) * (1 << 9) + inode_bl->i_bytes; + diff = size - size_bl; swap_inode_data(inode, inode_bl);
inode->i_ctime = inode_bl->i_ctime = current_time(inode); @@ -193,24 +201,46 @@ static long swap_inode_boot_loader(struc
err = ext4_mark_inode_dirty(handle, inode); if (err < 0) { + /* No need to update quota information. */ ext4_warning(inode->i_sb, "couldn't mark inode #%lu dirty (err %d)", inode->i_ino, err); /* Revert all changes: */ swap_inode_data(inode, inode_bl); ext4_mark_inode_dirty(handle, inode); - } else { - err = ext4_mark_inode_dirty(handle, inode_bl); - if (err < 0) { - ext4_warning(inode_bl->i_sb, - "couldn't mark inode #%lu dirty (err %d)", - inode_bl->i_ino, err); - /* Revert all changes: */ - swap_inode_data(inode, inode_bl); - ext4_mark_inode_dirty(handle, inode); - ext4_mark_inode_dirty(handle, inode_bl); - } + goto err_out1; + } + + blocks = inode_bl->i_blocks; + bytes = inode_bl->i_bytes; + inode_bl->i_blocks = inode->i_blocks; + inode_bl->i_bytes = inode->i_bytes; + err = ext4_mark_inode_dirty(handle, inode_bl); + if (err < 0) { + /* No need to update quota information. */ + ext4_warning(inode_bl->i_sb, + "couldn't mark inode #%lu dirty (err %d)", + inode_bl->i_ino, err); + goto revert; + } + + /* Bootloader inode should not be counted into quota information. */ + if (diff > 0) + dquot_free_space(inode, diff); + else + err = dquot_alloc_space(inode, -1 * diff); + + if (err < 0) { +revert: + /* Revert all changes: */ + inode_bl->i_blocks = blocks; + inode_bl->i_bytes = bytes; + swap_inode_data(inode, inode_bl); + ext4_mark_inode_dirty(handle, inode); + ext4_mark_inode_dirty(handle, inode_bl); } + +err_out1: ext4_journal_stop(handle); ext4_double_up_write_data_sem(inode, inode_bl);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: yangerkun yangerkun@huawei.com
commit abdc644e8cbac2e9b19763680e5a7cf9bab2bee7 upstream.
The reason is that while swapping two inode, we swap the flags too. Some flags such as EXT4_JOURNAL_DATA_FL can really confuse the things since we're not resetting the address operations structure. The simplest way to keep things sane is to restrict the flags that can be swapped.
Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/ext4.h | 3 +++ fs/ext4/ioctl.c | 6 +++++- 2 files changed, 8 insertions(+), 1 deletion(-)
--- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -426,6 +426,9 @@ struct flex_groups { /* Flags that are appropriate for non-directories/regular files. */ #define EXT4_OTHER_FLMASK (EXT4_NODUMP_FL | EXT4_NOATIME_FL)
+/* The only flags that should be swapped */ +#define EXT4_FL_SHOULD_SWAP (EXT4_HUGE_FILE_FL | EXT4_EXTENTS_FL) + /* Mask out flags that are inappropriate for the given type of inode. */ static inline __u32 ext4_mask_flags(umode_t mode, __u32 flags) { --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -63,6 +63,7 @@ static void swap_inode_data(struct inode loff_t isize; struct ext4_inode_info *ei1; struct ext4_inode_info *ei2; + unsigned long tmp;
ei1 = EXT4_I(inode1); ei2 = EXT4_I(inode2); @@ -72,7 +73,10 @@ static void swap_inode_data(struct inode swap(inode1->i_mtime, inode2->i_mtime);
memswap(ei1->i_data, ei2->i_data, sizeof(ei1->i_data)); - swap(ei1->i_flags, ei2->i_flags); + tmp = ei1->i_flags & EXT4_FL_SHOULD_SWAP; + ei1->i_flags = (ei2->i_flags & EXT4_FL_SHOULD_SWAP) | + (ei1->i_flags & ~EXT4_FL_SHOULD_SWAP); + ei2->i_flags = tmp | (ei2->i_flags & ~EXT4_FL_SHOULD_SWAP); swap(ei1->i_disksize, ei2->i_disksize); ext4_es_remove_extent(inode1, 0, EXT_MAX_BLOCKS); ext4_es_remove_extent(inode2, 0, EXT_MAX_BLOCKS);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit f96c3ac8dfc24b4e38fc4c2eba5fea2107b929d1 upstream.
When computing maximum size of filesystem possible with given number of group descriptor blocks, we forget to include s_first_data_block into the number of blocks. Thus for filesystems with non-zero s_first_data_block it can happen that computed maximum filesystem size is actually lower than current filesystem size which confuses the code and eventually leads to a BUG_ON in ext4_alloc_group_tables() hitting on flex_gd->count == 0. The problem can be reproduced like:
truncate -s 100g /tmp/image mkfs.ext4 -b 1024 -E resize=262144 /tmp/image 32768 mount -t ext4 -o loop /tmp/image /mnt resize2fs /dev/loop0 262145 resize2fs /dev/loop0 300000
Fix the problem by properly including s_first_data_block into the computed number of filesystem blocks.
Fixes: 1c6bd7173d66 "ext4: convert file system to meta_bg if needed..." Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/resize.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1960,7 +1960,8 @@ retry: le16_to_cpu(es->s_reserved_gdt_blocks); n_group = n_desc_blocks * EXT4_DESC_PER_BLOCK(sb); n_blocks_count = (ext4_fsblk_t)n_group * - EXT4_BLOCKS_PER_GROUP(sb); + EXT4_BLOCKS_PER_GROUP(sb) + + le32_to_cpu(es->s_first_data_block); n_group--; /* set to last group number */ }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joerg Roedel jroedel@suse.de
commit 133d624b1cee16906134e92d5befb843b58bcf31 upstream.
The function returns the maximum size that can be mapped using DMA-API functions. The patch also adds the implementation for direct DMA and a new dma_map_ops pointer so that other implementations can expose their limit.
Cc: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/DMA-API.txt | 8 ++++++++ include/linux/dma-mapping.h | 8 ++++++++ kernel/dma/direct.c | 11 +++++++++++ kernel/dma/mapping.c | 14 ++++++++++++++ 4 files changed, 41 insertions(+)
--- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -195,6 +195,14 @@ Requesting the required mask does not al wish to take advantage of it, you should issue a dma_set_mask() call to set the mask to the value returned.
+:: + + size_t + dma_direct_max_mapping_size(struct device *dev); + +Returns the maximum size of a mapping for the device. The size parameter +of the mapping functions like dma_map_single(), dma_map_page() and +others should not be larger than the returned value.
Part Id - Streaming DMA mappings -------------------------------- --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -130,6 +130,7 @@ struct dma_map_ops { enum dma_data_direction direction); int (*dma_supported)(struct device *dev, u64 mask); u64 (*get_required_mask)(struct device *dev); + size_t (*max_mapping_size)(struct device *dev); };
#define DMA_MAPPING_ERROR (~(dma_addr_t)0) @@ -257,6 +258,8 @@ static inline void dma_direct_sync_sg_fo } #endif
+size_t dma_direct_max_mapping_size(struct device *dev); + #ifdef CONFIG_HAS_DMA #include <asm/dma-mapping.h>
@@ -460,6 +463,7 @@ int dma_supported(struct device *dev, u6 int dma_set_mask(struct device *dev, u64 mask); int dma_set_coherent_mask(struct device *dev, u64 mask); u64 dma_get_required_mask(struct device *dev); +size_t dma_max_mapping_size(struct device *dev); #else /* CONFIG_HAS_DMA */ static inline dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, size_t offset, size_t size, @@ -561,6 +565,10 @@ static inline u64 dma_get_required_mask( { return 0; } +static inline size_t dma_max_mapping_size(struct device *dev) +{ + return 0; +} #endif /* CONFIG_HAS_DMA */
static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -380,3 +380,14 @@ int dma_direct_supported(struct device * */ return mask >= __phys_to_dma(dev, min_mask); } + +size_t dma_direct_max_mapping_size(struct device *dev) +{ + size_t size = SIZE_MAX; + + /* If SWIOTLB is active, use its maximum mapping size */ + if (is_swiotlb_active()) + size = swiotlb_max_mapping_size(dev); + + return size; +} --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -357,3 +357,17 @@ void dma_cache_sync(struct device *dev, ops->cache_sync(dev, vaddr, size, dir); } EXPORT_SYMBOL(dma_cache_sync); + +size_t dma_max_mapping_size(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + size_t size = SIZE_MAX; + + if (dma_is_direct(ops)) + size = dma_direct_max_mapping_size(dev); + else if (ops && ops->max_mapping_size) + size = ops->max_mapping_size(dev); + + return size; +} +EXPORT_SYMBOL_GPL(dma_max_mapping_size);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joerg Roedel jroedel@suse.de
commit abe420bfae528c92bd8cc5ecb62dc95672b1fd6f upstream.
The function returns the maximum size that can be remapped by the SWIOTLB implementation. This function will be later exposed to users through the DMA-API.
Cc: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/swiotlb.h | 5 +++++ kernel/dma/swiotlb.c | 5 +++++ 2 files changed, 10 insertions(+)
--- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -76,6 +76,7 @@ bool swiotlb_map(struct device *dev, phy size_t size, enum dma_data_direction dir, unsigned long attrs); void __init swiotlb_exit(void); unsigned int swiotlb_max_segment(void); +size_t swiotlb_max_mapping_size(struct device *dev); #else #define swiotlb_force SWIOTLB_NO_FORCE static inline bool is_swiotlb_buffer(phys_addr_t paddr) @@ -95,6 +96,10 @@ static inline unsigned int swiotlb_max_s { return 0; } +static inline size_t swiotlb_max_mapping_size(struct device *dev) +{ + return SIZE_MAX; +} #endif /* CONFIG_SWIOTLB */
extern void swiotlb_print_info(void); --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -662,3 +662,8 @@ swiotlb_dma_supported(struct device *hwd { return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask; } + +size_t swiotlb_max_mapping_size(struct device *dev) +{ + return ((size_t)1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE; +}
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joerg Roedel jroedel@suse.de
commit 492366f7b4237257ef50ca9c431a6a0d50225aca upstream.
This function will be used from dma_direct code to determine the maximum segment size of a dma mapping.
Cc: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/swiotlb.h | 6 ++++++ kernel/dma/swiotlb.c | 9 +++++++++ 2 files changed, 15 insertions(+)
--- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -77,6 +77,7 @@ bool swiotlb_map(struct device *dev, phy void __init swiotlb_exit(void); unsigned int swiotlb_max_segment(void); size_t swiotlb_max_mapping_size(struct device *dev); +bool is_swiotlb_active(void); #else #define swiotlb_force SWIOTLB_NO_FORCE static inline bool is_swiotlb_buffer(phys_addr_t paddr) @@ -100,6 +101,11 @@ static inline size_t swiotlb_max_mapping { return SIZE_MAX; } + +static inline bool is_swiotlb_active(void) +{ + return false; +} #endif /* CONFIG_SWIOTLB */
extern void swiotlb_print_info(void); --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -667,3 +667,12 @@ size_t swiotlb_max_mapping_size(struct d { return ((size_t)1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE; } + +bool is_swiotlb_active(void) +{ + /* + * When SWIOTLB is initialized, even if io_tlb_start points to physical + * address zero, io_tlb_end surely doesn't. + */ + return io_tlb_end != 0; +}
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas bhelgaas@google.com
commit 10ecc818ea7319b5d0d2b4e1aa6a77323e776f76 upstream.
RussianNeuroMancer reported that the Intel 7265 wifi on a Dell Venue 11 Pro 7140 table stopped working after wakeup from suspend and bisected the problem to 9ab105deb60f ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR"). David Ward reported the same problem on a Dell Latitude 7350.
After af8bb9f89838 ("PCI/ACPI: Request LTR control from platform before using it"), we don't enable LTR unless the platform has granted LTR control to us. In addition, we don't notice if the platform had already enabled LTR itself.
After 9ab105deb60f ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR"), we avoid using LTR if we don't think the path to the device has LTR enabled.
The combination means that if the platform itself enables LTR but declines to give the OS control over LTR, we unnecessarily avoided using ASPM L1.2.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469 Fixes: 9ab105deb60f ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR") Fixes: af8bb9f89838 ("PCI/ACPI: Request LTR control from platform before using it") Reported-by: RussianNeuroMancer russianneuromancer@ya.ru Reported-by: David Ward david.ward@ll.mit.edu Signed-off-by: Bjorn Helgaas bhelgaas@google.com CC: stable@vger.kernel.org # v4.18+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/probe.c | 36 +++++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 13 deletions(-)
--- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2071,11 +2071,8 @@ static void pci_configure_ltr(struct pci { #ifdef CONFIG_PCIEASPM struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); - u32 cap; struct pci_dev *bridge; - - if (!host->native_ltr) - return; + u32 cap, ctl;
if (!pci_is_pcie(dev)) return; @@ -2084,22 +2081,35 @@ static void pci_configure_ltr(struct pci if (!(cap & PCI_EXP_DEVCAP2_LTR)) return;
- /* - * Software must not enable LTR in an Endpoint unless the Root - * Complex and all intermediate Switches indicate support for LTR. - * PCIe r3.1, sec 6.18. - */ - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) - dev->ltr_path = 1; - else { + pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl); + if (ctl & PCI_EXP_DEVCTL2_LTR_EN) { + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { + dev->ltr_path = 1; + return; + } + bridge = pci_upstream_bridge(dev); if (bridge && bridge->ltr_path) dev->ltr_path = 1; + + return; }
- if (dev->ltr_path) + if (!host->native_ltr) + return; + + /* + * Software must not enable LTR in an Endpoint unless the Root + * Complex and all intermediate Switches indicate support for LTR. + * PCIe r4.0, sec 6.18. + */ + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || + ((bridge = pci_upstream_bridge(dev)) && + bridge->ltr_path)) { pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_LTR_EN); + dev->ltr_path = 1; + } #endif }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dongdong Liu liudongdong3@huawei.com
commit 9f08a5d896ce43380314c34ed3f264c8e6075b80 upstream.
Previously dpc_handler() called aer_get_device_error_info() without initializing info->severity, so aer_get_device_error_info() relied on uninitialized data.
Add dpc_get_aer_uncorrect_severity() to read the port's AER status, mask, and severity registers and set info->severity.
Also, clear the port's AER fatal error status bits.
Fixes: 8aefa9b0d910 ("PCI/DPC: Print AER status in DPC event handling") Signed-off-by: Dongdong Liu liudongdong3@huawei.com [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Keith Busch keith.busch@intel.com Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pcie/dpc.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
--- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -202,6 +202,28 @@ static void dpc_process_rp_pio_error(str pci_write_config_dword(pdev, cap + PCI_EXP_DPC_RP_PIO_STATUS, status); }
+static int dpc_get_aer_uncorrect_severity(struct pci_dev *dev, + struct aer_err_info *info) +{ + int pos = dev->aer_cap; + u32 status, mask, sev; + + pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status); + pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, &mask); + status &= ~mask; + if (!status) + return 0; + + pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev); + status &= sev; + if (status) + info->severity = AER_FATAL; + else + info->severity = AER_NONFATAL; + + return 1; +} + static irqreturn_t dpc_handler(int irq, void *context) { struct aer_err_info info; @@ -229,9 +251,12 @@ static irqreturn_t dpc_handler(int irq, /* show RP PIO error detail information */ if (dpc->rp_extensions && reason == 3 && ext_reason == 0) dpc_process_rp_pio_error(dpc); - else if (reason == 0 && aer_get_device_error_info(pdev, &info)) { + else if (reason == 0 && + dpc_get_aer_uncorrect_severity(pdev, &info) && + aer_get_device_error_info(pdev, &info)) { aer_print_error(pdev, &info); pci_cleanup_aer_uncorrect_error_status(pdev); + pci_aer_clear_fatal_status(pdev); }
/* We configure DPC so it only triggers on ERR_FATAL */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Andersson bjorn.andersson@linaro.org
commit 02b485e31d98265189b91f3e69c43df2ed50610c upstream.
Acquiring the reset GPIO low means that reset is being deasserted, this is followed almost immediately with qcom_pcie_host_init() asserting it, initializing it and then finally deasserting it again, for the link to come up.
Some PCIe devices requires a minimum time between the initial deassert and subsequent reset cycles. In a platform that boots with the reset GPIO asserted this requirement is being violated by this deassert/assert pulse.
Acquire the reset GPIO high to prevent this situation by matching the state to the subsequent asserted state.
Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Stanimir Varbanov svarbanov@mm-sol.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/dwc/pcie-qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -1228,7 +1228,7 @@ static int qcom_pcie_probe(struct platfo
pcie->ops = of_device_get_match_data(dev);
- pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_LOW); + pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_HIGH); if (IS_ERR(pcie->reset)) { ret = PTR_ERR(pcie->reset); goto err_pm_runtime_put;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lucas Stach l.stach@pengutronix.de
commit 3afc8299f39a27b60e1519a28e18878ce878e7dd upstream.
Since 7c5925afbc58 (PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API) the MSI init claims one of the controller IRQs as a chained IRQ line for the MSI controller. On some designs, like the i.MX6, this line is shared with a PCIe legacy IRQ. When the line is claimed for the MSI domain, any device trying to use this legacy IRQs will fail to request this IRQ line.
As MSI and legacy IRQs are already mutually exclusive on the DWC core, as the core won't forward any legacy IRQs once any MSI has been enabled, users wishing to use legacy IRQs already need to explictly disable MSI support (usually via the pci=nomsi kernel commandline option). To avoid any issues with MSI conflicting with legacy IRQs, just skip all of the DWC MSI initalization, including the IRQ line claim, when MSI is disabled.
Fixes: 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains hierarchical API") Tested-by: Tim Harvey tharvey@gateworks.com Signed-off-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Gustavo Pimentel gustavo.pimentel@synopsys.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/dwc/pcie-designware-host.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -439,7 +439,7 @@ int dw_pcie_host_init(struct pcie_port * if (ret) pci->num_viewport = 2;
- if (IS_ENABLED(CONFIG_PCI_MSI)) { + if (IS_ENABLED(CONFIG_PCI_MSI) && pci_msi_enabled()) { /* * If a specific SoC driver needs to change the * default number of vectors, it needs to implement
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit bbe54ea5330d828cc396d451c0e1e5c3f9764c1e upstream.
Commit 0e157e528604 ("PCI/PME: Implement runtime PM callbacks") tried to solve an issue where the hierarchy immediately wakes up when it is transitioned into D3cold. However, it turns out to prevent PME propagation on some systems that do not support D3cold.
I looked more closely at what might cause the immediate wakeup. It happens when the ACPI power resource of the root port is turned off. The AML code associated with the _OFF() method of the ACPI power resource starts a PCIe L2/L3 Ready transition and waits for it to complete. Right after the L2/L3 Ready transition is started the root port receives a PME from the downstream port.
The simplest hierarchy where this happens looks like this:
00:1d.0 PCIe Root Port ^ | v 05:00.0 PCIe switch #1 upstream port 06:01.0 PCIe switch #1 downstream hotplug port ^ | v 08:00.0 PCIe switch #2 upstream port
It seems that the PCIe link between the two switches, before PME_Turn_Off/PME_TO_Ack is complete for the whole hierarchy, goes inactive and triggers PME towards the root port bringing it back to D0. The L2/L3 Ready sequence is described in PCIe r4.0 spec sections 5.2 and 5.3.3 but unfortunately they do not state what happens if DLLSCE is enabled during the sequence.
Disabling Data Link Layer State Changed event (DLLSCE) seems to prevent the issue and still allows the downstream hotplug port to notice when a device is plugged/unplugged.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202593 Fixes: 0e157e528604 ("PCI/PME: Implement runtime PM callbacks") Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com CC: stable@vger.kernel.org # v4.20+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/hotplug/pciehp_hpc.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -736,12 +736,25 @@ void pcie_clear_hotplug_events(struct co
void pcie_enable_interrupt(struct controller *ctrl) { - pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_HPIE, PCI_EXP_SLTCTL_HPIE); + u16 mask; + + mask = PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_DLLSCE; + pcie_write_cmd(ctrl, mask, mask); }
void pcie_disable_interrupt(struct controller *ctrl) { - pcie_write_cmd(ctrl, 0, PCI_EXP_SLTCTL_HPIE); + u16 mask; + + /* + * Mask hot-plug interrupt to prevent it triggering immediately + * when the link goes inactive (we still get PME when any of the + * enabled events is detected). Same goes with Link Layer State + * changed event which generates PME immediately when the link goes + * inactive so mask it as well. + */ + mask = PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_DLLSCE; + pcie_write_cmd(ctrl, 0, mask); }
/*
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Petazzoni thomas.petazzoni@bootlin.com
commit 59f81c35e0df840f7112cb296dde48df84a67c79 upstream.
The behavior of the different registers of the PCI-to-PCI bridge is currently encoded in two global arrays, shared by all instances of PCI-to-PCI bridge emulation.
However, we will need to tweak the behavior on a per-bridge basis, to accommodate for different capabilities of the platforms where this code is used. In preparation for this, create a per-bridge copy of the register behavior arrays, so that they can later be tweaked on a per-bridge basis.
Fixes: 1f08673eef123 ("PCI: mvebu: Convert to PCI emulated bridge config space") Reported-by: Luís Mendes luis.p.mendes@gmail.com Reported-by: Leigh Brown leigh@solinno.co.uk Tested-by: Leigh Brown leigh@solinno.co.uk Tested-by: Luis Mendes luis.p.mendes@gmail.com Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org Cc: Luís Mendes luis.p.mendes@gmail.com Cc: Leigh Brown leigh@solinno.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pci-bridge-emul.c | 80 +++++++++++++++++++++++++++--------------- drivers/pci/pci-bridge-emul.h | 8 +++- 2 files changed, 60 insertions(+), 28 deletions(-)
--- a/drivers/pci/pci-bridge-emul.c +++ b/drivers/pci/pci-bridge-emul.c @@ -24,29 +24,6 @@ #define PCI_CAP_PCIE_START PCI_BRIDGE_CONF_END #define PCI_CAP_PCIE_END (PCI_CAP_PCIE_START + PCI_EXP_SLTSTA2 + 2)
-/* - * Initialize a pci_bridge_emul structure to represent a fake PCI - * bridge configuration space. The caller needs to have initialized - * the PCI configuration space with whatever values make sense - * (typically at least vendor, device, revision), the ->ops pointer, - * and optionally ->data and ->has_pcie. - */ -void pci_bridge_emul_init(struct pci_bridge_emul *bridge) -{ - bridge->conf.class_revision |= PCI_CLASS_BRIDGE_PCI << 16; - bridge->conf.header_type = PCI_HEADER_TYPE_BRIDGE; - bridge->conf.cache_line_size = 0x10; - bridge->conf.status = PCI_STATUS_CAP_LIST; - - if (bridge->has_pcie) { - bridge->conf.capabilities_pointer = PCI_CAP_PCIE_START; - bridge->pcie_conf.cap_id = PCI_CAP_ID_EXP; - /* Set PCIe v2, root port, slot support */ - bridge->pcie_conf.cap = PCI_EXP_TYPE_ROOT_PORT << 4 | 2 | - PCI_EXP_FLAGS_SLOT; - } -} - struct pci_bridge_reg_behavior { /* Read-only bits */ u32 ro; @@ -284,6 +261,55 @@ const static struct pci_bridge_reg_behav };
/* + * Initialize a pci_bridge_emul structure to represent a fake PCI + * bridge configuration space. The caller needs to have initialized + * the PCI configuration space with whatever values make sense + * (typically at least vendor, device, revision), the ->ops pointer, + * and optionally ->data and ->has_pcie. + */ +int pci_bridge_emul_init(struct pci_bridge_emul *bridge) +{ + bridge->conf.class_revision |= PCI_CLASS_BRIDGE_PCI << 16; + bridge->conf.header_type = PCI_HEADER_TYPE_BRIDGE; + bridge->conf.cache_line_size = 0x10; + bridge->conf.status = PCI_STATUS_CAP_LIST; + bridge->pci_regs_behavior = kmemdup(pci_regs_behavior, + sizeof(pci_regs_behavior), + GFP_KERNEL); + if (!bridge->pci_regs_behavior) + return -ENOMEM; + + if (bridge->has_pcie) { + bridge->conf.capabilities_pointer = PCI_CAP_PCIE_START; + bridge->pcie_conf.cap_id = PCI_CAP_ID_EXP; + /* Set PCIe v2, root port, slot support */ + bridge->pcie_conf.cap = PCI_EXP_TYPE_ROOT_PORT << 4 | 2 | + PCI_EXP_FLAGS_SLOT; + bridge->pcie_cap_regs_behavior = + kmemdup(pcie_cap_regs_behavior, + sizeof(pcie_cap_regs_behavior), + GFP_KERNEL); + if (!bridge->pcie_cap_regs_behavior) { + kfree(bridge->pci_regs_behavior); + return -ENOMEM; + } + } + + return 0; +} + +/* + * Cleanup a pci_bridge_emul structure that was previously initilized + * using pci_bridge_emul_init(). + */ +void pci_bridge_emul_cleanup(struct pci_bridge_emul *bridge) +{ + if (bridge->has_pcie) + kfree(bridge->pcie_cap_regs_behavior); + kfree(bridge->pci_regs_behavior); +} + +/* * Should be called by the PCI controller driver when reading the PCI * configuration space of the fake bridge. It will call back the * ->ops->read_base or ->ops->read_pcie operations. @@ -312,11 +338,11 @@ int pci_bridge_emul_conf_read(struct pci reg -= PCI_CAP_PCIE_START; read_op = bridge->ops->read_pcie; cfgspace = (u32 *) &bridge->pcie_conf; - behavior = pcie_cap_regs_behavior; + behavior = bridge->pcie_cap_regs_behavior; } else { read_op = bridge->ops->read_base; cfgspace = (u32 *) &bridge->conf; - behavior = pci_regs_behavior; + behavior = bridge->pci_regs_behavior; }
if (read_op) @@ -383,11 +409,11 @@ int pci_bridge_emul_conf_write(struct pc reg -= PCI_CAP_PCIE_START; write_op = bridge->ops->write_pcie; cfgspace = (u32 *) &bridge->pcie_conf; - behavior = pcie_cap_regs_behavior; + behavior = bridge->pcie_cap_regs_behavior; } else { write_op = bridge->ops->write_base; cfgspace = (u32 *) &bridge->conf; - behavior = pci_regs_behavior; + behavior = bridge->pci_regs_behavior; }
/* Keep all bits, except the RW bits */ --- a/drivers/pci/pci-bridge-emul.h +++ b/drivers/pci/pci-bridge-emul.h @@ -107,15 +107,21 @@ struct pci_bridge_emul_ops { u32 old, u32 new, u32 mask); };
+struct pci_bridge_reg_behavior; + struct pci_bridge_emul { struct pci_bridge_emul_conf conf; struct pci_bridge_emul_pcie_conf pcie_conf; struct pci_bridge_emul_ops *ops; + struct pci_bridge_reg_behavior *pci_regs_behavior; + struct pci_bridge_reg_behavior *pcie_cap_regs_behavior; void *data; bool has_pcie; };
-void pci_bridge_emul_init(struct pci_bridge_emul *bridge); +int pci_bridge_emul_init(struct pci_bridge_emul *bridge); +void pci_bridge_emul_cleanup(struct pci_bridge_emul *bridge); + int pci_bridge_emul_conf_read(struct pci_bridge_emul *bridge, int where, int size, u32 *value); int pci_bridge_emul_conf_write(struct pci_bridge_emul *bridge, int where,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Petazzoni thomas.petazzoni@bootlin.com
commit 33776d059630e5045ea9ccf756c74de8f9cc86de upstream.
Depending on the capabilities of the PCI controller/platform, the PCI-to-PCI bridge emulation behavior might need to be different. For example, on platforms that use the pci-mvebu code, we currently don't support prefetchable memory BARs, so the corresponding fields in the PCI-to-PCI bridge configuration space should be read-only.
To implement this, extend pci_bridge_emul_init() to take a "flags" argument, with currently one flag supported:
PCI_BRIDGE_EMUL_NO_PREFETCHABLE_BAR
that will make the prefetchable memory base and limit registers read-only.
The pci-mvebu and pci-aardvark drivers are updated accordingly.
Fixes: 1f08673eef123 ("PCI: mvebu: Convert to PCI emulated bridge config space") Reported-by: Luís Mendes luis.p.mendes@gmail.com Reported-by: Leigh Brown leigh@solinno.co.uk Tested-by: Leigh Brown leigh@solinno.co.uk Tested-by: Luis Mendes luis.p.mendes@gmail.com Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org Cc: Luís Mendes luis.p.mendes@gmail.com Cc: Leigh Brown leigh@solinno.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/pci-aardvark.c | 2 +- drivers/pci/controller/pci-mvebu.c | 2 +- drivers/pci/pci-bridge-emul.c | 8 +++++++- drivers/pci/pci-bridge-emul.h | 7 ++++++- 4 files changed, 15 insertions(+), 4 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -499,7 +499,7 @@ static void advk_sw_pci_bridge_init(stru bridge->data = pcie; bridge->ops = &advk_pci_bridge_emul_ops;
- pci_bridge_emul_init(bridge); + pci_bridge_emul_init(bridge, 0);
}
--- a/drivers/pci/controller/pci-mvebu.c +++ b/drivers/pci/controller/pci-mvebu.c @@ -583,7 +583,7 @@ static void mvebu_pci_bridge_emul_init(s bridge->data = port; bridge->ops = &mvebu_pci_bridge_emul_ops;
- pci_bridge_emul_init(bridge); + pci_bridge_emul_init(bridge, PCI_BRIDGE_EMUL_NO_PREFETCHABLE_BAR); }
static inline struct mvebu_pcie *sys_to_pcie(struct pci_sys_data *sys) --- a/drivers/pci/pci-bridge-emul.c +++ b/drivers/pci/pci-bridge-emul.c @@ -267,7 +267,8 @@ const static struct pci_bridge_reg_behav * (typically at least vendor, device, revision), the ->ops pointer, * and optionally ->data and ->has_pcie. */ -int pci_bridge_emul_init(struct pci_bridge_emul *bridge) +int pci_bridge_emul_init(struct pci_bridge_emul *bridge, + unsigned int flags) { bridge->conf.class_revision |= PCI_CLASS_BRIDGE_PCI << 16; bridge->conf.header_type = PCI_HEADER_TYPE_BRIDGE; @@ -295,6 +296,11 @@ int pci_bridge_emul_init(struct pci_brid } }
+ if (flags & PCI_BRIDGE_EMUL_NO_PREFETCHABLE_BAR) { + bridge->pci_regs_behavior[PCI_PREF_MEMORY_BASE / 4].ro = ~0; + bridge->pci_regs_behavior[PCI_PREF_MEMORY_BASE / 4].rw = 0; + } + return 0; }
--- a/drivers/pci/pci-bridge-emul.h +++ b/drivers/pci/pci-bridge-emul.h @@ -119,7 +119,12 @@ struct pci_bridge_emul { bool has_pcie; };
-int pci_bridge_emul_init(struct pci_bridge_emul *bridge); +enum { + PCI_BRIDGE_EMUL_NO_PREFETCHABLE_BAR = BIT(0), +}; + +int pci_bridge_emul_init(struct pci_bridge_emul *bridge, + unsigned int flags); void pci_bridge_emul_cleanup(struct pci_bridge_emul *bridge);
int pci_bridge_emul_conf_read(struct pci_bridge_emul *bridge, int where,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael J. Ruhl michael.j.ruhl@intel.com
commit bc5add09764c123f58942a37c8335247e683d234 upstream.
When disabling and removing a receive context, it is possible for an asynchronous event (i.e IRQ) to occur. Because of this, there is a race between cleaning up the context, and the context being used by the asynchronous event.
cpu 0 (context cleanup) rc->ref_count-- (ref_count == 0) hfi1_rcd_free() cpu 1 (IRQ (with rcd index)) rcd_get_by_index() lock ref_count+++ <-- reference count race (WARNING) return rcd unlock cpu 0 hfi1_free_ctxtdata() <-- incorrect free location lock remove rcd from array unlock free rcd
This race will cause the following WARNING trace:
WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 Call Trace: dump_stack+0x19/0x1b __warn+0xd8/0x100 warn_slowpath_null+0x1d/0x20 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1] is_rcv_urgent_int+0x24/0x90 [hfi1] general_interrupt+0x1b6/0x210 [hfi1] __handle_irq_event_percpu+0x44/0x1c0 handle_irq_event_percpu+0x32/0x80 handle_irq_event+0x3c/0x60 handle_edge_irq+0x7f/0x150 handle_irq+0xe4/0x1a0 do_IRQ+0x4d/0xf0 common_interrupt+0x162/0x162
The race can also lead to a use after free which could be similar to:
general protection fault: 0000 1 SMP CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015 task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230 Call Trace: ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1] hfi1_aio_write+0xba/0x110 [hfi1] do_sync_readv_writev+0x7b/0xd0 do_readv_writev+0xce/0x260 ? handle_mm_fault+0x39d/0x9b0 ? pick_next_task_fair+0x5f/0x1b0 ? sched_clock_cpu+0x85/0xc0 ? __schedule+0x13a/0x890 vfs_writev+0x35/0x60 SyS_writev+0x7f/0x110 system_call_fastpath+0x22/0x27
Use the appropriate kref API to verify access.
Reorder context cleanup to ensure context removal before cleanup occurs correctly.
Cc: stable@vger.kernel.org # v4.14.0+ Fixes: f683c80ca68e ("IB/hfi1: Resolve kernel panics by reference counting receive contexts") Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/hfi.h | 2 +- drivers/infiniband/hw/hfi1/init.c | 14 +++++++++----- 2 files changed, 10 insertions(+), 6 deletions(-)
--- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1435,7 +1435,7 @@ void hfi1_init_pportdata(struct pci_dev struct hfi1_devdata *dd, u8 hw_pidx, u8 port); void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd); int hfi1_rcd_put(struct hfi1_ctxtdata *rcd); -void hfi1_rcd_get(struct hfi1_ctxtdata *rcd); +int hfi1_rcd_get(struct hfi1_ctxtdata *rcd); struct hfi1_ctxtdata *hfi1_rcd_get_by_index_safe(struct hfi1_devdata *dd, u16 ctxt); struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt); --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -215,12 +215,12 @@ static void hfi1_rcd_free(struct kref *k struct hfi1_ctxtdata *rcd = container_of(kref, struct hfi1_ctxtdata, kref);
- hfi1_free_ctxtdata(rcd->dd, rcd); - spin_lock_irqsave(&rcd->dd->uctxt_lock, flags); rcd->dd->rcd[rcd->ctxt] = NULL; spin_unlock_irqrestore(&rcd->dd->uctxt_lock, flags);
+ hfi1_free_ctxtdata(rcd->dd, rcd); + kfree(rcd); }
@@ -243,10 +243,13 @@ int hfi1_rcd_put(struct hfi1_ctxtdata *r * @rcd: pointer to an initialized rcd data structure * * Use this to get a reference after the init. + * + * Return : reflect kref_get_unless_zero(), which returns non-zero on + * increment, otherwise 0. */ -void hfi1_rcd_get(struct hfi1_ctxtdata *rcd) +int hfi1_rcd_get(struct hfi1_ctxtdata *rcd) { - kref_get(&rcd->kref); + return kref_get_unless_zero(&rcd->kref); }
/** @@ -326,7 +329,8 @@ struct hfi1_ctxtdata *hfi1_rcd_get_by_in spin_lock_irqsave(&dd->uctxt_lock, flags); if (dd->rcd[ctxt]) { rcd = dd->rcd[ctxt]; - hfi1_rcd_get(rcd); + if (!hfi1_rcd_get(rcd)) + rcd = NULL; } spin_unlock_irqrestore(&dd->uctxt_lock, flags);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Marciniszyn mike.marciniszyn@intel.com
commit 38bbc9f0381550d1d227fc57afa08436e36b32fc upstream.
The IBTA spec notes:
o9-5.2.1: For any HCA which supports SEND with Invalidate, upon receiving an IETH, the Invalidate operation must not take place until after the normal transport header validation checks have been successfully completed.
The rdmavt loopback code does the validation after the invalidate.
Fix by relocating the operation specific logic for all SEND variants until after the validity checks.
Cc: stable@vger.kernel.org #v4.20+ Reviewed-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/sw/rdmavt/qp.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
--- a/drivers/infiniband/sw/rdmavt/qp.c +++ b/drivers/infiniband/sw/rdmavt/qp.c @@ -2893,18 +2893,8 @@ again: goto send_comp;
case IB_WR_SEND_WITH_INV: - if (!rvt_invalidate_rkey(qp, wqe->wr.ex.invalidate_rkey)) { - wc.wc_flags = IB_WC_WITH_INVALIDATE; - wc.ex.invalidate_rkey = wqe->wr.ex.invalidate_rkey; - } - goto send; - case IB_WR_SEND_WITH_IMM: - wc.wc_flags = IB_WC_WITH_IMM; - wc.ex.imm_data = wqe->wr.ex.imm_data; - /* FALLTHROUGH */ case IB_WR_SEND: -send: ret = rvt_get_rwqe(qp, false); if (ret < 0) goto op_err; @@ -2912,6 +2902,22 @@ send: goto rnr_nak; if (wqe->length > qp->r_len) goto inv_err; + switch (wqe->wr.opcode) { + case IB_WR_SEND_WITH_INV: + if (!rvt_invalidate_rkey(qp, + wqe->wr.ex.invalidate_rkey)) { + wc.wc_flags = IB_WC_WITH_INVALIDATE; + wc.ex.invalidate_rkey = + wqe->wr.ex.invalidate_rkey; + } + break; + case IB_WR_SEND_WITH_IMM: + wc.wc_flags = IB_WC_WITH_IMM; + wc.ex.imm_data = wqe->wr.ex.imm_data; + break; + default: + break; + } break;
case IB_WR_RDMA_WRITE_WITH_IMM:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael J. Ruhl michael.j.ruhl@intel.com
commit d757c60eca9b22f4d108929a24401e0fdecda0b1 upstream.
The RC/UC code path can go through a software loopback. In this code path the receive side QP is manipulated.
If two threads are working on the QP receive side (i.e. post_send, and modify_qp to an error state), QP information can be corrupted.
(post_send via loopback) set r_sge loop update r_sge (modify_qp) take r_lock update r_sge <---- r_sge is now incorrect (post_send) update r_sge <---- crash, etc. ...
This can lead to one of the two following crashes:
BUG: unable to handle kernel NULL pointer dereference at (null) IP: hfi1_copy_sge+0xf1/0x2e0 [hfi1] PGD 8000001fe6a57067 PUD 1fd9e0c067 PMD 0 Call Trace: ruc_loopback+0x49b/0xbc0 [hfi1] hfi1_do_send+0x38e/0x3e0 [hfi1] _hfi1_do_send+0x1e/0x20 [hfi1] process_one_work+0x17f/0x440 worker_thread+0x126/0x3c0 kthread+0xd1/0xe0 ret_from_fork_nospec_begin+0x21/0x21
or:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 IP: rvt_clear_mr_refs+0x45/0x370 [rdmavt] PGD 80000006ae5eb067 PUD ef15d0067 PMD 0 Call Trace: rvt_error_qp+0xaa/0x240 [rdmavt] rvt_modify_qp+0x47f/0xaa0 [rdmavt] ib_security_modify_qp+0x8f/0x400 [ib_core] ib_modify_qp_with_udata+0x44/0x70 [ib_core] modify_qp.isra.23+0x1eb/0x2b0 [ib_uverbs] ib_uverbs_modify_qp+0xaa/0xf0 [ib_uverbs] ib_uverbs_write+0x272/0x430 [ib_uverbs] vfs_write+0xc0/0x1f0 SyS_write+0x7f/0xf0 system_call_fastpath+0x1c/0x21
Fix by using the appropriate locking on the receiving QP.
Fixes: 15703461533a ("IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavt") Cc: stable@vger.kernel.org #v4.9+ Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/sw/rdmavt/qp.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-)
--- a/drivers/infiniband/sw/rdmavt/qp.c +++ b/drivers/infiniband/sw/rdmavt/qp.c @@ -2785,6 +2785,18 @@ again: } EXPORT_SYMBOL(rvt_copy_sge);
+static enum ib_wc_status loopback_qp_drop(struct rvt_ibport *rvp, + struct rvt_qp *sqp) +{ + rvp->n_pkt_drops++; + /* + * For RC, the requester would timeout and retry so + * shortcut the timeouts and just signal too many retries. + */ + return sqp->ibqp.qp_type == IB_QPT_RC ? + IB_WC_RETRY_EXC_ERR : IB_WC_SUCCESS; +} + /** * ruc_loopback - handle UC and RC loopback requests * @sqp: the sending QP @@ -2857,17 +2869,14 @@ again: } spin_unlock_irqrestore(&sqp->s_lock, flags);
- if (!qp || !(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK) || + if (!qp) { + send_status = loopback_qp_drop(rvp, sqp); + goto serr_no_r_lock; + } + spin_lock_irqsave(&qp->r_lock, flags); + if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK) || qp->ibqp.qp_type != sqp->ibqp.qp_type) { - rvp->n_pkt_drops++; - /* - * For RC, the requester would timeout and retry so - * shortcut the timeouts and just signal too many retries. - */ - if (sqp->ibqp.qp_type == IB_QPT_RC) - send_status = IB_WC_RETRY_EXC_ERR; - else - send_status = IB_WC_SUCCESS; + send_status = loopback_qp_drop(rvp, sqp); goto serr; }
@@ -3047,6 +3056,7 @@ do_write: wqe->wr.send_flags & IB_SEND_SOLICITED);
send_comp: + spin_unlock_irqrestore(&qp->r_lock, flags); spin_lock_irqsave(&sqp->s_lock, flags); rvp->n_loop_pkts++; flush_send: @@ -3073,6 +3083,7 @@ rnr_nak: } if (sqp->s_rnr_retry_cnt < 7) sqp->s_rnr_retry--; + spin_unlock_irqrestore(&qp->r_lock, flags); spin_lock_irqsave(&sqp->s_lock, flags); if (!(ib_rvt_state_ops[sqp->state] & RVT_PROCESS_RECV_OK)) goto clr_busy; @@ -3101,6 +3112,8 @@ err: rvt_rc_error(qp, wc.status);
serr: + spin_unlock_irqrestore(&qp->r_lock, flags); +serr_no_r_lock: spin_lock_irqsave(&sqp->s_lock, flags); rvt_send_complete(sqp, wqe, send_status); if (sqp->ibqp.qp_type == IB_QPT_RC) {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vaibhav Jain vaibhav@linux.ibm.com
commit edeb304f659792fb5bab90d7d6f3408b4c7301fb upstream.
Within cxl module, iteration over array 'adapter->afu' may be racy at few points as it might be simultaneously read during an EEH and its contents being set to NULL while driver is being unloaded or unbound from the adapter. This might result in a NULL pointer to 'struct afu' being de-referenced during an EEH thereby causing a kernel oops.
This patch fixes this by making sure that all access to the array 'adapter->afu' is wrapped within the context of spin-lock 'adapter->afu_list_lock'.
Fixes: 9e8df8a21963 ("cxl: EEH support") Cc: stable@vger.kernel.org # v4.3+ Acked-by: Andrew Donnellan andrew.donnellan@au1.ibm.com Acked-by: Frederic Barrat fbarrat@linux.ibm.com Acked-by: Christophe Lombard clombard@linux.vnet.ibm.com Signed-off-by: Vaibhav Jain vaibhav@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/cxl/guest.c | 2 ++ drivers/misc/cxl/pci.c | 39 ++++++++++++++++++++++++++++++--------- 2 files changed, 32 insertions(+), 9 deletions(-)
--- a/drivers/misc/cxl/guest.c +++ b/drivers/misc/cxl/guest.c @@ -267,6 +267,7 @@ static int guest_reset(struct cxl *adapt int i, rc;
pr_devel("Adapter reset request\n"); + spin_lock(&adapter->afu_list_lock); for (i = 0; i < adapter->slices; i++) { if ((afu = adapter->afu[i])) { pci_error_handlers(afu, CXL_ERROR_DETECTED_EVENT, @@ -283,6 +284,7 @@ static int guest_reset(struct cxl *adapt pci_error_handlers(afu, CXL_RESUME_EVENT, 0); } } + spin_unlock(&adapter->afu_list_lock); return rc; }
--- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1805,7 +1805,7 @@ static pci_ers_result_t cxl_vphb_error_d /* There should only be one entry, but go through the list * anyway */ - if (afu->phb == NULL) + if (afu == NULL || afu->phb == NULL) return result;
list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { @@ -1832,7 +1832,8 @@ static pci_ers_result_t cxl_pci_error_de { struct cxl *adapter = pci_get_drvdata(pdev); struct cxl_afu *afu; - pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET, afu_result; + pci_ers_result_t result = PCI_ERS_RESULT_NEED_RESET; + pci_ers_result_t afu_result = PCI_ERS_RESULT_NEED_RESET; int i;
/* At this point, we could still have an interrupt pending. @@ -1843,6 +1844,7 @@ static pci_ers_result_t cxl_pci_error_de
/* If we're permanently dead, give up. */ if (state == pci_channel_io_perm_failure) { + spin_lock(&adapter->afu_list_lock); for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i]; /* @@ -1851,6 +1853,7 @@ static pci_ers_result_t cxl_pci_error_de */ cxl_vphb_error_detected(afu, state); } + spin_unlock(&adapter->afu_list_lock); return PCI_ERS_RESULT_DISCONNECT; }
@@ -1932,11 +1935,17 @@ static pci_ers_result_t cxl_pci_error_de * * In slot_reset, free the old resources and allocate new ones. * * In resume, clear the flag to allow things to start. */ + + /* Make sure no one else changes the afu list */ + spin_lock(&adapter->afu_list_lock); + for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i];
- afu_result = cxl_vphb_error_detected(afu, state); + if (afu == NULL) + continue;
+ afu_result = cxl_vphb_error_detected(afu, state); cxl_context_detach_all(afu); cxl_ops->afu_deactivate_mode(afu, afu->current_mode); pci_deconfigure_afu(afu); @@ -1948,6 +1957,7 @@ static pci_ers_result_t cxl_pci_error_de (result == PCI_ERS_RESULT_NEED_RESET)) result = PCI_ERS_RESULT_NONE; } + spin_unlock(&adapter->afu_list_lock);
/* should take the context lock here */ if (cxl_adapter_context_lock(adapter) != 0) @@ -1980,14 +1990,18 @@ static pci_ers_result_t cxl_pci_slot_res */ cxl_adapter_context_unlock(adapter);
+ spin_lock(&adapter->afu_list_lock); for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i];
+ if (afu == NULL) + continue; + if (pci_configure_afu(afu, adapter, pdev)) - goto err; + goto err_unlock;
if (cxl_afu_select_best_mode(afu)) - goto err; + goto err_unlock;
if (afu->phb == NULL) continue; @@ -1999,16 +2013,16 @@ static pci_ers_result_t cxl_pci_slot_res ctx = cxl_get_context(afu_dev);
if (ctx && cxl_release_context(ctx)) - goto err; + goto err_unlock;
ctx = cxl_dev_context_init(afu_dev); if (IS_ERR(ctx)) - goto err; + goto err_unlock;
afu_dev->dev.archdata.cxl_ctx = ctx;
if (cxl_ops->afu_check_and_enable(afu)) - goto err; + goto err_unlock;
afu_dev->error_state = pci_channel_io_normal;
@@ -2029,8 +2043,13 @@ static pci_ers_result_t cxl_pci_slot_res result = PCI_ERS_RESULT_DISCONNECT; } } + + spin_unlock(&adapter->afu_list_lock); return result;
+err_unlock: + spin_unlock(&adapter->afu_list_lock); + err: /* All the bits that happen in both error_detected and cxl_remove * should be idempotent, so we don't need to worry about leaving a mix @@ -2051,10 +2070,11 @@ static void cxl_pci_resume(struct pci_de * This is not the place to be checking if everything came back up * properly, because there's no return value: do that in slot_reset. */ + spin_lock(&adapter->afu_list_lock); for (i = 0; i < adapter->slices; i++) { afu = adapter->afu[i];
- if (afu->phb == NULL) + if (afu == NULL || afu->phb == NULL) continue;
list_for_each_entry(afu_dev, &afu->phb->bus->devices, bus_list) { @@ -2063,6 +2083,7 @@ static void cxl_pci_resume(struct pci_de afu_dev->driver->err_handler->resume(afu_dev); } } + spin_unlock(&adapter->afu_list_lock); }
static const struct pci_error_handlers cxl_err_handler = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 1c2d14212b15a60300a2d4f6364753e87394c521 upstream.
When ext2 filesystem is created with 64k block size, ext2_max_size() will return value less than 0. Also, we cannot write any file in this fs since the sb->maxbytes is less than 0. The core of the problem is that the size of block index tree for such large block size is more than i_blocks can carry. So fix the computation to count with this possibility.
File size limits computed with the new function for the full range of possible block sizes look like:
bits file_size 10 17247252480 11 275415851008 12 2196873666560 13 2197948973056 14 2198486220800 15 2198754754560 16 2198888906752
CC: stable@vger.kernel.org Reported-by: yangerkun yangerkun@huawei.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext2/super.c | 41 ++++++++++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 15 deletions(-)
--- a/fs/ext2/super.c +++ b/fs/ext2/super.c @@ -757,7 +757,8 @@ static loff_t ext2_max_size(int bits) { loff_t res = EXT2_NDIR_BLOCKS; int meta_blocks; - loff_t upper_limit; + unsigned int upper_limit; + unsigned int ppb = 1 << (bits-2);
/* This is calculated to be the largest file size for a * dense, file such that the total number of @@ -771,24 +772,34 @@ static loff_t ext2_max_size(int bits) /* total blocks in file system block size */ upper_limit >>= (bits - 9);
- - /* indirect blocks */ - meta_blocks = 1; - /* double indirect blocks */ - meta_blocks += 1 + (1LL << (bits-2)); - /* tripple indirect blocks */ - meta_blocks += 1 + (1LL << (bits-2)) + (1LL << (2*(bits-2))); - - upper_limit -= meta_blocks; - upper_limit <<= bits; - + /* Compute how many blocks we can address by block tree */ res += 1LL << (bits-2); res += 1LL << (2*(bits-2)); res += 1LL << (3*(bits-2)); + /* Does block tree limit file size? */ + if (res < upper_limit) + goto check_lfs; + + res = upper_limit; + /* How many metadata blocks are needed for addressing upper_limit? */ + upper_limit -= EXT2_NDIR_BLOCKS; + /* indirect blocks */ + meta_blocks = 1; + upper_limit -= ppb; + /* double indirect blocks */ + if (upper_limit < ppb * ppb) { + meta_blocks += 1 + DIV_ROUND_UP(upper_limit, ppb); + res -= meta_blocks; + goto check_lfs; + } + meta_blocks += 1 + ppb; + upper_limit -= ppb * ppb; + /* tripple indirect blocks for the rest */ + meta_blocks += 1 + DIV_ROUND_UP(upper_limit, ppb) + + DIV_ROUND_UP(upper_limit, ppb*ppb); + res -= meta_blocks; +check_lfs: res <<= bits; - if (res > upper_limit) - res = upper_limit; - if (res > MAX_LFS_FILESIZE) res = MAX_LFS_FILESIZE;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kunihiko Hayashi hayashi.kunihiko@socionext.com
commit 521282237b9d78b9bff423ec818becd4c95841c2 upstream.
Need to set the update bit in UNIPHIER_CLK_CPUGEAR_UPD to update the CPU-gear value.
Fixes: d08f1f0d596c ("clk: uniphier: add CPU-gear change (cpufreq) support") Cc: linux-stable@vger.kernel.org Signed-off-by: Kunihiko Hayashi hayashi.kunihiko@socionext.com Acked-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/uniphier/clk-uniphier-cpugear.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/uniphier/clk-uniphier-cpugear.c +++ b/drivers/clk/uniphier/clk-uniphier-cpugear.c @@ -47,7 +47,7 @@ static int uniphier_clk_cpugear_set_pare return ret;
ret = regmap_write_bits(gear->regmap, - gear->regbase + UNIPHIER_CLK_CPUGEAR_SET, + gear->regbase + UNIPHIER_CLK_CPUGEAR_UPD, UNIPHIER_CLK_CPUGEAR_UPD_BIT, UNIPHIER_CLK_CPUGEAR_UPD_BIT); if (ret)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren tony@atomide.com
commit 5ae51d67aec95f6f9386aa8dd5db424964895575 upstream.
I noticed that modprobe clk-twl6040 can fail after a cold boot with: abe_cm:clk:0010:0: failed to enable ... Unhandled fault: imprecise external abort (0x1406) at 0xbe896b20
WARNING: CPU: 1 PID: 29 at drivers/clk/clk.c:828 clk_core_disable_lock+0x18/0x24 ... (clk_core_disable_lock) from [<c0123534>] (_disable_clocks+0x18/0x90) (_disable_clocks) from [<c0124040>] (_idle+0x17c/0x244) (_idle) from [<c0125ad4>] (omap_hwmod_idle+0x24/0x44) (omap_hwmod_idle) from [<c053a038>] (sysc_runtime_suspend+0x48/0x108) (sysc_runtime_suspend) from [<c06084c4>] (__rpm_callback+0x144/0x1d8) (__rpm_callback) from [<c0608578>] (rpm_callback+0x20/0x80) (rpm_callback) from [<c0607034>] (rpm_suspend+0x120/0x694) (rpm_suspend) from [<c0607a78>] (__pm_runtime_idle+0x60/0x84) (__pm_runtime_idle) from [<c053aaf0>] (sysc_probe+0x874/0xf2c) (sysc_probe) from [<c05fecd4>] (platform_drv_probe+0x48/0x98)
After searching around for a similar issue, I came across an earlier fix that never got merged upstream in the Android tree for glass-omap-xrr02. There is patch "MFD: twl6040-codec: Implement PDMCLK cold temp errata" by Misael Lopez Cruz misael.lopez@ti.com.
Based on my observations, this fix is also needed when cold booting devices, and not just for deeper idle modes. Since we now have a clock driver for pdmclk, let's fix the issue in twl6040_pdmclk_prepare().
Cc: Misael Lopez Cruz misael.lopez@ti.com Cc: Peter Ujfalusi peter.ujfalusi@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Acked-by: Peter Ujfalusi peter.ujfalusi@ti.com Cc: stable@vger.kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/clk-twl6040.c | 53 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-)
--- a/drivers/clk/clk-twl6040.c +++ b/drivers/clk/clk-twl6040.c @@ -41,6 +41,43 @@ static int twl6040_pdmclk_is_prepared(st return pdmclk->enabled; }
+static int twl6040_pdmclk_reset_one_clock(struct twl6040_pdmclk *pdmclk, + unsigned int reg) +{ + const u8 reset_mask = TWL6040_HPLLRST; /* Same for HPPLL and LPPLL */ + int ret; + + ret = twl6040_set_bits(pdmclk->twl6040, reg, reset_mask); + if (ret < 0) + return ret; + + ret = twl6040_clear_bits(pdmclk->twl6040, reg, reset_mask); + if (ret < 0) + return ret; + + return 0; +} + +/* + * TWL6040A2 Phoenix Audio IC erratum #6: "PDM Clock Generation Issue At + * Cold Temperature". This affects cold boot and deeper idle states it + * seems. The workaround consists of resetting HPPLL and LPPLL. + */ +static int twl6040_pdmclk_quirk_reset_clocks(struct twl6040_pdmclk *pdmclk) +{ + int ret; + + ret = twl6040_pdmclk_reset_one_clock(pdmclk, TWL6040_REG_HPPLLCTL); + if (ret) + return ret; + + ret = twl6040_pdmclk_reset_one_clock(pdmclk, TWL6040_REG_LPPLLCTL); + if (ret) + return ret; + + return 0; +} + static int twl6040_pdmclk_prepare(struct clk_hw *hw) { struct twl6040_pdmclk *pdmclk = container_of(hw, struct twl6040_pdmclk, @@ -48,8 +85,20 @@ static int twl6040_pdmclk_prepare(struct int ret;
ret = twl6040_power(pdmclk->twl6040, 1); - if (!ret) - pdmclk->enabled = 1; + if (ret) + return ret; + + ret = twl6040_pdmclk_quirk_reset_clocks(pdmclk); + if (ret) + goto out_err; + + pdmclk->enabled = 1; + + return 0; + +out_err: + dev_err(pdmclk->dev, "%s: error %i\n", __func__, ret); + twl6040_power(pdmclk->twl6040, 0);
return ret; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzk@kernel.org
commit 5f0b6216ea381b43c0dff88702d6cc5673d63922 upstream.
During initialization of subdevices if platform_device_alloc() failed, returned NULL pointer will be later dereferenced. Add proper error paths to exynos5_clk_register_subcmu(). The return value of this function is still ignored because at this stage of init there is nothing we can do.
Fixes: b06a532bf1fa ("clk: samsung: Add Exynos5 sub-CMU clock driver") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/samsung/clk-exynos5-subcmu.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/clk/samsung/clk-exynos5-subcmu.c +++ b/drivers/clk/samsung/clk-exynos5-subcmu.c @@ -136,15 +136,21 @@ static int __init exynos5_clk_register_s { struct of_phandle_args genpdspec = { .np = pd_node }; struct platform_device *pdev; + int ret;
pdev = platform_device_alloc(info->pd_name, -1); + if (!pdev) + return -ENOMEM; + pdev->dev.parent = parent; pdev->driver_override = "exynos5-subcmu"; platform_set_drvdata(pdev, (void *)info); of_genpd_add_device(&genpdspec, &pdev->dev); - platform_device_add(pdev); + ret = platform_device_add(pdev); + if (ret) + platform_device_put(pdev);
- return 0; + return ret; }
static int __init exynos5_clk_probe(struct platform_device *pdev)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzk@kernel.org
commit 785c9f411eb2d9a6076d3511c631587d5e676bf3 upstream.
Platform driver driver_override field should not be initialized from const memory because the core later kfree() it. If driver_override is manually set later through sysfs, kfree() of old value leads to:
$ echo "new_value" > /sys/bus/platform/drivers/.../driver_override
kernel BUG at ../mm/slub.c:3960! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM ... (kfree) from [<c058e8c0>] (platform_set_driver_override+0x84/0xac) (platform_set_driver_override) from [<c058e908>] (driver_override_store+0x20/0x34) (driver_override_store) from [<c031f778>] (kernfs_fop_write+0x100/0x1dc) (kernfs_fop_write) from [<c0296de8>] (__vfs_write+0x2c/0x17c) (__vfs_write) from [<c02970c4>] (vfs_write+0xa4/0x188) (vfs_write) from [<c02972e8>] (ksys_write+0x4c/0xac) (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
The clk-exynos5-subcmu driver uses override only for the purpose of creating meaningful names for children devices (matching names of power domains, e.g. DISP, MFC). The driver_override was not developed for this purpose so just switch to default names of devices to fix the issue.
Fixes: b06a532bf1fa ("clk: samsung: Add Exynos5 sub-CMU clock driver") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/samsung/clk-exynos5-subcmu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/clk/samsung/clk-exynos5-subcmu.c +++ b/drivers/clk/samsung/clk-exynos5-subcmu.c @@ -138,12 +138,11 @@ static int __init exynos5_clk_register_s struct platform_device *pdev; int ret;
- pdev = platform_device_alloc(info->pd_name, -1); + pdev = platform_device_alloc("exynos5-subcmu", PLATFORM_DEVID_AUTO); if (!pdev) return -ENOMEM;
pdev->dev.parent = parent; - pdev->driver_override = "exynos5-subcmu"; platform_set_drvdata(pdev, (void *)info); of_genpd_add_device(&genpdspec, &pdev->dev); ret = platform_device_add(pdev);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Cercueil paul@crapouillou.net
commit bc5d922c93491878c44c9216e9d227c7eeb81d7f upstream.
Take a parent rate of 180 MHz, and a requested rate of 4.285715 MHz. This results in a theorical divider of 41.999993 which is then rounded up to 42. The .round_rate function would then return (180 MHz / 42) as the clock, rounded down, so 4.285714 MHz.
Calling clk_set_rate on 4.285714 MHz would round the rate again, and give a theorical divider of 42,0000028, now rounded up to 43, and the rate returned would be (180 MHz / 43) which is 4.186046 MHz, aka. not what we requested.
Fix this by rounding up the divisions.
Signed-off-by: Paul Cercueil paul@crapouillou.net Tested-by: Maarten ter Huurne maarten@treewalker.org Cc: stable@vger.kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/ingenic/cgu.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/clk/ingenic/cgu.c +++ b/drivers/clk/ingenic/cgu.c @@ -426,16 +426,16 @@ ingenic_clk_round_rate(struct clk_hw *hw struct ingenic_clk *ingenic_clk = to_ingenic_clk(hw); struct ingenic_cgu *cgu = ingenic_clk->cgu; const struct ingenic_cgu_clk_info *clk_info; - long rate = *parent_rate; + unsigned int div = 1;
clk_info = &cgu->clock_info[ingenic_clk->idx];
if (clk_info->type & CGU_CLK_DIV) - rate /= ingenic_clk_calc_div(clk_info, *parent_rate, req_rate); + div = ingenic_clk_calc_div(clk_info, *parent_rate, req_rate); else if (clk_info->type & CGU_CLK_FIXDIV) - rate /= clk_info->fixdiv.div; + div = clk_info->fixdiv.div;
- return rate; + return DIV_ROUND_UP(*parent_rate, div); }
static int @@ -455,7 +455,7 @@ ingenic_clk_set_rate(struct clk_hw *hw,
if (clk_info->type & CGU_CLK_DIV) { div = ingenic_clk_calc_div(clk_info, parent_rate, req_rate); - rate = parent_rate / div; + rate = DIV_ROUND_UP(parent_rate, div);
if (rate != req_rate) return -EINVAL;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Cercueil paul@crapouillou.net
commit 7ca4c922aad2e3c46767a12f80d01c6b25337b59 upstream.
The 'div' field does not represent a number of bits used to divide (understand: right-shift) the divider, but a number itself used to divide the divider.
Signed-off-by: Paul Cercueil paul@crapouillou.net Signed-off-by: Maarten ter Huurne maarten@treewalker.org Cc: stable@vger.kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/ingenic/cgu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/ingenic/cgu.h +++ b/drivers/clk/ingenic/cgu.h @@ -80,7 +80,7 @@ struct ingenic_cgu_mux_info { * @reg: offset of the divider control register within the CGU * @shift: number of bits to left shift the divide value by (ie. the index of * the lowest bit of the divide value within its control register) - * @div: number of bits to divide the divider value by (i.e. if the + * @div: number to divide the divider value by (i.e. if the * effective divider value is the value written to the register * multiplied by some constant) * @bits: the size of the divide value in bits
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Osipenko digetx@gmail.com
commit 563b9372f7ec57e44e8f9a8600c5107d7ffdd166 upstream.
The ChipIdea's platform device need to be unregistered on Tegra's driver module removal.
Fixes: dfebb5f43a78827a ("usb: chipidea: Add support for Tegra20/30/114/124") Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-by: Peter Chen peter.chen@nxp.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/chipidea/ci_hdrc_tegra.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/chipidea/ci_hdrc_tegra.c +++ b/drivers/usb/chipidea/ci_hdrc_tegra.c @@ -130,6 +130,7 @@ static int tegra_udc_remove(struct platf { struct tegra_udc *udc = platform_get_drvdata(pdev);
+ ci_hdrc_remove_device(udc->dev); usb_phy_set_suspend(udc->phy, 1); clk_disable_unprepare(udc->clk);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolaus Voss nikolaus.voss@loewensteinmedical.de
commit 8a863a608d47fa5d9dd15cf841817f73f804cf91 upstream.
Commit 1a2f474d328f handles block _reads_ separately with plain-I2C adapters, but the problem described with regmap-i2c not handling SMBus block transfers (i.e. read and writes) correctly also exists with writes.
As workaround, this patch adds a block write function the same way 1a2f474d328f adds a block read function.
Fixes: 1a2f474d328f ("usb: typec: tps6598x: handle block reads separately with plain-I2C adapters") Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers") Signed-off-by: Nikolaus Voss nikolaus.voss@loewensteinmedical.de Cc: stable stable@vger.kernel.org Reviewed-by: Guenter Roeck linux@roeck-us.net Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/typec/tps6598x.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)
--- a/drivers/usb/typec/tps6598x.c +++ b/drivers/usb/typec/tps6598x.c @@ -110,6 +110,20 @@ tps6598x_block_read(struct tps6598x *tps return 0; }
+static int tps6598x_block_write(struct tps6598x *tps, u8 reg, + void *val, size_t len) +{ + u8 data[TPS_MAX_LEN + 1]; + + if (!tps->i2c_protocol) + return regmap_raw_write(tps->regmap, reg, val, len); + + data[0] = len; + memcpy(&data[1], val, len); + + return regmap_raw_write(tps->regmap, reg, data, sizeof(data)); +} + static inline int tps6598x_read16(struct tps6598x *tps, u8 reg, u16 *val) { return tps6598x_block_read(tps, reg, val, sizeof(u16)); @@ -127,23 +141,23 @@ static inline int tps6598x_read64(struct
static inline int tps6598x_write16(struct tps6598x *tps, u8 reg, u16 val) { - return regmap_raw_write(tps->regmap, reg, &val, sizeof(u16)); + return tps6598x_block_write(tps, reg, &val, sizeof(u16)); }
static inline int tps6598x_write32(struct tps6598x *tps, u8 reg, u32 val) { - return regmap_raw_write(tps->regmap, reg, &val, sizeof(u32)); + return tps6598x_block_write(tps, reg, &val, sizeof(u32)); }
static inline int tps6598x_write64(struct tps6598x *tps, u8 reg, u64 val) { - return regmap_raw_write(tps->regmap, reg, &val, sizeof(u64)); + return tps6598x_block_write(tps, reg, &val, sizeof(u64)); }
static inline int tps6598x_write_4cc(struct tps6598x *tps, u8 reg, const char *val) { - return regmap_raw_write(tps->regmap, reg, &val, sizeof(u32)); + return tps6598x_block_write(tps, reg, &val, sizeof(u32)); }
static int tps6598x_read_partner_identity(struct tps6598x *tps) @@ -229,8 +243,8 @@ static int tps6598x_exec_cmd(struct tps6 return -EBUSY;
if (in_len) { - ret = regmap_raw_write(tps->regmap, TPS_REG_DATA1, - in_data, in_len); + ret = tps6598x_block_write(tps, TPS_REG_DATA1, + in_data, in_len); if (ret) return ret; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phuong Nguyen phuong.nguyen.xw@renesas.com
commit d9140a0da4a230a03426d175145989667758aa6a upstream.
This commit fixes the issue that USB-DMAC hangs silently after system resumes on R-Car Gen3 hence renesas_usbhs will not work correctly when using USB-DMAC for bulk transfer e.g. ethernet or serial gadgets.
The issue can be reproduced by these steps: 1. modprobe g_serial 2. Suspend and resume system. 3. connect a usb cable to host side 4. Transfer data from Host to Target 5. cat /dev/ttyGS0 (Target side) 6. echo "test" > /dev/ttyACM0 (Host side)
The 'cat' will not result anything. However, system still can work normally.
Currently, USB-DMAC driver does not have system sleep callbacks hence this driver relies on the PM core to force runtime suspend/resume to suspend and reinitialize USB-DMAC during system resume. After the commit 17218e0092f8 ("PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume()"), PM core will not force runtime suspend/resume anymore so this issue happens.
To solve this, make system suspend resume explicit by using pm_runtime_force_{suspend,resume}() as the system sleep callbacks. SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() is used to make sure USB-DMAC suspended after and initialized before renesas_usbhs."
Signed-off-by: Phuong Nguyen phuong.nguyen.xw@renesas.com Signed-off-by: Hiroyuki Yokoyama hiroyuki.yokoyama.vx@renesas.com Cc: stable@vger.kernel.org # v4.16+ [shimoda: revise the commit log and add Cc tag] Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/sh/usb-dmac.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/dma/sh/usb-dmac.c +++ b/drivers/dma/sh/usb-dmac.c @@ -694,6 +694,8 @@ static int usb_dmac_runtime_resume(struc #endif /* CONFIG_PM */
static const struct dev_pm_ops usb_dmac_pm = { + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, + pm_runtime_force_resume) SET_RUNTIME_PM_OPS(usb_dmac_runtime_suspend, usb_dmac_runtime_resume, NULL) };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anssi Hannula anssi.hannula@bitwise.fi
commit 7abab1605139bc41442864c18f9573440f7ca105 upstream.
If RX is disabled while there are still unprocessed bytes in RX FIFO, cdns_uart_handle_rx() called from interrupt handler will get stuck in the receive loop as read bytes will not get removed from the RX FIFO and CDNS_UART_SR_RXEMPTY bit will never get set.
Avoid the stuck handler by checking first if RX is disabled. port->lock protects against race with RX-disabling functions.
This HW behavior was mentioned by Nathan Rossi in 43e98facc4a3 ("tty: xuartps: Fix RX hang, and TX corruption in termios call") which fixed a similar issue in cdns_uart_set_termios(). The behavior can also be easily verified by e.g. setting CDNS_UART_CR_RX_DIS at the beginning of cdns_uart_handle_rx() - the following loop will then get stuck.
Resetting the FIFO using RXRST would not set RXEMPTY either so simply issuing a reset after RX-disable would not work.
I observe this frequently on a ZynqMP board during heavy RX load at 1M baudrate when the reader process exits and thus RX gets disabled.
Fixes: 61ec9016988f ("tty/serial: add support for Xilinx PS UART") Signed-off-by: Anssi Hannula anssi.hannula@bitwise.fi Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/xilinx_uartps.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/tty/serial/xilinx_uartps.c +++ b/drivers/tty/serial/xilinx_uartps.c @@ -364,7 +364,13 @@ static irqreturn_t cdns_uart_isr(int irq cdns_uart_handle_tx(dev_id); isrstatus &= ~CDNS_UART_IXR_TXEMPTY; } - if (isrstatus & CDNS_UART_IXR_RXMASK) + + /* + * Skip RX processing if RX is disabled as RXEMPTY will never be set + * as read bytes will not be removed from the FIFO. + */ + if (isrstatus & CDNS_UART_IXR_RXMASK && + !(readl(port->membase + CDNS_UART_CR) & CDNS_UART_CR_RX_DIS)) cdns_uart_handle_rx(dev_id, isrstatus);
spin_unlock(&port->lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lubomir Rintel lkundrak@v3.sk
commit f4817843e39ce78aace0195a57d4e8500a65a898 upstream.
There are two other drivers that bind to mrvl,mmp-uart and both of them assume register shift of 2 bits. There are device trees that lack the property and rely on that assumption.
If this driver wins the race to bind to those devices, it should behave the same as the older deprecated driver.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/8250/8250_of.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/tty/serial/8250/8250_of.c +++ b/drivers/tty/serial/8250/8250_of.c @@ -130,6 +130,10 @@ static int of_platform_serial_setup(stru port->flags |= UPF_IOREMAP; }
+ /* Compatibility with the deprecated pxa driver and 8250_pxa drivers. */ + if (of_device_is_compatible(np, "mrvl,mmp-uart")) + port->regshift = 2; + /* Check for registers offset within the devices address range */ if (of_property_read_u32(np, "reg-shift", &prop) == 0) port->regshift = prop;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jay Dolan jay.dolan@accesio.com
commit b896b03bc7fce43a07012cc6bf5e2ab2fddf3364 upstream.
Have the correct number of ports created for ACCES serial cards. Two port cards show up as four ports, and four port cards show up as eight.
Fixes: c8d192428f52 ("serial: 8250: added acces i/o products quad and octal serial cards") Signed-off-by: Jay Dolan jay.dolan@accesio.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/8250/8250_pci.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
--- a/drivers/tty/serial/8250/8250_pci.c +++ b/drivers/tty/serial/8250/8250_pci.c @@ -4575,10 +4575,10 @@ static const struct pci_device_id serial */ { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_2SDB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM_2S, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SDB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, @@ -4587,10 +4587,10 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_2DB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM232_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4DB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, @@ -4599,10 +4599,10 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_2SMDB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM_2SM, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SMDB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, @@ -4611,13 +4611,13 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7951 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, @@ -4626,16 +4626,16 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_2S, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4S, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM232_2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7954 }, @@ -4644,13 +4644,13 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_2SM, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7954 }, + pbn_pericom_PI7C9X7952 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM422_4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7958 }, + pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM485_4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7958 }, + pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM422_8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7958 }, @@ -4659,19 +4659,19 @@ static const struct pci_device_id serial pbn_pericom_PI7C9X7958 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7958 }, + pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7958 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SM, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7958 }, + pbn_pericom_PI7C9X7954 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_8SM, PCI_ANY_ID, PCI_ANY_ID, 0, 0, pbn_pericom_PI7C9X7958 }, { PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4SM, PCI_ANY_ID, PCI_ANY_ID, 0, 0, - pbn_pericom_PI7C9X7958 }, + pbn_pericom_PI7C9X7954 }, /* * Topic TP560 Data/Fax/Voice 56k modem (reported by Evan Clarke) */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jay Dolan jay.dolan@accesio.com
commit 78d3820b9bd39028727c6aab7297b63c093db343 upstream.
The four port Pericom chips have the fourth port at the wrong address. Make use of quirk to fix it.
Fixes: c8d192428f52 ("serial: 8250: added acces i/o products quad and octal serial cards") Cc: stable stable@vger.kernel.org Signed-off-by: Jay Dolan jay.dolan@accesio.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/8250/8250_pci.c | 105 +++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+)
--- a/drivers/tty/serial/8250/8250_pci.c +++ b/drivers/tty/serial/8250/8250_pci.c @@ -2027,6 +2027,111 @@ static struct pci_serial_quirk pci_seria .setup = pci_default_setup, .exit = pci_plx9050_exit, }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SDB, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM_4S, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4DB, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM232_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SMDB, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM_4SM, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4S, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM232_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM422_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM485_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SM, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, + { + .vendor = PCI_VENDOR_ID_ACCESIO, + .device = PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4SM, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = pci_pericom_setup, + }, /* * SBS Technologies, Inc., PMC-OCTALPRO 232 */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhangyi (F) yi.zhang@huawei.com
commit 904cdbd41d749a476863a0ca41f6f396774f26e4 upstream.
Now, we capture a data corruption problem on ext4 while we're truncating an extent index block. Imaging that if we are revoking a buffer which has been journaled by the committing transaction, the buffer's jbddirty flag will not be cleared in jbd2_journal_forget(), so the commit code will set the buffer dirty flag again after refile the buffer.
fsx kjournald2 jbd2_journal_commit_transaction jbd2_journal_revoke commit phase 1~5... jbd2_journal_forget belongs to older transaction commit phase 6 jbddirty not clear __jbd2_journal_refile_buffer __jbd2_journal_unfile_buffer test_clear_buffer_jbddirty mark_buffer_dirty
Finally, if the freed extent index block was allocated again as data block by some other files, it may corrupt the file data after writing cached pages later, such as during unmount time. (In general, clean_bdev_aliases() related helpers should be invoked after re-allocation to prevent the above corruption, but unfortunately we missed it when zeroout the head of extra extent blocks in ext4_ext_handle_unwritten_extents()).
This patch mark buffer as freed and set j_next_transaction to the new transaction when it already belongs to the committing transaction in jbd2_journal_forget(), so that commit code knows it should clear dirty bits when it is done with the buffer.
This problem can be reproduced by xfstests generic/455 easily with seeds (3246 3247 3248 3249).
Signed-off-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Reviewed-by: Jan Kara jack@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/jbd2/transaction.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1609,14 +1609,21 @@ int jbd2_journal_forget (handle_t *handl /* However, if the buffer is still owned by a prior * (committing) transaction, we can't drop it yet... */ JBUFFER_TRACE(jh, "belongs to older transaction"); - /* ... but we CAN drop it from the new transaction if we - * have also modified it since the original commit. */ + /* ... but we CAN drop it from the new transaction through + * marking the buffer as freed and set j_next_transaction to + * the new transaction, so that not only the commit code + * knows it should clear dirty bits when it is done with the + * buffer, but also the buffer can be checkpointed only + * after the new transaction commits. */
- if (jh->b_next_transaction) { - J_ASSERT(jh->b_next_transaction == transaction); + set_buffer_freed(bh); + + if (!jh->b_next_transaction) { spin_lock(&journal->j_list_lock); - jh->b_next_transaction = NULL; + jh->b_next_transaction = transaction; spin_unlock(&journal->j_list_lock); + } else { + J_ASSERT(jh->b_next_transaction == transaction);
/* * only drop a reference if this transaction modified
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhangyi (F) yi.zhang@huawei.com
commit 01215d3edb0f384ddeaa5e4a22c1ae5ff634149f upstream.
The jh pointer may be used uninitialized in the two cases below and the compiler complain about it when enabling JBUFFER_TRACE macro, fix them.
In file included from fs/jbd2/transaction.c:19:0: fs/jbd2/transaction.c: In function ‘jbd2_journal_get_undo_access’: ./include/linux/jbd2.h:1637:38: warning: ‘jh’ is used uninitialized in this function [-Wuninitialized] #define JBUFFER_TRACE(jh, info) do { printk("%s: %d\n", __func__, jh->b_jcount);} while (0) ^ fs/jbd2/transaction.c:1219:23: note: ‘jh’ was declared here struct journal_head *jh; ^ In file included from fs/jbd2/transaction.c:19:0: fs/jbd2/transaction.c: In function ‘jbd2_journal_dirty_metadata’: ./include/linux/jbd2.h:1637:38: warning: ‘jh’ may be used uninitialized in this function [-Wmaybe-uninitialized] #define JBUFFER_TRACE(jh, info) do { printk("%s: %d\n", __func__, jh->b_jcount);} while (0) ^ fs/jbd2/transaction.c:1332:23: note: ‘jh’ was declared here struct journal_head *jh; ^
Signed-off-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/jbd2/transaction.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1252,11 +1252,12 @@ int jbd2_journal_get_undo_access(handle_ struct journal_head *jh; char *committed_data = NULL;
- JBUFFER_TRACE(jh, "entry"); if (jbd2_write_access_granted(handle, bh, true)) return 0;
jh = jbd2_journal_add_journal_head(bh); + JBUFFER_TRACE(jh, "entry"); + /* * Do this first --- it can drop the journal lock, so we want to * make sure that obtaining the committed_data is done @@ -1367,15 +1368,17 @@ int jbd2_journal_dirty_metadata(handle_t
if (is_handle_aborted(handle)) return -EROFS; - if (!buffer_jbd(bh)) { - ret = -EUCLEAN; - goto out; - } + if (!buffer_jbd(bh)) + return -EUCLEAN; + /* * We don't grab jh reference here since the buffer must be part * of the running transaction. */ jh = bh2jh(bh); + jbd_debug(5, "journal_head %p\n", jh); + JBUFFER_TRACE(jh, "entry"); + /* * This and the following assertions are unreliable since we may see jh * in inconsistent state unless we grab bh_state lock. But this is @@ -1409,9 +1412,6 @@ int jbd2_journal_dirty_metadata(handle_t }
journal = transaction->t_journal; - jbd_debug(5, "journal_head %p\n", jh); - JBUFFER_TRACE(jh, "entry"); - jbd_lock_bh_state(bh);
if (jh->b_modified == 0) {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
commit 292c997a1970f8d1e1dfa354ed770a22f7b5a434 upstream.
As does in __sctp_connect(), when checking addrs in a while loop, after get the addr len according to sa_family, it's necessary to do the check walk_size + af->sockaddr_len > addrs_size to make sure it won't access an out-of-bounds addr.
The same thing is needed in selinux_sctp_bind_connect(), otherwise an out-of-bounds issue can be triggered:
[14548.772313] BUG: KASAN: slab-out-of-bounds in selinux_sctp_bind_connect+0x1aa/0x1f0 [14548.927083] Call Trace: [14548.938072] dump_stack+0x9a/0xe9 [14548.953015] print_address_description+0x65/0x22e [14548.996524] kasan_report.cold.6+0x92/0x1a6 [14549.015335] selinux_sctp_bind_connect+0x1aa/0x1f0 [14549.036947] security_sctp_bind_connect+0x58/0x90 [14549.058142] __sctp_setsockopt_connectx+0x5a/0x150 [sctp] [14549.081650] sctp_setsockopt.part.24+0x1322/0x3ce0 [sctp]
Cc: stable@vger.kernel.org Fixes: d452930fd3b9 ("selinux: Add SCTP support") Reported-by: Chunyu Hu chuhu@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Reviewed-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/hooks.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -5120,6 +5120,9 @@ static int selinux_sctp_bind_connect(str return -EINVAL; }
+ if (walk_size + len > addrlen) + return -EINVAL; + err = -EINVAL; switch (optname) { /* Bind checks */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: J. Bruce Fields bfields@redhat.com
commit 3815a245b50124f0865415dcb606a034e97494d4 upstream.
In the case when we're reusing a superblock, selinux_sb_clone_mnt_opts() fails to set set_kern_flags, with the result that nfs_clone_sb_security() incorrectly clears NFS_CAP_SECURITY_LABEL.
The result is that if you mount the same NFS filesystem twice, NFS security labels are turned off, even if they would work fine if you mounted the filesystem only once.
("fixes" may be not exactly the right tag, it may be more like "fixed-other-cases-but-missed-this-one".)
Cc: Scott Mayhew smayhew@redhat.com Cc: stable@vger.kernel.org Fixes: 0b4d3452b8b4 "security/selinux: allow security_sb_clone_mnt_opts..." Signed-off-by: J. Bruce Fields bfields@redhat.com Acked-by: Stephen Smalley sds@tycho.nsa.gov Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/hooks.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -959,8 +959,11 @@ static int selinux_sb_clone_mnt_opts(con BUG_ON(!(oldsbsec->flags & SE_SBINITIALIZED));
/* if fs is reusing a sb, make sure that the contexts match */ - if (newsbsec->flags & SE_SBINITIALIZED) + if (newsbsec->flags & SE_SBINITIALIZED) { + if ((kern_flags & SECURITY_LSM_NATIVE_LABELS) && !set_context) + *set_kern_flags |= SECURITY_LSM_NATIVE_LABELS; return selinux_cmp_sb_context(oldsb, newsb); + }
mutex_lock(&newsbsec->lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 9580b71b5a7863c24a9bd18bcd2ad759b86b1eff upstream.
Clear the on-stack STACK_FRAME_REGS_MARKER on exception exit in order to avoid confusing stacktrace like the one below.
Call Trace: [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180 [c0e9dd10] [c0895130] memchr+0x24/0x74 [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4 --- interrupt: c0e9df00 at 0x400f330 LR = init_stack+0x1f00/0x2000 [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable) [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484
With this patch the trace becomes:
Call Trace: [c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable) [c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180 [c0e9dd10] [c0895150] memchr+0x24/0x74 [c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574 [c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8 [c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4 [c0e9de80] [c00ae3e4] printk+0xa8/0xcc [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108 [c0e9df50] [c0c15434] start_kernel+0x310/0x488 [c0e9dff0] [00003484] 0x3484
Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/entry_32.S | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -745,6 +745,9 @@ fast_exception_return: mtcr r10 lwz r10,_LINK(r11) mtlr r10 + /* Clear the exception_marker on the stack to avoid confusing stacktrace */ + li r10, 0 + stw r10, 8(r11) REST_GPR(10, r11) #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS) mtspr SPRN_NRI, r0 @@ -982,6 +985,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRE mtcrf 0xFF,r10 mtlr r11
+ /* Clear the exception_marker on the stack to avoid confusing stacktrace */ + li r10, 0 + stw r10, 8(r1) /* * Once we put values in SRR0 and SRR1, we are in a state * where exceptions are not recoverable, since taking an @@ -1021,6 +1027,9 @@ exc_exit_restart_end: mtlr r11 lwz r10,_CCR(r1) mtcrf 0xff,r10 + /* Clear the exception_marker on the stack to avoid confusing stacktrace */ + li r10, 0 + stw r10, 8(r1) REST_2GPRS(9, r1) .globl exc_exit_restart exc_exit_restart:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 6d183ca8baec983dc4208ca45ece3c36763df912 upstream.
'nobats' kernel parameter or some options like CONFIG_DEBUG_PAGEALLOC deny the use of BATS for mapping memory.
This patch makes sure that the specific wii RAM mapping function takes it into account as well.
Fixes: de32400dd26e ("wii: use both mem1 and mem2 as ram") Cc: stable@vger.kernel.org Reviewed-by: Jonathan Neuschafer j.neuschaefer@gmx.net Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/embedded6xx/wii.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/powerpc/platforms/embedded6xx/wii.c +++ b/arch/powerpc/platforms/embedded6xx/wii.c @@ -83,6 +83,10 @@ unsigned long __init wii_mmu_mapin_mem2( /* MEM2 64MB@0x10000000 */ delta = wii_hole_start + wii_hole_size; size = top - delta; + + if (__map_without_bats) + return delta; + for (bl = 128<<10; bl < max_size; bl <<= 1) { if (bl * 2 > size) break;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jordan Niethe jniethe5@gmail.com
commit 7b62f9bd2246b7d3d086e571397c14ba52645ef1 upstream.
Currently the opal log is globally readable. It is kernel policy to limit the visibility of physical addresses / kernel pointers to root. Given this and the fact the opal log may contain this information it would be better to limit the readability to root.
Fixes: bfc36894a48b ("powerpc/powernv: Add OPAL message log interface") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Jordan Niethe jniethe5@gmail.com Reviewed-by: Stewart Smith stewart@linux.ibm.com Reviewed-by: Andrew Donnellan andrew.donnellan@au1.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/powernv/opal-msglog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/powernv/opal-msglog.c +++ b/arch/powerpc/platforms/powernv/opal-msglog.c @@ -98,7 +98,7 @@ static ssize_t opal_msglog_read(struct f }
static struct bin_attribute opal_msglog_attr = { - .attr = {.name = "msglog", .mode = 0444}, + .attr = {.name = "msglog", .mode = 0400}, .read = opal_msglog_read };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 36da5ff0bea2dc67298150ead8d8471575c54c7d upstream.
The 83xx has 8 SPRG registers and uses at least SPRG4 for DTLB handling LRU.
Fixes: 2319f1239592 ("powerpc/mm: e300c2/c3/c4 TLB errata workaround") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/83xx/suspend-asm.S | 34 +++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-)
--- a/arch/powerpc/platforms/83xx/suspend-asm.S +++ b/arch/powerpc/platforms/83xx/suspend-asm.S @@ -26,13 +26,13 @@ #define SS_MSR 0x74 #define SS_SDR1 0x78 #define SS_LR 0x7c -#define SS_SPRG 0x80 /* 4 SPRGs */ -#define SS_DBAT 0x90 /* 8 DBATs */ -#define SS_IBAT 0xd0 /* 8 IBATs */ -#define SS_TB 0x110 -#define SS_CR 0x118 -#define SS_GPREG 0x11c /* r12-r31 */ -#define STATE_SAVE_SIZE 0x16c +#define SS_SPRG 0x80 /* 8 SPRGs */ +#define SS_DBAT 0xa0 /* 8 DBATs */ +#define SS_IBAT 0xe0 /* 8 IBATs */ +#define SS_TB 0x120 +#define SS_CR 0x128 +#define SS_GPREG 0x12c /* r12-r31 */ +#define STATE_SAVE_SIZE 0x17c
.section .data .align 5 @@ -103,6 +103,16 @@ _GLOBAL(mpc83xx_enter_deep_sleep) stw r7, SS_SPRG+12(r3) stw r8, SS_SDR1(r3)
+ mfspr r4, SPRN_SPRG4 + mfspr r5, SPRN_SPRG5 + mfspr r6, SPRN_SPRG6 + mfspr r7, SPRN_SPRG7 + + stw r4, SS_SPRG+16(r3) + stw r5, SS_SPRG+20(r3) + stw r6, SS_SPRG+24(r3) + stw r7, SS_SPRG+28(r3) + mfspr r4, SPRN_DBAT0U mfspr r5, SPRN_DBAT0L mfspr r6, SPRN_DBAT1U @@ -493,6 +503,16 @@ mpc83xx_deep_resume: mtspr SPRN_IBAT7U, r6 mtspr SPRN_IBAT7L, r7
+ lwz r4, SS_SPRG+16(r3) + lwz r5, SS_SPRG+20(r3) + lwz r6, SS_SPRG+24(r3) + lwz r7, SS_SPRG+28(r3) + + mtspr SPRN_SPRG4, r4 + mtspr SPRN_SPRG5, r5 + mtspr SPRN_SPRG6, r6 + mtspr SPRN_SPRG7, r7 + lwz r4, SS_SPRG+0(r3) lwz r5, SS_SPRG+4(r3) lwz r6, SS_SPRG+8(r3)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit c3c7470c75566a077c8dc71dcf8f1948b8ddfab4 upstream.
When the hash MMU is active the AMR, IAMR and UAMOR are used for pkeys. The AMR is directly writable by user space, and the UAMOR masks those writes, meaning both registers are effectively user register state. The IAMR is used to create an execute only key.
Also we must maintain the value of at least the AMR when running in process context, so that any memory accesses done by the kernel on behalf of the process are correctly controlled by the AMR.
Although we are correctly switching all registers when going into a guest, on returning to the host we just write 0 into all regs, except on Power9 where we restore the IAMR correctly.
This could be observed by a user process if it writes the AMR, then runs a guest and we then return immediately to it without rescheduling. Because we have written 0 to the AMR that would have the effect of granting read/write permission to pages that the process was trying to protect.
In addition, when using the Radix MMU, the AMR can prevent inadvertent kernel access to userspace data, writing 0 to the AMR disables that protection.
So save and restore AMR, IAMR and UAMOR.
Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Russell Currey ruscur@russell.cc Signed-off-by: Michael Ellerman mpe@ellerman.id.au Acked-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -58,6 +58,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) #define STACK_SLOT_DAWR (SFS-56) #define STACK_SLOT_DAWRX (SFS-64) #define STACK_SLOT_HFSCR (SFS-72) +#define STACK_SLOT_AMR (SFS-80) +#define STACK_SLOT_UAMOR (SFS-88) /* the following is used by the P9 short path */ #define STACK_SLOT_NVGPRS (SFS-152) /* 18 gprs */
@@ -726,11 +728,9 @@ BEGIN_FTR_SECTION mfspr r5, SPRN_TIDR mfspr r6, SPRN_PSSCR mfspr r7, SPRN_PID - mfspr r8, SPRN_IAMR std r5, STACK_SLOT_TID(r1) std r6, STACK_SLOT_PSSCR(r1) std r7, STACK_SLOT_PID(r1) - std r8, STACK_SLOT_IAMR(r1) mfspr r5, SPRN_HFSCR std r5, STACK_SLOT_HFSCR(r1) END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) @@ -738,11 +738,18 @@ BEGIN_FTR_SECTION mfspr r5, SPRN_CIABR mfspr r6, SPRN_DAWR mfspr r7, SPRN_DAWRX + mfspr r8, SPRN_IAMR std r5, STACK_SLOT_CIABR(r1) std r6, STACK_SLOT_DAWR(r1) std r7, STACK_SLOT_DAWRX(r1) + std r8, STACK_SLOT_IAMR(r1) END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+ mfspr r5, SPRN_AMR + std r5, STACK_SLOT_AMR(r1) + mfspr r6, SPRN_UAMOR + std r6, STACK_SLOT_UAMOR(r1) + BEGIN_FTR_SECTION /* Set partition DABR */ /* Do this before re-enabling PMU to avoid P7 DABR corruption bug */ @@ -1631,22 +1638,25 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_3 mtspr SPRN_PSPB, r0 mtspr SPRN_WORT, r0 BEGIN_FTR_SECTION - mtspr SPRN_IAMR, r0 mtspr SPRN_TCSCR, r0 /* Set MMCRS to 1<<31 to freeze and disable the SPMC counters */ li r0, 1 sldi r0, r0, 31 mtspr SPRN_MMCRS, r0 END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) -8:
- /* Save and reset AMR and UAMOR before turning on the MMU */ + /* Save and restore AMR, IAMR and UAMOR before turning on the MMU */ + ld r8, STACK_SLOT_IAMR(r1) + mtspr SPRN_IAMR, r8 + +8: /* Power7 jumps back in here */ mfspr r5,SPRN_AMR mfspr r6,SPRN_UAMOR std r5,VCPU_AMR(r9) std r6,VCPU_UAMOR(r9) - li r6,0 - mtspr SPRN_AMR,r6 + ld r5,STACK_SLOT_AMR(r1) + ld r6,STACK_SLOT_UAMOR(r1) + mtspr SPRN_AMR, r5 mtspr SPRN_UAMOR, r6
/* Switch DSCR back to host value */ @@ -1746,11 +1756,9 @@ BEGIN_FTR_SECTION ld r5, STACK_SLOT_TID(r1) ld r6, STACK_SLOT_PSSCR(r1) ld r7, STACK_SLOT_PID(r1) - ld r8, STACK_SLOT_IAMR(r1) mtspr SPRN_TIDR, r5 mtspr SPRN_PSSCR, r6 mtspr SPRN_PID, r7 - mtspr SPRN_IAMR, r8 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
#ifdef CONFIG_PPC_RADIX_MMU
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Mackerras paulus@ozlabs.org
commit 19f8a5b5be2898573a5e1dc1db93e8d40117606a upstream.
Commit 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug", 2017-07-21) added two calls to opal_slw_set_reg() inside pnv_cpu_offline(), with the aim of changing the LPCR value in the SLW image to disable wakeups from the decrementer while a CPU is offline. However, pnv_cpu_offline() gets called each time a secondary CPU thread is woken up to participate in running a KVM guest, that is, not just when a CPU is offlined.
Since opal_slw_set_reg() is a very slow operation (with observed execution times around 20 milliseconds), this means that an offline secondary CPU can often be busy doing the opal_slw_set_reg() call when the primary CPU wants to grab all the secondary threads so that it can run a KVM guest. This leads to messages like "KVM: couldn't grab CPU n" being printed and guest execution failing.
There is no need to reprogram the SLW image on every KVM guest entry and exit. So that we do it only when a CPU is really transitioning between online and offline, this moves the calls to pnv_program_cpu_hotplug_lpcr() into pnv_smp_cpu_kill_self().
Fixes: 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Paul Mackerras paulus@ozlabs.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/powernv.h | 2 ++ arch/powerpc/platforms/powernv/idle.c | 27 ++------------------------- arch/powerpc/platforms/powernv/smp.c | 25 +++++++++++++++++++++++++ 3 files changed, 29 insertions(+), 25 deletions(-)
--- a/arch/powerpc/include/asm/powernv.h +++ b/arch/powerpc/include/asm/powernv.h @@ -23,6 +23,8 @@ extern int pnv_npu2_handle_fault(struct unsigned long *flags, unsigned long *status, int count);
+void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val); + void pnv_tm_init(void); #else static inline void powernv_set_nmmu_ptcr(unsigned long ptcr) { } --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -458,7 +458,8 @@ EXPORT_SYMBOL_GPL(pnv_power9_force_smt4_ #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
#ifdef CONFIG_HOTPLUG_CPU -static void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val) + +void pnv_program_cpu_hotplug_lpcr(unsigned int cpu, u64 lpcr_val) { u64 pir = get_hard_smp_processor_id(cpu);
@@ -481,20 +482,6 @@ unsigned long pnv_cpu_offline(unsigned i { unsigned long srr1; u32 idle_states = pnv_get_supported_cpuidle_states(); - u64 lpcr_val; - - /* - * We don't want to take decrementer interrupts while we are - * offline, so clear LPCR:PECE1. We keep PECE2 (and - * LPCR_PECE_HVEE on P9) enabled as to let IPIs in. - * - * If the CPU gets woken up by a special wakeup, ensure that - * the SLW engine sets LPCR with decrementer bit cleared, else - * the CPU will come back to the kernel due to a spurious - * wakeup. - */ - lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; - pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val);
__ppc64_runlatch_off();
@@ -526,16 +513,6 @@ unsigned long pnv_cpu_offline(unsigned i
__ppc64_runlatch_on();
- /* - * Re-enable decrementer interrupts in LPCR. - * - * Further, we want stop states to be woken up by decrementer - * for non-hotplug cases. So program the LPCR via stop api as - * well. - */ - lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1; - pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); - return srr1; } #endif --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -39,6 +39,7 @@ #include <asm/cpuidle.h> #include <asm/kexec.h> #include <asm/reg.h> +#include <asm/powernv.h>
#include "powernv.h"
@@ -153,6 +154,7 @@ static void pnv_smp_cpu_kill_self(void) { unsigned int cpu; unsigned long srr1, wmask; + u64 lpcr_val;
/* Standard hot unplug procedure */ /* @@ -174,6 +176,19 @@ static void pnv_smp_cpu_kill_self(void) if (cpu_has_feature(CPU_FTR_ARCH_207S)) wmask = SRR1_WAKEMASK_P8;
+ /* + * We don't want to take decrementer interrupts while we are + * offline, so clear LPCR:PECE1. We keep PECE2 (and + * LPCR_PECE_HVEE on P9) enabled so as to let IPIs in. + * + * If the CPU gets woken up by a special wakeup, ensure that + * the SLW engine sets LPCR with decrementer bit cleared, else + * the CPU will come back to the kernel due to a spurious + * wakeup. + */ + lpcr_val = mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); + while (!generic_check_cpu_restart(cpu)) { /* * Clear IPI flag, since we don't handle IPIs while @@ -246,6 +261,16 @@ static void pnv_smp_cpu_kill_self(void)
}
+ /* + * Re-enable decrementer interrupts in LPCR. + * + * Further, we want stop states to be woken up by decrementer + * for non-hotplug cases. So program the LPCR via stop api as + * well. + */ + lpcr_val = mfspr(SPRN_LPCR) | (u64)LPCR_PECE1; + pnv_program_cpu_hotplug_lpcr(cpu, lpcr_val); + DBG("CPU%d coming online...\n", cpu); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit 7104dccfd052fde51eecc9972dad9c40bd3e0d11 upstream.
The slbfee. instruction must have bit 24 of RB clear, failure to do so can result in false negatives that result in incorrect assertions.
This is not obvious from the ISA v3.0B document, which only says:
The hardware ignores the contents of RB 36:38 40:63 -- p.1032
This patch fixes the bug and also clears all other bits from PPC bit 36-63, which is good practice when dealing with reserved or ignored bits.
Fixes: e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Tested-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/mm/slb.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -69,6 +69,11 @@ static void assert_slb_presence(bool pre if (!cpu_has_feature(CPU_FTR_ARCH_206)) return;
+ /* + * slbfee. requires bit 24 (PPC bit 39) be clear in RB. Hardware + * ignores all other bits from 0-27, so just clear them all. + */ + ea &= ~((1UL << 28) - 1); asm volatile(__PPC_SLBFEE_DOT(%0, %1) : "=r"(tmp) : "r"(ea) : "cr0");
WARN_ON(present == (tmp == 0));
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk
commit fe1ef6bcdb4fca33434256a802a3ed6aacf0bd2f upstream.
Commit 8792468da5e1 "powerpc: Add the ability to save FPU without giving it up" unexpectedly removed the MSR_FE0 and MSR_FE1 bits from the bitmask used to update the MSR of the previous thread in __giveup_fpu() causing a KVM-PR MacOS guest to lockup and panic the host kernel.
Leaving FE0/1 enabled means unrelated processes might receive FPEs when they're not expecting them and crash. In particular if this happens to init the host will then panic.
eg (transcribed): qemu-system-ppc[837]: unhandled signal 8 at 12cc9ce4 nip 12cc9ce4 lr 12cc9ca4 code 0 systemd[1]: unhandled signal 8 at 202f02e0 nip 202f02e0 lr 001003d4 code 0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Reinstate these bits to the MSR bitmask to enable MacOS guests to run under 32-bit KVM-PR once again without issue.
Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -176,7 +176,7 @@ static void __giveup_fpu(struct task_str
save_fpu(tsk); msr = tsk->thread.regs->msr; - msr &= ~MSR_FP; + msr &= ~(MSR_FP|MSR_FE0|MSR_FE1); #ifdef CONFIG_VSX if (cpu_has_feature(CPU_FTR_VSX)) msr &= ~MSR_VSX;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit ca6d5149d2ad0a8d2f9c28cbe379802260a0a5e0 upstream.
GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks the build:
In function ‘user_regset_copyin’, inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9: include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is out of the bounds [0, 16] of object ‘vrsave’ with type ‘union <anonymous>’ [-Werror=array-bounds] arch/powerpc/kernel/ptrace.c: In function ‘vr_set’: arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here } vrsave;
This has been identified as a regression in GCC, see GCC bug 88273.
However we can avoid the warning and also simplify the logic and make it more robust.
Currently we pass -1 as end_pos to user_regset_copyout(). This says "copy up to the end of the regset".
The definition of the regset is: [REGSET_VMX] = { .core_note_type = NT_PPC_VMX, .n = 34, .size = sizeof(vector128), .align = sizeof(vector128), .active = vr_active, .get = vr_get, .set = vr_set },
The end is calculated as (n * size), ie. 34 * sizeof(vector128).
In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning we can copy up to sizeof(vector128) into/out-of vrsave.
The on-stack vrsave is defined as: union { elf_vrreg_t reg; u32 word; } vrsave;
And elf_vrreg_t is: typedef __vector128 elf_vrreg_t;
So there is no bug, but we rely on all those sizes lining up, otherwise we would have a kernel stack exposure/overwrite on our hands.
Rather than relying on that we can pass an explict end_pos based on the sizeof(vrsave). The result should be exactly the same but it's more obviously not over-reading/writing the stack and it avoids the compiler warning.
Reported-by: Meelis Roos mroos@linux.ee Reported-by: Mathieu Malaterre malat@debian.org Cc: stable@vger.kernel.org Tested-by: Mathieu Malaterre malat@debian.org Tested-by: Meelis Roos mroos@linux.ee Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/ptrace.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -561,6 +561,7 @@ static int vr_get(struct task_struct *ta /* * Copy out only the low-order word of vrsave. */ + int start, end; union { elf_vrreg_t reg; u32 word; @@ -569,8 +570,10 @@ static int vr_get(struct task_struct *ta
vrsave.word = target->thread.vrsave;
+ start = 33 * sizeof(vector128); + end = start + sizeof(vrsave); ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &vrsave, - 33 * sizeof(vector128), -1); + start, end); }
return ret; @@ -608,6 +611,7 @@ static int vr_set(struct task_struct *ta /* * We use only the first word of vrsave. */ + int start, end; union { elf_vrreg_t reg; u32 word; @@ -616,8 +620,10 @@ static int vr_set(struct task_struct *ta
vrsave.word = target->thread.vrsave;
+ start = 33 * sizeof(vector128); + end = start + sizeof(vrsave); ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &vrsave, - 33 * sizeof(vector128), -1); + start, end); if (!ret) target->thread.vrsave = vrsave.word; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com
commit 35f2806b481f5b9207f25e1886cba5d1c4d12cc7 upstream.
We added runtime allocation of 16G pages in commit 4ae279c2c96a ("powerpc/mm/hugetlb: Allow runtime allocation of 16G.") That was done to enable 16G allocation on PowerNV and KVM config. In case of KVM config, we mostly would have the entire guest RAM backed by 16G hugetlb pages for this to work. PAPR do support partial backing of guest RAM with hugepages via ibm,expected#pages node of memory node in the device tree. This means rest of the guest RAM won't be backed by 16G contiguous pages in the host and hence a hash page table insertion can fail in such case.
An example error message will look like
hash-mmu: mm: Hashing failure ! EA=0x7efc00000000 access=0x8000000000000006 current=readback hash-mmu: trap=0x300 vsid=0x67af789 ssize=1 base psize=14 psize 14 pte=0xc000000400000386 readback[12260]: unhandled signal 7 at 00007efc00000000 nip 00000000100012d0 lr 000000001000127c code 2
This patch address that by preventing runtime allocation of 16G hugepages in LPAR config. To allocate 16G hugetlb one need to kernel command line hugepagesz=16G hugepages=<number of 16G pages>
With radix translation mode we don't run into this issue.
This change will prevent runtime allocation of 16G hugetlb pages on kvm with hash translation mode. However, with the current upstream it was observed that 16G hugetlbfs backed guest doesn't boot at all.
We observe boot failure with the below message: [131354.647546] KVM: map_vrma at 0 failed, ret=-4
That means this patch is not resulting in an observable regression. Once we fix the boot issue with 16G hugetlb backed memory, we need to use ibm,expected#pages memory node attribute to indicate 16G page reservation to the guest. This will also enable partial backing of guest RAM with 16G pages.
Fixes: 4ae279c2c96a ("powerpc/mm/hugetlb: Allow runtime allocation of 16G.") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/book3s/64/hugetlb.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -35,6 +35,14 @@ static inline int hstate_get_psize(struc #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE static inline bool gigantic_page_supported(void) { + /* + * We used gigantic page reservation with hypervisor assist in some case. + * We cannot use runtime allocation of gigantic pages in those platforms + * This is hash translation mode LPARs. + */ + if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled()) + return false; + return true; } #endif
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit 1b5fc84aba170bdfe3533396ca9662ceea1609b7 upstream.
The NMI IPI timeout logic is broken, if __smp_send_nmi_ipi() times out on the first condition, delay_us will be zero which will send it into the second spin loop with no timeout so it will spin forever.
Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/smp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -519,7 +519,7 @@ int __smp_send_nmi_ipi(int cpu, void (*f if (delay_us) { delay_us--; if (!delay_us) - break; + goto timeout; } }
@@ -530,10 +530,11 @@ int __smp_send_nmi_ipi(int cpu, void (*f if (delay_us) { delay_us--; if (!delay_us) - break; + goto timeout; } }
+timeout: if (!cpumask_empty(&nmi_ipi_pending_mask)) { /* Timeout waiting for CPUs to call smp_handle_nmi_ipi */ ret = 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit 88b9a3d1425a436e95c41f09986fdae2daee437a upstream.
The xmon debugger IPI handler waits in the callback function while xmon is still active. This means they don't complete the IPI, and the initiator always times out waiting for them.
Things manage to work after the timeout because there is some fallback logic to keep NMI IPI state sane in case of the timeout, but this is a bit ugly.
This patch changes NMI IPI back to half-asynchronous (i.e., wait for everyone to call in, do not wait for IPI function to complete), but the complexity is avoided by going one step further and allowing new IPIs to be issued before the IPI functions to all complete.
If synchronization against that is required, it is left up to the caller, but current callers don't require that. In fact with the timeout handling, callers must be able to cope with this already.
Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/smp.c | 93 ++++++++++++++-------------------------------- 1 file changed, 29 insertions(+), 64 deletions(-)
--- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -358,13 +358,12 @@ void arch_send_call_function_ipi_mask(co * NMI IPIs may not be recoverable, so should not be used as ongoing part of * a running system. They can be used for crash, debug, halt/reboot, etc. * - * NMI IPIs are globally single threaded. No more than one in progress at - * any time. - * * The IPI call waits with interrupts disabled until all targets enter the - * NMI handler, then the call returns. + * NMI handler, then returns. Subsequent IPIs can be issued before targets + * have returned from their handlers, so there is no guarantee about + * concurrency or re-entrancy. * - * No new NMI can be initiated until targets exit the handler. + * A new NMI can be issued before all targets exit the handler. * * The IPI call may time out without all targets entering the NMI handler. * In that case, there is some logic to recover (and ignore subsequent @@ -375,7 +374,7 @@ void arch_send_call_function_ipi_mask(co
static atomic_t __nmi_ipi_lock = ATOMIC_INIT(0); static struct cpumask nmi_ipi_pending_mask; -static int nmi_ipi_busy_count = 0; +static bool nmi_ipi_busy = false; static void (*nmi_ipi_function)(struct pt_regs *) = NULL;
static void nmi_ipi_lock_start(unsigned long *flags) @@ -414,7 +413,7 @@ static void nmi_ipi_unlock_end(unsigned */ int smp_handle_nmi_ipi(struct pt_regs *regs) { - void (*fn)(struct pt_regs *); + void (*fn)(struct pt_regs *) = NULL; unsigned long flags; int me = raw_smp_processor_id(); int ret = 0; @@ -425,29 +424,17 @@ int smp_handle_nmi_ipi(struct pt_regs *r * because the caller may have timed out. */ nmi_ipi_lock_start(&flags); - if (!nmi_ipi_busy_count) - goto out; - if (!cpumask_test_cpu(me, &nmi_ipi_pending_mask)) - goto out; - - fn = nmi_ipi_function; - if (!fn) - goto out; - - cpumask_clear_cpu(me, &nmi_ipi_pending_mask); - nmi_ipi_busy_count++; - nmi_ipi_unlock(); - - ret = 1; - - fn(regs); - - nmi_ipi_lock(); - if (nmi_ipi_busy_count > 1) /* Can race with caller time-out */ - nmi_ipi_busy_count--; -out: + if (cpumask_test_cpu(me, &nmi_ipi_pending_mask)) { + cpumask_clear_cpu(me, &nmi_ipi_pending_mask); + fn = READ_ONCE(nmi_ipi_function); + WARN_ON_ONCE(!fn); + ret = 1; + } nmi_ipi_unlock_end(&flags);
+ if (fn) + fn(regs); + return ret; }
@@ -473,7 +460,7 @@ static void do_smp_send_nmi_ipi(int cpu, * - cpu is the target CPU (must not be this CPU), or NMI_IPI_ALL_OTHERS. * - fn is the target callback function. * - delay_us > 0 is the delay before giving up waiting for targets to - * complete executing the handler, == 0 specifies indefinite delay. + * begin executing the handler, == 0 specifies indefinite delay. */ int __smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us, bool safe) { @@ -487,31 +474,33 @@ int __smp_send_nmi_ipi(int cpu, void (*f if (unlikely(!smp_ops)) return 0;
- /* Take the nmi_ipi_busy count/lock with interrupts hard disabled */ nmi_ipi_lock_start(&flags); - while (nmi_ipi_busy_count) { + while (nmi_ipi_busy) { nmi_ipi_unlock_end(&flags); - spin_until_cond(nmi_ipi_busy_count == 0); + spin_until_cond(!nmi_ipi_busy); nmi_ipi_lock_start(&flags); } - + nmi_ipi_busy = true; nmi_ipi_function = fn;
+ WARN_ON_ONCE(!cpumask_empty(&nmi_ipi_pending_mask)); + if (cpu < 0) { /* ALL_OTHERS */ cpumask_copy(&nmi_ipi_pending_mask, cpu_online_mask); cpumask_clear_cpu(me, &nmi_ipi_pending_mask); } else { - /* cpumask starts clear */ cpumask_set_cpu(cpu, &nmi_ipi_pending_mask); } - nmi_ipi_busy_count++; + nmi_ipi_unlock();
+ /* Interrupts remain hard disabled */ + do_smp_send_nmi_ipi(cpu, safe);
nmi_ipi_lock(); - /* nmi_ipi_busy_count is held here, so unlock/lock is okay */ + /* nmi_ipi_busy is set here, so unlock/lock is okay */ while (!cpumask_empty(&nmi_ipi_pending_mask)) { nmi_ipi_unlock(); udelay(1); @@ -519,34 +508,19 @@ int __smp_send_nmi_ipi(int cpu, void (*f if (delay_us) { delay_us--; if (!delay_us) - goto timeout; + break; } }
- while (nmi_ipi_busy_count > 1) { - nmi_ipi_unlock(); - udelay(1); - nmi_ipi_lock(); - if (delay_us) { - delay_us--; - if (!delay_us) - goto timeout; - } - } - -timeout: if (!cpumask_empty(&nmi_ipi_pending_mask)) { /* Timeout waiting for CPUs to call smp_handle_nmi_ipi */ ret = 0; cpumask_clear(&nmi_ipi_pending_mask); } - if (nmi_ipi_busy_count > 1) { - /* Timeout waiting for CPUs to execute fn */ - ret = 0; - nmi_ipi_busy_count = 1; - }
- nmi_ipi_busy_count--; + nmi_ipi_function = NULL; + nmi_ipi_busy = false; + nmi_ipi_unlock_end(&flags);
return ret; @@ -614,17 +588,8 @@ void crash_send_ipi(void (*crash_ipi_cal static void nmi_stop_this_cpu(struct pt_regs *regs) { /* - * This is a special case because it never returns, so the NMI IPI - * handling would never mark it as done, which makes any later - * smp_send_nmi_ipi() call spin forever. Mark it done now. - * * IRQs are already hard disabled by the smp_handle_nmi_ipi. */ - nmi_ipi_lock(); - if (nmi_ipi_busy_count > 1) - nmi_ipi_busy_count--; - nmi_ipi_unlock(); - spin_begin(); while (1) spin_cpu_relax();
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 0bbea75c476b77fa7d7811d6be911cc7583e640f upstream.
Looks like book3s/32 doesn't set RI on machine check, so checking RI before calling die() will always be fatal allthought this is not an issue in most cases.
Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Fixes: daf00ae71dad ("powerpc/traps: restore recoverability of machine_check interrupts") Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/traps.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -763,15 +763,15 @@ void machine_check_exception(struct pt_r if (check_io_access(regs)) goto bail;
- /* Must die if the interrupt is not recoverable */ - if (!(regs->msr & MSR_RI)) - nmi_panic(regs, "Unrecoverable Machine check"); - if (!nested) nmi_exit();
die("Machine check", regs, SIGBUS);
+ /* Must die if the interrupt is not recoverable */ + if (!(regs->msr & MSR_RI)) + nmi_panic(regs, "Unrecoverable Machine check"); + return;
bail:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 9bf3d3c4e4fd82c7174f4856df372ab2a71005b9 upstream.
Today's message is useless:
[ 42.253267] Kernel stack overflow in process (ptrval), r1=c65500b0
This patch fixes it:
[ 66.905235] Kernel stack overflow in process sh[356], r1=c65560b0
Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr [mpe: Use task_pid_nr()] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/traps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1542,8 +1542,8 @@ bail:
void StackOverflow(struct pt_regs *regs) { - printk(KERN_CRIT "Kernel stack overflow in process %p, r1=%lx\n", - current, regs->gpr[1]); + pr_crit("Kernel stack overflow in process %s[%d], r1=%lx\n", + current->comm, task_pid_nr(current), regs->gpr[1]); debugger(regs); show_regs(regs); panic("kernel stack overflow");
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit e2477233145f2156434afb799583bccd878f3e9f upstream.
Fix boolean expressions by using logical AND operator '&&' instead of bitwise operator '&'.
This issue was detected with the help of Coccinelle.
Fixes: 4fa084af28ca ("ARM: OSIRIS: DVS (Dynamic Voltage Scaling) supoort.") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com [krzk: Fix -Wparentheses warning] Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/mach-s3c24xx/mach-osiris-dvs.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/arm/mach-s3c24xx/mach-osiris-dvs.c +++ b/arch/arm/mach-s3c24xx/mach-osiris-dvs.c @@ -65,16 +65,16 @@ static int osiris_dvs_notify(struct noti
switch (val) { case CPUFREQ_PRECHANGE: - if (old_dvs & !new_dvs || - cur_dvs & !new_dvs) { + if ((old_dvs && !new_dvs) || + (cur_dvs && !new_dvs)) { pr_debug("%s: exiting dvs\n", __func__); cur_dvs = false; gpio_set_value(OSIRIS_GPIO_DVS, 1); } break; case CPUFREQ_POSTCHANGE: - if (!old_dvs & new_dvs || - !cur_dvs & new_dvs) { + if ((!old_dvs && new_dvs) || + (!cur_dvs && new_dvs)) { pr_debug("entering dvs\n"); cur_dvs = true; gpio_set_value(OSIRIS_GPIO_DVS, 0);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julien Thierry julien.thierry@arm.com
commit 5870970b9a828d8693aa6d15742573289d7dbcd0 upstream.
When using VHE, the host needs to clear HCR_EL2.TGE bit in order to interact with guest TLBs, switching from EL2&0 translation regime to EL1&0.
However, some non-maskable asynchronous event could happen while TGE is cleared like SDEI. Because of this address translation operations relying on EL2&0 translation regime could fail (tlb invalidation, userspace access, ...).
Fix this by properly setting HCR_EL2.TGE when entering NMI context and clear it if necessary when returning to the interrupted context.
Signed-off-by: Julien Thierry julien.thierry@arm.com Suggested-by: Marc Zyngier marc.zyngier@arm.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Reviewed-by: James Morse james.morse@arm.com Cc: Arnd Bergmann arnd@arndb.de Cc: Will Deacon will.deacon@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: James Morse james.morse@arm.com Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/hardirq.h | 31 +++++++++++++++++++++++++++++++ arch/arm64/kernel/irq.c | 3 +++ include/linux/hardirq.h | 7 +++++++ 3 files changed, 41 insertions(+)
--- a/arch/arm64/include/asm/hardirq.h +++ b/arch/arm64/include/asm/hardirq.h @@ -17,8 +17,12 @@ #define __ASM_HARDIRQ_H
#include <linux/cache.h> +#include <linux/percpu.h> #include <linux/threads.h> +#include <asm/barrier.h> #include <asm/irq.h> +#include <asm/kvm_arm.h> +#include <asm/sysreg.h>
#define NR_IPI 7
@@ -37,6 +41,33 @@ u64 smp_irq_stat_cpu(unsigned int cpu);
#define __ARCH_IRQ_EXIT_IRQS_DISABLED 1
+struct nmi_ctx { + u64 hcr; +}; + +DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts); + +#define arch_nmi_enter() \ + do { \ + if (is_kernel_in_hyp_mode()) { \ + struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \ + nmi_ctx->hcr = read_sysreg(hcr_el2); \ + if (!(nmi_ctx->hcr & HCR_TGE)) { \ + write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2); \ + isb(); \ + } \ + } \ + } while (0) + +#define arch_nmi_exit() \ + do { \ + if (is_kernel_in_hyp_mode()) { \ + struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \ + if (!(nmi_ctx->hcr & HCR_TGE)) \ + write_sysreg(nmi_ctx->hcr, hcr_el2); \ + } \ + } while (0) + static inline void ack_bad_irq(unsigned int irq) { extern unsigned long irq_err_count; --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -33,6 +33,9 @@
unsigned long irq_err_count;
+/* Only access this in an NMI enter/exit */ +DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts); + DEFINE_PER_CPU(unsigned long *, irq_stack_ptr);
int arch_show_interrupts(struct seq_file *p, int prec) --- a/include/linux/hardirq.h +++ b/include/linux/hardirq.h @@ -60,8 +60,14 @@ extern void irq_enter(void); */ extern void irq_exit(void);
+#ifndef arch_nmi_enter +#define arch_nmi_enter() do { } while (0) +#define arch_nmi_exit() do { } while (0) +#endif + #define nmi_enter() \ do { \ + arch_nmi_enter(); \ printk_nmi_enter(); \ lockdep_off(); \ ftrace_nmi_enter(); \ @@ -80,6 +86,7 @@ extern void irq_exit(void); ftrace_nmi_exit(); \ lockdep_on(); \ printk_nmi_exit(); \ + arch_nmi_exit(); \ } while (0)
#endif /* LINUX_HARDIRQ_H */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will.deacon@arm.com
commit b9a4b9d084d978f80eb9210727c81804588b42ff upstream.
FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by taking a hardware watchpoint. Unfortunately, if a debug handler returns a non-zero value, then we will propagate the UNKNOWN FAR value to userspace via the si_addr field of the SIGTRAP siginfo_t.
Instead, let's set si_addr to take on the PC of the faulting instruction, which we have available in the current pt_regs.
Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/mm/fault.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -824,11 +824,12 @@ void __init hook_debug_fault_code(int nr debug_fault_info[nr].name = name; }
-asmlinkage int __exception do_debug_exception(unsigned long addr, +asmlinkage int __exception do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr, struct pt_regs *regs) { const struct fault_info *inf = esr_to_debug_fault_info(esr); + unsigned long pc = instruction_pointer(regs); int rv;
/* @@ -838,14 +839,14 @@ asmlinkage int __exception do_debug_exce if (interrupts_enabled(regs)) trace_hardirqs_off();
- if (user_mode(regs) && !is_ttbr0_addr(instruction_pointer(regs))) + if (user_mode(regs) && !is_ttbr0_addr(pc)) arm64_apply_bp_hardening();
- if (!inf->fn(addr, esr, regs)) { + if (!inf->fn(addr_if_watchpoint, esr, regs)) { rv = 1; } else { arm64_notify_die(inf->name, regs, - inf->sig, inf->code, (void __user *)addr, esr); + inf->sig, inf->code, (void __user *)pc, esr); rv = 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will.deacon@arm.com
commit 6bd288569b50bc89fa5513031086746968f585cb upstream.
Debug exception handlers may be called for exceptions generated both by user and kernel code. In many cases, this is checked explicitly, but in other cases things either happen to work by happy accident or they go slightly wrong. For example, executing 'brk #4' from userspace will enter the kprobes code and be ignored, but the instruction will be retried forever in userspace instead of delivering a SIGTRAP.
Fix this issue in the most stable-friendly fashion by simply adding explicit checks of the triggering exception level to all of our debug exception handlers.
Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/kgdb.c | 14 ++++++++++---- arch/arm64/kernel/probes/kprobes.c | 6 ++++++ 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/arch/arm64/kernel/kgdb.c +++ b/arch/arm64/kernel/kgdb.c @@ -244,27 +244,33 @@ int kgdb_arch_handle_exception(int excep
static int kgdb_brk_fn(struct pt_regs *regs, unsigned int esr) { + if (user_mode(regs)) + return DBG_HOOK_ERROR; + kgdb_handle_exception(1, SIGTRAP, 0, regs); - return 0; + return DBG_HOOK_HANDLED; } NOKPROBE_SYMBOL(kgdb_brk_fn)
static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int esr) { + if (user_mode(regs)) + return DBG_HOOK_ERROR; + compiled_break = 1; kgdb_handle_exception(1, SIGTRAP, 0, regs);
- return 0; + return DBG_HOOK_HANDLED; } NOKPROBE_SYMBOL(kgdb_compiled_brk_fn);
static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned int esr) { - if (!kgdb_single_step) + if (user_mode(regs) || !kgdb_single_step) return DBG_HOOK_ERROR;
kgdb_handle_exception(1, SIGTRAP, 0, regs); - return 0; + return DBG_HOOK_HANDLED; } NOKPROBE_SYMBOL(kgdb_step_brk_fn);
--- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -450,6 +450,9 @@ kprobe_single_step_handler(struct pt_reg struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); int retval;
+ if (user_mode(regs)) + return DBG_HOOK_ERROR; + /* return error if this is not our step */ retval = kprobe_ss_hit(kcb, instruction_pointer(regs));
@@ -466,6 +469,9 @@ kprobe_single_step_handler(struct pt_reg int __kprobes kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr) { + if (user_mode(regs)) + return DBG_HOOK_ERROR; + kprobe_handler(regs); return DBG_HOOK_HANDLED; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Martin Dave.Martin@arm.com
commit c88b093693ccbe41991ef2e9b1d251945e6e54ed upstream.
Due to what looks like a typo dating back to the original addition of FPEXC32_EL2 handling, KVM currently initialises this register to an architecturally invalid value.
As a result, the VECITR field (RES1) in bits [10:8] is initialised with 0, and the two reserved (RES0) bits [6:5] are initialised with 1. (In the Common VFP Subarchitecture as specified by ARMv7-A, these two bits were IMP DEF. ARMv8-A removes them.)
This patch changes the reset value from 0x70 to 0x700, which reflects the architectural constraints and is presumably what was originally intended.
Cc: stable@vger.kernel.org # 4.12.x- Cc: Christoffer Dall christoffer.dall@arm.com Fixes: 62a89c44954f ("arm64: KVM: 32bit handling of coprocessor traps") Signed-off-by: Dave Martin Dave.Martin@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kvm/sys_regs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1476,7 +1476,7 @@ static const struct sys_reg_desc sys_reg
{ SYS_DESC(SYS_DACR32_EL2), NULL, reset_unknown, DACR32_EL2 }, { SYS_DESC(SYS_IFSR32_EL2), NULL, reset_unknown, IFSR32_EL2 }, - { SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x70 }, + { SYS_DESC(SYS_FPEXC32_EL2), NULL, reset_val, FPEXC32_EL2, 0x700 }, };
static bool trap_dbgidr(struct kvm_vcpu *vcpu,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Gardon bgardon@google.com
commit 92da008fa21034c369cdb8ca2b629fe5c196826b upstream.
This reverts commit 71883a62fcd6c70639fa12cda733378b4d997409.
The above commit contains an optimization to kvm_zap_gfn_range which uses gfn-limited TLB flushes, if enabled. If using these limited flushes, kvm_zap_gfn_range passes lock_flush_tlb=false to slot_handle_level_range which creates a race when the function unlocks to call cond_resched. See an example of this race below:
CPU 0 CPU 1 CPU 3 // zap_direct_gfn_range mmu_lock() // *ptep == pte_1 *ptep = 0 if (lock_flush_tlb) flush_tlbs() mmu_unlock() // In invalidate range // MMU notifier mmu_lock() if (pte != 0) *ptep = 0 flush = true if (flush) flush_remote_tlbs() mmu_unlock() return // Host MM reallocates // page previously // backing guest memory. // Guest accesses // invalid page // through pte_1 // in its TLB!!
Tested: Ran all kvm-unit-tests on a Intel Haswell machine with and without this patch. The patch introduced no new failures.
Signed-off-by: Ben Gardon bgardon@google.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/mmu.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-)
--- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -5635,13 +5635,8 @@ void kvm_zap_gfn_range(struct kvm *kvm, { struct kvm_memslots *slots; struct kvm_memory_slot *memslot; - bool flush_tlb = true; - bool flush = false; int i;
- if (kvm_available_flush_tlb_with_range()) - flush_tlb = false; - spin_lock(&kvm->mmu_lock); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); @@ -5653,17 +5648,12 @@ void kvm_zap_gfn_range(struct kvm *kvm, if (start >= end) continue;
- flush |= slot_handle_level_range(kvm, memslot, - kvm_zap_rmapp, PT_PAGE_TABLE_LEVEL, - PT_MAX_HUGEPAGE_LEVEL, start, - end - 1, flush_tlb); + slot_handle_level_range(kvm, memslot, kvm_zap_rmapp, + PT_PAGE_TABLE_LEVEL, PT_MAX_HUGEPAGE_LEVEL, + start, end - 1, true); } }
- if (flush) - kvm_flush_remote_tlbs_with_address(kvm, gfn_start, - gfn_end - gfn_start + 1); - spin_unlock(&kvm->mmu_lock); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Corey Minyard cminyard@mvista.com
commit 41b766d661bf94a364960862cfc248a78313dbd3 upstream.
When excuting a command like: modprobe ipmi_si ports=0xffc0e3 type=bt The system would get an oops.
The trouble here is that ipmi_si_hardcode_find_bmc() is called before ipmi_si_platform_init(), but initialization of the hard-coded device creates an IPMI platform device, which won't be initialized yet.
The real trouble is that hard-coded devices aren't created with any device, and the fixup is done later. So do it right, create the hard-coded devices as normal platform devices.
This required adding some new resource types to the IPMI platform code for passing information required by the hard-coded device and adding some code to remove the hard-coded platform devices on module removal.
To enforce the "hard-coded devices passed by the user take priority over firmware devices" rule, some special code was added to check and see if a hard-coded device already exists.
Reported-by: Yang Yingliang yangyingliang@huawei.com Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Corey Minyard cminyard@mvista.com Tested-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/ipmi/ipmi_si.h | 4 drivers/char/ipmi/ipmi_si_hardcode.c | 232 +++++++++++++++++++++++++---------- drivers/char/ipmi/ipmi_si_intf.c | 23 ++- drivers/char/ipmi/ipmi_si_platform.c | 29 +++- 4 files changed, 214 insertions(+), 74 deletions(-)
--- a/drivers/char/ipmi/ipmi_si.h +++ b/drivers/char/ipmi/ipmi_si.h @@ -25,7 +25,9 @@ void ipmi_irq_finish_setup(struct si_sm_ int ipmi_si_remove_by_dev(struct device *dev); void ipmi_si_remove_by_data(int addr_space, enum si_type si_type, unsigned long addr); -int ipmi_si_hardcode_find_bmc(void); +void ipmi_hardcode_init(void); +void ipmi_si_hardcode_exit(void); +int ipmi_si_hardcode_match(int addr_type, unsigned long addr); void ipmi_si_platform_init(void); void ipmi_si_platform_shutdown(void);
--- a/drivers/char/ipmi/ipmi_si_hardcode.c +++ b/drivers/char/ipmi/ipmi_si_hardcode.c @@ -3,6 +3,7 @@ #define pr_fmt(fmt) "ipmi_hardcode: " fmt
#include <linux/moduleparam.h> +#include <linux/platform_device.h> #include "ipmi_si.h"
/* @@ -12,23 +13,22 @@
#define SI_MAX_PARMS 4
-static char *si_type[SI_MAX_PARMS]; #define MAX_SI_TYPE_STR 30 -static char si_type_str[MAX_SI_TYPE_STR]; +static char si_type_str[MAX_SI_TYPE_STR] __initdata; static unsigned long addrs[SI_MAX_PARMS]; static unsigned int num_addrs; static unsigned int ports[SI_MAX_PARMS]; static unsigned int num_ports; -static int irqs[SI_MAX_PARMS]; -static unsigned int num_irqs; -static int regspacings[SI_MAX_PARMS]; -static unsigned int num_regspacings; -static int regsizes[SI_MAX_PARMS]; -static unsigned int num_regsizes; -static int regshifts[SI_MAX_PARMS]; -static unsigned int num_regshifts; -static int slave_addrs[SI_MAX_PARMS]; /* Leaving 0 chooses the default value */ -static unsigned int num_slave_addrs; +static int irqs[SI_MAX_PARMS] __initdata; +static unsigned int num_irqs __initdata; +static int regspacings[SI_MAX_PARMS] __initdata; +static unsigned int num_regspacings __initdata; +static int regsizes[SI_MAX_PARMS] __initdata; +static unsigned int num_regsizes __initdata; +static int regshifts[SI_MAX_PARMS] __initdata; +static unsigned int num_regshifts __initdata; +static int slave_addrs[SI_MAX_PARMS] __initdata; +static unsigned int num_slave_addrs __initdata;
module_param_string(type, si_type_str, MAX_SI_TYPE_STR, 0); MODULE_PARM_DESC(type, "Defines the type of each interface, each" @@ -73,12 +73,133 @@ MODULE_PARM_DESC(slave_addrs, "Set the d " overridden by this parm. This is an array indexed" " by interface number.");
-int ipmi_si_hardcode_find_bmc(void) +static struct platform_device *ipmi_hc_pdevs[SI_MAX_PARMS]; + +static void __init ipmi_hardcode_init_one(const char *si_type_str, + unsigned int i, + unsigned long addr, + unsigned int flags) +{ + struct platform_device *pdev; + unsigned int num_r = 1, size; + struct resource r[4]; + struct property_entry p[6]; + enum si_type si_type; + unsigned int regspacing, regsize; + int rv; + + memset(p, 0, sizeof(p)); + memset(r, 0, sizeof(r)); + + if (!si_type_str || !*si_type_str || strcmp(si_type_str, "kcs") == 0) { + size = 2; + si_type = SI_KCS; + } else if (strcmp(si_type_str, "smic") == 0) { + size = 2; + si_type = SI_SMIC; + } else if (strcmp(si_type_str, "bt") == 0) { + size = 3; + si_type = SI_BT; + } else if (strcmp(si_type_str, "invalid") == 0) { + /* + * Allow a firmware-specified interface to be + * disabled. + */ + size = 1; + si_type = SI_TYPE_INVALID; + } else { + pr_warn("Interface type specified for interface %d, was invalid: %s\n", + i, si_type_str); + return; + } + + regsize = regsizes[i]; + if (regsize == 0) + regsize = DEFAULT_REGSIZE; + + p[0] = PROPERTY_ENTRY_U8("ipmi-type", si_type); + p[1] = PROPERTY_ENTRY_U8("slave-addr", slave_addrs[i]); + p[2] = PROPERTY_ENTRY_U8("addr-source", SI_HARDCODED); + p[3] = PROPERTY_ENTRY_U8("reg-shift", regshifts[i]); + p[4] = PROPERTY_ENTRY_U8("reg-size", regsize); + /* Last entry must be left NULL to terminate it. */ + + /* + * Register spacing is derived from the resources in + * the IPMI platform code. + */ + regspacing = regspacings[i]; + if (regspacing == 0) + regspacing = regsize; + + r[0].start = addr; + r[0].end = r[0].start + regsize - 1; + r[0].name = "IPMI Address 1"; + r[0].flags = flags; + + if (size > 1) { + r[1].start = r[0].start + regspacing; + r[1].end = r[1].start + regsize - 1; + r[1].name = "IPMI Address 2"; + r[1].flags = flags; + num_r++; + } + + if (size > 2) { + r[2].start = r[1].start + regspacing; + r[2].end = r[2].start + regsize - 1; + r[2].name = "IPMI Address 3"; + r[2].flags = flags; + num_r++; + } + + if (irqs[i]) { + r[num_r].start = irqs[i]; + r[num_r].end = irqs[i]; + r[num_r].name = "IPMI IRQ"; + r[num_r].flags = IORESOURCE_IRQ; + num_r++; + } + + pdev = platform_device_alloc("hardcode-ipmi-si", i); + if (!pdev) { + pr_err("Error allocating IPMI platform device %d\n", i); + return; + } + + rv = platform_device_add_resources(pdev, r, num_r); + if (rv) { + dev_err(&pdev->dev, + "Unable to add hard-code resources: %d\n", rv); + goto err; + } + + rv = platform_device_add_properties(pdev, p); + if (rv) { + dev_err(&pdev->dev, + "Unable to add hard-code properties: %d\n", rv); + goto err; + } + + rv = platform_device_add(pdev); + if (rv) { + dev_err(&pdev->dev, + "Unable to add hard-code device: %d\n", rv); + goto err; + } + + ipmi_hc_pdevs[i] = pdev; + return; + +err: + platform_device_put(pdev); +} + +void __init ipmi_hardcode_init(void) { - int ret = -ENODEV; - int i; - struct si_sm_io io; + unsigned int i; char *str; + char *si_type[SI_MAX_PARMS];
/* Parse out the si_type string into its components. */ str = si_type_str; @@ -95,54 +216,45 @@ int ipmi_si_hardcode_find_bmc(void) } }
- memset(&io, 0, sizeof(io)); for (i = 0; i < SI_MAX_PARMS; i++) { - if (!ports[i] && !addrs[i]) - continue; - - io.addr_source = SI_HARDCODED; - pr_info("probing via hardcoded address\n"); + if (i < num_ports && ports[i]) + ipmi_hardcode_init_one(si_type[i], i, ports[i], + IORESOURCE_IO); + if (i < num_addrs && addrs[i]) + ipmi_hardcode_init_one(si_type[i], i, addrs[i], + IORESOURCE_MEM); + } +}
- if (!si_type[i] || strcmp(si_type[i], "kcs") == 0) { - io.si_type = SI_KCS; - } else if (strcmp(si_type[i], "smic") == 0) { - io.si_type = SI_SMIC; - } else if (strcmp(si_type[i], "bt") == 0) { - io.si_type = SI_BT; - } else { - pr_warn("Interface type specified for interface %d, was invalid: %s\n", - i, si_type[i]); - continue; - } +void ipmi_si_hardcode_exit(void) +{ + unsigned int i;
- if (ports[i]) { - /* An I/O port */ - io.addr_data = ports[i]; - io.addr_type = IPMI_IO_ADDR_SPACE; - } else if (addrs[i]) { - /* A memory port */ - io.addr_data = addrs[i]; - io.addr_type = IPMI_MEM_ADDR_SPACE; - } else { - pr_warn("Interface type specified for interface %d, but port and address were not set or set to zero\n", - i); - continue; - } + for (i = 0; i < SI_MAX_PARMS; i++) { + if (ipmi_hc_pdevs[i]) + platform_device_unregister(ipmi_hc_pdevs[i]); + } +}
- io.addr = NULL; - io.regspacing = regspacings[i]; - if (!io.regspacing) - io.regspacing = DEFAULT_REGSPACING; - io.regsize = regsizes[i]; - if (!io.regsize) - io.regsize = DEFAULT_REGSIZE; - io.regshift = regshifts[i]; - io.irq = irqs[i]; - if (io.irq) - io.irq_setup = ipmi_std_irq_setup; - io.slave_addr = slave_addrs[i]; +/* + * Returns true of the given address exists as a hardcoded address, + * false if not. + */ +int ipmi_si_hardcode_match(int addr_type, unsigned long addr) +{ + unsigned int i;
- ret = ipmi_si_add_smi(&io); + if (addr_type == IPMI_IO_ADDR_SPACE) { + for (i = 0; i < num_ports; i++) { + if (ports[i] == addr) + return 1; + } + } else { + for (i = 0; i < num_addrs; i++) { + if (addrs[i] == addr) + return 1; + } } - return ret; + + return 0; } --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -1862,6 +1862,18 @@ int ipmi_si_add_smi(struct si_sm_io *io) int rv = 0; struct smi_info *new_smi, *dup;
+ /* + * If the user gave us a hard-coded device at the same + * address, they presumably want us to use it and not what is + * in the firmware. + */ + if (io->addr_source != SI_HARDCODED && + ipmi_si_hardcode_match(io->addr_type, io->addr_data)) { + dev_info(io->dev, + "Hard-coded device at this address already exists"); + return -ENODEV; + } + if (!io->io_setup) { if (io->addr_type == IPMI_IO_ADDR_SPACE) { io->io_setup = ipmi_si_port_setup; @@ -2089,7 +2101,7 @@ static int try_smi_init(struct smi_info return rv; }
-static int init_ipmi_si(void) +static int __init init_ipmi_si(void) { struct smi_info *e; enum ipmi_addr_src type = SI_INVALID; @@ -2097,11 +2109,9 @@ static int init_ipmi_si(void) if (initialized) return 0;
- pr_info("IPMI System Interface driver\n"); + ipmi_hardcode_init();
- /* If the user gave us a device, they presumably want us to use it */ - if (!ipmi_si_hardcode_find_bmc()) - goto do_scan; + pr_info("IPMI System Interface driver\n");
ipmi_si_platform_init();
@@ -2113,7 +2123,6 @@ static int init_ipmi_si(void) with multiple BMCs we assume that there will be several instances of a given type so if we succeed in registering a type then also try to register everything else of the same type */ -do_scan: mutex_lock(&smi_infos_lock); list_for_each_entry(e, &smi_infos, link) { /* Try to register a device if it has an IRQ and we either @@ -2299,6 +2308,8 @@ static void cleanup_ipmi_si(void) list_for_each_entry_safe(e, tmp_e, &smi_infos, link) cleanup_one_si(e); mutex_unlock(&smi_infos_lock); + + ipmi_si_hardcode_exit(); } module_exit(cleanup_ipmi_si);
--- a/drivers/char/ipmi/ipmi_si_platform.c +++ b/drivers/char/ipmi/ipmi_si_platform.c @@ -128,8 +128,6 @@ ipmi_get_info_from_resources(struct plat if (res_second->start > io->addr_data) io->regspacing = res_second->start - io->addr_data; } - io->regsize = DEFAULT_REGSIZE; - io->regshift = 0;
return res; } @@ -137,7 +135,7 @@ ipmi_get_info_from_resources(struct plat static int platform_ipmi_probe(struct platform_device *pdev) { struct si_sm_io io; - u8 type, slave_addr, addr_source; + u8 type, slave_addr, addr_source, regsize, regshift; int rv;
rv = device_property_read_u8(&pdev->dev, "addr-source", &addr_source); @@ -149,7 +147,7 @@ static int platform_ipmi_probe(struct pl if (addr_source == SI_SMBIOS) { if (!si_trydmi) return -ENODEV; - } else { + } else if (addr_source != SI_HARDCODED) { if (!si_tryplatform) return -ENODEV; } @@ -169,11 +167,23 @@ static int platform_ipmi_probe(struct pl case SI_BT: io.si_type = type; break; + case SI_TYPE_INVALID: /* User disabled this in hardcode. */ + return -ENODEV; default: dev_err(&pdev->dev, "ipmi-type property is invalid\n"); return -EINVAL; }
+ io.regsize = DEFAULT_REGSIZE; + rv = device_property_read_u8(&pdev->dev, "reg-size", ®size); + if (!rv) + io.regsize = regsize; + + io.regshift = 0; + rv = device_property_read_u8(&pdev->dev, "reg-shift", ®shift); + if (!rv) + io.regshift = regshift; + if (!ipmi_get_info_from_resources(pdev, &io)) return -EINVAL;
@@ -193,7 +203,8 @@ static int platform_ipmi_probe(struct pl
io.dev = &pdev->dev;
- pr_info("ipmi_si: SMBIOS: %s %#lx regsize %d spacing %d irq %d\n", + pr_info("ipmi_si: %s: %s %#lx regsize %d spacing %d irq %d\n", + ipmi_addr_src_to_str(addr_source), (io.addr_type == IPMI_IO_ADDR_SPACE) ? "io" : "mem", io.addr_data, io.regsize, io.regspacing, io.irq);
@@ -358,6 +369,9 @@ static int acpi_ipmi_probe(struct platfo goto err_free; }
+ io.regsize = DEFAULT_REGSIZE; + io.regshift = 0; + res = ipmi_get_info_from_resources(pdev, &io); if (!res) { rv = -EINVAL; @@ -420,8 +434,9 @@ static int ipmi_remove(struct platform_d }
static const struct platform_device_id si_plat_ids[] = { - { "dmi-ipmi-si", 0 }, - { } + { "dmi-ipmi-si", 0 }, + { "hardcode-ipmi-si", 0 }, + { } };
struct platform_driver ipmi_platform_driver = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Yingliang yangyingliang@huawei.com
commit 401e7e88d4ef80188ffa07095ac00456f901b8c4 upstream.
When we excute the following commands, we got oops rmmod ipmi_si cat /proc/ioports
[ 1623.482380] Unable to handle kernel paging request at virtual address ffff00000901d478 [ 1623.482382] Mem abort info: [ 1623.482383] ESR = 0x96000007 [ 1623.482385] Exception class = DABT (current EL), IL = 32 bits [ 1623.482386] SET = 0, FnV = 0 [ 1623.482387] EA = 0, S1PTW = 0 [ 1623.482388] Data abort info: [ 1623.482389] ISV = 0, ISS = 0x00000007 [ 1623.482390] CM = 0, WnR = 0 [ 1623.482393] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 00000000d7d94a66 [ 1623.482395] [ffff00000901d478] pgd=000000dffbfff003, pud=000000dffbffe003, pmd=0000003f5d06e003, pte=0000000000000000 [ 1623.482399] Internal error: Oops: 96000007 [#1] SMP [ 1623.487407] Modules linked in: ipmi_si(E) nls_utf8 isofs rpcrdma ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_umad rdma_cm ib_cm dm_mirror dm_region_hash dm_log iw_cm dm_mod aes_ce_blk crypto_simd cryptd aes_ce_cipher ses ghash_ce sha2_ce enclosure sha256_arm64 sg sha1_ce hisi_sas_v2_hw hibmc_drm sbsa_gwdt hisi_sas_main ip_tables mlx5_ib ib_uverbs marvell ib_core mlx5_core ixgbe mdio hns_dsaf ipmi_devintf hns_enet_drv ipmi_msghandler hns_mdio [last unloaded: ipmi_si] [ 1623.532410] CPU: 30 PID: 11438 Comm: cat Kdump: loaded Tainted: G E 5.0.0-rc3+ #168 [ 1623.541498] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.37 11/21/2017 [ 1623.548822] pstate: a0000005 (NzCv daif -PAN -UAO) [ 1623.553684] pc : string+0x28/0x98 [ 1623.557040] lr : vsnprintf+0x368/0x5e8 [ 1623.560837] sp : ffff000013213a80 [ 1623.564191] x29: ffff000013213a80 x28: ffff00001138abb5 [ 1623.569577] x27: ffff000013213c18 x26: ffff805f67d06049 [ 1623.574963] x25: 0000000000000000 x24: ffff00001138abb5 [ 1623.580349] x23: 0000000000000fb7 x22: ffff0000117ed000 [ 1623.585734] x21: ffff000011188fd8 x20: ffff805f67d07000 [ 1623.591119] x19: ffff805f67d06061 x18: ffffffffffffffff [ 1623.596505] x17: 0000000000000200 x16: 0000000000000000 [ 1623.601890] x15: ffff0000117ed748 x14: ffff805f67d07000 [ 1623.607276] x13: ffff805f67d0605e x12: 0000000000000000 [ 1623.612661] x11: 0000000000000000 x10: 0000000000000000 [ 1623.618046] x9 : 0000000000000000 x8 : 000000000000000f [ 1623.623432] x7 : ffff805f67d06061 x6 : fffffffffffffffe [ 1623.628817] x5 : 0000000000000012 x4 : ffff00000901d478 [ 1623.634203] x3 : ffff0a00ffffff04 x2 : ffff805f67d07000 [ 1623.639588] x1 : ffff805f67d07000 x0 : ffffffffffffffff [ 1623.644974] Process cat (pid: 11438, stack limit = 0x000000008d4cbc10) [ 1623.651592] Call trace: [ 1623.654068] string+0x28/0x98 [ 1623.657071] vsnprintf+0x368/0x5e8 [ 1623.660517] seq_vprintf+0x70/0x98 [ 1623.668009] seq_printf+0x7c/0xa0 [ 1623.675530] r_show+0xc8/0xf8 [ 1623.682558] seq_read+0x330/0x440 [ 1623.689877] proc_reg_read+0x78/0xd0 [ 1623.697346] __vfs_read+0x60/0x1a0 [ 1623.704564] vfs_read+0x94/0x150 [ 1623.711339] ksys_read+0x6c/0xd8 [ 1623.717939] __arm64_sys_read+0x24/0x30 [ 1623.725077] el0_svc_common+0x120/0x148 [ 1623.732035] el0_svc_handler+0x30/0x40 [ 1623.738757] el0_svc+0x8/0xc [ 1623.744520] Code: d1000406 aa0103e2 54000149 b4000080 (39400085) [ 1623.753441] ---[ end trace f91b6a4937de9835 ]--- [ 1623.760871] Kernel panic - not syncing: Fatal exception [ 1623.768935] SMP: stopping secondary CPUs [ 1623.775718] Kernel Offset: disabled [ 1623.781998] CPU features: 0x002,21006008 [ 1623.788777] Memory Limit: none [ 1623.798329] Starting crashdump kernel... [ 1623.805202] Bye!
If io_setup is called successful in try_smi_init() but try_smi_init() goes out_err before calling ipmi_register_smi(), so ipmi_unregister_smi() will not be called while removing module. It leads to the resource that allocated in io_setup() can not be freed, but the name(DEVICE_NAME) of resource is freed while removing the module. It causes use-after-free when cat /proc/ioports.
Fix this by calling io_cleanup() while try_smi_init() goes to out_err. and don't call io_cleanup() until io_setup() returns successful to avoid warning prints.
Fixes: 93c303d2045b ("ipmi_si: Clean up shutdown a bit") Cc: stable@vger.kernel.org Reported-by: NuoHan Qiao qiaonuohan@huawei.com Suggested-by: Corey Minyard cminyard@mvista.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Corey Minyard cminyard@mvista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/ipmi/ipmi_si_intf.c | 5 +++++ drivers/char/ipmi/ipmi_si_mem_io.c | 5 +++-- drivers/char/ipmi/ipmi_si_port_io.c | 5 +++-- 3 files changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -2097,6 +2097,11 @@ static int try_smi_init(struct smi_info WARN_ON(new_smi->io.dev->init_name != NULL);
out_err: + if (rv && new_smi->io.io_cleanup) { + new_smi->io.io_cleanup(&new_smi->io); + new_smi->io.io_cleanup = NULL; + } + kfree(init_name); return rv; } --- a/drivers/char/ipmi/ipmi_si_mem_io.c +++ b/drivers/char/ipmi/ipmi_si_mem_io.c @@ -81,8 +81,6 @@ int ipmi_si_mem_setup(struct si_sm_io *i if (!addr) return -ENODEV;
- io->io_cleanup = mem_cleanup; - /* * Figure out the actual readb/readw/readl/etc routine to use based * upon the register size. @@ -141,5 +139,8 @@ int ipmi_si_mem_setup(struct si_sm_io *i mem_region_cleanup(io, io->io_size); return -EIO; } + + io->io_cleanup = mem_cleanup; + return 0; } --- a/drivers/char/ipmi/ipmi_si_port_io.c +++ b/drivers/char/ipmi/ipmi_si_port_io.c @@ -68,8 +68,6 @@ int ipmi_si_port_setup(struct si_sm_io * if (!addr) return -ENODEV;
- io->io_cleanup = port_cleanup; - /* * Figure out the actual inb/inw/inl/etc routine to use based * upon the register size. @@ -109,5 +107,8 @@ int ipmi_si_port_setup(struct si_sm_io * return -EIO; } } + + io->io_cleanup = port_cleanup; + return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown neil@brown.name
commit 0bdb50c531f7377a9da80d3ce2d61f389c84cb30 upstream.
A dm-raid array with devices larger than 4GB won't assemble on a 32 bit host since _check_data_dev_sectors() was added in 4.16. This is because to_sector() treats its argument as an "unsigned long" which is 32bits (4GB) on a 32bit host. Using "unsigned long long" is more correct.
Kernels as early as 4.2 can have other problems due to to_sector() being used on the size of a device.
Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality") cc: stable@vger.kernel.org (v4.2+) Reported-and-tested-by: Guillaume Perréal gperreal@free.fr Signed-off-by: NeilBrown neil@brown.name Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/device-mapper.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -609,7 +609,7 @@ do { \ */ #define dm_target_offset(ti, sector) ((sector) - (ti)->begin)
-static inline sector_t to_sector(unsigned long n) +static inline sector_t to_sector(unsigned long long n) { return (n >> SECTOR_SHIFT); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 225557446856448039a9e495da37b72c20071ef2 upstream.
When using dm-integrity underneath md-raid, some tests with raid auto-correction trigger large amounts of integrity failures - and all these failures print an error message. These messages can bring the system to a halt if the system is using serial console.
Fix this by limiting the rate of error messages - it improves the speed of raid recovery and avoids the hang.
Fixes: 7eada909bfd7a ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-integrity.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -1368,8 +1368,8 @@ again: checksums_ptr - checksums, !dio->write ? TAG_CMP : TAG_WRITE); if (unlikely(r)) { if (r > 0) { - DMERR("Checksum failed at sector 0x%llx", - (unsigned long long)(sector - ((r + ic->tag_size - 1) / ic->tag_size))); + DMERR_LIMIT("Checksum failed at sector 0x%llx", + (unsigned long long)(sector - ((r + ic->tag_size - 1) / ic->tag_size))); r = -EILSEQ; atomic64_inc(&ic->number_of_mismatches); } @@ -1561,8 +1561,8 @@ retry_kmap:
integrity_sector_checksum(ic, logical_sector, mem + bv.bv_offset, checksums_onstack); if (unlikely(memcmp(checksums_onstack, journal_entry_tag(ic, je), ic->tag_size))) { - DMERR("Checksum failed when reading from journal, at sector 0x%llx", - (unsigned long long)logical_sector); + DMERR_LIMIT("Checksum failed when reading from journal, at sector 0x%llx", + (unsigned long long)logical_sector); } } #endif
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cody P Schafer dev@codyps.com
commit 46c039d06b6ecabb94bd16c3a999b28dc83b79ce upstream.
Without this, we get failures like this when the kernel attempts to initialize a cx231xx device:
[16046.153653] cx231xx 3-1.2:1.1: New device Hauppauge Hauppauge Device @ 480 Mbps (2040:c200) with 6 interfaces [16046.153900] cx231xx 3-1.2:1.1: can't change interface 3 alt no. to 3: Max. Pkt size = 0 [16046.153907] cx231xx 3-1.2:1.1: Identified as Hauppauge USB Live 2 (card=9) [16046.154350] i2c i2c-11: Added multiplexed i2c bus 13 [16046.154379] i2c i2c-11: Added multiplexed i2c bus 14 [16046.267194] cx25840 10-0044: cx23102 A/V decoder found @ 0x88 (cx231xx #0-0) [16048.424551] cx25840 10-0044: loaded v4l-cx231xx-avcore-01.fw firmware (16382 bytes) [16048.463224] cx231xx 3-1.2:1.1: v4l2 driver version 0.0.3 [16048.567878] cx231xx 3-1.2:1.1: Registered video device video2 [v4l2] [16048.568001] cx231xx 3-1.2:1.1: Registered VBI device vbi0 [16048.568419] cx231xx 3-1.2:1.1: audio EndPoint Addr 0x83, Alternate settings: 3 [16048.568425] cx231xx 3-1.2:1.1: video EndPoint Addr 0x84, Alternate settings: 5 [16048.568431] cx231xx 3-1.2:1.1: VBI EndPoint Addr 0x85, Alternate settings: 2 [16048.568436] cx231xx 3-1.2:1.1: sliced CC EndPoint Addr 0x86, Alternate settings: 2 [16048.568448] usb 3-1.2: couldn't get decoder output pad for V4L I/O [16048.568453] cx231xx 3-1.2:1.1: V4L2 device vbi0 deregistered [16048.568579] cx231xx 3-1.2:1.1: V4L2 device video2 deregistered [16048.569001] cx231xx: probe of 3-1.2:1.1 failed with error -22
Likely a regession since Commit 9d6d20e652c0 ("media: v4l2-mc: switch it to use the new approach to setup pipelines") (v4.19-rc1-100-g9d6d20e652c0), which introduced the use of PAD_SIGNAL_DV within v4l2_mc_create_media_graph().
This also modifies cx25840 to remove the VBI pad, matching the action taken in Commit 092a37875a22 ("media: v4l2: remove VBI output pad").
Fixes: 9d6d20e652c0 ("media: v4l2-mc: switch it to use the new approach to setup pipelines")
Cc: stable@vger.kernel.org Signed-off-by: Cody P Schafer dev@codyps.com Tested-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/i2c/cx25840/cx25840-core.c | 3 ++- drivers/media/i2c/cx25840/cx25840-core.h | 1 - 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/media/i2c/cx25840/cx25840-core.c +++ b/drivers/media/i2c/cx25840/cx25840-core.c @@ -5216,8 +5216,9 @@ static int cx25840_probe(struct i2c_clie * those extra inputs. So, let's add it only when needed. */ state->pads[CX25840_PAD_INPUT].flags = MEDIA_PAD_FL_SINK; + state->pads[CX25840_PAD_INPUT].sig_type = PAD_SIGNAL_ANALOG; state->pads[CX25840_PAD_VID_OUT].flags = MEDIA_PAD_FL_SOURCE; - state->pads[CX25840_PAD_VBI_OUT].flags = MEDIA_PAD_FL_SOURCE; + state->pads[CX25840_PAD_VID_OUT].sig_type = PAD_SIGNAL_DV; sd->entity.function = MEDIA_ENT_F_ATV_DECODER;
ret = media_entity_pads_init(&sd->entity, ARRAY_SIZE(state->pads), --- a/drivers/media/i2c/cx25840/cx25840-core.h +++ b/drivers/media/i2c/cx25840/cx25840-core.h @@ -40,7 +40,6 @@ enum cx25840_model { enum cx25840_media_pads { CX25840_PAD_INPUT, CX25840_PAD_VID_OUT, - CX25840_PAD_VBI_OUT,
CX25840_NUM_PADS };
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit ae7b8eda27b33b1f688dfdebe4d46f690a8f9162 upstream.
There is a potential NULL pointer dereference in case devm_kzalloc() fails and returns NULL.
Fix this by adding a NULL check on *lookup*
This bug was detected with the help of Coccinelle.
Fixes: b2e63555592f ("i2c: gpio: Convert to use descriptors") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mfd/sm501.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/mfd/sm501.c +++ b/drivers/mfd/sm501.c @@ -1145,6 +1145,9 @@ static int sm501_register_gpio_i2c_insta lookup = devm_kzalloc(&pdev->dev, sizeof(*lookup) + 3 * sizeof(struct gpiod_lookup), GFP_KERNEL); + if (!lookup) + return -ENOMEM; + lookup->dev_id = "i2c-gpio"; if (iic->pin_sda < 32) lookup->table[0].chip_label = "SM501-LOW";
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Machek pavel@ucw.cz
commit fd10606f93a149a9f3d37574e5385b083b4a7b32 upstream.
The driver doesn't generate uevents on charger connect/disconnect. This leads to UPower not detecting when AC is on or off... and that is bad.
Reported by Arthur D. on github ( https://github.com/maemo-leste/bugtracker/issues/206 ), thanks to Merlijn Wajer for suggesting a fix.
Cc: stable@kernel.org Signed-off-by: Pavel Machek pavel@ucw.cz Acked-by: Tony Lindgren tony@atomide.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/power/supply/cpcap-charger.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/power/supply/cpcap-charger.c +++ b/drivers/power/supply/cpcap-charger.c @@ -458,6 +458,7 @@ static void cpcap_usb_detect(struct work goto out_err; }
+ power_supply_changed(ddata->usb); return;
out_err:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 22782b3f9bb8ae21c710e2880db21bc729771e92 upstream.
After commit 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command line parameter") new cpuidle governors are not added to the list of available governors, so governor selection via sysfs doesn't work as expected (even though it is rarely used anyway).
Fix that by making cpuidle_register_governor() add new governors to cpuidle_governors again.
Fixes: 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command line parameter") Reported-by: Kees Cook keescook@chromium.org Cc: 5.0+ stable@vger.kernel.org # 5.0+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cpuidle/governor.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/cpuidle/governor.c +++ b/drivers/cpuidle/governor.c @@ -89,6 +89,7 @@ int cpuidle_register_governor(struct cpu mutex_lock(&cpuidle_lock); if (__cpuidle_find_governor(gov->name) == NULL) { ret = 0; + list_add_tail(&gov->governor_list, &cpuidle_governors); if (!cpuidle_curr_governor || !strncasecmp(param_governor, gov->name, CPUIDLE_NAME_LEN) || (cpuidle_curr_governor->rating < gov->rating &&
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit f57dcf4c72113c745d83f1c65f7291299f65c14f upstream.
When we fail to add the request to the I/O queue, we currently leave it to the caller to free the failed request. However since some of the requests that fail are actually created by nfs_pageio_add_request() itself, and are not passed back the caller, this leads to a leakage issue, which can again cause page locks to leak.
This commit addresses the leakage by freeing the created requests on error, using desc->pg_completion_ops->error_cleanup()
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Fixes: a7d42ddb30997 ("nfs: add mirroring support to pgio layer") Cc: stable@vger.kernel.org # v4.0: c18b96a1b862: nfs: clean up rest of reqs Cc: stable@vger.kernel.org # v4.0: d600ad1f2bdb: NFS41: pop some layoutget Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/pagelist.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-)
--- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -988,6 +988,17 @@ static void nfs_pageio_doio(struct nfs_p } }
+static void +nfs_pageio_cleanup_request(struct nfs_pageio_descriptor *desc, + struct nfs_page *req) +{ + LIST_HEAD(head); + + nfs_list_remove_request(req); + nfs_list_add_request(req, &head); + desc->pg_completion_ops->error_cleanup(&head); +} + /** * nfs_pageio_add_request - Attempt to coalesce a request into a page list. * @desc: destination io descriptor @@ -1025,10 +1036,8 @@ static int __nfs_pageio_add_request(stru nfs_page_group_unlock(req); desc->pg_moreio = 1; nfs_pageio_doio(desc); - if (desc->pg_error < 0) - return 0; - if (mirror->pg_recoalesce) - return 0; + if (desc->pg_error < 0 || mirror->pg_recoalesce) + goto out_cleanup_subreq; /* retry add_request for this subreq */ nfs_page_group_lock(req); continue; @@ -1061,6 +1070,10 @@ err_ptr: desc->pg_error = PTR_ERR(subreq); nfs_page_group_unlock(req); return 0; +out_cleanup_subreq: + if (req != subreq) + nfs_pageio_cleanup_request(desc, subreq); + return 0; }
static int nfs_do_recoalesce(struct nfs_pageio_descriptor *desc) @@ -1168,11 +1181,14 @@ int nfs_pageio_add_request(struct nfs_pa if (nfs_pgio_has_mirroring(desc)) desc->pg_mirror_idx = midx; if (!nfs_pageio_add_request_mirror(desc, dupreq)) - goto out_failed; + goto out_cleanup_subreq; }
return 1;
+out_cleanup_subreq: + if (req != dupreq) + nfs_pageio_cleanup_request(desc, dupreq); out_failed: nfs_pageio_error_cleanup(desc); return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 4d91969ed4dbcefd0e78f77494f0cb8fada9048a upstream.
Whether we need to exit early, or just reprocess the list, we must not lost track of the request which failed to get recoalesced.
Fixes: 03d5eb65b538 ("NFS: Fix a memory leak in nfs_do_recoalesce") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/pagelist.c | 1 - 1 file changed, 1 deletion(-)
--- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -1092,7 +1092,6 @@ static int nfs_do_recoalesce(struct nfs_ struct nfs_page *req;
req = list_first_entry(&head, struct nfs_page, wb_list); - nfs_list_remove_request(req); if (__nfs_pageio_add_request(desc, req)) continue; if (desc->pg_error < 0) {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 8127d82705998568b52ac724e28e00941538083d upstream.
If the I/O completion failed with a fatal error, then we should just exit nfs_pageio_complete_mirror() rather than try to recoalesce.
Fixes: a7d42ddb3099 ("nfs: add mirroring support to pgio layer") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/pagelist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -1209,7 +1209,7 @@ static void nfs_pageio_complete_mirror(s desc->pg_mirror_idx = mirror_idx; for (;;) { nfs_pageio_doio(desc); - if (!mirror->pg_recoalesce) + if (desc->pg_error < 0 || !mirror->pg_recoalesce) break; if (!nfs_do_recoalesce(desc)) break;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: J. Bruce Fields bfields@redhat.com
commit c54f24e338ed2a35218f117a4a1afb5f9e2b4e64 upstream.
We're unintentionally limiting the number of slots per nfsv4.1 session to 10. Often more than 10 simultaneous RPCs are needed for the best performance.
This calculation was meant to prevent any one client from using up more than a third of the limit we set for total memory use across all clients and sessions. Instead, it's limiting the client to a third of the maximum for a single session.
Fix this.
Reported-by: Chris Tracy ctracy@engr.scu.edu Cc: stable@vger.kernel.org Fixes: de766e570413 "nfsd: give out fewer session slots as limit approaches" Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfsd/nfs4state.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1544,16 +1544,16 @@ static u32 nfsd4_get_drc_mem(struct nfsd { u32 slotsize = slot_bytes(ca); u32 num = ca->maxreqs; - int avail; + unsigned long avail, total_avail;
spin_lock(&nfsd_drc_lock); - avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, - nfsd_drc_max_mem - nfsd_drc_mem_used); + total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used; + avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail); /* * Never use more than a third of the remaining memory, * unless it's the only way to give this client a slot: */ - avail = clamp_t(int, avail, slotsize, avail/3); + avail = clamp_t(int, avail, slotsize, total_avail/3); num = min_t(int, num, avail / slotsize); nfsd_drc_mem_used += num * slotsize; spin_unlock(&nfsd_drc_lock);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown neilb@suse.com
commit b602345da6cbb135ba68cf042df8ec9a73da7981 upstream.
If the result of an NFSv3 readdir{,plus} request results in the "offset" on one entry having to be split across 2 pages, and is sized so that the next directory entry doesn't fit in the requested size, then memory corruption can happen.
When encode_entry() is called after encoding the last entry that fits, it notices that ->offset and ->offset1 are set, and so stores the offset value in the two pages as required. It clears ->offset1 but *does not* clear ->offset.
Normally this omission doesn't matter as encode_entry_baggage() will be called, and will set ->offset to a suitable value (not on a page boundary). But in the case where cd->buflen < elen and nfserr_toosmall is returned, ->offset is not reset.
This means that nfsd3proc_readdirplus will see ->offset with a value 4 bytes before the end of a page, and ->offset1 set to NULL. It will try to write 8bytes to ->offset. If we are lucky, the next page will be read-only, and the system will BUG: unable to handle kernel paging request at...
If we are unlucky, some innocent page will have the first 4 bytes corrupted.
nfsd3proc_readdir() doesn't even check for ->offset1, it just blindly writes 8 bytes to the offset wherever it is.
Fix this by clearing ->offset after it is used, and copying the ->offset handling code from nfsd3_proc_readdirplus into nfsd3_proc_readdir.
(Note that the commit hash in the Fixes tag is from the 'history' tree - this bug predates git).
Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding") Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?... Cc: stable@vger.kernel.org (v2.6.12+) Signed-off-by: NeilBrown neilb@suse.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfsd/nfs3proc.c | 16 ++++++++++++++-- fs/nfsd/nfs3xdr.c | 1 + 2 files changed, 15 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -463,8 +463,19 @@ nfsd3_proc_readdir(struct svc_rqst *rqst &resp->common, nfs3svc_encode_entry); memcpy(resp->verf, argp->verf, 8); resp->count = resp->buffer - argp->buffer; - if (resp->offset) - xdr_encode_hyper(resp->offset, argp->cookie); + if (resp->offset) { + loff_t offset = argp->cookie; + + if (unlikely(resp->offset1)) { + /* we ended up with offset on a page boundary */ + *resp->offset = htonl(offset >> 32); + *resp->offset1 = htonl(offset & 0xffffffff); + resp->offset1 = NULL; + } else { + xdr_encode_hyper(resp->offset, offset); + } + resp->offset = NULL; + }
RETURN_STATUS(nfserr); } @@ -533,6 +544,7 @@ nfsd3_proc_readdirplus(struct svc_rqst * } else { xdr_encode_hyper(resp->offset, offset); } + resp->offset = NULL; }
RETURN_STATUS(nfserr); --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -921,6 +921,7 @@ encode_entry(struct readdir_cd *ccd, con } else { xdr_encode_hyper(cd->offset, offset64); } + cd->offset = NULL; }
/*
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihao Wu wuyihao@linux.alibaba.com
commit dd838821f0a29781b185cd8fb8e48d5c177bd838 upstream.
Commit 62a063b8e7d1 "nfsd4: fix crash on writing v4_end_grace before nfsd startup" is trying to fix a NULL dereference issue, but it mistakenly checks if the nfsd server is started. So fix it.
Fixes: 62a063b8e7d1 "nfsd4: fix crash on writing v4_end_grace before nfsd startup" Cc: stable@vger.kernel.org Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Signed-off-by: Yihao Wu wuyihao@linux.alibaba.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfsd/nfsctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1126,7 +1126,7 @@ static ssize_t write_v4_end_grace(struct case 'Y': case 'y': case '1': - if (nn->nfsd_serv) + if (!nn->nfsd_serv) return -EBUSY; nfsd4_end_grace(nn); break;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit c1dffe0bf7f9c3d57d9f237a7cb2a81e62babd2b upstream.
If we have to retransmit a request, we should ensure that we reinitialise the sequence results structure, since in the event of a signal we need to treat the request as if it had not been sent.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/nfs4proc.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -947,6 +947,13 @@ nfs4_sequence_process_interrupted(struct
#endif /* !CONFIG_NFS_V4_1 */
+static void nfs41_sequence_res_init(struct nfs4_sequence_res *res) +{ + res->sr_timestamp = jiffies; + res->sr_status_flags = 0; + res->sr_status = 1; +} + static void nfs4_sequence_attach_slot(struct nfs4_sequence_args *args, struct nfs4_sequence_res *res, @@ -958,10 +965,6 @@ void nfs4_sequence_attach_slot(struct nf args->sa_slot = slot;
res->sr_slot = slot; - res->sr_timestamp = jiffies; - res->sr_status_flags = 0; - res->sr_status = 1; - }
int nfs4_setup_sequence(struct nfs_client *client, @@ -1007,6 +1010,7 @@ int nfs4_setup_sequence(struct nfs_clien
trace_nfs4_setup_sequence(session, args); out_start: + nfs41_sequence_res_init(res); rpc_call_start(task); return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: J. Bruce Fields bfields@redhat.com
commit b7e5034cbecf5a65b7bfdc2b20a8378039577706 upstream.
James Pearson found that an NFS server stopped responding to UDP requests if started with more than 1017 threads.
sv_max_mesg is about 2^20, so that is probably where the calculation performed by
svc_sock_setbufsize(svsk->sk_sock, (serv->sv_nrthreads+3) * serv->sv_max_mesg, (serv->sv_nrthreads+3) * serv->sv_max_mesg);
starts to overflow an int.
Reported-by: James Pearson jcpearson@gmail.com Tested-by: James Pearson jcpearson@gmail.com Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/svcsock.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -349,12 +349,16 @@ static ssize_t svc_recvfrom(struct svc_r /* * Set socket snd and rcv buffer lengths */ -static void svc_sock_setbufsize(struct socket *sock, unsigned int snd, - unsigned int rcv) +static void svc_sock_setbufsize(struct svc_sock *svsk, unsigned int nreqs) { + unsigned int max_mesg = svsk->sk_xprt.xpt_server->sv_max_mesg; + struct socket *sock = svsk->sk_sock; + + nreqs = min(nreqs, INT_MAX / 2 / max_mesg); + lock_sock(sock->sk); - sock->sk->sk_sndbuf = snd * 2; - sock->sk->sk_rcvbuf = rcv * 2; + sock->sk->sk_sndbuf = nreqs * max_mesg * 2; + sock->sk->sk_rcvbuf = nreqs * max_mesg * 2; sock->sk->sk_write_space(sock->sk); release_sock(sock->sk); } @@ -516,9 +520,7 @@ static int svc_udp_recvfrom(struct svc_r * provides an upper bound on the number of threads * which will access the socket. */ - svc_sock_setbufsize(svsk->sk_sock, - (serv->sv_nrthreads+3) * serv->sv_max_mesg, - (serv->sv_nrthreads+3) * serv->sv_max_mesg); + svc_sock_setbufsize(svsk, serv->sv_nrthreads + 3);
clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); skb = NULL; @@ -681,9 +683,7 @@ static void svc_udp_init(struct svc_sock * receive and respond to one request. * svc_udp_recvfrom will re-adjust if necessary */ - svc_sock_setbufsize(svsk->sk_sock, - 3 * svsk->sk_xprt.xpt_server->sv_max_mesg, - 3 * svsk->sk_xprt.xpt_server->sv_max_mesg); + svc_sock_setbufsize(svsk, 3);
/* data might have come in before data_ready set up */ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viresh Kumar viresh.kumar@linaro.org
commit 1fad17fb1bbcd73159c2b992668a6957ecc5af8a upstream.
If wakeup_source_add() is called right after wakeup_source_remove() for the same wakeup source, timer_setup() may be called for a potentially scheduled timer which is incorrect.
To avoid that, move the wakeup source timer cancellation from wakeup_source_drop() to wakeup_source_remove().
Moreover, make wakeup_source_remove() clear the timer function after canceling the timer to let wakeup_source_not_registered() treat unregistered wakeup sources in the same way as the ones that have never been registered.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Cc: 4.4+ stable@vger.kernel.org # 4.4+ [ rjw: Subject, changelog, merged two patches together ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/base/power/wakeup.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -118,7 +118,6 @@ void wakeup_source_drop(struct wakeup_so if (!ws) return;
- del_timer_sync(&ws->timer); __pm_relax(ws); } EXPORT_SYMBOL_GPL(wakeup_source_drop); @@ -205,6 +204,13 @@ void wakeup_source_remove(struct wakeup_ list_del_rcu(&ws->entry); raw_spin_unlock_irqrestore(&events_lock, flags); synchronize_srcu(&wakeup_srcu); + + del_timer_sync(&ws->timer); + /* + * Clear timer.function to make wakeup_source_not_registered() treat + * this wakeup source as not registered. + */ + ws->timer.function = NULL; } EXPORT_SYMBOL_GPL(wakeup_source_remove);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viresh Kumar viresh.kumar@linaro.org
commit faef080f6db5320011862f7baf1aa66d0851559f upstream.
At boot up, CPUFreq core performs a sanity check to see if the system is running at a frequency defined in the frequency table of the CPU. If so, we try to find a valid frequency (lowest frequency greater than the currently programmed frequency) from the table and set it. When the call reaches dev_pm_opp_set_rate(), it calls _find_freq_ceil(opp_table, &old_freq) to find the previously configured OPP and this call also updates the old_freq. This eventually sets the old_freq == freq (new target requested by cpufreq core) and we skip updating the performance state in this case.
Fix this by also updating the performance state when the old_freq == freq.
Fixes: ca1b5d77b1c6 ("OPP: Configure all required OPPs") Cc: v5.0 stable@vger.kernel.org # v5.0 Reported-by: Niklas Cassel niklas.cassel@linaro.org Tested-by: Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/opp/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -743,7 +743,7 @@ int dev_pm_opp_set_rate(struct device *d old_freq, freq);
/* Scaling up? Configure required OPPs before frequency */ - if (freq > old_freq) { + if (freq >= old_freq) { ret = _set_required_opps(dev, opp_table, opp); if (ret) goto put_opp;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Axtens dja@axtens.net
commit 9951379b0ca88c95876ad9778b9099e19a95d566 upstream.
Some users see panics like the following when performing fstrim on a bcached volume:
[ 529.803060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 530.183928] #PF error: [normal kernel read fault] [ 530.412392] PGD 8000001f42163067 P4D 8000001f42163067 PUD 1f42168067 PMD 0 [ 530.750887] Oops: 0000 [#1] SMP PTI [ 530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3 [ 531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [ 531.693137] RIP: 0010:blk_queue_split+0x148/0x620 [ 531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48 8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3 [ 532.838634] RSP: 0018:ffffb9b708df39b0 EFLAGS: 00010246 [ 533.093571] RAX: 00000000ffffffff RBX: 0000000000046000 RCX: 0000000000000000 [ 533.441865] RDX: 0000000000000200 RSI: 0000000000000000 RDI: 0000000000000000 [ 533.789922] RBP: ffffb9b708df3a48 R08: ffff940d3b3fdd20 R09: 0000000000000000 [ 534.137512] R10: ffffb9b708df3958 R11: 0000000000000000 R12: 0000000000000000 [ 534.485329] R13: 0000000000000000 R14: 0000000000000000 R15: ffff940d39212020 [ 534.833319] FS: 00007efec26e3840(0000) GS:ffff940d1f480000(0000) knlGS:0000000000000000 [ 535.224098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 535.504318] CR2: 0000000000000008 CR3: 0000001f4e256004 CR4: 00000000001606e0 [ 535.851759] Call Trace: [ 535.970308] ? mempool_alloc_slab+0x15/0x20 [ 536.174152] ? bch_data_insert+0x42/0xd0 [bcache] [ 536.403399] blk_mq_make_request+0x97/0x4f0 [ 536.607036] generic_make_request+0x1e2/0x410 [ 536.819164] submit_bio+0x73/0x150 [ 536.980168] ? submit_bio+0x73/0x150 [ 537.149731] ? bio_associate_blkg_from_css+0x3b/0x60 [ 537.391595] ? _cond_resched+0x1a/0x50 [ 537.573774] submit_bio_wait+0x59/0x90 [ 537.756105] blkdev_issue_discard+0x80/0xd0 [ 537.959590] ext4_trim_fs+0x4a9/0x9e0 [ 538.137636] ? ext4_trim_fs+0x4a9/0x9e0 [ 538.324087] ext4_ioctl+0xea4/0x1530 [ 538.497712] ? _copy_to_user+0x2a/0x40 [ 538.679632] do_vfs_ioctl+0xa6/0x600 [ 538.853127] ? __do_sys_newfstat+0x44/0x70 [ 539.051951] ksys_ioctl+0x6d/0x80 [ 539.212785] __x64_sys_ioctl+0x1a/0x20 [ 539.394918] do_syscall_64+0x5a/0x110 [ 539.568674] entry_SYSCALL_64_after_hwframe+0x44/0xa9
We have observed it where both: 1) LVM/devmapper is involved (bcache backing device is LVM volume) and 2) writeback cache is involved (bcache cache_mode is writeback)
On one machine, we can reliably reproduce it with:
# echo writeback > /sys/block/bcache0/bcache/cache_mode (not sure whether above line is required) # mount /dev/bcache0 /test # for i in {0..10}; do file="$(mktemp /test/zero.XXX)" dd if=/dev/zero of="$file" bs=1M count=256 sync rm $file done # fstrim -v /test
Observing this with tracepoints on, we see the following writes:
fstrim-18019 [022] .... 91107.302026: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 4260112 + 196352 hit 0 bypass 1 fstrim-18019 [022] .... 91107.302050: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 4456464 + 262144 hit 0 bypass 1 fstrim-18019 [022] .... 91107.302075: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 4718608 + 81920 hit 0 bypass 1 fstrim-18019 [022] .... 91107.302094: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 5324816 + 180224 hit 0 bypass 1 fstrim-18019 [022] .... 91107.302121: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 5505040 + 262144 hit 0 bypass 1 fstrim-18019 [022] .... 91107.302145: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 5767184 + 81920 hit 0 bypass 1 fstrim-18019 [022] .... 91107.308777: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0 DS 6373392 + 180224 hit 1 bypass 0 <crash>
Note the final one has different hit/bypass flags.
This is because in should_writeback(), we were hitting a case where the partial stripe condition was returning true and so should_writeback() was returning true early.
If that hadn't been the case, it would have hit the would_skip test, and as would_skip == s->iop.bypass == true, should_writeback() would have returned false.
Looking at the git history from 'commit 72c270612bd3 ("bcache: Write out full stripes")', it looks like the idea was to optimise for raid5/6:
* If a stripe is already dirty, force writes to that stripe to writeback mode - to help build up full stripes of dirty data
To fix this issue, make sure that should_writeback() on a discard op never returns true.
More details of debugging: https://www.spinics.net/lists/linux-bcache/msg06996.html
Previous reports: - https://bugzilla.kernel.org/show_bug.cgi?id=201051 - https://bugzilla.kernel.org/show_bug.cgi?id=196103 - https://www.spinics.net/lists/linux-bcache/msg06885.html
(Coly Li: minor modification to follow maximum 75 chars per line rule)
Cc: Kent Overstreet koverstreet@google.com Cc: stable@vger.kernel.org Fixes: 72c270612bd3 ("bcache: Write out full stripes") Signed-off-by: Daniel Axtens dja@axtens.net Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/writeback.h | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/md/bcache/writeback.h +++ b/drivers/md/bcache/writeback.h @@ -71,6 +71,9 @@ static inline bool should_writeback(stru in_use > bch_cutoff_writeback_sync) return false;
+ if (bio_op(bio) == REQ_OP_DISCARD) + return false; + if (dc->partial_stripes_expensive && bcache_dev_stripe_dirty(dc, bio->bi_iter.bi_sector, bio_sectors(bio)))
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tang Junhui tang.junhui.linux@gmail.com
commit 58ac323084ebf44f8470eeb8b82660f9d0ee3689 upstream.
Stale && dirty keys can be produced in the follow way: After writeback in write_dirty_finish(), dirty keys k1 will replace by clean keys k2 ==>ret = bch_btree_insert(dc->disk.c, &keys, NULL, &w->key); ==>btree_insert_fn(struct btree_op *b_op, struct btree *b) ==>static int bch_btree_insert_node(struct btree *b, struct btree_op *op, struct keylist *insert_keys, atomic_t *journal_ref, Then two steps: A) update k1 to k2 in btree node memory; bch_btree_insert_keys(b, op, insert_keys, replace_key) B) Write the bset(contains k2) to cache disk by a 30s delay work bch_btree_leaf_dirty(b, journal_ref). But before the 30s delay work write the bset to cache device, these things happened: A) GC works, and reclaim the bucket k2 point to; B) Allocator works, and invalidate the bucket k2 point to, and increase the gen of the bucket, and place it into free_inc fifo; C) Until now, the 30s delay work still does not finish work, so in the disk, the key still is k1, it is dirty and stale (its gen is smaller than the gen of the bucket). and then the machine power off suddenly happens; D) When the machine power on again, after the btree reconstruction, the stale dirty key appear.
In bch_extent_bad(), when expensive_debug_checks is off, it would treat the dirty key as good even it is stale keys, and it would cause bellow probelms: A) In read_dirty() it would cause machine crash: BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); B) It could be worse when reads hits stale dirty keys, it would read old incorrect data.
This patch tolerate the existence of these stale && dirty keys, and treat them as bad key in bch_extent_bad().
(Coly Li: fix indent which was modified by sender's email client)
Signed-off-by: Tang Junhui tang.junhui.linux@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/extents.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/drivers/md/bcache/extents.c +++ b/drivers/md/bcache/extents.c @@ -538,6 +538,7 @@ static bool bch_extent_bad(struct btree_ { struct btree *b = container_of(bk, struct btree, keys); unsigned int i, stale; + char buf[80];
if (!KEY_PTRS(k) || bch_extent_invalid(bk, k)) @@ -547,19 +548,19 @@ static bool bch_extent_bad(struct btree_ if (!ptr_available(b->c, k, i)) return true;
- if (!expensive_debug_checks(b->c) && KEY_DIRTY(k)) - return false; - for (i = 0; i < KEY_PTRS(k); i++) { stale = ptr_stale(b->c, k, i);
+ if (stale && KEY_DIRTY(k)) { + bch_extent_to_text(buf, sizeof(buf), k); + pr_info("stale dirty pointer, stale %u, key: %s", + stale, buf); + } + btree_bug_on(stale > BUCKET_GC_GEN_MAX, b, "key too stale: %i, need_gc %u", stale, b->c->need_gc);
- btree_bug_on(stale && KEY_DIRTY(k) && KEY_SIZE(k), - b, "stale dirty pointer"); - if (stale) return true;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Coly Li colyli@suse.de
commit dc7292a5bcb4c878b076fca2ac3fc22f81b8f8df upstream.
In 'commit 752f66a75aba ("bcache: use REQ_PRIO to indicate bio for metadata")' REQ_META is replaced by REQ_PRIO to indicate metadata bio. This assumption is not always correct, e.g. XFS uses REQ_META to mark metadata bio other than REQ_PRIO. This is why Nix noticed that bcache does not cache metadata for XFS after the above commit.
Thanks to Dave Chinner, he explains the difference between REQ_META and REQ_PRIO from view of file system developer. Here I quote part of his explanation from mailing list, REQ_META is used for metadata. REQ_PRIO is used to communicate to the lower layers that the submitter considers this IO to be more important that non REQ_PRIO IO and so dispatch should be expedited.
IOWs, if the filesystem considers metadata IO to be more important that user data IO, then it will use REQ_PRIO | REQ_META rather than just REQ_META.
Then it seems bios with REQ_META or REQ_PRIO should both be cached for performance optimation, because they are all probably low I/O latency demand by upper layer (e.g. file system).
So in this patch, when we want to decide whether to bypass the cache, REQ_META and REQ_PRIO are both checked. Then both metadata and high priority I/O requests will be handled properly.
Reported-by: Nix nix@esperi.org.uk Signed-off-by: Coly Li colyli@suse.de Reviewed-by: Andre Noll maan@tuebingen.mpg.de Tested-by: Nix nix@esperi.org.uk Cc: stable@vger.kernel.org Cc: Dave Chinner david@fromorbit.com Cc: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/request.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -392,10 +392,11 @@ static bool check_should_bypass(struct c
/* * Flag for bypass if the IO is for read-ahead or background, - * unless the read-ahead request is for metadata (eg, for gfs2). + * unless the read-ahead request is for metadata + * (eg, for gfs2 or xfs). */ if (bio->bi_opf & (REQ_RAHEAD|REQ_BACKGROUND) && - !(bio->bi_opf & REQ_PRIO)) + !(bio->bi_opf & (REQ_META|REQ_PRIO))) goto skip;
if (bio->bi_iter.bi_sector & (c->sb.block_size - 1) || @@ -877,7 +878,7 @@ static int cached_dev_cache_miss(struct }
if (!(bio->bi_opf & REQ_RAHEAD) && - !(bio->bi_opf & REQ_PRIO) && + !(bio->bi_opf & (REQ_META|REQ_PRIO)) && s->iop.c->gc_stats.in_use < CUTOFF_CACHE_READA) reada = min_t(sector_t, dc->readahead >> 9, get_capacity(bio->bi_disk) - bio_end_sector(bio));
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit a41e8f25fa8f8f67360d88eb0eebbabe95a64bdf upstream.
The networking maintainer keeps a public list of the patches being queued up for the next round of stable releases. Be sure to check there before asking for a patch to be applied so that you do not waste people's time.
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Acked-by: David S. Miller davem@davemloft.net Signed-off-by: Jonathan Corbet corbet@lwn.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/process/stable-kernel-rules.rst | 3 +++ 1 file changed, 3 insertions(+)
--- a/Documentation/process/stable-kernel-rules.rst +++ b/Documentation/process/stable-kernel-rules.rst @@ -38,6 +38,9 @@ Procedure for submitting patches to the - If the patch covers files in net/ or drivers/net please follow netdev stable submission guidelines as described in :ref:`Documentation/networking/netdev-FAQ.rst <netdev-FAQ>` + after first checking the stable networking queue at + https://patchwork.ozlabs.org/bundle/davem/stable/?series=&submitter=&... + to ensure the requested patch is not already queued up. - Security patches should not be handled (solely) by the -stable review process but should follow the procedures in :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`.
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Pitre nicolas.pitre@linaro.org
commit a6dbe442755999960ca54a9b8ecfd9606be0ea75 upstream.
Commit 4b4ecd9cb853 ("vt: Perform safe console erase only once") removed what appeared to be an extra call to scr_memsetw(). This missed the fact that set_origin() must be called before clearing the screen otherwise old screen content gets restored on the screen when using vgacon. Let's fix that by moving all the scrollback handling to flush_scrollback() where it logically belongs, and invoking it before the actual screen clearing in csi_J(), making the code simpler in the end.
Reported-by: Matthew Whitehead tedheadster@gmail.com Signed-off-by: Nicolas Pitre nico@linaro.org Tested-by: Matthew Whitehead tedheadster@gmail.com Fixes: 4b4ecd9cb853 ("vt: Perform safe console erase only once") Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/vt/vt.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -935,8 +935,11 @@ static void flush_scrollback(struct vc_d { WARN_CONSOLE_UNLOCKED();
+ set_origin(vc); if (vc->vc_sw->con_flush_scrollback) vc->vc_sw->con_flush_scrollback(vc); + else + vc->vc_sw->con_switch(vc); }
/* @@ -1503,8 +1506,10 @@ static void csi_J(struct vc_data *vc, in count = ((vc->vc_pos - vc->vc_origin) >> 1) + 1; start = (unsigned short *)vc->vc_origin; break; + case 3: /* include scrollback */ + flush_scrollback(vc); + /* fallthrough */ case 2: /* erase whole display */ - case 3: /* (and scrollback buffer later) */ vc_uniscr_clear_lines(vc, 0, vc->vc_rows); count = vc->vc_cols * vc->vc_rows; start = (unsigned short *)vc->vc_origin; @@ -1513,13 +1518,7 @@ static void csi_J(struct vc_data *vc, in return; } scr_memsetw(start, vc->vc_video_erase_char, 2 * count); - if (vpar == 3) { - set_origin(vc); - flush_scrollback(vc); - if (con_is_visible(vc)) - update_screen(vc); - } else if (con_should_update(vc)) - do_update_region(vc, (unsigned long) start, count); + update_region(vc, (unsigned long) start, count); vc->vc_need_wrap = 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@redhat.com
commit f76a16adc485699f95bb71fce114f97c832fe664 upstream.
The .orc_unwind section is a packed array of 6-byte structs. It's currently aligned to 6 bytes, which is causing warnings in the LLD linker.
Six isn't a power of two, so it's not a valid alignment value. The actual alignment doesn't matter much because it's an array of packed structs. An alignment of two is sufficient. In reality it always gets aligned to four bytes because it comes immediately after the 4-byte-aligned .orc_unwind_ip section.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Nick Desaulniers ndesaulniers@google.com Reported-by: Dmitry Golovin dima@golovin.in Reported-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Sedat Dilek sedat.dilek@gmail.com Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/218 Link: https://lkml.kernel.org/r/d55027ee95fe73e952dcd8be90aebd31b0095c45.155189204... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/asm-generic/vmlinux.lds.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -733,7 +733,7 @@ KEEP(*(.orc_unwind_ip)) \ __stop_orc_unwind_ip = .; \ } \ - . = ALIGN(6); \ + . = ALIGN(2); \ .orc_unwind : AT(ADDR(.orc_unwind) - LOAD_OFFSET) { \ __start_orc_unwind = .; \ KEEP(*(.orc_unwind)) \
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit 03997612904866abe7cdcc992784ef65cb3a4b81 upstream.
CYC packet timestamp calculation depends upon CBR which was being cleared upon overflow (OVF). That can cause errors due to failing to synchronize with sideband events. Even if a CBR change has been lost, the old CBR is still a better estimate than zero. So remove the clearing of CBR.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 1 - 1 file changed, 1 deletion(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -1394,7 +1394,6 @@ static int intel_pt_overflow(struct inte { intel_pt_log("ERROR: Buffer overflow\n"); intel_pt_clear_tx_flags(decoder); - decoder->cbr = 0; decoder->timestamp_insn_cnt = 0; decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC; decoder->overflow = true;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit d6d457451eb94fa747dc202765592eb8885a7352 upstream.
Kallsyms symbols do not have a size, so the size becomes the distance to the next symbol.
Consequently the recently added trampoline symbols end up with large sizes because the trampolines are some distance from one another and the main kernel map.
However, symbols that end outside their map can disrupt the symbol tree because, after mapping, it can appear incorrectly that they overlap other symbols.
Add logic to truncate symbol size to the end of the corresponding map.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: stable@vger.kernel.org Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/20190109091835.5570-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/symbol.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -710,6 +710,8 @@ static int map_groups__split_kallsyms_fo }
pos->start -= curr_map->start - curr_map->pgoff; + if (pos->end > curr_map->end) + pos->end = curr_map->end; if (pos->end) pos->end -= curr_map->start - curr_map->pgoff; symbols__insert(&curr_map->dso->symbols, pos);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit c3fcadf0bb765faf45d6d562246e1d08885466df upstream.
Define auxtrace record alignment so that it can be referenced elsewhere.
Note this is preparation for patch "perf intel-pt: Fix overlap calculation for padding"
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/auxtrace.c | 4 ++-- tools/perf/util/auxtrace.h | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-)
--- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1278,9 +1278,9 @@ static int __auxtrace_mmap__read(struct }
/* padding must be written by fn() e.g. record__process_auxtrace() */ - padding = size & 7; + padding = size & (PERF_AUXTRACE_RECORD_ALIGNMENT - 1); if (padding) - padding = 8 - padding; + padding = PERF_AUXTRACE_RECORD_ALIGNMENT - padding;
memset(&ev, 0, sizeof(ev)); ev.auxtrace.header.type = PERF_RECORD_AUXTRACE; --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -40,6 +40,9 @@ struct record_opts; struct auxtrace_info_event; struct events_stats;
+/* Auxtrace records must have the same alignment as perf event records */ +#define PERF_AUXTRACE_RECORD_ALIGNMENT 8 + enum auxtrace_type { PERF_AUXTRACE_UNKNOWN, PERF_AUXTRACE_INTEL_PT,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit 5a99d99e3310a565b0cf63f785b347be9ee0da45 upstream.
Auxtrace records might have up to 7 bytes of padding appended. Adjust the overlap accordingly.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 36 ++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -26,6 +26,7 @@
#include "../cache.h" #include "../util.h" +#include "../auxtrace.h"
#include "intel-pt-insn-decoder.h" #include "intel-pt-pkt-decoder.h" @@ -2574,6 +2575,34 @@ static int intel_pt_tsc_cmp(uint64_t tsc } }
+#define MAX_PADDING (PERF_AUXTRACE_RECORD_ALIGNMENT - 1) + +/** + * adj_for_padding - adjust overlap to account for padding. + * @buf_b: second buffer + * @buf_a: first buffer + * @len_a: size of first buffer + * + * @buf_a might have up to 7 bytes of padding appended. Adjust the overlap + * accordingly. + * + * Return: A pointer into @buf_b from where non-overlapped data starts + */ +static unsigned char *adj_for_padding(unsigned char *buf_b, + unsigned char *buf_a, size_t len_a) +{ + unsigned char *p = buf_b - MAX_PADDING; + unsigned char *q = buf_a + len_a - MAX_PADDING; + int i; + + for (i = MAX_PADDING; i; i--, p++, q++) { + if (*p != *q) + break; + } + + return p; +} + /** * intel_pt_find_overlap_tsc - determine start of non-overlapped trace data * using TSC. @@ -2624,8 +2653,11 @@ static unsigned char *intel_pt_find_over
/* Same TSC, so buffers are consecutive */ if (!cmp && rem_b >= rem_a) { + unsigned char *start; + *consecutive = true; - return buf_b + len_b - (rem_b - rem_a); + start = buf_b + len_b - (rem_b - rem_a); + return adj_for_padding(start, buf_a, len_a); } if (cmp < 0) return buf_b; /* tsc_a < tsc_b => no overlap */ @@ -2688,7 +2720,7 @@ unsigned char *intel_pt_find_overlap(uns found = memmem(buf_a, len_a, buf_b, len_a); if (found) { *consecutive = true; - return buf_b + len_a; + return adj_for_padding(buf_b + len_a, buf_a, len_a); }
/* Try again at next PSB in buffer 'a' */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kan Liang kan.liang@linux.intel.com
commit 8041ffd36f42d8521d66dd1e236feb58cecd68bc upstream.
The client IMC bandwidth events currently return very large values:
$ perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 10000 -a
10.000117222 34,788.76 MiB uncore_imc/data_reads/ 10.000117222 8.26 MiB uncore_imc/data_writes/ 20.000374584 34,842.89 MiB uncore_imc/data_reads/ 20.000374584 10.45 MiB uncore_imc/data_writes/ 30.000633299 37,965.29 MiB uncore_imc/data_reads/ 30.000633299 323.62 MiB uncore_imc/data_writes/ 40.000891548 41,012.88 MiB uncore_imc/data_reads/ 40.000891548 6.98 MiB uncore_imc/data_writes/ 50.001142480 1,125,899,906,621,494.75 MiB uncore_imc/data_reads/ 50.001142480 6.97 MiB uncore_imc/data_writes/
The client IMC events are freerunning counters. They still use the old event encoding format (0x1 for data_read and 0x2 for data write). The counter bit width is calculated by common code, which assume that the standard encoding format is used for the freerunning counters. Error bit width information is calculated.
The patch intends to convert the old client IMC event encoding to the standard encoding format.
Current common code uses event->attr.config which directly copy from user space. We should not implicitly modify it for a converted event. The event->hw.config is used to replace the event->attr.config in common code.
For client IMC events, the event->attr.config is used to calculate a converted event with standard encoding format in the custom event_init(). The converted event is stored in event->hw.config. For other events of freerunning counters, they already use the standard encoding format. The same value as event->attr.config is assigned to event->hw.config in common event_init().
Reported-by: Jin Yao yao.jin@linux.intel.com Tested-by: Jin Yao yao.jin@linux.intel.com Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@surriel.com Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: stable@kernel.org # v4.18+ Fixes: 9aae1780e7e8 ("perf/x86/intel/uncore: Clean up client IMC uncore") Link: https://lkml.kernel.org/r/20190227165729.1861-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/intel/uncore.c | 1 + arch/x86/events/intel/uncore.h | 12 ++++++------ arch/x86/events/intel/uncore_snb.c | 4 +++- 3 files changed, 10 insertions(+), 7 deletions(-)
--- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -740,6 +740,7 @@ static int uncore_pmu_event_init(struct /* fixed counters have event field hardcoded to zero */ hwc->config = 0ULL; } else if (is_freerunning_event(event)) { + hwc->config = event->attr.config; if (!check_valid_freerunning_event(box, event)) return -EINVAL; event->hw.idx = UNCORE_PMC_IDX_FREERUNNING; --- a/arch/x86/events/intel/uncore.h +++ b/arch/x86/events/intel/uncore.h @@ -292,8 +292,8 @@ static inline unsigned int uncore_freerunning_counter(struct intel_uncore_box *box, struct perf_event *event) { - unsigned int type = uncore_freerunning_type(event->attr.config); - unsigned int idx = uncore_freerunning_idx(event->attr.config); + unsigned int type = uncore_freerunning_type(event->hw.config); + unsigned int idx = uncore_freerunning_idx(event->hw.config); struct intel_uncore_pmu *pmu = box->pmu;
return pmu->type->freerunning[type].counter_base + @@ -377,7 +377,7 @@ static inline unsigned int uncore_freerunning_bits(struct intel_uncore_box *box, struct perf_event *event) { - unsigned int type = uncore_freerunning_type(event->attr.config); + unsigned int type = uncore_freerunning_type(event->hw.config);
return box->pmu->type->freerunning[type].bits; } @@ -385,7 +385,7 @@ unsigned int uncore_freerunning_bits(str static inline int uncore_num_freerunning(struct intel_uncore_box *box, struct perf_event *event) { - unsigned int type = uncore_freerunning_type(event->attr.config); + unsigned int type = uncore_freerunning_type(event->hw.config);
return box->pmu->type->freerunning[type].num_counters; } @@ -399,8 +399,8 @@ static inline int uncore_num_freerunning static inline bool check_valid_freerunning_event(struct intel_uncore_box *box, struct perf_event *event) { - unsigned int type = uncore_freerunning_type(event->attr.config); - unsigned int idx = uncore_freerunning_idx(event->attr.config); + unsigned int type = uncore_freerunning_type(event->hw.config); + unsigned int idx = uncore_freerunning_idx(event->hw.config);
return (type < uncore_num_freerunning_types(box, event)) && (idx < uncore_num_freerunning(box, event)); --- a/arch/x86/events/intel/uncore_snb.c +++ b/arch/x86/events/intel/uncore_snb.c @@ -448,9 +448,11 @@ static int snb_uncore_imc_event_init(str
/* must be done before validate_group */ event->hw.event_base = base; - event->hw.config = cfg; event->hw.idx = idx;
+ /* Convert to standard encoding format for freerunning counters */ + event->hw.config = ((cfg - 1) << 8) | 0x10ff; + /* no group validation needed, we have free running counters */
return 0;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit 076333870c2f5bdd9b6d31e7ca1909cf0c84cbfa upstream.
When TSC is not available, "timeless" decoding is used but a divide by zero occurs if perf_time_to_tsc() is called.
Ensure the divisor is not zero.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org # v4.9+ Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/util/intel-pt.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -2522,6 +2522,8 @@ int intel_pt_process_auxtrace_info(union }
pt->timeless_decoding = intel_pt_timeless_decoding(pt); + if (pt->timeless_decoding && !pt->tc.time_mult) + pt->tc.time_mult = 1; pt->have_tsc = intel_pt_have_tsc(pt); pt->sampling_mode = false; pt->est_tsc = !pt->timeless_decoding;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aditya Pakki pakki001@umn.edu
commit e406f12dde1a8375d77ea02d91f313fb1a9c6aec upstream.
mddev->sync_thread can be set to NULL on kzalloc failure downstream. The patch checks for such a scenario and frees allocated resources.
Committer node:
Added similar fix to raid5.c, as suggested by Guoqing.
Cc: stable@vger.kernel.org # v3.16+ Acked-by: Guoqing Jiang gqjiang@suse.com Signed-off-by: Aditya Pakki pakki001@umn.edu Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/raid10.c | 2 ++ drivers/md/raid5.c | 2 ++ 2 files changed, 4 insertions(+)
--- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3939,6 +3939,8 @@ static int raid10_run(struct mddev *mdde set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); mddev->sync_thread = md_register_thread(md_do_sync, mddev, "reshape"); + if (!mddev->sync_thread) + goto out_free_conf; }
return 0; --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7402,6 +7402,8 @@ static int raid5_run(struct mddev *mddev set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); mddev->sync_thread = md_register_thread(md_do_sync, mddev, "reshape"); + if (!mddev->sync_thread) + goto abort; }
/* Ok, everything is just fine now */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Tatashin pasha.tatashin@soleen.com
commit b5179ec4187251a751832193693d6e474d3445ac upstream.
VMs may show incorrect uptime and dmesg printk offsets on hypervisors with unstable clock. The problem is produced when VM is rebooted without exiting from qemu.
The fix is to calculate clock offset not only for stable clock but for unstable clock as well, and use kvm_sched_clock_read() which substracts the offset for both clocks.
This is safe, because pvclock_clocksource_read() does the right thing and makes sure that clock always goes forward, so once offset is calculated with unstable clock, we won't get new reads that are smaller than offset, and thus won't get negative results.
Thank you Jon DeVree for helping to reproduce this issue.
Fixes: 857baa87b642 ("sched/clock: Enable sched clock early") Cc: stable@vger.kernel.org Reported-by: Dominique Martinet asmadeus@codewreck.org Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/kvmclock.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
--- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -104,12 +104,8 @@ static u64 kvm_sched_clock_read(void)
static inline void kvm_sched_clock_init(bool stable) { - if (!stable) { - pv_ops.time.sched_clock = kvm_clock_read; + if (!stable) clear_sched_clock_stable(); - return; - } - kvm_sched_clock_offset = kvm_clock_read(); pv_ops.time.sched_clock = kvm_sched_clock_read;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 745cfeaac09ce359130a5451d90cb0bd4094c290 upstream.
Arnd reported the following compiler warning:
arch/x86/kernel/ftrace.c:669:23: error: 'ftrace_jmp_replace' defined but not used [-Werror=unused-function]
The ftrace_jmp_replace() function now only has a single user and should be simply moved by that user. But looking at the code, it shows that ftrace_jmp_replace() is similar to ftrace_call_replace() except that instead of using the opcode of 0xe8 it uses 0xe9. It makes more sense to consolidate that function into one implementation that both ftrace_jmp_replace() and ftrace_call_replace() use by passing in the op code separate.
The structure in ftrace_code_union is also modified to replace the "e8" field with the more appropriate name "op".
Cc: stable@vger.kernel.org Reported-by: Arnd Bergmann arnd@arndb.de Acked-by: Arnd Bergmann arnd@arndb.de Link: http://lkml.kernel.org/r/20190304200748.1418790-1-arnd@arndb.de Fixes: d2a68c4effd8 ("x86/ftrace: Do not call function graph from dynamic trampolines") Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/ftrace.c | 42 +++++++++++++++++------------------------- 1 file changed, 17 insertions(+), 25 deletions(-)
--- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -49,7 +49,7 @@ int ftrace_arch_code_modify_post_process union ftrace_code_union { char code[MCOUNT_INSN_SIZE]; struct { - unsigned char e8; + unsigned char op; int offset; } __attribute__((packed)); }; @@ -59,20 +59,23 @@ static int ftrace_calc_offset(long ip, l return (int)(addr - ip); }
-static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr) +static unsigned char * +ftrace_text_replace(unsigned char op, unsigned long ip, unsigned long addr) { static union ftrace_code_union calc;
- calc.e8 = 0xe8; + calc.op = op; calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr);
- /* - * No locking needed, this must be called via kstop_machine - * which in essence is like running on a uniprocessor machine. - */ return calc.code; }
+static unsigned char * +ftrace_call_replace(unsigned long ip, unsigned long addr) +{ + return ftrace_text_replace(0xe8, ip, addr); +} + static inline int within(unsigned long addr, unsigned long start, unsigned long end) { @@ -664,22 +667,6 @@ int __init ftrace_dyn_arch_init(void) return 0; }
-#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER) -static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr) -{ - static union ftrace_code_union calc; - - /* Jmp not a call (ignore the .e8) */ - calc.e8 = 0xe9; - calc.offset = ftrace_calc_offset(ip + MCOUNT_INSN_SIZE, addr); - - /* - * ftrace external locks synchronize the access to the static variable. - */ - return calc.code; -} -#endif - /* Currently only x86_64 supports dynamic trampolines */ #ifdef CONFIG_X86_64
@@ -891,8 +878,8 @@ static void *addr_from_call(void *ptr) return NULL;
/* Make sure this is a call */ - if (WARN_ON_ONCE(calc.e8 != 0xe8)) { - pr_warn("Expected e8, got %x\n", calc.e8); + if (WARN_ON_ONCE(calc.op != 0xe8)) { + pr_warn("Expected e8, got %x\n", calc.op); return NULL; }
@@ -963,6 +950,11 @@ void arch_ftrace_trampoline_free(struct #ifdef CONFIG_DYNAMIC_FTRACE extern void ftrace_graph_call(void);
+static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr) +{ + return ftrace_text_replace(0xe9, ip, addr); +} + static int ftrace_mod_jmp(unsigned long ip, void *func) { unsigned char *new;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com
commit 3d7a850fdc1a2e4d2adbc95cc0fc962974725e88 upstream.
The current approach to read first 6 bytes from the response and then tail of the response, can cause the 2nd memcpy_fromio() to do an unaligned read (e.g. read 32-bit word from address aligned to a 16-bits), depending on how memcpy_fromio() is implemented. If this happens, the read will fail and the memory controller will fill the read with 1's.
This was triggered by 170d13ca3a2f, which should be probably refined to check and react to the address alignment. Before that commit, on x86 memcpy_fromio() turned out to be memcpy(). By a luck GCC has done the right thing (from tpm_crb's perspective) for us so far, but we should not rely on that. Thus, it makes sense to fix this also in tpm_crb, not least because the fix can be then backported to stable kernels and make them more robust when compiled in differing environments.
Cc: stable@vger.kernel.org Cc: James Morris jmorris@namei.org Cc: Tomas Winkler tomas.winkler@intel.com Cc: Jerry Snitselaar jsnitsel@redhat.com Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface") Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Acked-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/tpm/tpm_crb.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/char/tpm/tpm_crb.c +++ b/drivers/char/tpm/tpm_crb.c @@ -287,19 +287,29 @@ static int crb_recv(struct tpm_chip *chi struct crb_priv *priv = dev_get_drvdata(&chip->dev); unsigned int expected;
- /* sanity check */ - if (count < 6) + /* A sanity check that the upper layer wants to get at least the header + * as that is the minimum size for any TPM response. + */ + if (count < TPM_HEADER_SIZE) return -EIO;
+ /* If this bit is set, according to the spec, the TPM is in + * unrecoverable condition. + */ if (ioread32(&priv->regs_t->ctrl_sts) & CRB_CTRL_STS_ERROR) return -EIO;
- memcpy_fromio(buf, priv->rsp, 6); - expected = be32_to_cpup((__be32 *) &buf[2]); - if (expected > count || expected < 6) + /* Read the first 8 bytes in order to get the length of the response. + * We read exactly a quad word in order to make sure that the remaining + * reads will be aligned. + */ + memcpy_fromio(buf, priv->rsp, 8); + + expected = be32_to_cpup((__be32 *)&buf[2]); + if (expected > count || expected < TPM_HEADER_SIZE) return -EIO;
- memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6); + memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
return expected; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com
commit f5595f5baa30e009bf54d0d7653a9a0cc465be60 upstream.
The send() callback should never return length as it does not in every driver except tpm_crb in the success case. The reason is that the main transmit functionality only cares about whether the transmit was successful or not and ignores the count completely.
Suggested-by: Stefan Berger stefanb@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Reviewed-by: Stefan Berger stefanb@linux.ibm.com Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Tested-by: Alexander Steffen Alexander.Steffen@infineon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/tpm/st33zp24/st33zp24.c | 2 +- drivers/char/tpm/tpm-interface.c | 11 ++++++++++- drivers/char/tpm/tpm_atmel.c | 2 +- drivers/char/tpm/tpm_i2c_atmel.c | 6 +++++- drivers/char/tpm/tpm_i2c_infineon.c | 2 +- drivers/char/tpm/tpm_i2c_nuvoton.c | 2 +- drivers/char/tpm/tpm_ibmvtpm.c | 8 ++++---- drivers/char/tpm/tpm_infineon.c | 2 +- drivers/char/tpm/tpm_nsc.c | 2 +- drivers/char/tpm/tpm_tis_core.c | 2 +- drivers/char/tpm/tpm_vtpm_proxy.c | 3 +-- drivers/char/tpm/xen-tpmfront.c | 2 +- 12 files changed, 28 insertions(+), 16 deletions(-)
--- a/drivers/char/tpm/st33zp24/st33zp24.c +++ b/drivers/char/tpm/st33zp24/st33zp24.c @@ -436,7 +436,7 @@ static int st33zp24_send(struct tpm_chip goto out_err; }
- return len; + return 0; out_err: st33zp24_cancel(chip); release_locality(chip); --- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -230,10 +230,19 @@ static ssize_t tpm_try_transmit(struct t if (rc < 0) { if (rc != -EPIPE) dev_err(&chip->dev, - "%s: tpm_send: error %d\n", __func__, rc); + "%s: send(): error %d\n", __func__, rc); goto out; }
+ /* A sanity check. send() should just return zero on success e.g. + * not the command length. + */ + if (rc > 0) { + dev_warn(&chip->dev, + "%s: send(): invalid value %d\n", __func__, rc); + rc = 0; + } + if (chip->flags & TPM_CHIP_FLAG_IRQ) goto out_recv;
--- a/drivers/char/tpm/tpm_atmel.c +++ b/drivers/char/tpm/tpm_atmel.c @@ -105,7 +105,7 @@ static int tpm_atml_send(struct tpm_chip iowrite8(buf[i], priv->iobase); }
- return count; + return 0; }
static void tpm_atml_cancel(struct tpm_chip *chip) --- a/drivers/char/tpm/tpm_i2c_atmel.c +++ b/drivers/char/tpm/tpm_i2c_atmel.c @@ -65,7 +65,11 @@ static int i2c_atmel_send(struct tpm_chi dev_dbg(&chip->dev, "%s(buf=%*ph len=%0zx) -> sts=%d\n", __func__, (int)min_t(size_t, 64, len), buf, len, status); - return status; + + if (status < 0) + return status; + + return 0; }
static int i2c_atmel_recv(struct tpm_chip *chip, u8 *buf, size_t count) --- a/drivers/char/tpm/tpm_i2c_infineon.c +++ b/drivers/char/tpm/tpm_i2c_infineon.c @@ -587,7 +587,7 @@ static int tpm_tis_i2c_send(struct tpm_c /* go and do it */ iic_tpm_write(TPM_STS(tpm_dev.locality), &sts, 1);
- return len; + return 0; out_err: tpm_tis_i2c_ready(chip); /* The TPM needs some time to clean up here, --- a/drivers/char/tpm/tpm_i2c_nuvoton.c +++ b/drivers/char/tpm/tpm_i2c_nuvoton.c @@ -467,7 +467,7 @@ static int i2c_nuvoton_send(struct tpm_c }
dev_dbg(dev, "%s() -> %zd\n", __func__, len); - return len; + return 0; }
static bool i2c_nuvoton_req_canceled(struct tpm_chip *chip, u8 status) --- a/drivers/char/tpm/tpm_ibmvtpm.c +++ b/drivers/char/tpm/tpm_ibmvtpm.c @@ -139,14 +139,14 @@ static int tpm_ibmvtpm_recv(struct tpm_c }
/** - * tpm_ibmvtpm_send - Send tpm request - * + * tpm_ibmvtpm_send() - Send a TPM command * @chip: tpm chip struct * @buf: buffer contains data to send * @count: size of buffer * * Return: - * Number of bytes sent or < 0 on error. + * 0 on success, + * -errno on error */ static int tpm_ibmvtpm_send(struct tpm_chip *chip, u8 *buf, size_t count) { @@ -192,7 +192,7 @@ static int tpm_ibmvtpm_send(struct tpm_c rc = 0; ibmvtpm->tpm_processing_cmd = false; } else - rc = count; + rc = 0;
spin_unlock(&ibmvtpm->rtce_lock); return rc; --- a/drivers/char/tpm/tpm_infineon.c +++ b/drivers/char/tpm/tpm_infineon.c @@ -354,7 +354,7 @@ static int tpm_inf_send(struct tpm_chip for (i = 0; i < count; i++) { wait_and_send(chip, buf[i]); } - return count; + return 0; }
static void tpm_inf_cancel(struct tpm_chip *chip) --- a/drivers/char/tpm/tpm_nsc.c +++ b/drivers/char/tpm/tpm_nsc.c @@ -226,7 +226,7 @@ static int tpm_nsc_send(struct tpm_chip } outb(NSC_COMMAND_EOC, priv->base + NSC_COMMAND);
- return count; + return 0; }
static void tpm_nsc_cancel(struct tpm_chip *chip) --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -481,7 +481,7 @@ static int tpm_tis_send_main(struct tpm_ goto out_err; } } - return len; + return 0; out_err: tpm_tis_ready(chip); return rc; --- a/drivers/char/tpm/tpm_vtpm_proxy.c +++ b/drivers/char/tpm/tpm_vtpm_proxy.c @@ -335,7 +335,6 @@ static int vtpm_proxy_is_driver_command( static int vtpm_proxy_tpm_op_send(struct tpm_chip *chip, u8 *buf, size_t count) { struct proxy_dev *proxy_dev = dev_get_drvdata(&chip->dev); - int rc = 0;
if (count > sizeof(proxy_dev->buffer)) { dev_err(&chip->dev, @@ -366,7 +365,7 @@ static int vtpm_proxy_tpm_op_send(struct
wake_up_interruptible(&proxy_dev->wq);
- return rc; + return 0; }
static void vtpm_proxy_tpm_op_cancel(struct tpm_chip *chip) --- a/drivers/char/tpm/xen-tpmfront.c +++ b/drivers/char/tpm/xen-tpmfront.c @@ -173,7 +173,7 @@ static int vtpm_send(struct tpm_chip *ch return -ETIME; }
- return count; + return 0; }
static int vtpm_recv(struct tpm_chip *chip, u8 *buf, size_t count)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang, Jun jun.zhang@intel.com
commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903 upstream.
The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread. Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup.
Unfortunately, natural though it might be, this heuristic fails when rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call. In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context. Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions.
This race window is quite narrow, but it actually did happen during real testing. It would of course need to be fixed even if it was strictly theoretical in nature.
This patch does not Cc stable because it does not apply cleanly to earlier kernel versions.
Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread") Reported-by: "He, Bo" bo.he@intel.com Co-developed-by: "Zhang, Jun" jun.zhang@intel.com Co-developed-by: "He, Bo" bo.he@intel.com Co-developed-by: "xiao, jin" jin.xiao@intel.com Co-developed-by: Bai, Jie A jie.a.bai@intel.com Signed-off: "Zhang, Jun" jun.zhang@intel.com Signed-off: "He, Bo" bo.he@intel.com Signed-off: "xiao, jin" jin.xiao@intel.com Signed-off: Bai, Jie A jie.a.bai@intel.com Signed-off-by: "Zhang, Jun" jun.zhang@intel.com [ paulmck: Switch from !in_softirq() to "!in_interrupt() && !in_serving_softirq() to avoid redundant wakeups and to also handle the interrupt-handler scenario as well as the softirq-handler scenario that actually occurred in testing. ] Signed-off-by: Paul E. McKenney paulmck@linux.ibm.com Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/rcu/tree.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1557,14 +1557,23 @@ static bool rcu_future_gp_cleanup(struct }
/* - * Awaken the grace-period kthread. Don't do a self-awaken, and don't - * bother awakening when there is nothing for the grace-period kthread - * to do (as in several CPUs raced to awaken, and we lost), and finally - * don't try to awaken a kthread that has not yet been created. + * Awaken the grace-period kthread. Don't do a self-awaken (unless in + * an interrupt or softirq handler), and don't bother awakening when there + * is nothing for the grace-period kthread to do (as in several CPUs raced + * to awaken, and we lost), and finally don't try to awaken a kthread that + * has not yet been created. If all those checks are passed, track some + * debug information and awaken. + * + * So why do the self-wakeup when in an interrupt or softirq handler + * in the grace-period kthread's context? Because the kthread might have + * been interrupted just as it was going to sleep, and just after the final + * pre-sleep check of the awaken condition. In this case, a wakeup really + * is required, and is therefore supplied. */ static void rcu_gp_kthread_wake(void) { - if (current == rcu_state.gp_kthread || + if ((current == rcu_state.gp_kthread && + !in_interrupt() && !in_serving_softirq()) || !READ_ONCE(rcu_state.gp_flags) || !rcu_state.gp_kthread) return;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve Longerbeam slongerbeam@gmail.com
commit a19c22677377b87e4354f7306f46ad99bc982a9f upstream.
Upstream must be stopped immediately after receiving the last EOF and before disabling the IDMA channel. This can be accomplished by moving upstream stream off to just after receiving the last EOF completion in prp_stop(). For symmetry also move upstream stream on to end of prp_start().
This fixes a complete system hard lockup on the SabreAuto when streaming from the ADV7180, by repeatedly sending a stream off immediately followed by stream on:
while true; do v4l2-ctl -d1 --stream-mmap --stream-count=3; done
Eventually this either causes the system lockup or EOF timeouts at all subsequent stream on, until a system reset.
The lockup occurs when disabling the IDMA channel at stream off. Stopping the video data stream entering the IDMA channel before disabling the channel itself appears to be a reliable fix for the hard lockup.
Fixes: f0d9c8924e2c3 ("[media] media: imx: Add IC subdev drivers")
Reported-by: Gaël PORTAY gael.portay@collabora.com Tested-by: Gaël PORTAY gael.portay@collabora.com Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Cc: stable@vger.kernel.org # for 4.13 and up Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-ic-prpencvf.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
--- a/drivers/staging/media/imx/imx-ic-prpencvf.c +++ b/drivers/staging/media/imx/imx-ic-prpencvf.c @@ -680,12 +680,23 @@ static int prp_start(struct prp_priv *pr goto out_free_nfb4eof_irq; }
+ /* start upstream */ + ret = v4l2_subdev_call(priv->src_sd, video, s_stream, 1); + ret = (ret && ret != -ENOIOCTLCMD) ? ret : 0; + if (ret) { + v4l2_err(&ic_priv->sd, + "upstream stream on failed: %d\n", ret); + goto out_free_eof_irq; + } + /* start the EOF timeout timer */ mod_timer(&priv->eof_timeout_timer, jiffies + msecs_to_jiffies(IMX_MEDIA_EOF_TIMEOUT));
return 0;
+out_free_eof_irq: + devm_free_irq(ic_priv->dev, priv->eof_irq, priv); out_free_nfb4eof_irq: devm_free_irq(ic_priv->dev, priv->nfb4eof_irq, priv); out_unsetup: @@ -717,6 +728,12 @@ static void prp_stop(struct prp_priv *pr if (ret == 0) v4l2_warn(&ic_priv->sd, "wait last EOF timeout\n");
+ /* stop upstream */ + ret = v4l2_subdev_call(priv->src_sd, video, s_stream, 0); + if (ret && ret != -ENOIOCTLCMD) + v4l2_warn(&ic_priv->sd, + "upstream stream off failed: %d\n", ret); + devm_free_irq(ic_priv->dev, priv->eof_irq, priv); devm_free_irq(ic_priv->dev, priv->nfb4eof_irq, priv);
@@ -1148,15 +1165,6 @@ static int prp_s_stream(struct v4l2_subd if (ret) goto out;
- /* start/stop upstream */ - ret = v4l2_subdev_call(priv->src_sd, video, s_stream, enable); - ret = (ret && ret != -ENOIOCTLCMD) ? ret : 0; - if (ret) { - if (enable) - prp_stop(priv); - goto out; - } - update_count: priv->stream_count += enable ? 1 : -1; if (priv->stream_count < 0)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: French, Nicholas A naf@ou.edu
commit 1b4fd9de6ec7f3722c2b3e08cc5ad171c11f93be upstream.
A typo in code cleanup commit db9c1007bc07 ("media: lgdt330x: do some cleanups at status logic") broke the FE_HAS_LOCK reporting for 3303 chips by inadvertently modifying the register mask.
The broken lock status is critial as it prevents video capture cards from reporting signal strength, scanning for channels, and capturing video.
Fix regression by reverting mask change.
Cc: stable@vger.kernel.org # Kernel 4.17+ Fixes: db9c1007bc07 ("media: lgdt330x: do some cleanups at status logic") Signed-off-by: Nick French naf@ou.edu Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Tested-by: Adam Stylinski kungfujesus06@gmail.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/dvb-frontends/lgdt330x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/dvb-frontends/lgdt330x.c +++ b/drivers/media/dvb-frontends/lgdt330x.c @@ -783,7 +783,7 @@ static int lgdt3303_read_status(struct d
if ((buf[0] & 0x02) == 0x00) *status |= FE_HAS_SYNC; - if ((buf[0] & 0xfd) == 0x01) + if ((buf[0] & 0x01) == 0x01) *status |= FE_HAS_VITERBI | FE_HAS_LOCK; break; default:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wens@csie.org
commit d31b282e2c0de9c7fb113516820340251f03a625 upstream.
max_register is currently set to 0x1000. This is beyond the mapped address range of the hardware, so attempts to dump the regmap from debugfs would trigger a kernel exception.
Furthermore, the useful registers only occupy a small section at the beginning of the full range. Change the value to 0x9c, the last known register on the V3s and H3.
On the A31, the register range is extended to support additional capture channels. Since this is not yet supported, ignore it for now.
Fixes: 5cc7522d8965 ("media: sun6i: Add support for Allwinner CSI V3s")
Cc: stable@vger.kernel.org Signed-off-by: Chen-Yu Tsai wens@csie.org Acked-by: Maxime Ripard maxime.ripard@bootlin.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c @@ -793,7 +793,7 @@ static const struct regmap_config sun6i_ .reg_bits = 32, .reg_stride = 4, .val_bits = 32, - .max_register = 0x1000, + .max_register = 0x9c, };
static int sun6i_csi_resource_request(struct sun6i_csi_dev *sdev,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 9dd0627d8d62a7ddb001a75f63942d92b5336561 upstream.
The UVC video driver converts the timestamp from hardware specific unit to one known by the kernel at the time when the buffer is dequeued. This is fine in general, but the streamoff operation consists of the following steps (among other things):
1. uvc_video_clock_cleanup --- the hardware clock sample array is released and the pointer to the array is set to NULL,
2. buffers in active state are returned to the user and
3. buf_finish callback is called on buffers that are prepared. buf_finish includes calling uvc_video_clock_update that accesses the hardware clock sample array.
The above is serialised by a queue specific mutex. Address the problem by skipping the clock conversion if the hardware clock sample array is already released.
Fixes: 9c0863b1cc48 ("[media] vb2: call buf_finish from __queue_cancel")
Reported-by: Chiranjeevi Rapolu chiranjeevi.rapolu@intel.com Tested-by: Chiranjeevi Rapolu chiranjeevi.rapolu@intel.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/uvc/uvc_video.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/media/usb/uvc/uvc_video.c +++ b/drivers/media/usb/uvc/uvc_video.c @@ -676,6 +676,14 @@ void uvc_video_clock_update(struct uvc_s if (!uvc_hw_timestamps_param) return;
+ /* + * We will get called from __vb2_queue_cancel() if there are buffers + * done but not dequeued by the user, but the sample array has already + * been released at that time. Just bail out in that case. + */ + if (!clock->samples) + return; + spin_lock_irqsave(&clock->lock, flags);
if (clock->count < clock->size)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lucas A. M. Magalhães lucmaga@gmail.com
commit adc589d2a20808fb99d46a78175cd023f2040338 upstream.
Add a linear pipeline logic for the stream control. It's created by walking backwards on the entity graph. When the stream starts it will simply loop through the pipeline calling the respective process_frame function of each entity.
Fixes: f2fe89061d797 ("vimc: Virtual Media Controller core, capture and sensor")
Cc: stable@vger.kernel.org # for v4.20 Signed-off-by: Lucas A. M. Magalhães lucmaga@gmail.com Acked-by: Helen Koike helen.koike@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl [hverkuil-cisco@xs4all.nl: fixed small space-after-tab issue in the patch] Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vimc/Makefile | 3 drivers/media/platform/vimc/vimc-capture.c | 18 +- drivers/media/platform/vimc/vimc-common.c | 35 ----- drivers/media/platform/vimc/vimc-common.h | 15 -- drivers/media/platform/vimc/vimc-debayer.c | 26 --- drivers/media/platform/vimc/vimc-scaler.c | 28 ---- drivers/media/platform/vimc/vimc-sensor.c | 56 +------- drivers/media/platform/vimc/vimc-streamer.c | 188 ++++++++++++++++++++++++++++ drivers/media/platform/vimc/vimc-streamer.h | 38 +++++ 9 files changed, 260 insertions(+), 147 deletions(-)
--- a/drivers/media/platform/vimc/Makefile +++ b/drivers/media/platform/vimc/Makefile @@ -5,6 +5,7 @@ vimc_common-objs := vimc-common.o vimc_debayer-objs := vimc-debayer.o vimc_scaler-objs := vimc-scaler.o vimc_sensor-objs := vimc-sensor.o +vimc_streamer-objs := vimc-streamer.o
obj-$(CONFIG_VIDEO_VIMC) += vimc.o vimc_capture.o vimc_common.o vimc-debayer.o \ - vimc_scaler.o vimc_sensor.o + vimc_scaler.o vimc_sensor.o vimc_streamer.o --- a/drivers/media/platform/vimc/vimc-capture.c +++ b/drivers/media/platform/vimc/vimc-capture.c @@ -24,6 +24,7 @@ #include <media/videobuf2-vmalloc.h>
#include "vimc-common.h" +#include "vimc-streamer.h"
#define VIMC_CAP_DRV_NAME "vimc-capture"
@@ -44,7 +45,7 @@ struct vimc_cap_device { spinlock_t qlock; struct mutex lock; u32 sequence; - struct media_pipeline pipe; + struct vimc_stream stream; };
static const struct v4l2_pix_format fmt_default = { @@ -248,14 +249,13 @@ static int vimc_cap_start_streaming(stru vcap->sequence = 0;
/* Start the media pipeline */ - ret = media_pipeline_start(entity, &vcap->pipe); + ret = media_pipeline_start(entity, &vcap->stream.pipe); if (ret) { vimc_cap_return_all_buffers(vcap, VB2_BUF_STATE_QUEUED); return ret; }
- /* Enable streaming from the pipe */ - ret = vimc_pipeline_s_stream(&vcap->vdev.entity, 1); + ret = vimc_streamer_s_stream(&vcap->stream, &vcap->ved, 1); if (ret) { media_pipeline_stop(entity); vimc_cap_return_all_buffers(vcap, VB2_BUF_STATE_QUEUED); @@ -273,8 +273,7 @@ static void vimc_cap_stop_streaming(stru { struct vimc_cap_device *vcap = vb2_get_drv_priv(vq);
- /* Disable streaming from the pipe */ - vimc_pipeline_s_stream(&vcap->vdev.entity, 0); + vimc_streamer_s_stream(&vcap->stream, &vcap->ved, 0);
/* Stop the media pipeline */ media_pipeline_stop(&vcap->vdev.entity); @@ -355,8 +354,8 @@ static void vimc_cap_comp_unbind(struct kfree(vcap); }
-static void vimc_cap_process_frame(struct vimc_ent_device *ved, - struct media_pad *sink, const void *frame) +static void *vimc_cap_process_frame(struct vimc_ent_device *ved, + const void *frame) { struct vimc_cap_device *vcap = container_of(ved, struct vimc_cap_device, ved); @@ -370,7 +369,7 @@ static void vimc_cap_process_frame(struc typeof(*vimc_buf), list); if (!vimc_buf) { spin_unlock(&vcap->qlock); - return; + return ERR_PTR(-EAGAIN); }
/* Remove this entry from the list */ @@ -391,6 +390,7 @@ static void vimc_cap_process_frame(struc vb2_set_plane_payload(&vimc_buf->vb2.vb2_buf, 0, vcap->format.sizeimage); vb2_buffer_done(&vimc_buf->vb2.vb2_buf, VB2_BUF_STATE_DONE); + return NULL; }
static int vimc_cap_comp_bind(struct device *comp, struct device *master, --- a/drivers/media/platform/vimc/vimc-common.c +++ b/drivers/media/platform/vimc/vimc-common.c @@ -207,41 +207,6 @@ const struct vimc_pix_map *vimc_pix_map_ } EXPORT_SYMBOL_GPL(vimc_pix_map_by_pixelformat);
-int vimc_propagate_frame(struct media_pad *src, const void *frame) -{ - struct media_link *link; - - if (!(src->flags & MEDIA_PAD_FL_SOURCE)) - return -EINVAL; - - /* Send this frame to all sink pads that are direct linked */ - list_for_each_entry(link, &src->entity->links, list) { - if (link->source == src && - (link->flags & MEDIA_LNK_FL_ENABLED)) { - struct vimc_ent_device *ved = NULL; - struct media_entity *entity = link->sink->entity; - - if (is_media_entity_v4l2_subdev(entity)) { - struct v4l2_subdev *sd = - container_of(entity, struct v4l2_subdev, - entity); - ved = v4l2_get_subdevdata(sd); - } else if (is_media_entity_v4l2_video_device(entity)) { - struct video_device *vdev = - container_of(entity, - struct video_device, - entity); - ved = video_get_drvdata(vdev); - } - if (ved && ved->process_frame) - ved->process_frame(ved, link->sink, frame); - } - } - - return 0; -} -EXPORT_SYMBOL_GPL(vimc_propagate_frame); - /* Helper function to allocate and initialize pads */ struct media_pad *vimc_pads_init(u16 num_pads, const unsigned long *pads_flag) { --- a/drivers/media/platform/vimc/vimc-common.h +++ b/drivers/media/platform/vimc/vimc-common.h @@ -113,24 +113,13 @@ struct vimc_pix_map { struct vimc_ent_device { struct media_entity *ent; struct media_pad *pads; - void (*process_frame)(struct vimc_ent_device *ved, - struct media_pad *sink, const void *frame); + void * (*process_frame)(struct vimc_ent_device *ved, + const void *frame); void (*vdev_get_format)(struct vimc_ent_device *ved, struct v4l2_pix_format *fmt); };
/** - * vimc_propagate_frame - propagate a frame through the topology - * - * @src: the source pad where the frame is being originated - * @frame: the frame to be propagated - * - * This function will call the process_frame callback from the vimc_ent_device - * struct of the nodes directly connected to the @src pad - */ -int vimc_propagate_frame(struct media_pad *src, const void *frame); - -/** * vimc_pads_init - initialize pads * * @num_pads: number of pads to initialize --- a/drivers/media/platform/vimc/vimc-debayer.c +++ b/drivers/media/platform/vimc/vimc-debayer.c @@ -321,7 +321,6 @@ static void vimc_deb_set_rgb_mbus_fmt_rg static int vimc_deb_s_stream(struct v4l2_subdev *sd, int enable) { struct vimc_deb_device *vdeb = v4l2_get_subdevdata(sd); - int ret;
if (enable) { const struct vimc_pix_map *vpix; @@ -351,22 +350,10 @@ static int vimc_deb_s_stream(struct v4l2 if (!vdeb->src_frame) return -ENOMEM;
- /* Turn the stream on in the subdevices directly connected */ - ret = vimc_pipeline_s_stream(&vdeb->sd.entity, 1); - if (ret) { - vfree(vdeb->src_frame); - vdeb->src_frame = NULL; - return ret; - } } else { if (!vdeb->src_frame) return 0;
- /* Disable streaming from the pipe */ - ret = vimc_pipeline_s_stream(&vdeb->sd.entity, 0); - if (ret) - return ret; - vfree(vdeb->src_frame); vdeb->src_frame = NULL; } @@ -480,9 +467,8 @@ static void vimc_deb_calc_rgb_sink(struc } }
-static void vimc_deb_process_frame(struct vimc_ent_device *ved, - struct media_pad *sink, - const void *sink_frame) +static void *vimc_deb_process_frame(struct vimc_ent_device *ved, + const void *sink_frame) { struct vimc_deb_device *vdeb = container_of(ved, struct vimc_deb_device, ved); @@ -491,7 +477,7 @@ static void vimc_deb_process_frame(struc
/* If the stream in this node is not active, just return */ if (!vdeb->src_frame) - return; + return ERR_PTR(-EINVAL);
for (i = 0; i < vdeb->sink_fmt.height; i++) for (j = 0; j < vdeb->sink_fmt.width; j++) { @@ -499,12 +485,8 @@ static void vimc_deb_process_frame(struc vdeb->set_rgb_src(vdeb, i, j, rgb); }
- /* Propagate the frame through all source pads */ - for (i = 1; i < vdeb->sd.entity.num_pads; i++) { - struct media_pad *pad = &vdeb->sd.entity.pads[i]; + return vdeb->src_frame;
- vimc_propagate_frame(pad, vdeb->src_frame); - } }
static void vimc_deb_comp_unbind(struct device *comp, struct device *master, --- a/drivers/media/platform/vimc/vimc-scaler.c +++ b/drivers/media/platform/vimc/vimc-scaler.c @@ -217,7 +217,6 @@ static const struct v4l2_subdev_pad_ops static int vimc_sca_s_stream(struct v4l2_subdev *sd, int enable) { struct vimc_sca_device *vsca = v4l2_get_subdevdata(sd); - int ret;
if (enable) { const struct vimc_pix_map *vpix; @@ -245,22 +244,10 @@ static int vimc_sca_s_stream(struct v4l2 if (!vsca->src_frame) return -ENOMEM;
- /* Turn the stream on in the subdevices directly connected */ - ret = vimc_pipeline_s_stream(&vsca->sd.entity, 1); - if (ret) { - vfree(vsca->src_frame); - vsca->src_frame = NULL; - return ret; - } } else { if (!vsca->src_frame) return 0;
- /* Disable streaming from the pipe */ - ret = vimc_pipeline_s_stream(&vsca->sd.entity, 0); - if (ret) - return ret; - vfree(vsca->src_frame); vsca->src_frame = NULL; } @@ -346,26 +333,19 @@ static void vimc_sca_fill_src_frame(cons vimc_sca_scale_pix(vsca, i, j, sink_frame); }
-static void vimc_sca_process_frame(struct vimc_ent_device *ved, - struct media_pad *sink, - const void *sink_frame) +static void *vimc_sca_process_frame(struct vimc_ent_device *ved, + const void *sink_frame) { struct vimc_sca_device *vsca = container_of(ved, struct vimc_sca_device, ved); - unsigned int i;
/* If the stream in this node is not active, just return */ if (!vsca->src_frame) - return; + return ERR_PTR(-EINVAL);
vimc_sca_fill_src_frame(vsca, sink_frame);
- /* Propagate the frame through all source pads */ - for (i = 1; i < vsca->sd.entity.num_pads; i++) { - struct media_pad *pad = &vsca->sd.entity.pads[i]; - - vimc_propagate_frame(pad, vsca->src_frame); - } + return vsca->src_frame; };
static void vimc_sca_comp_unbind(struct device *comp, struct device *master, --- a/drivers/media/platform/vimc/vimc-sensor.c +++ b/drivers/media/platform/vimc/vimc-sensor.c @@ -16,8 +16,6 @@ */
#include <linux/component.h> -#include <linux/freezer.h> -#include <linux/kthread.h> #include <linux/module.h> #include <linux/mod_devicetable.h> #include <linux/platform_device.h> @@ -201,38 +199,27 @@ static const struct v4l2_subdev_pad_ops .set_fmt = vimc_sen_set_fmt, };
-static int vimc_sen_tpg_thread(void *data) +static void *vimc_sen_process_frame(struct vimc_ent_device *ved, + const void *sink_frame) { - struct vimc_sen_device *vsen = data; - unsigned int i; - - set_freezable(); - set_current_state(TASK_UNINTERRUPTIBLE); - - for (;;) { - try_to_freeze(); - if (kthread_should_stop()) - break; - - tpg_fill_plane_buffer(&vsen->tpg, 0, 0, vsen->frame); - - /* Send the frame to all source pads */ - for (i = 0; i < vsen->sd.entity.num_pads; i++) - vimc_propagate_frame(&vsen->sd.entity.pads[i], - vsen->frame); + struct vimc_sen_device *vsen = container_of(ved, struct vimc_sen_device, + ved); + const struct vimc_pix_map *vpix; + unsigned int frame_size;
- /* 60 frames per second */ - schedule_timeout(HZ/60); - } + /* Calculate the frame size */ + vpix = vimc_pix_map_by_code(vsen->mbus_format.code); + frame_size = vsen->mbus_format.width * vpix->bpp * + vsen->mbus_format.height;
- return 0; + tpg_fill_plane_buffer(&vsen->tpg, 0, 0, vsen->frame); + return vsen->frame; }
static int vimc_sen_s_stream(struct v4l2_subdev *sd, int enable) { struct vimc_sen_device *vsen = container_of(sd, struct vimc_sen_device, sd); - int ret;
if (enable) { const struct vimc_pix_map *vpix; @@ -258,26 +245,8 @@ static int vimc_sen_s_stream(struct v4l2 /* configure the test pattern generator */ vimc_sen_tpg_s_format(vsen);
- /* Initialize the image generator thread */ - vsen->kthread_sen = kthread_run(vimc_sen_tpg_thread, vsen, - "%s-sen", vsen->sd.v4l2_dev->name); - if (IS_ERR(vsen->kthread_sen)) { - dev_err(vsen->dev, "%s: kernel_thread() failed\n", - vsen->sd.name); - vfree(vsen->frame); - vsen->frame = NULL; - return PTR_ERR(vsen->kthread_sen); - } } else { - if (!vsen->kthread_sen) - return 0; - - /* Stop image generator */ - ret = kthread_stop(vsen->kthread_sen); - if (ret) - return ret;
- vsen->kthread_sen = NULL; vfree(vsen->frame); vsen->frame = NULL; return 0; @@ -413,6 +382,7 @@ static int vimc_sen_comp_bind(struct dev if (ret) goto err_free_hdl;
+ vsen->ved.process_frame = vimc_sen_process_frame; dev_set_drvdata(comp, &vsen->ved); vsen->dev = comp;
--- /dev/null +++ b/drivers/media/platform/vimc/vimc-streamer.c @@ -0,0 +1,188 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * vimc-streamer.c Virtual Media Controller Driver + * + * Copyright (C) 2018 Lucas A. M. Magalhães lucmaga@gmail.com + * + */ + +#include <linux/init.h> +#include <linux/module.h> +#include <linux/freezer.h> +#include <linux/kthread.h> + +#include "vimc-streamer.h" + +/** + * vimc_get_source_entity - get the entity connected with the first sink pad + * + * @ent: reference media_entity + * + * Helper function that returns the media entity containing the source pad + * linked with the first sink pad from the given media entity pad list. + */ +static struct media_entity *vimc_get_source_entity(struct media_entity *ent) +{ + struct media_pad *pad; + int i; + + for (i = 0; i < ent->num_pads; i++) { + if (ent->pads[i].flags & MEDIA_PAD_FL_SOURCE) + continue; + pad = media_entity_remote_pad(&ent->pads[i]); + return pad ? pad->entity : NULL; + } + return NULL; +} + +/* + * vimc_streamer_pipeline_terminate - Disable stream in all ved in stream + * + * @stream: the pointer to the stream structure with the pipeline to be + * disabled. + * + * Calls s_stream to disable the stream in each entity of the pipeline + * + */ +static void vimc_streamer_pipeline_terminate(struct vimc_stream *stream) +{ + struct media_entity *entity; + struct v4l2_subdev *sd; + + while (stream->pipe_size) { + stream->pipe_size--; + entity = stream->ved_pipeline[stream->pipe_size]->ent; + entity = vimc_get_source_entity(entity); + stream->ved_pipeline[stream->pipe_size] = NULL; + + if (!is_media_entity_v4l2_subdev(entity)) + continue; + + sd = media_entity_to_v4l2_subdev(entity); + v4l2_subdev_call(sd, video, s_stream, 0); + } +} + +/* + * vimc_streamer_pipeline_init - initializes the stream structure + * + * @stream: the pointer to the stream structure to be initialized + * @ved: the pointer to the vimc entity initializing the stream + * + * Initializes the stream structure. Walks through the entity graph to + * construct the pipeline used later on the streamer thread. + * Calls s_stream to enable stream in all entities of the pipeline. + */ +static int vimc_streamer_pipeline_init(struct vimc_stream *stream, + struct vimc_ent_device *ved) +{ + struct media_entity *entity; + struct video_device *vdev; + struct v4l2_subdev *sd; + int ret = 0; + + stream->pipe_size = 0; + while (stream->pipe_size < VIMC_STREAMER_PIPELINE_MAX_SIZE) { + if (!ved) { + vimc_streamer_pipeline_terminate(stream); + return -EINVAL; + } + stream->ved_pipeline[stream->pipe_size++] = ved; + + entity = vimc_get_source_entity(ved->ent); + /* Check if the end of the pipeline was reached*/ + if (!entity) + return 0; + + if (is_media_entity_v4l2_subdev(entity)) { + sd = media_entity_to_v4l2_subdev(entity); + ret = v4l2_subdev_call(sd, video, s_stream, 1); + if (ret && ret != -ENOIOCTLCMD) { + vimc_streamer_pipeline_terminate(stream); + return ret; + } + ved = v4l2_get_subdevdata(sd); + } else { + vdev = container_of(entity, + struct video_device, + entity); + ved = video_get_drvdata(vdev); + } + } + + vimc_streamer_pipeline_terminate(stream); + return -EINVAL; +} + +static int vimc_streamer_thread(void *data) +{ + struct vimc_stream *stream = data; + int i; + + set_freezable(); + set_current_state(TASK_UNINTERRUPTIBLE); + + for (;;) { + try_to_freeze(); + if (kthread_should_stop()) + break; + + for (i = stream->pipe_size - 1; i >= 0; i--) { + stream->frame = stream->ved_pipeline[i]->process_frame( + stream->ved_pipeline[i], + stream->frame); + if (!stream->frame) + break; + if (IS_ERR(stream->frame)) + break; + } + //wait for 60hz + schedule_timeout(HZ / 60); + } + + return 0; +} + +int vimc_streamer_s_stream(struct vimc_stream *stream, + struct vimc_ent_device *ved, + int enable) +{ + int ret; + + if (!stream || !ved) + return -EINVAL; + + if (enable) { + if (stream->kthread) + return 0; + + ret = vimc_streamer_pipeline_init(stream, ved); + if (ret) + return ret; + + stream->kthread = kthread_run(vimc_streamer_thread, stream, + "vimc-streamer thread"); + + if (IS_ERR(stream->kthread)) + return PTR_ERR(stream->kthread); + + } else { + if (!stream->kthread) + return 0; + + ret = kthread_stop(stream->kthread); + if (ret) + return ret; + + stream->kthread = NULL; + + vimc_streamer_pipeline_terminate(stream); + } + + return 0; +} +EXPORT_SYMBOL_GPL(vimc_streamer_s_stream); + +MODULE_DESCRIPTION("Virtual Media Controller Driver (VIMC) Streamer"); +MODULE_AUTHOR("Lucas A. M. Magalhães lucmaga@gmail.com"); +MODULE_LICENSE("GPL"); --- /dev/null +++ b/drivers/media/platform/vimc/vimc-streamer.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * vimc-streamer.h Virtual Media Controller Driver + * + * Copyright (C) 2018 Lucas A. M. Magalhães lucmaga@gmail.com + * + */ + +#ifndef _VIMC_STREAMER_H_ +#define _VIMC_STREAMER_H_ + +#include <media/media-device.h> + +#include "vimc-common.h" + +#define VIMC_STREAMER_PIPELINE_MAX_SIZE 16 + +struct vimc_stream { + struct media_pipeline pipe; + struct vimc_ent_device *ved_pipeline[VIMC_STREAMER_PIPELINE_MAX_SIZE]; + unsigned int pipe_size; + u8 *frame; + struct task_struct *kthread; +}; + +/** + * vimc_streamer_s_streamer - start/stop the stream + * + * @stream: the pointer to the stream to start or stop + * @ved: The last entity of the streamer pipeline + * @enable: any non-zero number start the stream, zero stop + * + */ +int vimc_streamer_s_stream(struct vimc_stream *stream, + struct vimc_ent_device *ved, + int enable); + +#endif //_VIMC_STREAMER_H_
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve Longerbeam slongerbeam@gmail.com
commit 337e90ed028643c7acdfd0d31e3224d05ca03d66 upstream.
Some imx platforms do not have fwnode connections to all CSI input ports, and should not be treated as an error. This includes the imx6q SabreAuto, which has no connections to ipu1_csi1 and ipu2_csi0. Return -ENOTCONN in imx_csi_parse_endpoint() so that v4l2-fwnode endpoint parsing will not treat an unconnected CSI input port as an error.
Fixes: c893500a16baf ("media: imx: csi: Register a subdev notifier")
Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Acked-by: Tim Harvey tharvey@gateworks.com Cc: stable@vger.kernel.org Tested-by: Fabio Estevam festevam@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-media-csi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/media/imx/imx-media-csi.c +++ b/drivers/staging/media/imx/imx-media-csi.c @@ -1787,7 +1787,7 @@ static int imx_csi_parse_endpoint(struct struct v4l2_fwnode_endpoint *vep, struct v4l2_async_subdev *asd) { - return fwnode_device_is_available(asd->match.fwnode) ? 0 : -EINVAL; + return fwnode_device_is_available(asd->match.fwnode) ? 0 : -ENOTCONN; }
static int imx_csi_async_register(struct csi_priv *priv)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve Longerbeam slongerbeam@gmail.com
commit 2e0fe66e0a136252f4d89dbbccdcb26deb867eb8 upstream.
Disable the CSI immediately after receiving the last EOF before stream off (and thus before disabling the IDMA channel). Do this by moving the wait for EOF completion into a new function csi_idmac_wait_last_eof().
This fixes a complete system hard lockup on the SabreAuto when streaming from the ADV7180, by repeatedly sending a stream off immediately followed by stream on:
while true; do v4l2-ctl -d4 --stream-mmap --stream-count=3; done
Eventually this either causes the system lockup or EOF timeouts at all subsequent stream on, until a system reset.
The lockup occurs when disabling the IDMA channel at stream off. Disabling the CSI before disabling the IDMA channel appears to be a reliable fix for the hard lockup.
Fixes: 4a34ec8e470cb ("[media] media: imx: Add CSI subdev driver")
Reported-by: Gaël PORTAY gael.portay@collabora.com Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Cc: stable@vger.kernel.org # for 4.13 and up Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-media-csi.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
--- a/drivers/staging/media/imx/imx-media-csi.c +++ b/drivers/staging/media/imx/imx-media-csi.c @@ -629,7 +629,7 @@ out_put_ipu: return ret; }
-static void csi_idmac_stop(struct csi_priv *priv) +static void csi_idmac_wait_last_eof(struct csi_priv *priv) { unsigned long flags; int ret; @@ -646,7 +646,10 @@ static void csi_idmac_stop(struct csi_pr &priv->last_eof_comp, msecs_to_jiffies(IMX_MEDIA_EOF_TIMEOUT)); if (ret == 0) v4l2_warn(&priv->sd, "wait last EOF timeout\n"); +}
+static void csi_idmac_stop(struct csi_priv *priv) +{ devm_free_irq(priv->dev, priv->eof_irq, priv); devm_free_irq(priv->dev, priv->nfb4eof_irq, priv);
@@ -758,6 +761,16 @@ idmac_stop:
static void csi_stop(struct csi_priv *priv) { + if (priv->dest == IPU_CSI_DEST_IDMAC) + csi_idmac_wait_last_eof(priv); + + /* + * Disable the CSI asap, after syncing with the last EOF. + * Doing so after the IDMA channel is disabled has shown to + * create hard system-wide hangs. + */ + ipu_csi_disable(priv->csi); + if (priv->dest == IPU_CSI_DEST_IDMAC) { csi_idmac_stop(priv);
@@ -765,8 +778,6 @@ static void csi_stop(struct csi_priv *pr if (priv->fim) imx_media_fim_set_stream(priv->fim, NULL, false); } - - ipu_csi_disable(priv->csi); }
static const struct csi_skip_desc csi_skip[12] = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve Longerbeam slongerbeam@gmail.com
commit 4bc1ab41eee9d02ad2483bf8f51a7b72e3504eba upstream.
Move upstream stream off to just after receiving the last EOF completion and disabling the CSI (and thus before disabling the IDMA channel) in csi_stop(). For symmetry also move upstream stream on to beginning of csi_start().
Doing this makes csi_s_stream() more symmetric with prp_s_stream() which will require the same change to fix a hard lockup.
Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Cc: stable@vger.kernel.org # for 4.13 and up Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/media/imx/imx-media-csi.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)
--- a/drivers/staging/media/imx/imx-media-csi.c +++ b/drivers/staging/media/imx/imx-media-csi.c @@ -725,10 +725,16 @@ static int csi_start(struct csi_priv *pr
output_fi = &priv->frame_interval[priv->active_output_pad];
+ /* start upstream */ + ret = v4l2_subdev_call(priv->src_sd, video, s_stream, 1); + ret = (ret && ret != -ENOIOCTLCMD) ? ret : 0; + if (ret) + return ret; + if (priv->dest == IPU_CSI_DEST_IDMAC) { ret = csi_idmac_start(priv); if (ret) - return ret; + goto stop_upstream; }
ret = csi_setup(priv); @@ -756,6 +762,8 @@ fim_off: idmac_stop: if (priv->dest == IPU_CSI_DEST_IDMAC) csi_idmac_stop(priv); +stop_upstream: + v4l2_subdev_call(priv->src_sd, video, s_stream, 0); return ret; }
@@ -771,6 +779,9 @@ static void csi_stop(struct csi_priv *pr */ ipu_csi_disable(priv->csi);
+ /* stop upstream */ + v4l2_subdev_call(priv->src_sd, video, s_stream, 0); + if (priv->dest == IPU_CSI_DEST_IDMAC) { csi_idmac_stop(priv);
@@ -938,23 +949,13 @@ static int csi_s_stream(struct v4l2_subd goto update_count;
if (enable) { - /* upstream must be started first, before starting CSI */ - ret = v4l2_subdev_call(priv->src_sd, video, s_stream, 1); - ret = (ret && ret != -ENOIOCTLCMD) ? ret : 0; - if (ret) - goto out; - dev_dbg(priv->dev, "stream ON\n"); ret = csi_start(priv); - if (ret) { - v4l2_subdev_call(priv->src_sd, video, s_stream, 0); + if (ret) goto out; - } } else { dev_dbg(priv->dev, "stream OFF\n"); - /* CSI must be stopped first, then stop upstream */ csi_stop(priv); - v4l2_subdev_call(priv->src_sd, video, s_stream, 0); }
update_count:
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Noralf Trønnes noralf@tronnes.org
commit 78de14c23e031420aa5f61973583635eccd6cd2a upstream.
If fbdev setup has failed, lastclose will give a NULL pointer deref:
[ 77.794295] [drm:drm_lastclose] [ 77.794414] [drm:drm_lastclose] driver lastclose completed [ 77.794660] Unable to handle kernel NULL pointer dereference at virtual address 00000014 [ 77.809460] pgd = b376b71b [ 77.818275] [00000014] *pgd=175ba831, *pte=00000000, *ppte=00000000 [ 77.830813] Internal error: Oops: 17 [#1] ARM [ 77.840963] Modules linked in: mi0283qt mipi_dbi tinydrm raspberrypi_hwmon gpio_backlight backlight snd_bcm2835(C) bcm2835_rng rng_core [ 77.865203] CPU: 0 PID: 527 Comm: lt-modetest Tainted: G C 5.0.0-rc1+ #1 [ 77.879525] Hardware name: BCM2835 [ 77.889185] PC is at restore_fbdev_mode+0x20/0x164 [ 77.900261] LR is at drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0x9c [ 78.002446] Process lt-modetest (pid: 527, stack limit = 0x7a3d5c14) [ 78.291030] Backtrace: [ 78.300815] [<c04f2d0c>] (restore_fbdev_mode) from [<c04f4708>] (drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0x9c) [ 78.319095] r9:d8a8a288 r8:d891acf0 r7:d7697910 r6:00000000 r5:d891ac00 r4:d891ac00 [ 78.334432] [<c04f46b4>] (drm_fb_helper_restore_fbdev_mode_unlocked) from [<c04f47e8>] (drm_fbdev_client_restore+0x18/0x20) [ 78.353296] r8:d76978c0 r7:d7697910 r6:d7697950 r5:d7697800 r4:d891ac00 r3:c04f47d0 [ 78.368689] [<c04f47d0>] (drm_fbdev_client_restore) from [<c051b6b4>] (drm_client_dev_restore+0x7c/0xc0) [ 78.385982] [<c051b638>] (drm_client_dev_restore) from [<c04f8fd0>] (drm_lastclose+0xc4/0xd4) [ 78.402332] r8:d76978c0 r7:d7471080 r6:c0e0c088 r5:d8a85e00 r4:d7697800 [ 78.416688] [<c04f8f0c>] (drm_lastclose) from [<c04f9088>] (drm_release+0xa8/0x10c) [ 78.431929] r5:d8a85e00 r4:d7697800 [ 78.442989] [<c04f8fe0>] (drm_release) from [<c02640c4>] (__fput+0x104/0x1c8) [ 78.457740] r8:d5ccea10 r7:d96cfb10 r6:00000008 r5:d74c1b90 r4:d8a8a280 [ 78.472043] [<c0263fc0>] (__fput) from [<c02641ec>] (____fput+0x18/0x1c) [ 78.486363] r10:00000006 r9:d7722000 r8:c01011c4 r7:00000000 r6:c0ebac6c r5:d892a340 [ 78.501869] r4:d8a8a280 [ 78.512002] [<c02641d4>] (____fput) from [<c013ef1c>] (task_work_run+0x98/0xac) [ 78.527186] [<c013ee84>] (task_work_run) from [<c010cc54>] (do_work_pending+0x4f8/0x570) [ 78.543238] r7:d7722030 r6:00000004 r5:d7723fb0 r4:00000000 [ 78.556825] [<c010c75c>] (do_work_pending) from [<c0101034>] (slow_work_pending+0xc/0x20) [ 78.674256] ---[ end trace 70d3a60cf739be3b ]---
Fix by using drm_fb_helper_lastclose() which checks if fbdev is in use.
Fixes: 9060d7f49376 ("drm/fb-helper: Finish the generic fbdev emulation") Cc: stable@vger.kernel.org Signed-off-by: Noralf Trønnes noralf@tronnes.org Reviewed-by: Gerd Hoffmann kraxel@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20190125150300.33268-1-noralf@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/drm_fb_helper.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -3170,9 +3170,7 @@ static void drm_fbdev_client_unregister(
static int drm_fbdev_client_restore(struct drm_client_dev *client) { - struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client); - - drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper); + drm_fb_helper_lastclose(client->dev);
return 0; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit cc5034a5d293dd620484d1d836aa16c6764a1c8c upstream.
Add missing break statement in order to prevent the code from falling through to case CB_TARGET_MASK.
This bug was found thanks to the ongoing efforts to enable -Wimplicit-fallthrough.
Fixes: dd220a00e8bd ("drm/radeon/kms: add support for streamout v7") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/radeon/evergreen_cs.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/radeon/evergreen_cs.c +++ b/drivers/gpu/drm/radeon/evergreen_cs.c @@ -1299,6 +1299,7 @@ static int evergreen_cs_handle_reg(struc return -EINVAL; } ib[idx] += (u32)((reloc->gpu_offset >> 8) & 0xffffffff); + break; case CB_TARGET_MASK: track->cb_target_mask = radeon_get_ib_value(p, idx); track->cb_dirty = true;
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Evan Quan evan.quan@amd.com
commit f5742ec36422a39b57f0256e4847f61b3c432f8c upstream.
Set sampling period as 500ms to provide a smooth power reading output. Also, correct the register for power reading.
Signed-off-by: Evan Quan evan.quan@amd.com Reviewed-by: Feifei Xu Feifei.Xu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c @@ -3491,14 +3491,14 @@ static int smu7_get_gpu_power(struct pp_
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogStart); cgs_write_ind_register(hwmgr->device, CGS_IND_REG__SMC, - ixSMU_PM_STATUS_94, 0); + ixSMU_PM_STATUS_95, 0);
for (i = 0; i < 10; i++) { - mdelay(1); + mdelay(500); smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogSample); tmp = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__SMC, - ixSMU_PM_STATUS_94); + ixSMU_PM_STATUS_95); if (tmp != 0) break; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harry Wentland harry.wentland@amd.com
commit 59d3191f14dc18881fec1172c7096b7863622803 upstream.
Powerplay functions called from dm_pp_* functions tend to do a mutex_lock which isn't safe to do inside a kernel_fpu_begin/end block as those will disable/enable preemption.
Rearrange the dm_pp_get_clock_levels_by_type_with_voltage calls to make sure they happen outside of kernel_fpu_begin/end.
Cc: stable@vger.kernel.org Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c @@ -1355,12 +1355,12 @@ void dcn_bw_update_from_pplib(struct dc struct dm_pp_clock_levels_with_voltage fclks = {0}, dcfclks = {0}; bool res;
- kernel_fpu_begin(); - /* TODO: This is not the proper way to obtain fabric_and_dram_bandwidth, should be min(fclk, memclk) */ res = dm_pp_get_clock_levels_by_type_with_voltage( ctx, DM_PP_CLOCK_TYPE_FCLK, &fclks);
+ kernel_fpu_begin(); + if (res) res = verify_clock_values(&fclks);
@@ -1379,9 +1379,13 @@ void dcn_bw_update_from_pplib(struct dc } else BREAK_TO_DEBUGGER();
+ kernel_fpu_end(); + res = dm_pp_get_clock_levels_by_type_with_voltage( ctx, DM_PP_CLOCK_TYPE_DCFCLK, &dcfclks);
+ kernel_fpu_begin(); + if (res) res = verify_clock_values(&dcfclks);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 152482580a1b0accb60676063a1ac57b2d12daf6 upstream.
kvm_arch_memslots_updated() is at this point in time an x86-specific hook for handling MMIO generation wraparound. x86 stashes 19 bits of the memslots generation number in its MMIO sptes in order to avoid full page fault walks for repeat faults on emulated MMIO addresses. Because only 19 bits are used, wrapping the MMIO generation number is possible, if unlikely. kvm_arch_memslots_updated() alerts x86 that the generation has changed so that it can invalidate all MMIO sptes in case the effective MMIO generation has wrapped so as to avoid using a stale spte, e.g. a (very) old spte that was created with generation==0.
Given that the purpose of kvm_arch_memslots_updated() is to prevent consuming stale entries, it needs to be called before the new generation is propagated to memslots. Invalidating the MMIO sptes after updating memslots means that there is a window where a vCPU could dereference the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO spte that was created with (pre-wrap) generation==0.
Fixes: e59dbe09f8e6 ("KVM: Introduce kvm_arch_memslots_updated()") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/include/asm/kvm_host.h | 2 +- arch/powerpc/include/asm/kvm_host.h | 2 +- arch/s390/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu.c | 4 ++-- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 2 +- virt/kvm/arm/mmu.c | 2 +- virt/kvm/kvm_main.c | 7 +++++-- 9 files changed, 15 insertions(+), 12 deletions(-)
--- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -1134,7 +1134,7 @@ static inline void kvm_arch_hardware_uns static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} -static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} +static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -837,7 +837,7 @@ struct kvm_vcpu_arch { static inline void kvm_arch_hardware_disable(void) {} static inline void kvm_arch_hardware_unsetup(void) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {} -static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} +static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_exit(void) {} --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -878,7 +878,7 @@ static inline void kvm_arch_vcpu_uninit( static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, struct kvm_memory_slot *dont) {} -static inline void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) {} +static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) {} --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1255,7 +1255,7 @@ void kvm_mmu_clear_dirty_pt_masked(struc struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask); void kvm_mmu_zap_all(struct kvm *kvm); -void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, struct kvm_memslots *slots); +void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
--- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -5891,13 +5891,13 @@ static bool kvm_has_zapped_obsolete_page return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages)); }
-void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, struct kvm_memslots *slots) +void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) { /* * The very rare case: if the generation-number is round, * zap all shadow pages. */ - if (unlikely((slots->generation & MMIO_GEN_MASK) == 0)) { + if (unlikely((gen & MMIO_GEN_MASK) == 0)) { kvm_debug_ratelimited("kvm: zapping shadow pages for mmio generation wraparound\n"); kvm_mmu_invalidate_zap_all_pages(kvm); } --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9348,13 +9348,13 @@ out_free: return -ENOMEM; }
-void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) +void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) { /* * memslots->generation has been incremented. * mmio generation may have reached its maximum value. */ - kvm_mmu_invalidate_mmio_sptes(kvm, slots); + kvm_mmu_invalidate_mmio_sptes(kvm, gen); }
int kvm_arch_prepare_memory_region(struct kvm *kvm, --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -634,7 +634,7 @@ void kvm_arch_free_memslot(struct kvm *k struct kvm_memory_slot *dont); int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned long npages); -void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots); +void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen); int kvm_arch_prepare_memory_region(struct kvm *kvm, struct kvm_memory_slot *memslot, const struct kvm_userspace_memory_region *mem, --- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -2353,7 +2353,7 @@ int kvm_arch_create_memslot(struct kvm * return 0; }
-void kvm_arch_memslots_updated(struct kvm *kvm, struct kvm_memslots *slots) +void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) { }
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -874,6 +874,7 @@ static struct kvm_memslots *install_new_ int as_id, struct kvm_memslots *slots) { struct kvm_memslots *old_memslots = __kvm_memslots(kvm, as_id); + u64 gen;
/* * Set the low bit in the generation, which disables SPTE caching @@ -896,9 +897,11 @@ static struct kvm_memslots *install_new_ * space 0 will use generations 0, 4, 8, ... while * address space 1 will * use generations 2, 6, 10, 14, ... */ - slots->generation += KVM_ADDRESS_SPACE_NUM * 2 - 1; + gen = slots->generation + KVM_ADDRESS_SPACE_NUM * 2 - 1;
- kvm_arch_memslots_updated(kvm, slots); + kvm_arch_memslots_updated(kvm, gen); + + slots->generation = gen;
return old_memslots; }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 61c08aa9606d4e48a8a50639c956448a720174c3 upstream.
The vCPU-run asm blob does a manual comparison of a VMCS' launched status to execute the correct VM-Enter instruction, i.e. VMLAUNCH vs. VMRESUME. The launched flag is a bool, which is a typedef of _Bool. C99 does not define an exact size for _Bool, stating only that is must be large enough to hold '0' and '1'. Most, if not all, compilers use a single byte for _Bool, including gcc[1].
Originally, 'launched' was of type 'int' and so the asm blob used 'cmpl' to check the launch status. When 'launched' was moved to be stored on a per-VMCS basis, struct vcpu_vmx's "temporary" __launched flag was added in order to avoid having to pass the current VMCS into the asm blob. The new '__launched' was defined as a 'bool' and not an 'int', but the 'cmp' instruction was not updated.
This has not caused any known problems, likely due to compilers aligning variables to 4-byte or 8-byte boundaries and KVM zeroing out struct vcpu_vmx during allocation. I.e. vCPU-run accesses "junk" data, it just happens to always be zero and so doesn't affect the result.
[1] https://gcc.gnu.org/ml/gcc-patches/2000-10/msg01127.html
Fixes: d462b8192368 ("KVM: VMX: Keep list of loaded VMCSs, instead of vcpus") Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson jmattson@google.com Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6399,7 +6399,7 @@ static void __vmx_vcpu_run(struct kvm_vc "mov %%" _ASM_AX", %%cr2 \n\t" "3: \n\t" /* Check if vmlaunch or vmresume is needed */ - "cmpl $0, %c[launched](%%" _ASM_CX ") \n\t" + "cmpb $0, %c[launched](%%" _ASM_CX ") \n\t" /* Load guest registers. Don't clobber flags. */ "mov %c[rax](%%" _ASM_CX "), %%" _ASM_AX " \n\t" "mov %c[rbx](%%" _ASM_CX "), %%" _ASM_BX " \n\t"
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 0e0ab73c9a0243736bcd779b30b717e23ba9a56d upstream.
...except RSP, which is restored by hardware as part of VM-Exit.
Paolo theorized that restoring registers from the stack after a VM-Exit in lieu of zeroing them could lead to speculative execution with the guest's values, e.g. if the stack accesses miss the L1 cache[1]. Zeroing XORs are dirt cheap, so just be ultra-paranoid.
Note that the scratch register (currently RCX) used to save/restore the guest state is also zeroed as its host-defined value is loaded via the stack, just with a MOV instead of a POP.
[1] https://patchwork.kernel.org/patch/10771539/#22441255
Fixes: 0cb5b30698fd ("kvm: vmx: Scrub hardware GPRs at VM-exit") Cc: stable@vger.kernel.org Cc: Jim Mattson jmattson@google.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/vmx.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6449,10 +6449,15 @@ static void __vmx_vcpu_run(struct kvm_vc "mov %%r13, %c[r13](%%" _ASM_CX ") \n\t" "mov %%r14, %c[r14](%%" _ASM_CX ") \n\t" "mov %%r15, %c[r15](%%" _ASM_CX ") \n\t" + /* - * Clear host registers marked as clobbered to prevent - * speculative use. - */ + * Clear all general purpose registers (except RSP, which is loaded by + * the CPU during VM-Exit) to prevent speculative use of the guest's + * values, even those that are saved/loaded via the stack. In theory, + * an L1 cache miss when restoring registers could lead to speculative + * execution with the guest's values. Zeroing XORs are dirt cheap, + * i.e. the extra paranoia is essentially free. + */ "xor %%r8d, %%r8d \n\t" "xor %%r9d, %%r9d \n\t" "xor %%r10d, %%r10d \n\t" @@ -6467,8 +6472,11 @@ static void __vmx_vcpu_run(struct kvm_vc
"xor %%eax, %%eax \n\t" "xor %%ebx, %%ebx \n\t" + "xor %%ecx, %%ecx \n\t" + "xor %%edx, %%edx \n\t" "xor %%esi, %%esi \n\t" "xor %%edi, %%edi \n\t" + "xor %%ebp, %%ebp \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" : ASM_CALL_CONSTRAINT : "c"(vmx), "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp),
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit e1359e2beb8b0a1188abc997273acbaedc8ee791 upstream.
The check to detect a wrap of the MMIO generation explicitly looks for a generation number of zero. Now that unique memslots generation numbers are assigned to each address space, only address space 0 will get a generation number of exactly zero when wrapping. E.g. when address space 1 goes from 0x7fffe to 0x80002, the MMIO generation number will wrap to 0x2. Adjust the MMIO generation to strip the address space modifier prior to checking for a wrap.
Fixes: 4bd518f1598d ("KVM: use separate generations for each address space") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/mmu.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -5893,11 +5893,28 @@ static bool kvm_has_zapped_obsolete_page
void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) { + gen &= MMIO_GEN_MASK; + + /* + * Shift to eliminate the "update in-progress" flag, which isn't + * included in the spte's generation number. + */ + gen >>= 1; + + /* + * Generation numbers are incremented in multiples of the number of + * address spaces in order to provide unique generations across all + * address spaces. Strip what is effectively the address space + * modifier prior to checking for a wrap of the MMIO generation so + * that a wrap in any address space is detected. + */ + gen &= ~((u64)KVM_ADDRESS_SPACE_NUM - 1); + /* - * The very rare case: if the generation-number is round, + * The very rare case: if the MMIO generation number has wrapped, * zap all shadow pages. */ - if (unlikely((gen & MMIO_GEN_MASK) == 0)) { + if (unlikely(gen == 0)) { kvm_debug_ratelimited("kvm: zapping shadow pages for mmio generation wraparound\n"); kvm_mmu_invalidate_zap_all_pages(kvm); }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit ddfd1730fd829743e41213e32ccc8b4aa6dc8325 upstream.
When installing new memslots, KVM sets bit 0 of the generation number to indicate that an update is in-progress. Until the update is complete, there are no guarantees as to whether a vCPU will see the old or the new memslots. Explicity prevent caching MMIO accesses so as to avoid using an access cached from the old memslots after the new memslots have been installed.
Note that it is unclear whether or not disabling caching during the update window is strictly necessary as there is no definitive documentation as to what ordering guarantees KVM provides with respect to updating memslots. That being said, the MMIO spte code does not allow reusing sptes created while an update is in-progress, and the associated documentation explicitly states:
We do not want to use an MMIO sptes created with an odd generation number, ... If KVM is unlucky and creates an MMIO spte while the low bit is 1, the next access to the spte will always be a cache miss.
At the very least, disabling the per-vCPU MMIO cache during updates will make its behavior consistent with the MMIO spte behavior and documentation.
Fixes: 56f17dd3fbc4 ("kvm: x86: fix stale mmio cache bug") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/x86.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -181,6 +181,11 @@ static inline bool emul_is_noncanonical_ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn, unsigned access) { + u64 gen = kvm_memslots(vcpu->kvm)->generation; + + if (unlikely(gen & 1)) + return; + /* * If this is a shadow nested page table, the "GVA" is * actually a nGPA. @@ -188,7 +193,7 @@ static inline void vcpu_cache_mmio_info( vcpu->arch.mmio_gva = mmu_is_nested(vcpu) ? 0 : gva & PAGE_MASK; vcpu->arch.access = access; vcpu->arch.mmio_gfn = gfn; - vcpu->arch.mmio_gen = kvm_memslots(vcpu->kvm)->generation; + vcpu->arch.mmio_gen = gen; }
static inline bool vcpu_match_mmio_gen(struct kvm_vcpu *vcpu)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 946c522b603f281195af1df91837a1d4d1eb3bc9 upstream.
The VMCS.EXIT_QUALIFCATION field reports the displacements of memory operands for various instructions, including VMX instructions, as a naturally sized unsigned value, but masks the value by the addr size, e.g. given a ModRM encoded as -0x28(%ebp), the -0x28 displacement is reported as 0xffffffd8 for a 32-bit address size. Despite some weird wording regarding sign extension, the SDM explicitly states that bits beyond the instructions address size are undefined:
In all cases, bits of this field beyond the instruction’s address size are undefined.
Failure to sign extend the displacement results in KVM incorrectly treating a negative displacement as a large positive displacement when the address size of the VMX instruction is smaller than KVM's native size, e.g. a 32-bit address size on a 64-bit KVM.
The very original decoding, added by commit 064aea774768 ("KVM: nVMX: Decoding memory operands of VMX instructions"), sort of modeled sign extension by truncating the final virtual/linear address for a 32-bit address size. I.e. it messed up the effective address but made it work by adjusting the final address.
When segmentation checks were added, the truncation logic was kept as-is and no sign extension logic was introduced. In other words, it kept calculating the wrong effective address while mostly generating the correct virtual/linear address. As the effective address is what's used in the segment limit checks, this results in KVM incorreclty injecting #GP/#SS faults due to non-existent segment violations when a nested VMM uses negative displacements with an address size smaller than KVM's native address size.
Using the -0x28(%ebp) example, an EBP value of 0x1000 will result in KVM using 0x100000fd8 as the effective address when checking for a segment limit violation. This causes a 100% failure rate when running a 32-bit KVM build as L1 on top of a 64-bit KVM L0.
Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4035,6 +4035,10 @@ int get_vmx_mem_address(struct kvm_vcpu /* Addr = segment_base + offset */ /* offset = base + [index * scale] + displacement */ off = exit_qualification; /* holds the displacement */ + if (addr_size == 1) + off = (gva_t)sign_extend64(off, 31); + else if (addr_size == 0) + off = (gva_t)sign_extend64(off, 15); if (base_is_valid) off += kvm_register_read(vcpu, base_reg); if (index_is_valid)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 8570f9e881e3fde98801bb3a47eef84dd934d405 upstream.
The address size of an instruction affects the effective address, not the virtual/linear address. The final address may still be truncated, e.g. to 32-bits outside of long mode, but that happens irrespective of the address size, e.g. a 32-bit address size can yield a 64-bit virtual address when using FS/GS with a non-zero base.
Fixes: 064aea774768 ("KVM: nVMX: Decoding memory operands of VMX instructions") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4044,20 +4044,41 @@ int get_vmx_mem_address(struct kvm_vcpu if (index_is_valid) off += kvm_register_read(vcpu, index_reg)<<scaling; vmx_get_segment(vcpu, &s, seg_reg); - *ret = s.base + off;
+ /* + * The effective address, i.e. @off, of a memory operand is truncated + * based on the address size of the instruction. Note that this is + * the *effective address*, i.e. the address prior to accounting for + * the segment's base. + */ if (addr_size == 1) /* 32 bit */ - *ret &= 0xffffffff; + off &= 0xffffffff; + else if (addr_size == 0) /* 16 bit */ + off &= 0xffff;
/* Checks for #GP/#SS exceptions. */ exn = false; if (is_long_mode(vcpu)) { + /* + * The virtual/linear address is never truncated in 64-bit + * mode, e.g. a 32-bit address size can yield a 64-bit virtual + * address when using FS/GS with a non-zero base. + */ + *ret = s.base + off; + /* Long mode: #GP(0)/#SS(0) if the memory address is in a * non-canonical form. This is the only check on the memory * destination for long mode! */ exn = is_noncanonical_address(*ret, vcpu); } else if (is_protmode(vcpu)) { + /* + * When not in long mode, the virtual/linear address is + * unconditionally truncated to 32 bits regardless of the + * address size. + */ + *ret = (s.base + off) & 0xffffffff; + /* Protected mode: apply checks for segment validity in the * following order: * - segment type check (#GP(0) may be thrown)
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 34333cc6c2cb021662fd32e24e618d1b86de95bf upstream.
Regarding segments with a limit==0xffffffff, the SDM officially states:
When the effective limit is FFFFFFFFH (4 GBytes), these accesses may or may not cause the indicated exceptions. Behavior is implementation-specific and may vary from one execution to another.
In practice, all CPUs that support VMX ignore limit checks for "flat segments", i.e. an expand-up data or code segment with base=0 and limit=0xffffffff. This is subtly different than wrapping the effective address calculation based on the address size, as the flat segment behavior also applies to accesses that would wrap the 4g boundary, e.g. a 4-byte access starting at 0xffffffff will access linear addresses 0xffffffff, 0x0, 0x1 and 0x2.
Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4102,10 +4102,16 @@ int get_vmx_mem_address(struct kvm_vcpu /* Protected mode: #GP(0)/#SS(0) if the segment is unusable. */ exn = (s.unusable != 0); - /* Protected mode: #GP(0)/#SS(0) if the memory - * operand is outside the segment limit. + + /* + * Protected mode: #GP(0)/#SS(0) if the memory operand is + * outside the segment limit. All CPUs that support VMX ignore + * limit checks for flat segments, i.e. segments with base==0, + * limit==0xffffffff and of type expand-up data or code. */ - exn = exn || (off + sizeof(u64) > s.limit); + if (!(s.base == 0 && s.limit == 0xffffffff && + ((s.type & 8) || !(s.type & 4)))) + exn = exn || (off + sizeof(u64) > s.limit); } if (exn) { kvm_queue_exception_e(vcpu,
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 1ce072cbfd8dba46f117804850398e0b3040a541 upstream.
Nested early checks does a manual comparison of a VMCS' launched status in its asm blob to execute the correct VM-Enter instruction, i.e. VMLAUNCH vs. VMRESUME. The launched flag is a bool, which is a typedef of _Bool. C99 does not define an exact size for _Bool, stating only that is must be large enough to hold '0' and '1'. Most, if not all, compilers use a single byte for _Bool, including gcc[1].
The use of 'cmpl' instead of 'cmpb' was not deliberate, but rather the result of a copy-paste as the asm blob was directly derived from the asm blob for vCPU-run.
This has not caused any known problems, likely due to compilers aligning variables to 4-byte or 8-byte boundaries and KVM zeroing out struct vcpu_vmx during allocation. I.e. vCPU-run accesses "junk" data, it just happens to always be zero and so doesn't affect the result.
[1] https://gcc.gnu.org/ml/gcc-patches/2000-10/msg01127.html
Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W") Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson jmattson@google.com Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2765,7 +2765,7 @@ static int nested_vmx_check_vmentry_hw(s "add $%c[wordsize], %%" _ASM_SP "\n\t" /* un-adjust RSP */
/* Check if vmlaunch or vmresume is needed */ - "cmpl $0, %c[launched](%% " _ASM_CX")\n\t" + "cmpb $0, %c[launched](%% " _ASM_CX")\n\t"
"call vmx_vmenter\n\t"
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit aed13f2e00ce278f039b76e7ac84d419aff48ef6 upstream.
Make sure to disable and deregister the switch on late probe errors to avoid use-after-free when the device-resource-managed switch is freed.
Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable stable@vger.kernel.org # 4.20 Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/dsa/lantiq_gswip.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -1099,7 +1099,7 @@ static int gswip_probe(struct platform_d dev_err(dev, "wrong CPU port defined, HW only supports port: %i", priv->hw_info->cpu_port); err = -EINVAL; - goto mdio_bus; + goto disable_switch; }
platform_set_drvdata(pdev, priv); @@ -1109,6 +1109,9 @@ static int gswip_probe(struct platform_d (version & GSWIP_VERSION_MOD_MASK) >> GSWIP_VERSION_MOD_SHIFT); return 0;
+disable_switch: + gswip_mdio_mask(priv, GSWIP_MDIO_GLOB_ENABLE, 0, GSWIP_MDIO_GLOB); + dsa_unregister_switch(priv->ds); mdio_bus: if (mdio_np) mdiobus_unregister(priv->ds->slave_mii_bus);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit c8cbcb0d8bd72d44fad1a5ddc348ac10e0fb1b37 upstream.
Use the new of_get_compatible_child() helper to look up child nodes to avoid ever matching non-child nodes elsewhere in the tree.
Also fix up the related struct device_node leaks.
Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable stable@vger.kernel.org # 4.20 Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/dsa/lantiq_gswip.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -1069,10 +1069,10 @@ static int gswip_probe(struct platform_d version = gswip_switch_r(priv, GSWIP_VERSION);
/* bring up the mdio bus */ - gphy_fw_np = of_find_compatible_node(pdev->dev.of_node, NULL, - "lantiq,gphy-fw"); + gphy_fw_np = of_get_compatible_child(dev->of_node, "lantiq,gphy-fw"); if (gphy_fw_np) { err = gswip_gphy_fw_list(priv, gphy_fw_np, version); + of_node_put(gphy_fw_np); if (err) { dev_err(dev, "gphy fw probe failed\n"); return err; @@ -1080,13 +1080,12 @@ static int gswip_probe(struct platform_d }
/* bring up the mdio bus */ - mdio_np = of_find_compatible_node(pdev->dev.of_node, NULL, - "lantiq,xrx200-mdio"); + mdio_np = of_get_compatible_child(dev->of_node, "lantiq,xrx200-mdio"); if (mdio_np) { err = gswip_mdio(priv, mdio_np); if (err) { dev_err(dev, "mdio probe failed\n"); - goto gphy_fw; + goto put_mdio_node; } }
@@ -1115,7 +1114,8 @@ disable_switch: mdio_bus: if (mdio_np) mdiobus_unregister(priv->ds->slave_mii_bus); -gphy_fw: +put_mdio_node: + of_node_put(mdio_np); for (i = 0; i < priv->num_gphy_fw; i++) gswip_gphy_fw_remove(priv, &priv->gphy_fw[i]); return err; @@ -1134,8 +1134,10 @@ static int gswip_remove(struct platform_
dsa_unregister_switch(priv->ds);
- if (priv->ds->slave_mii_bus) + if (priv->ds->slave_mii_bus) { mdiobus_unregister(priv->ds->slave_mii_bus); + of_node_put(priv->ds->slave_mii_bus->dev.of_node); + }
for (i = 0; i < priv->num_gphy_fw; i++) gswip_gphy_fw_remove(priv, &priv->gphy_fw[i]);
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit 86a86804e4f18fc3880541b3d5a07f4df0fe29cb upstream.
The fix to make WARN work in the early boot code created a problem on older machines without EDAT-1. The setup_lowcore_dat_on function uses the pointer from lowcore_ptr[0] to set the DAT bit in the new PSWs. That does not work if the kernel page table is set up with 4K pages as the prefix address maps to absolute zero.
To make this work the PSWs need to be changed with via address 0 in form of the S390_lowcore definition.
Reported-by: Guenter Roeck linux@roeck-us.net Tested-by: Cornelia Huck cohuck@redhat.com Fixes: 94f85ed3e2f8 ("s390/setup: fix early warning messages") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/kernel/setup.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -451,13 +451,12 @@ static void __init setup_lowcore_dat_off
static void __init setup_lowcore_dat_on(void) { - struct lowcore *lc; - - lc = lowcore_ptr[0]; - lc->external_new_psw.mask |= PSW_MASK_DAT; - lc->svc_new_psw.mask |= PSW_MASK_DAT; - lc->program_new_psw.mask |= PSW_MASK_DAT; - lc->io_new_psw.mask |= PSW_MASK_DAT; + __ctl_clear_bit(0, 28); + S390_lowcore.external_new_psw.mask |= PSW_MASK_DAT; + S390_lowcore.svc_new_psw.mask |= PSW_MASK_DAT; + S390_lowcore.program_new_psw.mask |= PSW_MASK_DAT; + S390_lowcore.io_new_psw.mask |= PSW_MASK_DAT; + __ctl_set_bit(0, 28); }
static struct resource code_resource = {
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit ed7dc973bd91da234d93aff6d033a5206a6c9885 upstream.
If the socket is not connected, then we want to initiate a reconnect rather that trying to transmit requests. If there is a large number of requests queued and waiting for the lock in call_transmit(), then it can take a while for one of the to loop back and retake the lock in call_connect.
Fixes: 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/clnt.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
--- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1786,7 +1786,12 @@ call_encode(struct rpc_task *task) xprt_request_enqueue_receive(task); xprt_request_enqueue_transmit(task); out: - task->tk_action = call_bind; + task->tk_action = call_transmit; + /* Check that the connection is OK */ + if (!xprt_bound(task->tk_xprt)) + task->tk_action = call_bind; + else if (!xprt_connected(task->tk_xprt)) + task->tk_action = call_connect; }
/* @@ -1978,13 +1983,19 @@ call_transmit(struct rpc_task *task) { dprint_status(task);
- task->tk_status = 0; + task->tk_action = call_transmit_status; if (test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate)) { if (!xprt_prepare_transmit(task)) return; - xprt_transmit(task); + task->tk_status = 0; + if (test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate)) { + if (!xprt_connected(task->tk_xprt)) { + task->tk_status = -ENOTCONN; + return; + } + xprt_transmit(task); + } } - task->tk_action = call_transmit_status; xprt_end_transmit(task); }
@@ -2046,6 +2057,8 @@ call_transmit_status(struct rpc_task *ta case -EADDRINUSE: case -ENOTCONN: case -EPIPE: + task->tk_action = call_bind; + task->tk_status = 0; break; } }
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 477687e1116ad16180caf8633dd830b296a5ce73 upstream.
Now that transmissions happen through a queue, we require the RPC tasks to handle error conditions that may have been set while they were sleeping. The back channel does not currently do this, but assumes that any error condition happens during its own call to xprt_transmit().
The solution is to ensure that the back channel splits out the error handling just like the forward channel does.
Fixes: 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/clnt.c | 61 +++++++++++++++++++++++++++++------------------------- 1 file changed, 33 insertions(+), 28 deletions(-)
--- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -66,9 +66,6 @@ static void call_decode(struct rpc_task static void call_bind(struct rpc_task *task); static void call_bind_status(struct rpc_task *task); static void call_transmit(struct rpc_task *task); -#if defined(CONFIG_SUNRPC_BACKCHANNEL) -static void call_bc_transmit(struct rpc_task *task); -#endif /* CONFIG_SUNRPC_BACKCHANNEL */ static void call_status(struct rpc_task *task); static void call_transmit_status(struct rpc_task *task); static void call_refresh(struct rpc_task *task); @@ -1131,6 +1128,8 @@ rpc_call_async(struct rpc_clnt *clnt, co EXPORT_SYMBOL_GPL(rpc_call_async);
#if defined(CONFIG_SUNRPC_BACKCHANNEL) +static void call_bc_encode(struct rpc_task *task); + /** * rpc_run_bc_task - Allocate a new RPC task for backchannel use, then run * rpc_execute against it @@ -1152,7 +1151,7 @@ struct rpc_task *rpc_run_bc_task(struct task = rpc_new_task(&task_setup_data); xprt_init_bc_request(req, task);
- task->tk_action = call_bc_transmit; + task->tk_action = call_bc_encode; atomic_inc(&task->tk_count); WARN_ON_ONCE(atomic_read(&task->tk_count) != 2); rpc_execute(task); @@ -2064,6 +2063,16 @@ call_transmit_status(struct rpc_task *ta }
#if defined(CONFIG_SUNRPC_BACKCHANNEL) +static void call_bc_transmit(struct rpc_task *task); +static void call_bc_transmit_status(struct rpc_task *task); + +static void +call_bc_encode(struct rpc_task *task) +{ + xprt_request_enqueue_transmit(task); + task->tk_action = call_bc_transmit; +} + /* * 5b. Send the backchannel RPC reply. On error, drop the reply. In * addition, disconnect on connectivity errors. @@ -2071,26 +2080,23 @@ call_transmit_status(struct rpc_task *ta static void call_bc_transmit(struct rpc_task *task) { - struct rpc_rqst *req = task->tk_rqstp; - - if (rpc_task_need_encode(task)) - xprt_request_enqueue_transmit(task); - if (!test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate)) - goto out_wakeup; - - if (!xprt_prepare_transmit(task)) - goto out_retry; - - if (task->tk_status < 0) { - printk(KERN_NOTICE "RPC: Could not send backchannel reply " - "error: %d\n", task->tk_status); - goto out_done; + task->tk_action = call_bc_transmit_status; + if (test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate)) { + if (!xprt_prepare_transmit(task)) + return; + task->tk_status = 0; + xprt_transmit(task); } + xprt_end_transmit(task); +}
- xprt_transmit(task); +static void +call_bc_transmit_status(struct rpc_task *task) +{ + struct rpc_rqst *req = task->tk_rqstp;
- xprt_end_transmit(task); dprint_status(task); + switch (task->tk_status) { case 0: /* Success */ @@ -2104,8 +2110,14 @@ call_bc_transmit(struct rpc_task *task) case -ENOTCONN: case -EPIPE: break; + case -ENOBUFS: + rpc_delay(task, HZ>>2); + /* fall through */ + case -EBADSLT: case -EAGAIN: - goto out_retry; + task->tk_status = 0; + task->tk_action = call_bc_transmit; + return; case -ETIMEDOUT: /* * Problem reaching the server. Disconnect and let the @@ -2124,18 +2136,11 @@ call_bc_transmit(struct rpc_task *task) * We were unable to reply and will have to drop the * request. The server should reconnect and retransmit. */ - WARN_ON_ONCE(task->tk_status == -EAGAIN); printk(KERN_NOTICE "RPC: Could not send backchannel reply " "error: %d\n", task->tk_status); break; } -out_wakeup: - rpc_wake_up_queued_task(&req->rq_xprt->pending, task); -out_done: task->tk_action = rpc_exit_task; - return; -out_retry: - task->tk_status = 0; } #endif /* CONFIG_SUNRPC_BACKCHANNEL */
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 7b3fef8e4157ed424bcde039a60a730aa0dfb0eb upstream.
Fix a regression where soft and softconn requests are not timing out as expected.
Fixes: 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/clnt.c | 42 ++++++++++++++++++++++++------------------ 1 file changed, 24 insertions(+), 18 deletions(-)
--- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -77,6 +77,7 @@ static void call_connect_status(struct r static __be32 *rpc_encode_header(struct rpc_task *task); static __be32 *rpc_verify_header(struct rpc_task *task); static int rpc_ping(struct rpc_clnt *clnt); +static void rpc_check_timeout(struct rpc_task *task);
static void rpc_register_client(struct rpc_clnt *clnt) { @@ -1941,8 +1942,7 @@ call_connect_status(struct rpc_task *tas break; if (clnt->cl_autobind) { rpc_force_rebind(clnt); - task->tk_action = call_bind; - return; + goto out_retry; } /* fall through */ case -ECONNRESET: @@ -1962,16 +1962,19 @@ call_connect_status(struct rpc_task *tas /* fall through */ case -ENOTCONN: case -EAGAIN: - /* Check for timeouts before looping back to call_bind */ case -ETIMEDOUT: - task->tk_action = call_timeout; - return; + goto out_retry; case 0: clnt->cl_stats->netreconn++; task->tk_action = call_transmit; return; } rpc_exit(task, status); + return; +out_retry: + /* Check for timeouts before looping back to call_bind */ + task->tk_action = call_bind; + rpc_check_timeout(task); }
/* @@ -2048,7 +2051,7 @@ call_transmit_status(struct rpc_task *ta trace_xprt_ping(task->tk_xprt, task->tk_status); rpc_exit(task, task->tk_status); - break; + return; } /* fall through */ case -ECONNRESET: @@ -2060,6 +2063,7 @@ call_transmit_status(struct rpc_task *ta task->tk_status = 0; break; } + rpc_check_timeout(task); }
#if defined(CONFIG_SUNRPC_BACKCHANNEL) @@ -2196,7 +2200,7 @@ call_status(struct rpc_task *task) case -EPIPE: case -ENOTCONN: case -EAGAIN: - task->tk_action = call_encode; + task->tk_action = call_timeout; break; case -EIO: /* shutdown or soft timeout */ @@ -2210,20 +2214,13 @@ call_status(struct rpc_task *task) } }
-/* - * 6a. Handle RPC timeout - * We do not release the request slot, so we keep using the - * same XID for all retransmits. - */ static void -call_timeout(struct rpc_task *task) +rpc_check_timeout(struct rpc_task *task) { struct rpc_clnt *clnt = task->tk_client;
- if (xprt_adjust_timeout(task->tk_rqstp) == 0) { - dprintk("RPC: %5u call_timeout (minor)\n", task->tk_pid); - goto retry; - } + if (xprt_adjust_timeout(task->tk_rqstp) == 0) + return;
dprintk("RPC: %5u call_timeout (major)\n", task->tk_pid); task->tk_timeouts++; @@ -2259,10 +2256,19 @@ call_timeout(struct rpc_task *task) * event? RFC2203 requires the server to drop all such requests. */ rpcauth_invalcred(task); +}
-retry: +/* + * 6a. Handle RPC timeout + * We do not release the request slot, so we keep using the + * same XID for all retransmits. + */ +static void +call_timeout(struct rpc_task *task) +{ task->tk_action = call_encode; task->tk_status = 0; + rpc_check_timeout(task); }
/*
stable-rc/linux-5.0.y boot: 55 boots: 1 failed, 54 passed (v5.0.3-239-g6a3b25ca9720)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-5.0.y/kernel/v5.0.3... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-5.0.y/kernel/v5.0.3-239-g6...
Tree: stable-rc Branch: linux-5.0.y Git Describe: v5.0.3-239-g6a3b25ca9720 Git Commit: 6a3b25ca97204a3891527d88b6691b362fee82c8 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 32 unique boards, 16 SoC families, 10 builds out of 208
Boot Failure Detected:
arm64:
defconfig: gcc-7: meson-gxbb-p200: 1 failed lab
--- For more info write to info@kernelci.org
On 3/22/19 4:13 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
Build results: total: 159 pass: 159 fail: 0 Qemu test results: total: 345 pass: 345 fail: 0
Guenter
On Fri, Mar 22, 2019 at 09:47:45PM -0700, Guenter Roeck wrote:
On 3/22/19 4:13 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
Build results: total: 159 pass: 159 fail: 0 Qemu test results: total: 345 pass: 345 fail: 0
Wonderful, thanks for testing all 6 of these and letting me know.
greg k-h
On Fri, 22 Mar 2019 at 17:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.4-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.0.4-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.0.y git commit: 6a3b25ca97204a3891527d88b6691b362fee82c8 git describe: v5.0.3-239-g6a3b25ca9720 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.0-oe/build/v5.0.3-239-g...
No regressions (compared to build v5.0.3)
No fixes (compared to build v5.0.3)
Ran 23161 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * ltp-fs-tests * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Sat, Mar 23, 2019 at 11:17:52AM +0530, Naresh Kamboju wrote:
On Fri, 22 Mar 2019 at 17:42, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.4-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Great! Thanks for testing 5 of these and letting me know.
greg k-h
On 22/03/2019 11:13, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.4-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v5.0: 11 builds: 11 pass, 0 fail 22 boots: 22 pass, 0 fail 28 tests: 28 pass, 0 fail
Linux version: 5.0.4-rc1-g6a3b25c Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Sun, Mar 24, 2019 at 11:58:56AM +0000, Jon Hunter wrote:
On 22/03/2019 11:13, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.0.4 release. There are 238 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Mar 24 11:11:13 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.4-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.0.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v5.0: 11 builds: 11 pass, 0 fail 22 boots: 22 pass, 0 fail 28 tests: 28 pass, 0 fail
Thank you for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org