This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
Thanks, Sasha
------------- Pseudo-Shortlog of commits:
Adrian Larumbe (1): dmaengine: xilinx_dma: Fix read-after-free bug when terminating transfers
Andy Shevchenko (1): ptp_pch: Restore dependency on PCI
Babu Moger (1): x86/resctrl: Fix default monitoring groups reporting
Bixuan Cui (1): genirq/msi: Ensure deactivation on teardown
Chris Lesiak (1): iio: humidity: hdc100x: Add margin to the conversion time
Colin Ian King (1): iio: adc: Fix incorrect exit of for-loop
DENG Qingfang (1): net: dsa: mt7530: add the missing RxUnicast MIB counter
Dan Williams (1): ACPI: NFIT: Fix support for virtual SPA ranges
Dave Gerlach (1): ARM: dts: am43x-epos-evm: Reduce i2c0 bus speed for tps65218
Dinghao Liu (1): net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32
Dongliang Mu (4): ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi ieee802154: hwsim: fix GPF in hwsim_new_edge_nl ipack: tpci200: fix many double free issues in tpci200_pci_probe ipack: tpci200: fix memory leak in the tpci200_register
Eric Dumazet (2): net: igmp: fix data-race in igmp_ifc_timer_expire() net: igmp: increase size of mr_ifc_count
Greg Kroah-Hartman (1): i2c: dev: zero out array used for i2c reads from userspace
Harshvardhan Jha (1): scsi: megaraid_mm: Fix end of loop tests for list_for_each_entry()
Ivan T. Ivanov (1): net: usb: lan78xx: don't modify phy_device state concurrently
Jakub Kicinski (2): bnxt: don't lock the tx queue from napi poll bnxt: disable napi before canceling DIM
Jaroslav Kysela (1): ALSA: hda - fix the 'Capture Switch' value change notifications
Jeff Layton (2): locks: print a warning when mount fails due to lack of "mand" support fs: warn about impending deprecation of mandatory locks
Johannes Berg (1): mac80211: drop data frames without key on encrypted links
Jouni Malinen (5): ath: Use safer key clearing with key cache entries ath9k: Clear key cache explicitly on disabling hardware ath: Export ath_hw_keysetmac() ath: Modify ath_key_delete() to not need full key entry ath9k: Postpone key cache entry deletion for TXQ frames reference it
Longpeng(Mike) (1): vsock/virtio: avoid potential deadlock when vsock device remove
Marcin Bachry (1): PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
Marek Behún (1): cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant
Maxim Levitsky (2): KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656) KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
Maximilian Heyne (1): xen/events: Fix race in set_evtchn_to_irq
Nathan Chancellor (1): vmlinux.lds.h: Handle clang's module.{c,d}tor sections
Neal Cardwell (1): tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
NeilBrown (1): btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
Ole Bjørn Midtbø (1): Bluetooth: hidp: use correct wait queue when removing ctrl_wait
Pali Rohár (1): ppp: Fix generating ifname when empty IFLA_IFNAME is specified
Pavel Skripkin (1): net: 6pack: fix slab-out-of-bounds in decode_data
Peter Ujfalusi (1): dmaengine: of-dma: router_xlate to return -EPROBE_DEFER if controller is not yet available
Pu Lehui (1): powerpc/kprobes: Fix kprobe Oops happens in booke
Randy Dunlap (2): x86/tools: Fix objdump version check again dccp: add do-while-0 stubs for dccp_pr_debug macros
Richard Fitzgerald (5): ASoC: cs42l42: Correct definition of ADC Volume control ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J ASoC: cs42l42: Fix inversion of ADC Notch Switch control ASoC: cs42l42: Remove duplicate control for WNF filter frequency ASoC: cs42l42: Fix LRCLK frame start edge
Roi Dayan (1): psample: Add a fwd declaration for skbuff
Saeed Mirzamohammadi (1): iommu/vt-d: Fix agaw for a supported 48 bit guest address width
Saravana Kannan (2): net: mdio-mux: Don't ignore memory allocation errors net: mdio-mux: Handle -EPROBE_DEFER correctly
Sasha Levin (1): Linux 4.19.205-rc1
Sergey Marinkevich (1): netfilter: nft_exthdr: fix endianness of tcp option cast
Sreekanth Reddy (1): scsi: core: Avoid printing an error if target_alloc() returns -ENXIO
Srinivas Kandagatla (3): slimbus: messaging: start transaction ids from 1 instead of zero slimbus: messaging: check for valid transaction id slimbus: ngd: reset dma setup during runtime pm
Steven Rostedt (VMware) (1): tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name
Sudeep Holla (1): ARM: dts: nomadik: Fix up interrupt controller node names
Takashi Iwai (2): ASoC: intel: atom: Fix reference to PCM buffer address ASoC: intel: atom: Fix breakage for PCM buffer address setup
Takeshi Misawa (1): net: Fix memory leak in ieee802154_raw_deliver
Thomas Gleixner (12): genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP x86/msi: Force affinity setup before startup x86/ioapic: Force affinity setup before startup PCI/MSI: Enable and mask MSI-X early PCI/MSI: Do not set invalid bits in MSI mask PCI/MSI: Correct misleading comments PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() PCI/MSI: Protect msi_desc::masked for multi-MSI PCI/MSI: Mask all unused MSI-X entries PCI/MSI: Enforce that MSI-X table entry is masked for update PCI/MSI: Enforce MSI[X] entry updates to be visible x86/fpu: Make init_fpstate correct with optimized XSAVE
Vincent Whitchurch (1): mmc: dw_mmc: Fix hang on data CRC error
Vladimir Oltean (1): net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
Xie Yongji (1): vhost: Fix the calculation in vhost_overflow()
Yang Yingliang (1): net: bridge: fix memleak in br_add_if()
Ye Bin (1): scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach()
Yu Kuai (1): dmaengine: usb-dmac: Fix PM reference leak in usb_dmac_probe()
.../filesystems/mandatory-locking.txt | 10 ++ Makefile | 4 +- arch/arm/boot/dts/am43x-epos-evm.dts | 2 +- arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 4 +- arch/powerpc/kernel/kprobes.c | 3 +- arch/x86/include/asm/fpu/internal.h | 30 ++--- arch/x86/include/asm/svm.h | 2 + arch/x86/kernel/apic/io_apic.c | 6 +- arch/x86/kernel/apic/msi.c | 13 +- arch/x86/kernel/cpu/intel_rdt_monitor.c | 27 ++-- arch/x86/kernel/fpu/xstate.c | 38 +++++- arch/x86/kvm/svm.c | 18 ++- arch/x86/tools/chkobjdump.awk | 1 + drivers/acpi/nfit/core.c | 3 + drivers/base/core.c | 1 + drivers/cpufreq/armada-37xx-cpufreq.c | 6 +- drivers/dma/of-dma.c | 9 +- drivers/dma/sh/usb-dmac.c | 2 +- drivers/dma/xilinx/xilinx_dma.c | 12 ++ drivers/i2c/i2c-dev.c | 5 +- drivers/iio/adc/palmas_gpadc.c | 4 +- drivers/iio/humidity/hdc100x.c | 6 +- drivers/iommu/intel-iommu.c | 7 +- drivers/ipack/carriers/tpci200.c | 60 +++++---- drivers/mmc/host/dw_mmc.c | 6 +- drivers/net/dsa/lan9303-core.c | 34 ++--- drivers/net/dsa/mt7530.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt.c | 57 +++++---- .../ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 +- drivers/net/hamradio/6pack.c | 6 + drivers/net/ieee802154/mac802154_hwsim.c | 6 +- drivers/net/phy/mdio-mux.c | 36 ++++-- drivers/net/ppp/ppp_generic.c | 2 +- drivers/net/usb/lan78xx.c | 16 ++- drivers/net/wireless/ath/ath.h | 3 +- drivers/net/wireless/ath/ath5k/mac80211-ops.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_main.c | 2 +- drivers/net/wireless/ath/ath9k/hw.h | 1 + drivers/net/wireless/ath/ath9k/main.c | 95 +++++++++++++- drivers/net/wireless/ath/key.c | 41 +++--- drivers/pci/msi.c | 120 ++++++++++++------ drivers/pci/quirks.c | 1 + drivers/ptp/Kconfig | 3 +- drivers/scsi/device_handler/scsi_dh_rdac.c | 4 +- drivers/scsi/megaraid/megaraid_mm.c | 21 ++- drivers/scsi/scsi_scan.c | 3 +- drivers/slimbus/messaging.c | 7 +- drivers/slimbus/qcom-ngd-ctrl.c | 5 +- drivers/vhost/vhost.c | 10 +- drivers/xen/events/events_base.c | 20 ++- fs/btrfs/inode.c | 10 +- fs/namespace.c | 15 ++- include/asm-generic/vmlinux.lds.h | 1 + include/linux/device.h | 1 + include/linux/inetdevice.h | 2 +- include/linux/irq.h | 2 + include/linux/msi.h | 2 +- include/net/psample.h | 2 + kernel/irq/chip.c | 5 +- kernel/irq/msi.c | 13 +- kernel/trace/trace_events_hist.c | 2 + net/bluetooth/hidp/core.c | 2 +- net/bridge/br_if.c | 2 + net/dccp/dccp.h | 6 +- net/ieee802154/socket.c | 7 +- net/ipv4/igmp.c | 21 ++- net/ipv4/tcp_bbr.c | 2 +- net/mac80211/debugfs_sta.c | 1 + net/mac80211/key.c | 1 + net/mac80211/sta_info.h | 1 + net/mac80211/tx.c | 12 +- net/netfilter/nft_exthdr.c | 8 +- net/vmw_vsock/virtio_transport.c | 7 +- sound/pci/hda/hda_generic.c | 10 +- sound/soc/codecs/cs42l42.c | 39 +++--- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 3 +- 76 files changed, 643 insertions(+), 313 deletions(-)
From: Chris Lesiak chris.lesiak@licor.com
commit 84edec86f449adea9ee0b4912a79ab8d9d65abb7 upstream.
The datasheets have the following note for the conversion time specification: "This parameter is specified by design and/or characterization and it is not tested in production."
Parts have been seen that require more time to do 14-bit conversions for the relative humidity channel. The result is ENXIO due to the address phase of a transfer not getting an ACK.
Delay an additional 1 ms per conversion to allow for additional margin.
Fixes: 4839367d99e3 ("iio: humidity: add HDC100x support") Signed-off-by: Chris Lesiak chris.lesiak@licor.com Acked-by: Matt Ranostay matt.ranostay@konsulko.com Link: https://lore.kernel.org/r/20210614141820.2034827-1-chris.lesiak@licor.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/humidity/hdc100x.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/humidity/hdc100x.c b/drivers/iio/humidity/hdc100x.c index 0fcaa2c0b2f4..51ad5a9ed085 100644 --- a/drivers/iio/humidity/hdc100x.c +++ b/drivers/iio/humidity/hdc100x.c @@ -24,6 +24,8 @@ #include <linux/iio/trigger_consumer.h> #include <linux/iio/triggered_buffer.h>
+#include <linux/time.h> + #define HDC100X_REG_TEMP 0x00 #define HDC100X_REG_HUMIDITY 0x01
@@ -165,7 +167,7 @@ static int hdc100x_get_measurement(struct hdc100x_data *data, struct iio_chan_spec const *chan) { struct i2c_client *client = data->client; - int delay = data->adc_int_us[chan->address]; + int delay = data->adc_int_us[chan->address] + 1*USEC_PER_MSEC; int ret; __be16 val;
@@ -322,7 +324,7 @@ static irqreturn_t hdc100x_trigger_handler(int irq, void *p) struct iio_dev *indio_dev = pf->indio_dev; struct hdc100x_data *data = iio_priv(indio_dev); struct i2c_client *client = data->client; - int delay = data->adc_int_us[0] + data->adc_int_us[1]; + int delay = data->adc_int_us[0] + data->adc_int_us[1] + 2*USEC_PER_MSEC; int ret;
/* dual read starts at temp register */
From: Colin Ian King colin.king@canonical.com
commit 5afc1540f13804a31bb704b763308e17688369c5 upstream.
Currently the for-loop that scans for the optimial adc_period iterates through all the possible adc_period levels because the exit logic in the loop is inverted. I believe the comparison should be swapped and the continue replaced with a break to exit the loop at the correct point.
Addresses-Coverity: ("Continue has no effect") Fixes: e08e19c331fb ("iio:adc: add iio driver for Palmas (twl6035/7) gpadc") Signed-off-by: Colin Ian King colin.king@canonical.com Link: https://lore.kernel.org/r/20210730071651.17394-1-colin.king@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/palmas_gpadc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/adc/palmas_gpadc.c b/drivers/iio/adc/palmas_gpadc.c index 69b9affeef1e..7dcd4213d38a 100644 --- a/drivers/iio/adc/palmas_gpadc.c +++ b/drivers/iio/adc/palmas_gpadc.c @@ -659,8 +659,8 @@ static int palmas_adc_wakeup_configure(struct palmas_gpadc *adc)
adc_period = adc->auto_conversion_period; for (i = 0; i < 16; ++i) { - if (((1000 * (1 << i)) / 32) < adc_period) - continue; + if (((1000 * (1 << i)) / 32) >= adc_period) + break; } if (i > 0) i--;
From: Takashi Iwai tiwai@suse.de
commit 2e6b836312a477d647a7920b56810a5a25f6c856 upstream.
PCM buffers might be allocated dynamically when the buffer preallocation failed or a larger buffer is requested, and it's not guaranteed that substream->dma_buffer points to the actually used buffer. The address should be retrieved from runtime->dma_addr, instead of substream->dma_buffer (and shouldn't use virt_to_phys).
Also, remove the line overriding runtime->dma_area superfluously, which was already set up at the PCM buffer allocation.
Cc: Cezary Rojewski cezary.rojewski@intel.com Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20210728112353.6675-3-tiwai@suse.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index be773101d876..501ac836777a 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -135,7 +135,7 @@ static void sst_fill_alloc_params(struct snd_pcm_substream *substream, snd_pcm_uframes_t period_size; ssize_t periodbytes; ssize_t buffer_bytes = snd_pcm_lib_buffer_bytes(substream); - u32 buffer_addr = virt_to_phys(substream->dma_buffer.area); + u32 buffer_addr = substream->runtime->dma_addr;
channels = substream->runtime->channels; period_size = substream->runtime->period_size; @@ -241,7 +241,6 @@ static int sst_platform_alloc_stream(struct snd_pcm_substream *substream, /* set codec params and inform SST driver the same */ sst_fill_pcm_params(substream, ¶m); sst_fill_alloc_params(substream, &alloc_params); - substream->runtime->dma_area = substream->dma_buffer.area; str_params.sparams = param; str_params.aparams = alloc_params; str_params.codec = SST_CODEC_TYPE_PCM;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 86ff25ed6cd8240d18df58930bd8848b19fce308 upstream.
If an i2c driver happens to not provide the full amount of data that a user asks for, it is possible that some uninitialized data could be sent to userspace. While all in-kernel drivers look to be safe, just be sure by initializing the buffer to zero before it is passed to the i2c driver so that any future drivers will not have this issue.
Also properly copy the amount of data recvieved to the userspace buffer, as pointed out by Dan Carpenter.
Reported-by: Eric Dumazet edumazet@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-dev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 1d10ee86299d..57aece809841 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -149,7 +149,7 @@ static ssize_t i2cdev_read(struct file *file, char __user *buf, size_t count, if (count > 8192) count = 8192;
- tmp = kmalloc(count, GFP_KERNEL); + tmp = kzalloc(count, GFP_KERNEL); if (tmp == NULL) return -ENOMEM;
@@ -158,7 +158,8 @@ static ssize_t i2cdev_read(struct file *file, char __user *buf, size_t count,
ret = i2c_master_recv(client, tmp, count); if (ret >= 0) - ret = copy_to_user(buf, tmp, count) ? -EFAULT : ret; + if (copy_to_user(buf, tmp, ret)) + ret = -EFAULT; kfree(tmp); return ret; }
From: Dan Williams dan.j.williams@intel.com
commit b93dfa6bda4d4e88e5386490f2b277a26958f9d3 upstream.
Fix the NFIT parsing code to treat a 0 index in a SPA Range Structure as a special case and not match Region Mapping Structures that use 0 to indicate that they are not mapped. Without this fix some platform BIOS descriptions of "virtual disk" ranges do not result in the pmem driver attaching to the range.
Details: In addition to typical persistent memory ranges, the ACPI NFIT may also convey "virtual" ranges. These ranges are indicated by a UUID in the SPA Range Structure of UUID_VOLATILE_VIRTUAL_DISK, UUID_VOLATILE_VIRTUAL_CD, UUID_PERSISTENT_VIRTUAL_DISK, or UUID_PERSISTENT_VIRTUAL_CD. The critical difference between virtual ranges and UUID_PERSISTENT_MEMORY, is that virtual do not support associations with Region Mapping Structures. For this reason the "index" value of virtual SPA Range Structures is allowed to be 0. If a platform BIOS decides to represent NVDIMMs with disconnected "Region Mapping Structures" (range-index == 0), the kernel may falsely associate them with standalone ranges where the "SPA Range Structure Index" is also zero. When this happens the driver may falsely require labels where "virtual disks" are expected to be label-less. I.e. "label-less" is where the namespace-range == region-range and the pmem driver attaches with no user action to create a namespace.
Cc: Jacek Zloch jacek.zloch@intel.com Cc: Lukasz Sobieraj lukasz.sobieraj@intel.com Cc: "Lee, Chun-Yi" jlee@suse.com Cc: stable@vger.kernel.org Fixes: c2f32acdf848 ("acpi, nfit: treat virtual ramdisk SPA as pmem region") Reported-by: Krzysztof Rusocki krzysztof.rusocki@intel.com Reported-by: Damian Bassa damian.bassa@intel.com Reviewed-by: Jeff Moyer jmoyer@redhat.com Link: https://lore.kernel.org/r/162870796589.2521182.1240403310175570220.stgit@dwi... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/nfit/core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index cb88f3b43a94..58a756ca14d8 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2834,6 +2834,9 @@ static int acpi_nfit_register_region(struct acpi_nfit_desc *acpi_desc, struct acpi_nfit_memory_map *memdev = nfit_memdev->memdev; struct nd_mapping_desc *mapping;
+ /* range index 0 == unmapped in SPA or invalid-SPA */ + if (memdev->range_index == 0 || spa->range_index == 0) + continue; if (memdev->range_index != spa->range_index) continue; if (count >= ND_MAX_MAPPINGS) {
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit e9faf53c5a5d01f6f2a09ae28ec63a3bbd6f64fd ]
Both MAC802154_HWSIM_ATTR_RADIO_ID and MAC802154_HWSIM_ATTR_RADIO_EDGE, MAC802154_HWSIM_EDGE_ATTR_ENDPOINT_ID and MAC802154_HWSIM_EDGE_ATTR_LQI must be present to fix GPF.
Fixes: f25da51fdc38 ("ieee802154: hwsim: add replacement for fakelb") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210705131321.217111-1-mudongliangabcd@gmail.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/mac802154_hwsim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/mac802154_hwsim.c b/drivers/net/ieee802154/mac802154_hwsim.c index 06aadebc2d5b..82f3fbda7dfe 100644 --- a/drivers/net/ieee802154/mac802154_hwsim.c +++ b/drivers/net/ieee802154/mac802154_hwsim.c @@ -546,7 +546,7 @@ static int hwsim_set_edge_lqi(struct sk_buff *msg, struct genl_info *info) u32 v0, v1; u8 lqi;
- if (!info->attrs[MAC802154_HWSIM_ATTR_RADIO_ID] && + if (!info->attrs[MAC802154_HWSIM_ATTR_RADIO_ID] || !info->attrs[MAC802154_HWSIM_ATTR_RADIO_EDGE]) return -EINVAL;
@@ -555,7 +555,7 @@ static int hwsim_set_edge_lqi(struct sk_buff *msg, struct genl_info *info) hwsim_edge_policy, NULL)) return -EINVAL;
- if (!edge_attrs[MAC802154_HWSIM_EDGE_ATTR_ENDPOINT_ID] && + if (!edge_attrs[MAC802154_HWSIM_EDGE_ATTR_ENDPOINT_ID] || !edge_attrs[MAC802154_HWSIM_EDGE_ATTR_LQI]) return -EINVAL;
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 889d0e7dc68314a273627d89cbb60c09e1cc1c25 ]
Both MAC802154_HWSIM_ATTR_RADIO_ID and MAC802154_HWSIM_ATTR_RADIO_EDGE must be present to fix GPF.
Fixes: f25da51fdc38 ("ieee802154: hwsim: add replacement for fakelb") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210707155633.1486603-1-mudongliangabcd@gmail.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/mac802154_hwsim.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ieee802154/mac802154_hwsim.c b/drivers/net/ieee802154/mac802154_hwsim.c index 82f3fbda7dfe..ed60e691cc2b 100644 --- a/drivers/net/ieee802154/mac802154_hwsim.c +++ b/drivers/net/ieee802154/mac802154_hwsim.c @@ -432,7 +432,7 @@ static int hwsim_new_edge_nl(struct sk_buff *msg, struct genl_info *info) struct hwsim_edge *e; u32 v0, v1;
- if (!info->attrs[MAC802154_HWSIM_ATTR_RADIO_ID] && + if (!info->attrs[MAC802154_HWSIM_ATTR_RADIO_ID] || !info->attrs[MAC802154_HWSIM_ATTR_RADIO_EDGE]) return -EINVAL;
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit ee86f680ff4c9b406d49d4e22ddf10805b8a2137 ]
The ADC volume is a signed 8-bit number with range -97 to +12, with -97 being mute. Use a SOC_SINGLE_S8_TLV() to define this and fix the DECLARE_TLV_DB_SCALE() to have the correct start and mute flag.
Fixes: 2c394ca79604 ("ASoC: Add support for CS42L42 codec") Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Link: https://lore.kernel.org/r/20210729170929.6589-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l42.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c index fddfd227a9c0..6a58c666776a 100644 --- a/sound/soc/codecs/cs42l42.c +++ b/sound/soc/codecs/cs42l42.c @@ -404,7 +404,7 @@ static const struct regmap_config cs42l42_regmap = { .cache_type = REGCACHE_RBTREE, };
-static DECLARE_TLV_DB_SCALE(adc_tlv, -9600, 100, false); +static DECLARE_TLV_DB_SCALE(adc_tlv, -9700, 100, true); static DECLARE_TLV_DB_SCALE(mixer_tlv, -6300, 100, true);
static const char * const cs42l42_hpf_freq_text[] = { @@ -443,8 +443,7 @@ static const struct snd_kcontrol_new cs42l42_snd_controls[] = { CS42L42_ADC_INV_SHIFT, true, false), SOC_SINGLE("ADC Boost Switch", CS42L42_ADC_CTL, CS42L42_ADC_DIG_BOOST_SHIFT, true, false), - SOC_SINGLE_SX_TLV("ADC Volume", CS42L42_ADC_VOLUME, - CS42L42_ADC_VOL_SHIFT, 0xA0, 0x6C, adc_tlv), + SOC_SINGLE_S8_TLV("ADC Volume", CS42L42_ADC_VOLUME, -97, 12, adc_tlv), SOC_SINGLE("ADC WNF Switch", CS42L42_ADC_WNF_HPF_CTL, CS42L42_ADC_WNF_EN_SHIFT, true, false), SOC_SINGLE("ADC HPF Switch", CS42L42_ADC_WNF_HPF_CTL,
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 64324bac750b84ca54711fb7d332132fcdb87293 ]
The driver has no support for left-justified protocol so it should not have been allowing this to be passed to cs42l42_set_dai_fmt().
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Fixes: 2c394ca79604 ("ASoC: Add support for CS42L42 codec") Link: https://lore.kernel.org/r/20210729170929.6589-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l42.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c index 6a58c666776a..ca6541ac59e1 100644 --- a/sound/soc/codecs/cs42l42.c +++ b/sound/soc/codecs/cs42l42.c @@ -773,7 +773,6 @@ static int cs42l42_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt) /* interface format */ switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { case SND_SOC_DAIFMT_I2S: - case SND_SOC_DAIFMT_LEFT_J: break; default: return -EINVAL;
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 30615bd21b4cc3c3bb5ae8bd70e2a915cc5f75c7 ]
The underlying register field has inverted sense (0 = enabled) so the control definition must be marked as inverted.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Fixes: 2c394ca79604 ("ASoC: Add support for CS42L42 codec") Link: https://lore.kernel.org/r/20210803160834.9005-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l42.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c index ca6541ac59e1..c11e60e9fe4e 100644 --- a/sound/soc/codecs/cs42l42.c +++ b/sound/soc/codecs/cs42l42.c @@ -436,7 +436,7 @@ static SOC_ENUM_SINGLE_DECL(cs42l42_wnf05_freq_enum, CS42L42_ADC_WNF_HPF_CTL, static const struct snd_kcontrol_new cs42l42_snd_controls[] = { /* ADC Volume and Filter Controls */ SOC_SINGLE("ADC Notch Switch", CS42L42_ADC_CTL, - CS42L42_ADC_NOTCH_DIS_SHIFT, true, false), + CS42L42_ADC_NOTCH_DIS_SHIFT, true, true), SOC_SINGLE("ADC Weak Force Switch", CS42L42_ADC_CTL, CS42L42_ADC_FORCE_WEAK_VCM_SHIFT, true, false), SOC_SINGLE("ADC Invert Switch", CS42L42_ADC_CTL,
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 8b353bbeae20e2214c9d9d88bcb2fda4ba145d83 ]
The driver was defining two ALSA controls that both change the same register field for the wind noise filter corner frequency. The filter response has two corners, at different frequencies, and the duplicate controls most likely were an attempt to be able to set the value using either of the frequencies.
However, having two controls changing the same field can be problematic and it is unnecessary. Both frequencies are related to each other so setting one implies exactly what the other would be.
Removing a control affects user-side code, but there is currently no known use of the removed control so it would be best to remove it now before it becomes a problem.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Fixes: 2c394ca79604 ("ASoC: Add support for CS42L42 codec") Link: https://lore.kernel.org/r/20210803160834.9005-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l42.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c index c11e60e9fe4e..fb12fcf88878 100644 --- a/sound/soc/codecs/cs42l42.c +++ b/sound/soc/codecs/cs42l42.c @@ -424,15 +424,6 @@ static SOC_ENUM_SINGLE_DECL(cs42l42_wnf3_freq_enum, CS42L42_ADC_WNF_HPF_CTL, CS42L42_ADC_WNF_CF_SHIFT, cs42l42_wnf3_freq_text);
-static const char * const cs42l42_wnf05_freq_text[] = { - "280Hz", "315Hz", "350Hz", "385Hz", - "420Hz", "455Hz", "490Hz", "525Hz" -}; - -static SOC_ENUM_SINGLE_DECL(cs42l42_wnf05_freq_enum, CS42L42_ADC_WNF_HPF_CTL, - CS42L42_ADC_WNF_CF_SHIFT, - cs42l42_wnf05_freq_text); - static const struct snd_kcontrol_new cs42l42_snd_controls[] = { /* ADC Volume and Filter Controls */ SOC_SINGLE("ADC Notch Switch", CS42L42_ADC_CTL, @@ -450,7 +441,6 @@ static const struct snd_kcontrol_new cs42l42_snd_controls[] = { CS42L42_ADC_HPF_EN_SHIFT, true, false), SOC_ENUM("HPF Corner Freq", cs42l42_hpf_freq_enum), SOC_ENUM("WNF 3dB Freq", cs42l42_wnf3_freq_enum), - SOC_ENUM("WNF 05dB Freq", cs42l42_wnf05_freq_enum),
/* DAC Volume and Filter Controls */ SOC_SINGLE("DACA Invert Switch", CS42L42_DAC_CTL1,
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 0c2f2ad4f16a58879463d0979a54293f8f296d6f ]
An I2S frame starts on the falling edge of LRCLK so ASP_STP must be 0.
At the same time, move other format settings in the same register from cs42l42_pll_config() to cs42l42_set_dai_fmt() where you'd expect to find them, and merge into a single write.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Fixes: 2c394ca79604 ("ASoC: Add support for CS42L42 codec") Link: https://lore.kernel.org/r/20210805161111.10410-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l42.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c index fb12fcf88878..4cb3e11c66af 100644 --- a/sound/soc/codecs/cs42l42.c +++ b/sound/soc/codecs/cs42l42.c @@ -659,15 +659,6 @@ static int cs42l42_pll_config(struct snd_soc_component *component) CS42L42_FSYNC_PULSE_WIDTH_MASK, CS42L42_FRAC1_VAL(fsync - 1) << CS42L42_FSYNC_PULSE_WIDTH_SHIFT); - snd_soc_component_update_bits(component, - CS42L42_ASP_FRM_CFG, - CS42L42_ASP_5050_MASK, - CS42L42_ASP_5050_MASK); - /* Set the frame delay to 1.0 SCLK clocks */ - snd_soc_component_update_bits(component, CS42L42_ASP_FRM_CFG, - CS42L42_ASP_FSD_MASK, - CS42L42_ASP_FSD_1_0 << - CS42L42_ASP_FSD_SHIFT); /* Set the sample rates (96k or lower) */ snd_soc_component_update_bits(component, CS42L42_FS_RATE_EN, CS42L42_FS_EN_MASK, @@ -763,6 +754,18 @@ static int cs42l42_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt) /* interface format */ switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { case SND_SOC_DAIFMT_I2S: + /* + * 5050 mode, frame starts on falling edge of LRCLK, + * frame delayed by 1.0 SCLKs + */ + snd_soc_component_update_bits(component, + CS42L42_ASP_FRM_CFG, + CS42L42_ASP_STP_MASK | + CS42L42_ASP_5050_MASK | + CS42L42_ASP_FSD_MASK, + CS42L42_ASP_5050_MASK | + (CS42L42_ASP_FSD_1_0 << + CS42L42_ASP_FSD_SHIFT)); break; default: return -EINVAL;
From: DENG Qingfang dqfext@gmail.com
[ Upstream commit aff51c5da3208bd164381e1488998667269c6cf4 ]
Add the missing RxUnicast counter.
Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: DENG Qingfang dqfext@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mt7530.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index 6335c4ea0957..2ff6a0be97de 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -54,6 +54,7 @@ static const struct mt7530_mib_desc mt7530_mib[] = { MIB_DESC(2, 0x48, "TxBytes"), MIB_DESC(1, 0x60, "RxDrop"), MIB_DESC(1, 0x64, "RxFiltering"), + MIB_DESC(1, 0x68, "RxUnicast"), MIB_DESC(1, 0x6c, "RxMulticast"), MIB_DESC(1, 0x70, "RxBroadcast"), MIB_DESC(1, 0x74, "RxAlignErr"),
From: Pali Rohár pali@kernel.org
[ Upstream commit 2459dcb96bcba94c08d6861f8a050185ff301672 ]
IFLA_IFNAME is nul-term string which means that IFLA_IFNAME buffer can be larger than length of string which contains.
Function __rtnl_newlink() generates new own ifname if either IFLA_IFNAME was not specified at all or userspace passed empty nul-term string.
It is expected that if userspace does not specify ifname for new ppp netdev then kernel generates one in format "ppp<id>" where id matches to the ppp unit id which can be later obtained by PPPIOCGUNIT ioctl.
And it works in this way if IFLA_IFNAME is not specified at all. But it does not work when IFLA_IFNAME is specified with empty string.
So fix this logic also for empty IFLA_IFNAME in ppp_nl_newlink() function and correctly generates ifname based on ppp unit identifier if userspace did not provided preferred ifname.
Without this patch when IFLA_IFNAME was specified with empty string then kernel created a new ppp interface in format "ppp<id>" but id did not match ppp unit id returned by PPPIOCGUNIT ioctl. In this case id was some number generated by __rtnl_newlink() function.
Signed-off-by: Pali Rohár pali@kernel.org Fixes: bb8082f69138 ("ppp: build ifname using unit identifier for rtnl based devices") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 1af47aaa7ba5..dc9de8731c56 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -1125,7 +1125,7 @@ static int ppp_nl_newlink(struct net *src_net, struct net_device *dev, * the PPP unit identifer as suffix (i.e. ppp<unit_id>). This allows * userspace to infer the device name using to the PPPIOCGUNIT ioctl. */ - if (!tb[IFLA_IFNAME]) + if (!tb[IFLA_IFNAME] || !nla_len(tb[IFLA_IFNAME]) || !*(char *)nla_data(tb[IFLA_IFNAME])) conf.ifname_is_set = false;
err = ppp_dev_configure(src_net, dev, &conf);
From: Roi Dayan roid@nvidia.com
[ Upstream commit beb7f2de5728b0bd2140a652fa51f6ad85d159f7 ]
Without this there is a warning if source files include psample.h before skbuff.h or doesn't include it at all.
Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") Signed-off-by: Roi Dayan roid@nvidia.com Link: https://lore.kernel.org/r/20210808065242.1522535-1-roid@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/psample.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/net/psample.h b/include/net/psample.h index 94cb37a7bf75..796f01e5635d 100644 --- a/include/net/psample.h +++ b/include/net/psample.h @@ -18,6 +18,8 @@ struct psample_group { struct psample_group *psample_group_get(struct net *net, u32 group_num); void psample_group_put(struct psample_group *group);
+struct sk_buff; + #if IS_ENABLED(CONFIG_PSAMPLE)
void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
From: Takeshi Misawa jeliantsurux@gmail.com
[ Upstream commit 1090340f7ee53e824fd4eef66a4855d548110c5b ]
If IEEE-802.15.4-RAW is closed before receive skb, skb is leaked. Fix this, by freeing sk_receive_queue in sk->sk_destruct().
syzbot report: BUG: memory leak unreferenced object 0xffff88810f644600 (size 232): comm "softirq", pid 0, jiffies 4294967032 (age 81.270s) hex dump (first 32 bytes): 10 7d 4b 12 81 88 ff ff 10 7d 4b 12 81 88 ff ff .}K......}K..... 00 00 00 00 00 00 00 00 40 7c 4b 12 81 88 ff ff ........@|K..... backtrace: [<ffffffff83651d4a>] skb_clone+0xaa/0x2b0 net/core/skbuff.c:1496 [<ffffffff83fe1b80>] ieee802154_raw_deliver net/ieee802154/socket.c:369 [inline] [<ffffffff83fe1b80>] ieee802154_rcv+0x100/0x340 net/ieee802154/socket.c:1070 [<ffffffff8367cc7a>] __netif_receive_skb_one_core+0x6a/0xa0 net/core/dev.c:5384 [<ffffffff8367cd07>] __netif_receive_skb+0x27/0xa0 net/core/dev.c:5498 [<ffffffff8367cdd9>] netif_receive_skb_internal net/core/dev.c:5603 [inline] [<ffffffff8367cdd9>] netif_receive_skb+0x59/0x260 net/core/dev.c:5662 [<ffffffff83fe6302>] ieee802154_deliver_skb net/mac802154/rx.c:29 [inline] [<ffffffff83fe6302>] ieee802154_subif_frame net/mac802154/rx.c:102 [inline] [<ffffffff83fe6302>] __ieee802154_rx_handle_packet net/mac802154/rx.c:212 [inline] [<ffffffff83fe6302>] ieee802154_rx+0x612/0x620 net/mac802154/rx.c:284 [<ffffffff83fe59a6>] ieee802154_tasklet_handler+0x86/0xa0 net/mac802154/main.c:35 [<ffffffff81232aab>] tasklet_action_common.constprop.0+0x5b/0x100 kernel/softirq.c:557 [<ffffffff846000bf>] __do_softirq+0xbf/0x2ab kernel/softirq.c:345 [<ffffffff81232f4c>] do_softirq kernel/softirq.c:248 [inline] [<ffffffff81232f4c>] do_softirq+0x5c/0x80 kernel/softirq.c:235 [<ffffffff81232fc1>] __local_bh_enable_ip+0x51/0x60 kernel/softirq.c:198 [<ffffffff8367a9a4>] local_bh_enable include/linux/bottom_half.h:32 [inline] [<ffffffff8367a9a4>] rcu_read_unlock_bh include/linux/rcupdate.h:745 [inline] [<ffffffff8367a9a4>] __dev_queue_xmit+0x7f4/0xf60 net/core/dev.c:4221 [<ffffffff83fe2db4>] raw_sendmsg+0x1f4/0x2b0 net/ieee802154/socket.c:295 [<ffffffff8363af16>] sock_sendmsg_nosec net/socket.c:654 [inline] [<ffffffff8363af16>] sock_sendmsg+0x56/0x80 net/socket.c:674 [<ffffffff8363deec>] __sys_sendto+0x15c/0x200 net/socket.c:1977 [<ffffffff8363dfb6>] __do_sys_sendto net/socket.c:1989 [inline] [<ffffffff8363dfb6>] __se_sys_sendto net/socket.c:1985 [inline] [<ffffffff8363dfb6>] __x64_sys_sendto+0x26/0x30 net/socket.c:1985
Fixes: 9ec767160357 ("net: add IEEE 802.15.4 socket family implementation") Reported-and-tested-by: syzbot+1f68113fa907bf0695a8@syzkaller.appspotmail.com Signed-off-by: Takeshi Misawa jeliantsurux@gmail.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20210805075414.GA15796@DESKTOP Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ieee802154/socket.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c index 89819745e482..14c6fac039f9 100644 --- a/net/ieee802154/socket.c +++ b/net/ieee802154/socket.c @@ -1002,6 +1002,11 @@ static const struct proto_ops ieee802154_dgram_ops = { #endif };
+static void ieee802154_sock_destruct(struct sock *sk) +{ + skb_queue_purge(&sk->sk_receive_queue); +} + /* Create a socket. Initialise the socket, blank the addresses * set the state. */ @@ -1042,7 +1047,7 @@ static int ieee802154_create(struct net *net, struct socket *sock, sock->ops = ops;
sock_init_data(sock, sk); - /* FIXME: sk->sk_destruct */ + sk->sk_destruct = ieee802154_sock_destruct; sk->sk_family = PF_IEEE802154;
/* Checksums on by default */
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4a2b285e7e103d4d6c6ed3e5052a0ff74a5d7f15 ]
Fix the data-race reported by syzbot [1] Issue here is that igmp_ifc_timer_expire() can update in_dev->mr_ifc_count while another change just occured from another context.
in_dev->mr_ifc_count is only 8bit wide, so the race had little consequences.
[1] BUG: KCSAN: data-race in igmp_ifc_event / igmp_ifc_timer_expire
write to 0xffff8881051e3062 of 1 bytes by task 12547 on cpu 0: igmp_ifc_event+0x1d5/0x290 net/ipv4/igmp.c:821 igmp_group_added+0x462/0x490 net/ipv4/igmp.c:1356 ____ip_mc_inc_group+0x3ff/0x500 net/ipv4/igmp.c:1461 __ip_mc_join_group+0x24d/0x2c0 net/ipv4/igmp.c:2199 ip_mc_join_group_ssm+0x20/0x30 net/ipv4/igmp.c:2218 do_ip_setsockopt net/ipv4/ip_sockglue.c:1285 [inline] ip_setsockopt+0x1827/0x2a80 net/ipv4/ip_sockglue.c:1423 tcp_setsockopt+0x8c/0xa0 net/ipv4/tcp.c:3657 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3362 __sys_setsockopt+0x18f/0x200 net/socket.c:2159 __do_sys_setsockopt net/socket.c:2170 [inline] __se_sys_setsockopt net/socket.c:2167 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2167 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff8881051e3062 of 1 bytes by interrupt on cpu 1: igmp_ifc_timer_expire+0x706/0xa30 net/ipv4/igmp.c:808 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1419 expire_timers+0x135/0x250 kernel/time/timer.c:1464 __run_timers+0x358/0x420 kernel/time/timer.c:1732 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1745 __do_softirq+0x12c/0x26e kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x9a/0xb0 kernel/softirq.c:636 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1100 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 console_unlock+0x8e8/0xb30 kernel/printk/printk.c:2646 vprintk_emit+0x125/0x3d0 kernel/printk/printk.c:2174 vprintk_default+0x22/0x30 kernel/printk/printk.c:2185 vprintk+0x15a/0x170 kernel/printk/printk_safe.c:392 printk+0x62/0x87 kernel/printk/printk.c:2216 selinux_netlink_send+0x399/0x400 security/selinux/hooks.c:6041 security_netlink_send+0x42/0x90 security/security.c:2070 netlink_sendmsg+0x59e/0x7c0 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg net/socket.c:723 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392 ___sys_sendmsg net/socket.c:2446 [inline] __sys_sendmsg+0x1ed/0x270 net/socket.c:2475 __do_sys_sendmsg net/socket.c:2484 [inline] __se_sys_sendmsg net/socket.c:2482 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2482 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x01 -> 0x02
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 12539 Comm: syz-executor.1 Not tainted 5.14.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/igmp.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index ffa847fc9619..95ec3923083f 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -807,10 +807,17 @@ static void igmp_gq_timer_expire(struct timer_list *t) static void igmp_ifc_timer_expire(struct timer_list *t) { struct in_device *in_dev = from_timer(in_dev, t, mr_ifc_timer); + u8 mr_ifc_count;
igmpv3_send_cr(in_dev); - if (in_dev->mr_ifc_count) { - in_dev->mr_ifc_count--; +restart: + mr_ifc_count = READ_ONCE(in_dev->mr_ifc_count); + + if (mr_ifc_count) { + if (cmpxchg(&in_dev->mr_ifc_count, + mr_ifc_count, + mr_ifc_count - 1) != mr_ifc_count) + goto restart; igmp_ifc_start_timer(in_dev, unsolicited_report_interval(in_dev)); } @@ -822,7 +829,7 @@ static void igmp_ifc_event(struct in_device *in_dev) struct net *net = dev_net(in_dev->dev); if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev)) return; - in_dev->mr_ifc_count = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; + WRITE_ONCE(in_dev->mr_ifc_count, in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv); igmp_ifc_start_timer(in_dev, 1); }
@@ -961,7 +968,7 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, in_dev->mr_qri; } /* cancel the interface change timer */ - in_dev->mr_ifc_count = 0; + WRITE_ONCE(in_dev->mr_ifc_count, 0); if (del_timer(&in_dev->mr_ifc_timer)) __in_dev_put(in_dev); /* clear deleted report items */ @@ -1739,7 +1746,7 @@ void ip_mc_down(struct in_device *in_dev) igmp_group_dropped(pmc);
#ifdef CONFIG_IP_MULTICAST - in_dev->mr_ifc_count = 0; + WRITE_ONCE(in_dev->mr_ifc_count, 0); if (del_timer(&in_dev->mr_ifc_timer)) __in_dev_put(in_dev); in_dev->mr_gq_running = 0; @@ -1956,7 +1963,7 @@ static int ip_mc_del_src(struct in_device *in_dev, __be32 *pmca, int sfmode, pmc->sfmode = MCAST_INCLUDE; #ifdef CONFIG_IP_MULTICAST pmc->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; - in_dev->mr_ifc_count = pmc->crcount; + WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount); for (psf = pmc->sources; psf; psf = psf->sf_next) psf->sf_crcount = 0; igmp_ifc_event(pmc->interface); @@ -2135,7 +2142,7 @@ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode, /* else no filters; keep old mode for reports */
pmc->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv; - in_dev->mr_ifc_count = pmc->crcount; + WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount); for (psf = pmc->sources; psf; psf = psf->sf_next) psf->sf_crcount = 0; igmp_ifc_event(in_dev);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit ada2fee185d8145afb89056558bb59545b9dbdd0 ]
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into multiple netlink skbs if the buffer provided by user space is too small (one buffer will typically handle a few hundred FDB entries).
When the current buffer becomes full, nlmsg_put() in dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that point, and then the dump resumes on the same port with a new skb, and FDB entries up to the saved index are simply skipped.
Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to drivers, then drivers must check for the -EMSGSIZE error code returned by it. Otherwise, when a netlink skb becomes full, DSA will no longer save newly dumped FDB entries to it, but the driver will continue dumping. So FDB entries will be missing from the dump.
Fix the broken backpressure by propagating the "cb" return code and allow rtnl_fdb_dump() to restart the FDB dump with a new skb.
Fixes: ab335349b852 ("net: dsa: lan9303: Add port_fast_age and port_fdb_dump methods") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/lan9303-core.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index b4f6e1a67dd9..b89c474e6b6b 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -566,12 +566,12 @@ static int lan9303_alr_make_entry_raw(struct lan9303 *chip, u32 dat0, u32 dat1) return 0; }
-typedef void alr_loop_cb_t(struct lan9303 *chip, u32 dat0, u32 dat1, - int portmap, void *ctx); +typedef int alr_loop_cb_t(struct lan9303 *chip, u32 dat0, u32 dat1, + int portmap, void *ctx);
-static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx) +static int lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx) { - int i; + int ret = 0, i;
mutex_lock(&chip->alr_mutex); lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, @@ -591,13 +591,17 @@ static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx) LAN9303_ALR_DAT1_PORT_BITOFFS; portmap = alrport_2_portmap[alrport];
- cb(chip, dat0, dat1, portmap, ctx); + ret = cb(chip, dat0, dat1, portmap, ctx); + if (ret) + break;
lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, LAN9303_ALR_CMD_GET_NEXT); lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0); } mutex_unlock(&chip->alr_mutex); + + return ret; }
static void alr_reg_to_mac(u32 dat0, u32 dat1, u8 mac[6]) @@ -615,18 +619,20 @@ struct del_port_learned_ctx { };
/* Clear learned (non-static) entry on given port */ -static void alr_loop_cb_del_port_learned(struct lan9303 *chip, u32 dat0, - u32 dat1, int portmap, void *ctx) +static int alr_loop_cb_del_port_learned(struct lan9303 *chip, u32 dat0, + u32 dat1, int portmap, void *ctx) { struct del_port_learned_ctx *del_ctx = ctx; int port = del_ctx->port;
if (((BIT(port) & portmap) == 0) || (dat1 & LAN9303_ALR_DAT1_STATIC)) - return; + return 0;
/* learned entries has only one port, we can just delete */ dat1 &= ~LAN9303_ALR_DAT1_VALID; /* delete entry */ lan9303_alr_make_entry_raw(chip, dat0, dat1); + + return 0; }
struct port_fdb_dump_ctx { @@ -635,19 +641,19 @@ struct port_fdb_dump_ctx { dsa_fdb_dump_cb_t *cb; };
-static void alr_loop_cb_fdb_port_dump(struct lan9303 *chip, u32 dat0, - u32 dat1, int portmap, void *ctx) +static int alr_loop_cb_fdb_port_dump(struct lan9303 *chip, u32 dat0, + u32 dat1, int portmap, void *ctx) { struct port_fdb_dump_ctx *dump_ctx = ctx; u8 mac[ETH_ALEN]; bool is_static;
if ((BIT(dump_ctx->port) & portmap) == 0) - return; + return 0;
alr_reg_to_mac(dat0, dat1, mac); is_static = !!(dat1 & LAN9303_ALR_DAT1_STATIC); - dump_ctx->cb(mac, 0, is_static, dump_ctx->data); + return dump_ctx->cb(mac, 0, is_static, dump_ctx->data); }
/* Set a static ALR entry. Delete entry if port_map is zero */ @@ -1214,9 +1220,7 @@ static int lan9303_port_fdb_dump(struct dsa_switch *ds, int port, };
dev_dbg(chip->dev, "%s(%d)\n", __func__, port); - lan9303_alr_loop(chip, alr_loop_cb_fdb_port_dump, &dump_ctx); - - return 0; + return lan9303_alr_loop(chip, alr_loop_cb_fdb_port_dump, &dump_ctx); }
static int lan9303_port_mdb_prepare(struct dsa_switch *ds, int port,
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 519133debcc19f5c834e7e28480b60bdc234fe02 ]
I got a memleak report:
BUG: memory leak unreferenced object 0x607ee521a658 (size 240): comm "syz-executor.0", pid 955, jiffies 4294780569 (age 16.449s) hex dump (first 32 bytes, cpu 1): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d830ea5a>] br_multicast_add_port+0x1c2/0x300 net/bridge/br_multicast.c:1693 [<00000000274d9a71>] new_nbp net/bridge/br_if.c:435 [inline] [<00000000274d9a71>] br_add_if+0x670/0x1740 net/bridge/br_if.c:611 [<0000000012ce888e>] do_set_master net/core/rtnetlink.c:2513 [inline] [<0000000012ce888e>] do_set_master+0x1aa/0x210 net/core/rtnetlink.c:2487 [<0000000099d1cafc>] __rtnl_newlink+0x1095/0x13e0 net/core/rtnetlink.c:3457 [<00000000a01facc0>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488 [<00000000acc9186c>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5550 [<00000000d4aabb9c>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504 [<00000000bc2e12a3>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<00000000bc2e12a3>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340 [<00000000e4dc2d0e>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929 [<000000000d22c8b3>] sock_sendmsg_nosec net/socket.c:654 [inline] [<000000000d22c8b3>] sock_sendmsg+0x139/0x170 net/socket.c:674 [<00000000e281417a>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350 [<00000000237aa2ab>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404 [<000000004f2dc381>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433 [<0000000005feca6c>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47 [<000000007304477d>] entry_SYSCALL_64_after_hwframe+0x44/0xae
On error path of br_add_if(), p->mcast_stats allocated in new_nbp() need be freed, or it will be leaked.
Fixes: 1080ab95e3c7 ("net: bridge: add support for IGMP/MLD stats and export them via netlink") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Acked-by: Nikolay Aleksandrov nikolay@nvidia.com Link: https://lore.kernel.org/r/20210809132023.978546-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_if.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 5aa508a08a69..b5fb2b682e19 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -604,6 +604,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
err = dev_set_allmulti(dev, 1); if (err) { + br_multicast_del_port(p); kfree(p); /* kobject not yet init'd, manually free */ goto err1; } @@ -708,6 +709,7 @@ err4: err3: sysfs_remove_link(br->ifobj, p->dev->name); err2: + br_multicast_del_port(p); kobject_put(&p->kobj); dev_set_allmulti(dev, -1); err1:
From: Neal Cardwell ncardwell@google.com
[ Upstream commit 6de035fec045f8ae5ee5f3a02373a18b939e91fb ]
Currently if BBR congestion control is initialized after more than 2B packets have been delivered, depending on the phase of the tp->delivered counter the tracking of BBR round trips can get stuck.
The bug arises because if tp->delivered is between 2^31 and 2^32 at the time the BBR congestion control module is initialized, then the initialization of bbr->next_rtt_delivered to 0 will cause the logic to believe that the end of the round trip is still billions of packets in the future. More specifically, the following check will fail repeatedly:
!before(rs->prior_delivered, bbr->next_rtt_delivered)
and thus the connection will take up to 2B packets delivered before that check will pass and the connection will set:
bbr->round_start = 1;
This could cause many mechanisms in BBR to fail to trigger, for example bbr_check_full_bw_reached() would likely never exit STARTUP.
This bug is 5 years old and has not been observed, and as a practical matter this would likely rarely trigger, since it would require transferring at least 2B packets, or likely more than 3 terabytes of data, before switching congestion control algorithms to BBR.
This patch is a stable candidate for kernels as far back as v4.9, when tcp_bbr.c was added.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell ncardwell@google.com Reviewed-by: Yuchung Cheng ycheng@google.com Reviewed-by: Kevin Yang yyd@google.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20210811024056.235161-1-ncardwell@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bbr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index b70c9365e131..1740de053072 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -985,7 +985,7 @@ static void bbr_init(struct sock *sk) bbr->prior_cwnd = 0; tp->snd_ssthresh = TCP_INFINITE_SSTHRESH; bbr->rtt_cnt = 0; - bbr->next_rtt_delivered = 0; + bbr->next_rtt_delivered = tp->delivered; bbr->prev_ca_state = TCP_CA_Open; bbr->packet_conservation = 0;
From: Eric Dumazet edumazet@google.com
[ Upstream commit b69dd5b3780a7298bd893816a09da751bc0636f7 ]
Some arches support cmpxchg() on 4-byte and 8-byte only. Increase mr_ifc_count width to 32bit to fix this problem.
Fixes: 4a2b285e7e10 ("net: igmp: fix data-race in igmp_ifc_timer_expire()") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20210811195715.3684218-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/inetdevice.h | 2 +- net/ipv4/igmp.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h index a64f21a97369..131f93f8d587 100644 --- a/include/linux/inetdevice.h +++ b/include/linux/inetdevice.h @@ -41,7 +41,7 @@ struct in_device { unsigned long mr_qri; /* Query Response Interval */ unsigned char mr_qrv; /* Query Robustness Variable */ unsigned char mr_gq_running; - unsigned char mr_ifc_count; + u32 mr_ifc_count; struct timer_list mr_gq_timer; /* general query timer */ struct timer_list mr_ifc_timer; /* interface change timer */
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 95ec3923083f..dca7fe0ae24a 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -807,7 +807,7 @@ static void igmp_gq_timer_expire(struct timer_list *t) static void igmp_ifc_timer_expire(struct timer_list *t) { struct in_device *in_dev = from_timer(in_dev, t, mr_ifc_timer); - u8 mr_ifc_count; + u32 mr_ifc_count;
igmpv3_send_cr(in_dev); restart:
From: Maximilian Heyne mheyne@amazon.de
[ Upstream commit 88ca2521bd5b4e8b83743c01a2d4cb09325b51e9 ]
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq mapping are lazily allocated in this function. The check whether the row is already present and the row initialization is not synchronized. Two threads can at the same time allocate a new row for evtchn_to_irq and add the irq mapping to the their newly allocated row. One thread will overwrite what the other has set for evtchn_to_irq[row] and therefore the irq mapping is lost. This will trigger a BUG_ON later in bind_evtchn_to_cpu:
INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802 INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002) INFO: nvme nvme77: 1/0/0 default/read/poll queues CRIT: kernel BUG at drivers/xen/events/events_base.c:427! WARN: invalid opcode: 0000 [#1] SMP NOPTI WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme] WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0 WARN: Call Trace: WARN: set_affinity_irq+0x121/0x150 WARN: irq_do_set_affinity+0x37/0xe0 WARN: irq_setup_affinity+0xf6/0x170 WARN: irq_startup+0x64/0xe0 WARN: __setup_irq+0x69e/0x740 WARN: ? request_threaded_irq+0xad/0x160 WARN: request_threaded_irq+0xf5/0x160 WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme] WARN: pci_request_irq+0xa9/0xf0 WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130 WARN: queue_request_irq+0x4c/0x70 [nvme] WARN: nvme_reset_work+0x82d/0x1550 [nvme] WARN: ? check_preempt_wakeup+0x14f/0x230 WARN: ? check_preempt_curr+0x29/0x80 WARN: ? nvme_irq_check+0x30/0x30 [nvme] WARN: process_one_work+0x18e/0x3c0 WARN: worker_thread+0x30/0x3a0 WARN: ? process_one_work+0x3c0/0x3c0 WARN: kthread+0x113/0x130 WARN: ? kthread_park+0x90/0x90 WARN: ret_from_fork+0x3a/0x50
This patch sets evtchn_to_irq rows via a cmpxchg operation so that they will be set only once. The row is now cleared before writing it to evtchn_to_irq in order to not create a race once the row is visible for other threads.
While at it, do not require the page to be zeroed, because it will be overwritten with -1's in clear_evtchn_to_irq_row anyway.
Signed-off-by: Maximilian Heyne mheyne@amazon.de Fixes: d0b075ffeede ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated") Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/events/events_base.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index a2f8130e18fe..d138027034fd 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -133,12 +133,12 @@ static void disable_dynirq(struct irq_data *data);
static DEFINE_PER_CPU(unsigned int, irq_epoch);
-static void clear_evtchn_to_irq_row(unsigned row) +static void clear_evtchn_to_irq_row(int *evtchn_row) { unsigned col;
for (col = 0; col < EVTCHN_PER_ROW; col++) - WRITE_ONCE(evtchn_to_irq[row][col], -1); + WRITE_ONCE(evtchn_row[col], -1); }
static void clear_evtchn_to_irq_all(void) @@ -148,7 +148,7 @@ static void clear_evtchn_to_irq_all(void) for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) { if (evtchn_to_irq[row] == NULL) continue; - clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_to_irq[row]); } }
@@ -156,6 +156,7 @@ static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) { unsigned row; unsigned col; + int *evtchn_row;
if (evtchn >= xen_evtchn_max_channels()) return -EINVAL; @@ -168,11 +169,18 @@ static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) if (irq == -1) return 0;
- evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL); - if (evtchn_to_irq[row] == NULL) + evtchn_row = (int *) __get_free_pages(GFP_KERNEL, 0); + if (evtchn_row == NULL) return -ENOMEM;
- clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_row); + + /* + * We've prepared an empty row for the mapping. If a different + * thread was faster inserting it, we can drop ours. + */ + if (cmpxchg(&evtchn_to_irq[row], NULL, evtchn_row) != NULL) + free_page((unsigned long) evtchn_row); }
WRITE_ONCE(evtchn_to_irq[row][col], irq);
From: "Longpeng(Mike)" longpeng2@huawei.com
[ Upstream commit 49b0b6ffe20c5344f4173f3436298782a08da4f2 ]
There's a potential deadlock case when remove the vsock device or process the RESET event:
vsock_for_each_connected_socket: spin_lock_bh(&vsock_table_lock) ----------- (1) ... virtio_vsock_reset_sock: lock_sock(sk) --------------------- (2) ... spin_unlock_bh(&vsock_table_lock)
lock_sock() may do initiative schedule when the 'sk' is owned by other thread at the same time, we would receivce a warning message that "scheduling while atomic".
Even worse, if the next task (selected by the scheduler) try to release a 'sk', it need to request vsock_table_lock and the deadlock occur, cause the system into softlockup state. Call trace: queued_spin_lock_slowpath vsock_remove_bound vsock_remove_sock virtio_transport_release __vsock_release vsock_release __sock_release sock_close __fput ____fput
So we should not require sk_lock in this case, just like the behavior in vhost_vsock or vmci.
Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko") Cc: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Longpeng(Mike) longpeng2@huawei.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Link: https://lore.kernel.org/r/20210812053056.1699-1-longpeng2@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/vmw_vsock/virtio_transport.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index cc70d651d13e..e34979fcefd2 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -373,11 +373,14 @@ static void virtio_vsock_event_fill(struct virtio_vsock *vsock)
static void virtio_vsock_reset_sock(struct sock *sk) { - lock_sock(sk); + /* vmci_transport.c doesn't take sk_lock here either. At least we're + * under vsock_table_lock so the sock cannot disappear while we're + * executing. + */ + sk->sk_state = TCP_CLOSE; sk->sk_err = ECONNRESET; sk->sk_error_report(sk); - release_sock(sk); }
static void virtio_vsock_update_guest_cid(struct virtio_vsock *vsock)
From: Pu Lehui pulehui@huawei.com
[ Upstream commit 43e8f76006592cb1573a959aa287c45421066f9c ]
When using kprobe on powerpc booke series processor, Oops happens as show bellow:
/ # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events / # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable / # sleep 1 [ 50.076730] Oops: Exception in kernel mode, sig: 5 [#1] [ 50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500 [ 50.077221] Modules linked in: [ 50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21 [ 50.077887] NIP: c0b9c4e0 LR: c00ebecc CTR: 00000000 [ 50.078067] REGS: c3883de0 TRAP: 0700 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.078349] MSR: 00029000 <CE,EE,ME> CR: 24000228 XER: 20000000 [ 50.078675] [ 50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001 [ 50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4 [ 50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000 [ 50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000 [ 50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190 [ 50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0 [ 50.080638] Call Trace: [ 50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable) [ 50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110 [ 50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28 [ 50.081541] --- interrupt: c00 at 0x100a4d08 [ 50.081749] NIP: 100a4d08 LR: 101b5234 CTR: 00000003 [ 50.081931] REGS: c3883f50 TRAP: 0c00 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.082183] MSR: 0002f902 <CE,EE,PR,FP,ME> CR: 24000222 XER: 00000000 [ 50.082457] [ 50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff [ 50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4 [ 50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000 [ 50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8 [ 50.083789] NIP [100a4d08] 0x100a4d08 [ 50.083917] LR [101b5234] 0x101b5234 [ 50.084042] --- interrupt: c00 [ 50.084238] Instruction dump: [ 50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010 [ 50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048 [ 50.085487] ---[ end trace f6fffe98e2fa8f3e ]--- [ 50.085678] Trace/breakpoint trap
There is no real mode for booke arch and the MMU translation is always on. The corresponding MSR_IS/MSR_DS bit in booke is used to switch the address space, but not for real mode judgment.
Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode") Signed-off-by: Pu Lehui pulehui@huawei.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/kprobes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index 53a39661eb13..ccf16bccc2bc 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -277,7 +277,8 @@ int kprobe_handler(struct pt_regs *regs) if (user_mode(regs)) return 0;
- if (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR)) + if (!IS_ENABLED(CONFIG_BOOKE) && + (!(regs->msr & MSR_IR) || !(regs->msr & MSR_DR))) return 0;
/*
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 839ad22f755132838f406751439363c07272ad87 ]
Skip (omit) any version string info that is parenthesized.
Warning: objdump version 15) is older than 2.19 Warning: Skipping posttest.
where 'objdump -v' says: GNU objdump (GNU Binutils; SUSE Linux Enterprise 15) 2.35.1.20201123-7.18
Fixes: 8bee738bb1979 ("x86: Fix objdump version check in chkobjdump.awk for different formats.") Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Link: https://lore.kernel.org/r/20210731000146.2720-1-rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/tools/chkobjdump.awk | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/tools/chkobjdump.awk b/arch/x86/tools/chkobjdump.awk index fd1ab80be0de..a4cf678cf5c8 100644 --- a/arch/x86/tools/chkobjdump.awk +++ b/arch/x86/tools/chkobjdump.awk @@ -10,6 +10,7 @@ BEGIN {
/^GNU objdump/ { verstr = "" + gsub(/(.*)/, ""); for (i = 3; i <= NF; i++) if (match($(i), "^[0-9]")) { verstr = $(i);
From: Thomas Gleixner tglx@linutronix.de
commit 826da771291fc25a428e871f9e7fb465e390f852 upstream.
X86 IO/APIC and MSI interrupts (when used without interrupts remapping) require that the affinity setup on startup is done before the interrupt is enabled for the first time as the non-remapped operation mode cannot safely migrate enabled interrupts from arbitrary contexts. Provide a new irq chip flag which allows affected hardware to request this.
This has to be opt-in because there have been reports in the past that some interrupt chips cannot handle affinity setting before startup.
Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.779791738@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/irq.h | 2 ++ kernel/irq/chip.c | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/irq.h b/include/linux/irq.h index a042faefb9b7..9504267414a4 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -535,6 +535,7 @@ struct irq_chip { * IRQCHIP_ONESHOT_SAFE: One shot does not require mask/unmask * IRQCHIP_EOI_THREADED: Chip requires eoi() on unmask in threaded mode * IRQCHIP_SUPPORTS_LEVEL_MSI Chip can provide two doorbells for Level MSIs + * IRQCHIP_AFFINITY_PRE_STARTUP: Default affinity update before startup */ enum { IRQCHIP_SET_TYPE_MASKED = (1 << 0), @@ -545,6 +546,7 @@ enum { IRQCHIP_ONESHOT_SAFE = (1 << 5), IRQCHIP_EOI_THREADED = (1 << 6), IRQCHIP_SUPPORTS_LEVEL_MSI = (1 << 7), + IRQCHIP_AFFINITY_PRE_STARTUP = (1 << 10), };
#include <linux/irqdesc.h> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 09d914e486a2..9afbd89b6096 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -265,8 +265,11 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force) } else { switch (__irq_startup_managed(desc, aff, force)) { case IRQ_STARTUP_NORMAL: + if (d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP) + irq_setup_affinity(desc); ret = __irq_startup(desc); - irq_setup_affinity(desc); + if (!(d->chip->flags & IRQCHIP_AFFINITY_PRE_STARTUP)) + irq_setup_affinity(desc); break; case IRQ_STARTUP_MANAGED: irq_do_set_affinity(d, aff, false);
From: Thomas Gleixner tglx@linutronix.de
commit ff363f480e5997051dd1de949121ffda3b753741 upstream.
The X86 MSI mechanism cannot handle interrupt affinity changes safely after startup other than from an interrupt handler, unless interrupt remapping is enabled. The startup sequence in the generic interrupt code violates that assumption.
Mark the irq chips with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that the default interrupt setting happens before the interrupt is started up for the first time.
While the interrupt remapping MSI chip does not require this, there is no point in treating it differently as this might spare an interrupt to a CPU which is not in the default affinity mask.
For the non-remapping case go to the direct write path when the interrupt is not yet started similar to the not yet activated case.
Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.886722080@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/apic/msi.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index fb26c956c442..ca17a3848834 100644 --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -89,11 +89,13 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force) * The quirk bit is not set in this case. * - The new vector is the same as the old vector * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up) + * - The interrupt is not yet started up * - The new destination CPU is the same as the old destination CPU */ if (!irqd_msi_nomask_quirk(irqd) || cfg->vector == old_cfg.vector || old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR || + !irqd_is_started(irqd) || cfg->dest_apicid == old_cfg.dest_apicid) { irq_msi_update_msg(irqd, cfg); return ret; @@ -181,7 +183,8 @@ static struct irq_chip pci_msi_controller = { .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_compose_msi_msg = irq_msi_compose_msg, .irq_set_affinity = msi_set_affinity, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | + IRQCHIP_AFFINITY_PRE_STARTUP, };
int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) @@ -282,7 +285,8 @@ static struct irq_chip pci_msi_ir_controller = { .irq_ack = irq_chip_ack_parent, .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | + IRQCHIP_AFFINITY_PRE_STARTUP, };
static struct msi_domain_info pci_msi_ir_domain_info = { @@ -325,7 +329,8 @@ static struct irq_chip dmar_msi_controller = { .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_compose_msi_msg = irq_msi_compose_msg, .irq_write_msi_msg = dmar_msi_write_msg, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | + IRQCHIP_AFFINITY_PRE_STARTUP, };
static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info, @@ -423,7 +428,7 @@ static struct irq_chip hpet_msi_controller __ro_after_init = { .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_compose_msi_msg = irq_msi_compose_msg, .irq_write_msi_msg = hpet_msi_write_msg, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_AFFINITY_PRE_STARTUP, };
static irq_hw_number_t hpet_msi_get_hwirq(struct msi_domain_info *info,
From: Thomas Gleixner tglx@linutronix.de
commit 0c0e37dc11671384e53ba6ede53a4d91162a2cc5 upstream.
The IO/APIC cannot handle interrupt affinity changes safely after startup other than from an interrupt handler. The startup sequence in the generic interrupt code violates that assumption.
Mark the irq chip with the new IRQCHIP_AFFINITY_PRE_STARTUP flag so that the default interrupt setting happens before the interrupt is started up for the first time.
Fixes: 18404756765c ("genirq: Expose default irq affinity mask (take 3)") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.832143400@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/apic/io_apic.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index a89dac380243..677508baf95a 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1958,7 +1958,8 @@ static struct irq_chip ioapic_chip __read_mostly = { .irq_set_affinity = ioapic_set_affinity, .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_get_irqchip_state = ioapic_irq_get_chip_state, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | + IRQCHIP_AFFINITY_PRE_STARTUP, };
static struct irq_chip ioapic_ir_chip __read_mostly = { @@ -1971,7 +1972,8 @@ static struct irq_chip ioapic_ir_chip __read_mostly = { .irq_set_affinity = ioapic_set_affinity, .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_get_irqchip_state = ioapic_irq_get_chip_state, - .flags = IRQCHIP_SKIP_SET_WAKE, + .flags = IRQCHIP_SKIP_SET_WAKE | + IRQCHIP_AFFINITY_PRE_STARTUP, };
static inline void init_IO_APIC_traps(void)
From: Babu Moger Babu.Moger@amd.com
commit 064855a69003c24bd6b473b367d364e418c57625 upstream.
Creating a new sub monitoring group in the root /sys/fs/resctrl leads to getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes on the entire filesystem.
Steps to reproduce:
1. mount -t resctrl resctrl /sys/fs/resctrl/
2. cd /sys/fs/resctrl/
3. cat mon_data/mon_L3_00/mbm_total_bytes 23189832
4. Create sub monitor group: mkdir mon_groups/test1
5. cat mon_data/mon_L3_00/mbm_total_bytes Unavailable
When a new monitoring group is created, a new RMID is assigned to the new group. But the RMID is not active yet. When the events are read on the new RMID, it is expected to report the status as "Unavailable".
When the user reads the events on the default monitoring group with multiple subgroups, the events on all subgroups are consolidated together. Currently, if any of the RMID reads report as "Unavailable", then everything will be reported as "Unavailable".
Fix the issue by discarding the "Unavailable" reads and reporting all the successful RMID reads. This is not a problem on Intel systems as Intel reports 0 on Inactive RMIDs.
Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data") Reported-by: Paweł Szulik pawel.szulik@intel.com Signed-off-by: Babu Moger Babu.Moger@amd.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311 Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoge... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/intel_rdt_monitor.c | 27 ++++++++++++------------- 1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_rdt_monitor.c b/arch/x86/kernel/cpu/intel_rdt_monitor.c index 5dfa5ab9a5ae..6eeb17dfde48 100644 --- a/arch/x86/kernel/cpu/intel_rdt_monitor.c +++ b/arch/x86/kernel/cpu/intel_rdt_monitor.c @@ -233,15 +233,14 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr) return chunks >>= shift; }
-static int __mon_event_count(u32 rmid, struct rmid_read *rr) +static u64 __mon_event_count(u32 rmid, struct rmid_read *rr) { struct mbm_state *m; u64 chunks, tval;
tval = __rmid_read(rmid, rr->evtid); if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) { - rr->val = tval; - return -EINVAL; + return tval; } switch (rr->evtid) { case QOS_L3_OCCUP_EVENT_ID: @@ -253,12 +252,6 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr) case QOS_L3_MBM_LOCAL_EVENT_ID: m = &rr->d->mbm_local[rmid]; break; - default: - /* - * Code would never reach here because - * an invalid event id would fail the __rmid_read. - */ - return -EINVAL; }
if (rr->first) { @@ -308,23 +301,29 @@ void mon_event_count(void *info) struct rdtgroup *rdtgrp, *entry; struct rmid_read *rr = info; struct list_head *head; + u64 ret_val;
rdtgrp = rr->rgrp;
- if (__mon_event_count(rdtgrp->mon.rmid, rr)) - return; + ret_val = __mon_event_count(rdtgrp->mon.rmid, rr);
/* - * For Ctrl groups read data from child monitor groups. + * For Ctrl groups read data from child monitor groups and + * add them together. Count events which are read successfully. + * Discard the rmid_read's reporting errors. */ head = &rdtgrp->mon.crdtgrp_list;
if (rdtgrp->type == RDTCTRL_GROUP) { list_for_each_entry(entry, head, mon.crdtgrp_list) { - if (__mon_event_count(entry->mon.rmid, rr)) - return; + if (__mon_event_count(entry->mon.rmid, rr) == 0) + ret_val = 0; } } + + /* Report error if none of rmid_reads are successful */ + if (ret_val) + rr->val = ret_val; }
/*
From: Bixuan Cui cuibixuan@huawei.com
commit dbbc93576e03fbe24b365fab0e901eb442237a8a upstream.
msi_domain_alloc_irqs() invokes irq_domain_activate_irq(), but msi_domain_free_irqs() does not enforce deactivation before tearing down the interrupts.
This happens when PCI/MSI interrupts are set up and never used before being torn down again, e.g. in error handling pathes. The only place which cleans that up is the error handling path in msi_domain_alloc_irqs().
Move the cleanup from msi_domain_alloc_irqs() into msi_domain_free_irqs() to cure that.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early") Signed-off-by: Bixuan Cui cuibixuan@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210518033117.78104-1-cuibixuan@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/irq/msi.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index 604974f2afb1..88269dd5a8ca 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -477,11 +477,6 @@ skip_activate: return 0;
cleanup: - for_each_msi_vector(desc, i, dev) { - irq_data = irq_domain_get_irq_data(domain, i); - if (irqd_is_activated(irq_data)) - irq_domain_deactivate_irq(irq_data); - } msi_domain_free_irqs(domain, dev); return ret; } @@ -494,7 +489,15 @@ cleanup: */ void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) { + struct irq_data *irq_data; struct msi_desc *desc; + int i; + + for_each_msi_vector(desc, i, dev) { + irq_data = irq_domain_get_irq_data(domain, i); + if (irqd_is_activated(irq_data)) + irq_domain_deactivate_irq(irq_data); + }
for_each_msi_entry(desc, dev) { /*
From: Thomas Gleixner tglx@linutronix.de
commit 438553958ba19296663c6d6583d208dfb6792830 upstream.
The ordering of MSI-X enable in hardware is dysfunctional:
1) MSI-X is disabled in the control register 2) Various setup functions 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing the MSI-X table entries 4) MSI-X is enabled and masked in the control register with the comment that enabling is required for some hardware to access the MSI-X table
Step #4 obviously contradicts #3. The history of this is an issue with the NIU hardware. When #4 was introduced the table access actually happened in msix_program_entries() which was invoked after enabling and masking MSI-X.
This was changed in commit d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") which removed the table write from msix_program_entries().
Interestingly enough nobody noticed and either NIU still works or it did not get any testing with a kernel 3.19 or later.
Nevertheless this is inconsistent and there is no reason why MSI-X can't be enabled and masked in the control register early on, i.e. move step #4 above to step #1. This preserves the NIU workaround and has no side effects on other hardware.
Fixes: d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Ashok Raj ashok.raj@intel.com Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 23a363fd4c59..949dc342c16a 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -743,18 +743,25 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, u16 control; void __iomem *base;
- /* Ensure MSI-X is disabled while it is set up */ - pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); + /* + * Some devices require MSI-X to be enabled before the MSI-X + * registers can be accessed. Mask all the vectors to prevent + * interrupts coming in before they're fully set up. + */ + pci_msix_clear_and_set_ctrl(dev, 0, PCI_MSIX_FLAGS_MASKALL | + PCI_MSIX_FLAGS_ENABLE);
pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control); /* Request & Map MSI-X table region */ base = msix_map_region(dev, msix_table_size(control)); - if (!base) - return -ENOMEM; + if (!base) { + ret = -ENOMEM; + goto out_disable; + }
ret = msix_setup_entries(dev, base, entries, nvec, affd); if (ret) - return ret; + goto out_disable;
ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); if (ret) @@ -765,14 +772,6 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, if (ret) goto out_free;
- /* - * Some devices require MSI-X to be enabled before we can touch the - * MSI-X registers. We need to mask all the vectors to prevent - * interrupts coming in before they're fully set up. - */ - pci_msix_clear_and_set_ctrl(dev, 0, - PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE); - msix_program_entries(dev, entries);
ret = populate_msi_sysfs(dev); @@ -807,6 +806,9 @@ out_avail: out_free: free_msi_irqs(dev);
+out_disable: + pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); + return ret; }
From: Thomas Gleixner tglx@linutronix.de
commit 361fd37397f77578735907341579397d5bed0a2d upstream.
msi_mask_irq() takes a mask and a flags argument. The mask argument is used to mask out bits from the cached mask and the flags argument to set bits.
Some places invoke it with a flags argument which sets bits which are not used by the device, i.e. when the device supports up to 8 vectors a full unmask in some places sets the mask to 0xFFFFFF00. While devices probably do not care, it's still bad practice.
Fixes: 7ba1930db02f ("PCI MSI: Unmask MSI if setup failed") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 949dc342c16a..77e096c942ec 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -619,21 +619,21 @@ static int msi_capability_init(struct pci_dev *dev, int nvec, /* Configure MSI capability structure */ ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; }
ret = msi_verify_entries(dev); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; }
ret = populate_msi_sysfs(dev); if (ret) { - msi_mask_irq(entry, mask, ~mask); + msi_mask_irq(entry, mask, 0); free_msi_irqs(dev); return ret; } @@ -897,7 +897,7 @@ static void pci_msi_shutdown(struct pci_dev *dev) /* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); /* Keep cached state to be restored */ - __pci_msi_desc_mask_irq(desc, mask, ~mask); + __pci_msi_desc_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ dev->irq = desc->msi_attrib.default_irq;
From: Thomas Gleixner tglx@linutronix.de
commit 689e6b5351573c38ccf92a0dd8b3e2c2241e4aff upstream.
The comments about preserving the cached state in pci_msi[x]_shutdown() are misleading as the MSI descriptors are freed right after those functions return. So there is nothing to restore. Preparatory change.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 77e096c942ec..758223252cd4 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -896,7 +896,6 @@ static void pci_msi_shutdown(struct pci_dev *dev)
/* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); - /* Keep cached state to be restored */ __pci_msi_desc_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ @@ -982,10 +981,8 @@ static void pci_msix_shutdown(struct pci_dev *dev) }
/* Return the device with MSI-X masked as initial states */ - for_each_pci_msi_entry(entry, dev) { - /* Keep cached states to be restored */ + for_each_pci_msi_entry(entry, dev) __pci_msix_desc_mask_irq(entry, 1); - }
pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); pci_intx_for_msi(dev, 1);
From: Thomas Gleixner tglx@linutronix.de
commit d28d4ad2a1aef27458b3383725bb179beb8d015c upstream.
No point in using the raw write function from shutdown. Preparatory change to introduce proper serialization for the msi_desc::masked cache.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 758223252cd4..677b58670011 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -896,7 +896,7 @@ static void pci_msi_shutdown(struct pci_dev *dev)
/* Return the device with MSI unmasked as initial states */ mask = msi_mask(desc->msi_attrib.multi_cap); - __pci_msi_desc_mask_irq(desc, mask, 0); + msi_mask_irq(desc, mask, 0);
/* Restore dev->irq to its default pin-assertion irq */ dev->irq = desc->msi_attrib.default_irq;
From: Thomas Gleixner tglx@linutronix.de
commit 77e89afc25f30abd56e76a809ee2884d7c1b63ce upstream.
Multi-MSI uses a single MSI descriptor and there is a single mask register when the device supports per vector masking. To avoid reading back the mask register the value is cached in the MSI descriptor and updates are done by clearing and setting bits in the cache and writing it to the device.
But nothing protects msi_desc::masked and the mask register from being modified concurrently on two different CPUs for two different Linux interrupts which belong to the same multi-MSI descriptor.
Add a lock to struct device and protect any operation on the mask and the mask register with it.
This makes the update of msi_desc::masked unconditional, but there is no place which requires a modification of the hardware register without updating the masked cache.
msi_mask_irq() is now an empty wrapper which will be cleaned up in follow up changes.
The problem goes way back to the initial support of multi-MSI, but picking the commit which introduced the mask cache is a valid cut off point (2.6.30).
Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 1 + drivers/pci/msi.c | 19 ++++++++++--------- include/linux/device.h | 1 + include/linux/msi.h | 2 +- 4 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index f7f601858f10..6e380ad9d08a 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1682,6 +1682,7 @@ void device_initialize(struct device *dev) device_pm_init(dev); set_dev_node(dev, -1); #ifdef CONFIG_GENERIC_MSI_IRQ + raw_spin_lock_init(&dev->msi_lock); INIT_LIST_HEAD(&dev->msi_list); #endif INIT_LIST_HEAD(&dev->links.consumers); diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 677b58670011..a9cbc301a8a6 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -170,24 +170,25 @@ static inline __attribute_const__ u32 msi_mask(unsigned x) * reliably as devices without an INTx disable bit will then generate a * level IRQ which will never be cleared. */ -u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) +void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) { - u32 mask_bits = desc->masked; + raw_spinlock_t *lock = &desc->dev->msi_lock; + unsigned long flags;
if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit) - return 0; + return;
- mask_bits &= ~mask; - mask_bits |= flag; + raw_spin_lock_irqsave(lock, flags); + desc->masked &= ~mask; + desc->masked |= flag; pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos, - mask_bits); - - return mask_bits; + desc->masked); + raw_spin_unlock_irqrestore(lock, flags); }
static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) { - desc->masked = __pci_msi_desc_mask_irq(desc, mask, flag); + __pci_msi_desc_mask_irq(desc, mask, flag); }
static void __iomem *pci_msix_desc_addr(struct msi_desc *desc) diff --git a/include/linux/device.h b/include/linux/device.h index b1c8150e9ea5..37e359d81a86 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -998,6 +998,7 @@ struct device { struct dev_pin_info *pins; #endif #ifdef CONFIG_GENERIC_MSI_IRQ + raw_spinlock_t msi_lock; struct list_head msi_list; #endif
diff --git a/include/linux/msi.h b/include/linux/msi.h index 5dd171849a27..62982e6afddf 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -150,7 +150,7 @@ void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg); void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag); -u32 __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag); +void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag); void pci_msi_mask_irq(struct irq_data *data); void pci_msi_unmask_irq(struct irq_data *data);
From: Thomas Gleixner tglx@linutronix.de
commit 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f upstream.
When MSI-X is enabled the ordering of calls is:
msix_map_region(); msix_setup_entries(); pci_msi_setup_msi_irqs(); msix_program_entries();
This has a few interesting issues:
1) msix_setup_entries() allocates the MSI descriptors and initializes them except for the msi_desc:masked member which is left zero initialized.
2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets up the MSI interrupts which ends up in pci_write_msi_msg() unless the interrupt chip provides its own irq_write_msi_msg() function.
3) msix_program_entries() does not do what the name suggests. It solely updates the entries array (if not NULL) and initializes the masked member for each MSI descriptor by reading the hardware state and then masks the entry.
Obviously this has some issues:
1) The uninitialized masked member of msi_desc prevents the enforcement of masking the entry in pci_write_msi_msg() depending on the cached masked bit. Aside of that half initialized data is a NONO in general
2) msix_program_entries() only ensures that the actually allocated entries are masked. This is wrong as experimentation with crash testing and crash kernel kexec has shown.
This limited testing unearthed that when the production kernel had more entries in use and unmasked when it crashed and the crash kernel allocated a smaller amount of entries, then a full scan of all entries found unmasked entries which were in use in the production kernel.
This is obviously a device or emulation issue as the device reset should mask all MSI-X table entries, but obviously that's just part of the paper specification.
Cure this by:
1) Masking all table entries in hardware 2) Initializing msi_desc::masked in msix_setup_entries() 3) Removing the mask dance in msix_program_entries() 4) Renaming msix_program_entries() to msix_update_entries() to reflect the purpose of that function.
As the masking of unused entries has never been done the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index a9cbc301a8a6..d13b8b608891 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -675,6 +675,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, { struct cpumask *curmsk, *masks = NULL; struct msi_desc *entry; + void __iomem *addr; int ret, i;
if (affd) @@ -694,6 +695,7 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
entry->msi_attrib.is_msix = 1; entry->msi_attrib.is_64 = 1; + if (entries) entry->msi_attrib.entry_nr = entries[i].entry; else @@ -701,6 +703,10 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, entry->msi_attrib.default_irq = dev->irq; entry->mask_base = base;
+ addr = pci_msix_desc_addr(entry); + if (addr) + entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL); + list_add_tail(&entry->list, dev_to_msi_list(&dev->dev)); if (masks) curmsk++; @@ -711,21 +717,27 @@ out: return ret; }
-static void msix_program_entries(struct pci_dev *dev, - struct msix_entry *entries) +static void msix_update_entries(struct pci_dev *dev, struct msix_entry *entries) { struct msi_desc *entry; - int i = 0;
for_each_pci_msi_entry(entry, dev) { - if (entries) - entries[i++].vector = entry->irq; - entry->masked = readl(pci_msix_desc_addr(entry) + - PCI_MSIX_ENTRY_VECTOR_CTRL); - msix_mask_irq(entry, 1); + if (entries) { + entries->vector = entry->irq; + entries++; + } } }
+static void msix_mask_all(void __iomem *base, int tsize) +{ + u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; + int i; + + for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE) + writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL); +} + /** * msix_capability_init - configure device's MSI-X capability * @dev: pointer to the pci_dev data structure of MSI-X device function @@ -740,9 +752,9 @@ static void msix_program_entries(struct pci_dev *dev, static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, int nvec, const struct irq_affinity *affd) { - int ret; - u16 control; void __iomem *base; + int ret, tsize; + u16 control;
/* * Some devices require MSI-X to be enabled before the MSI-X @@ -754,12 +766,16 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control); /* Request & Map MSI-X table region */ - base = msix_map_region(dev, msix_table_size(control)); + tsize = msix_table_size(control); + base = msix_map_region(dev, tsize); if (!base) { ret = -ENOMEM; goto out_disable; }
+ /* Ensure that all table entries are masked. */ + msix_mask_all(base, tsize); + ret = msix_setup_entries(dev, base, entries, nvec, affd); if (ret) goto out_disable; @@ -773,7 +789,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, if (ret) goto out_free;
- msix_program_entries(dev, entries); + msix_update_entries(dev, entries);
ret = populate_msi_sysfs(dev); if (ret)
From: Thomas Gleixner tglx@linutronix.de
commit da181dc974ad667579baece33c2c8d2d1e4558d5 upstream.
The specification (PCIe r5.0, sec 6.1.4.5) states:
For MSI-X, a function is permitted to cache Address and Data values from unmasked MSI-X Table entries. However, anytime software unmasks a currently masked MSI-X Table entry either by clearing its Mask bit or by clearing the Function Mask bit, the function must update any Address or Data values that it cached from that entry. If software changes the Address or Data value of an entry while the entry is unmasked, the result is undefined.
The Linux kernel's MSI-X support never enforced that the entry is masked before the entry is modified hence the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Enforce the entry to be masked across the update.
There is no point in enforcing this to be handled at all possible call sites as this is just pointless code duplication and the common update function is the obvious place to enforce this.
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Reported-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d13b8b608891..5a28f7e81f0c 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -303,10 +303,25 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) /* Don't touch the hardware now */ } else if (entry->msi_attrib.is_msix) { void __iomem *base = pci_msix_desc_addr(entry); + bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT); + + /* + * The specification mandates that the entry is masked + * when the message is modified: + * + * "If software changes the Address or Data value of an + * entry while the entry is unmasked, the result is + * undefined." + */ + if (unmasked) + __pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR); writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR); writel(msg->data, base + PCI_MSIX_ENTRY_DATA); + + if (unmasked) + __pci_msix_desc_mask_irq(entry, 0); } else { int pos = dev->msi_cap; u16 msgctl;
From: Thomas Gleixner tglx@linutronix.de
commit b9255a7cb51754e8d2645b65dd31805e282b4f3e upstream.
Nothing enforces the posted writes to be visible when the function returns. Flush them even if the flush might be redundant when the entry is masked already as the unmask will flush as well. This is either setup or a rare affinity change event so the extra flush is not the end of the world.
While this is more a theoretical issue especially the logic in the X86 specific msi_set_affinity() function relies on the assumption that the update has reached the hardware when the function returns.
Again, as this never has been enforced the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Marc Zyngier maz@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/msi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 5a28f7e81f0c..bc80b0f0ea1b 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -322,6 +322,9 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
if (unmasked) __pci_msix_desc_mask_irq(entry, 0); + + /* Ensure that the writes are visible in the device */ + readl(base + PCI_MSIX_ENTRY_DATA); } else { int pos = dev->msi_cap; u16 msgctl; @@ -342,6 +345,8 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) pci_write_config_word(dev, pos + PCI_MSI_DATA_32, msg->data); } + /* Ensure that the writes are visible in the device */ + pci_read_config_word(dev, pos + PCI_MSI_FLAGS, &msgctl); } entry->msg = *msg; }
From: Nathan Chancellor nathan@kernel.org
commit 848378812e40152abe9b9baf58ce2004f76fb988 upstream.
A recent change in LLVM causes module_{c,d}tor sections to appear when CONFIG_K{A,C}SAN are enabled, which results in orphan section warnings because these are not handled anywhere:
ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_ctor) is being placed in '.text.asan.module_ctor' ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.asan.module_dtor) is being placed in '.text.asan.module_dtor' ld.lld: warning: arch/x86/pci/built-in.a(legacy.o):(.text.tsan.module_ctor) is being placed in '.text.tsan.module_ctor'
Fangrui explains: "the function asan.module_ctor has the SHF_GNU_RETAIN flag, so it is in a separate section even with -fno-function-sections (default)".
Place them in the TEXT_TEXT section so that these technologies continue to work with the newer compiler versions. All of the KASAN and KCSAN KUnit tests continue to pass after this change.
Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1432 Link: https://github.com/llvm/llvm-project/commit/7b789562244ee941b7bf2cefeb3fc08a... Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Fangrui Song maskray@google.com Acked-by: Marco Elver elver@google.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20210731023107.1932981-1-nathan@kernel.org [nc: Resolve conflict due to lack of cf68fffb66d60] Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index ad8766e1635e..a26e6f5034a6 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -508,6 +508,7 @@ NOINSTR_TEXT \ *(.text..refcount) \ *(.ref.text) \ + *(.text.asan.* .text.tsan.*) \ MEM_KEEP(init.text*) \ MEM_KEEP(exit.text*) \
From: Saeed Mirzamohammadi saeed.mirzamohammadi@oracle.com
[ Upstream commit 327d5b2fee91c404a3956c324193892cf2cc9528 ]
The IOMMU driver calculates the guest addressability for a DMA request based on the value of the mgaw reported from the IOMMU. However, this is a fused value and as mentioned in the spec, the guest width should be calculated based on the minimum of supported adjusted guest address width (SAGAW) and MGAW.
This is from specification: "Guest addressability for a given DMA request is limited to the minimum of the value reported through this field and the adjusted guest address width of the corresponding page-table structure. (Adjusted guest address widths supported by hardware are reported through the SAGAW field)."
This causes domain initialization to fail and following errors appear for EHCI PCI driver:
[ 2.486393] ehci-pci 0000:01:00.4: EHCI Host Controller [ 2.486624] ehci-pci 0000:01:00.4: new USB bus registered, assigned bus number 1 [ 2.489127] ehci-pci 0000:01:00.4: DMAR: Allocating domain failed [ 2.489350] ehci-pci 0000:01:00.4: DMAR: 32bit DMA uses non-identity mapping [ 2.489359] ehci-pci 0000:01:00.4: can't setup: -12 [ 2.489531] ehci-pci 0000:01:00.4: USB bus 1 deregistered [ 2.490023] ehci-pci 0000:01:00.4: init 0000:01:00.4 fail, -12 [ 2.490358] ehci-pci: probe of 0000:01:00.4 failed with error -12
This issue happens when the value of the sagaw corresponds to a 48-bit agaw. This fix updates the calculation of the agaw based on the minimum of IOMMU's sagaw value and MGAW.
This issue happens on the code path of getting a private domain for a device. A private domain was needed when the domain of an iommu group couldn't meet the requirement of a device. The IOMMU core has been evolved to eliminate the need for private domain, hence this code path has alreay been removed from the upstream since commit 327d5b2fee91c ("iommu/vt-d: Allow 32bit devices to uses DMA domain"). Instead of back porting all patches that are required for removing the private domain, this simply fixes it in the affected stable kernel between v4.16 and v5.7.
[baolu: The orignal patch could be found here https://lore.kernel.org/linux-iommu/20210412202736.70765-1-saeed.mirzamohamm.... I added commit message according to Greg's comments at https://lore.kernel.org/linux-iommu/YHZ%2FT9x7Xjf1r6fI@kroah.com/.]
Cc: Joerg Roedel joro@8bytes.org Cc: Ashok Raj ashok.raj@intel.com Cc: stable@vger.kernel.org #v4.16+ Signed-off-by: Saeed Mirzamohammadi saeed.mirzamohammadi@oracle.com Tested-by: Camille Lu camille.lu@hpe.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel-iommu.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index d2166dfc8b3f..dcb865d19309 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1928,7 +1928,7 @@ static inline int guestwidth_to_adjustwidth(int gaw) static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, int guest_width) { - int adjust_width, agaw; + int adjust_width, agaw, cap_width; unsigned long sagaw; int err;
@@ -1942,8 +1942,9 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu, domain_reserve_special_ranges(domain);
/* calculate AGAW */ - if (guest_width > cap_mgaw(iommu->cap)) - guest_width = cap_mgaw(iommu->cap); + cap_width = min_t(int, cap_mgaw(iommu->cap), agaw_to_width(iommu->agaw)); + if (guest_width > cap_width) + guest_width = cap_width; domain->gaw = guest_width; adjust_width = guestwidth_to_adjustwidth(guest_width); agaw = width_to_agaw(adjust_width);
From: Johannes Berg johannes.berg@intel.com
commit a0761a301746ec2d92d7fcb82af69c0a6a4339aa upstream.
If we know that we have an encrypted link (based on having had a key configured for TX in the past) then drop all data frames in the key selection handler if there's no key anymore.
This fixes an issue with mac80211 internal TXQs - there we can buffer frames for an encrypted link, but then if the key is no longer there when they're dequeued, the frames are sent without encryption. This happens if a station is disconnected while the frames are still on the TXQ.
Detecting that a link should be encrypted based on a first key having been configured for TX is fine as there are no use cases for a connection going from with encryption to no encryption. With extended key IDs, however, there is a case of having a key configured for only decryption, so we can't just trigger this behaviour on a key being configured.
Cc: stable@vger.kernel.org Reported-by: Jouni Malinen j@w1.fi Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20200326150855.6865c7f28a14.I9fb1d911b0642... Signed-off-by: Johannes Berg johannes.berg@intel.com [pali: Backported to 4.19 and older versions] Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/debugfs_sta.c | 1 + net/mac80211/key.c | 1 + net/mac80211/sta_info.h | 1 + net/mac80211/tx.c | 12 +++++++++--- 4 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c index 4105081dc1df..6f390c2e4c8e 100644 --- a/net/mac80211/debugfs_sta.c +++ b/net/mac80211/debugfs_sta.c @@ -80,6 +80,7 @@ static const char * const sta_flag_names[] = { FLAG(MPSP_OWNER), FLAG(MPSP_RECIPIENT), FLAG(PS_DELIVER), + FLAG(USES_ENCRYPTION), #undef FLAG };
diff --git a/net/mac80211/key.c b/net/mac80211/key.c index 6775d6cb7d3d..7fc55177db84 100644 --- a/net/mac80211/key.c +++ b/net/mac80211/key.c @@ -341,6 +341,7 @@ static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata, if (sta) { if (pairwise) { rcu_assign_pointer(sta->ptk[idx], new); + set_sta_flag(sta, WLAN_STA_USES_ENCRYPTION); sta->ptk_idx = idx; ieee80211_check_fast_xmit(sta); } else { diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h index c33bc5fc0f2d..75d982ff7f3d 100644 --- a/net/mac80211/sta_info.h +++ b/net/mac80211/sta_info.h @@ -102,6 +102,7 @@ enum ieee80211_sta_info_flags { WLAN_STA_MPSP_OWNER, WLAN_STA_MPSP_RECIPIENT, WLAN_STA_PS_DELIVER, + WLAN_STA_USES_ENCRYPTION,
NUM_WLAN_STA_FLAGS, }; diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 98d048630ad2..3530d1a5fc98 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -593,10 +593,13 @@ ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) struct ieee80211_tx_info *info = IEEE80211_SKB_CB(tx->skb); struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)tx->skb->data;
- if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT)) + if (unlikely(info->flags & IEEE80211_TX_INTFL_DONT_ENCRYPT)) { tx->key = NULL; - else if (tx->sta && - (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]))) + return TX_CONTINUE; + } + + if (tx->sta && + (key = rcu_dereference(tx->sta->ptk[tx->sta->ptk_idx]))) tx->key = key; else if (ieee80211_is_group_privacy_action(tx->skb) && (key = rcu_dereference(tx->sdata->default_multicast_key))) @@ -657,6 +660,9 @@ ieee80211_tx_h_select_key(struct ieee80211_tx_data *tx) if (!skip_hw && tx->key && tx->key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE) info->control.hw_key = &tx->key->conf; + } else if (!ieee80211_is_mgmt(hdr->frame_control) && tx->sta && + test_sta_flag(tx->sta, WLAN_STA_USES_ENCRYPTION)) { + return TX_DROP; }
return TX_CONTINUE;
From: Maxim Levitsky mlevitsk@redhat.com
[ upstream commit c7dfa4009965a9b2d7b329ee970eb8da0d32f0bc ]
If L1 disables VMLOAD/VMSAVE intercepts, and doesn't enable Virtual VMLOAD/VMSAVE (currently not supported for the nested hypervisor), then VMLOAD/VMSAVE must operate on the L1 physical memory, which is only possible by making L0 intercept these instructions.
Failure to do so allowed the nested guest to run VMLOAD/VMSAVE unintercepted, and thus read/write portions of the host physical memory.
Fixes: 89c8a4984fc9 ("KVM: SVM: Enable Virtual VMLOAD VMSAVE feature")
Suggested-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 72d729f34437..9673ddb3d7a0 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -513,6 +513,9 @@ static void recalc_intercepts(struct vcpu_svm *svm) c->intercept_dr = h->intercept_dr | g->intercept_dr; c->intercept_exceptions = h->intercept_exceptions | g->intercept_exceptions; c->intercept = h->intercept | g->intercept; + + c->intercept |= (1ULL << INTERCEPT_VMLOAD); + c->intercept |= (1ULL << INTERCEPT_VMSAVE); }
static inline struct vmcb *get_host_vmcb(struct vcpu_svm *svm)
From: Maxim Levitsky mlevitsk@redhat.com
[ upstream commit 0f923e07124df069ba68d8bb12324398f4b6b709 ]
* Invert the mask of bits that we pick from L2 in nested_vmcb02_prepare_control
* Invert and explicitly use VIRQ related bits bitmask in svm_clear_vintr
This fixes a security issue that allowed a malicious L1 to run L2 with AVIC enabled, which allowed the L2 to exploit the uninitialized and enabled AVIC to read/write the host physical memory at some offsets.
Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler") Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/svm.h | 2 ++ arch/x86/kvm/svm.c | 15 ++++++++------- 2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 93b462e48067..b6dedf6c835c 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -118,6 +118,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define V_IGN_TPR_SHIFT 20 #define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
+#define V_IRQ_INJECTION_BITS_MASK (V_IRQ_MASK | V_INTR_PRIO_MASK | V_IGN_TPR_MASK) + #define V_INTR_MASKING_SHIFT 24 #define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 9673ddb3d7a0..85181457413e 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1444,12 +1444,7 @@ static __init int svm_hardware_setup(void) } }
- if (vgif) { - if (!boot_cpu_has(X86_FEATURE_VGIF)) - vgif = false; - else - pr_info("Virtual GIF supported\n"); - } + vgif = false; /* Disabled for CVE-2021-3653 */
return 0;
@@ -3593,7 +3588,13 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa, svm->nested.intercept = nested_vmcb->control.intercept;
svm_flush_tlb(&svm->vcpu, true); - svm->vmcb->control.int_ctl = nested_vmcb->control.int_ctl | V_INTR_MASKING_MASK; + + svm->vmcb->control.int_ctl &= + V_INTR_MASKING_MASK | V_GIF_ENABLE_MASK | V_GIF_MASK; + + svm->vmcb->control.int_ctl |= nested_vmcb->control.int_ctl & + (V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK); + if (nested_vmcb->control.int_ctl & V_INTR_MASKING_MASK) svm->vcpu.arch.hflags |= HF_VINTR_MASK; else
From: Thomas Gleixner tglx@linutronix.de
commit f9dfb5e390fab2df9f7944bb91e7705aba14cd26 upstream.
The XSAVE init code initializes all enabled and supported components with XRSTOR(S) to init state. Then it XSAVEs the state of the components back into init_fpstate which is used in several places to fill in the init state of components.
This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because those use the init optimization and skip writing state of components which are in init state. So init_fpstate.xsave still contains all zeroes after this operation.
There are two ways to solve that:
1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when XSAVES is enabled because XSAVES uses compacted format.
2) Save the components which are known to have a non-zero init state by other means.
Looking deeper, #2 is the right thing to do because all components the kernel supports have all-zeroes init state except the legacy features (FP, SSE). Those cannot be hard coded because the states are not identical on all CPUs, but they can be saved with FXSAVE which avoids all conditionals.
Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with a BUILD_BUG_ON() which reminds developers to validate that a newly added component has all zeroes init state. As a bonus remove the now unused copy_xregs_to_kernel_booting() crutch.
The XSAVE and reshuffle method can still be implemented in the unlikely case that components are added which have a non-zero init state and no other means to save them. For now, FXSAVE is just simple and good enough.
[ bp: Fix a typo or two in the text. ]
Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported") Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/fpu/internal.h | 30 ++++++----------------- arch/x86/kernel/fpu/xstate.c | 38 ++++++++++++++++++++++++++--- 2 files changed, 43 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index b8c935033d21..4f274d851986 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -215,6 +215,14 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) } }
+static inline void fxsave(struct fxregs_state *fx) +{ + if (IS_ENABLED(CONFIG_X86_32)) + asm volatile( "fxsave %[fx]" : [fx] "=m" (*fx)); + else + asm volatile("fxsaveq %[fx]" : [fx] "=m" (*fx)); +} + /* These macros all use (%edi)/(%rdi) as the single memory argument. */ #define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27" #define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37" @@ -283,28 +291,6 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory")
-/* - * This function is called only during boot time when x86 caps are not set - * up and alternative can not be used yet. - */ -static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate) -{ - u64 mask = -1; - u32 lmask = mask; - u32 hmask = mask >> 32; - int err; - - WARN_ON(system_state != SYSTEM_BOOTING); - - if (static_cpu_has(X86_FEATURE_XSAVES)) - XSTATE_OP(XSAVES, xstate, lmask, hmask, err); - else - XSTATE_OP(XSAVE, xstate, lmask, hmask, err); - - /* We should never fault when copying to a kernel buffer: */ - WARN_ON_FPU(err); -} - /* * This function is called only during boot time when x86 caps are not set * up and alternative can not be used yet. diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 601a5da1d196..7d372db8bee1 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -404,6 +404,24 @@ static void __init print_xstate_offset_size(void) } }
+/* + * All supported features have either init state all zeros or are + * handled in setup_init_fpu() individually. This is an explicit + * feature list and does not use XFEATURE_MASK*SUPPORTED to catch + * newly added supported features at build time and make people + * actually look at the init state for the new feature. + */ +#define XFEATURES_INIT_FPSTATE_HANDLED \ + (XFEATURE_MASK_FP | \ + XFEATURE_MASK_SSE | \ + XFEATURE_MASK_YMM | \ + XFEATURE_MASK_OPMASK | \ + XFEATURE_MASK_ZMM_Hi256 | \ + XFEATURE_MASK_Hi16_ZMM | \ + XFEATURE_MASK_PKRU | \ + XFEATURE_MASK_BNDREGS | \ + XFEATURE_MASK_BNDCSR) + /* * setup the xstate image representing the init state */ @@ -411,6 +429,8 @@ static void __init setup_init_fpu_buf(void) { static int on_boot_cpu __initdata = 1;
+ BUILD_BUG_ON(XCNTXT_MASK != XFEATURES_INIT_FPSTATE_HANDLED); + WARN_ON_FPU(!on_boot_cpu); on_boot_cpu = 0;
@@ -429,10 +449,22 @@ static void __init setup_init_fpu_buf(void) copy_kernel_to_xregs_booting(&init_fpstate.xsave);
/* - * Dump the init state again. This is to identify the init state - * of any feature which is not represented by all zero's. + * All components are now in init state. Read the state back so + * that init_fpstate contains all non-zero init state. This only + * works with XSAVE, but not with XSAVEOPT and XSAVES because + * those use the init optimization which skips writing data for + * components in init state. + * + * XSAVE could be used, but that would require to reshuffle the + * data when XSAVES is available because XSAVES uses xstate + * compaction. But doing so is a pointless exercise because most + * components have an all zeros init state except for the legacy + * ones (FP and SSE). Those can be saved with FXSAVE into the + * legacy area. Adding new features requires to ensure that init + * state is all zeroes or if not to add the necessary handling + * here. */ - copy_xregs_to_kernel_booting(&init_fpstate.xsave); + fxsave(&init_fpstate.fxsave); }
static int xfeature_uncompacted_offset(int xfeature_nr)
From: Jouni Malinen jouni@codeaurora.org
commit 56c5485c9e444c2e85e11694b6c44f1338fc20fd upstream.
It is possible for there to be pending frames in TXQs with a reference to the key cache entry that is being deleted. If such a key cache entry is cleared, those pending frame in TXQ might get transmitted without proper encryption. It is safer to leave the previously used key into the key cache in such cases. Instead, only clear the MAC address to prevent RX processing from using this key cache entry.
This is needed in particularly in AP mode where the TXQs cannot be flushed on station disconnection. This change alone may not be able to address all cases where the key cache entry might get reused for other purposes immediately (the key cache entry should be released for reuse only once the TXQs do not have any remaining references to them), but this makes it less likely to get unprotected frames and the more complete changes may end up being significantly more complex.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20201214172118.18100-2-jouni@codeaurora.org Cc: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/key.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/key.c b/drivers/net/wireless/ath/key.c index 1816b4e7dc26..59618bb41f6c 100644 --- a/drivers/net/wireless/ath/key.c +++ b/drivers/net/wireless/ath/key.c @@ -583,7 +583,16 @@ EXPORT_SYMBOL(ath_key_config); */ void ath_key_delete(struct ath_common *common, struct ieee80211_key_conf *key) { - ath_hw_keyreset(common, key->hw_key_idx); + /* Leave CCMP and TKIP (main key) configured to avoid disabling + * encryption for potentially pending frames already in a TXQ with the + * keyix pointing to this key entry. Instead, only clear the MAC address + * to prevent RX processing from using this key cache entry. + */ + if (test_bit(key->hw_key_idx, common->ccmp_keymap) || + test_bit(key->hw_key_idx, common->tkip_keymap)) + ath_hw_keysetmac(common, key->hw_key_idx, NULL); + else + ath_hw_keyreset(common, key->hw_key_idx); if (key->hw_key_idx < IEEE80211_WEP_NKID) return;
From: Jouni Malinen jouni@codeaurora.org
commit 73488cb2fa3bb1ef9f6cf0d757f76958bd4deaca upstream.
Now that ath/key.c may not be explicitly clearing keys from the key cache, clear all key cache entries when disabling hardware to make sure no keys are left behind beyond this point.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20201214172118.18100-3-jouni@codeaurora.org Cc: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath9k/main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index e929020d7c9c..4b0a3f042ca3 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -896,6 +896,11 @@ static void ath9k_stop(struct ieee80211_hw *hw)
spin_unlock_bh(&sc->sc_pcu_lock);
+ /* Clear key cache entries explicitly to get rid of any potentially + * remaining keys. + */ + ath9k_cmn_init_crypto(sc->sc_ah); + ath9k_ps_restore(sc);
sc->ps_idle = prev_idle;
From: Jouni Malinen jouni@codeaurora.org
commit d2d3e36498dd8e0c83ea99861fac5cf9e8671226 upstream.
ath9k is going to use this for safer management of key cache entries.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20201214172118.18100-4-jouni@codeaurora.org Cc: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath.h | 1 + drivers/net/wireless/ath/key.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath.h b/drivers/net/wireless/ath/ath.h index 7a364eca46d6..9d18105c449f 100644 --- a/drivers/net/wireless/ath/ath.h +++ b/drivers/net/wireless/ath/ath.h @@ -203,6 +203,7 @@ int ath_key_config(struct ath_common *common, struct ieee80211_sta *sta, struct ieee80211_key_conf *key); bool ath_hw_keyreset(struct ath_common *common, u16 entry); +bool ath_hw_keysetmac(struct ath_common *common, u16 entry, const u8 *mac); void ath_hw_cycle_counters_update(struct ath_common *common); int32_t ath_hw_get_listen_time(struct ath_common *common);
diff --git a/drivers/net/wireless/ath/key.c b/drivers/net/wireless/ath/key.c index 59618bb41f6c..cb266cf3c77c 100644 --- a/drivers/net/wireless/ath/key.c +++ b/drivers/net/wireless/ath/key.c @@ -84,8 +84,7 @@ bool ath_hw_keyreset(struct ath_common *common, u16 entry) } EXPORT_SYMBOL(ath_hw_keyreset);
-static bool ath_hw_keysetmac(struct ath_common *common, - u16 entry, const u8 *mac) +bool ath_hw_keysetmac(struct ath_common *common, u16 entry, const u8 *mac) { u32 macHi, macLo; u32 unicast_flag = AR_KEYTABLE_VALID; @@ -125,6 +124,7 @@ static bool ath_hw_keysetmac(struct ath_common *common,
return true; } +EXPORT_SYMBOL(ath_hw_keysetmac);
static bool ath_hw_set_keycache_entry(struct ath_common *common, u16 entry, const struct ath_keyval *k,
From: Jouni Malinen jouni@codeaurora.org
commit 144cd24dbc36650a51f7fe3bf1424a1432f1f480 upstream.
tkip_keymap can be used internally to avoid the reference to key->cipher and with this, only the key index value itself is needed. This allows ath_key_delete() call to be postponed to be handled after the upper layer STA and key entry have already been removed. This is needed to make ath9k key cache management safer.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20201214172118.18100-5-jouni@codeaurora.org Cc: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath.h | 2 +- drivers/net/wireless/ath/ath5k/mac80211-ops.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_main.c | 2 +- drivers/net/wireless/ath/ath9k/main.c | 5 ++- drivers/net/wireless/ath/key.c | 34 +++++++++---------- 5 files changed, 22 insertions(+), 23 deletions(-)
diff --git a/drivers/net/wireless/ath/ath.h b/drivers/net/wireless/ath/ath.h index 9d18105c449f..f083fb9038c3 100644 --- a/drivers/net/wireless/ath/ath.h +++ b/drivers/net/wireless/ath/ath.h @@ -197,7 +197,7 @@ struct sk_buff *ath_rxbuf_alloc(struct ath_common *common, bool ath_is_mybeacon(struct ath_common *common, struct ieee80211_hdr *hdr);
void ath_hw_setbssidmask(struct ath_common *common); -void ath_key_delete(struct ath_common *common, struct ieee80211_key_conf *key); +void ath_key_delete(struct ath_common *common, u8 hw_key_idx); int ath_key_config(struct ath_common *common, struct ieee80211_vif *vif, struct ieee80211_sta *sta, diff --git a/drivers/net/wireless/ath/ath5k/mac80211-ops.c b/drivers/net/wireless/ath/ath5k/mac80211-ops.c index 16e052d02c94..0f4836fc3b7c 100644 --- a/drivers/net/wireless/ath/ath5k/mac80211-ops.c +++ b/drivers/net/wireless/ath/ath5k/mac80211-ops.c @@ -522,7 +522,7 @@ ath5k_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, } break; case DISABLE_KEY: - ath_key_delete(common, key); + ath_key_delete(common, key->hw_key_idx); break; default: ret = -EINVAL; diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_main.c b/drivers/net/wireless/ath/ath9k/htc_drv_main.c index a82ad739ab80..16a7bae62b7d 100644 --- a/drivers/net/wireless/ath/ath9k/htc_drv_main.c +++ b/drivers/net/wireless/ath/ath9k/htc_drv_main.c @@ -1460,7 +1460,7 @@ static int ath9k_htc_set_key(struct ieee80211_hw *hw, } break; case DISABLE_KEY: - ath_key_delete(common, key); + ath_key_delete(common, key->hw_key_idx); break; default: ret = -EINVAL; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 4b0a3f042ca3..95cc581e3761 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -1546,12 +1546,11 @@ static void ath9k_del_ps_key(struct ath_softc *sc, { struct ath_common *common = ath9k_hw_common(sc->sc_ah); struct ath_node *an = (struct ath_node *) sta->drv_priv; - struct ieee80211_key_conf ps_key = { .hw_key_idx = an->ps_key };
if (!an->ps_key) return;
- ath_key_delete(common, &ps_key); + ath_key_delete(common, an->ps_key); an->ps_key = 0; an->key_idx[0] = 0; } @@ -1742,7 +1741,7 @@ static int ath9k_set_key(struct ieee80211_hw *hw, } break; case DISABLE_KEY: - ath_key_delete(common, key); + ath_key_delete(common, key->hw_key_idx); if (an) { for (i = 0; i < ARRAY_SIZE(an->key_idx); i++) { if (an->key_idx[i] != key->hw_key_idx) diff --git a/drivers/net/wireless/ath/key.c b/drivers/net/wireless/ath/key.c index cb266cf3c77c..61b59a804e30 100644 --- a/drivers/net/wireless/ath/key.c +++ b/drivers/net/wireless/ath/key.c @@ -581,38 +581,38 @@ EXPORT_SYMBOL(ath_key_config); /* * Delete Key. */ -void ath_key_delete(struct ath_common *common, struct ieee80211_key_conf *key) +void ath_key_delete(struct ath_common *common, u8 hw_key_idx) { /* Leave CCMP and TKIP (main key) configured to avoid disabling * encryption for potentially pending frames already in a TXQ with the * keyix pointing to this key entry. Instead, only clear the MAC address * to prevent RX processing from using this key cache entry. */ - if (test_bit(key->hw_key_idx, common->ccmp_keymap) || - test_bit(key->hw_key_idx, common->tkip_keymap)) - ath_hw_keysetmac(common, key->hw_key_idx, NULL); + if (test_bit(hw_key_idx, common->ccmp_keymap) || + test_bit(hw_key_idx, common->tkip_keymap)) + ath_hw_keysetmac(common, hw_key_idx, NULL); else - ath_hw_keyreset(common, key->hw_key_idx); - if (key->hw_key_idx < IEEE80211_WEP_NKID) + ath_hw_keyreset(common, hw_key_idx); + if (hw_key_idx < IEEE80211_WEP_NKID) return;
- clear_bit(key->hw_key_idx, common->keymap); - clear_bit(key->hw_key_idx, common->ccmp_keymap); - if (key->cipher != WLAN_CIPHER_SUITE_TKIP) + clear_bit(hw_key_idx, common->keymap); + clear_bit(hw_key_idx, common->ccmp_keymap); + if (!test_bit(hw_key_idx, common->tkip_keymap)) return;
- clear_bit(key->hw_key_idx + 64, common->keymap); + clear_bit(hw_key_idx + 64, common->keymap);
- clear_bit(key->hw_key_idx, common->tkip_keymap); - clear_bit(key->hw_key_idx + 64, common->tkip_keymap); + clear_bit(hw_key_idx, common->tkip_keymap); + clear_bit(hw_key_idx + 64, common->tkip_keymap);
if (!(common->crypt_caps & ATH_CRYPT_CAP_MIC_COMBINED)) { - ath_hw_keyreset(common, key->hw_key_idx + 32); - clear_bit(key->hw_key_idx + 32, common->keymap); - clear_bit(key->hw_key_idx + 64 + 32, common->keymap); + ath_hw_keyreset(common, hw_key_idx + 32); + clear_bit(hw_key_idx + 32, common->keymap); + clear_bit(hw_key_idx + 64 + 32, common->keymap);
- clear_bit(key->hw_key_idx + 32, common->tkip_keymap); - clear_bit(key->hw_key_idx + 64 + 32, common->tkip_keymap); + clear_bit(hw_key_idx + 32, common->tkip_keymap); + clear_bit(hw_key_idx + 64 + 32, common->tkip_keymap); } } EXPORT_SYMBOL(ath_key_delete);
From: Jouni Malinen jouni@codeaurora.org
commit ca2848022c12789685d3fab3227df02b863f9696 upstream.
Do not delete a key cache entry that is still being referenced by pending frames in TXQs. This avoids reuse of the key cache entry while a frame might still be transmitted using it.
To avoid having to do any additional operations during the main TX path operations, track pending key cache entries in a new bitmap and check whether any pending entries can be deleted before every new key add/remove operation. Also clear any remaining entries when stopping the interface.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/20201214172118.18100-6-jouni@codeaurora.org Cc: Pali Rohár pali@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath9k/hw.h | 1 + drivers/net/wireless/ath/ath9k/main.c | 87 ++++++++++++++++++++++++++- 2 files changed, 87 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath9k/hw.h b/drivers/net/wireless/ath/ath9k/hw.h index 68956cdc8c9a..4b5687b6c0c9 100644 --- a/drivers/net/wireless/ath/ath9k/hw.h +++ b/drivers/net/wireless/ath/ath9k/hw.h @@ -818,6 +818,7 @@ struct ath_hw { struct ath9k_pacal_info pacal_info; struct ar5416Stats stats; struct ath9k_tx_queue_info txq[ATH9K_NUM_TX_QUEUES]; + DECLARE_BITMAP(pending_del_keymap, ATH_KEYMAX);
enum ath9k_int imask; u32 imrs2_reg; diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c index 95cc581e3761..a0097bebcba3 100644 --- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -823,12 +823,80 @@ exit: ieee80211_free_txskb(hw, skb); }
+static bool ath9k_txq_list_has_key(struct list_head *txq_list, u32 keyix) +{ + struct ath_buf *bf; + struct ieee80211_tx_info *txinfo; + struct ath_frame_info *fi; + + list_for_each_entry(bf, txq_list, list) { + if (bf->bf_state.stale || !bf->bf_mpdu) + continue; + + txinfo = IEEE80211_SKB_CB(bf->bf_mpdu); + fi = (struct ath_frame_info *)&txinfo->rate_driver_data[0]; + if (fi->keyix == keyix) + return true; + } + + return false; +} + +static bool ath9k_txq_has_key(struct ath_softc *sc, u32 keyix) +{ + struct ath_hw *ah = sc->sc_ah; + int i; + struct ath_txq *txq; + bool key_in_use = false; + + for (i = 0; !key_in_use && i < ATH9K_NUM_TX_QUEUES; i++) { + if (!ATH_TXQ_SETUP(sc, i)) + continue; + txq = &sc->tx.txq[i]; + if (!txq->axq_depth) + continue; + if (!ath9k_hw_numtxpending(ah, txq->axq_qnum)) + continue; + + ath_txq_lock(sc, txq); + key_in_use = ath9k_txq_list_has_key(&txq->axq_q, keyix); + if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) { + int idx = txq->txq_tailidx; + + while (!key_in_use && + !list_empty(&txq->txq_fifo[idx])) { + key_in_use = ath9k_txq_list_has_key( + &txq->txq_fifo[idx], keyix); + INCR(idx, ATH_TXFIFO_DEPTH); + } + } + ath_txq_unlock(sc, txq); + } + + return key_in_use; +} + +static void ath9k_pending_key_del(struct ath_softc *sc, u8 keyix) +{ + struct ath_hw *ah = sc->sc_ah; + struct ath_common *common = ath9k_hw_common(ah); + + if (!test_bit(keyix, ah->pending_del_keymap) || + ath9k_txq_has_key(sc, keyix)) + return; + + /* No more TXQ frames point to this key cache entry, so delete it. */ + clear_bit(keyix, ah->pending_del_keymap); + ath_key_delete(common, keyix); +} + static void ath9k_stop(struct ieee80211_hw *hw) { struct ath_softc *sc = hw->priv; struct ath_hw *ah = sc->sc_ah; struct ath_common *common = ath9k_hw_common(ah); bool prev_idle; + int i;
ath9k_deinit_channel_context(sc);
@@ -896,6 +964,9 @@ static void ath9k_stop(struct ieee80211_hw *hw)
spin_unlock_bh(&sc->sc_pcu_lock);
+ for (i = 0; i < ATH_KEYMAX; i++) + ath9k_pending_key_del(sc, i); + /* Clear key cache entries explicitly to get rid of any potentially * remaining keys. */ @@ -1712,6 +1783,12 @@ static int ath9k_set_key(struct ieee80211_hw *hw, if (sta) an = (struct ath_node *)sta->drv_priv;
+ /* Delete pending key cache entries if no more frames are pointing to + * them in TXQs. + */ + for (i = 0; i < ATH_KEYMAX; i++) + ath9k_pending_key_del(sc, i); + switch (cmd) { case SET_KEY: if (sta) @@ -1741,7 +1818,15 @@ static int ath9k_set_key(struct ieee80211_hw *hw, } break; case DISABLE_KEY: - ath_key_delete(common, key->hw_key_idx); + if (ath9k_txq_has_key(sc, key->hw_key_idx)) { + /* Delay key cache entry deletion until there are no + * remaining TXQ frames pointing to this entry. + */ + set_bit(key->hw_key_idx, sc->sc_ah->pending_del_keymap); + ath_hw_keysetmac(common, key->hw_key_idx, NULL); + } else { + ath_key_delete(common, key->hw_key_idx); + } if (an) { for (i = 0; i < ARRAY_SIZE(an->key_idx); i++) { if (an->key_idx[i] != key->hw_key_idx)
From: Adrian Larumbe adrian.martinezlarumbe@imgtec.com
[ Upstream commit 7dd2dd4ff9f3abda601f22b9d01441a0869d20d7 ]
When user calls dmaengine_terminate_sync, the driver will clean up any remaining descriptors for all the pending or active transfers that had previously been submitted. However, this might happen whilst the tasklet is invoking the DMA callback for the last finished transfer, so by the time it returns and takes over the channel's spinlock, the list of completed descriptors it was traversing is no longer valid. This leads to a read-after-free situation.
Fix it by signalling whether a user-triggered termination has happened by means of a boolean variable.
Signed-off-by: Adrian Larumbe adrian.martinezlarumbe@imgtec.com Link: https://lore.kernel.org/r/20210706234338.7696-3-adrian.martinezlarumbe@imgte... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/xilinx/xilinx_dma.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 0c5668e897fe..d891ec05bc48 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -332,6 +332,7 @@ struct xilinx_dma_tx_descriptor { * @genlock: Support genlock mode * @err: Channel has errors * @idle: Check for channel idle + * @terminating: Check for channel being synchronized by user * @tasklet: Cleanup work after irq * @config: Device configuration info * @flush_on_fsync: Flush on Frame sync @@ -369,6 +370,7 @@ struct xilinx_dma_chan { bool genlock; bool err; bool idle; + bool terminating; struct tasklet_struct tasklet; struct xilinx_vdma_config config; bool flush_on_fsync; @@ -843,6 +845,13 @@ static void xilinx_dma_chan_desc_cleanup(struct xilinx_dma_chan *chan) /* Run any dependencies, then free the descriptor */ dma_run_dependencies(&desc->async_tx); xilinx_dma_free_tx_descriptor(chan, desc); + + /* + * While we ran a callback the user called a terminate function, + * which takes care of cleaning up any remaining descriptors + */ + if (chan->terminating) + break; }
spin_unlock_irqrestore(&chan->lock, flags); @@ -1612,6 +1621,8 @@ static dma_cookie_t xilinx_dma_tx_submit(struct dma_async_tx_descriptor *tx) if (desc->cyclic) chan->cyclic = true;
+ chan->terminating = false; + spin_unlock_irqrestore(&chan->lock, flags);
return cookie; @@ -2068,6 +2079,7 @@ static int xilinx_dma_terminate_all(struct dma_chan *dchan) }
/* Remove and free all of the descriptors in the lists */ + chan->terminating = true; xilinx_dma_free_descriptors(chan); chan->idle = true;
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 1da569fa7ec8cb0591c74aa3050d4ea1397778b4 ]
pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by moving the error_pm label above the pm_runtime_put() in the error path.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Link: https://lore.kernel.org/r/20210706124521.1371901-1-yukuai3@huawei.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sh/usb-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/sh/usb-dmac.c b/drivers/dma/sh/usb-dmac.c index 6c94ed750049..d77bf325f038 100644 --- a/drivers/dma/sh/usb-dmac.c +++ b/drivers/dma/sh/usb-dmac.c @@ -860,8 +860,8 @@ static int usb_dmac_probe(struct platform_device *pdev)
error: of_dma_controller_free(pdev->dev.of_node); - pm_runtime_put(&pdev->dev); error_pm: + pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); return ret; }
From: Dave Gerlach d-gerlach@ti.com
[ Upstream commit 20a6b3fd8e2e2c063b25fbf2ee74d86b898e5087 ]
Based on the latest timing specifications for the TPS65218 from the data sheet, http://www.ti.com/lit/ds/symlink/tps65218.pdf, document SLDS206 from November 2014, we must change the i2c bus speed to better fit within the minimum high SCL time required for proper i2c transfer.
When running at 400khz, measurements show that SCL spends 0.8125 uS/1.666 uS high/low which violates the requirement for minimum high period of SCL provided in datasheet Table 7.6 which is 1 uS. Switching to 100khz gives us 5 uS/5 uS high/low which both fall above the minimum given values for 100 khz, 4.0 uS/4.7 uS high/low.
Without this patch occasionally a voltage set operation from the kernel will appear to have worked but the actual voltage reflected on the PMIC will not have updated, causing problems especially with cpufreq that may update to a higher OPP without actually raising the voltage on DCDC2, leading to a hang.
Signed-off-by: Dave Gerlach d-gerlach@ti.com Signed-off-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am43x-epos-evm.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/am43x-epos-evm.dts b/arch/arm/boot/dts/am43x-epos-evm.dts index 02bbdfb3f258..0cc3ac6566c6 100644 --- a/arch/arm/boot/dts/am43x-epos-evm.dts +++ b/arch/arm/boot/dts/am43x-epos-evm.dts @@ -590,7 +590,7 @@ status = "okay"; pinctrl-names = "default"; pinctrl-0 = <&i2c0_pins>; - clock-frequency = <400000>; + clock-frequency = <100000>;
tps65218: tps65218@24 { reg = <0x24>;
From: Peter Ujfalusi peter.ujfalusi@gmail.com
[ Upstream commit eda97cb095f2958bbad55684a6ca3e7d7af0176a ]
If the router_xlate can not find the controller in the available DMA devices then it should return with -EPORBE_DEFER in a same way as the of_dma_request_slave_channel() does.
The issue can be reproduced if the event router is registered before the DMA controller itself and a driver would request for a channel before the controller is registered. In of_dma_request_slave_channel(): 1. of_dma_find_controller() would find the dma_router 2. ofdma->of_dma_xlate() would fail and returned NULL 3. -ENODEV is returned as error code
with this patch we would return in this case the correct -EPROBE_DEFER and the client can try to request the channel later.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@gmail.com Link: https://lore.kernel.org/r/20210717190021.21897-1-peter.ujfalusi@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/of-dma.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/of-dma.c b/drivers/dma/of-dma.c index 8344a60c2131..a9d3ab94749b 100644 --- a/drivers/dma/of-dma.c +++ b/drivers/dma/of-dma.c @@ -68,8 +68,12 @@ static struct dma_chan *of_dma_router_xlate(struct of_phandle_args *dma_spec, return NULL;
ofdma_target = of_dma_find_controller(&dma_spec_target); - if (!ofdma_target) - return NULL; + if (!ofdma_target) { + ofdma->dma_router->route_free(ofdma->dma_router->dev, + route_data); + chan = ERR_PTR(-EPROBE_DEFER); + goto err; + }
chan = ofdma_target->of_dma_xlate(&dma_spec_target, ofdma_target); if (IS_ERR_OR_NULL(chan)) { @@ -80,6 +84,7 @@ static struct dma_chan *of_dma_router_xlate(struct of_phandle_args *dma_spec, chan->route_data = route_data; }
+err: /* * Need to put the node back since the ofdma->of_dma_route_allocate * has taken it for generating the new, translated dma_spec
From: Harshvardhan Jha harshvardhan.jha@oracle.com
[ Upstream commit 77541f78eadfe9fdb018a7b8b69f0f2af2cf4b82 ]
The list_for_each_entry() iterator, "adapter" in this code, can never be NULL. If we exit the loop without finding the correct adapter then "adapter" points invalid memory that is an offset from the list head. This will eventually lead to memory corruption and presumably a kernel crash.
Link: https://lore.kernel.org/r/20210708074642.23599-1-harshvardhan.jha@oracle.com Acked-by: Sumit Saxena sumit.saxena@broadcom.com Signed-off-by: Harshvardhan Jha harshvardhan.jha@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_mm.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/megaraid/megaraid_mm.c b/drivers/scsi/megaraid/megaraid_mm.c index 8428247015db..81df2c94b747 100644 --- a/drivers/scsi/megaraid/megaraid_mm.c +++ b/drivers/scsi/megaraid/megaraid_mm.c @@ -250,7 +250,7 @@ mraid_mm_get_adapter(mimd_t __user *umimd, int *rval) mimd_t mimd; uint32_t adapno; int iterator; - + bool is_found;
if (copy_from_user(&mimd, umimd, sizeof(mimd_t))) { *rval = -EFAULT; @@ -266,12 +266,16 @@ mraid_mm_get_adapter(mimd_t __user *umimd, int *rval)
adapter = NULL; iterator = 0; + is_found = false;
list_for_each_entry(adapter, &adapters_list_g, list) { - if (iterator++ == adapno) break; + if (iterator++ == adapno) { + is_found = true; + break; + } }
- if (!adapter) { + if (!is_found) { *rval = -ENODEV; return NULL; } @@ -737,6 +741,7 @@ ioctl_done(uioc_t *kioc) uint32_t adapno; int iterator; mraid_mmadp_t* adapter; + bool is_found;
/* * When the kioc returns from driver, make sure it still doesn't @@ -759,19 +764,23 @@ ioctl_done(uioc_t *kioc) iterator = 0; adapter = NULL; adapno = kioc->adapno; + is_found = false;
con_log(CL_ANN, ( KERN_WARNING "megaraid cmm: completed " "ioctl that was timedout before\n"));
list_for_each_entry(adapter, &adapters_list_g, list) { - if (iterator++ == adapno) break; + if (iterator++ == adapno) { + is_found = true; + break; + } }
kioc->timedout = 0;
- if (adapter) { + if (is_found) mraid_mm_dealloc_kioc( adapter, kioc ); - } + } else { wake_up(&wait_q);
From: Ye Bin yebin10@huawei.com
[ Upstream commit bc546c0c9abb3bb2fb46866b3d1e6ade9695a5f6 ]
The following BUG_ON() was observed during RDAC scan:
[595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP ...... [595953.251065] Call trace: [595953.259054] check_ownership+0xb0/0x118 [595953.269794] rdac_bus_attach+0x1f0/0x4b0 [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 [595953.278211] scsi_dh_add_device+0xc4/0xe8 [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 [595953.291142] __scsi_scan_target+0x598/0x630 [595953.295395] scsi_scan_target+0x120/0x130 [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] [595953.304944] store_scan+0xb0/0x108 [595953.308420] dev_attr_store+0x44/0x60 [595953.312160] sysfs_kf_write+0x58/0x80 [595953.315893] kernfs_fop_write+0xe8/0x1f0 [595953.319888] __vfs_write+0x60/0x190 [595953.323448] vfs_write+0xac/0x1c0 [595953.326836] ksys_write+0x74/0xf0 [595953.330221] __arm64_sys_write+0x24/0x30
Code is in check_ownership:
list_for_each_entry_rcu(tmp, &h->ctlr->dh_list, node) { /* h->sdev should always be valid */ BUG_ON(!tmp->sdev); tmp->sdev->access_state = access_state; }
rdac_bus_attach initialize_controller list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev;
rdac_bus_detach list_del_rcu(&h->node); h->sdev = NULL;
Fix the race between rdac_bus_attach() and rdac_bus_detach() where h->sdev is NULL when processing the RDAC attach.
Link: https://lore.kernel.org/r/20210113063103.2698953-1-yebin10@huawei.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Ye Bin yebin10@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/device_handler/scsi_dh_rdac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c index 6c629ef1bc4e..b3c23edd4b6c 100644 --- a/drivers/scsi/device_handler/scsi_dh_rdac.c +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c @@ -453,8 +453,8 @@ static int initialize_controller(struct scsi_device *sdev, if (!h->ctlr) err = SCSI_DH_RES_TEMP_UNAVAIL; else { - list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev; + list_add_rcu(&h->node, &h->ctlr->dh_list); } spin_unlock(&list_lock); err = SCSI_DH_OK; @@ -779,11 +779,11 @@ static void rdac_bus_detach( struct scsi_device *sdev ) spin_lock(&list_lock); if (h->ctlr) { list_del_rcu(&h->node); - h->sdev = NULL; kref_put(&h->ctlr->kref, release_controller); } spin_unlock(&list_lock); sdev->handler_data = NULL; + synchronize_rcu(); kfree(h); }
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
[ Upstream commit 70edd2e6f652f67d854981fd67f9ad0f1deaea92 ]
Avoid printing a 'target allocation failed' error if the driver target_alloc() callback function returns -ENXIO. This return value indicates that the corresponding H:C:T:L entry is empty.
Removing this error reduces the scan time if the user issues SCAN_WILD_CARD scan operation through sysfs parameter on a host with a lot of empty H:C:T:L entries.
Avoiding the printk on -ENXIO matches the behavior of the other callback functions during scanning.
Link: https://lore.kernel.org/r/20210726115402.1936-1-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_scan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 009a5b2aa3d0..149465de35b2 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -462,7 +462,8 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, error = shost->hostt->target_alloc(starget);
if(error) { - dev_printk(KERN_ERR, dev, "target allocation failed, error %d\n", error); + if (error != -ENXIO) + dev_err(dev, "target allocation failed, error %d\n", error); /* don't want scsi_target_reap to do the final * put because it will be under the host lock */ scsi_target_destroy(starget);
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit 47091f473b364c98207c4def197a0ae386fc9af1 ]
Once the new schema interrupt-controller/arm,vic.yaml is added, we get the below warnings:
arch/arm/boot/dts/ste-nomadik-nhk15.dt.yaml: intc@10140000: $nodename:0: 'intc@10140000' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
Fix the node names for the interrupt controller to conform to the standard node name interrupt-controller@..
Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Cc: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20210617210825.3064367-2-sudeep.holla@arm.com Link: https://lore.kernel.org/r/20210626000103.830184-1-linus.walleij@linaro.org' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/ste-nomadik-stn8815.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi b/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi index fca76a696d9d..9ba4d1630ca3 100644 --- a/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi +++ b/arch/arm/boot/dts/ste-nomadik-stn8815.dtsi @@ -755,14 +755,14 @@ status = "disabled"; };
- vica: intc@10140000 { + vica: interrupt-controller@10140000 { compatible = "arm,versatile-vic"; interrupt-controller; #interrupt-cells = <1>; reg = <0x10140000 0x20>; };
- vicb: intc@10140020 { + vicb: interrupt-controller@10140020 { compatible = "arm,versatile-vic"; interrupt-controller; #interrupt-cells = <1>;
From: "Ivan T. Ivanov" iivanov@suse.de
[ Upstream commit 6b67d4d63edece1033972214704c04f36c5be89a ]
Currently phy_device state could be left in inconsistent state shown by following alert message[1]. This is because phy_read_status could be called concurrently from lan78xx_delayedwork, phy_state_machine and __ethtool_get_link. Fix this by making sure that phy_device state is updated atomically.
[1] lan78xx 1-1.1.1:1.0 eth0: No phy led trigger registered for speed(-1)
Signed-off-by: Ivan T. Ivanov iivanov@suse.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/lan78xx.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 5bd07cdb3e6e..ac5f72077b26 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1172,7 +1172,7 @@ static int lan78xx_link_reset(struct lan78xx_net *dev) { struct phy_device *phydev = dev->net->phydev; struct ethtool_link_ksettings ecmd; - int ladv, radv, ret; + int ladv, radv, ret, link; u32 buf;
/* clear LAN78xx interrupt status */ @@ -1180,9 +1180,12 @@ static int lan78xx_link_reset(struct lan78xx_net *dev) if (unlikely(ret < 0)) return -EIO;
+ mutex_lock(&phydev->lock); phy_read_status(phydev); + link = phydev->link; + mutex_unlock(&phydev->lock);
- if (!phydev->link && dev->link_on) { + if (!link && dev->link_on) { dev->link_on = false;
/* reset MAC */ @@ -1195,7 +1198,7 @@ static int lan78xx_link_reset(struct lan78xx_net *dev) return -EIO;
del_timer(&dev->stat_monitor); - } else if (phydev->link && !dev->link_on) { + } else if (link && !dev->link_on) { dev->link_on = true;
phy_ethtool_ksettings_get(phydev, &ecmd); @@ -1485,9 +1488,14 @@ static int lan78xx_set_eee(struct net_device *net, struct ethtool_eee *edata)
static u32 lan78xx_get_link(struct net_device *net) { + u32 link; + + mutex_lock(&net->phydev->lock); phy_read_status(net->phydev); + link = net->phydev->link; + mutex_unlock(&net->phydev->lock);
- return net->phydev->link; + return link; }
static void lan78xx_get_drvinfo(struct net_device *net,
From: Ole Bjørn Midtbø omidtbo@cisco.com
[ Upstream commit cca342d98bef68151a80b024f7bf5f388d1fbdea ]
A different wait queue was used when removing ctrl_wait than when adding it. This effectively made the remove operation without locking compared to other operations on the wait queue ctrl_wait was part of. This caused issues like below where dead000000000100 is LIST_POISON1 and dead000000000200 is LIST_POISON2.
list_add corruption. next->prev should be prev (ffffffc1b0a33a08), \ but was dead000000000200. (next=ffffffc03ac77de0). ------------[ cut here ]------------ CPU: 3 PID: 2138 Comm: bluetoothd Tainted: G O 4.4.238+ #9 ... ---[ end trace 0adc2158f0646eac ]--- Call trace: [<ffffffc000443f78>] __list_add+0x38/0xb0 [<ffffffc0000f0d04>] add_wait_queue+0x4c/0x68 [<ffffffc00020eecc>] __pollwait+0xec/0x100 [<ffffffc000d1556c>] bt_sock_poll+0x74/0x200 [<ffffffc000bdb8a8>] sock_poll+0x110/0x128 [<ffffffc000210378>] do_sys_poll+0x220/0x480 [<ffffffc0002106f0>] SyS_poll+0x80/0x138 [<ffffffc00008510c>] __sys_trace_return+0x0/0x4
Unable to handle kernel paging request at virtual address dead000000000100 ... CPU: 4 PID: 5387 Comm: kworker/u15:3 Tainted: G W O 4.4.238+ #9 ... Call trace: [<ffffffc0000f079c>] __wake_up_common+0x7c/0xa8 [<ffffffc0000f0818>] __wake_up+0x50/0x70 [<ffffffc000be11b0>] sock_def_wakeup+0x58/0x60 [<ffffffc000de5e10>] l2cap_sock_teardown_cb+0x200/0x224 [<ffffffc000d3f2ac>] l2cap_chan_del+0xa4/0x298 [<ffffffc000d45ea0>] l2cap_conn_del+0x118/0x198 [<ffffffc000d45f8c>] l2cap_disconn_cfm+0x6c/0x78 [<ffffffc000d29934>] hci_event_packet+0x564/0x2e30 [<ffffffc000d19b0c>] hci_rx_work+0x10c/0x360 [<ffffffc0000c2218>] process_one_work+0x268/0x460 [<ffffffc0000c2678>] worker_thread+0x268/0x480 [<ffffffc0000c94e0>] kthread+0x118/0x128 [<ffffffc000085070>] ret_from_fork+0x10/0x20 ---[ end trace 0adc2158f0646ead ]---
Signed-off-by: Ole Bjørn Midtbø omidtbo@cisco.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hidp/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c index 253975cce943..0cbd0bca971f 100644 --- a/net/bluetooth/hidp/core.c +++ b/net/bluetooth/hidp/core.c @@ -1282,7 +1282,7 @@ static int hidp_session_thread(void *arg)
/* cleanup runtime environment */ remove_wait_queue(sk_sleep(session->intr_sock->sk), &intr_wait); - remove_wait_queue(sk_sleep(session->intr_sock->sk), &ctrl_wait); + remove_wait_queue(sk_sleep(session->ctrl_sock->sk), &ctrl_wait); wake_up_interruptible(&session->report_queue); hidp_del_timer(session);
From: Marek Behún kabel@kernel.org
[ Upstream commit 484f2b7c61b9ae58cc00c5127bcbcd9177af8dfe ]
The 1.2 GHz variant of the Armada 3720 SOC is unstable with DVFS: when the SOC boots, the WTMI firmware sets clocks and AVS values that work correctly with 1.2 GHz CPU frequency, but random crashes occur once cpufreq driver starts scaling.
We do not know currently what is the reason: - it may be that the voltage value for L0 for 1.2 GHz variant provided by the vendor in the OTP is simply incorrect when scaling is used, - it may be that some delay is needed somewhere, - it may be something else.
The most sane solution now seems to be to simply forbid the cpufreq driver on 1.2 GHz variant.
Signed-off-by: Marek Behún kabel@kernel.org Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/armada-37xx-cpufreq.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index a36452bd9612..31b5655419b4 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -102,7 +102,11 @@ struct armada_37xx_dvfs { };
static struct armada_37xx_dvfs armada_37xx_dvfs[] = { - {.cpu_freq_max = 1200*1000*1000, .divider = {1, 2, 4, 6} }, + /* + * The cpufreq scaling for 1.2 GHz variant of the SOC is currently + * unstable because we do not know how to configure it properly. + */ + /* {.cpu_freq_max = 1200*1000*1000, .divider = {1, 2, 4, 6} }, */ {.cpu_freq_max = 1000*1000*1000, .divider = {1, 2, 4, 5} }, {.cpu_freq_max = 800*1000*1000, .divider = {1, 2, 3, 4} }, {.cpu_freq_max = 600*1000*1000, .divider = {2, 4, 5, 6} },
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 86aab09a4870bb8346c9579864588c3d7f555299 ]
GCC complains about empty macros in an 'if' statement, so convert them to 'do {} while (0)' macros.
Fixes these build warnings:
net/dccp/output.c: In function 'dccp_xmit_packet': ../net/dccp/output.c:283:71: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] 283 | dccp_pr_debug("transmit_skb() returned err=%d\n", err); net/dccp/ackvec.c: In function 'dccp_ackvec_update_old': ../net/dccp/ackvec.c:163:80: warning: suggest braces around empty body in an 'else' statement [-Wempty-body] 163 | (unsigned long long)seqno, state);
Fixes: dc841e30eaea ("dccp: Extend CCID packet dequeueing interface") Fixes: 380240864451 ("dccp ccid-2: Update code for the Ack Vector input/registration routine") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: dccp@vger.kernel.org Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: Gerrit Renker gerrit@erg.abdn.ac.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/dccp/dccp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h index f91e3816806b..aec3c724665f 100644 --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -44,9 +44,9 @@ extern bool dccp_debug; #define dccp_pr_debug_cat(format, a...) DCCP_PRINTK(dccp_debug, format, ##a) #define dccp_debug(fmt, a...) dccp_pr_debug_cat(KERN_DEBUG fmt, ##a) #else -#define dccp_pr_debug(format, a...) -#define dccp_pr_debug_cat(format, a...) -#define dccp_debug(format, a...) +#define dccp_pr_debug(format, a...) do {} while (0) +#define dccp_pr_debug_cat(format, a...) do {} while (0) +#define dccp_debug(format, a...) do {} while (0) #endif
extern struct inet_hashinfo dccp_hashinfo;
From: Xie Yongji xieyongji@bytedance.com
[ Upstream commit f7ad318ea0ad58ebe0e595e59aed270bb643b29b ]
This fixes the incorrect calculation for integer overflow when the last address of iova range is 0xffffffff.
Fixes: ec33d031a14b ("vhost: detect 32 bit integer wrap around") Reported-by: Jason Wang jasowang@redhat.com Signed-off-by: Xie Yongji xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vhost.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 732327756ee1..7a58f629155d 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -678,10 +678,16 @@ static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz) (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8); }
+/* Make sure 64 bit math will not overflow. */ static bool vhost_overflow(u64 uaddr, u64 size) { - /* Make sure 64 bit math will not overflow. */ - return uaddr > ULONG_MAX || size > ULONG_MAX || uaddr > ULONG_MAX - size; + if (uaddr > ULONG_MAX || size > ULONG_MAX) + return true; + + if (!size) + return false; + + return uaddr > ULONG_MAX - size + 1; }
/* Caller should have vq mutex and device mutex. */
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 3c603136c9f82833813af77185618de5af67676c ]
We can't take the tx lock from the napi poll routine, because netpoll can poll napi at any moment, including with the tx lock already held.
The tx lock is protecting against two paths - the disable path, and (as Michael points out) the NETDEV_TX_BUSY case which may occur if NAPI completions race with start_xmit and both decide to re-enable the queue.
For the disable/ifdown path use synchronize_net() to make sure closing the device does not race we restarting the queues. Annotate accesses to dev_state against data races.
For the NAPI cleanup vs start_xmit path - appropriate barriers are already in place in the main spot where Tx queue is stopped but we need to do the same careful dance in the TX_BUSY case.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Reviewed-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 54 ++++++++++++++--------- 1 file changed, 32 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index ebcf4ea66385..c4ddd8f71b93 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -282,6 +282,26 @@ static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb) return md_dst->u.port_info.port_id; }
+static bool bnxt_txr_netif_try_stop_queue(struct bnxt *bp, + struct bnxt_tx_ring_info *txr, + struct netdev_queue *txq) +{ + netif_tx_stop_queue(txq); + + /* netif_tx_stop_queue() must be done before checking + * tx index in bnxt_tx_avail() below, because in + * bnxt_tx_int(), we update tx index before checking for + * netif_tx_queue_stopped(). + */ + smp_mb(); + if (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh) { + netif_tx_wake_queue(txq); + return false; + } + + return true; +} + static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bnxt *bp = netdev_priv(dev); @@ -309,8 +329,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
free_size = bnxt_tx_avail(bp, txr); if (unlikely(free_size < skb_shinfo(skb)->nr_frags + 2)) { - netif_tx_stop_queue(txq); - return NETDEV_TX_BUSY; + if (bnxt_txr_netif_try_stop_queue(bp, txr, txq)) + return NETDEV_TX_BUSY; }
length = skb->len; @@ -521,16 +541,7 @@ tx_done: if (skb->xmit_more && !tx_buf->is_push) bnxt_db_write(bp, txr->tx_doorbell, DB_KEY_TX | prod);
- netif_tx_stop_queue(txq); - - /* netif_tx_stop_queue() must be done before checking - * tx index in bnxt_tx_avail() below, because in - * bnxt_tx_int(), we update tx index before checking for - * netif_tx_queue_stopped(). - */ - smp_mb(); - if (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh) - netif_tx_wake_queue(txq); + bnxt_txr_netif_try_stop_queue(bp, txr, txq); } return NETDEV_TX_OK;
@@ -614,14 +625,9 @@ next_tx_int: smp_mb();
if (unlikely(netif_tx_queue_stopped(txq)) && - (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) { - __netif_tx_lock(txq, smp_processor_id()); - if (netif_tx_queue_stopped(txq) && - bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh && - txr->dev_state != BNXT_DEV_STATE_CLOSING) - netif_tx_wake_queue(txq); - __netif_tx_unlock(txq); - } + bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh && + READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING) + netif_tx_wake_queue(txq); }
static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping, @@ -6294,9 +6300,11 @@ void bnxt_tx_disable(struct bnxt *bp) if (bp->tx_ring) { for (i = 0; i < bp->tx_nr_rings; i++) { txr = &bp->tx_ring[i]; - txr->dev_state = BNXT_DEV_STATE_CLOSING; + WRITE_ONCE(txr->dev_state, BNXT_DEV_STATE_CLOSING); } } + /* Make sure napi polls see @dev_state change */ + synchronize_net(); /* Drop carrier first to prevent TX timeout */ netif_carrier_off(bp->dev); /* Stop all TX queues */ @@ -6310,8 +6318,10 @@ void bnxt_tx_enable(struct bnxt *bp)
for (i = 0; i < bp->tx_nr_rings; i++) { txr = &bp->tx_ring[i]; - txr->dev_state = 0; + WRITE_ONCE(txr->dev_state, 0); } + /* Make sure napi polls see @dev_state change */ + synchronize_net(); netif_tx_wake_all_queues(bp->dev); if (bp->link_info.link_up) netif_carrier_on(bp->dev);
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 01cca6b9330ac7460de44eeeb3a0607f8aae69ff ]
napi schedules DIM, napi has to be disabled first, then DIM canceled.
Noticed while reading the code.
Fixes: 0bc0b97fca73 ("bnxt_en: cleanup DIM work on device shutdown") Fixes: 6a8788f25625 ("bnxt_en: add support for software dynamic interrupt moderation") Reviewed-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index c4ddd8f71b93..55827ac65a15 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6269,10 +6269,9 @@ static void bnxt_disable_napi(struct bnxt *bp) for (i = 0; i < bp->cp_nr_rings; i++) { struct bnxt_cp_ring_info *cpr = &bp->bnapi[i]->cp_ring;
+ napi_disable(&bp->bnapi[i]->napi); if (bp->bnapi[i]->rx_ring) cancel_work_sync(&cpr->dim.work); - - napi_disable(&bp->bnapi[i]->napi); } }
From: Pavel Skripkin paskripkin@gmail.com
[ Upstream commit 19d1532a187669ce86d5a2696eb7275310070793 ]
Syzbot reported slab-out-of bounds write in decode_data(). The problem was in missing validation checks.
Syzbot's reproducer generated malicious input, which caused decode_data() to be called a lot in sixpack_decode(). Since rx_count_cooked is only 400 bytes and noone reported before, that 400 bytes is not enough, let's just check if input is malicious and complain about buffer overrun.
Fail log: ================================================================== BUG: KASAN: slab-out-of-bounds in drivers/net/hamradio/6pack.c:843 Write of size 1 at addr ffff888087c5544e by task kworker/u4:0/7
CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.6.0-rc3-syzkaller #0 ... Workqueue: events_unbound flush_to_ldisc Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:641 __asan_report_store1_noabort+0x17/0x20 mm/kasan/generic_report.c:137 decode_data.part.0+0x23b/0x270 drivers/net/hamradio/6pack.c:843 decode_data drivers/net/hamradio/6pack.c:965 [inline] sixpack_decode drivers/net/hamradio/6pack.c:968 [inline]
Reported-and-tested-by: syzbot+fc8cd9a673d4577fb2e4@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Pavel Skripkin paskripkin@gmail.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/hamradio/6pack.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/hamradio/6pack.c b/drivers/net/hamradio/6pack.c index 8c636c493227..1001e9a2edd4 100644 --- a/drivers/net/hamradio/6pack.c +++ b/drivers/net/hamradio/6pack.c @@ -859,6 +859,12 @@ static void decode_data(struct sixpack *sp, unsigned char inbyte) return; }
+ if (sp->rx_count_cooked + 2 >= sizeof(sp->cooked_buf)) { + pr_err("6pack: cooked buffer overrun, data loss\n"); + sp->rx_count = 0; + return; + } + buf = sp->raw_buf; sp->cooked_buf[sp->rx_count_cooked++] = buf[0] | ((buf[1] << 2) & 0xc0);
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 55c8fca1dae1fb0d11deaa21b65a647dedb1bc50 ]
During the swap dependency on PCH_GBE to selection PTP_1588_CLOCK_PCH incidentally dropped the implicit dependency on the PCI. Restore it.
Fixes: 18d359ceb044 ("pch_gbe, ptp_pch: Fix the dependency direction between these drivers") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig index d137c480db46..dd04aedd76e0 100644 --- a/drivers/ptp/Kconfig +++ b/drivers/ptp/Kconfig @@ -91,7 +91,8 @@ config DP83640_PHY config PTP_1588_CLOCK_PCH tristate "Intel PCH EG20T as PTP clock" depends on X86_32 || COMPILE_TEST - depends on HAS_IOMEM && NET + depends on HAS_IOMEM && PCI + depends on NET imply PTP_1588_CLOCK help This driver adds support for using the PCH EG20T as a PTP
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 0a298d133893c72c96e2156ed7cb0f0c4a306a3e ]
qlcnic_83xx_unlock_flash() is called on all paths after we call qlcnic_83xx_lock_flash(), except for one error path on failure of QLCRD32(), which may cause a deadlock. This bug is suggested by a static analysis tool, please advise.
Fixes: 81d0aeb0a4fff ("qlcnic: flash template based firmware reset recovery") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Link: https://lore.kernel.org/r/20210816131405.24024-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 6ed8294f7df8..a15845e511b2 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -3158,8 +3158,10 @@ int qlcnic_83xx_flash_read32(struct qlcnic_adapter *adapter, u32 flash_addr,
indirect_addr = QLC_83XX_FLASH_DIRECT_DATA(addr); ret = QLCRD32(adapter, indirect_addr, &err); - if (err == -EIO) + if (err == -EIO) { + qlcnic_83xx_unlock_flash(adapter); return err; + }
word = ret; *(u32 *)p_data = word;
From: Saravana Kannan saravanak@google.com
[ Upstream commit 99d81e942474cc7677d12f673f42a7ea699e2589 ]
If we are seeing memory allocation errors, don't try to continue registering child mdiobus devices. It's unlikely they'll succeed.
Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and errors more verbose") Signed-off-by: Saravana Kannan saravanak@google.com Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Marc Zyngier maz@kernel.org Tested-by: Marc Zyngier maz@kernel.org Acked-by: Kevin Hilman khilman@baylibre.com Tested-by: Kevin Hilman khilman@baylibre.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/mdio-mux.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c index 0a86f1e4c02f..bb7e3f12a003 100644 --- a/drivers/net/phy/mdio-mux.c +++ b/drivers/net/phy/mdio-mux.c @@ -85,6 +85,17 @@ out:
static int parent_count;
+static void mdio_mux_uninit_children(struct mdio_mux_parent_bus *pb) +{ + struct mdio_mux_child_bus *cb = pb->children; + + while (cb) { + mdiobus_unregister(cb->mii_bus); + mdiobus_free(cb->mii_bus); + cb = cb->next; + } +} + int mdio_mux_init(struct device *dev, struct device_node *mux_node, int (*switch_fn)(int cur, int desired, void *data), @@ -147,7 +158,7 @@ int mdio_mux_init(struct device *dev, cb = devm_kzalloc(dev, sizeof(*cb), GFP_KERNEL); if (!cb) { ret_val = -ENOMEM; - continue; + goto err_loop; } cb->bus_number = v; cb->parent = pb; @@ -155,8 +166,7 @@ int mdio_mux_init(struct device *dev, cb->mii_bus = mdiobus_alloc(); if (!cb->mii_bus) { ret_val = -ENOMEM; - devm_kfree(dev, cb); - continue; + goto err_loop; } cb->mii_bus->priv = cb;
@@ -185,6 +195,10 @@ int mdio_mux_init(struct device *dev,
dev_err(dev, "Error: No acceptable child buses found\n"); devm_kfree(dev, pb); + +err_loop: + mdio_mux_uninit_children(pb); + of_node_put(child_bus_node); err_pb_kz: put_device(&parent_bus->dev); err_parent_bus: @@ -196,14 +210,8 @@ EXPORT_SYMBOL_GPL(mdio_mux_init); void mdio_mux_uninit(void *mux_handle) { struct mdio_mux_parent_bus *pb = mux_handle; - struct mdio_mux_child_bus *cb = pb->children; - - while (cb) { - mdiobus_unregister(cb->mii_bus); - mdiobus_free(cb->mii_bus); - cb = cb->next; - }
+ mdio_mux_uninit_children(pb); put_device(&pb->mii_bus->dev); } EXPORT_SYMBOL_GPL(mdio_mux_uninit);
From: Saravana Kannan saravanak@google.com
[ Upstream commit 7bd0cef5dac685f09ef8b0b2a7748ff42d284dc7 ]
When registering mdiobus children, if we get an -EPROBE_DEFER, we shouldn't ignore it and continue registering the rest of the mdiobus children. This would permanently prevent the deferring child mdiobus from working instead of reattempting it in the future. So, if a child mdiobus needs to be reattempted in the future, defer the entire mdio-mux initialization.
This fixes the issue where PHYs sitting under the mdio-mux aren't initialized correctly if the PHY's interrupt controller is not yet ready when the mdio-mux is being probed. Additional context in the link below.
Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.") Link: https://lore.kernel.org/lkml/CAGETcx95kHrv8wA-O+-JtfH7H9biJEGJtijuPVN0V5dUKU... Signed-off-by: Saravana Kannan saravanak@google.com Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Marc Zyngier maz@kernel.org Tested-by: Marc Zyngier maz@kernel.org Acked-by: Kevin Hilman khilman@baylibre.com Tested-by: Kevin Hilman khilman@baylibre.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/mdio-mux.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c index bb7e3f12a003..c16f875ed9ea 100644 --- a/drivers/net/phy/mdio-mux.c +++ b/drivers/net/phy/mdio-mux.c @@ -178,11 +178,15 @@ int mdio_mux_init(struct device *dev, cb->mii_bus->write = mdio_mux_write; r = of_mdiobus_register(cb->mii_bus, child_bus_node); if (r) { + mdiobus_free(cb->mii_bus); + if (r == -EPROBE_DEFER) { + ret_val = r; + goto err_loop; + } + devm_kfree(dev, cb); dev_err(dev, "Error: Failed to register MDIO bus for child %pOF\n", child_bus_node); - mdiobus_free(cb->mii_bus); - devm_kfree(dev, cb); } else { cb->next = pb->children; pb->children = cb;
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit 25f8203b4be1937c4939bb98623e67dcfd7da4d1 ]
When a Data CRC interrupt is received, the driver disables the DMA, then sends the stop/abort command and then waits for Data Transfer Over.
However, sometimes, when a data CRC error is received in the middle of a multi-block write transfer, the Data Transfer Over interrupt is never received, and the driver hangs and never completes the request.
The driver sets the BMOD.SWR bit (SDMMC_IDMAC_SWRESET) when stopping the DMA, but according to the manual CMD.STOP_ABORT_CMD should be programmed "before assertion of SWR". Do these operations in the recommended order. With this change the Data Transfer Over is always received correctly in my tests.
Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Reviewed-by: Jaehoon Chung jh80.chung@samsung.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210630102232.16011-1-vincent.whitchurch@axis.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/dw_mmc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 22c454c7aaca..8e09586f880f 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -2043,8 +2043,8 @@ static void dw_mci_tasklet_func(unsigned long priv) continue; }
- dw_mci_stop_dma(host); send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_SENDING_STOP; break; } @@ -2068,10 +2068,10 @@ static void dw_mci_tasklet_func(unsigned long priv) */ if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { - dw_mci_stop_dma(host); if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_DATA_ERROR; break; } @@ -2104,10 +2104,10 @@ static void dw_mci_tasklet_func(unsigned long priv) */ if (test_and_clear_bit(EVENT_DATA_ERROR, &host->pending_events)) { - dw_mci_stop_dma(host); if (!(host->data_status & (SDMMC_INT_DRTO | SDMMC_INT_EBE))) send_stop_abort(host, data); + dw_mci_stop_dma(host); state = STATE_DATA_ERROR; break; }
From: Jaroslav Kysela perex@perex.cz
[ Upstream commit a2befe9380dd04ee76c871568deca00eedf89134 ]
The original code in the cap_put_caller() function does not handle correctly the positive values returned from the passed function for multiple iterations. It means that the change notifications may be lost.
Fixes: 352f7f914ebb ("ALSA: hda - Merge Realtek parser code to generic parser") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213851 Cc: stable@kernel.org Signed-off-by: Jaroslav Kysela perex@perex.cz Link: https://lore.kernel.org/r/20210811161441.1325250-1-perex@perex.cz Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_generic.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/sound/pci/hda/hda_generic.c b/sound/pci/hda/hda_generic.c index 6099a9f1cb3d..ff263ad19230 100644 --- a/sound/pci/hda/hda_generic.c +++ b/sound/pci/hda/hda_generic.c @@ -3470,7 +3470,7 @@ static int cap_put_caller(struct snd_kcontrol *kcontrol, struct hda_gen_spec *spec = codec->spec; const struct hda_input_mux *imux; struct nid_path *path; - int i, adc_idx, err = 0; + int i, adc_idx, ret, err = 0;
imux = &spec->input_mux; adc_idx = kcontrol->id.index; @@ -3480,9 +3480,13 @@ static int cap_put_caller(struct snd_kcontrol *kcontrol, if (!path || !path->ctls[type]) continue; kcontrol->private_value = path->ctls[type]; - err = func(kcontrol, ucontrol); - if (err < 0) + ret = func(kcontrol, ucontrol); + if (ret < 0) { + err = ret; break; + } + if (ret > 0) + err = 1; } mutex_unlock(&codec->control_mutex); if (err >= 0 && spec->cap_sync_hook)
From: "Steven Rostedt (VMware)" rostedt@goodmis.org
[ Upstream commit 5acce0bff2a0420ce87d4591daeb867f47d552c2 ]
The following commands:
# echo 'read_max u64 size;' > synthetic_events # echo 'hist:keys=common_pid:count=count:onmax($count).trace(read_max,count)' > events/syscalls/sys_enter_read/trigger
Causes:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 4 PID: 1763 Comm: bash Not tainted 5.14.0-rc2-test+ #155 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strcmp+0xc/0x20 Code: 75 f7 31 c0 0f b6 0c 06 88 0c 02 48 83 c0 01 84 c9 75 f1 4c 89 c0 c3 0f 1f 80 00 00 00 00 31 c0 eb 08 48 83 c0 01 84 d2 74 0f <0f> b6 14 07 3a 14 06 74 ef 19 c0 83 c8 01 c3 31 c0 c3 66 90 48 89 RSP: 0018:ffffb5fdc0963ca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffffb3a4e040 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9714c0d0b640 RDI: 0000000000000000 RBP: 0000000000000000 R08: 00000022986b7cde R09: ffffffffb3a4dff8 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9714c50603c8 R13: 0000000000000000 R14: ffff97143fdf9e48 R15: ffff9714c01a2210 FS: 00007f1fa6785740(0000) GS:ffff9714da400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000002d863004 CR4: 00000000001706e0 Call Trace: __find_event_file+0x4e/0x80 action_create+0x6b7/0xeb0 ? kstrdup+0x44/0x60 event_hist_trigger_func+0x1a07/0x2130 trigger_process_regex+0xbd/0x110 event_trigger_write+0x71/0xd0 vfs_write+0xe9/0x310 ksys_write+0x68/0xe0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f1fa6879e87
The problem was the "trace(read_max,count)" where the "count" should be "$count" as "onmax()" only handles variables (although it really should be able to figure out that "count" is a field of sys_enter_read). But there's a path that does not find the variable and ends up passing a NULL for the event, which ends up getting passed to "strcmp()".
Add a check for NULL to return and error on the command with:
# cat error_log hist:syscalls:sys_enter_read: error: Couldn't create or find variable Command: hist:keys=common_pid:count=count:onmax($count).trace(read_max,count) ^ Link: https://lkml.kernel.org/r/20210808003011.4037f8d0@oasis.local.home
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: stable@vger.kernel.org Fixes: 50450603ec9cb tracing: Add 'onmax' hist trigger action support Reviewed-by: Tom Zanussi zanussi@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_events_hist.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index bbde8d3d6c8a..44d1340634f6 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -3786,6 +3786,8 @@ onmatch_create_field_var(struct hist_trigger_data *hist_data, event = data->onmatch.match_event; }
+ if (!event) + goto free; /* * At this point, we're looking at a field on another * event. Because we can't modify a hist trigger on
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit 9659281ce78de0f15a4aa124da8f7450b1399c09 ]
As tid is unsigned its hard to figure out if the tid is valid or invalid. So Start the transaction ids from 1 instead of zero so that we could differentiate between a valid tid and invalid tids
This is useful in cases where controller would add a tid for controller specific transfers.
Fixes: d3062a210930 ("slimbus: messaging: add slim_alloc/free_txn_tid()") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20210809082428.11236-2-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/slimbus/messaging.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/slimbus/messaging.c b/drivers/slimbus/messaging.c index d5879142dbef..3b77713f1e3f 100644 --- a/drivers/slimbus/messaging.c +++ b/drivers/slimbus/messaging.c @@ -66,7 +66,7 @@ int slim_alloc_txn_tid(struct slim_controller *ctrl, struct slim_msg_txn *txn) int ret = 0;
spin_lock_irqsave(&ctrl->txn_lock, flags); - ret = idr_alloc_cyclic(&ctrl->tid_idr, txn, 0, + ret = idr_alloc_cyclic(&ctrl->tid_idr, txn, 1, SLIM_MAX_TIDS, GFP_ATOMIC); if (ret < 0) { spin_unlock_irqrestore(&ctrl->txn_lock, flags);
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit a263c1ff6abe0e66712f40d595bbddc7a35907f8 ]
In some usecases transaction ids are dynamically allocated inside the controller driver after sending the messages which have generic acknowledge responses. So check for this before refcounting pm_runtime.
Without this we would end up imbalancing runtime pm count by doing pm_runtime_put() in both slim_do_transfer() and slim_msg_response() for a single pm_runtime_get() in slim_do_transfer()
Fixes: d3062a210930 ("slimbus: messaging: add slim_alloc/free_txn_tid()") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20210809082428.11236-3-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/slimbus/messaging.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/slimbus/messaging.c b/drivers/slimbus/messaging.c index 3b77713f1e3f..ddf0371ad52b 100644 --- a/drivers/slimbus/messaging.c +++ b/drivers/slimbus/messaging.c @@ -131,7 +131,8 @@ int slim_do_transfer(struct slim_controller *ctrl, struct slim_msg_txn *txn) goto slim_xfer_err; } } - + /* Initialize tid to invalid value */ + txn->tid = 0; need_tid = slim_tid_txn(txn->mt, txn->mc);
if (need_tid) { @@ -163,7 +164,7 @@ int slim_do_transfer(struct slim_controller *ctrl, struct slim_msg_txn *txn) txn->mt, txn->mc, txn->la, ret);
slim_xfer_err: - if (!clk_pause_msg && (!need_tid || ret == -ETIMEDOUT)) { + if (!clk_pause_msg && (txn->tid == 0 || ret == -ETIMEDOUT)) { /* * remove runtime-pm vote if this was TX only, or * if there was error during this transaction
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit d77772538f00b7265deace6e77e555ee18365ad0 ]
During suspend/resume NGD remote instance is power cycled along with remotely controlled bam dma engine. So Reset the dma configuration during this suspend resume path so that we are not dealing with any stale dma setup.
Without this transactions timeout after first suspend resume path.
Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver") Cc: stable@vger.kernel.org Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20210809082428.11236-5-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/slimbus/qcom-ngd-ctrl.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/slimbus/qcom-ngd-ctrl.c b/drivers/slimbus/qcom-ngd-ctrl.c index 44021620d101..1a5311fb45a5 100644 --- a/drivers/slimbus/qcom-ngd-ctrl.c +++ b/drivers/slimbus/qcom-ngd-ctrl.c @@ -1060,7 +1060,8 @@ static void qcom_slim_ngd_setup(struct qcom_slim_ngd_ctrl *ctrl) { u32 cfg = readl_relaxed(ctrl->ngd->base);
- if (ctrl->state == QCOM_SLIM_NGD_CTRL_DOWN) + if (ctrl->state == QCOM_SLIM_NGD_CTRL_DOWN || + ctrl->state == QCOM_SLIM_NGD_CTRL_ASLEEP) qcom_slim_ngd_init_dma(ctrl);
/* By default enable message queues */ @@ -1111,6 +1112,7 @@ static int qcom_slim_ngd_power_up(struct qcom_slim_ngd_ctrl *ctrl) dev_info(ctrl->dev, "Subsys restart: ADSP active framer\n"); return 0; } + qcom_slim_ngd_setup(ctrl); return 0; }
@@ -1496,6 +1498,7 @@ static int __maybe_unused qcom_slim_ngd_runtime_suspend(struct device *dev) struct qcom_slim_ngd_ctrl *ctrl = dev_get_drvdata(dev); int ret = 0;
+ qcom_slim_ngd_exit_dma(ctrl); if (!ctrl->qmi.handle) return 0;
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 57a1681095f912239c7fb4d66683ab0425973838 ]
The function tpci200_register called by tpci200_install and tpci200_unregister called by tpci200_uninstall are in pair. However, tpci200_unregister has some cleanup operations not in the tpci200_register. So the error handling code of tpci200_pci_probe has many different double free issues.
Fix this problem by moving those cleanup operations out of tpci200_unregister, into tpci200_pci_remove and reverting the previous commit 9272e5d0028d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe").
Fixes: 9272e5d0028d ("ipack/carriers/tpci200: Fix a double free in tpci200_pci_probe") Cc: stable@vger.kernel.org Reported-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Link: https://lore.kernel.org/r/20210810100323.3938492-1-mudongliangabcd@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ipack/carriers/tpci200.c | 36 ++++++++++++++++---------------- 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/drivers/ipack/carriers/tpci200.c b/drivers/ipack/carriers/tpci200.c index 7895320e50c1..2172d1efa71e 100644 --- a/drivers/ipack/carriers/tpci200.c +++ b/drivers/ipack/carriers/tpci200.c @@ -94,16 +94,13 @@ static void tpci200_unregister(struct tpci200_board *tpci200) free_irq(tpci200->info->pdev->irq, (void *) tpci200);
pci_iounmap(tpci200->info->pdev, tpci200->info->interface_regs); - pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs);
pci_release_region(tpci200->info->pdev, TPCI200_IP_INTERFACE_BAR); pci_release_region(tpci200->info->pdev, TPCI200_IO_ID_INT_SPACES_BAR); pci_release_region(tpci200->info->pdev, TPCI200_MEM16_SPACE_BAR); pci_release_region(tpci200->info->pdev, TPCI200_MEM8_SPACE_BAR); - pci_release_region(tpci200->info->pdev, TPCI200_CFG_MEM_BAR);
pci_disable_device(tpci200->info->pdev); - pci_dev_put(tpci200->info->pdev); }
static void tpci200_enable_irq(struct tpci200_board *tpci200, @@ -532,7 +529,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, tpci200->info = kzalloc(sizeof(struct tpci200_infos), GFP_KERNEL); if (!tpci200->info) { ret = -ENOMEM; - goto out_err_info; + goto err_tpci200; }
pci_dev_get(pdev); @@ -543,7 +540,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (ret) { dev_err(&pdev->dev, "Failed to allocate PCI Configuration Memory"); ret = -EBUSY; - goto out_err_pci_request; + goto err_tpci200_info; } tpci200->info->cfg_regs = ioremap_nocache( pci_resource_start(pdev, TPCI200_CFG_MEM_BAR), @@ -551,7 +548,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (!tpci200->info->cfg_regs) { dev_err(&pdev->dev, "Failed to map PCI Configuration Memory"); ret = -EFAULT; - goto out_err_ioremap; + goto err_request_region; }
/* Disable byte swapping for 16 bit IP module access. This will ensure @@ -574,7 +571,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, if (ret) { dev_err(&pdev->dev, "error during tpci200 install\n"); ret = -ENODEV; - goto out_err_install; + goto err_cfg_regs; }
/* Register the carrier in the industry pack bus driver */ @@ -586,7 +583,7 @@ static int tpci200_pci_probe(struct pci_dev *pdev, dev_err(&pdev->dev, "error registering the carrier on ipack driver\n"); ret = -EFAULT; - goto out_err_bus_register; + goto err_tpci200_install; }
/* save the bus number given by ipack to logging purpose */ @@ -597,19 +594,16 @@ static int tpci200_pci_probe(struct pci_dev *pdev, tpci200_create_device(tpci200, i); return 0;
-out_err_bus_register: +err_tpci200_install: tpci200_uninstall(tpci200); - /* tpci200->info->cfg_regs is unmapped in tpci200_uninstall */ - tpci200->info->cfg_regs = NULL; -out_err_install: - if (tpci200->info->cfg_regs) - iounmap(tpci200->info->cfg_regs); -out_err_ioremap: +err_cfg_regs: + pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs); +err_request_region: pci_release_region(pdev, TPCI200_CFG_MEM_BAR); -out_err_pci_request: - pci_dev_put(pdev); +err_tpci200_info: kfree(tpci200->info); -out_err_info: + pci_dev_put(pdev); +err_tpci200: kfree(tpci200); return ret; } @@ -619,6 +613,12 @@ static void __tpci200_pci_remove(struct tpci200_board *tpci200) ipack_bus_unregister(tpci200->info->ipack_bus); tpci200_uninstall(tpci200);
+ pci_iounmap(tpci200->info->pdev, tpci200->info->cfg_regs); + + pci_release_region(tpci200->info->pdev, TPCI200_CFG_MEM_BAR); + + pci_dev_put(tpci200->info->pdev); + kfree(tpci200->info); kfree(tpci200); }
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 50f05bd114a46a74726e432bf81079d3f13a55b7 ]
The error handling code in tpci200_register does not free interface_regs allocated by ioremap and the current version of error handling code is problematic.
Fix this by refactoring the error handling code and free interface_regs when necessary.
Fixes: 43986798fd50 ("ipack: add error handling for ioremap_nocache") Cc: stable@vger.kernel.org Reported-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Link: https://lore.kernel.org/r/20210810100323.3938492-2-mudongliangabcd@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ipack/carriers/tpci200.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/ipack/carriers/tpci200.c b/drivers/ipack/carriers/tpci200.c index 2172d1efa71e..4c8da6af2516 100644 --- a/drivers/ipack/carriers/tpci200.c +++ b/drivers/ipack/carriers/tpci200.c @@ -259,7 +259,7 @@ static int tpci200_register(struct tpci200_board *tpci200) "(bn 0x%X, sn 0x%X) failed to allocate PCI resource for BAR 2 !", tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); - goto out_disable_pci; + goto err_disable_device; }
/* Request IO ID INT space (Bar 3) */ @@ -271,7 +271,7 @@ static int tpci200_register(struct tpci200_board *tpci200) "(bn 0x%X, sn 0x%X) failed to allocate PCI resource for BAR 3 !", tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); - goto out_release_ip_space; + goto err_ip_interface_bar; }
/* Request MEM8 space (Bar 5) */ @@ -282,7 +282,7 @@ static int tpci200_register(struct tpci200_board *tpci200) "(bn 0x%X, sn 0x%X) failed to allocate PCI resource for BAR 5!", tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); - goto out_release_ioid_int_space; + goto err_io_id_int_spaces_bar; }
/* Request MEM16 space (Bar 4) */ @@ -293,7 +293,7 @@ static int tpci200_register(struct tpci200_board *tpci200) "(bn 0x%X, sn 0x%X) failed to allocate PCI resource for BAR 4!", tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); - goto out_release_mem8_space; + goto err_mem8_space_bar; }
/* Map internal tpci200 driver user space */ @@ -307,7 +307,7 @@ static int tpci200_register(struct tpci200_board *tpci200) tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); res = -ENOMEM; - goto out_release_mem8_space; + goto err_mem16_space_bar; }
/* Initialize lock that protects interface_regs */ @@ -346,18 +346,22 @@ static int tpci200_register(struct tpci200_board *tpci200) "(bn 0x%X, sn 0x%X) unable to register IRQ !", tpci200->info->pdev->bus->number, tpci200->info->pdev->devfn); - goto out_release_ioid_int_space; + goto err_interface_regs; }
return 0;
-out_release_mem8_space: +err_interface_regs: + pci_iounmap(tpci200->info->pdev, tpci200->info->interface_regs); +err_mem16_space_bar: + pci_release_region(tpci200->info->pdev, TPCI200_MEM16_SPACE_BAR); +err_mem8_space_bar: pci_release_region(tpci200->info->pdev, TPCI200_MEM8_SPACE_BAR); -out_release_ioid_int_space: +err_io_id_int_spaces_bar: pci_release_region(tpci200->info->pdev, TPCI200_IO_ID_INT_SPACES_BAR); -out_release_ip_space: +err_ip_interface_bar: pci_release_region(tpci200->info->pdev, TPCI200_IP_INTERFACE_BAR); -out_disable_pci: +err_disable_device: pci_disable_device(tpci200->info->pdev); return res; }
From: NeilBrown neilb@suse.de
[ Upstream commit 3f79f6f6247c83f448c8026c3ee16d4636ef8d4f ]
Cross-rename lacks a check when that would prevent exchanging a directory and subvolume from different parent subvolume. This causes data inconsistencies and is caught before commit by tree-checker, turning the filesystem to read-only.
Calling the renameat2 with RENAME_EXCHANGE flags like
renameat2(AT_FDCWD, namesrc, AT_FDCWD, namedest, (1 << 1))
on two paths:
namesrc = dir1/subvol1/dir2 namedest = subvol2/subvol3
will cause key order problem with following write time tree-checker report:
[1194842.307890] BTRFS critical (device loop1): corrupt leaf: root=5 block=27574272 slot=10 ino=258, invalid previous key objectid, have 257 expect 258 [1194842.322221] BTRFS info (device loop1): leaf 27574272 gen 8 total ptrs 11 free space 15444 owner 5 [1194842.331562] BTRFS info (device loop1): refs 2 lock_owner 0 current 26561 [1194842.338772] item 0 key (256 1 0) itemoff 16123 itemsize 160 [1194842.338793] inode generation 3 size 16 mode 40755 [1194842.338801] item 1 key (256 12 256) itemoff 16111 itemsize 12 [1194842.338809] item 2 key (256 84 2248503653) itemoff 16077 itemsize 34 [1194842.338817] dir oid 258 type 2 [1194842.338823] item 3 key (256 84 2363071922) itemoff 16043 itemsize 34 [1194842.338830] dir oid 257 type 2 [1194842.338836] item 4 key (256 96 2) itemoff 16009 itemsize 34 [1194842.338843] item 5 key (256 96 3) itemoff 15975 itemsize 34 [1194842.338852] item 6 key (257 1 0) itemoff 15815 itemsize 160 [1194842.338863] inode generation 6 size 8 mode 40755 [1194842.338869] item 7 key (257 12 256) itemoff 15801 itemsize 14 [1194842.338876] item 8 key (257 84 2505409169) itemoff 15767 itemsize 34 [1194842.338883] dir oid 256 type 2 [1194842.338888] item 9 key (257 96 2) itemoff 15733 itemsize 34 [1194842.338895] item 10 key (258 12 256) itemoff 15719 itemsize 14 [1194842.339163] BTRFS error (device loop1): block=27574272 write time tree block corruption detected [1194842.339245] ------------[ cut here ]------------ [1194842.443422] WARNING: CPU: 6 PID: 26561 at fs/btrfs/disk-io.c:449 csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.511863] CPU: 6 PID: 26561 Comm: kworker/u17:2 Not tainted 5.14.0-rc3-git+ #793 [1194842.511870] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008 [1194842.511876] Workqueue: btrfs-worker-high btrfs_work_helper [btrfs] [1194842.511976] RIP: 0010:csum_one_extent_buffer+0xed/0x100 [btrfs] [1194842.512068] RSP: 0018:ffffa2c284d77da0 EFLAGS: 00010282 [1194842.512074] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffff928867bd9978 [1194842.512078] RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff928867bd9970 [1194842.512081] RBP: ffff92876b958000 R08: 0000000000000001 R09: 00000000000c0003 [1194842.512085] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [1194842.512088] R13: ffff92875f989f98 R14: 0000000000000000 R15: 0000000000000000 [1194842.512092] FS: 0000000000000000(0000) GS:ffff928867a00000(0000) knlGS:0000000000000000 [1194842.512095] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1194842.512099] CR2: 000055f5384da1f0 CR3: 0000000102fe4000 CR4: 00000000000006e0 [1194842.512103] Call Trace: [1194842.512128] ? run_one_async_free+0x10/0x10 [btrfs] [1194842.631729] btree_csum_one_bio+0x1ac/0x1d0 [btrfs] [1194842.631837] run_one_async_start+0x18/0x30 [btrfs] [1194842.631938] btrfs_work_helper+0xd5/0x1d0 [btrfs] [1194842.647482] process_one_work+0x262/0x5e0 [1194842.647520] worker_thread+0x4c/0x320 [1194842.655935] ? process_one_work+0x5e0/0x5e0 [1194842.655946] kthread+0x135/0x160 [1194842.655953] ? set_kthread_struct+0x40/0x40 [1194842.655965] ret_from_fork+0x1f/0x30 [1194842.672465] irq event stamp: 1729 [1194842.672469] hardirqs last enabled at (1735): [<ffffffffbd1104f5>] console_trylock_spinning+0x185/0x1a0 [1194842.672477] hardirqs last disabled at (1740): [<ffffffffbd1104cc>] console_trylock_spinning+0x15c/0x1a0 [1194842.672482] softirqs last enabled at (1666): [<ffffffffbdc002e1>] __do_softirq+0x2e1/0x50a [1194842.672491] softirqs last disabled at (1651): [<ffffffffbd08aab7>] __irq_exit_rcu+0xa7/0xd0
The corrupted data will not be written, and filesystem can be unmounted and mounted again (all changes since the last commit will be lost).
Add the missing check for new_ino so that all non-subvolumes must reside under the same parent subvolume. There's an exception allowing to exchange two subvolumes from any parents as the directory representing a subvolume is only a logical link and does not have any other structures related to the parent subvolume, unlike files, directories etc, that are always in the inode namespace of the parent subvolume.
Fixes: cdd1fedf8261 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.7+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: NeilBrown neilb@suse.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d29f4cf125d2..6f02a3f77fa8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9556,8 +9556,14 @@ static int btrfs_rename_exchange(struct inode *old_dir, bool sync_log_dest = false; bool commit_transaction = false;
- /* we only allow rename subvolume link between subvolumes */ - if (old_ino != BTRFS_FIRST_FREE_OBJECTID && root != dest) + /* + * For non-subvolumes allow exchange only within one subvolume, in the + * same inode namespace. Two subvolumes (represented as directory) can + * be exchanged as they're a logical link and have a fixed inode number. + */ + if (root != dest && + (old_ino != BTRFS_FIRST_FREE_OBJECTID || + new_ino != BTRFS_FIRST_FREE_OBJECTID)) return -EXDEV;
btrfs_init_log_ctx(&ctx_root, old_inode);
From: Marcin Bachry hegel666@gmail.com
[ Upstream commit e0bff43220925b7e527f9d3bc9f5c624177c959e ]
The Renoir XHCI controller apparently doesn't resume reliably with the standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and they are not aware of any HW errata or specific issues. The HW works fine in Windows. I was told Windows uses a rather generous default delay of 100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com Signed-off-by: Marcin Bachry hegel666@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Cc: Mario Limonciello mario.limonciello@amd.com Cc: Prike Liang prike.liang@amd.com Cc: Shyam Sundar S K shyam-sundar.s-k@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index f287a9f919da..7e873b6b7d55 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1882,6 +1882,7 @@ static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev) } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1639, quirk_ryzen_xhci_d3hot);
#ifdef CONFIG_X86_IO_APIC static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 65ca89c2b12cca0d473f3dd54267568ad3af55cc ]
The commit 2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM buffer address") changed the reference of PCM buffer address to substream->runtime->dma_addr as the buffer address may change dynamically. However, I forgot that the dma_addr field is still not set up for the CONTINUOUS buffer type (that this driver uses) yet in 5.14 and earlier kernels, and it resulted in garbage I/O. The problem will be fixed in 5.15, but we need to address it quickly for now.
The fix is to deduce the address again from the DMA pointer with virt_to_phys(), but from the right one, substream->runtime->dma_area.
Fixes: 2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM buffer address") Reported-and-tested-by: Hans de Goede hdegoede@redhat.com Cc: stable@vger.kernel.org Acked-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/2048c6aa-2187-46bd-6772-36a4fb3c5aeb@redhat.com Link: https://lore.kernel.org/r/20210819152945.8510-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index 501ac836777a..682ee41ec75c 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -135,7 +135,7 @@ static void sst_fill_alloc_params(struct snd_pcm_substream *substream, snd_pcm_uframes_t period_size; ssize_t periodbytes; ssize_t buffer_bytes = snd_pcm_lib_buffer_bytes(substream); - u32 buffer_addr = substream->runtime->dma_addr; + u32 buffer_addr = virt_to_phys(substream->runtime->dma_area);
channels = substream->runtime->channels; period_size = substream->runtime->period_size;
From: Jeff Layton jlayton@kernel.org
[ Upstream commit df2474a22c42ce419b67067c52d71da06c385501 ]
Since 9e8925b67a ("locks: Allow disabling mandatory locking at compile time"), attempts to mount filesystems with "-o mand" will fail. Unfortunately, there is no other indiciation of the reason for the failure.
Change how the function is defined for better readability. When CONFIG_MANDATORY_FILE_LOCKING is disabled, printk a warning when someone attempts to mount with -o mand.
Also, add a blurb to the mandatory-locking.txt file to explain about the "mand" option, and the behavior one should expect when it is disabled.
Reported-by: Jan Kara jack@suse.cz Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/filesystems/mandatory-locking.txt | 10 ++++++++++ fs/namespace.c | 11 ++++++++--- 2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/Documentation/filesystems/mandatory-locking.txt b/Documentation/filesystems/mandatory-locking.txt index 0979d1d2ca8b..a251ca33164a 100644 --- a/Documentation/filesystems/mandatory-locking.txt +++ b/Documentation/filesystems/mandatory-locking.txt @@ -169,3 +169,13 @@ havoc if they lock crucial files. The way around it is to change the file permissions (remove the setgid bit) before trying to read or write to it. Of course, that might be a bit tricky if the system is hung :-(
+7. The "mand" mount option +-------------------------- +Mandatory locking is disabled on all filesystems by default, and must be +administratively enabled by mounting with "-o mand". That mount option +is only allowed if the mounting task has the CAP_SYS_ADMIN capability. + +Since kernel v4.5, it is possible to disable mandatory locking +altogether by setting CONFIG_MANDATORY_FILE_LOCKING to "n". A kernel +with this disabled will reject attempts to mount filesystems with the +"mand" mount option with the error status EPERM. diff --git a/fs/namespace.c b/fs/namespace.c index edd397fa2991..8d2bf350e7c6 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1610,13 +1610,18 @@ static inline bool may_mount(void) return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN); }
+#ifdef CONFIG_MANDATORY_FILE_LOCKING static inline bool may_mandlock(void) { -#ifndef CONFIG_MANDATORY_FILE_LOCKING - return false; -#endif return capable(CAP_SYS_ADMIN); } +#else +static inline bool may_mandlock(void) +{ + pr_warn("VFS: "mand" mount option not supported"); + return false; +} +#endif
/* * Now umount can handle mount points as well as block devices.
From: Jeff Layton jlayton@kernel.org
[ Upstream commit fdd92b64d15bc4aec973caa25899afd782402e68 ]
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros have disabled it. Warn the stragglers that still use "-o mand" that we'll be dropping support for that mount option.
Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/namespace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index 8d2bf350e7c6..2f3c6a0350a8 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1611,8 +1611,12 @@ static inline bool may_mount(void) }
#ifdef CONFIG_MANDATORY_FILE_LOCKING -static inline bool may_mandlock(void) +static bool may_mandlock(void) { + pr_warn_once("======================================================\n" + "WARNING: the mand mount option is being deprecated and\n" + " will be removed in v5.15!\n" + "======================================================\n"); return capable(CAP_SYS_ADMIN); } #else
From: Sergey Marinkevich sergey.marinkevich@eltex-co.ru
[ Upstream commit 2e34328b396a69b73661ba38d47d92b7cf21c2c4 ]
I got a problem on MIPS with Big-Endian is turned on: every time when NF trying to change TCP MSS it returns because of new.v16 was greater than old.v16. But real MSS was 1460 and my rule was like this:
add rule table chain tcp option maxseg size set 1400
And 1400 is lesser that 1460, not greater.
Later I founded that main causer is cast from u32 to __be16.
Debugging:
In example MSS = 1400(HEX: 0x578). Here is representation of each byte like it is in memory by addresses from left to right(e.g. [0x0 0x1 0x2 0x3]). LE — Little-Endian system, BE — Big-Endian, left column is type.
LE BE u32: [78 05 00 00] [00 00 05 78]
As you can see, u32 representation will be casted to u16 from different half of 4-byte address range. But actually nf_tables uses registers and store data of various size. Actually TCP MSS stored in 2 bytes. But registers are still u32 in definition:
struct nft_regs { union { u32 data[20]; struct nft_verdict verdict; }; };
So, access like regs->data[priv->sreg] exactly u32. So, according to table presents above, per-byte representation of stored TCP MSS in register will be:
LE BE (u32)regs->data[]: [78 05 00 00] [05 78 00 00] ^^ ^^
We see that register uses just half of u32 and other 2 bytes may be used for some another data. But in nft_exthdr_tcp_set_eval() it casted just like u32 -> __be16:
new.v16 = src
But u32 overfill __be16, so it get 2 low bytes. For clarity draw one more table(<xx xx> means that bytes will be used for cast).
LE BE u32: [<78 05> 00 00] [00 00 <05 78>] (u32)regs->data[]: [<78 05> 00 00] [05 78 <00 00>]
As you can see, for Little-Endian nothing changes, but for Big-endian we take the wrong half. In my case there is some other data instead of zeros, so new MSS was wrongly greater.
For shooting this bug I used solution for ports ranges. Applying of this patch does not affect Little-Endian systems.
Signed-off-by: Sergey Marinkevich sergey.marinkevich@eltex-co.ru Acked-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_exthdr.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c index 64e69d6683ca..93fee4106019 100644 --- a/net/netfilter/nft_exthdr.c +++ b/net/netfilter/nft_exthdr.c @@ -137,7 +137,6 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr, unsigned int i, optl, tcphdr_len, offset; struct tcphdr *tcph; u8 *opt; - u32 src;
tcph = nft_tcp_header_pointer(pkt, sizeof(buff), buff, &tcphdr_len); if (!tcph) @@ -146,7 +145,6 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr, opt = (u8 *)tcph; for (i = sizeof(*tcph); i < tcphdr_len - 1; i += optl) { union { - u8 octet; __be16 v16; __be32 v32; } old, new; @@ -167,13 +165,13 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr, if (!tcph) return;
- src = regs->data[priv->sreg]; offset = i + priv->offset;
switch (priv->len) { case 2: old.v16 = get_unaligned((u16 *)(opt + offset)); - new.v16 = src; + new.v16 = (__force __be16)nft_reg_load16( + ®s->data[priv->sreg]);
switch (priv->type) { case TCPOPT_MSS: @@ -191,7 +189,7 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr, old.v16, new.v16, false); break; case 4: - new.v32 = src; + new.v32 = regs->data[priv->sreg]; old.v32 = get_unaligned((u32 *)(opt + offset));
if (old.v32 == new.v32)
Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index d4ffcafb8efa..c0fd3cd96338 100644 --- a/Makefile +++ b/Makefile @@ -1,8 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 204 -EXTRAVERSION = +SUBLEVEL = 205 +EXTRAVERSION = -rc1 NAME = "People's Front"
# *DOCUMENTATION*
Hi!
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-4...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
Hi Sasha,
On Tue, Aug 24, 2021 at 01:01:26PM -0400, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
Build test: mips (gcc version 11.1.1 20210816): 63 configs -> no failure arm (gcc version 11.1.1 20210816): 116 configs -> no new failure arm64 (gcc version 11.1.1 20210816): 2 configs -> no failure x86_64 (gcc version 10.2.1 20210110): 4 configs -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1]
[1]. https://openqa.qa.codethink.co.uk/tests/53
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
On Tue, Aug 24, 2021 at 01:01:26PM -0400, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 438 pass: 438 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
Hello!
On 8/24/21 12:01 PM, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
Thanks, Sasha
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
## Build * kernel: 4.19.205-rc1 * git: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] * git branch: linux-4.19.y * git commit: c1eea862e3bb2aec599f5b1b2aaaa1ee48e709b8 * git describe: v4.19.204-84-gc1eea862e3bb * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.19.y/build/v4.19....
## No regressions (compared to v4.19.204)
## No fixes (compared to v4.19.204)
## Test result summary total: 69147, pass: 56533, fail: 531, skip: 10703, xfail: 1380
## Build Summary * arm: 98 total, 98 passed, 0 failed * arm64: 30 total, 30 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 15 total, 15 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 40 total, 40 passed, 0 failed * s390: 9 total, 9 passed, 0 failed * sparc: 9 total, 9 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 17 total, 17 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * install-android-platform-tools-r2600 * kselftest-android * kselftest-arm64 * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * rcutorture * ssuite * v4l2-compliance
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On 8/24/21 11:01 AM, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
Thanks, Sasha
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 2021/8/25 1:01, Sasha Levin wrote:
This is the start of the stable review cycle for the 4.19.205 release. There are 84 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu 26 Aug 2021 05:02:47 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/p... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
Thanks, Sasha
Tested on arm64 and x86 for 4.19.205-rc1,
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Branch: linux-4.19.y Version: 4.19.205-rc1 Commit: c1eea862e3bb2aec599f5b1b2aaaa1ee48e709b8 Compiler: gcc version 7.3.0 (GCC)
arm64: -------------------------------------------------------------------- Testcase Result Summary: total: 8859 passed: 8859 failed: 0 timeout: 0 --------------------------------------------------------------------
x86: -------------------------------------------------------------------- Testcase Result Summary: total: 8859 passed: 8859 failed: 0 timeout: 0 --------------------------------------------------------------------
Tested-by: Hulk Robot hulkrobot@huawei.com
linux-stable-mirror@lists.linaro.org