This is the start of the stable review cycle for the 5.4.26 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Mar 2020 10:31:16 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.26-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.4.26-rc1
Karsten Graul kgraul@linux.ibm.com net/smc: cancel event worker during device removal
Karsten Graul kgraul@linux.ibm.com net/smc: check for valid ib_client_data
Eric Dumazet edumazet@google.com ipv6: restrict IPV6_ADDRFORM operation
Suravee Suthikulpanit suravee.suthikulpanit@amd.com iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE
Wolfram Sang wsa+renesas@sang-engineering.com i2c: acpi: put device when verifying client fails
Daniel Drake drake@endlessm.com iommu/vt-d: Ignore devices with out-of-spec domain number
Zhenzhong Duan zhenzhong.duan@gmail.com iommu/vt-d: Fix the wrong printing in RHSA parsing
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_chain_nat: inet family is missing module ownership
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: dump NFTA_CHAIN_FLAGS attribute
Jakub Kicinski kuba@kernel.org netfilter: nft_tunnel: add missing attribute validation for tunnels
Jakub Kicinski kuba@kernel.org netfilter: nft_payload: add missing attribute validation for payload csum flags
Jakub Kicinski kuba@kernel.org netfilter: cthelper: add missing attribute validation for cthelper
Tommi Rantala tommi.t.rantala@nokia.com perf bench futex-wake: Restore thread count default to online CPU count
Jakub Kicinski kuba@kernel.org nl80211: add missing attribute validation for channel switch
Jakub Kicinski kuba@kernel.org nl80211: add missing attribute validation for beacon report scanning
Jakub Kicinski kuba@kernel.org nl80211: add missing attribute validation for critical protocol indication
Hamish Martin hamish.martin@alliedtelesis.co.nz i2c: gpio: suppress error on probe defer
Qian Cai cai@lca.pw iommu/vt-d: Fix RCU-list bugs in intel_iommu_init()
Christoph Hellwig hch@lst.de driver code: clarify and fix platform device DMA mask allocation
Zhenyu Wang zhenyuw@linux.intel.com drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits
Charles Keepax ckeepax@opensource.cirrus.com pinctrl: core: Remove extra kref_get which blocks hogs being freed
Tina Zhang tina.zhang@intel.com drm/i915/gvt: Fix dma-buf display blur issue on CFL
Suman Anna s-anna@ti.com virtio_ring: Fix mem leak with vring_new_virtqueue()
Leonard Crestez leonard.crestez@nxp.com pinctrl: imx: scu: Align imx sc msg structs to 4
Nicolas Belin nbelin@baylibre.com pinctrl: meson-gxl: fix GPIOX sdio pins
Anson Huang Anson.Huang@nxp.com clk: imx8mn: Fix incorrect clock defines
Sven Eckelmann sven@narfation.org batman-adv: Don't schedule OGM for disabled interface
Yonghyun Hwang yonghyun@google.com iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
Amol Grover frextrite@gmail.com iommu/vt-d: Fix RCU list debugging warnings
Hans de Goede hdegoede@redhat.com iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
Marc Zyngier maz@kernel.org iommu/dma: Fix MSI reservation allocation
Tony Luck tony.luck@intel.com x86/mce: Fix logic and comments around MSR_PPIN_CTL
Kim Phillips kim.phillips@amd.com perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
Felix Fietkau nbd@nbd.name mt76: fix array overflow on receiving too many fragments for a packet
Jarkko Nikula jarkko.nikula@linux.intel.com i2c: designware-pci: Fix BUG_ON during device removal
Vladis Dronov vdronov@redhat.com efi: Add a sanity check to efivar_store_raw()
Vladis Dronov vdronov@redhat.com efi: Fix a race and a buffer overflow while reading efivars via sysfs
Tom Lendacky thomas.lendacky@amd.com x86/ioremap: Map EFI runtime services data as encrypted for SEV
Wolfram Sang wsa@the-dreams.de macintosh: windfarm: fix MODINFO regression
Eric Biggers ebiggers@google.com fscrypt: don't evict dirty inodes after removing key
Tejun Heo tj@kernel.org blk-iocost: fix incorrect vtime comparison in iocg_is_idle()
Takashi Iwai tiwai@suse.de ipmi_si: Avoid spurious errors for optional IRQs
Stefan Haberland sth@linux.ibm.com s390/dasd: fix data corruption for thin provisioned devices
Miklos Szeredi mszeredi@redhat.com fuse: fix stack use after return
Eugeniy Paltsev Eugeniy.Paltsev@synopsys.com ARC: define __ALIGN_STR and __ALIGN symbols for ARC
Vitaly Kuznetsov vkuznets@redhat.com KVM: nVMX: avoid NULL pointer dereference with incorrect EVMCS GPAs
Vitaly Kuznetsov vkuznets@redhat.com KVM: x86: clear stale x86_emulate_ctxt->intercept value
Al Viro viro@zeniv.linux.org.uk gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
Al Viro viro@zeniv.linux.org.uk cifs_atomic_open(): fix double-put on late allocation failure
Steven Rostedt (VMware) rostedt@goodmis.org ktest: Add timeout for ssh sync testing
Mathias Kresin dev@kresin.me pinctrl: falcon: fix syntax error
Ben Chuang ben.chuang@genesyslogic.com.tw mmc: sdhci-pci-gli: Enable MSI interrupt for GL975x
Chris Wilson chris@chris-wilson.co.uk drm/i915: Defer semaphore priority bumping to a workqueue
Matthew Auld matthew.auld@intel.com drm/i915: be more solid in checking the alignment
Colin Ian King colin.king@canonical.com drm/amd/display: remove duplicated assignment to grph_obj_type
Hillf Danton hdanton@sina.com workqueue: don't use wq_select_unbound_cpu() for bound works
Vasily Averin vvs@virtuozzo.com netfilter: x_tables: xt_mttg_seq_next should increase position index
Vasily Averin vvs@virtuozzo.com netfilter: xt_recent: recent_seq_next should increase position index
Vasily Averin vvs@virtuozzo.com netfilter: synproxy: synproxy_cpu_seq_next should increase position index
Vasily Averin vvs@virtuozzo.com netfilter: nf_conntrack: ct_cpu_seq_next should increase position index
Hans de Goede hdegoede@redhat.com iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
Halil Pasic pasic@linux.ibm.com virtio-blk: fix hw_queue stopped on arbitrary error
Dan Moulding dmoulding@me.com iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices
Florian Westphal fw@strlen.de netfilter: nf_tables: fix infinite loop when expr is not available
Michal Koutný mkoutny@suse.com cgroup: Iterate tasks that did not finish do_exit()
Vasily Averin vvs@virtuozzo.com cgroup: cgroup_procs_next should increase position index
Florian Fainelli f.fainelli@gmail.com net: phy: Avoid multiple suspends
Andrew Lunn andrew@lunn.ch net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
Hangbin Liu liuhangbin@gmail.com selftests/net/fib_tests: update addr_metric_test for peer route testing
Hangbin Liu liuhangbin@gmail.com net/ipv6: remove the old peer route if change it to a new one
Hangbin Liu liuhangbin@gmail.com net/ipv6: need update peer route when modify metric
Heiner Kallweit hkallweit1@gmail.com net: phy: fix MDIO bus PM PHY resuming
Heiner Kallweit hkallweit1@gmail.com net: phy: avoid clearing PHY interrupts twice in irq handler
Jakub Kicinski kuba@kernel.org nfc: add missing attribute validation for vendor subcommand
Jakub Kicinski kuba@kernel.org nfc: add missing attribute validation for deactivate target
Jakub Kicinski kuba@kernel.org nfc: add missing attribute validation for SE API
Jakub Kicinski kuba@kernel.org tipc: add missing attribute validation for MTU property
Jakub Kicinski kuba@kernel.org team: add missing attribute validation for array index
Jakub Kicinski kuba@kernel.org team: add missing attribute validation for port ifindex
Jakub Kicinski kuba@kernel.org net: taprio: add missing attribute validation for txtime delay
Jakub Kicinski kuba@kernel.org net: fq: add missing attribute validation for orphan mask
Jakub Kicinski kuba@kernel.org macsec: add missing attribute validation for port
Jakub Kicinski kuba@kernel.org can: add missing attribute validation for termination
Jakub Kicinski kuba@kernel.org nl802154: add missing attribute validation for dev_type
Jakub Kicinski kuba@kernel.org nl802154: add missing attribute validation
Jakub Kicinski kuba@kernel.org fib: add missing attribute validation for tun_id
Jakub Kicinski kuba@kernel.org devlink: validate length of region addr/len
Jakub Kicinski kuba@kernel.org devlink: validate length of param values
Eric Dumazet edumazet@google.com net: memcg: fix lockdep splat in inet_csk_accept()
Shakeel Butt shakeelb@google.com net: memcg: late association of sock to memcg
Shakeel Butt shakeelb@google.com cgroup: memcg: net: do not associate sock with unrelated cgroup
Edwin Peer edwin.peer@broadcom.com bnxt_en: fix error handling when flashing from file
Vasundhara Volam vasundhara-v.volam@broadcom.com bnxt_en: reinitialize IRQs when MTU is modified
Eric Dumazet edumazet@google.com bonding/alb: make sure arp header is pulled before accessing it
Vinicius Costa Gomes vinicius.gomes@intel.com taprio: Fix sending packets without dequeueing them
Eric Dumazet edumazet@google.com slip: make slhc_compress() more robust against malicious packets
Edward Cree ecree@solarflare.com sfc: detach from cb_page in efx_copy_channel()
You-Sheng Yang vicamo.yang@canonical.com r8152: check disconnect status after long sleep
Colin Ian King colin.king@canonical.com net: systemport: fix index check to avoid an array out of bounds access
Remi Pommarel repk@triplefau.lt net: stmmac: dwmac1000: Disable ACS if enhanced descs are not used
Jonas Gorski jonas.gorski@gmail.com net: phy: bcm63xx: fix OOPS due to missing driver name
Willem de Bruijn willemb@google.com net/packet: tpacket_rcv: do not increment ring index on drop
Dan Carpenter dan.carpenter@oracle.com net: nfc: fix bounds checking bugs on "pipe"
Dmitry Bogdanov dbogdanov@marvell.com net: macsec: update SCI upon MAC address change.
Pablo Neira Ayuso pablo@netfilter.org netlink: Use netlink header as base to calculate bad attribute offset
Hangbin Liu liuhangbin@gmail.com net/ipv6: use configured metric when add peer route
Jian Shen shenjian15@huawei.com net: hns3: fix a not link up issue when fibre port supports autoneg
Jakub Kicinski kuba@kernel.org net: fec: validate the new settings in fec_enet_set_coalesce()
Russell King rmk+kernel@armlinux.org.uk net: dsa: mv88e6xxx: fix lockup on warm boot
Russell King rmk+kernel@armlinux.org.uk net: dsa: fix phylink_start()/phylink_stop() calls
Mahesh Bandewar maheshb@google.com macvlan: add cond_resched() during multicast processing
Mahesh Bandewar maheshb@google.com ipvlan: don't deref eth hdr before checking it's set
Eric Dumazet edumazet@google.com ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()
Jiri Wiesner jwiesner@suse.com ipvlan: do not add hardware address of master to its unicast filter list
Mahesh Bandewar maheshb@google.com ipvlan: add cond_resched_rcu() while processing muticast backlog
Hangbin Liu liuhangbin@gmail.com ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface
Dmitry Yakunin zeil@yandex-team.ru inet_diag: return classid for all socket types
Eric Dumazet edumazet@google.com gre: fix uninit-value in __iptunnel_pull_header
Dmitry Yakunin zeil@yandex-team.ru cgroup, netclassid: periodically release file_lock on classid updating
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Fixed one of HP ALC671 platform Headset Mic supported
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Add Headset Mic supported for HP cPC
Takashi Iwai tiwai@suse.de ALSA: hda/realtek - More constifications
Nathan Chancellor natechancellor@gmail.com virtio_balloon: Adjust label in virtballoon_probe
-------------
Diffstat:
Documentation/filesystems/porting.rst | 8 + Makefile | 4 +- arch/arc/include/asm/linkage.h | 2 + arch/x86/events/amd/uncore.c | 17 +-- arch/x86/kernel/cpu/mce/intel.c | 9 +- arch/x86/kvm/emulate.c | 1 + arch/x86/kvm/vmx/nested.c | 5 +- arch/x86/mm/ioremap.c | 18 +++ block/blk-iocost.c | 2 +- drivers/base/platform.c | 25 +--- drivers/block/virtio_blk.c | 8 +- drivers/char/ipmi/ipmi_si_platform.c | 4 +- drivers/firmware/efi/efivars.c | 32 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 3 +- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 3 +- drivers/gpu/drm/i915/gvt/display.c | 3 +- drivers/gpu/drm/i915/gvt/vgpu.c | 12 +- drivers/gpu/drm/i915/i915_request.c | 22 ++- drivers/gpu/drm/i915/i915_request.h | 2 + drivers/gpu/drm/i915/i915_utils.h | 5 + drivers/i2c/busses/i2c-designware-pcidrv.c | 1 + drivers/i2c/busses/i2c-gpio.c | 2 +- drivers/i2c/i2c-core-acpi.c | 10 +- drivers/iommu/amd_iommu.c | 4 +- drivers/iommu/dma-iommu.c | 16 +- drivers/iommu/dmar.c | 21 ++- drivers/iommu/intel-iommu.c | 18 ++- drivers/macintosh/windfarm_ad7417_sensor.c | 7 + drivers/macintosh/windfarm_fcu_controls.c | 7 + drivers/macintosh/windfarm_lm75_sensor.c | 16 +- drivers/macintosh/windfarm_lm87_sensor.c | 7 + drivers/macintosh/windfarm_max6690_sensor.c | 7 + drivers/macintosh/windfarm_smu_sat.c | 7 + drivers/mmc/host/sdhci-pci-gli.c | 17 +++ drivers/net/bonding/bond_alb.c | 20 +-- drivers/net/can/dev.c | 1 + drivers/net/dsa/mv88e6xxx/global2.c | 8 +- drivers/net/ethernet/broadcom/bcmsysport.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 24 ++- drivers/net/ethernet/freescale/fec_main.c | 6 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +- drivers/net/ethernet/sfc/efx.c | 1 + .../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 3 +- drivers/net/ipvlan/ipvlan_core.c | 19 ++- drivers/net/ipvlan/ipvlan_main.c | 5 +- drivers/net/macsec.c | 12 +- drivers/net/macvlan.c | 2 + drivers/net/phy/bcm63xx.c | 1 + drivers/net/phy/phy.c | 3 +- drivers/net/phy/phy_device.c | 11 +- drivers/net/slip/slhc.c | 14 +- drivers/net/team/team.c | 2 + drivers/net/usb/r8152.c | 8 + drivers/net/wireless/intel/iwlwifi/mvm/nvm.c | 3 +- drivers/net/wireless/mediatek/mt76/dma.c | 9 +- drivers/pinctrl/core.c | 1 - drivers/pinctrl/freescale/pinctrl-scu.c | 4 +- drivers/pinctrl/meson/pinctrl-meson-gxl.c | 4 +- drivers/pinctrl/pinctrl-falcon.c | 2 +- drivers/s390/block/dasd.c | 27 +++- drivers/s390/block/dasd_eckd.c | 163 ++++++++++++++++++++- drivers/s390/block/dasd_int.h | 15 +- drivers/virtio/virtio_balloon.c | 2 +- drivers/virtio/virtio_ring.c | 4 +- fs/cifs/dir.c | 1 - fs/crypto/keysetup.c | 9 ++ fs/fuse/dev.c | 6 +- fs/fuse/fuse_i.h | 2 + fs/gfs2/inode.c | 2 +- fs/open.c | 3 - include/dt-bindings/clock/imx8mn-clock.h | 4 +- include/linux/cgroup.h | 1 + include/linux/dmar.h | 8 +- include/linux/inet_diag.h | 18 ++- include/linux/phy.h | 3 + include/linux/platform_device.h | 2 +- include/net/fib_rules.h | 1 + kernel/cgroup/cgroup.c | 37 +++-- kernel/workqueue.c | 14 +- mm/memcontrol.c | 14 +- net/batman-adv/bat_iv_ogm.c | 4 + net/core/devlink.c | 33 +++-- net/core/netclassid_cgroup.c | 47 ++++-- net/core/sock.c | 5 +- net/dsa/dsa_priv.h | 2 + net/dsa/port.c | 44 ++++-- net/dsa/slave.c | 8 +- net/ieee802154/nl_policy.c | 6 + net/ipv4/gre_demux.c | 12 +- net/ipv4/inet_connection_sock.c | 20 +++ net/ipv4/inet_diag.c | 44 +++--- net/ipv4/raw_diag.c | 5 +- net/ipv4/udp_diag.c | 5 +- net/ipv6/addrconf.c | 51 +++++-- net/ipv6/ipv6_sockglue.c | 10 +- net/netfilter/nf_conntrack_standalone.c | 2 +- net/netfilter/nf_synproxy_core.c | 2 +- net/netfilter/nf_tables_api.c | 15 +- net/netfilter/nfnetlink_cthelper.c | 2 + net/netfilter/nft_chain_nat.c | 1 + net/netfilter/nft_payload.c | 1 + net/netfilter/nft_tunnel.c | 2 + net/netfilter/x_tables.c | 6 +- net/netfilter/xt_recent.c | 2 +- net/netlink/af_netlink.c | 2 +- net/nfc/hci/core.c | 19 ++- net/nfc/netlink.c | 4 + net/packet/af_packet.c | 13 +- net/sched/sch_fq.c | 1 + net/sched/sch_taprio.c | 13 +- net/sctp/diag.c | 8 +- net/smc/smc_ib.c | 3 + net/tipc/netlink.c | 1 + net/wireless/nl80211.c | 5 + sound/pci/hda/patch_realtek.c | 163 +++++++++++++-------- tools/perf/bench/futex-wake.c | 4 +- tools/testing/ktest/ktest.pl | 2 +- tools/testing/selftests/net/fib_tests.sh | 34 ++++- 119 files changed, 1038 insertions(+), 397 deletions(-)
From: Nathan Chancellor natechancellor@gmail.com
commit 6ae4edab2fbf86ec92fbf0a8f0c60b857d90d50f upstream.
Clang warns when CONFIG_BALLOON_COMPACTION is unset:
../drivers/virtio/virtio_balloon.c:963:1: warning: unused label 'out_del_vqs' [-Wunused-label] out_del_vqs: ^~~~~~~~~~~~ 1 warning generated.
Move the label within the preprocessor block since it is only used when CONFIG_BALLOON_COMPACTION is set.
Fixes: 1ad6f58ea936 ("virtio_balloon: Fix memory leaks on errors in virtballoon_probe()") Link: https://github.com/ClangBuiltLinux/linux/issues/886 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Link: https://lore.kernel.org/r/20200216004039.23464-1-natechancellor@gmail.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: David Hildenbrand david@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/virtio/virtio_balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -958,8 +958,8 @@ out_iput: iput(vb->vb_dev_info.inode); out_kern_unmount: kern_unmount(balloon_mnt); -#endif out_del_vqs: +#endif vdev->config->del_vqs(vdev); out_free_vb: kfree(vb);
From: Takashi Iwai tiwai@suse.de
commit 6b0f95c49d890440c01a759c767dfe40e2acdbf2 upstream.
Apply const prefix to each coef table array.
Just for minor optimization and no functional changes.
Link: https://lore.kernel.org/r/20200105144823.29547-4-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 118 +++++++++++++++++++++--------------------- 1 file changed, 59 insertions(+), 59 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -949,7 +949,7 @@ struct alc_codec_rename_pci_table { const char *name; };
-static struct alc_codec_rename_table rename_tbl[] = { +static const struct alc_codec_rename_table rename_tbl[] = { { 0x10ec0221, 0xf00f, 0x1003, "ALC231" }, { 0x10ec0269, 0xfff0, 0x3010, "ALC277" }, { 0x10ec0269, 0xf0f0, 0x2010, "ALC259" }, @@ -970,7 +970,7 @@ static struct alc_codec_rename_table ren { } /* terminator */ };
-static struct alc_codec_rename_pci_table rename_pci_tbl[] = { +static const struct alc_codec_rename_pci_table rename_pci_tbl[] = { { 0x10ec0280, 0x1028, 0, "ALC3220" }, { 0x10ec0282, 0x1028, 0, "ALC3221" }, { 0x10ec0283, 0x1028, 0, "ALC3223" }, @@ -3000,7 +3000,7 @@ static void alc269_shutup(struct hda_cod alc_shutup_pins(codec); }
-static struct coef_fw alc282_coefs[] = { +static const struct coef_fw alc282_coefs[] = { WRITE_COEF(0x03, 0x0002), /* Power Down Control */ UPDATE_COEF(0x05, 0xff3f, 0x0700), /* FIFO and filter clock */ WRITE_COEF(0x07, 0x0200), /* DMIC control */ @@ -3112,7 +3112,7 @@ static void alc282_shutup(struct hda_cod alc_write_coef_idx(codec, 0x78, coef78); }
-static struct coef_fw alc283_coefs[] = { +static const struct coef_fw alc283_coefs[] = { WRITE_COEF(0x03, 0x0002), /* Power Down Control */ UPDATE_COEF(0x05, 0xff3f, 0x0700), /* FIFO and filter clock */ WRITE_COEF(0x07, 0x0200), /* DMIC control */ @@ -4188,7 +4188,7 @@ static void alc269_fixup_hp_line1_mic1_l } }
-static struct coef_fw alc225_pre_hsmode[] = { +static const struct coef_fw alc225_pre_hsmode[] = { UPDATE_COEF(0x4a, 1<<8, 0), UPDATE_COEFEX(0x57, 0x05, 1<<14, 0), UPDATE_COEF(0x63, 3<<14, 3<<14), @@ -4201,7 +4201,7 @@ static struct coef_fw alc225_pre_hsmode[
static void alc_headset_mode_unplugged(struct hda_codec *codec) { - static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEF(0x1b, 0x0c0b), /* LDO and MISC control */ WRITE_COEF(0x45, 0xd089), /* UAJ function set to menual mode */ UPDATE_COEFEX(0x57, 0x05, 1<<14, 0), /* Direct Drive HP Amp control(Set to verb control)*/ @@ -4209,7 +4209,7 @@ static void alc_headset_mode_unplugged(s WRITE_COEFEX(0x57, 0x03, 0x8aa6), /* Direct Drive HP Amp control */ {} }; - static struct coef_fw coef0256[] = { + static const struct coef_fw coef0256[] = { WRITE_COEF(0x1b, 0x0c4b), /* LDO and MISC control */ WRITE_COEF(0x45, 0xd089), /* UAJ function set to menual mode */ WRITE_COEF(0x06, 0x6104), /* Set MIC2 Vref gate with HP */ @@ -4217,7 +4217,7 @@ static void alc_headset_mode_unplugged(s UPDATE_COEFEX(0x57, 0x05, 1<<14, 0), /* Direct Drive HP Amp control(Set to verb control)*/ {} }; - static struct coef_fw coef0233[] = { + static const struct coef_fw coef0233[] = { WRITE_COEF(0x1b, 0x0c0b), WRITE_COEF(0x45, 0xc429), UPDATE_COEF(0x35, 0x4000, 0), @@ -4227,7 +4227,7 @@ static void alc_headset_mode_unplugged(s WRITE_COEF(0x32, 0x42a3), {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x4f, 0xfcc0, 0xc400), UPDATE_COEF(0x50, 0x2000, 0x2000), UPDATE_COEF(0x56, 0x0006, 0x0006), @@ -4235,18 +4235,18 @@ static void alc_headset_mode_unplugged(s UPDATE_COEF(0x67, 0x2000, 0), {} }; - static struct coef_fw coef0298[] = { + static const struct coef_fw coef0298[] = { UPDATE_COEF(0x19, 0x1300, 0x0300), {} }; - static struct coef_fw coef0292[] = { + static const struct coef_fw coef0292[] = { WRITE_COEF(0x76, 0x000e), WRITE_COEF(0x6c, 0x2400), WRITE_COEF(0x18, 0x7308), WRITE_COEF(0x6b, 0xc429), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { UPDATE_COEF(0x10, 7<<8, 6<<8), /* SET Line1 JD to 0 */ UPDATE_COEFEX(0x57, 0x05, 1<<15|1<<13, 0x0), /* SET charge pump by verb */ UPDATE_COEFEX(0x57, 0x03, 1<<10, 1<<10), /* SET EN_OSW to 1 */ @@ -4255,16 +4255,16 @@ static void alc_headset_mode_unplugged(s UPDATE_COEF(0x4a, 0x000f, 0x000e), /* Combo Jack auto detect */ {} }; - static struct coef_fw coef0668[] = { + static const struct coef_fw coef0668[] = { WRITE_COEF(0x15, 0x0d40), WRITE_COEF(0xb7, 0x802b), {} }; - static struct coef_fw coef0225[] = { + static const struct coef_fw coef0225[] = { UPDATE_COEF(0x63, 3<<14, 0), {} }; - static struct coef_fw coef0274[] = { + static const struct coef_fw coef0274[] = { UPDATE_COEF(0x4a, 0x0100, 0), UPDATE_COEFEX(0x57, 0x05, 0x4000, 0), UPDATE_COEF(0x6b, 0xf000, 0x5000), @@ -4329,25 +4329,25 @@ static void alc_headset_mode_unplugged(s static void alc_headset_mode_mic_in(struct hda_codec *codec, hda_nid_t hp_pin, hda_nid_t mic_pin) { - static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEFEX(0x57, 0x03, 0x8aa6), WRITE_COEF(0x06, 0x6100), /* Set MIC2 Vref gate to normal */ {} }; - static struct coef_fw coef0256[] = { + static const struct coef_fw coef0256[] = { UPDATE_COEFEX(0x57, 0x05, 1<<14, 1<<14), /* Direct Drive HP Amp control(Set to verb control)*/ WRITE_COEFEX(0x57, 0x03, 0x09a3), WRITE_COEF(0x06, 0x6100), /* Set MIC2 Vref gate to normal */ {} }; - static struct coef_fw coef0233[] = { + static const struct coef_fw coef0233[] = { UPDATE_COEF(0x35, 0, 1<<14), WRITE_COEF(0x06, 0x2100), WRITE_COEF(0x1a, 0x0021), WRITE_COEF(0x26, 0x008c), {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x4f, 0x00c0, 0), UPDATE_COEF(0x50, 0x2000, 0), UPDATE_COEF(0x56, 0x0006, 0), @@ -4356,30 +4356,30 @@ static void alc_headset_mode_mic_in(stru UPDATE_COEF(0x67, 0x2000, 0x2000), {} }; - static struct coef_fw coef0292[] = { + static const struct coef_fw coef0292[] = { WRITE_COEF(0x19, 0xa208), WRITE_COEF(0x2e, 0xacf0), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { UPDATE_COEFEX(0x57, 0x05, 0, 1<<15|1<<13), /* SET charge pump by verb */ UPDATE_COEFEX(0x57, 0x03, 1<<10, 0), /* SET EN_OSW to 0 */ UPDATE_COEF(0x1a, 1<<3, 0), /* Combo JD gating without LINE1-VREFO */ {} }; - static struct coef_fw coef0688[] = { + static const struct coef_fw coef0688[] = { WRITE_COEF(0xb7, 0x802b), WRITE_COEF(0xb5, 0x1040), UPDATE_COEF(0xc3, 0, 1<<12), {} }; - static struct coef_fw coef0225[] = { + static const struct coef_fw coef0225[] = { UPDATE_COEFEX(0x57, 0x05, 1<<14, 1<<14), UPDATE_COEF(0x4a, 3<<4, 2<<4), UPDATE_COEF(0x63, 3<<14, 0), {} }; - static struct coef_fw coef0274[] = { + static const struct coef_fw coef0274[] = { UPDATE_COEFEX(0x57, 0x05, 0x4000, 0x4000), UPDATE_COEF(0x4a, 0x0010, 0), UPDATE_COEF(0x6b, 0xf000, 0), @@ -4465,7 +4465,7 @@ static void alc_headset_mode_mic_in(stru
static void alc_headset_mode_default(struct hda_codec *codec) { - static struct coef_fw coef0225[] = { + static const struct coef_fw coef0225[] = { UPDATE_COEF(0x45, 0x3f<<10, 0x30<<10), UPDATE_COEF(0x45, 0x3f<<10, 0x31<<10), UPDATE_COEF(0x49, 3<<8, 0<<8), @@ -4474,14 +4474,14 @@ static void alc_headset_mode_default(str UPDATE_COEF(0x67, 0xf000, 0x3000), {} }; - static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEF(0x45, 0xc089), WRITE_COEF(0x45, 0xc489), WRITE_COEFEX(0x57, 0x03, 0x8ea6), WRITE_COEF(0x49, 0x0049), {} }; - static struct coef_fw coef0256[] = { + static const struct coef_fw coef0256[] = { WRITE_COEF(0x45, 0xc489), WRITE_COEFEX(0x57, 0x03, 0x0da3), WRITE_COEF(0x49, 0x0049), @@ -4489,12 +4489,12 @@ static void alc_headset_mode_default(str WRITE_COEF(0x06, 0x6100), {} }; - static struct coef_fw coef0233[] = { + static const struct coef_fw coef0233[] = { WRITE_COEF(0x06, 0x2100), WRITE_COEF(0x32, 0x4ea3), {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x4f, 0xfcc0, 0xc400), /* Set to TRS type */ UPDATE_COEF(0x50, 0x2000, 0x2000), UPDATE_COEF(0x56, 0x0006, 0x0006), @@ -4502,26 +4502,26 @@ static void alc_headset_mode_default(str UPDATE_COEF(0x67, 0x2000, 0), {} }; - static struct coef_fw coef0292[] = { + static const struct coef_fw coef0292[] = { WRITE_COEF(0x76, 0x000e), WRITE_COEF(0x6c, 0x2400), WRITE_COEF(0x6b, 0xc429), WRITE_COEF(0x18, 0x7308), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { UPDATE_COEF(0x4a, 0x000f, 0x000e), /* Combo Jack auto detect */ WRITE_COEF(0x45, 0xC429), /* Set to TRS type */ UPDATE_COEF(0x1a, 1<<3, 0), /* Combo JD gating without LINE1-VREFO */ {} }; - static struct coef_fw coef0688[] = { + static const struct coef_fw coef0688[] = { WRITE_COEF(0x11, 0x0041), WRITE_COEF(0x15, 0x0d40), WRITE_COEF(0xb7, 0x802b), {} }; - static struct coef_fw coef0274[] = { + static const struct coef_fw coef0274[] = { WRITE_COEF(0x45, 0x4289), UPDATE_COEF(0x4a, 0x0010, 0x0010), UPDATE_COEF(0x6b, 0x0f00, 0), @@ -4584,53 +4584,53 @@ static void alc_headset_mode_ctia(struct { int val;
- static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEF(0x45, 0xd489), /* Set to CTIA type */ WRITE_COEF(0x1b, 0x0c2b), WRITE_COEFEX(0x57, 0x03, 0x8ea6), {} }; - static struct coef_fw coef0256[] = { + static const struct coef_fw coef0256[] = { WRITE_COEF(0x45, 0xd489), /* Set to CTIA type */ WRITE_COEF(0x1b, 0x0e6b), {} }; - static struct coef_fw coef0233[] = { + static const struct coef_fw coef0233[] = { WRITE_COEF(0x45, 0xd429), WRITE_COEF(0x1b, 0x0c2b), WRITE_COEF(0x32, 0x4ea3), {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x50, 0x2000, 0x2000), UPDATE_COEF(0x56, 0x0006, 0x0006), UPDATE_COEF(0x66, 0x0008, 0), UPDATE_COEF(0x67, 0x2000, 0), {} }; - static struct coef_fw coef0292[] = { + static const struct coef_fw coef0292[] = { WRITE_COEF(0x6b, 0xd429), WRITE_COEF(0x76, 0x0008), WRITE_COEF(0x18, 0x7388), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { WRITE_COEF(0x45, 0xd429), /* Set to ctia type */ UPDATE_COEF(0x10, 7<<8, 7<<8), /* SET Line1 JD to 1 */ {} }; - static struct coef_fw coef0688[] = { + static const struct coef_fw coef0688[] = { WRITE_COEF(0x11, 0x0001), WRITE_COEF(0x15, 0x0d60), WRITE_COEF(0xc3, 0x0000), {} }; - static struct coef_fw coef0225_1[] = { + static const struct coef_fw coef0225_1[] = { UPDATE_COEF(0x45, 0x3f<<10, 0x35<<10), UPDATE_COEF(0x63, 3<<14, 2<<14), {} }; - static struct coef_fw coef0225_2[] = { + static const struct coef_fw coef0225_2[] = { UPDATE_COEF(0x45, 0x3f<<10, 0x35<<10), UPDATE_COEF(0x63, 3<<14, 1<<14), {} @@ -4702,48 +4702,48 @@ static void alc_headset_mode_ctia(struct /* Nokia type */ static void alc_headset_mode_omtp(struct hda_codec *codec) { - static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEF(0x45, 0xe489), /* Set to OMTP Type */ WRITE_COEF(0x1b, 0x0c2b), WRITE_COEFEX(0x57, 0x03, 0x8ea6), {} }; - static struct coef_fw coef0256[] = { + static const struct coef_fw coef0256[] = { WRITE_COEF(0x45, 0xe489), /* Set to OMTP Type */ WRITE_COEF(0x1b, 0x0e6b), {} }; - static struct coef_fw coef0233[] = { + static const struct coef_fw coef0233[] = { WRITE_COEF(0x45, 0xe429), WRITE_COEF(0x1b, 0x0c2b), WRITE_COEF(0x32, 0x4ea3), {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x50, 0x2000, 0x2000), UPDATE_COEF(0x56, 0x0006, 0x0006), UPDATE_COEF(0x66, 0x0008, 0), UPDATE_COEF(0x67, 0x2000, 0), {} }; - static struct coef_fw coef0292[] = { + static const struct coef_fw coef0292[] = { WRITE_COEF(0x6b, 0xe429), WRITE_COEF(0x76, 0x0008), WRITE_COEF(0x18, 0x7388), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { WRITE_COEF(0x45, 0xe429), /* Set to omtp type */ UPDATE_COEF(0x10, 7<<8, 7<<8), /* SET Line1 JD to 1 */ {} }; - static struct coef_fw coef0688[] = { + static const struct coef_fw coef0688[] = { WRITE_COEF(0x11, 0x0001), WRITE_COEF(0x15, 0x0d50), WRITE_COEF(0xc3, 0x0000), {} }; - static struct coef_fw coef0225[] = { + static const struct coef_fw coef0225[] = { UPDATE_COEF(0x45, 0x3f<<10, 0x39<<10), UPDATE_COEF(0x63, 3<<14, 2<<14), {} @@ -4803,17 +4803,17 @@ static void alc_determine_headset_type(s int val; bool is_ctia = false; struct alc_spec *spec = codec->spec; - static struct coef_fw coef0255[] = { + static const struct coef_fw coef0255[] = { WRITE_COEF(0x45, 0xd089), /* combo jack auto switch control(Check type)*/ WRITE_COEF(0x49, 0x0149), /* combo jack auto switch control(Vref conteol) */ {} }; - static struct coef_fw coef0288[] = { + static const struct coef_fw coef0288[] = { UPDATE_COEF(0x4f, 0xfcc0, 0xd400), /* Check Type */ {} }; - static struct coef_fw coef0298[] = { + static const struct coef_fw coef0298[] = { UPDATE_COEF(0x50, 0x2000, 0x2000), UPDATE_COEF(0x56, 0x0006, 0x0006), UPDATE_COEF(0x66, 0x0008, 0), @@ -4821,19 +4821,19 @@ static void alc_determine_headset_type(s UPDATE_COEF(0x19, 0x1300, 0x1300), {} }; - static struct coef_fw coef0293[] = { + static const struct coef_fw coef0293[] = { UPDATE_COEF(0x4a, 0x000f, 0x0008), /* Combo Jack auto detect */ WRITE_COEF(0x45, 0xD429), /* Set to ctia type */ {} }; - static struct coef_fw coef0688[] = { + static const struct coef_fw coef0688[] = { WRITE_COEF(0x11, 0x0001), WRITE_COEF(0xb7, 0x802b), WRITE_COEF(0x15, 0x0d60), WRITE_COEF(0xc3, 0x0c00), {} }; - static struct coef_fw coef0274[] = { + static const struct coef_fw coef0274[] = { UPDATE_COEF(0x4a, 0x0010, 0), UPDATE_COEF(0x4a, 0x8000, 0), WRITE_COEF(0x45, 0xd289), @@ -5120,7 +5120,7 @@ static void alc_fixup_headset_mode_no_hp static void alc255_set_default_jack_type(struct hda_codec *codec) { /* Set to iphone type */ - static struct coef_fw alc255fw[] = { + static const struct coef_fw alc255fw[] = { WRITE_COEF(0x1b, 0x880b), WRITE_COEF(0x45, 0xd089), WRITE_COEF(0x1b, 0x080b), @@ -5128,7 +5128,7 @@ static void alc255_set_default_jack_type WRITE_COEF(0x1b, 0x0c0b), {} }; - static struct coef_fw alc256fw[] = { + static const struct coef_fw alc256fw[] = { WRITE_COEF(0x1b, 0x884b), WRITE_COEF(0x45, 0xd089), WRITE_COEF(0x1b, 0x084b), @@ -8542,7 +8542,7 @@ static void alc662_fixup_aspire_ethos_hp } }
-static struct coef_fw alc668_coefs[] = { +static const struct coef_fw alc668_coefs[] = { WRITE_COEF(0x01, 0xbebe), WRITE_COEF(0x02, 0xaaaa), WRITE_COEF(0x03, 0x0), WRITE_COEF(0x04, 0x0180), WRITE_COEF(0x06, 0x0), WRITE_COEF(0x07, 0x0f80), WRITE_COEF(0x08, 0x0031), WRITE_COEF(0x0a, 0x0060), WRITE_COEF(0x0b, 0x0),
From: Kailang Yang kailang@realtek.com
commit 5af29028fd6db9438b5584ab7179710a0a22569d upstream.
HP ALC671 need to support Headset Mic.
Signed-off-by: Kailang Yang kailang@realtek.com Link: https://lore.kernel.org/r/06a9d2b176e14706976d6584cbe2d92a@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 44 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8542,6 +8542,29 @@ static void alc662_fixup_aspire_ethos_hp } }
+static void alc671_fixup_hp_headset_mic2(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + + static const struct hda_pintbl pincfgs[] = { + { 0x19, 0x02a11040 }, /* use as headset mic, with its own jack detect */ + { 0x1b, 0x0181304f }, + { } + }; + + switch (action) { + case HDA_FIXUP_ACT_PRE_PROBE: + spec->gen.mixer_nid = 0; + spec->parse_flags |= HDA_PINCFG_HEADSET_MIC; + snd_hda_apply_pincfgs(codec, pincfgs); + break; + case HDA_FIXUP_ACT_INIT: + alc_write_coef_idx(codec, 0x19, 0xa054); + break; + } +} + static const struct coef_fw alc668_coefs[] = { WRITE_COEF(0x01, 0xbebe), WRITE_COEF(0x02, 0xaaaa), WRITE_COEF(0x03, 0x0), WRITE_COEF(0x04, 0x0180), WRITE_COEF(0x06, 0x0), WRITE_COEF(0x07, 0x0f80), @@ -8615,6 +8638,7 @@ enum { ALC662_FIXUP_LENOVO_MULTI_CODECS, ALC669_FIXUP_ACER_ASPIRE_ETHOS, ALC669_FIXUP_ACER_ASPIRE_ETHOS_HEADSET, + ALC671_FIXUP_HP_HEADSET_MIC2, };
static const struct hda_fixup alc662_fixups[] = { @@ -8956,6 +8980,10 @@ static const struct hda_fixup alc662_fix .chained = true, .chain_id = ALC669_FIXUP_ACER_ASPIRE_ETHOS_HEADSET }, + [ALC671_FIXUP_HP_HEADSET_MIC2] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc671_fixup_hp_headset_mic2, + }, };
static const struct snd_pci_quirk alc662_fixup_tbl[] = { @@ -9138,6 +9166,22 @@ static const struct snd_hda_pin_quirk al {0x12, 0x90a60130}, {0x14, 0x90170110}, {0x15, 0x0321101f}), + SND_HDA_PIN_QUIRK(0x10ec0671, 0x103c, "HP cPC", ALC671_FIXUP_HP_HEADSET_MIC2, + {0x14, 0x01014010}, + {0x17, 0x90170150}, + {0x1b, 0x01813030}, + {0x21, 0x02211020}), + SND_HDA_PIN_QUIRK(0x10ec0671, 0x103c, "HP cPC", ALC671_FIXUP_HP_HEADSET_MIC2, + {0x14, 0x01014010}, + {0x18, 0x01a19040}, + {0x1b, 0x01813030}, + {0x21, 0x02211020}), + SND_HDA_PIN_QUIRK(0x10ec0671, 0x103c, "HP cPC", ALC671_FIXUP_HP_HEADSET_MIC2, + {0x14, 0x01014020}, + {0x17, 0x90170110}, + {0x18, 0x01a19050}, + {0x1b, 0x01813040}, + {0x21, 0x02211030}), {} };
From: Kailang Yang kailang@realtek.com
commit f2adbae0cb20c8eaf06914b2187043ea944b0aff upstream.
HP want to keep BIOS verb table for release platform. So, it need to add 0x19 pin for quirk.
Fixes: 5af29028fd6d ("ALSA: hda/realtek - Add Headset Mic supported for HP cPC") Signed-off-by: Kailang Yang kailang@realtek.com Link: https://lore.kernel.org/r/74636ccb700a4cbda24c58a99dc430ce@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9169,6 +9169,7 @@ static const struct snd_hda_pin_quirk al SND_HDA_PIN_QUIRK(0x10ec0671, 0x103c, "HP cPC", ALC671_FIXUP_HP_HEADSET_MIC2, {0x14, 0x01014010}, {0x17, 0x90170150}, + {0x19, 0x02a11060}, {0x1b, 0x01813030}, {0x21, 0x02211020}), SND_HDA_PIN_QUIRK(0x10ec0671, 0x103c, "HP cPC", ALC671_FIXUP_HP_HEADSET_MIC2,
From: Dmitry Yakunin zeil@yandex-team.ru
[ Upstream commit 018d26fcd12a75fb9b5fe233762aa3f2f0854b88 ]
In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts.
This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion.
Now update is non atomic and socket may be skipped using calls:
dup2(oldfd, newfd); close(oldfd);
But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer.
New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration.
So in common cases this patch has no side effects.
Signed-off-by: Dmitry Yakunin zeil@yandex-team.ru Reviewed-by: Konstantin Khlebnikov khlebnikov@yandex-team.ru Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/netclassid_cgroup.c | 47 +++++++++++++++++++++++++++++++++---------- 1 file changed, 37 insertions(+), 10 deletions(-)
--- a/net/core/netclassid_cgroup.c +++ b/net/core/netclassid_cgroup.c @@ -53,30 +53,60 @@ static void cgrp_css_free(struct cgroup_ kfree(css_cls_state(css)); }
+/* + * To avoid freezing of sockets creation for tasks with big number of threads + * and opened sockets lets release file_lock every 1000 iterated descriptors. + * New sockets will already have been created with new classid. + */ + +struct update_classid_context { + u32 classid; + unsigned int batch; +}; + +#define UPDATE_CLASSID_BATCH 1000 + static int update_classid_sock(const void *v, struct file *file, unsigned n) { int err; + struct update_classid_context *ctx = (void *)v; struct socket *sock = sock_from_file(file, &err);
if (sock) { spin_lock(&cgroup_sk_update_lock); - sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, - (unsigned long)v); + sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, ctx->classid); spin_unlock(&cgroup_sk_update_lock); } + if (--ctx->batch == 0) { + ctx->batch = UPDATE_CLASSID_BATCH; + return n + 1; + } return 0; }
+static void update_classid_task(struct task_struct *p, u32 classid) +{ + struct update_classid_context ctx = { + .classid = classid, + .batch = UPDATE_CLASSID_BATCH + }; + unsigned int fd = 0; + + do { + task_lock(p); + fd = iterate_fd(p->files, fd, update_classid_sock, &ctx); + task_unlock(p); + cond_resched(); + } while (fd); +} + static void cgrp_attach(struct cgroup_taskset *tset) { struct cgroup_subsys_state *css; struct task_struct *p;
cgroup_taskset_for_each(p, css, tset) { - task_lock(p); - iterate_fd(p->files, 0, update_classid_sock, - (void *)(unsigned long)css_cls_state(css)->classid); - task_unlock(p); + update_classid_task(p, css_cls_state(css)->classid); } }
@@ -98,10 +128,7 @@ static int write_classid(struct cgroup_s
css_task_iter_start(css, 0, &it); while ((p = css_task_iter_next(&it))) { - task_lock(p); - iterate_fd(p->files, 0, update_classid_sock, - (void *)(unsigned long)cs->classid); - task_unlock(p); + update_classid_task(p, cs->classid); cond_resched(); } css_task_iter_end(&it);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 17c25cafd4d3e74c83dce56b158843b19c40b414 ]
syzbot found an interesting case of the kernel reading an uninit-value [1]
Problem is in the handling of ETH_P_WCCP in gre_parse_header()
We look at the byte following GRE options to eventually decide if the options are four bytes longer.
Use skb_header_pointer() to not pull bytes if we found that no more bytes were needed.
All callers of gre_parse_header() are properly using pskb_may_pull() anyway before proceeding to next header.
[1] BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline] BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 pskb_may_pull include/linux/skbuff.h:2303 [inline] __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 iptunnel_pull_header include/net/ip_tunnels.h:411 [inline] gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606 ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432 ip6_input_finish net/ipv6/ip6_input.c:473 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ip6_input net/ipv6/ip6_input.c:482 [inline] ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576 dst_input include/net/dst.h:442 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306 __netif_receive_skb_one_core net/core/dev.c:5198 [inline] __netif_receive_skb net/core/dev.c:5312 [inline] netif_receive_skb_internal net/core/dev.c:5402 [inline] netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461 tun_rx_batched include/linux/skbuff.h:4321 [inline] tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f62d99 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000fffedb2c EFLAGS: 00000217 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020002580 RDX: 0000000000000fca RSI: 0000000000000036 RDI: 0000000000000004 RBP: 0000000000008914 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 tun_alloc_skb drivers/net/tun.c:1529 [inline] tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
Fixes: 95f5c64c3c13 ("gre: Move utility functions to common headers") Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/gre_demux.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/ipv4/gre_demux.c +++ b/net/ipv4/gre_demux.c @@ -56,7 +56,9 @@ int gre_del_protocol(const struct gre_pr } EXPORT_SYMBOL_GPL(gre_del_protocol);
-/* Fills in tpi and returns header length to be pulled. */ +/* Fills in tpi and returns header length to be pulled. + * Note that caller must use pskb_may_pull() before pulling GRE header. + */ int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi, bool *csum_err, __be16 proto, int nhs) { @@ -110,8 +112,14 @@ int gre_parse_header(struct sk_buff *skb * - When dealing with WCCPv2, Skip extra 4 bytes in GRE header */ if (greh->flags == 0 && tpi->proto == htons(ETH_P_WCCP)) { + u8 _val, *val; + + val = skb_header_pointer(skb, nhs + hdr_len, + sizeof(_val), &_val); + if (!val) + return -EINVAL; tpi->proto = proto; - if ((*(u8 *)options & 0xF0) != 0x40) + if ((*val & 0xF0) != 0x40) hdr_len += 4; } tpi->hdr_len = hdr_len;
From: Dmitry Yakunin zeil@yandex-team.ru
[ Upstream commit 83f73c5bb7b9a9135173f0ba2b1aa00c06664ff9 ]
In commit 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") croup classid reporting was fixed. But this works only for TCP sockets because for other socket types icsk parameter can be NULL and classid code path is skipped. This change moves classid handling to inet_diag_msg_attrs_fill() function.
Also inet_diag_msg_attrs_size() helper was added and addends in nlmsg_new() were reordered to save order from inet_sk_diag_fill().
Fixes: 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") Signed-off-by: Dmitry Yakunin zeil@yandex-team.ru Reviewed-by: Konstantin Khlebnikov khlebnikov@yandex-team.ru Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/inet_diag.h | 18 ++++++++++++------ net/ipv4/inet_diag.c | 44 ++++++++++++++++++++------------------------ net/ipv4/raw_diag.c | 5 +++-- net/ipv4/udp_diag.c | 5 +++-- net/sctp/diag.c | 8 ++------ 5 files changed, 40 insertions(+), 40 deletions(-)
--- a/include/linux/inet_diag.h +++ b/include/linux/inet_diag.h @@ -2,15 +2,10 @@ #ifndef _INET_DIAG_H_ #define _INET_DIAG_H_ 1
+#include <net/netlink.h> #include <uapi/linux/inet_diag.h>
-struct net; -struct sock; struct inet_hashinfo; -struct nlattr; -struct nlmsghdr; -struct sk_buff; -struct netlink_callback;
struct inet_diag_handler { void (*dump)(struct sk_buff *skb, @@ -62,6 +57,17 @@ int inet_diag_bc_sk(const struct nlattr
void inet_diag_msg_common_fill(struct inet_diag_msg *r, struct sock *sk);
+static inline size_t inet_diag_msg_attrs_size(void) +{ + return nla_total_size(1) /* INET_DIAG_SHUTDOWN */ + + nla_total_size(1) /* INET_DIAG_TOS */ +#if IS_ENABLED(CONFIG_IPV6) + + nla_total_size(1) /* INET_DIAG_TCLASS */ + + nla_total_size(1) /* INET_DIAG_SKV6ONLY */ +#endif + + nla_total_size(4) /* INET_DIAG_MARK */ + + nla_total_size(4); /* INET_DIAG_CLASS_ID */ +} int inet_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb, struct inet_diag_msg *r, int ext, struct user_namespace *user_ns, bool net_admin); --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -100,13 +100,9 @@ static size_t inet_sk_attr_size(struct s aux = handler->idiag_get_aux_size(sk, net_admin);
return nla_total_size(sizeof(struct tcp_info)) - + nla_total_size(1) /* INET_DIAG_SHUTDOWN */ - + nla_total_size(1) /* INET_DIAG_TOS */ - + nla_total_size(1) /* INET_DIAG_TCLASS */ - + nla_total_size(4) /* INET_DIAG_MARK */ - + nla_total_size(4) /* INET_DIAG_CLASS_ID */ - + nla_total_size(sizeof(struct inet_diag_meminfo)) + nla_total_size(sizeof(struct inet_diag_msg)) + + inet_diag_msg_attrs_size() + + nla_total_size(sizeof(struct inet_diag_meminfo)) + nla_total_size(SK_MEMINFO_VARS * sizeof(u32)) + nla_total_size(TCP_CA_NAME_MAX) + nla_total_size(sizeof(struct tcpvegas_info)) @@ -147,6 +143,24 @@ int inet_diag_msg_attrs_fill(struct sock if (net_admin && nla_put_u32(skb, INET_DIAG_MARK, sk->sk_mark)) goto errout;
+ if (ext & (1 << (INET_DIAG_CLASS_ID - 1)) || + ext & (1 << (INET_DIAG_TCLASS - 1))) { + u32 classid = 0; + +#ifdef CONFIG_SOCK_CGROUP_DATA + classid = sock_cgroup_classid(&sk->sk_cgrp_data); +#endif + /* Fallback to socket priority if class id isn't set. + * Classful qdiscs use it as direct reference to class. + * For cgroup2 classid is always zero. + */ + if (!classid) + classid = sk->sk_priority; + + if (nla_put_u32(skb, INET_DIAG_CLASS_ID, classid)) + goto errout; + } + r->idiag_uid = from_kuid_munged(user_ns, sock_i_uid(sk)); r->idiag_inode = sock_i_ino(sk);
@@ -284,24 +298,6 @@ int inet_sk_diag_fill(struct sock *sk, s goto errout; }
- if (ext & (1 << (INET_DIAG_CLASS_ID - 1)) || - ext & (1 << (INET_DIAG_TCLASS - 1))) { - u32 classid = 0; - -#ifdef CONFIG_SOCK_CGROUP_DATA - classid = sock_cgroup_classid(&sk->sk_cgrp_data); -#endif - /* Fallback to socket priority if class id isn't set. - * Classful qdiscs use it as direct reference to class. - * For cgroup2 classid is always zero. - */ - if (!classid) - classid = sk->sk_priority; - - if (nla_put_u32(skb, INET_DIAG_CLASS_ID, classid)) - goto errout; - } - out: nlmsg_end(skb, nlh); return 0; --- a/net/ipv4/raw_diag.c +++ b/net/ipv4/raw_diag.c @@ -100,8 +100,9 @@ static int raw_diag_dump_one(struct sk_b if (IS_ERR(sk)) return PTR_ERR(sk);
- rep = nlmsg_new(sizeof(struct inet_diag_msg) + - sizeof(struct inet_diag_meminfo) + 64, + rep = nlmsg_new(nla_total_size(sizeof(struct inet_diag_msg)) + + inet_diag_msg_attrs_size() + + nla_total_size(sizeof(struct inet_diag_meminfo)) + 64, GFP_KERNEL); if (!rep) { sock_put(sk); --- a/net/ipv4/udp_diag.c +++ b/net/ipv4/udp_diag.c @@ -64,8 +64,9 @@ static int udp_dump_one(struct udp_table goto out;
err = -ENOMEM; - rep = nlmsg_new(sizeof(struct inet_diag_msg) + - sizeof(struct inet_diag_meminfo) + 64, + rep = nlmsg_new(nla_total_size(sizeof(struct inet_diag_msg)) + + inet_diag_msg_attrs_size() + + nla_total_size(sizeof(struct inet_diag_meminfo)) + 64, GFP_KERNEL); if (!rep) goto out; --- a/net/sctp/diag.c +++ b/net/sctp/diag.c @@ -237,15 +237,11 @@ static size_t inet_assoc_attr_size(struc addrcnt++;
return nla_total_size(sizeof(struct sctp_info)) - + nla_total_size(1) /* INET_DIAG_SHUTDOWN */ - + nla_total_size(1) /* INET_DIAG_TOS */ - + nla_total_size(1) /* INET_DIAG_TCLASS */ - + nla_total_size(4) /* INET_DIAG_MARK */ - + nla_total_size(4) /* INET_DIAG_CLASS_ID */ + nla_total_size(addrlen * asoc->peer.transport_count) + nla_total_size(addrlen * addrcnt) - + nla_total_size(sizeof(struct inet_diag_meminfo)) + nla_total_size(sizeof(struct inet_diag_msg)) + + inet_diag_msg_attrs_size() + + nla_total_size(sizeof(struct inet_diag_meminfo)) + 64; }
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 60380488e4e0b95e9e82aa68aa9705baa86de84c ]
Rafał found an issue that for non-Ethernet interface, if we down and up frequently, the memory will be consumed slowly.
The reason is we add allnodes/allrouters addressed in multicast list in ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up() for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb getting bigger and bigger. The call stack looks like:
addrconf_notify(NETDEV_REGISTER) ipv6_add_dev ipv6_dev_mc_inc(ff01::1) ipv6_dev_mc_inc(ff02::1) ipv6_dev_mc_inc(ff02::2)
addrconf_notify(NETDEV_UP) addrconf_dev_config /* Alas, we support only Ethernet autoconfiguration. */ return;
addrconf_notify(NETDEV_DOWN) addrconf_ifdown ipv6_mc_down igmp6_group_dropped(ff02::2) mld_add_delrec(ff02::2) igmp6_group_dropped(ff02::1) igmp6_group_dropped(ff01::1)
After investigating, I can't found a rule to disable multicast on non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM, tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up() in inetdev_event(). Even for IPv6, we don't check the dev type and call ipv6_add_dev(), ipv6_dev_mc_inc() after register device.
So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for non-Ethernet interface.
v2: Also check IFF_MULTICAST flag to make sure the interface supports multicast
Reported-by: Rafał Miłecki zajec5@gmail.com Tested-by: Rafał Miłecki zajec5@gmail.com Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels") Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3345,6 +3345,10 @@ static void addrconf_dev_config(struct n (dev->type != ARPHRD_NONE) && (dev->type != ARPHRD_RAWIP)) { /* Alas, we support only Ethernet autoconfiguration. */ + idev = __in6_dev_get(dev); + if (!IS_ERR_OR_NULL(idev) && dev->flags & IFF_UP && + dev->flags & IFF_MULTICAST) + ipv6_mc_up(idev); return; }
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit e18b353f102e371580f3f01dd47567a25acc3c1d ]
If there are substantial number of slaves created as simulated by Syzbot, the backlog processing could take much longer and result into the issue found in the Syzbot report.
INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 1, t=10502 jiffies, g=5049, c=5048, q=752) All QSes seen, last rcu_sched kthread activity 10502 (4294965563-4294955061), jiffies_till_next_fqs=1, root ->qsmask 0x0 syz-executor.1 R running task on cpu 1 10984 11210 3866 0x30020008 179034491270 Call Trace: <IRQ> [<ffffffff81497163>] _sched_show_task kernel/sched/core.c:8063 [inline] [<ffffffff81497163>] _sched_show_task.cold+0x2fd/0x392 kernel/sched/core.c:8030 [<ffffffff8146a91b>] sched_show_task+0xb/0x10 kernel/sched/core.c:8073 [<ffffffff815c931b>] print_other_cpu_stall kernel/rcu/tree.c:1577 [inline] [<ffffffff815c931b>] check_cpu_stall kernel/rcu/tree.c:1695 [inline] [<ffffffff815c931b>] __rcu_pending kernel/rcu/tree.c:3478 [inline] [<ffffffff815c931b>] rcu_pending kernel/rcu/tree.c:3540 [inline] [<ffffffff815c931b>] rcu_check_callbacks.cold+0xbb4/0xc29 kernel/rcu/tree.c:2876 [<ffffffff815e3962>] update_process_times+0x32/0x80 kernel/time/timer.c:1635 [<ffffffff816164f0>] tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:161 [<ffffffff81616ae4>] tick_sched_timer+0x44/0x130 kernel/time/tick-sched.c:1193 [<ffffffff815e75f7>] __run_hrtimer kernel/time/hrtimer.c:1393 [inline] [<ffffffff815e75f7>] __hrtimer_run_queues+0x307/0xd90 kernel/time/hrtimer.c:1455 [<ffffffff815e90ea>] hrtimer_interrupt+0x2ea/0x730 kernel/time/hrtimer.c:1513 [<ffffffff844050f4>] local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1031 [inline] [<ffffffff844050f4>] smp_apic_timer_interrupt+0x144/0x5e0 arch/x86/kernel/apic/apic.c:1056 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 RIP: 0010:do_raw_read_lock+0x22/0x80 kernel/locking/spinlock_debug.c:153 RSP: 0018:ffff8801dad07ab8 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff12 RAX: 0000000000000000 RBX: ffff8801c4135680 RCX: 0000000000000000 RDX: 1ffff10038826afe RSI: ffff88019d816bb8 RDI: ffff8801c41357f0 RBP: ffff8801dad07ac0 R08: 0000000000004b15 R09: 0000000000310273 R10: ffff88019d816bb8 R11: 0000000000000001 R12: ffff8801c41357e8 R13: 0000000000000000 R14: ffff8801cfb19850 R15: ffff8801cfb198b0 [<ffffffff8101460e>] __raw_read_lock_bh include/linux/rwlock_api_smp.h:177 [inline] [<ffffffff8101460e>] _raw_read_lock_bh+0x3e/0x50 kernel/locking/spinlock.c:240 [<ffffffff840d78ca>] ipv6_chk_mcast_addr+0x11a/0x6f0 net/ipv6/mcast.c:1006 [<ffffffff84023439>] ip6_mc_input+0x319/0x8e0 net/ipv6/ip6_input.c:482 [<ffffffff840211c8>] dst_input include/net/dst.h:449 [inline] [<ffffffff840211c8>] ip6_rcv_finish+0x408/0x610 net/ipv6/ip6_input.c:78 [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:292 [inline] [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:286 [inline] [<ffffffff840214de>] ipv6_rcv+0x10e/0x420 net/ipv6/ip6_input.c:278 [<ffffffff83a29efa>] __netif_receive_skb_one_core+0x12a/0x1f0 net/core/dev.c:5303 [<ffffffff83a2a15c>] __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:5417 [<ffffffff83a2f536>] process_backlog+0x216/0x6c0 net/core/dev.c:6243 [<ffffffff83a30d1b>] napi_poll net/core/dev.c:6680 [inline] [<ffffffff83a30d1b>] net_rx_action+0x47b/0xfb0 net/core/dev.c:6748 [<ffffffff846002c8>] __do_softirq+0x2c8/0x99a kernel/softirq.c:317 [<ffffffff813e656a>] invoke_softirq kernel/softirq.c:399 [inline] [<ffffffff813e656a>] irq_exit+0x16a/0x1a0 kernel/softirq.c:439 [<ffffffff84405115>] exiting_irq arch/x86/include/asm/apic.h:561 [inline] [<ffffffff84405115>] smp_apic_timer_interrupt+0x165/0x5e0 arch/x86/kernel/apic/apic.c:1058 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 </IRQ> RIP: 0010:__sanitizer_cov_trace_pc+0x26/0x50 kernel/kcov.c:102 RSP: 0018:ffff880196033bd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 RAX: ffff88019d8161c0 RBX: 00000000ffffffff RCX: ffffc90003501000 RDX: 0000000000000002 RSI: ffffffff816236d1 RDI: 0000000000000005 RBP: ffff880196033bd8 R08: ffff88019d8161c0 R09: 0000000000000000 R10: 1ffff10032c067f0 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [<ffffffff816236d1>] do_futex+0x151/0x1d50 kernel/futex.c:3548 [<ffffffff816260f0>] C_SYSC_futex kernel/futex_compat.c:201 [inline] [<ffffffff816260f0>] compat_SyS_futex+0x270/0x3b0 kernel/futex_compat.c:175 [<ffffffff8101da17>] do_syscall_32_irqs_on arch/x86/entry/common.c:353 [inline] [<ffffffff8101da17>] do_fast_syscall_32+0x357/0xe1c arch/x86/entry/common.c:415 [<ffffffff84401a9b>] entry_SYSENTER_compat+0x8b/0x9d arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f23c69 RSP: 002b:00000000f5d1f12c EFLAGS: 00000282 ORIG_RAX: 00000000000000f0 RAX: ffffffffffffffda RBX: 000000000816af88 RCX: 0000000000000080 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000816af8c RBP: 00000000f5d1f228 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 rcu_sched kthread starved for 10502 jiffies! g5049 c5048 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1 rcu_sched R running task on cpu 1 13048 8 2 0x90000000 179099587640 Call Trace: [<ffffffff8147321f>] context_switch+0x60f/0xa60 kernel/sched/core.c:3209 [<ffffffff8100095a>] __schedule+0x5aa/0x1da0 kernel/sched/core.c:3934 [<ffffffff810021df>] schedule+0x8f/0x1b0 kernel/sched/core.c:4011 [<ffffffff8101116d>] schedule_timeout+0x50d/0xee0 kernel/time/timer.c:1803 [<ffffffff815c13f1>] rcu_gp_kthread+0xda1/0x3b50 kernel/rcu/tree.c:2327 [<ffffffff8144b318>] kthread+0x348/0x420 kernel/kthread.c:246 [<ffffffff84400266>] ret_from_fork+0x56/0x70 arch/x86/entry/entry_64.S:393
Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”) Signed-off-by: Mahesh Bandewar maheshb@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ipvlan/ipvlan_core.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -277,6 +277,7 @@ void ipvlan_process_multicast(struct wor } ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, true); local_bh_enable(); + cond_resched_rcu(); } rcu_read_unlock();
From: Jiri Wiesner jwiesner@suse.com
[ Upstream commit 63aae7b17344d4b08a7d05cb07044de4c0f9dcc6 ]
There is a problem when ipvlan slaves are created on a master device that is a vmxnet3 device (ipvlan in VMware guests). The vmxnet3 driver does not support unicast address filtering. When an ipvlan device is brought up in ipvlan_open(), the ipvlan driver calls dev_uc_add() to add the hardware address of the vmxnet3 master device to the unicast address list of the master device, phy_dev->uc. This inevitably leads to the vmxnet3 master device being forced into promiscuous mode by __dev_set_rx_mode().
Promiscuous mode is switched on the master despite the fact that there is still only one hardware address that the master device should use for filtering in order for the ipvlan device to be able to receive packets. The comment above struct net_device describes the uc_promisc member as a "counter, that indicates, that promiscuous mode has been enabled due to the need to listen to additional unicast addresses in a device that does not implement ndo_set_rx_mode()". Moreover, the design of ipvlan guarantees that only the hardware address of a master device, phy_dev->dev_addr, will be used to transmit and receive all packets from its ipvlan slaves. Thus, the unicast address list of the master device should not be modified by ipvlan_open() and ipvlan_stop() in order to make ipvlan a workable option on masters that do not support unicast address filtering.
Fixes: 2ad7bf3638411 ("ipvlan: Initial check-in of the IPVLAN driver") Reported-by: Per Sundstrom per.sundstrom@redqube.se Signed-off-by: Jiri Wiesner jwiesner@suse.com Reviewed-by: Eric Dumazet edumazet@google.com Acked-by: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ipvlan/ipvlan_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -164,7 +164,6 @@ static void ipvlan_uninit(struct net_dev static int ipvlan_open(struct net_device *dev) { struct ipvl_dev *ipvlan = netdev_priv(dev); - struct net_device *phy_dev = ipvlan->phy_dev; struct ipvl_addr *addr;
if (ipvlan->port->mode == IPVLAN_MODE_L3 || @@ -178,7 +177,7 @@ static int ipvlan_open(struct net_device ipvlan_ht_addr_add(ipvlan, addr); rcu_read_unlock();
- return dev_uc_add(phy_dev, phy_dev->dev_addr); + return 0; }
static int ipvlan_stop(struct net_device *dev) @@ -190,8 +189,6 @@ static int ipvlan_stop(struct net_device dev_uc_unsync(phy_dev, dev); dev_mc_unsync(phy_dev, dev);
- dev_uc_del(phy_dev, phy_dev->dev_addr); - rcu_read_lock(); list_for_each_entry_rcu(addr, &ipvlan->addrs, anode) ipvlan_ht_addr_del(addr);
From: Eric Dumazet edumazet@google.com
[ Upstream commit afe207d80a61e4d6e7cfa0611a4af46d0ba95628 ]
Commit e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") added a cond_resched_rcu() in a loop using rcu protection to iterate over slaves.
This is breaking rcu rules, so lets instead use cond_resched() at a point we can reschedule
Fixes: e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ipvlan/ipvlan_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -277,7 +277,6 @@ void ipvlan_process_multicast(struct wor } ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, true); local_bh_enable(); - cond_resched_rcu(); } rcu_read_unlock();
@@ -294,6 +293,7 @@ void ipvlan_process_multicast(struct wor } if (dev) dev_put(dev); + cond_resched(); } }
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit ad8192767c9f9cf97da57b9ffcea70fb100febef ]
IPvlan in L3 mode discards outbound multicast packets but performs the check before ensuring the ether-header is set or not. This is an error that Eric found through code browsing.
Fixes: 2ad7bf363841 (“ipvlan: Initial check-in of the IPVLAN driver.”) Signed-off-by: Mahesh Bandewar maheshb@google.com Reported-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ipvlan/ipvlan_core.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
--- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -499,19 +499,21 @@ static int ipvlan_process_outbound(struc struct ethhdr *ethh = eth_hdr(skb); int ret = NET_XMIT_DROP;
- /* In this mode we dont care about multicast and broadcast traffic */ - if (is_multicast_ether_addr(ethh->h_dest)) { - pr_debug_ratelimited("Dropped {multi|broad}cast of type=[%x]\n", - ntohs(skb->protocol)); - kfree_skb(skb); - goto out; - } - /* The ipvlan is a pseudo-L2 device, so the packets that we receive * will have L2; which need to discarded and processed further * in the net-ns of the main-device. */ if (skb_mac_header_was_set(skb)) { + /* In this mode we dont care about + * multicast and broadcast traffic */ + if (is_multicast_ether_addr(ethh->h_dest)) { + pr_debug_ratelimited( + "Dropped {multi|broad}cast of type=[%x]\n", + ntohs(skb->protocol)); + kfree_skb(skb); + goto out; + } + skb_pull(skb, sizeof(*ethh)); skb->mac_header = (typeof(skb->mac_header))~0U; skb_reset_network_header(skb);
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit ce9a4186f9ac475c415ffd20348176a4ea366670 ]
The Rx bound multicast packets are deferred to a workqueue and macvlan can also suffer from the same attack that was discovered by Syzbot for IPvlan. This solution is not as effective as in IPvlan. IPvlan defers all (Tx and Rx) multicast packet processing to a workqueue while macvlan does this way only for the Rx. This fix should address the Rx codition to certain extent.
Tx is still suseptible. Tx multicast processing happens when .ndo_start_xmit is called, hence we cannot add cond_resched(). However, it's not that severe since the user which is generating / flooding will be affected the most.
Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macvlan.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -334,6 +334,8 @@ static void macvlan_process_broadcast(st if (src) dev_put(src->dev); consume_skb(skb); + + cond_resched(); } }
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 8640f8dc6d657ebfb4e67c202ad32c5457858a13 ]
Place phylink_start()/phylink_stop() inside dsa_port_enable() and dsa_port_disable(), which ensures that we call phylink_stop() before tearing down phylink - which is a documented requirement. Failure to do so can cause use-after-free bugs.
Fixes: 0e27921816ad ("net: dsa: Use PHYLINK for the CPU/DSA ports") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/dsa_priv.h | 2 ++ net/dsa/port.c | 32 ++++++++++++++++++++++++++------ net/dsa/slave.c | 8 ++------ 3 files changed, 30 insertions(+), 12 deletions(-)
--- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -128,7 +128,9 @@ static inline struct net_device *dsa_mas /* port.c */ int dsa_port_set_state(struct dsa_port *dp, u8 state, struct switchdev_trans *trans); +int dsa_port_enable_rt(struct dsa_port *dp, struct phy_device *phy); int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy); +void dsa_port_disable_rt(struct dsa_port *dp); void dsa_port_disable(struct dsa_port *dp); int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br); void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br); --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -63,7 +63,7 @@ static void dsa_port_set_state_now(struc pr_err("DSA: failed to set STP state %u (%d)\n", state, err); }
-int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy) +int dsa_port_enable_rt(struct dsa_port *dp, struct phy_device *phy) { struct dsa_switch *ds = dp->ds; int port = dp->index; @@ -78,14 +78,31 @@ int dsa_port_enable(struct dsa_port *dp, if (!dp->bridge_dev) dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
+ if (dp->pl) + phylink_start(dp->pl); + return 0; }
-void dsa_port_disable(struct dsa_port *dp) +int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy) +{ + int err; + + rtnl_lock(); + err = dsa_port_enable_rt(dp, phy); + rtnl_unlock(); + + return err; +} + +void dsa_port_disable_rt(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; int port = dp->index;
+ if (dp->pl) + phylink_stop(dp->pl); + if (!dp->bridge_dev) dsa_port_set_state_now(dp, BR_STATE_DISABLED);
@@ -93,6 +110,13 @@ void dsa_port_disable(struct dsa_port *d ds->ops->port_disable(ds, port); }
+void dsa_port_disable(struct dsa_port *dp) +{ + rtnl_lock(); + dsa_port_disable_rt(dp); + rtnl_unlock(); +} + int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br) { struct dsa_notifier_bridge_info info = { @@ -615,10 +639,6 @@ static int dsa_port_phylink_register(str goto err_phy_connect; }
- rtnl_lock(); - phylink_start(dp->pl); - rtnl_unlock(); - return 0;
err_phy_connect: --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -90,12 +90,10 @@ static int dsa_slave_open(struct net_dev goto clear_allmulti; }
- err = dsa_port_enable(dp, dev->phydev); + err = dsa_port_enable_rt(dp, dev->phydev); if (err) goto clear_promisc;
- phylink_start(dp->pl); - return 0;
clear_promisc: @@ -119,9 +117,7 @@ static int dsa_slave_close(struct net_de cancel_work_sync(&dp->xmit_work); skb_queue_purge(&dp->xmit_queue);
- phylink_stop(dp->pl); - - dsa_port_disable(dp); + dsa_port_disable_rt(dp);
dev_mc_unsync(master, dev); dev_uc_unsync(master, dev);
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 0395823b8d9a4d87bd1bf74359123461c2ae801b ]
If the switch is not hardware reset on a warm boot, interrupts can be left enabled, and possibly pending. This will cause us to enter an infinite loop trying to service an interrupt we are unable to handle, thereby preventing the kernel from booting.
Ensure that the global 2 interrupt sources are disabled before we claim the parent interrupt.
Observed on the ZII development revision B and C platforms with reworked serdes support, and using reboot -f to reboot the platform.
Fixes: dc30c35be720 ("net: dsa: mv88e6xxx: Implement interrupt support.") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mv88e6xxx/global2.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/net/dsa/mv88e6xxx/global2.c +++ b/drivers/net/dsa/mv88e6xxx/global2.c @@ -1083,6 +1083,13 @@ int mv88e6xxx_g2_irq_setup(struct mv88e6 { int err, irq, virq;
+ chip->g2_irq.masked = ~0; + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g2_int_mask(chip, ~chip->g2_irq.masked); + mv88e6xxx_reg_unlock(chip); + if (err) + return err; + chip->g2_irq.domain = irq_domain_add_simple( chip->dev->of_node, 16, 0, &mv88e6xxx_g2_irq_domain_ops, chip); if (!chip->g2_irq.domain) @@ -1092,7 +1099,6 @@ int mv88e6xxx_g2_irq_setup(struct mv88e6 irq_create_mapping(chip->g2_irq.domain, irq);
chip->g2_irq.chip = mv88e6xxx_g2_irq_chip; - chip->g2_irq.masked = ~0;
chip->device_irq = irq_find_mapping(chip->g1_irq.domain, MV88E6XXX_G1_STS_IRQ_DEVICE);
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit ab14961d10d02d20767612c78ce148f6eb85bd58 ]
fec_enet_set_coalesce() validates the previously set params and if they are within range proceeds to apply the new ones. The new ones, however, are not validated. This seems backwards, probably a copy-paste error?
Compile tested only.
Fixes: d851b47b22fc ("net: fec: add interrupt coalescence feature support") Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Fugang Duan fugang.duan@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -2529,15 +2529,15 @@ fec_enet_set_coalesce(struct net_device return -EINVAL; }
- cycle = fec_enet_us_to_itr_clock(ndev, fep->rx_time_itr); + cycle = fec_enet_us_to_itr_clock(ndev, ec->rx_coalesce_usecs); if (cycle > 0xFFFF) { dev_err(dev, "Rx coalesced usec exceed hardware limitation\n"); return -EINVAL; }
- cycle = fec_enet_us_to_itr_clock(ndev, fep->tx_time_itr); + cycle = fec_enet_us_to_itr_clock(ndev, ec->tx_coalesce_usecs); if (cycle > 0xFFFF) { - dev_err(dev, "Rx coalesced usec exceed hardware limitation\n"); + dev_err(dev, "Tx coalesced usec exceed hardware limitation\n"); return -EINVAL; }
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 68e1006f618e509fc7869259fe83ceec4a95dac3 ]
When fibre port supports auto-negotiation, the IMP(Intelligent Management Process) processes the speed of auto-negotiation and the user's speed separately. For below case, the port will get a not link up problem. step 1: disables auto-negotiation and sets speed to A, then the driver's MAC speed will be updated to A. step 2: enables auto-negotiation and MAC gets negotiated speed B, then the driver's MAC speed will be updated to B through querying in periodical task. step 3: MAC gets new negotiated speed A. step 4: disables auto-negotiation and sets speed to B before periodical task query new MAC speed A, the driver will ignore the speed configuration.
This patch fixes it by skipping speed and duplex checking when fibre port supports auto-negotiation.
Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -2417,10 +2417,12 @@ static int hclge_cfg_mac_speed_dup_hw(st
int hclge_cfg_mac_speed_dup(struct hclge_dev *hdev, int speed, u8 duplex) { + struct hclge_mac *mac = &hdev->hw.mac; int ret;
duplex = hclge_check_speed_dup(duplex, speed); - if (hdev->hw.mac.speed == speed && hdev->hw.mac.duplex == duplex) + if (!mac->support_autoneg && mac->speed == speed && + mac->duplex == duplex) return 0;
ret = hclge_cfg_mac_speed_dup_hw(hdev, speed, duplex);
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 07758eb9ff52794fba15d03aa88d92dbd1b7d125 ]
When we add peer address with metric configured, IPv4 could set the dest metric correctly, but IPv6 do not. e.g.
]# ip addr add 192.0.2.1 peer 192.0.2.2/32 dev eth1 metric 20 ]# ip route show dev eth1 192.0.2.2 proto kernel scope link src 192.0.2.1 metric 20 ]# ip addr add 2001:db8::1 peer 2001:db8::2/128 dev eth1 metric 20 ]# ip -6 route show dev eth1 2001:db8::1 proto kernel metric 20 pref medium 2001:db8::2 proto kernel metric 256 pref medium
Fix this by using configured metric instead of default one.
Reported-by: Jianlin Shi jishi@redhat.com Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes") Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -5988,9 +5988,9 @@ static void __ipv6_ifa_notify(int event, if (ifp->idev->cnf.forwarding) addrconf_join_anycast(ifp); if (!ipv6_addr_any(&ifp->peer_addr)) - addrconf_prefix_route(&ifp->peer_addr, 128, 0, - ifp->idev->dev, 0, 0, - GFP_ATOMIC); + addrconf_prefix_route(&ifp->peer_addr, 128, + ifp->rt_priority, ifp->idev->dev, + 0, 0, GFP_ATOMIC); break; case RTM_DELADDR: if (ifp->idev->cnf.forwarding)
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 84b3268027641401bb8ad4427a90a3cce2eb86f5 ]
Userspace might send a batch that is composed of several netlink messages. The netlink_ack() function must use the pointer to the netlink header as base to calculate the bad attribute offset.
Fixes: 2d4bc93368f5 ("netlink: extended ACK reporting") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netlink/af_netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2434,7 +2434,7 @@ void netlink_ack(struct sk_buff *in_skb, in_skb->len)) WARN_ON(nla_put_u32(skb, NLMSGERR_ATTR_OFFS, (u8 *)extack->bad_attr - - in_skb->data)); + (u8 *)nlh)); } else { if (extack->cookie_len) WARN_ON(nla_put(skb, NLMSGERR_ATTR_COOKIE,
From: Dmitry Bogdanov dbogdanov@marvell.com
[ Upstream commit 6fc498bc82929ee23aa2f35a828c6178dfd3f823 ]
SCI should be updated, because it contains MAC in its first 6 octets.
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Dmitry Bogdanov dbogdanov@marvell.com Signed-off-by: Mark Starovoytov mstarovoitov@marvell.com Signed-off-by: Igor Russkikh irusskikh@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macsec.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -2882,6 +2882,11 @@ static void macsec_dev_set_rx_mode(struc dev_uc_sync(real_dev, dev); }
+static sci_t dev_to_sci(struct net_device *dev, __be16 port) +{ + return make_sci(dev->dev_addr, port); +} + static int macsec_set_mac_address(struct net_device *dev, void *p) { struct macsec_dev *macsec = macsec_priv(dev); @@ -2903,6 +2908,7 @@ static int macsec_set_mac_address(struct
out: ether_addr_copy(dev->dev_addr, addr->sa_data); + macsec->secy.sci = dev_to_sci(dev, MACSEC_PORT_ES); return 0; }
@@ -3176,11 +3182,6 @@ static bool sci_exists(struct net_device return false; }
-static sci_t dev_to_sci(struct net_device *dev, __be16 port) -{ - return make_sci(dev->dev_addr, port); -} - static int macsec_add_dev(struct net_device *dev, sci_t sci, u8 icv_len) { struct macsec_dev *macsec = macsec_priv(dev);
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit a3aefbfe45751bf7b338c181b97608e276b5bb73 ]
This is similar to commit 674d9de02aa7 ("NFC: Fix possible memory corruption when handling SHDLC I-Frame commands") and commit d7ee81ad09f0 ("NFC: nci: Add some bounds checking in nci_hci_cmd_received()") which added range checks on "pipe".
The "pipe" variable comes skb->data[0] in nfc_hci_msg_rx_work(). It's in the 0-255 range. We're using it as the array index into the hdev->pipes[] array which has NFC_HCI_MAX_PIPES (128) members.
Fixes: 118278f20aa8 ("NFC: hci: Add pipes table to reference them with a tuple {gate, host}") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/hci/core.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
--- a/net/nfc/hci/core.c +++ b/net/nfc/hci/core.c @@ -181,13 +181,20 @@ exit: void nfc_hci_cmd_received(struct nfc_hci_dev *hdev, u8 pipe, u8 cmd, struct sk_buff *skb) { - u8 gate = hdev->pipes[pipe].gate; u8 status = NFC_HCI_ANY_OK; struct hci_create_pipe_resp *create_info; struct hci_delete_pipe_noti *delete_info; struct hci_all_pipe_cleared_noti *cleared_info; + u8 gate;
- pr_debug("from gate %x pipe %x cmd %x\n", gate, pipe, cmd); + pr_debug("from pipe %x cmd %x\n", pipe, cmd); + + if (pipe >= NFC_HCI_MAX_PIPES) { + status = NFC_HCI_ANY_E_NOK; + goto exit; + } + + gate = hdev->pipes[pipe].gate;
switch (cmd) { case NFC_HCI_ADM_NOTIFY_PIPE_CREATED: @@ -375,8 +382,14 @@ void nfc_hci_event_received(struct nfc_h struct sk_buff *skb) { int r = 0; - u8 gate = hdev->pipes[pipe].gate; + u8 gate; + + if (pipe >= NFC_HCI_MAX_PIPES) { + pr_err("Discarded event %x to invalid pipe %x\n", event, pipe); + goto exit; + }
+ gate = hdev->pipes[pipe].gate; if (gate == NFC_HCI_INVALID_GATE) { pr_err("Discarded event %x to unopened pipe %x\n", event, pipe); goto exit;
From: Willem de Bruijn willemb@google.com
[ Upstream commit 46e4c421a053c36bf7a33dda2272481bcaf3eed3 ]
In one error case, tpacket_rcv drops packets after incrementing the ring producer index.
If this happens, it does not update tp_status to TP_STATUS_USER and thus the reader is stalled for an iteration of the ring, causing out of order arrival.
The only such error path is when virtio_net_hdr_from_skb fails due to encountering an unknown GSO type.
Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/packet/af_packet.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2273,6 +2273,13 @@ static int tpacket_rcv(struct sk_buff *s TP_STATUS_KERNEL, (macoff+snaplen)); if (!h.raw) goto drop_n_account; + + if (do_vnet && + virtio_net_hdr_from_skb(skb, h.raw + macoff - + sizeof(struct virtio_net_hdr), + vio_le(), true, 0)) + goto drop_n_account; + if (po->tp_version <= TPACKET_V2) { packet_increment_rx_head(po, &po->rx_ring); /* @@ -2285,12 +2292,6 @@ static int tpacket_rcv(struct sk_buff *s status |= TP_STATUS_LOSING; }
- if (do_vnet && - virtio_net_hdr_from_skb(skb, h.raw + macoff - - sizeof(struct virtio_net_hdr), - vio_le(), true, 0)) - goto drop_n_account; - po->stats.stats1.tp_packets++; if (copy_skb) { status |= TP_STATUS_COPY;
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 43de81b0601df7d7988d3f5617ee0987df65c883 ]
719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap") was a bit over-eager and also removed the second phy driver's name, resulting in a nasty OOPS on registration:
[ 1.319854] CPU 0 Unable to handle kernel paging request at virtual address 00000000, epc == 804dd50c, ra == 804dd4f0 [ 1.330859] Oops[#1]: [ 1.333138] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.22 #0 [ 1.339217] $ 0 : 00000000 00000001 87ca7f00 805c1874 [ 1.344590] $ 4 : 00000000 00000047 00585000 8701f800 [ 1.349965] $ 8 : 8701f800 804f4a5c 00000003 64726976 [ 1.355341] $12 : 00000001 00000000 00000000 00000114 [ 1.360718] $16 : 87ca7f80 00000000 00000000 80639fe4 [ 1.366093] $20 : 00000002 00000000 806441d0 80b90000 [ 1.371470] $24 : 00000000 00000000 [ 1.376847] $28 : 87c1e000 87c1fda0 80b90000 804dd4f0 [ 1.382224] Hi : d1c8f8da [ 1.385180] Lo : 5518a480 [ 1.388182] epc : 804dd50c kset_find_obj+0x3c/0x114 [ 1.393345] ra : 804dd4f0 kset_find_obj+0x20/0x114 [ 1.398530] Status: 10008703 KERNEL EXL IE [ 1.402833] Cause : 00800008 (ExcCode 02) [ 1.406952] BadVA : 00000000 [ 1.409913] PrId : 0002a075 (Broadcom BMIPS4350) [ 1.414745] Modules linked in: [ 1.417895] Process swapper/0 (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000) [ 1.426214] Stack : 87cec000 80630000 80639370 80640658 80640000 80049af4 80639fe4 8063a0d8 [ 1.434816] 8063a0d8 802ef078 00000002 00000000 806441d0 80b90000 8063a0d8 802ef114 [ 1.443417] 87cea0de 87c1fde0 00000000 804de488 87cea000 8063a0d8 8063a0d8 80334e48 [ 1.452018] 80640000 8063984c 80639bf4 00000000 8065de48 00000001 8063a0d8 80334ed0 [ 1.460620] 806441d0 80b90000 80b90000 802ef164 8065dd70 80620000 80b90000 8065de58 [ 1.469222] ... [ 1.471734] Call Trace: [ 1.474255] [<804dd50c>] kset_find_obj+0x3c/0x114 [ 1.479141] [<802ef078>] driver_find+0x1c/0x44 [ 1.483665] [<802ef114>] driver_register+0x74/0x148 [ 1.488719] [<80334e48>] phy_driver_register+0x9c/0xd0 [ 1.493968] [<80334ed0>] phy_drivers_register+0x54/0xe8 [ 1.499345] [<8001061c>] do_one_initcall+0x7c/0x1f4 [ 1.504374] [<80644ed8>] kernel_init_freeable+0x1d4/0x2b4 [ 1.509940] [<804f4e24>] kernel_init+0x10/0xf8 [ 1.514502] [<80018e68>] ret_from_kernel_thread+0x14/0x1c [ 1.520040] Code: 1060000c 02202025 90650000 <90810000> 24630001 14250004 24840001 14a0fffb 90650000 [ 1.530061] [ 1.531698] ---[ end trace d52f1717cd29bdc8 ]---
Fix it by readding the name.
Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/bcm63xx.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/phy/bcm63xx.c +++ b/drivers/net/phy/bcm63xx.c @@ -73,6 +73,7 @@ static struct phy_driver bcm63xx_driver[ /* same phy as above, with just a different OUI */ .phy_id = 0x002bdc00, .phy_id_mask = 0xfffffc00, + .name = "Broadcom BCM63XX (2)", /* PHY_BASIC_FEATURES */ .flags = PHY_IS_INTERNAL, .config_init = bcm63xx_config_init,
From: Remi Pommarel repk@triplefau.lt
[ Upstream commit b723bd933980f4956dabc8a8d84b3e83be8d094c ]
ACS (auto PAD/FCS stripping) removes FCS off 802.3 packets (LLC) so that there is no need to manually strip it for such packets. The enhanced DMA descriptors allow to flag LLC packets so that the receiving callback can use that to strip FCS manually or not. On the other hand, normal descriptors do not support that.
Thus in order to not truncate LLC packet ACS should be disabled when using normal DMA descriptors.
Fixes: 47dd7a540b8a0 ("net: add support for STMicroelectronics Ethernet controllers.") Signed-off-by: Remi Pommarel repk@triplefau.lt Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c @@ -24,6 +24,7 @@ static void dwmac1000_core_init(struct mac_device_info *hw, struct net_device *dev) { + struct stmmac_priv *priv = netdev_priv(dev); void __iomem *ioaddr = hw->pcsr; u32 value = readl(ioaddr + GMAC_CONTROL); int mtu = dev->mtu; @@ -35,7 +36,7 @@ static void dwmac1000_core_init(struct m * Broadcom tags can look like invalid LLC/SNAP packets and cause the * hardware to truncate packets on reception. */ - if (netdev_uses_dsa(dev)) + if (netdev_uses_dsa(dev) || !priv->plat->enh_desc) value &= ~GMAC_CONTROL_ACS;
if (mtu > 1500)
From: Colin Ian King colin.king@canonical.com
[ Upstream commit c0368595c1639947839c0db8294ee96aca0b3b86 ]
Currently the bounds check on index is off by one and can lead to an out of bounds access on array priv->filters_loc when index is RXCHK_BRCM_TAG_MAX.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bcmsysport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -2135,7 +2135,7 @@ static int bcm_sysport_rule_set(struct b return -ENOSPC;
index = find_first_zero_bit(priv->filters, RXCHK_BRCM_TAG_MAX); - if (index > RXCHK_BRCM_TAG_MAX) + if (index >= RXCHK_BRCM_TAG_MAX) return -ENOSPC;
/* Location is the classification ID, and index is the position
From: You-Sheng Yang vicamo.yang@canonical.com
[ Upstream commit d64c7a08034b32c285e576208ae44fc3ba3fa7df ]
Dell USB Type C docking WD19/WD19DC attaches additional peripherals as:
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M |__ Port 1: Dev 11, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 3: Dev 12, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 4: Dev 13, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
where usb 2-1-3 is a hub connecting all USB Type-A/C ports on the dock.
When hotplugging such dock with additional usb devices already attached on it, the probing process may reset usb 2.1 port, therefore r8152 ethernet device is also reset. However, during r8152 device init there are several for-loops that, when it's unable to retrieve hardware registers due to being disconnected from USB, may take up to 14 seconds each in practice, and that has to be completed before USB may re-enumerate devices on the bus. As a result, devices attached to the dock will only be available after nearly 1 minute after the dock was plugged in:
[ 216.388290] [250] r8152 2-1.4:1.0: usb_probe_interface [ 216.388292] [250] r8152 2-1.4:1.0: usb_probe_interface - got id [ 258.830410] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): PHY not ready [ 258.830460] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Invalid header when reading pass-thru MAC addr [ 258.830464] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Get ether addr fail
This happens in, for example, r8153_init:
static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size, void *data, u16 type) { if (test_bit(RTL8152_UNPLUG, &tp->flags)) return -ENODEV; ... }
static u16 ocp_read_word(struct r8152 *tp, u16 type, u16 index) { u32 data; ... generic_ocp_read(tp, index, sizeof(tmp), &tmp, type | byen);
data = __le32_to_cpu(tmp); ... return (u16)data; }
static void r8153_init(struct r8152 *tp) { ... if (test_bit(RTL8152_UNPLUG, &tp->flags)) return;
for (i = 0; i < 500; i++) { if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; msleep(20); } ... }
Since ocp_read_word() doesn't check the return status of generic_ocp_read(), and the only exit condition for the loop is to have a match in the returned value, such loops will only ends after exceeding its maximum runs when the device has been marked as disconnected, which takes 500 * 20ms = 10 seconds in theory, 14 in practice.
To solve this long latency another test to RTL8152_UNPLUG flag should be added after those 20ms sleep to skip unnecessary loops, so that the device probe can complete early and proceed to parent port reset/reprobe process.
This can be reproduced on all kernel versions up to latest v5.6-rc2, but after v5.5-rc7 the reproduce rate is dramatically lowered to 1/30 or less while it was around 1/2.
Signed-off-by: You-Sheng Yang vicamo.yang@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/r8152.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3006,6 +3006,8 @@ static u16 r8153_phy_status(struct r8152 }
msleep(20); + if (test_bit(RTL8152_UNPLUG, &tp->flags)) + break; }
return data; @@ -4419,7 +4421,10 @@ static void r8153_init(struct r8152 *tp) if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; + msleep(20); + if (test_bit(RTL8152_UNPLUG, &tp->flags)) + break; }
data = r8153_phy_status(tp, 0); @@ -4545,7 +4550,10 @@ static void r8153b_init(struct r8152 *tp if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; + msleep(20); + if (test_bit(RTL8152_UNPLUG, &tp->flags)) + break; }
data = r8153_phy_status(tp, 0);
From: Edward Cree ecree@solarflare.com
[ Upstream commit 4b1bd9db078f7d5332c8601a2f5bd43cf0458fd4 ]
It's a resource, not a parameter, so we can't copy it into the new channel's TX queues, otherwise aliasing will lead to resource- management bugs if the channel is subsequently torn down without being initialised.
Before the Fixes:-tagged commit there was a similar bug with tsoh_page, but I'm not sure it's worth doing another fix for such old kernels.
Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Suggested-by: Derek Shute Derek.Shute@stratus.com Signed-off-by: Edward Cree ecree@solarflare.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/sfc/efx.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -519,6 +519,7 @@ efx_copy_channel(const struct efx_channe if (tx_queue->channel) tx_queue->channel = channel; tx_queue->buffer = NULL; + tx_queue->cb_page = NULL; memset(&tx_queue->txd, 0, sizeof(tx_queue->txd)); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 110a40dfb708fe940a3f3704d470e431c368d256 ]
Before accessing various fields in IPV4 network header and TCP header, make sure the packet :
- Has IP version 4 (ip->version == 4) - Has not a silly network length (ip->ihl >= 5) - Is big enough to hold network and transport headers - Has not a silly TCP header size (th->doff >= sizeof(struct tcphdr) / 4)
syzbot reported :
BUG: KMSAN: uninit-value in slhc_compress+0x5b9/0x2e60 drivers/net/slip/slhc.c:270 CPU: 0 PID: 11728 Comm: syz-executor231 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 slhc_compress+0x5b9/0x2e60 drivers/net/slip/slhc.c:270 ppp_send_frame drivers/net/ppp/ppp_generic.c:1637 [inline] __ppp_xmit_process+0x1902/0x2970 drivers/net/ppp/ppp_generic.c:1495 ppp_xmit_process+0x147/0x2f0 drivers/net/ppp/ppp_generic.c:1516 ppp_write+0x6bb/0x790 drivers/net/ppp/ppp_generic.c:512 do_loop_readv_writev fs/read_write.c:717 [inline] do_iter_write+0x812/0xdc0 fs/read_write.c:1000 compat_writev+0x2df/0x5a0 fs/read_write.c:1351 do_compat_pwritev64 fs/read_write.c:1400 [inline] __do_compat_sys_pwritev fs/read_write.c:1420 [inline] __se_compat_sys_pwritev fs/read_write.c:1414 [inline] __ia32_compat_sys_pwritev+0x349/0x3f0 fs/read_write.c:1414 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f7cd99 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000ffdb84ac EFLAGS: 00000217 ORIG_RAX: 000000000000014e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000200001c0 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000040047459 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] ppp_write+0x115/0x790 drivers/net/ppp/ppp_generic.c:500 do_loop_readv_writev fs/read_write.c:717 [inline] do_iter_write+0x812/0xdc0 fs/read_write.c:1000 compat_writev+0x2df/0x5a0 fs/read_write.c:1351 do_compat_pwritev64 fs/read_write.c:1400 [inline] __do_compat_sys_pwritev fs/read_write.c:1420 [inline] __se_compat_sys_pwritev fs/read_write.c:1414 [inline] __ia32_compat_sys_pwritev+0x349/0x3f0 fs/read_write.c:1414 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
Fixes: b5451d783ade ("slip: Move the SLIP drivers") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/slip/slhc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/drivers/net/slip/slhc.c +++ b/drivers/net/slip/slhc.c @@ -232,7 +232,7 @@ slhc_compress(struct slcompress *comp, u struct cstate *cs = lcs->next; unsigned long deltaS, deltaA; short changes = 0; - int hlen; + int nlen, hlen; unsigned char new_seq[16]; unsigned char *cp = new_seq; struct iphdr *ip; @@ -248,6 +248,8 @@ slhc_compress(struct slcompress *comp, u return isize;
ip = (struct iphdr *) icp; + if (ip->version != 4 || ip->ihl < 5) + return isize;
/* Bail if this packet isn't TCP, or is an IP fragment */ if (ip->protocol != IPPROTO_TCP || (ntohs(ip->frag_off) & 0x3fff)) { @@ -258,10 +260,14 @@ slhc_compress(struct slcompress *comp, u comp->sls_o_tcp++; return isize; } - /* Extract TCP header */ + nlen = ip->ihl * 4; + if (isize < nlen + sizeof(*th)) + return isize;
- th = (struct tcphdr *)(((unsigned char *)ip) + ip->ihl*4); - hlen = ip->ihl*4 + th->doff*4; + th = (struct tcphdr *)(icp + nlen); + if (th->doff < sizeof(struct tcphdr) / 4) + return isize; + hlen = nlen + th->doff * 4;
/* Bail if the TCP packet isn't `compressible' (i.e., ACK isn't set or * some other control bit is set). Also uncompressible if
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit b09fe70ef520e011ba4a64f4b93f948a8f14717b ]
There was a bug that was causing packets to be sent to the driver without first calling dequeue() on the "child" qdisc. And the KASAN report below shows that sending a packet without calling dequeue() leads to bad results.
The problem is that when checking the last qdisc "child" we do not set the returned skb to NULL, which can cause it to be sent to the driver, and so after the skb is sent, it may be freed, and in some situations a reference to it may still be in the child qdisc, because it was never dequeued.
The crash log looks like this:
[ 19.937538] ================================================================== [ 19.938300] BUG: KASAN: use-after-free in taprio_dequeue_soft+0x620/0x780 [ 19.938968] Read of size 4 at addr ffff8881128628cc by task swapper/1/0 [ 19.939612] [ 19.939772] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc3+ #97 [ 19.940397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qe4 [ 19.941523] Call Trace: [ 19.941774] <IRQ> [ 19.941985] dump_stack+0x97/0xe0 [ 19.942323] print_address_description.constprop.0+0x3b/0x60 [ 19.942884] ? taprio_dequeue_soft+0x620/0x780 [ 19.943325] ? taprio_dequeue_soft+0x620/0x780 [ 19.943767] __kasan_report.cold+0x1a/0x32 [ 19.944173] ? taprio_dequeue_soft+0x620/0x780 [ 19.944612] kasan_report+0xe/0x20 [ 19.944954] taprio_dequeue_soft+0x620/0x780 [ 19.945380] __qdisc_run+0x164/0x18d0 [ 19.945749] net_tx_action+0x2c4/0x730 [ 19.946124] __do_softirq+0x268/0x7bc [ 19.946491] irq_exit+0x17d/0x1b0 [ 19.946824] smp_apic_timer_interrupt+0xeb/0x380 [ 19.947280] apic_timer_interrupt+0xf/0x20 [ 19.947687] </IRQ> [ 19.947912] RIP: 0010:default_idle+0x2d/0x2d0 [ 19.948345] Code: 00 00 41 56 41 55 65 44 8b 2d 3f 8d 7c 7c 41 54 55 53 0f 1f 44 00 00 e8 b1 b2 c5 fd e9 07 00 3 [ 19.950166] RSP: 0018:ffff88811a3efda0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13 [ 19.950909] RAX: 0000000080000000 RBX: ffff88811a3a9600 RCX: ffffffff8385327e [ 19.951608] RDX: 1ffff110234752c0 RSI: 0000000000000000 RDI: ffffffff8385262f [ 19.952309] RBP: ffffed10234752c0 R08: 0000000000000001 R09: ffffed10234752c1 [ 19.953009] R10: ffffed10234752c0 R11: ffff88811a3a9607 R12: 0000000000000001 [ 19.953709] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 19.954408] ? default_idle_call+0x2e/0x70 [ 19.954816] ? default_idle+0x1f/0x2d0 [ 19.955192] default_idle_call+0x5e/0x70 [ 19.955584] do_idle+0x3d4/0x500 [ 19.955909] ? arch_cpu_idle_exit+0x40/0x40 [ 19.956325] ? _raw_spin_unlock_irqrestore+0x23/0x30 [ 19.956829] ? trace_hardirqs_on+0x30/0x160 [ 19.957242] cpu_startup_entry+0x19/0x20 [ 19.957633] start_secondary+0x2a6/0x380 [ 19.958026] ? set_cpu_sibling_map+0x18b0/0x18b0 [ 19.958486] secondary_startup_64+0xa4/0xb0 [ 19.958921] [ 19.959078] Allocated by task 33: [ 19.959412] save_stack+0x1b/0x80 [ 19.959747] __kasan_kmalloc.constprop.0+0xc2/0xd0 [ 19.960222] kmem_cache_alloc+0xe4/0x230 [ 19.960617] __alloc_skb+0x91/0x510 [ 19.960967] ndisc_alloc_skb+0x133/0x330 [ 19.961358] ndisc_send_ns+0x134/0x810 [ 19.961735] addrconf_dad_work+0xad5/0xf80 [ 19.962144] process_one_work+0x78e/0x13a0 [ 19.962551] worker_thread+0x8f/0xfa0 [ 19.962919] kthread+0x2ba/0x3b0 [ 19.963242] ret_from_fork+0x3a/0x50 [ 19.963596] [ 19.963753] Freed by task 33: [ 19.964055] save_stack+0x1b/0x80 [ 19.964386] __kasan_slab_free+0x12f/0x180 [ 19.964830] kmem_cache_free+0x80/0x290 [ 19.965231] ip6_mc_input+0x38a/0x4d0 [ 19.965617] ipv6_rcv+0x1a4/0x1d0 [ 19.965948] __netif_receive_skb_one_core+0xf2/0x180 [ 19.966437] netif_receive_skb+0x8c/0x3c0 [ 19.966846] br_handle_frame_finish+0x779/0x1310 [ 19.967302] br_handle_frame+0x42a/0x830 [ 19.967694] __netif_receive_skb_core+0xf0e/0x2a90 [ 19.968167] __netif_receive_skb_one_core+0x96/0x180 [ 19.968658] process_backlog+0x198/0x650 [ 19.969047] net_rx_action+0x2fa/0xaa0 [ 19.969420] __do_softirq+0x268/0x7bc [ 19.969785] [ 19.969940] The buggy address belongs to the object at ffff888112862840 [ 19.969940] which belongs to the cache skbuff_head_cache of size 224 [ 19.971202] The buggy address is located 140 bytes inside of [ 19.971202] 224-byte region [ffff888112862840, ffff888112862920) [ 19.972344] The buggy address belongs to the page: [ 19.972820] page:ffffea00044a1800 refcount:1 mapcount:0 mapping:ffff88811a2bd1c0 index:0xffff8881128625c0 compo0 [ 19.973930] flags: 0x8000000000010200(slab|head) [ 19.974388] raw: 8000000000010200 ffff88811a2ed650 ffff88811a2ed650 ffff88811a2bd1c0 [ 19.975151] raw: ffff8881128625c0 0000000000190013 00000001ffffffff 0000000000000000 [ 19.975915] page dumped because: kasan: bad access detected [ 19.976461] page_owner tracks the page as allocated [ 19.976946] page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NO) [ 19.978332] prep_new_page+0x24b/0x330 [ 19.978707] get_page_from_freelist+0x2057/0x2c90 [ 19.979170] __alloc_pages_nodemask+0x218/0x590 [ 19.979619] new_slab+0x9d/0x300 [ 19.979948] ___slab_alloc.constprop.0+0x2f9/0x6f0 [ 19.980421] __slab_alloc.constprop.0+0x30/0x60 [ 19.980870] kmem_cache_alloc+0x201/0x230 [ 19.981269] __alloc_skb+0x91/0x510 [ 19.981620] alloc_skb_with_frags+0x78/0x4a0 [ 19.982043] sock_alloc_send_pskb+0x5eb/0x750 [ 19.982476] unix_stream_sendmsg+0x399/0x7f0 [ 19.982904] sock_sendmsg+0xe2/0x110 [ 19.983262] ____sys_sendmsg+0x4de/0x6d0 [ 19.983660] ___sys_sendmsg+0xe4/0x160 [ 19.984032] __sys_sendmsg+0xab/0x130 [ 19.984396] do_syscall_64+0xe7/0xae0 [ 19.984761] page last free stack trace: [ 19.985142] __free_pages_ok+0x432/0xbc0 [ 19.985533] qlist_free_all+0x56/0xc0 [ 19.985907] quarantine_reduce+0x149/0x170 [ 19.986315] __kasan_kmalloc.constprop.0+0x9e/0xd0 [ 19.986791] kmem_cache_alloc+0xe4/0x230 [ 19.987182] prepare_creds+0x24/0x440 [ 19.987548] do_faccessat+0x80/0x590 [ 19.987906] do_syscall_64+0xe7/0xae0 [ 19.988276] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 19.988775] [ 19.988930] Memory state around the buggy address: [ 19.989402] ffff888112862780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.990111] ffff888112862800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb [ 19.990822] >ffff888112862880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 19.991529] ^ [ 19.992081] ffff888112862900: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc [ 19.992796] ffff888112862980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler") Reported-by: Michael Schmidt michael.schmidt@eti.uni-siegen.de Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Acked-by: Andre Guedes andre.guedes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_taprio.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -564,8 +564,10 @@ static struct sk_buff *taprio_dequeue_so prio = skb->priority; tc = netdev_get_prio_tc_map(dev, prio);
- if (!(gate_mask & BIT(tc))) + if (!(gate_mask & BIT(tc))) { + skb = NULL; continue; + }
len = qdisc_pkt_len(skb); guard = ktime_add_ns(taprio_get_time(q), @@ -575,13 +577,17 @@ static struct sk_buff *taprio_dequeue_so * guard band ... */ if (gate_mask != TAPRIO_ALL_GATES_OPEN && - ktime_after(guard, entry->close_time)) + ktime_after(guard, entry->close_time)) { + skb = NULL; continue; + }
/* ... and no budget. */ if (gate_mask != TAPRIO_ALL_GATES_OPEN && - atomic_sub_return(len, &entry->budget) < 0) + atomic_sub_return(len, &entry->budget) < 0) { + skb = NULL; continue; + }
skb = child->ops->dequeue(child); if (unlikely(!skb))
From: Eric Dumazet edumazet@google.com
commit b7469e83d2add567e4e0b063963db185f3167cea upstream.
Similar to commit 38f88c454042 ("bonding/alb: properly access headers in bond_alb_xmit()"), we need to make sure arp header was pulled in skb->head before blindly accessing it in rlb_arp_xmit().
Remove arp_pkt() private helper, since it is more readable/obvious to have the following construct back to back :
if (!pskb_network_may_pull(skb, sizeof(*arp))) return NULL; arp = (struct arp_pkt *)skb_network_header(skb);
syzbot reported :
BUG: KMSAN: uninit-value in bond_slave_has_mac_rx include/net/bonding.h:704 [inline] BUG: KMSAN: uninit-value in rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] BUG: KMSAN: uninit-value in bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 CPU: 0 PID: 12743 Comm: syz-executor.4 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 bond_slave_has_mac_rx include/net/bonding.h:704 [inline] rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 __bond_start_xmit drivers/net/bonding/bond_main.c:4257 [inline] bond_start_xmit+0x85d/0x2f70 drivers/net/bonding/bond_main.c:4282 __netdev_start_xmit include/linux/netdevice.h:4524 [inline] netdev_start_xmit include/linux/netdevice.h:4538 [inline] xmit_one net/core/dev.c:3470 [inline] dev_hard_start_xmit+0x531/0xab0 net/core/dev.c:3486 __dev_queue_xmit+0x37de/0x4220 net/core/dev.c:4063 dev_queue_xmit+0x4b/0x60 net/core/dev.c:4096 packet_snd net/packet/af_packet.c:2967 [inline] packet_sendmsg+0x8347/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c479 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fc77ffbbc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fc77ffbc6d4 RCX: 000000000045c479 RDX: 000000000000000e RSI: 00000000200004c0 RDI: 0000000000000003 RBP: 000000000076bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a04 R14: 00000000004cc7b0 R15: 000000000076bf2c
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 packet_alloc_skb net/packet/af_packet.c:2815 [inline] packet_snd net/packet/af_packet.c:2910 [inline] packet_sendmsg+0x66a0/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Jay Vosburgh j.vosburgh@gmail.com Cc: Veaceslav Falico vfalico@gmail.com Cc: Andy Gospodarek andy@greyhouse.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_alb.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -50,11 +50,6 @@ struct arp_pkt { }; #pragma pack()
-static inline struct arp_pkt *arp_pkt(const struct sk_buff *skb) -{ - return (struct arp_pkt *)skb_network_header(skb); -} - /* Forward declaration */ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[], bool strict_match); @@ -553,10 +548,11 @@ static void rlb_req_update_subnet_client spin_unlock(&bond->mode_lock); }
-static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bond) +static struct slave *rlb_choose_channel(struct sk_buff *skb, + struct bonding *bond, + const struct arp_pkt *arp) { struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); - struct arp_pkt *arp = arp_pkt(skb); struct slave *assigned_slave, *curr_active_slave; struct rlb_client_info *client_info; u32 hash_index = 0; @@ -653,8 +649,12 @@ static struct slave *rlb_choose_channel( */ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) { - struct arp_pkt *arp = arp_pkt(skb); struct slave *tx_slave = NULL; + struct arp_pkt *arp; + + if (!pskb_network_may_pull(skb, sizeof(*arp))) + return NULL; + arp = (struct arp_pkt *)skb_network_header(skb);
/* Don't modify or load balance ARPs that do not originate locally * (e.g.,arrive via a bridge). @@ -664,7 +664,7 @@ static struct slave *rlb_arp_xmit(struct
if (arp->op_code == htons(ARPOP_REPLY)) { /* the arp must be sent on the selected rx channel */ - tx_slave = rlb_choose_channel(skb, bond); + tx_slave = rlb_choose_channel(skb, bond, arp); if (tx_slave) bond_hw_addr_copy(arp->mac_src, tx_slave->dev->dev_addr, tx_slave->dev->addr_len); @@ -676,7 +676,7 @@ static struct slave *rlb_arp_xmit(struct * When the arp reply is received the entry will be updated * with the correct unicast address of the client. */ - tx_slave = rlb_choose_channel(skb, bond); + tx_slave = rlb_choose_channel(skb, bond, arp);
/* The ARP reply packets must be delayed so that * they can cancel out the influence of the ARP request.
From: Vasundhara Volam vasundhara-v.volam@broadcom.com
[ Upstream commit a9b952d267e59a3b405e644930f46d252cea7122 ]
MTU changes may affect the number of IRQs so we must call bnxt_close_nic()/bnxt_open_nic() with the irq_re_init parameter set to true. The reason is that a larger MTU may require aggregation rings not needed with smaller MTU. We may not be able to allocate the required number of aggregation rings and so we reduce the number of channels which will change the number of IRQs. Without this patch, it may crash eventually in pci_disable_msix() when the IRQs are not properly unwound.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vasundhara Volam vasundhara-v.volam@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -10891,13 +10891,13 @@ static int bnxt_change_mtu(struct net_de struct bnxt *bp = netdev_priv(dev);
if (netif_running(dev)) - bnxt_close_nic(bp, false, false); + bnxt_close_nic(bp, true, false);
dev->mtu = new_mtu; bnxt_set_ring_params(bp);
if (netif_running(dev)) - return bnxt_open_nic(bp, false, false); + return bnxt_open_nic(bp, true, false);
return 0; }
From: Edwin Peer edwin.peer@broadcom.com
[ Upstream commit 22630e28f9c2b55abd217869cc0696def89f2284 ]
After bnxt_hwrm_do_send_message() was updated to return standard error codes in a recent commit, a regression in bnxt_flash_package_from_file() was introduced. The return value does not properly reflect all possible firmware errors when calling firmware to flash the package.
Fix it by consolidating all errors in one local variable rc instead of having 2 variables for different errors.
Fixes: d4f1420d3656 ("bnxt_en: Convert error code in firmware message response to standard code.") Signed-off-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 24 ++++++++++------------ 1 file changed, 11 insertions(+), 13 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2005,8 +2005,8 @@ static int bnxt_flash_package_from_file( struct hwrm_nvm_install_update_output *resp = bp->hwrm_cmd_resp_addr; struct hwrm_nvm_install_update_input install = {0}; const struct firmware *fw; - int rc, hwrm_err = 0; u32 item_len; + int rc = 0; u16 index;
bnxt_hwrm_fw_set_time(bp); @@ -2050,15 +2050,14 @@ static int bnxt_flash_package_from_file( memcpy(kmem, fw->data, fw->size); modify.host_src_addr = cpu_to_le64(dma_handle);
- hwrm_err = hwrm_send_message(bp, &modify, - sizeof(modify), - FLASH_PACKAGE_TIMEOUT); + rc = hwrm_send_message(bp, &modify, sizeof(modify), + FLASH_PACKAGE_TIMEOUT); dma_free_coherent(&bp->pdev->dev, fw->size, kmem, dma_handle); } } release_firmware(fw); - if (rc || hwrm_err) + if (rc) goto err_exit;
if ((install_type & 0xffff) == 0) @@ -2067,20 +2066,19 @@ static int bnxt_flash_package_from_file( install.install_type = cpu_to_le32(install_type);
mutex_lock(&bp->hwrm_cmd_lock); - hwrm_err = _hwrm_send_message(bp, &install, sizeof(install), - INSTALL_PACKAGE_TIMEOUT); - if (hwrm_err) { + rc = _hwrm_send_message(bp, &install, sizeof(install), + INSTALL_PACKAGE_TIMEOUT); + if (rc) { u8 error_code = ((struct hwrm_err_output *)resp)->cmd_err;
if (resp->error_code && error_code == NVM_INSTALL_UPDATE_CMD_ERR_CODE_FRAG_ERR) { install.flags |= cpu_to_le16( NVM_INSTALL_UPDATE_REQ_FLAGS_ALLOWED_TO_DEFRAG); - hwrm_err = _hwrm_send_message(bp, &install, - sizeof(install), - INSTALL_PACKAGE_TIMEOUT); + rc = _hwrm_send_message(bp, &install, sizeof(install), + INSTALL_PACKAGE_TIMEOUT); } - if (hwrm_err) + if (rc) goto flash_pkg_exit; }
@@ -2092,7 +2090,7 @@ static int bnxt_flash_package_from_file( flash_pkg_exit: mutex_unlock(&bp->hwrm_cmd_lock); err_exit: - if (hwrm_err == -EACCES) + if (rc == -EACCES) bnxt_print_admin_err(bp); return rc; }
From: Shakeel Butt shakeelb@google.com
[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ]
We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1.
mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup.
Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt.
WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg.
The stack trace of mem_cgroup_sk_alloc() from IRQ-context:
CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: <IRQ> dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9
The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking") Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Shakeel Butt shakeelb@google.com Reviewed-by: Roman Gushchin guro@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cgroup/cgroup.c | 4 ++++ mm/memcontrol.c | 4 ++++ 2 files changed, 8 insertions(+)
--- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -6381,6 +6381,10 @@ void cgroup_sk_alloc(struct sock_cgroup_ return; }
+ /* Don't associate the sock with unrelated interrupted task's cgroup. */ + if (in_interrupt()) + return; + rcu_read_lock();
while (true) { --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6806,6 +6806,10 @@ void mem_cgroup_sk_alloc(struct sock *sk return; }
+ /* Do not associate the sock with unrelated interrupted task's memcg. */ + if (in_interrupt()) + return; + rcu_read_lock(); memcg = mem_cgroup_from_task(current); if (memcg == root_mem_cgroup)
From: Shakeel Butt shakeelb@google.com
[ Upstream commit d752a4986532cb6305dfd5290a614cde8072769d ]
If a TCP socket is allocated in IRQ context or cloned from unassociated (i.e. not associated to a memcg) in IRQ context then it will remain unassociated for its whole life. Almost half of the TCPs created on the system are created in IRQ context, so, memory used by such sockets will not be accounted by the memcg.
This issue is more widespread in cgroup v1 where network memory accounting is opt-in but it can happen in cgroup v2 if the source socket for the cloning was created in root memcg.
To fix the issue, just do the association of the sockets at the accept() time in the process context and then force charge the memory buffer already used and reserved by the socket.
Signed-off-by: Shakeel Butt shakeelb@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 14 -------------- net/core/sock.c | 5 ++++- net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++ 3 files changed, 24 insertions(+), 15 deletions(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6792,20 +6792,6 @@ void mem_cgroup_sk_alloc(struct sock *sk if (!mem_cgroup_sockets_enabled) return;
- /* - * Socket cloning can throw us here with sk_memcg already - * filled. It won't however, necessarily happen from - * process context. So the test for root memcg given - * the current task's memcg won't help us in this case. - * - * Respecting the original socket's memcg is a better - * decision in this case. - */ - if (sk->sk_memcg) { - css_get(&sk->sk_memcg->css); - return; - } - /* Do not associate the sock with unrelated interrupted task's memcg. */ if (in_interrupt()) return; --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1832,7 +1832,10 @@ struct sock *sk_clone_lock(const struct atomic_set(&newsk->sk_zckey, 0);
sock_reset_flag(newsk, SOCK_DONE); - mem_cgroup_sk_alloc(newsk); + + /* sk->sk_memcg will be populated at accept() time */ + newsk->sk_memcg = NULL; + cgroup_sk_alloc(&newsk->sk_cgrp_data);
rcu_read_lock(); --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock } spin_unlock_bh(&queue->fastopenq.lock); } + + if (mem_cgroup_sockets_enabled) { + int amt; + + /* atomically get the memory usage, set and charge the + * sk->sk_memcg. + */ + lock_sock(newsk); + + /* The sk has not been accepted yet, no need to look at + * sk->sk_wmem_queued. + */ + amt = sk_mem_pages(newsk->sk_forward_alloc + + atomic_read(&sk->sk_rmem_alloc)); + mem_cgroup_sk_alloc(newsk); + if (newsk->sk_memcg && amt) + mem_cgroup_charge_skmem(newsk->sk_memcg, amt); + + release_sock(newsk); + } out: release_sock(sk); if (req)
From: Eric Dumazet edumazet@google.com
commit 06669ea346e476a5339033d77ef175566a40efbb upstream.
Locking newsk while still holding the listener lock triggered a lockdep splat [1]
We can simply move the memcg code after we release the listener lock, as this can also help if multiple threads are sharing a common listener.
Also fix a typo while reading socket sk_rmem_alloc.
[1] WARNING: possible recursive locking detected 5.6.0-rc3-syzkaller #0 Not tainted -------------------------------------------- syz-executor598/9524 is trying to acquire lock: ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492
but task is already holding lock: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(sk_lock-AF_INET6); lock(sk_lock-AF_INET6);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by syz-executor598/9524: #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
stack backtrace: CPU: 0 PID: 9524 Comm: syz-executor598 Not tainted 5.6.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 print_deadlock_bug kernel/locking/lockdep.c:2370 [inline] check_deadlock kernel/locking/lockdep.c:2411 [inline] validate_chain kernel/locking/lockdep.c:2954 [inline] __lock_acquire.cold+0x114/0x288 kernel/locking/lockdep.c:3954 lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4484 lock_sock_nested+0xc5/0x110 net/core/sock.c:2947 lock_sock include/net/sock.h:1541 [inline] inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492 inet_accept+0xe9/0x7c0 net/ipv4/af_inet.c:734 __sys_accept4_file+0x3ac/0x5b0 net/socket.c:1758 __sys_accept4+0x53/0x90 net/socket.c:1809 __do_sys_accept4 net/socket.c:1821 [inline] __se_sys_accept4 net/socket.c:1818 [inline] __x64_sys_accept4+0x93/0xf0 net/socket.c:1818 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4445c9 Code: e8 0c 0d 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc35b37608 EFLAGS: 00000246 ORIG_RAX: 0000000000000120 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004445c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000306777 R09: 0000000000306777 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00000000004053d0 R14: 0000000000000000 R15: 0000000000000000
Fixes: d752a4986532 ("net: memcg: late association of sock to memcg") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Shakeel Butt shakeelb@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/inet_connection_sock.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -483,27 +483,27 @@ struct sock *inet_csk_accept(struct sock spin_unlock_bh(&queue->fastopenq.lock); }
- if (mem_cgroup_sockets_enabled) { +out: + release_sock(sk); + if (newsk && mem_cgroup_sockets_enabled) { int amt;
/* atomically get the memory usage, set and charge the - * sk->sk_memcg. + * newsk->sk_memcg. */ lock_sock(newsk);
- /* The sk has not been accepted yet, no need to look at - * sk->sk_wmem_queued. + /* The socket has not been accepted yet, no need to look at + * newsk->sk_wmem_queued. */ amt = sk_mem_pages(newsk->sk_forward_alloc + - atomic_read(&sk->sk_rmem_alloc)); + atomic_read(&newsk->sk_rmem_alloc)); mem_cgroup_sk_alloc(newsk); if (newsk->sk_memcg && amt) mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
release_sock(newsk); } -out: - release_sock(sk); if (req) reqsk_put(req); return newsk;
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 8750939b6ad86abc3f53ec8a9683a1cded4a5654 ]
DEVLINK_ATTR_PARAM_VALUE_DATA may have different types so it's not checked by the normal netlink policy. Make sure the attribute length is what we expect.
Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/devlink.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-)
--- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -3222,34 +3222,41 @@ devlink_param_value_get_from_info(const struct genl_info *info, union devlink_param_value *value) { + struct nlattr *param_data; int len;
- if (param->type != DEVLINK_PARAM_TYPE_BOOL && - !info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]) + param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]; + + if (param->type != DEVLINK_PARAM_TYPE_BOOL && !param_data) return -EINVAL;
switch (param->type) { case DEVLINK_PARAM_TYPE_U8: - value->vu8 = nla_get_u8(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]); + if (nla_len(param_data) != sizeof(u8)) + return -EINVAL; + value->vu8 = nla_get_u8(param_data); break; case DEVLINK_PARAM_TYPE_U16: - value->vu16 = nla_get_u16(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]); + if (nla_len(param_data) != sizeof(u16)) + return -EINVAL; + value->vu16 = nla_get_u16(param_data); break; case DEVLINK_PARAM_TYPE_U32: - value->vu32 = nla_get_u32(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]); + if (nla_len(param_data) != sizeof(u32)) + return -EINVAL; + value->vu32 = nla_get_u32(param_data); break; case DEVLINK_PARAM_TYPE_STRING: - len = strnlen(nla_data(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]), - nla_len(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA])); - if (len == nla_len(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA]) || + len = strnlen(nla_data(param_data), nla_len(param_data)); + if (len == nla_len(param_data) || len >= __DEVLINK_PARAM_MAX_STRING_VALUE) return -EINVAL; - strcpy(value->vstr, - nla_data(info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA])); + strcpy(value->vstr, nla_data(param_data)); break; case DEVLINK_PARAM_TYPE_BOOL: - value->vbool = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA] ? - true : false; + if (param_data && nla_len(param_data)) + return -EINVAL; + value->vbool = nla_get_flag(param_data); break; } return 0;
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit ff3b63b8c299b73ac599b120653b47e275407656 ]
DEVLINK_ATTR_REGION_CHUNK_ADDR and DEVLINK_ATTR_REGION_CHUNK_LEN lack entries in the netlink policy. Corresponding nla_get_u64()s may read beyond the end of the message.
Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/devlink.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -5804,6 +5804,8 @@ static const struct nla_policy devlink_n [DEVLINK_ATTR_PARAM_VALUE_CMODE] = { .type = NLA_U8 }, [DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING }, [DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32 }, + [DEVLINK_ATTR_REGION_CHUNK_ADDR] = { .type = NLA_U64 }, + [DEVLINK_ATTR_REGION_CHUNK_LEN] = { .type = NLA_U64 }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING }, [DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] = { .type = NLA_U64 }, [DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER] = { .type = NLA_U8 },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 4c16d64ea04056f1b1b324ab6916019f6a064114 ]
Add missing netlink policy entry for FRA_TUN_ID.
Fixes: e7030878fc84 ("fib: Add fib rule match on tunnel id") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/fib_rules.h | 1 + 1 file changed, 1 insertion(+)
--- a/include/net/fib_rules.h +++ b/include/net/fib_rules.h @@ -108,6 +108,7 @@ struct fib_rule_notifier_info { [FRA_OIFNAME] = { .type = NLA_STRING, .len = IFNAMSIZ - 1 }, \ [FRA_PRIORITY] = { .type = NLA_U32 }, \ [FRA_FWMARK] = { .type = NLA_U32 }, \ + [FRA_TUN_ID] = { .type = NLA_U64 }, \ [FRA_FWMASK] = { .type = NLA_U32 }, \ [FRA_TABLE] = { .type = NLA_U32 }, \ [FRA_SUPPRESS_PREFIXLEN] = { .type = NLA_U32 }, \
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 9322cd7c4af2ccc7fe7c5f01adb53f4f77949e92 ]
Add missing attribute validation for several u8 types.
Fixes: 2c21d11518b6 ("net: add NL802154 interface for configuration of 802.15.4 devices") Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl_policy.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/ieee802154/nl_policy.c +++ b/net/ieee802154/nl_policy.c @@ -21,6 +21,11 @@ const struct nla_policy ieee802154_polic [IEEE802154_ATTR_HW_ADDR] = { .type = NLA_HW_ADDR, }, [IEEE802154_ATTR_PAN_ID] = { .type = NLA_U16, }, [IEEE802154_ATTR_CHANNEL] = { .type = NLA_U8, }, + [IEEE802154_ATTR_BCN_ORD] = { .type = NLA_U8, }, + [IEEE802154_ATTR_SF_ORD] = { .type = NLA_U8, }, + [IEEE802154_ATTR_PAN_COORD] = { .type = NLA_U8, }, + [IEEE802154_ATTR_BAT_EXT] = { .type = NLA_U8, }, + [IEEE802154_ATTR_COORD_REALIGN] = { .type = NLA_U8, }, [IEEE802154_ATTR_PAGE] = { .type = NLA_U8, }, [IEEE802154_ATTR_COORD_SHORT_ADDR] = { .type = NLA_U16, }, [IEEE802154_ATTR_COORD_HW_ADDR] = { .type = NLA_HW_ADDR, },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit b60673c4c418bef7550d02faf53c34fbfeb366bf ]
Add missing attribute type validation for IEEE802154_ATTR_DEV_TYPE to the netlink policy.
Fixes: 90c049b2c6ae ("ieee802154: interface type to be added") Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ieee802154/nl_policy.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ieee802154/nl_policy.c +++ b/net/ieee802154/nl_policy.c @@ -27,6 +27,7 @@ const struct nla_policy ieee802154_polic [IEEE802154_ATTR_BAT_EXT] = { .type = NLA_U8, }, [IEEE802154_ATTR_COORD_REALIGN] = { .type = NLA_U8, }, [IEEE802154_ATTR_PAGE] = { .type = NLA_U8, }, + [IEEE802154_ATTR_DEV_TYPE] = { .type = NLA_U8, }, [IEEE802154_ATTR_COORD_SHORT_ADDR] = { .type = NLA_U16, }, [IEEE802154_ATTR_COORD_HW_ADDR] = { .type = NLA_HW_ADDR, }, [IEEE802154_ATTR_COORD_PAN_ID] = { .type = NLA_U16, },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit ab02ad660586b94f5d08912a3952b939cf4c4430 ]
Add missing attribute validation for IFLA_CAN_TERMINATION to the netlink policy.
Fixes: 12a6075cabc0 ("can: dev: add CAN interface termination API") Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/dev.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -884,6 +884,7 @@ static const struct nla_policy can_polic = { .len = sizeof(struct can_bittiming) }, [IFLA_CAN_DATA_BITTIMING_CONST] = { .len = sizeof(struct can_bittiming_const) }, + [IFLA_CAN_TERMINATION] = { .type = NLA_U16 }, };
static int can_validate(struct nlattr *tb[], struct nlattr *data[],
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 31d9a1c524964bac77b7f9d0a1ac140dc6b57461 ]
Add missing attribute validation for IFLA_MACSEC_PORT to the netlink policy.
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macsec.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -2983,6 +2983,7 @@ static const struct device_type macsec_t
static const struct nla_policy macsec_rtnl_policy[IFLA_MACSEC_MAX + 1] = { [IFLA_MACSEC_SCI] = { .type = NLA_U64 }, + [IFLA_MACSEC_PORT] = { .type = NLA_U16 }, [IFLA_MACSEC_ICV_LEN] = { .type = NLA_U8 }, [IFLA_MACSEC_CIPHER_SUITE] = { .type = NLA_U64 }, [IFLA_MACSEC_WINDOW] = { .type = NLA_U32 },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 7e6dc03eeb023e18427a373522f1d247b916a641 ]
Add missing attribute validation for TCA_FQ_ORPHAN_MASK to the netlink policy.
Fixes: 06eb395fa985 ("pkt_sched: fq: better control of DDOS traffic") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_fq.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -745,6 +745,7 @@ static const struct nla_policy fq_policy [TCA_FQ_FLOW_MAX_RATE] = { .type = NLA_U32 }, [TCA_FQ_BUCKETS_LOG] = { .type = NLA_U32 }, [TCA_FQ_FLOW_REFILL_DELAY] = { .type = NLA_U32 }, + [TCA_FQ_ORPHAN_MASK] = { .type = NLA_U32 }, [TCA_FQ_LOW_RATE_THRESHOLD] = { .type = NLA_U32 }, [TCA_FQ_CE_THRESHOLD] = { .type = NLA_U32 }, };
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit e13aaa0643da10006ec35715954e7f92a62899a5 ]
Add missing attribute validation for TCA_TAPRIO_ATTR_TXTIME_DELAY to the netlink policy.
Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_taprio.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -774,6 +774,7 @@ static const struct nla_policy taprio_po [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME] = { .type = NLA_S64 }, [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 }, [TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 }, + [TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 }, };
static int fill_sched_entry(struct nlattr **tb, struct sched_entry *entry,
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit dd25cb272ccce4db67dc8509278229099e4f5e99 ]
Add missing attribute validation for TEAM_ATTR_OPTION_PORT_IFINDEX to the netlink policy.
Fixes: 80f7c6683fe0 ("team: add support for per-port options") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2240,6 +2240,7 @@ team_nl_option_policy[TEAM_ATTR_OPTION_M [TEAM_ATTR_OPTION_CHANGED] = { .type = NLA_FLAG }, [TEAM_ATTR_OPTION_TYPE] = { .type = NLA_U8 }, [TEAM_ATTR_OPTION_DATA] = { .type = NLA_BINARY }, + [TEAM_ATTR_OPTION_PORT_IFINDEX] = { .type = NLA_U32 }, };
static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 669fcd7795900cd1880237cbbb57a7db66cb9ac8 ]
Add missing attribute validation for TEAM_ATTR_OPTION_ARRAY_INDEX to the netlink policy.
Fixes: b13033262d24 ("team: introduce array options") Signed-off-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Jiri Pirko jiri@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2241,6 +2241,7 @@ team_nl_option_policy[TEAM_ATTR_OPTION_M [TEAM_ATTR_OPTION_TYPE] = { .type = NLA_U8 }, [TEAM_ATTR_OPTION_DATA] = { .type = NLA_BINARY }, [TEAM_ATTR_OPTION_PORT_IFINDEX] = { .type = NLA_U32 }, + [TEAM_ATTR_OPTION_ARRAY_INDEX] = { .type = NLA_U32 }, };
static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 213320a67962ff6e7b83b704d55cbebc341426db ]
Add missing attribute validation for TIPC_NLA_PROP_MTU to the netlink policy.
Fixes: 901271e0403a ("tipc: implement configuration of UDP media MTU") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/netlink.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/tipc/netlink.c +++ b/net/tipc/netlink.c @@ -111,6 +111,7 @@ const struct nla_policy tipc_nl_prop_pol [TIPC_NLA_PROP_PRIO] = { .type = NLA_U32 }, [TIPC_NLA_PROP_TOL] = { .type = NLA_U32 }, [TIPC_NLA_PROP_WIN] = { .type = NLA_U32 }, + [TIPC_NLA_PROP_MTU] = { .type = NLA_U32 }, [TIPC_NLA_PROP_BROADCAST] = { .type = NLA_U32 }, [TIPC_NLA_PROP_BROADCAST_RATIO] = { .type = NLA_U32 } };
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 361d23e41ca6e504033f7e66a03b95788377caae ]
Add missing attribute validation for NFC_ATTR_SE_INDEX to the netlink policy.
Fixes: 5ce3f32b5264 ("NFC: netlink: SE API implementation") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/netlink.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -43,6 +43,7 @@ static const struct nla_policy nfc_genl_ [NFC_ATTR_LLC_SDP] = { .type = NLA_NESTED }, [NFC_ATTR_FIRMWARE_NAME] = { .type = NLA_STRING, .len = NFC_FIRMWARE_NAME_MAXSIZE }, + [NFC_ATTR_SE_INDEX] = { .type = NLA_U32 }, [NFC_ATTR_SE_APDU] = { .type = NLA_BINARY }, [NFC_ATTR_VENDOR_DATA] = { .type = NLA_BINARY },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 88e706d5168b07df4792dbc3d1bc37b83e4bd74d ]
Add missing attribute validation for NFC_ATTR_TARGET_INDEX to the netlink policy.
Fixes: 4d63adfe12dd ("NFC: Add NFC_CMD_DEACTIVATE_TARGET support") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/netlink.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -32,6 +32,7 @@ static const struct nla_policy nfc_genl_ [NFC_ATTR_DEVICE_NAME] = { .type = NLA_STRING, .len = NFC_DEVICE_NAME_MAXSIZE }, [NFC_ATTR_PROTOCOLS] = { .type = NLA_U32 }, + [NFC_ATTR_TARGET_INDEX] = { .type = NLA_U32 }, [NFC_ATTR_COMM_MODE] = { .type = NLA_U8 }, [NFC_ATTR_RF_MODE] = { .type = NLA_U8 }, [NFC_ATTR_DEVICE_POWERED] = { .type = NLA_U8 },
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 6ba3da446551f2150fadbf8c7788edcb977683d3 ]
Add missing attribute validation for vendor subcommand attributes to the netlink policy.
Fixes: 9e58095f9660 ("NFC: netlink: Implement vendor command support") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/netlink.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -46,6 +46,8 @@ static const struct nla_policy nfc_genl_ .len = NFC_FIRMWARE_NAME_MAXSIZE }, [NFC_ATTR_SE_INDEX] = { .type = NLA_U32 }, [NFC_ATTR_SE_APDU] = { .type = NLA_BINARY }, + [NFC_ATTR_VENDOR_ID] = { .type = NLA_U32 }, + [NFC_ATTR_VENDOR_SUBCMD] = { .type = NLA_U32 }, [NFC_ATTR_VENDOR_DATA] = { .type = NLA_BINARY },
};
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 249bc9744e165abe74ae326f43e9d70bad54c3b7 ]
On all PHY drivers that implement did_interrupt() reading the interrupt status bits clears them. This means we may loose an interrupt that is triggered between calling did_interrupt() and phy_clear_interrupt(). As part of the fix make it a requirement that did_interrupt() clears the interrupt.
The Fixes tag refers to the first commit where the patch applies cleanly.
Fixes: 49644e68f472 ("net: phy: add callback for custom interrupt handler to struct phy_driver") Reported-by: Michael Walle michael@walle.cc Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy.c | 3 ++- include/linux/phy.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -761,7 +761,8 @@ static irqreturn_t phy_interrupt(int irq phy_trigger_machine(phydev); }
- if (phy_clear_interrupt(phydev)) + /* did_interrupt() may have cleared the interrupt already */ + if (!phydev->drv->did_interrupt && phy_clear_interrupt(phydev)) goto phy_err; return IRQ_HANDLED;
--- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -524,6 +524,7 @@ struct phy_driver { /* * Checks if the PHY generated an interrupt. * For multi-PHY devices with shared PHY interrupt pin + * Set interrupt bits have to be cleared. */ int (*did_interrupt)(struct phy_device *phydev);
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 611d779af7cad2b87487ff58e4931a90c20b113c ]
So far we have the unfortunate situation that mdio_bus_phy_may_suspend() is called in suspend AND resume path, assuming that function result is the same. After the original change this is no longer the case, resulting in broken resume as reported by Geert.
To fix this call mdio_bus_phy_may_suspend() in the suspend path only, and let the phy_device store the info whether it was suspended by MDIO bus PM.
Fixes: 503ba7c69610 ("net: phy: Avoid multiple suspends") Reported-by: Geert Uytterhoeven geert@linux-m68k.org Tested-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy_device.c | 6 +++++- include/linux/phy.h | 2 ++ 2 files changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -284,6 +284,8 @@ static int mdio_bus_phy_suspend(struct d if (!mdio_bus_phy_may_suspend(phydev)) return 0;
+ phydev->suspended_by_mdio_bus = 1; + return phy_suspend(phydev); }
@@ -292,9 +294,11 @@ static int mdio_bus_phy_resume(struct de struct phy_device *phydev = to_phy_device(dev); int ret;
- if (!mdio_bus_phy_may_suspend(phydev)) + if (!phydev->suspended_by_mdio_bus) goto no_resume;
+ phydev->suspended_by_mdio_bus = 0; + ret = phy_resume(phydev); if (ret < 0) return ret; --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -336,6 +336,7 @@ struct phy_c45_device_ids { * is_gigabit_capable: Set to true if PHY supports 1000Mbps * has_fixups: Set to true if this phy has fixups/quirks. * suspended: Set to true if this phy has been suspended successfully. + * suspended_by_mdio_bus: Set to true if this phy was suspended by MDIO bus. * sysfs_links: Internal boolean tracking sysfs symbolic links setup/removal. * loopback_enabled: Set true if this phy has been loopbacked successfully. * state: state of the PHY for management purposes @@ -372,6 +373,7 @@ struct phy_device { unsigned is_gigabit_capable:1; unsigned has_fixups:1; unsigned suspended:1; + unsigned suspended_by_mdio_bus:1; unsigned sysfs_links:1; unsigned loopback_enabled:1;
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 617940123e0140521f3080d2befc2bf55bcda094 ]
When we modify the route metric, the peer address's route need also be updated. Before the fix:
+ ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2 metric 60 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 60 pref medium 2001:db8::2 proto kernel metric 60 pref medium + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 61 pref medium 2001:db8::2 proto kernel metric 60 pref medium
After the fix: + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 61 pref medium 2001:db8::2 proto kernel metric 61 pref medium
Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4590,12 +4590,14 @@ inet6_rtm_deladdr(struct sk_buff *skb, s }
static int modify_prefix_route(struct inet6_ifaddr *ifp, - unsigned long expires, u32 flags) + unsigned long expires, u32 flags, + bool modify_peer) { struct fib6_info *f6i; u32 prio;
- f6i = addrconf_get_prefix_route(&ifp->addr, ifp->prefix_len, + f6i = addrconf_get_prefix_route(modify_peer ? &ifp->peer_addr : &ifp->addr, + ifp->prefix_len, ifp->idev->dev, 0, RTF_DEFAULT, true); if (!f6i) return -ENOENT; @@ -4606,7 +4608,8 @@ static int modify_prefix_route(struct in ip6_del_rt(dev_net(ifp->idev->dev), f6i);
/* add new one */ - addrconf_prefix_route(&ifp->addr, ifp->prefix_len, + addrconf_prefix_route(modify_peer ? &ifp->peer_addr : &ifp->addr, + ifp->prefix_len, ifp->rt_priority, ifp->idev->dev, expires, flags, GFP_KERNEL); } else { @@ -4682,7 +4685,7 @@ static int inet6_addr_modify(struct inet int rc = -ENOENT;
if (had_prefixroute) - rc = modify_prefix_route(ifp, expires, flags); + rc = modify_prefix_route(ifp, expires, flags, false);
/* prefix route could have been deleted; if so restore it */ if (rc == -ENOENT) { @@ -4690,6 +4693,15 @@ static int inet6_addr_modify(struct inet ifp->rt_priority, ifp->idev->dev, expires, flags, GFP_KERNEL); } + + if (had_prefixroute && !ipv6_addr_any(&ifp->peer_addr)) + rc = modify_prefix_route(ifp, expires, flags, true); + + if (rc == -ENOENT && !ipv6_addr_any(&ifp->peer_addr)) { + addrconf_prefix_route(&ifp->peer_addr, ifp->prefix_len, + ifp->rt_priority, ifp->idev->dev, + expires, flags, GFP_KERNEL); + } } else if (had_prefixroute) { enum cleanup_prefix_rt_t action; unsigned long rt_expires;
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit d0098e4c6b83e502cc1cd96d67ca86bc79a6c559 ]
When we modify the peer route and changed it to a new one, we should remove the old route first. Before the fix:
+ ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::2 proto kernel metric 256 pref medium + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::2 proto kernel metric 256 pref medium
After the fix: + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::3 proto kernel metric 256 pref medium
This patch depend on the previous patch "net/ipv6: need update peer route when modify metric" to update new peer route after delete old one.
Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1226,11 +1226,13 @@ check_cleanup_prefix_route(struct inet6_ }
static void -cleanup_prefix_route(struct inet6_ifaddr *ifp, unsigned long expires, bool del_rt) +cleanup_prefix_route(struct inet6_ifaddr *ifp, unsigned long expires, + bool del_rt, bool del_peer) { struct fib6_info *f6i;
- f6i = addrconf_get_prefix_route(&ifp->addr, ifp->prefix_len, + f6i = addrconf_get_prefix_route(del_peer ? &ifp->peer_addr : &ifp->addr, + ifp->prefix_len, ifp->idev->dev, 0, RTF_DEFAULT, true); if (f6i) { if (del_rt) @@ -1293,7 +1295,7 @@ static void ipv6_del_addr(struct inet6_i
if (action != CLEANUP_PREFIX_RT_NOP) { cleanup_prefix_route(ifp, expires, - action == CLEANUP_PREFIX_RT_DEL); + action == CLEANUP_PREFIX_RT_DEL, false); }
/* clean up prefsrc entries */ @@ -4631,6 +4633,7 @@ static int inet6_addr_modify(struct inet unsigned long timeout; bool was_managetempaddr; bool had_prefixroute; + bool new_peer = false;
ASSERT_RTNL();
@@ -4662,6 +4665,13 @@ static int inet6_addr_modify(struct inet cfg->preferred_lft = timeout; }
+ if (cfg->peer_pfx && + memcmp(&ifp->peer_addr, cfg->peer_pfx, sizeof(struct in6_addr))) { + if (!ipv6_addr_any(&ifp->peer_addr)) + cleanup_prefix_route(ifp, expires, true, true); + new_peer = true; + } + spin_lock_bh(&ifp->lock); was_managetempaddr = ifp->flags & IFA_F_MANAGETEMPADDR; had_prefixroute = ifp->flags & IFA_F_PERMANENT && @@ -4677,6 +4687,9 @@ static int inet6_addr_modify(struct inet if (cfg->rt_priority && cfg->rt_priority != ifp->rt_priority) ifp->rt_priority = cfg->rt_priority;
+ if (new_peer) + ifp->peer_addr = *cfg->peer_pfx; + spin_unlock_bh(&ifp->lock); if (!(ifp->flags&IFA_F_TENTATIVE)) ipv6_ifa_notify(0, ifp); @@ -4712,7 +4725,7 @@ static int inet6_addr_modify(struct inet
if (action != CLEANUP_PREFIX_RT_NOP) { cleanup_prefix_route(ifp, rt_expires, - action == CLEANUP_PREFIX_RT_DEL); + action == CLEANUP_PREFIX_RT_DEL, false); } }
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 0d29169a708bf730ede287248e429d579f432d1d ]
This patch update {ipv4, ipv6}_addr_metric_test with 1. Set metric of address with peer route and see if the route added correctly. 2. Modify metric and peer address for peer route and see if the route changed correctly.
Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/fib_tests.sh | 34 ++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-)
--- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -1041,6 +1041,27 @@ ipv6_addr_metric_test() fi log_test $rc 0 "Prefix route with metric on link up"
+ # verify peer metric added correctly + set -e + run_cmd "$IP -6 addr flush dev dummy2" + run_cmd "$IP -6 addr add dev dummy2 2001:db8:104::1 peer 2001:db8:104::2 metric 260" + set +e + + check_route6 "2001:db8:104::1 dev dummy2 proto kernel metric 260" + log_test $? 0 "Set metric with peer route on local side" + log_test $? 0 "User specified metric on local address" + check_route6 "2001:db8:104::2 dev dummy2 proto kernel metric 260" + log_test $? 0 "Set metric with peer route on peer side" + + set -e + run_cmd "$IP -6 addr change dev dummy2 2001:db8:104::1 peer 2001:db8:104::3 metric 261" + set +e + + check_route6 "2001:db8:104::1 dev dummy2 proto kernel metric 261" + log_test $? 0 "Modify metric and peer address on local side" + check_route6 "2001:db8:104::3 dev dummy2 proto kernel metric 261" + log_test $? 0 "Modify metric and peer address on peer side" + $IP li del dummy1 $IP li del dummy2 cleanup @@ -1457,13 +1478,20 @@ ipv4_addr_metric_test()
run_cmd "$IP addr flush dev dummy2" run_cmd "$IP addr add dev dummy2 172.16.104.1/32 peer 172.16.104.2 metric 260" - run_cmd "$IP addr change dev dummy2 172.16.104.1/32 peer 172.16.104.2 metric 261" rc=$? if [ $rc -eq 0 ]; then - check_route "172.16.104.2 dev dummy2 proto kernel scope link src 172.16.104.1 metric 261" + check_route "172.16.104.2 dev dummy2 proto kernel scope link src 172.16.104.1 metric 260" + rc=$? + fi + log_test $rc 0 "Set metric of address with peer route" + + run_cmd "$IP addr change dev dummy2 172.16.104.1/32 peer 172.16.104.3 metric 261" + rc=$? + if [ $rc -eq 0 ]; then + check_route "172.16.104.3 dev dummy2 proto kernel scope link src 172.16.104.1 metric 261" rc=$? fi - log_test $rc 0 "Modify metric of address with peer route" + log_test $rc 0 "Modify metric and peer address for peer route"
$IP li del dummy1 $IP li del dummy2
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit a20f997010c4ec76eaa55b8cc047d76dcac69f70 ]
By default, DSA drivers should configure CPU and DSA ports to their maximum speed. In many configurations this is sufficient to make the link work.
In some cases it is necessary to configure the link to run slower, e.g. because of limitations of the SoC it is connected to. Or back to back PHYs are used and the PHY needs to be driven in order to establish link. In this case, phylink is used.
Only instantiate phylink if it is required. If there is no PHY, or no fixed link properties, phylink can upset a link which works in the default configuration.
Fixes: 0e27921816ad ("net: dsa: Use PHYLINK for the CPU/DSA ports") Signed-off-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/port.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -649,9 +649,14 @@ err_phy_connect: int dsa_port_link_register_of(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; + struct device_node *phy_np;
- if (!ds->ops->adjust_link) - return dsa_port_phylink_register(dp); + if (!ds->ops->adjust_link) { + phy_np = of_parse_phandle(dp->dn, "phy-handle", 0); + if (of_phy_is_fixed_link(dp->dn) || phy_np) + return dsa_port_phylink_register(dp); + return 0; + }
dev_warn(ds->dev, "Using legacy PHYLIB callbacks. Please migrate to PHYLINK!\n"); @@ -666,11 +671,12 @@ void dsa_port_link_unregister_of(struct { struct dsa_switch *ds = dp->ds;
- if (!ds->ops->adjust_link) { + if (!ds->ops->adjust_link && dp->pl) { rtnl_lock(); phylink_disconnect_phy(dp->pl); rtnl_unlock(); phylink_destroy(dp->pl); + dp->pl = NULL; return; }
From: Florian Fainelli f.fainelli@gmail.com
commit 503ba7c6961034ff0047707685644cad9287c226 upstream.
It is currently possible for a PHY device to be suspended as part of a network device driver's suspend call while it is still being attached to that net_device, either via phy_suspend() or implicitly via phy_stop().
Later on, when the MDIO bus controller get suspended, we would attempt to suspend again the PHY because it is still attached to a network device.
This is both a waste of time and creates an opportunity for improper clock/power management bugs to creep in.
Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/phy/phy_device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -246,7 +246,7 @@ static bool mdio_bus_phy_may_suspend(str * MDIO bus driver and clock gated at this point. */ if (!netdev) - return !phydev->suspended; + goto out;
if (netdev->wol_enabled) return false; @@ -266,7 +266,8 @@ static bool mdio_bus_phy_may_suspend(str if (device_may_wakeup(&netdev->dev)) return false;
- return true; +out: + return !phydev->suspended; }
static int mdio_bus_phy_suspend(struct device *dev)
From: Vasily Averin vvs@virtuozzo.com
commit 2d4ecb030dcc90fb725ecbfc82ce5d6c37906e0e upstream.
If seq_file .next fuction does not change position index, read after some lseek can generate unexpected output:
1) dd bs=1 skip output of each 2nd elements $ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1 2 3 4 5 1+0 records in 1+0 records out 8 bytes copied, 0,000267297 s, 29,9 kB/s [test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8 2 4 <<< NB! 3 was skipped 6 <<< ... and 5 too 8 <<< ... and 7 8+0 records in 8+0 records out 8 bytes copied, 5,2123e-05 s, 153 kB/s
This happen because __cgroup_procs_start() makes an extra extra cgroup_procs_next() call
2) read after lseek beyond end of file generates whole last line. 3) read after lseek into middle of last line generates expected rest of last line and unexpected whole line once again.
Additionally patch removes an extra position index changes in __cgroup_procs_start()
Cc: stable@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/cgroup/cgroup.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4659,6 +4659,9 @@ static void *cgroup_procs_next(struct se struct kernfs_open_file *of = s->private; struct css_task_iter *it = of->priv;
+ if (pos) + (*pos)++; + return css_task_iter_next(it); }
@@ -4674,7 +4677,7 @@ static void *__cgroup_procs_start(struct * from position 0, so we can simply keep iterating on !0 *pos. */ if (!it) { - if (WARN_ON_ONCE((*pos)++)) + if (WARN_ON_ONCE((*pos))) return ERR_PTR(-EINVAL);
it = kzalloc(sizeof(*it), GFP_KERNEL); @@ -4682,10 +4685,11 @@ static void *__cgroup_procs_start(struct return ERR_PTR(-ENOMEM); of->priv = it; css_task_iter_start(&cgrp->self, iter_flags, it); - } else if (!(*pos)++) { + } else if (!(*pos)) { css_task_iter_end(it); css_task_iter_start(&cgrp->self, iter_flags, it); - } + } else + return it->cur_task;
return cgroup_procs_next(s, NULL, NULL); }
From: Michal Koutný mkoutny@suse.com
commit 9c974c77246460fa6a92c18554c3311c8c83c160 upstream.
PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY.
Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later.
Reported-by: Suren Baghdasaryan surenb@google.com Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný mkoutny@suse.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/cgroup.h | 1 + kernel/cgroup/cgroup.c | 23 ++++++++++++++++------- 2 files changed, 17 insertions(+), 7 deletions(-)
--- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -62,6 +62,7 @@ struct css_task_iter { struct list_head *mg_tasks_head; struct list_head *dying_tasks_head;
+ struct list_head *cur_tasks_head; struct css_set *cur_cset; struct css_set *cur_dcset; struct task_struct *cur_task; --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4461,12 +4461,16 @@ static void css_task_iter_advance_css_se } } while (!css_set_populated(cset) && list_empty(&cset->dying_tasks));
- if (!list_empty(&cset->tasks)) + if (!list_empty(&cset->tasks)) { it->task_pos = cset->tasks.next; - else if (!list_empty(&cset->mg_tasks)) + it->cur_tasks_head = &cset->tasks; + } else if (!list_empty(&cset->mg_tasks)) { it->task_pos = cset->mg_tasks.next; - else + it->cur_tasks_head = &cset->mg_tasks; + } else { it->task_pos = cset->dying_tasks.next; + it->cur_tasks_head = &cset->dying_tasks; + }
it->tasks_head = &cset->tasks; it->mg_tasks_head = &cset->mg_tasks; @@ -4524,10 +4528,14 @@ repeat: else it->task_pos = it->task_pos->next;
- if (it->task_pos == it->tasks_head) + if (it->task_pos == it->tasks_head) { it->task_pos = it->mg_tasks_head->next; - if (it->task_pos == it->mg_tasks_head) + it->cur_tasks_head = it->mg_tasks_head; + } + if (it->task_pos == it->mg_tasks_head) { it->task_pos = it->dying_tasks_head->next; + it->cur_tasks_head = it->dying_tasks_head; + } if (it->task_pos == it->dying_tasks_head) css_task_iter_advance_css_set(it); } else { @@ -4546,11 +4554,12 @@ repeat: goto repeat;
/* and dying leaders w/o live member threads */ - if (!atomic_read(&task->signal->live)) + if (it->cur_tasks_head == it->dying_tasks_head && + !atomic_read(&task->signal->live)) goto repeat; } else { /* skip all dying ones */ - if (task->flags & PF_EXITING) + if (it->cur_tasks_head == it->dying_tasks_head) goto repeat; } }
From: Florian Westphal fw@strlen.de
commit 1d305ba40eb8081ff21eeb8ca6ba5c70fd920934 upstream.
nft will loop forever if the kernel doesn't support an expression:
1. nft_expr_type_get() appends the family specific name to the module list. 2. -EAGAIN is returned to nfnetlink, nfnetlink calls abort path. 3. abort path sets ->done to true and calls request_module for the expression. 4. nfnetlink replays the batch, we end up in nft_expr_type_get() again. 5. nft_expr_type_get attempts to append family-specific name. This one already exists on the list, so we continue 6. nft_expr_type_get adds the generic expression name to the module list. -EAGAIN is returned, nfnetlink calls abort path. 7. abort path encounters the family-specific expression which has 'done' set, so it gets removed. 8. abort path requests the generic expression name, sets done to true. 9. batch is replayed.
If the expression could not be loaded, then we will end up back at 1), because the family-specific name got removed and the cycle starts again.
Note that userspace can SIGKILL the nft process to stop the cycle, but the desired behaviour is to return an error after the generic expr name fails to load the expression.
Fixes: eb014de4fd418 ("netfilter: nf_tables: autoload modules from the abort path") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_tables_api.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6970,13 +6970,8 @@ static void nf_tables_module_autoload(st list_splice_init(&net->nft.module_list, &module_list); mutex_unlock(&net->nft.commit_mutex); list_for_each_entry_safe(req, next, &module_list, list) { - if (req->done) { - list_del(&req->list); - kfree(req); - } else { - request_module("%s", req->module); - req->done = true; - } + request_module("%s", req->module); + req->done = true; } mutex_lock(&net->nft.commit_mutex); list_splice(&module_list, &net->nft.module_list); @@ -7759,6 +7754,7 @@ static void __net_exit nf_tables_exit_ne __nft_release_tables(net); mutex_unlock(&net->nft.commit_mutex); WARN_ON_ONCE(!list_empty(&net->nft.tables)); + WARN_ON_ONCE(!list_empty(&net->nft.module_list)); }
static struct pernet_operations nf_tables_net_ops = {
From: Dan Moulding dmoulding@me.com
commit a9149d243f259ad8f02b1e23dfe8ba06128f15e1 upstream.
The logic for checking required NVM sections was recently fixed in commit b3f20e098293 ("iwlwifi: mvm: fix NVM check for 3168 devices"). However, with that fixed the else is now taken for 3168 devices and within the else clause there is a mandatory check for the PHY_SKU section. This causes the parsing to fail for 3168 devices.
The PHY_SKU section is really only mandatory for the IWL_NVM_EXT layout (the phy_sku parameter of iwl_parse_nvm_data is only used when the NVM type is IWL_NVM_EXT). So this changes the PHY_SKU section check so that it's only mandatory for IWL_NVM_EXT.
Fixes: b3f20e098293 ("iwlwifi: mvm: fix NVM check for 3168 devices") Signed-off-by: Dan Moulding dmoulding@me.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/mvm/nvm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c @@ -309,7 +309,8 @@ iwl_parse_nvm_sections(struct iwl_mvm *m }
/* PHY_SKU section is mandatory in B0 */ - if (!mvm->nvm_sections[NVM_SECTION_TYPE_PHY_SKU].data) { + if (mvm->trans->cfg->nvm_type == IWL_NVM_EXT && + !mvm->nvm_sections[NVM_SECTION_TYPE_PHY_SKU].data) { IWL_ERR(mvm, "Can't parse phy_sku in B0, empty sections\n"); return NULL;
From: Halil Pasic pasic@linux.ibm.com
commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa upstream.
Since nobody else is going to restart our hw_queue for us, the blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient necessarily sufficient to ensure that the queue will get started again. In case of global resource outage (-ENOMEM because mapping failure, because of swiotlb full) our virtqueue may be empty and we can get stuck with a stopped hw_queue.
Let us not stop the queue on arbitrary errors, but only on -EONSPC which indicates a full virtqueue, where the hw_queue is guaranteed to get started by virtblk_done() before when it makes sense to carry on submitting requests. Let us also remove a stale comment.
Signed-off-by: Halil Pasic pasic@linux.ibm.com Cc: Jens Axboe axboe@kernel.dk Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails") Link: https://lore.kernel.org/r/20200213123728.61216-2-pasic@linux.ibm.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/virtio_blk.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -339,10 +339,12 @@ static blk_status_t virtio_queue_rq(stru err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num); if (err) { virtqueue_kick(vblk->vqs[qid].vq); - blk_mq_stop_hw_queue(hctx); + /* Don't stop the queue if -ENOMEM: we may have failed to + * bounce the buffer due to global resource outage. + */ + if (err == -ENOSPC) + blk_mq_stop_hw_queue(hctx); spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); - /* Out of mem doesn't actually happen, since we fall back - * to direct descriptors */ if (err == -ENOMEM || err == -ENOSPC) return BLK_STS_DEV_RESOURCE; return BLK_STS_IOERR;
From: Hans de Goede hdegoede@redhat.com
commit 81ee85d0462410de8eeeec1b9761941fd6ed8c7b upstream.
Quoting from the comment describing the WARN functions in include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report * significant kernel issues that need prompt attention if they should ever * appear at runtime. * * Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT for really are invalid external inputs. They are not under the kernel's control and the issues in them cannot be fixed by a kernel update. So logging a backtrace, which invites bug reports to be filed about this, is not helpful.
Fixes: 556ab45f9a77 ("ioat2: catch and recover from broken vtd configurations v6") Signed-off-by: Hans de Goede hdegoede@redhat.com Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20200309182510.373875-1-hdegoede@redhat.com BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=701847 Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/intel-iommu.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4129,10 +4129,11 @@ static void quirk_ioat_snb_local_iommu(s
/* we know that the this iommu should be at offset 0xa000 from vtbar */ drhd = dmar_find_matched_drhd_unit(pdev); - if (WARN_TAINT_ONCE(!drhd || drhd->reg_base_addr - vtbar != 0xa000, - TAINT_FIRMWARE_WORKAROUND, - "BIOS assigned incorrect VT-d unit for Intel(R) QuickData Technology device\n")) + if (!drhd || drhd->reg_base_addr - vtbar != 0xa000) { + pr_warn_once(FW_BUG "BIOS assigned incorrect VT-d unit for Intel(R) QuickData Technology device\n"); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); pdev->dev.archdata.iommu = DUMMY_DEVICE_DOMAIN_INFO; + } } DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IOAT_SNB, quirk_ioat_snb_local_iommu);
From: Vasily Averin vvs@virtuozzo.com
commit dc15af8e9dbd039ebb06336597d2c491ef46ab74 upstream.
If .next function does not change position index, following .show function will repeat output related to current position index.
Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conntrack_standalone.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -411,7 +411,7 @@ static void *ct_cpu_seq_next(struct seq_ *pos = cpu + 1; return per_cpu_ptr(net->ct.stat, cpu); } - + (*pos)++; return NULL; }
From: Vasily Averin vvs@virtuozzo.com
commit bb71f846a0002239f7058c84f1496648ff4a5c20 upstream.
If .next function does not change position index, following .show function will repeat output related to current position index.
Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_synproxy_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_synproxy_core.c +++ b/net/netfilter/nf_synproxy_core.c @@ -267,7 +267,7 @@ static void *synproxy_cpu_seq_next(struc *pos = cpu + 1; return per_cpu_ptr(snet->stats, cpu); } - + (*pos)++; return NULL; }
From: Vasily Averin vvs@virtuozzo.com
commit db25517a550926f609c63054b12ea9ad515e1a10 upstream.
If .next function does not change position index, following .show function will repeat output related to current position index.
Without the patch: # dd if=/proc/net/xt_recent/SSH # original file outpt src=127.0.0.4 ttl: 0 last_seen: 6275444819 oldest_pkt: 1 6275444819 src=127.0.0.2 ttl: 0 last_seen: 6275438906 oldest_pkt: 1 6275438906 src=127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 0+1 records in 0+1 records out 204 bytes copied, 6.1332e-05 s, 3.3 MB/s
Read after lseek into middle of last line (offset 140 in example below) generates expected end of last line and then unexpected whole last line once again
# dd if=/proc/net/xt_recent/SSH bs=140 skip=1 dd: /proc/net/xt_recent/SSH: cannot skip to specified offset 127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 src=127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 0+1 records in 0+1 records out 132 bytes copied, 6.2487e-05 s, 2.1 MB/s
Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/xt_recent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/xt_recent.c +++ b/net/netfilter/xt_recent.c @@ -492,12 +492,12 @@ static void *recent_seq_next(struct seq_ const struct recent_entry *e = v; const struct list_head *head = e->list.next;
+ (*pos)++; while (head == &t->iphash[st->bucket]) { if (++st->bucket >= ip_list_hash_size) return NULL; head = t->iphash[st->bucket].next; } - (*pos)++; return list_entry(head, struct recent_entry, list); }
From: Vasily Averin vvs@virtuozzo.com
commit ee84f19cbbe9cf7cba2958acb03163fed3ecbb0f upstream.
If .next function does not change position index, following .show function will repeat output related to current position index.
Without patch: # dd if=/proc/net/ip_tables_matches # original file output conntrack conntrack conntrack recent recent icmp udplite udp tcp 0+1 records in 0+1 records out 65 bytes copied, 5.4074e-05 s, 1.2 MB/s
# dd if=/proc/net/ip_tables_matches bs=62 skip=1 dd: /proc/net/ip_tables_matches: cannot skip to specified offset cp <<< end of last line tcp <<< and then unexpected whole last line once again 0+1 records in 0+1 records out 7 bytes copied, 0.000102447 s, 68.3 kB/s
Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/x_tables.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1551,6 +1551,9 @@ static void *xt_mttg_seq_next(struct seq uint8_t nfproto = (unsigned long)PDE_DATA(file_inode(seq->file)); struct nf_mttg_trav *trav = seq->private;
+ if (ppos != NULL) + ++(*ppos); + switch (trav->class) { case MTTG_TRAV_INIT: trav->class = MTTG_TRAV_NFP_UNSPEC; @@ -1576,9 +1579,6 @@ static void *xt_mttg_seq_next(struct seq default: return NULL; } - - if (ppos != NULL) - ++*ppos; return trav; }
From: Hillf Danton hdanton@sina.com
commit aa202f1f56960c60e7befaa0f49c72b8fa11b0a8 upstream.
wq_select_unbound_cpu() is designed for unbound workqueues only, but it's wrongly called when using a bound workqueue too.
Fixing this ensures work queued to a bound workqueue with cpu=WORK_CPU_UNBOUND always runs on the local CPU.
Before, that would happen only if wq_unbound_cpumask happened to include it (likely almost always the case), or was empty, or we got lucky with forced round-robin placement. So restricting /sys/devices/virtual/workqueue/cpumask to a small subset of a machine's CPUs would cause some bound work items to run unexpectedly there.
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Hillf Danton hdanton@sina.com [dj: massage changelog] Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Cc: Tejun Heo tj@kernel.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/workqueue.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1417,14 +1417,16 @@ static void __queue_work(int cpu, struct return; rcu_read_lock(); retry: - if (req_cpu == WORK_CPU_UNBOUND) - cpu = wq_select_unbound_cpu(raw_smp_processor_id()); - /* pwq which will be used unless @work is executing elsewhere */ - if (!(wq->flags & WQ_UNBOUND)) - pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); - else + if (wq->flags & WQ_UNBOUND) { + if (req_cpu == WORK_CPU_UNBOUND) + cpu = wq_select_unbound_cpu(raw_smp_processor_id()); pwq = unbound_pwq_by_node(wq, cpu_to_node(cpu)); + } else { + if (req_cpu == WORK_CPU_UNBOUND) + cpu = raw_smp_processor_id(); + pwq = per_cpu_ptr(wq->cpu_pwqs, cpu); + }
/* * If @work was previously on a different pool, it might still be
From: Colin Ian King colin.king@canonical.com
commit d785476c608c621b345dd9396e8b21e90375cb0e upstream.
Variable grph_obj_type is being assigned twice, one of these is redundant so remove it.
Addresses-Coverity: ("Evaluation order violation") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c @@ -365,8 +365,7 @@ bool amdgpu_atombios_get_connector_info_ router.ddc_valid = false; router.cd_valid = false; for (j = 0; j < ((le16_to_cpu(path->usSize) - 8) / 2); j++) { - uint8_t grph_obj_type= - grph_obj_type = + uint8_t grph_obj_type = (le16_to_cpu(path->usGraphicObjIds[j]) & OBJECT_TYPE_MASK) >> OBJECT_TYPE_SHIFT;
From: Matthew Auld matthew.auld@intel.com
commit 1d61c5d711a2dc0b978ae905535edee9601f9449 upstream.
The alignment is u64, and yet is_power_of_2() assumes unsigned long, which might give different results between 32b and 64b kernel.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20200305203534.210466-1-matthe... Cc: stable@vger.kernel.org (cherry picked from commit 2920516b2f719546f55079bc39a7fe409d9e80ab) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 3 ++- drivers/gpu/drm/i915/i915_utils.h | 5 +++++ 2 files changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -439,7 +439,8 @@ eb_validate_vma(struct i915_execbuffer * if (unlikely(entry->flags & eb->invalid_flags)) return -EINVAL;
- if (unlikely(entry->alignment && !is_power_of_2(entry->alignment))) + if (unlikely(entry->alignment && + !is_power_of_2_u64(entry->alignment))) return -EINVAL;
/* --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -233,6 +233,11 @@ static inline u64 ptr_to_u64(const void __idx; \ })
+static inline bool is_power_of_2_u64(u64 n) +{ + return (n != 0 && ((n & (n - 1)) == 0)); +} + static inline void __list_del_many(struct list_head *head, struct list_head *first) {
From: Chris Wilson chris@chris-wilson.co.uk
commit 14a0d527a479eb2cb6067f9e5e163e1bf35db2a9 upstream.
Since the semaphore fence may be signaled from inside an interrupt handler from inside a request holding its request->lock, we cannot then enter into the engine->active.lock for processing the semaphore priority bump as we may traverse our call tree and end up on another held request.
CPU 0: [ 2243.218864] _raw_spin_lock_irqsave+0x9a/0xb0 [ 2243.218867] i915_schedule_bump_priority+0x49/0x80 [i915] [ 2243.218869] semaphore_notify+0x6d/0x98 [i915] [ 2243.218871] __i915_sw_fence_complete+0x61/0x420 [i915] [ 2243.218874] ? kmem_cache_free+0x211/0x290 [ 2243.218876] i915_sw_fence_complete+0x58/0x80 [i915] [ 2243.218879] dma_i915_sw_fence_wake+0x3e/0x80 [i915] [ 2243.218881] signal_irq_work+0x571/0x690 [i915] [ 2243.218883] irq_work_run_list+0xd7/0x120 [ 2243.218885] irq_work_run+0x1d/0x50 [ 2243.218887] smp_irq_work_interrupt+0x21/0x30 [ 2243.218889] irq_work_interrupt+0xf/0x20
CPU 1: [ 2242.173107] _raw_spin_lock+0x8f/0xa0 [ 2242.173110] __i915_request_submit+0x64/0x4a0 [i915] [ 2242.173112] __execlists_submission_tasklet+0x8ee/0x2120 [i915] [ 2242.173114] ? i915_sched_lookup_priolist+0x1e3/0x2b0 [i915] [ 2242.173117] execlists_submit_request+0x2e8/0x2f0 [i915] [ 2242.173119] submit_notify+0x8f/0xc0 [i915] [ 2242.173121] __i915_sw_fence_complete+0x61/0x420 [i915] [ 2242.173124] ? _raw_spin_unlock_irqrestore+0x39/0x40 [ 2242.173137] i915_sw_fence_complete+0x58/0x80 [i915] [ 2242.173140] i915_sw_fence_commit+0x16/0x20 [i915]
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1318 Fixes: b7404c7ecb38 ("drm/i915: Bump ready tasks ahead of busywaits") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20200310101720.9944-1-chris@ch... (cherry picked from commit 209df10bb4536c81c2540df96c02cd079435357f) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/i915_request.c | 22 +++++++++++++++++----- drivers/gpu/drm/i915/i915_request.h | 2 ++ 2 files changed, 19 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -560,19 +560,31 @@ submit_notify(struct i915_sw_fence *fenc return NOTIFY_DONE; }
+static void irq_semaphore_cb(struct irq_work *wrk) +{ + struct i915_request *rq = + container_of(wrk, typeof(*rq), semaphore_work); + + i915_schedule_bump_priority(rq, I915_PRIORITY_NOSEMAPHORE); + i915_request_put(rq); +} + static int __i915_sw_fence_call semaphore_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) { - struct i915_request *request = - container_of(fence, typeof(*request), semaphore); + struct i915_request *rq = container_of(fence, typeof(*rq), semaphore);
switch (state) { case FENCE_COMPLETE: - i915_schedule_bump_priority(request, I915_PRIORITY_NOSEMAPHORE); + if (!(READ_ONCE(rq->sched.attr.priority) & I915_PRIORITY_NOSEMAPHORE)) { + i915_request_get(rq); + init_irq_work(&rq->semaphore_work, irq_semaphore_cb); + irq_work_queue(&rq->semaphore_work); + } break;
case FENCE_FREE: - i915_request_put(request); + i915_request_put(rq); break; }
@@ -1215,9 +1227,9 @@ void __i915_request_queue(struct i915_re * decide whether to preempt the entire chain so that it is ready to * run at the earliest possible convenience. */ - i915_sw_fence_commit(&rq->semaphore); if (attr && rq->engine->schedule) rq->engine->schedule(rq, attr); + i915_sw_fence_commit(&rq->semaphore); i915_sw_fence_commit(&rq->submit); }
--- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -26,6 +26,7 @@ #define I915_REQUEST_H
#include <linux/dma-fence.h> +#include <linux/irq_work.h> #include <linux/lockdep.h>
#include "gt/intel_context_types.h" @@ -147,6 +148,7 @@ struct i915_request { }; struct list_head execute_cb; struct i915_sw_fence semaphore; + struct irq_work semaphore_work;
/* * A list of everyone we wait upon, and everyone who waits upon us.
From: Ben Chuang ben.chuang@genesyslogic.com.tw
commit 31e43f31890ca6e909b27dcb539252b46aa465da upstream.
Enable MSI interrupt for GL9750/GL9755. Some platforms do not support PCI INTx and devices can not work without interrupt. Like messages below:
[ 4.487132] sdhci-pci 0000:01:00.0: SDHCI controller found [17a0:9755] (rev 0) [ 4.487198] ACPI BIOS Error (bug): Could not resolve symbol [_SB.PCI0.PBR2._PRT.APS2], AE_NOT_FOUND (20190816/psargs-330) [ 4.487397] ACPI Error: Aborting method _SB.PCI0.PBR2._PRT due to previous error (AE_NOT_FOUND) (20190816/psparse-529) [ 4.487707] pcieport 0000:00:01.3: can't derive routing for PCI INT A [ 4.487709] sdhci-pci 0000:01:00.0: PCI INT A: no GSI
Signed-off-by: Ben Chuang ben.chuang@genesyslogic.com.tw Tested-by: Raul E Rangel rrangel@chromium.org Fixes: e51df6ce668a ("mmc: host: sdhci-pci: Add Genesys Logic GL975x support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200219092900.9151-1-benchuanggli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/sdhci-pci-gli.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -262,10 +262,26 @@ static int gl9750_execute_tuning(struct return 0; }
+static void gli_pcie_enable_msi(struct sdhci_pci_slot *slot) +{ + int ret; + + ret = pci_alloc_irq_vectors(slot->chip->pdev, 1, 1, + PCI_IRQ_MSI | PCI_IRQ_MSIX); + if (ret < 0) { + pr_warn("%s: enable PCI MSI failed, error=%d\n", + mmc_hostname(slot->host->mmc), ret); + return; + } + + slot->host->irq = pci_irq_vector(slot->chip->pdev, 0); +} + static int gli_probe_slot_gl9750(struct sdhci_pci_slot *slot) { struct sdhci_host *host = slot->host;
+ gli_pcie_enable_msi(slot); slot->host->mmc->caps2 |= MMC_CAP2_NO_SDIO; sdhci_enable_v4_mode(host);
@@ -276,6 +292,7 @@ static int gli_probe_slot_gl9755(struct { struct sdhci_host *host = slot->host;
+ gli_pcie_enable_msi(slot); slot->host->mmc->caps2 |= MMC_CAP2_NO_SDIO; sdhci_enable_v4_mode(host);
From: Mathias Kresin dev@kresin.me
commit d62e7fbea4951c124a24176da0c7bf3003ec53d4 upstream.
Add the missing semicolon after of_node_put to get the file compiled.
Fixes: f17d2f54d36d ("pinctrl: falcon: Add of_node_put() before return") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Mathias Kresin dev@kresin.me Link: https://lore.kernel.org/r/20200305182245.9636-1-dev@kresin.me Acked-by: Thomas Langer thomas.langer@intel.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/pinctrl-falcon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pinctrl/pinctrl-falcon.c +++ b/drivers/pinctrl/pinctrl-falcon.c @@ -451,7 +451,7 @@ static int pinctrl_falcon_probe(struct p falcon_info.clk[*bank] = clk_get(&ppdev->dev, NULL); if (IS_ERR(falcon_info.clk[*bank])) { dev_err(&ppdev->dev, "failed to get clock\n"); - of_node_put(np) + of_node_put(np); return PTR_ERR(falcon_info.clk[*bank]); } falcon_info.membase[*bank] = devm_ioremap_resource(&pdev->dev,
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 4d00fc477a2ce8b6d2b09fb34ef9fe9918e7d434 upstream.
Before rebooting the box, a "ssh sync" is called to the test machine to see if it is alive or not. But if the test machine is in a partial state, that ssh may never actually finish, and the ktest test hangs.
Add a 10 second timeout to the sync test, which will fail after 10 seconds and then cause the test to reboot the test machine.
Cc: stable@vger.kernel.org Fixes: 6474ace999edd ("ktest.pl: Powercycle the box on reboot if no connection can be made") Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/testing/ktest/ktest.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/ktest/ktest.pl +++ b/tools/testing/ktest/ktest.pl @@ -1383,7 +1383,7 @@ sub reboot {
} else { # Make sure everything has been written to disk - run_ssh("sync"); + run_ssh("sync", 10);
if (defined($time)) { start_monitor;
From: Al Viro viro@zeniv.linux.org.uk
commit d9a9f4849fe0c9d560851ab22a85a666cddfdd24 upstream.
several iterations of ->atomic_open() calling conventions ago, we used to need fput() if ->atomic_open() failed at some point after successful finish_open(). Now (since 2016) it's not needed - struct file carries enough state to make fput() work regardless of the point in struct file lifecycle and discarding it on failure exits in open() got unified. Unfortunately, I'd missed the fact that we had an instance of ->atomic_open() (cifs one) that used to need that fput(), as well as the stale comment in finish_open() demanding such late failure handling. Trivially fixed...
Fixes: fe9ec8291fca "do_last(): take fput() on error after opening to out:" Cc: stable@kernel.org # v4.7+ Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/filesystems/porting.rst | 8 ++++++++ fs/cifs/dir.c | 1 - fs/open.c | 3 --- 3 files changed, 8 insertions(+), 4 deletions(-)
--- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -850,3 +850,11 @@ business doing so. d_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are very suspect (and won't work in modules). Such uses are very likely to be misspelled d_alloc_anon(). + +--- + +**mandatory** + +[should've been added in 2016] stale comment in finish_open() nonwithstanding, +failure exits in ->atomic_open() instances should *NOT* fput() the file, +no matter what. Everything is handled by the caller. --- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -560,7 +560,6 @@ cifs_atomic_open(struct inode *inode, st if (server->ops->close) server->ops->close(xid, tcon, &fid); cifs_del_pending_open(&open); - fput(file); rc = -ENOMEM; }
--- a/fs/open.c +++ b/fs/open.c @@ -860,9 +860,6 @@ cleanup_file: * the return value of d_splice_alias(), then the caller needs to perform dput() * on it after finish_open(). * - * On successful return @file is a fully instantiated open file. After this, if - * an error occurs in ->atomic_open(), it needs to clean up with fput(). - * * Returns zero on success or -errno if the open failed. */ int finish_open(struct file *file, struct dentry *dentry,
From: Al Viro viro@zeniv.linux.org.uk
commit 21039132650281de06a169cbe8a0f7e5c578fd8b upstream.
with the way fs/namei.c:do_last() had been done, ->atomic_open() instances needed to recognize the case when existing file got found with O_EXCL|O_CREAT, either by falling back to finish_no_open() or failing themselves. gfs2 one didn't.
Fixes: 6d4ade986f9c (GFS2: Add atomic_open support) Cc: stable@kernel.org # v3.11 Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/gfs2/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1248,7 +1248,7 @@ static int gfs2_atomic_open(struct inode if (!(file->f_mode & FMODE_OPENED)) return finish_no_open(file, d); dput(d); - return 0; + return excl && (flags & O_CREAT) ? -EEXIST : 0; }
BUG_ON(d != NULL);
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 342993f96ab24d5864ab1216f46c0b199c2baf8e upstream.
After commit 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Hyper-V guests on KVM stopped booting with:
kvm_nested_vmexit: rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0 kvm_page_fault: address febd0000 error_code 181 kvm_emulate_insn: 0:fffff802987d6169: f3 a5 kvm_emulate_insn: 0:fffff802987d6169: f3 a5 FAIL kvm_inj_exception: #UD (0x0)
"f3 a5" is a "rep movsw" instruction, which should not be intercepted at all. Commit c44b4c6ab80e ("KVM: emulate: clean up initializations in init_decode_cache") reduced the number of fields cleared by init_decode_cache() claiming that they are being cleared elsewhere, 'intercept', however, is left uncleared if the instruction does not have any of the "slow path" flags (NotImpl, Stack, Op3264, Sse, Mmx, CheckPerm, NearBranch, No16 and of course Intercept itself).
Fixes: c44b4c6ab80e ("KVM: emulate: clean up initializations in init_decode_cache") Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Cc: stable@vger.kernel.org Suggested-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Reviewed-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/emulate.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -5197,6 +5197,7 @@ int x86_decode_insn(struct x86_emulate_c ctxt->fetch.ptr = ctxt->fetch.data; ctxt->fetch.end = ctxt->fetch.data + insn_len; ctxt->opcode_len = 1; + ctxt->intercept = x86_intercept_none; if (insn_len > 0) memcpy(ctxt->fetch.data, insn, insn_len); else {
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 95fa10103dabc38be5de8efdfced5e67576ed896 upstream.
When an EVMCS enabled L1 guest on KVM will tries doing enlightened VMEnter with EVMCS GPA = 0 the host crashes because the
evmcs_gpa != vmx->nested.hv_evmcs_vmptr
condition in nested_vmx_handle_enlightened_vmptrld() will evaluate to false (as nested.hv_evmcs_vmptr is zeroed after init). The crash will happen on vmx->nested.hv_evmcs pointer dereference.
Another problematic EVMCS ptr value is '-1' but it only causes host crash after nested_release_evmcs() invocation. The problem is exactly the same as with '0', we mistakenly think that the EVMCS pointer hasn't changed and thus nested.hv_evmcs_vmptr is valid.
Resolve the issue by adding an additional !vmx->nested.hv_evmcs check to nested_vmx_handle_enlightened_vmptrld(), this way we will always be trying kvm_vcpu_map() when nested.hv_evmcs is NULL and this is supposed to catch all invalid EVMCS GPAs.
Also, initialize hv_evmcs_vmptr to '0' in nested_release_evmcs() to be consistent with initialization where we don't currently set hv_evmcs_vmptr to '-1'.
Cc: stable@vger.kernel.org Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -223,7 +223,7 @@ static inline void nested_release_evmcs( return;
kvm_vcpu_unmap(vcpu, &vmx->nested.hv_evmcs_map, true); - vmx->nested.hv_evmcs_vmptr = -1ull; + vmx->nested.hv_evmcs_vmptr = 0; vmx->nested.hv_evmcs = NULL; }
@@ -1828,7 +1828,8 @@ static int nested_vmx_handle_enlightened if (!nested_enlightened_vmentry(vcpu, &evmcs_gpa)) return 1;
- if (unlikely(evmcs_gpa != vmx->nested.hv_evmcs_vmptr)) { + if (unlikely(!vmx->nested.hv_evmcs || + evmcs_gpa != vmx->nested.hv_evmcs_vmptr)) { if (!vmx->nested.hv_evmcs) vmx->nested.current_vmptr = -1ull;
From: Eugeniy Paltsev Eugeniy.Paltsev@synopsys.com
commit 8d92e992a785f35d23f845206cf8c6cafbc264e0 upstream.
The default defintions use fill pattern 0x90 for padding which for ARC generates unintended "ldh_s r12,[r0,0x20]" corresponding to opcode 0x9090
So use ".align 4" which insert a "nop_s" instruction instead.
Cc: stable@vger.kernel.org Acked-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Eugeniy Paltsev Eugeniy.Paltsev@synopsys.com Signed-off-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arc/include/asm/linkage.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arc/include/asm/linkage.h +++ b/arch/arc/include/asm/linkage.h @@ -29,6 +29,8 @@ .endm
#define ASM_NL ` /* use '`' to mark new line in macro */ +#define __ALIGN .align 4 +#define __ALIGN_STR __stringify(__ALIGN)
/* annotation for data we want in DCCM - if enabled in .config */ .macro ARCFP_DATA nm
From: Miklos Szeredi mszeredi@redhat.com
commit 3e8cb8b2eaeb22f540f1cbc00cbb594047b7ba89 upstream.
Normal, synchronous requests will have their args allocated on the stack. After the FR_FINISHED bit is set by receiving the reply from the userspace fuse server, the originating task may return and reuse the stack frame, resulting in an Oops if the args structure is dereferenced.
Fix by setting a flag in the request itself upon initializing, indicating whether it has an asynchronous ->end() callback.
Reported-by: Kyle Sanderson kyle.leet@gmail.com Reported-by: Michael Stapelberg michael+lkml@stapelberg.ch Fixes: 2b319d1f6f92 ("fuse: don't dereference req->args on finished request") Cc: stable@vger.kernel.org # v5.4 Tested-by: Michael Stapelberg michael+lkml@stapelberg.ch Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dev.c | 6 +++--- fs/fuse/fuse_i.h | 2 ++ 2 files changed, 5 insertions(+), 3 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -276,12 +276,10 @@ static void flush_bg_queue(struct fuse_c void fuse_request_end(struct fuse_conn *fc, struct fuse_req *req) { struct fuse_iqueue *fiq = &fc->iq; - bool async;
if (test_and_set_bit(FR_FINISHED, &req->flags)) goto put_request;
- async = req->args->end; /* * test_and_set_bit() implies smp_mb() between bit * changing and below intr_entry check. Pairs with @@ -324,7 +322,7 @@ void fuse_request_end(struct fuse_conn * wake_up(&req->waitq); }
- if (async) + if (test_bit(FR_ASYNC, &req->flags)) req->args->end(fc, req->args, req->out.h.error); put_request: fuse_put_request(fc, req); @@ -471,6 +469,8 @@ static void fuse_args_to_req(struct fuse req->in.h.opcode = args->opcode; req->in.h.nodeid = args->nodeid; req->args = args; + if (args->end) + __set_bit(FR_ASYNC, &req->flags); }
ssize_t fuse_simple_request(struct fuse_conn *fc, struct fuse_args *args) --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -301,6 +301,7 @@ struct fuse_io_priv { * FR_SENT: request is in userspace, waiting for an answer * FR_FINISHED: request is finished * FR_PRIVATE: request is on private list + * FR_ASYNC: request is asynchronous */ enum fuse_req_flag { FR_ISREPLY, @@ -314,6 +315,7 @@ enum fuse_req_flag { FR_SENT, FR_FINISHED, FR_PRIVATE, + FR_ASYNC, };
/**
From: Stefan Haberland sth@linux.ibm.com
commit 5e6bdd37c5526ef01326df5dabb93011ee89237e upstream.
Devices are formatted in multiple of tracks. For an Extent Space Efficient (ESE) volume we get errors when accessing unformatted tracks. In this case the driver either formats the track on the flight for write requests or returns zero data for read requests.
In case a request spans multiple tracks, the indication of an unformatted track presented for the first track is incorrectly applied to all tracks covered by the request. As a result, tracks containing data will be handled as empty, resulting in zero data being returned on read, or overwriting existing data with zero on write.
Fix by determining the track that gets the NRF error. For write requests only format the track that is surely not formatted. For Read requests all tracks before have returned valid data and should not be touched. All tracks after the unformatted track might be formatted or not. Those are returned to the blocklayer to build a new request.
When using alias devices there is a chance that multiple write requests trigger a format of the same track which might lead to data loss. Ensure that a track is formatted only once by maintaining a list of currently processed tracks.
Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Stefan Haberland sth@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Reviewed-by: Peter Oberparleiter oberpar@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/block/dasd.c | 27 ++++++ drivers/s390/block/dasd_eckd.c | 163 +++++++++++++++++++++++++++++++++++++++-- drivers/s390/block/dasd_int.h | 15 +++ 3 files changed, 193 insertions(+), 12 deletions(-)
--- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -178,6 +178,8 @@ struct dasd_block *dasd_alloc_block(void (unsigned long) block); INIT_LIST_HEAD(&block->ccw_queue); spin_lock_init(&block->queue_lock); + INIT_LIST_HEAD(&block->format_list); + spin_lock_init(&block->format_lock); timer_setup(&block->timer, dasd_block_timeout, 0); spin_lock_init(&block->profile.lock);
@@ -1779,20 +1781,26 @@ void dasd_int_handler(struct ccw_device
if (dasd_ese_needs_format(cqr->block, irb)) { if (rq_data_dir((struct request *)cqr->callback_data) == READ) { - device->discipline->ese_read(cqr); + device->discipline->ese_read(cqr, irb); cqr->status = DASD_CQR_SUCCESS; cqr->stopclk = now; dasd_device_clear_timer(device); dasd_schedule_device_bh(device); return; } - fcqr = device->discipline->ese_format(device, cqr); + fcqr = device->discipline->ese_format(device, cqr, irb); if (IS_ERR(fcqr)) { + if (PTR_ERR(fcqr) == -EINVAL) { + cqr->status = DASD_CQR_ERROR; + return; + } /* * If we can't format now, let the request go * one extra round. Maybe we can format later. */ cqr->status = DASD_CQR_QUEUED; + dasd_schedule_device_bh(device); + return; } else { fcqr->status = DASD_CQR_QUEUED; cqr->status = DASD_CQR_QUEUED; @@ -2748,11 +2756,13 @@ static void __dasd_cleanup_cqr(struct da { struct request *req; blk_status_t error = BLK_STS_OK; + unsigned int proc_bytes; int status;
req = (struct request *) cqr->callback_data; dasd_profile_end(cqr->block, cqr, req);
+ proc_bytes = cqr->proc_bytes; status = cqr->block->base->discipline->free_cp(cqr, req); if (status < 0) error = errno_to_blk_status(status); @@ -2783,7 +2793,18 @@ static void __dasd_cleanup_cqr(struct da blk_mq_end_request(req, error); blk_mq_run_hw_queues(req->q, true); } else { - blk_mq_complete_request(req); + /* + * Partial completed requests can happen with ESE devices. + * During read we might have gotten a NRF error and have to + * complete a request partially. + */ + if (proc_bytes) { + blk_update_request(req, BLK_STS_OK, + blk_rq_bytes(req) - proc_bytes); + blk_mq_requeue_request(req, true); + } else { + blk_mq_complete_request(req); + } } }
--- a/drivers/s390/block/dasd_eckd.c +++ b/drivers/s390/block/dasd_eckd.c @@ -207,6 +207,45 @@ static void set_ch_t(struct ch_t *geo, _ geo->head |= head; }
+/* + * calculate failing track from sense data depending if + * it is an EAV device or not + */ +static int dasd_eckd_track_from_irb(struct irb *irb, struct dasd_device *device, + sector_t *track) +{ + struct dasd_eckd_private *private = device->private; + u8 *sense = NULL; + u32 cyl; + u8 head; + + sense = dasd_get_sense(irb); + if (!sense) { + DBF_DEV_EVENT(DBF_WARNING, device, "%s", + "ESE error no sense data\n"); + return -EINVAL; + } + if (!(sense[27] & DASD_SENSE_BIT_2)) { + DBF_DEV_EVENT(DBF_WARNING, device, "%s", + "ESE error no valid track data\n"); + return -EINVAL; + } + + if (sense[27] & DASD_SENSE_BIT_3) { + /* enhanced addressing */ + cyl = sense[30] << 20; + cyl |= (sense[31] & 0xF0) << 12; + cyl |= sense[28] << 8; + cyl |= sense[29]; + } else { + cyl = sense[29] << 8; + cyl |= sense[30]; + } + head = sense[31] & 0x0F; + *track = cyl * private->rdc_data.trk_per_cyl + head; + return 0; +} + static int set_timestamp(struct ccw1 *ccw, struct DE_eckd_data *data, struct dasd_device *device) { @@ -2986,6 +3025,37 @@ static int dasd_eckd_format_device(struc 0, NULL); }
+static bool test_and_set_format_track(struct dasd_format_entry *to_format, + struct dasd_block *block) +{ + struct dasd_format_entry *format; + unsigned long flags; + bool rc = false; + + spin_lock_irqsave(&block->format_lock, flags); + list_for_each_entry(format, &block->format_list, list) { + if (format->track == to_format->track) { + rc = true; + goto out; + } + } + list_add_tail(&to_format->list, &block->format_list); + +out: + spin_unlock_irqrestore(&block->format_lock, flags); + return rc; +} + +static void clear_format_track(struct dasd_format_entry *format, + struct dasd_block *block) +{ + unsigned long flags; + + spin_lock_irqsave(&block->format_lock, flags); + list_del_init(&format->list); + spin_unlock_irqrestore(&block->format_lock, flags); +} + /* * Callback function to free ESE format requests. */ @@ -2993,15 +3063,19 @@ static void dasd_eckd_ese_format_cb(stru { struct dasd_device *device = cqr->startdev; struct dasd_eckd_private *private = device->private; + struct dasd_format_entry *format = data;
+ clear_format_track(format, cqr->basedev->block); private->count--; dasd_ffree_request(cqr, device); }
static struct dasd_ccw_req * -dasd_eckd_ese_format(struct dasd_device *startdev, struct dasd_ccw_req *cqr) +dasd_eckd_ese_format(struct dasd_device *startdev, struct dasd_ccw_req *cqr, + struct irb *irb) { struct dasd_eckd_private *private; + struct dasd_format_entry *format; struct format_data_t fdata; unsigned int recs_per_trk; struct dasd_ccw_req *fcqr; @@ -3011,23 +3085,39 @@ dasd_eckd_ese_format(struct dasd_device struct request *req; sector_t first_trk; sector_t last_trk; + sector_t curr_trk; int rc;
req = cqr->callback_data; - base = cqr->block->base; + block = cqr->block; + base = block->base; private = base->private; - block = base->block; blksize = block->bp_block; recs_per_trk = recs_per_track(&private->rdc_data, 0, blksize); + format = &startdev->format_entry;
first_trk = blk_rq_pos(req) >> block->s2b_shift; sector_div(first_trk, recs_per_trk); last_trk = (blk_rq_pos(req) + blk_rq_sectors(req) - 1) >> block->s2b_shift; sector_div(last_trk, recs_per_trk); + rc = dasd_eckd_track_from_irb(irb, base, &curr_trk); + if (rc) + return ERR_PTR(rc); + + if (curr_trk < first_trk || curr_trk > last_trk) { + DBF_DEV_EVENT(DBF_WARNING, startdev, + "ESE error track %llu not within range %llu - %llu\n", + curr_trk, first_trk, last_trk); + return ERR_PTR(-EINVAL); + } + format->track = curr_trk; + /* test if track is already in formatting by another thread */ + if (test_and_set_format_track(format, block)) + return ERR_PTR(-EEXIST);
- fdata.start_unit = first_trk; - fdata.stop_unit = last_trk; + fdata.start_unit = curr_trk; + fdata.stop_unit = curr_trk; fdata.blksize = blksize; fdata.intensity = private->uses_cdl ? DASD_FMT_INT_COMPAT : 0;
@@ -3044,6 +3134,7 @@ dasd_eckd_ese_format(struct dasd_device return fcqr;
fcqr->callback = dasd_eckd_ese_format_cb; + fcqr->callback_data = (void *) format;
return fcqr; } @@ -3051,29 +3142,87 @@ dasd_eckd_ese_format(struct dasd_device /* * When data is read from an unformatted area of an ESE volume, this function * returns zeroed data and thereby mimics a read of zero data. + * + * The first unformatted track is the one that got the NRF error, the address is + * encoded in the sense data. + * + * All tracks before have returned valid data and should not be touched. + * All tracks after the unformatted track might be formatted or not. This is + * currently not known, remember the processed data and return the remainder of + * the request to the blocklayer in __dasd_cleanup_cqr(). */ -static void dasd_eckd_ese_read(struct dasd_ccw_req *cqr) +static int dasd_eckd_ese_read(struct dasd_ccw_req *cqr, struct irb *irb) { + struct dasd_eckd_private *private; + sector_t first_trk, last_trk; + sector_t first_blk, last_blk; unsigned int blksize, off; + unsigned int recs_per_trk; struct dasd_device *base; struct req_iterator iter; + struct dasd_block *block; + unsigned int skip_block; + unsigned int blk_count; struct request *req; struct bio_vec bv; + sector_t curr_trk; + sector_t end_blk; char *dst; + int rc;
req = (struct request *) cqr->callback_data; base = cqr->block->base; blksize = base->block->bp_block; + block = cqr->block; + private = base->private; + skip_block = 0; + blk_count = 0; + + recs_per_trk = recs_per_track(&private->rdc_data, 0, blksize); + first_trk = first_blk = blk_rq_pos(req) >> block->s2b_shift; + sector_div(first_trk, recs_per_trk); + last_trk = last_blk = + (blk_rq_pos(req) + blk_rq_sectors(req) - 1) >> block->s2b_shift; + sector_div(last_trk, recs_per_trk); + rc = dasd_eckd_track_from_irb(irb, base, &curr_trk); + if (rc) + return rc; + + /* sanity check if the current track from sense data is valid */ + if (curr_trk < first_trk || curr_trk > last_trk) { + DBF_DEV_EVENT(DBF_WARNING, base, + "ESE error track %llu not within range %llu - %llu\n", + curr_trk, first_trk, last_trk); + return -EINVAL; + } + + /* + * if not the first track got the NRF error we have to skip over valid + * blocks + */ + if (curr_trk != first_trk) + skip_block = curr_trk * recs_per_trk - first_blk; + + /* we have no information beyond the current track */ + end_blk = (curr_trk + 1) * recs_per_trk;
rq_for_each_segment(bv, req, iter) { dst = page_address(bv.bv_page) + bv.bv_offset; for (off = 0; off < bv.bv_len; off += blksize) { - if (dst && rq_data_dir(req) == READ) { + if (first_blk + blk_count >= end_blk) { + cqr->proc_bytes = blk_count * blksize; + return 0; + } + if (dst && !skip_block) { dst += off; memset(dst, 0, blksize); + } else { + skip_block--; } + blk_count++; } } + return 0; }
/* --- a/drivers/s390/block/dasd_int.h +++ b/drivers/s390/block/dasd_int.h @@ -187,6 +187,7 @@ struct dasd_ccw_req {
void (*callback)(struct dasd_ccw_req *, void *data); void *callback_data; + unsigned int proc_bytes; /* bytes for partial completion */ };
/* @@ -387,8 +388,9 @@ struct dasd_discipline { int (*ext_pool_warn_thrshld)(struct dasd_device *); int (*ext_pool_oos)(struct dasd_device *); int (*ext_pool_exhaust)(struct dasd_device *, struct dasd_ccw_req *); - struct dasd_ccw_req *(*ese_format)(struct dasd_device *, struct dasd_ccw_req *); - void (*ese_read)(struct dasd_ccw_req *); + struct dasd_ccw_req *(*ese_format)(struct dasd_device *, + struct dasd_ccw_req *, struct irb *); + int (*ese_read)(struct dasd_ccw_req *, struct irb *); };
extern struct dasd_discipline *dasd_diag_discipline_pointer; @@ -474,6 +476,11 @@ struct dasd_profile { spinlock_t lock; };
+struct dasd_format_entry { + struct list_head list; + sector_t track; +}; + struct dasd_device { /* Block device stuff. */ struct dasd_block *block; @@ -539,6 +546,7 @@ struct dasd_device { struct dentry *debugfs_dentry; struct dentry *hosts_dentry; struct dasd_profile profile; + struct dasd_format_entry format_entry; };
struct dasd_block { @@ -564,6 +572,9 @@ struct dasd_block {
struct dentry *debugfs_dentry; struct dasd_profile profile; + + struct list_head format_list; + spinlock_t format_lock; };
struct dasd_attention_data {
From: Takashi Iwai tiwai@suse.de
commit 443d372d6a96cd94ad119e5c14bb4d63a536a7f6 upstream.
Although the IRQ assignment in ipmi_si driver is optional, platform_get_irq() spews error messages unnecessarily: ipmi_si dmi-ipmi-si.0: IRQ index 0 not found
Fix this by switching to platform_get_irq_optional().
Cc: stable@vger.kernel.org # 5.4.x Cc: John Donnelly john.p.donnelly@oracle.com Fixes: 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()") Reported-and-tested-by: Patrick Vo patrick.vo@hpe.com Signed-off-by: Takashi Iwai tiwai@suse.de Message-Id: 20200205093146.1352-1-tiwai@suse.de Signed-off-by: Corey Minyard cminyard@mvista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/ipmi/ipmi_si_platform.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/char/ipmi/ipmi_si_platform.c +++ b/drivers/char/ipmi/ipmi_si_platform.c @@ -194,7 +194,7 @@ static int platform_ipmi_probe(struct pl else io.slave_addr = slave_addr;
- io.irq = platform_get_irq(pdev, 0); + io.irq = platform_get_irq_optional(pdev, 0); if (io.irq > 0) io.irq_setup = ipmi_std_irq_setup; else @@ -378,7 +378,7 @@ static int acpi_ipmi_probe(struct platfo io.irq = tmp; io.irq_setup = acpi_gpe_irq_setup; } else { - int irq = platform_get_irq(pdev, 0); + int irq = platform_get_irq_optional(pdev, 0);
if (irq > 0) { io.irq = irq;
From: Tejun Heo tj@kernel.org
commit dcd6589b11d3b1e71f516a87a7b9646ed356b4c0 upstream.
vtimes may wrap and time_before/after64() should be used to determine whether a given vtime is before or after another. iocg_is_idle() was incorrectly using plain "<" comparison do determine whether done_vtime is before vtime. Here, the only thing we're interested in is whether done_vtime matches vtime which indicates that there's nothing in flight. Let's test for inequality instead.
Signed-off-by: Tejun Heo tj@kernel.org Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-iocost.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -1318,7 +1318,7 @@ static bool iocg_is_idle(struct ioc_gq * return false;
/* is something in flight? */ - if (atomic64_read(&iocg->done_vtime) < atomic64_read(&iocg->vtime)) + if (atomic64_read(&iocg->done_vtime) != atomic64_read(&iocg->vtime)) return false;
return true;
From: Eric Biggers ebiggers@google.com
commit 2b4eae95c7361e0a147b838715c8baa1380a428f upstream.
After FS_IOC_REMOVE_ENCRYPTION_KEY removes a key, it syncs the filesystem and tries to get and put all inodes that were unlocked by the key so that unused inodes get evicted via fscrypt_drop_inode(). Normally, the inodes are all clean due to the sync.
However, after the filesystem is sync'ed, userspace can modify and close one of the files. (Userspace is *supposed* to close the files before removing the key. But it doesn't always happen, and the kernel can't assume it.) This causes the inode to be dirtied and have i_count == 0. Then, fscrypt_drop_inode() failed to consider this case and indicated that the inode can be dropped, causing the write to be lost.
On f2fs, other problems such as a filesystem freeze could occur due to the inode being freed while still on f2fs's dirty inode list.
Fix this bug by making fscrypt_drop_inode() only drop clean inodes.
I've written an xfstest which detects this bug on ext4, f2fs, and ubifs.
Fixes: b1c0ec3599f4 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl") Cc: stable@vger.kernel.org # v5.4+ Link: https://lore.kernel.org/r/20200305084138.653498-1-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/crypto/keysetup.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/crypto/keysetup.c +++ b/fs/crypto/keysetup.c @@ -579,6 +579,15 @@ int fscrypt_drop_inode(struct inode *ino mk = ci->ci_master_key->payload.data[0];
/* + * With proper, non-racy use of FS_IOC_REMOVE_ENCRYPTION_KEY, all inodes + * protected by the key were cleaned by sync_filesystem(). But if + * userspace is still using the files, inodes can be dirtied between + * then and now. We mustn't lose any writes, so skip dirty inodes here. + */ + if (inode->i_state & I_DIRTY_ALL) + return 0; + + /* * Note: since we aren't holding ->mk_secret_sem, the result here can * immediately become outdated. But there's no correctness problem with * unnecessarily evicting. Nor is there a correctness problem with not
From: Wolfram Sang wsa@the-dreams.de
commit bcf3588d8ed3517e6ffaf083f034812aee9dc8e2 upstream.
Commit af503716ac14 made sure OF devices get an OF style modalias with I2C events. It assumed all in-tree users were converted, yet it missed some Macintosh drivers.
Add an OF module device table for all windfarm drivers to make them automatically load again.
Fixes: af503716ac14 ("i2c: core: report OF style module alias for devices registered via OF") Link: https://bugzilla.kernel.org/show_bug.cgi?id=199471 Reported-by: Erhard Furtner erhard_f@mailbox.org Tested-by: Erhard Furtner erhard_f@mailbox.org Acked-by: Michael Ellerman mpe@ellerman.id.au (powerpc) Signed-off-by: Wolfram Sang wsa@the-dreams.de Cc: stable@kernel.org # v4.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/macintosh/windfarm_ad7417_sensor.c | 7 +++++++ drivers/macintosh/windfarm_fcu_controls.c | 7 +++++++ drivers/macintosh/windfarm_lm75_sensor.c | 16 +++++++++++++++- drivers/macintosh/windfarm_lm87_sensor.c | 7 +++++++ drivers/macintosh/windfarm_max6690_sensor.c | 7 +++++++ drivers/macintosh/windfarm_smu_sat.c | 7 +++++++ 6 files changed, 50 insertions(+), 1 deletion(-)
--- a/drivers/macintosh/windfarm_ad7417_sensor.c +++ b/drivers/macintosh/windfarm_ad7417_sensor.c @@ -312,9 +312,16 @@ static const struct i2c_device_id wf_ad7 }; MODULE_DEVICE_TABLE(i2c, wf_ad7417_id);
+static const struct of_device_id wf_ad7417_of_id[] = { + { .compatible = "ad7417", }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_ad7417_of_id); + static struct i2c_driver wf_ad7417_driver = { .driver = { .name = "wf_ad7417", + .of_match_table = wf_ad7417_of_id, }, .probe = wf_ad7417_probe, .remove = wf_ad7417_remove, --- a/drivers/macintosh/windfarm_fcu_controls.c +++ b/drivers/macintosh/windfarm_fcu_controls.c @@ -582,9 +582,16 @@ static const struct i2c_device_id wf_fcu }; MODULE_DEVICE_TABLE(i2c, wf_fcu_id);
+static const struct of_device_id wf_fcu_of_id[] = { + { .compatible = "fcu", }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_fcu_of_id); + static struct i2c_driver wf_fcu_driver = { .driver = { .name = "wf_fcu", + .of_match_table = wf_fcu_of_id, }, .probe = wf_fcu_probe, .remove = wf_fcu_remove, --- a/drivers/macintosh/windfarm_lm75_sensor.c +++ b/drivers/macintosh/windfarm_lm75_sensor.c @@ -14,6 +14,7 @@ #include <linux/init.h> #include <linux/wait.h> #include <linux/i2c.h> +#include <linux/of_device.h> #include <asm/prom.h> #include <asm/machdep.h> #include <asm/io.h> @@ -91,9 +92,14 @@ static int wf_lm75_probe(struct i2c_clie const struct i2c_device_id *id) { struct wf_lm75_sensor *lm; - int rc, ds1775 = id->driver_data; + int rc, ds1775; const char *name, *loc;
+ if (id) + ds1775 = id->driver_data; + else + ds1775 = !!of_device_get_match_data(&client->dev); + DBG("wf_lm75: creating %s device at address 0x%02x\n", ds1775 ? "ds1775" : "lm75", client->addr);
@@ -164,9 +170,17 @@ static const struct i2c_device_id wf_lm7 }; MODULE_DEVICE_TABLE(i2c, wf_lm75_id);
+static const struct of_device_id wf_lm75_of_id[] = { + { .compatible = "lm75", .data = (void *)0}, + { .compatible = "ds1775", .data = (void *)1 }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_lm75_of_id); + static struct i2c_driver wf_lm75_driver = { .driver = { .name = "wf_lm75", + .of_match_table = wf_lm75_of_id, }, .probe = wf_lm75_probe, .remove = wf_lm75_remove, --- a/drivers/macintosh/windfarm_lm87_sensor.c +++ b/drivers/macintosh/windfarm_lm87_sensor.c @@ -166,9 +166,16 @@ static const struct i2c_device_id wf_lm8 }; MODULE_DEVICE_TABLE(i2c, wf_lm87_id);
+static const struct of_device_id wf_lm87_of_id[] = { + { .compatible = "lm87cimt", }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_lm87_of_id); + static struct i2c_driver wf_lm87_driver = { .driver = { .name = "wf_lm87", + .of_match_table = wf_lm87_of_id, }, .probe = wf_lm87_probe, .remove = wf_lm87_remove, --- a/drivers/macintosh/windfarm_max6690_sensor.c +++ b/drivers/macintosh/windfarm_max6690_sensor.c @@ -120,9 +120,16 @@ static const struct i2c_device_id wf_max }; MODULE_DEVICE_TABLE(i2c, wf_max6690_id);
+static const struct of_device_id wf_max6690_of_id[] = { + { .compatible = "max6690", }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_max6690_of_id); + static struct i2c_driver wf_max6690_driver = { .driver = { .name = "wf_max6690", + .of_match_table = wf_max6690_of_id, }, .probe = wf_max6690_probe, .remove = wf_max6690_remove, --- a/drivers/macintosh/windfarm_smu_sat.c +++ b/drivers/macintosh/windfarm_smu_sat.c @@ -341,9 +341,16 @@ static const struct i2c_device_id wf_sat }; MODULE_DEVICE_TABLE(i2c, wf_sat_id);
+static const struct of_device_id wf_sat_of_id[] = { + { .compatible = "smu-sat", }, + { } +}; +MODULE_DEVICE_TABLE(of, wf_sat_of_id); + static struct i2c_driver wf_sat_driver = { .driver = { .name = "wf_smu_sat", + .of_match_table = wf_sat_of_id, }, .probe = wf_sat_probe, .remove = wf_sat_remove,
From: Tom Lendacky thomas.lendacky@amd.com
commit 985e537a4082b4635754a57f4f95430790afee6a upstream.
The dmidecode program fails to properly decode the SMBIOS data supplied by OVMF/UEFI when running in an SEV guest. The SMBIOS area, under SEV, is encrypted and resides in reserved memory that is marked as EFI runtime services data.
As a result, when memremap() is attempted for the SMBIOS data, it can't be mapped as regular RAM (through try_ram_remap()) and, since the address isn't part of the iomem resources list, it isn't mapped encrypted through the fallback ioremap().
Add a new __ioremap_check_other() to deal with memory types like EFI_RUNTIME_SERVICES_DATA which are not covered by the resource ranges.
This allows any runtime services data which has been created encrypted, to be mapped encrypted too.
[ bp: Move functionality to a separate function. ]
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Joerg Roedel jroedel@suse.de Tested-by: Joerg Roedel jroedel@suse.de Cc: stable@vger.kernel.org # 5.3 Link: https://lkml.kernel.org/r/2d9e16eb5b53dc82665c95c6764b7407719df7a0.158264532... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/ioremap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
--- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -106,6 +106,19 @@ static unsigned int __ioremap_check_encr return 0; }
+/* + * The EFI runtime services data area is not covered by walk_mem_res(), but must + * be mapped encrypted when SEV is active. + */ +static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *desc) +{ + if (!sev_active()) + return; + + if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA) + desc->flags |= IORES_MAP_ENCRYPTED; +} + static int __ioremap_collect_map_flags(struct resource *res, void *arg) { struct ioremap_desc *desc = arg; @@ -124,6 +137,9 @@ static int __ioremap_collect_map_flags(s * To avoid multiple resource walks, this function walks resources marked as * IORESOURCE_MEM and IORESOURCE_BUSY and looking for system RAM and/or a * resource described not as IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES). + * + * After that, deal with misc other ranges in __ioremap_check_other() which do + * not fall into the above category. */ static void __ioremap_check_mem(resource_size_t addr, unsigned long size, struct ioremap_desc *desc) @@ -135,6 +151,8 @@ static void __ioremap_check_mem(resource memset(desc, 0, sizeof(struct ioremap_desc));
walk_mem_res(start, end, desc, __ioremap_collect_map_flags); + + __ioremap_check_other(addr, desc); }
/*
From: Vladis Dronov vdronov@redhat.com
commit 286d3250c9d6437340203fb64938bea344729a0e upstream.
There is a race and a buffer overflow corrupting a kernel memory while reading an EFI variable with a size more than 1024 bytes via the older sysfs method. This happens because accessing struct efi_variable in efivar_{attr,size,data}_read() and friends is not protected from a concurrent access leading to a kernel memory corruption and, at best, to a crash. The race scenario is the following:
CPU0: CPU1: efivar_attr_read() var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) efivar_attr_read() // same EFI var var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) virt_efi_get_variable() // returns EFI_BUFFER_TOO_SMALL but // var->DataSize is set to a real // var size more than 1024 bytes up(&efivars_lock) virt_efi_get_variable() // called with var->DataSize set // to a real var size, returns // successfully and overwrites // a 1024-bytes kernel buffer up(&efivars_lock)
This can be reproduced by concurrent reading of an EFI variable which size is more than 1024 bytes:
ts# for cpu in $(seq 0 $(nproc --ignore=1)); do ( taskset -c $cpu \ cat /sys/firmware/efi/vars/KEKDefault*/size & ) ; done
Fix this by using a local variable for a var's data buffer size so it does not get overwritten.
Fixes: e14ab23dde12b80d ("efivars: efivar_entry API") Reported-by: Bob Sanders bob.sanders@hpe.com and the LTP testsuite Signed-off-by: Vladis Dronov vdronov@redhat.com Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200305084041.24053-2-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-24-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/firmware/efi/efivars.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/drivers/firmware/efi/efivars.c +++ b/drivers/firmware/efi/efivars.c @@ -83,13 +83,16 @@ static ssize_t efivar_attr_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); char *str = buf; + int ret;
if (!entry || !buf) return -EINVAL;
- var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO;
if (var->Attributes & EFI_VARIABLE_NON_VOLATILE) @@ -116,13 +119,16 @@ static ssize_t efivar_size_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); char *str = buf; + int ret;
if (!entry || !buf) return -EINVAL;
- var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO;
str += sprintf(str, "0x%lx\n", var->DataSize); @@ -133,12 +139,15 @@ static ssize_t efivar_data_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); + int ret;
if (!entry || !buf) return -EINVAL;
- var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO;
memcpy(buf, var->Data, var->DataSize); @@ -250,14 +259,16 @@ efivar_show_raw(struct efivar_entry *ent { struct efi_variable *var = &entry->var; struct compat_efi_variable *compat; + unsigned long datasize = sizeof(var->Data); size_t size; + int ret;
if (!entry || !buf) return 0;
- var->DataSize = 1024; - if (efivar_entry_get(entry, &entry->var.Attributes, - &entry->var.DataSize, entry->var.Data)) + ret = efivar_entry_get(entry, &var->Attributes, &datasize, var->Data); + var->DataSize = datasize; + if (ret) return -EIO;
if (in_compat_syscall()) {
From: Vladis Dronov vdronov@redhat.com
commit d6c066fda90d578aacdf19771a027ed484a79825 upstream.
Add a sanity check to efivar_store_raw() the same way efivar_{attr,size,data}_read() and efivar_show_raw() have it.
Signed-off-by: Vladis Dronov vdronov@redhat.com Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200305084041.24053-3-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-25-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/firmware/efi/efivars.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/firmware/efi/efivars.c +++ b/drivers/firmware/efi/efivars.c @@ -208,6 +208,9 @@ efivar_store_raw(struct efivar_entry *en u8 *data; int err;
+ if (!entry || !buf) + return -EINVAL; + if (in_compat_syscall()) { struct compat_efi_variable *compat;
From: Jarkko Nikula jarkko.nikula@linux.intel.com
commit 9be8bc4dd6177cf992b93b0bd014c4f611283896 upstream.
Function i2c_dw_pci_remove() -> pci_free_irq_vectors() -> pci_disable_msi() -> free_msi_irqs() will throw a BUG_ON() for MSI enabled device since the driver has not released the requested IRQ before calling the pci_free_irq_vectors().
Here driver requests an IRQ using devm_request_irq() but automatic release happens only after remove callback. Fix this by explicitly freeing the IRQ before calling pci_free_irq_vectors().
Fixes: 21aa3983d619 ("i2c: designware-pci: Switch over to MSI interrupts") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/busses/i2c-designware-pcidrv.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/i2c/busses/i2c-designware-pcidrv.c +++ b/drivers/i2c/busses/i2c-designware-pcidrv.c @@ -313,6 +313,7 @@ static void i2c_dw_pci_remove(struct pci pm_runtime_get_noresume(&pdev->dev);
i2c_del_adapter(&dev->adapter); + devm_free_irq(&pdev->dev, dev->irq, dev); pci_free_irq_vectors(pdev); }
From: Felix Fietkau nbd@nbd.name
commit b102f0c522cf668c8382c56a4f771b37d011cda2 upstream.
If the hardware receives an oversized packet with too many rx fragments, skb_shinfo(skb)->frags can overflow and corrupt memory of adjacent pages. This becomes especially visible if it corrupts the freelist pointer of a slab page.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/mediatek/mt76/dma.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -448,10 +448,13 @@ mt76_add_fragment(struct mt76_dev *dev, struct page *page = virt_to_head_page(data); int offset = data - page_address(page); struct sk_buff *skb = q->rx_head; + struct skb_shared_info *shinfo = skb_shinfo(skb);
- offset += q->buf_offset; - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, offset, len, - q->buf_size); + if (shinfo->nr_frags < ARRAY_SIZE(shinfo->frags)) { + offset += q->buf_offset; + skb_add_rx_frag(skb, shinfo->nr_frags, page, offset, len, + q->buf_size); + }
if (more) return;
From: Kim Phillips kim.phillips@amd.com
commit f967140dfb7442e2db0868b03b961f9c59418a1b upstream.
Enable the sampling check in kernel/events/core.c::perf_event_open(), which returns the more appropriate -EOPNOTSUPP.
BEFORE:
$ sudo perf record -a -e instructions,l3_request_g1.caching_l3_cache_accesses true Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (l3_request_g1.caching_l3_cache_accesses). /bin/dmesg | grep -i perf may provide additional information.
With nothing relevant in dmesg.
AFTER:
$ sudo perf record -a -e instructions,l3_request_g1.caching_l3_cache_accesses true Error: l3_request_g1.caching_l3_cache_accesses: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fixes: c43ca5091a37 ("perf/x86/amd: Add support for AMD NB and L2I "uncore" counters") Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200311191323.13124-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/amd/uncore.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-)
--- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -190,15 +190,12 @@ static int amd_uncore_event_init(struct
/* * NB and Last level cache counters (MSRs) are shared across all cores - * that share the same NB / Last level cache. Interrupts can be directed - * to a single target core, however, event counts generated by processes - * running on other cores cannot be masked out. So we do not support - * sampling and per-thread events. + * that share the same NB / Last level cache. On family 16h and below, + * Interrupts can be directed to a single target core, however, event + * counts generated by processes running on other cores cannot be masked + * out. So we do not support sampling and per-thread events via + * CAP_NO_INTERRUPT, and we do not enable counter overflow interrupts: */ - if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) - return -EINVAL; - - /* and we do not enable counter overflow interrupts */ hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB; hwc->idx = -1;
@@ -306,7 +303,7 @@ static struct pmu amd_nb_pmu = { .start = amd_uncore_start, .stop = amd_uncore_stop, .read = amd_uncore_read, - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, };
static struct pmu amd_llc_pmu = { @@ -317,7 +314,7 @@ static struct pmu amd_llc_pmu = { .start = amd_uncore_start, .stop = amd_uncore_stop, .read = amd_uncore_read, - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, };
static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
From: Tony Luck tony.luck@intel.com
commit 59b5809655bdafb0767d3fd00a3e41711aab07e6 upstream.
There are two implemented bits in the PPIN_CTL MSR:
Bit 0: LockOut (R/WO) Set 1 to prevent further writes to MSR_PPIN_CTL.
Bit 1: Enable_PPIN (R/W) If 1, enables MSR_PPIN to be accessible using RDMSR. If 0, an attempt to read MSR_PPIN will cause #GP.
So there are four defined values: 0: PPIN is disabled, PPIN_CTL may be updated 1: PPIN is disabled. PPIN_CTL is locked against updates 2: PPIN is enabled. PPIN_CTL may be updated 3: PPIN is enabled. PPIN_CTL is locked against updates
Code would only enable the X86_FEATURE_INTEL_PPIN feature for case "2". When it should have done so for both case "2" and case "3".
Fix the final test to just check for the enable bit. Also fix some of the other comments in this function.
Fixes: 3f5a7896a509 ("x86/mce: Include the PPIN in MCE records when available") Signed-off-by: Tony Luck tony.luck@intel.com Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200226011737.9958-1-tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/mce/intel.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/cpu/mce/intel.c +++ b/arch/x86/kernel/cpu/mce/intel.c @@ -489,17 +489,18 @@ static void intel_ppin_init(struct cpuin return;
if ((val & 3UL) == 1UL) { - /* PPIN available but disabled: */ + /* PPIN locked in disabled mode */ return; }
- /* If PPIN is disabled, but not locked, try to enable: */ - if (!(val & 3UL)) { + /* If PPIN is disabled, try to enable */ + if (!(val & 2UL)) { wrmsrl_safe(MSR_PPIN_CTL, val | 2UL); rdmsrl_safe(MSR_PPIN_CTL, &val); }
- if ((val & 3UL) == 2UL) + /* Is the enable bit set? */ + if (val & 2UL) set_cpu_cap(c, X86_FEATURE_INTEL_PPIN); } }
From: Marc Zyngier maz@kernel.org
commit 65ac74f1de3334852fb7d9b1b430fa5a06524276 upstream.
The way cookie_init_hw_msi_region() allocates the iommu_dma_msi_page structures doesn't match the way iommu_put_dma_cookie() frees them.
The former performs a single allocation of all the required structures, while the latter tries to free them one at a time. It doesn't quite work for the main use case (the GICv3 ITS where the range is 64kB) when the base granule size is 4kB.
This leads to a nice slab corruption on teardown, which is easily observable by simply creating a VF on a SRIOV-capable device, and tearing it down immediately (no need to even make use of it). Fortunately, this only affects systems where the ITS isn't translated by the SMMU, which are both rare and non-standard.
Fix it by allocating iommu_dma_msi_page structures one at a time.
Fixes: 7c1b058c8b5a3 ("iommu/dma: Handle IOMMU API reserved regions") Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Eric Auger eric.auger@redhat.com Cc: Robin Murphy robin.murphy@arm.com Cc: Joerg Roedel jroedel@suse.de Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Reviewed-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/dma-iommu.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -176,15 +176,15 @@ static int cookie_init_hw_msi_region(str start -= iova_offset(iovad, start); num_pages = iova_align(iovad, end - start) >> iova_shift(iovad);
- msi_page = kcalloc(num_pages, sizeof(*msi_page), GFP_KERNEL); - if (!msi_page) - return -ENOMEM; - for (i = 0; i < num_pages; i++) { - msi_page[i].phys = start; - msi_page[i].iova = start; - INIT_LIST_HEAD(&msi_page[i].list); - list_add(&msi_page[i].list, &cookie->msi_page_list); + msi_page = kmalloc(sizeof(*msi_page), GFP_KERNEL); + if (!msi_page) + return -ENOMEM; + + msi_page->phys = start; + msi_page->iova = start; + INIT_LIST_HEAD(&msi_page->list); + list_add(&msi_page->list, &cookie->msi_page_list); start += iovad->granule; }
From: Hans de Goede hdegoede@redhat.com
commit 59833696442c674acbbd297772ba89e7ad8c753d upstream.
Quoting from the comment describing the WARN functions in include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report * significant kernel issues that need prompt attention if they should ever * appear at runtime. * * Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT for really are invalid external inputs. They are not under the kernel's control and the issues in them cannot be fixed by a kernel update. So logging a backtrace, which invites bug reports to be filed about this, is not helpful.
Some distros, e.g. Fedora, have tools watching for the kernel backtraces logged by the WARN macros and offer the user an option to file a bug for this when these are encountered. The WARN_TAINT in warn_invalid_dmar() + another iommu WARN_TAINT, addressed in another patch, have lead to over a 100 bugs being filed this way.
This commit replaces the WARN_TAINT("...") calls, with pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) calls avoiding the backtrace and thus also avoiding bug-reports being filed about this against the kernel.
Fixes: fd0c8894893c ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables") Fixes: e625b4a95d50 ("iommu/vt-d: Parse ANDD records") Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Joerg Roedel jroedel@suse.de Acked-by: Lu Baolu baolu.lu@linux.intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200309140138.3753-2-hdegoede@redhat.com BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1564895 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/dmar.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -440,12 +440,13 @@ static int __init dmar_parse_one_andd(st
/* Check for NUL termination within the designated length */ if (strnlen(andd->device_name, header->length - 8) == header->length - 8) { - WARN_TAINT(1, TAINT_FIRMWARE_WORKAROUND, + pr_warn(FW_BUG "Your BIOS is broken; ANDD object name is not NUL-terminated\n" "BIOS vendor: %s; Ver: %s; Product Version: %s\n", dmi_get_system_info(DMI_BIOS_VENDOR), dmi_get_system_info(DMI_BIOS_VERSION), dmi_get_system_info(DMI_PRODUCT_VERSION)); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); return -EINVAL; } pr_info("ANDD device: %x name: %s\n", andd->device_number, @@ -471,14 +472,14 @@ static int dmar_parse_one_rhsa(struct ac return 0; } } - WARN_TAINT( - 1, TAINT_FIRMWARE_WORKAROUND, + pr_warn(FW_BUG "Your BIOS is broken; RHSA refers to non-existent DMAR unit at %llx\n" "BIOS vendor: %s; Ver: %s; Product Version: %s\n", drhd->reg_base_addr, dmi_get_system_info(DMI_BIOS_VENDOR), dmi_get_system_info(DMI_BIOS_VERSION), dmi_get_system_info(DMI_PRODUCT_VERSION)); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
return 0; } @@ -827,14 +828,14 @@ int __init dmar_table_init(void)
static void warn_invalid_dmar(u64 addr, const char *message) { - WARN_TAINT_ONCE( - 1, TAINT_FIRMWARE_WORKAROUND, + pr_warn_once(FW_BUG "Your BIOS is broken; DMAR reported at address %llx%s!\n" "BIOS vendor: %s; Ver: %s; Product Version: %s\n", addr, message, dmi_get_system_info(DMI_BIOS_VENDOR), dmi_get_system_info(DMI_BIOS_VERSION), dmi_get_system_info(DMI_PRODUCT_VERSION)); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); }
static int __ref
From: Amol Grover frextrite@gmail.com
commit 02d715b4a8182f4887d82df82a7b83aced647760 upstream.
dmar_drhd_units is traversed using list_for_each_entry_rcu() outside of an RCU read side critical section but under the protection of dmar_global_lock. Hence add corresponding lockdep expression to silence the following false-positive warnings:
[ 1.603975] ============================= [ 1.603976] WARNING: suspicious RCU usage [ 1.603977] 5.5.4-stable #17 Not tainted [ 1.603978] ----------------------------- [ 1.603980] drivers/iommu/intel-iommu.c:4769 RCU-list traversed in non-reader section!!
[ 1.603869] ============================= [ 1.603870] WARNING: suspicious RCU usage [ 1.603872] 5.5.4-stable #17 Not tainted [ 1.603874] ----------------------------- [ 1.603875] drivers/iommu/dmar.c:293 RCU-list traversed in non-reader section!!
Tested-by: Madhuparna Bhowmik madhuparnabhowmik10@gmail.com Signed-off-by: Amol Grover frextrite@gmail.com Cc: stable@vger.kernel.org Acked-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/dmar.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -69,8 +69,9 @@ struct dmar_pci_notify_info { extern struct rw_semaphore dmar_global_lock; extern struct list_head dmar_drhd_units;
-#define for_each_drhd_unit(drhd) \ - list_for_each_entry_rcu(drhd, &dmar_drhd_units, list) +#define for_each_drhd_unit(drhd) \ + list_for_each_entry_rcu(drhd, &dmar_drhd_units, list, \ + dmar_rcu_check())
#define for_each_active_drhd_unit(drhd) \ list_for_each_entry_rcu(drhd, &dmar_drhd_units, list) \ @@ -81,7 +82,8 @@ extern struct list_head dmar_drhd_units; if (i=drhd->iommu, drhd->ignored) {} else
#define for_each_iommu(i, drhd) \ - list_for_each_entry_rcu(drhd, &dmar_drhd_units, list) \ + list_for_each_entry_rcu(drhd, &dmar_drhd_units, list, \ + dmar_rcu_check()) \ if (i=drhd->iommu, 0) {} else
static inline bool dmar_rcu_check(void)
From: Yonghyun Hwang yonghyun@google.com
commit 77a1bce84bba01f3f143d77127b72e872b573795 upstream.
intel_iommu_iova_to_phys() has a bug when it translates an IOVA for a huge page onto its corresponding physical address. This commit fixes the bug by accomodating the level of page entry for the IOVA and adds IOVA's lower address to the physical address.
Cc: stable@vger.kernel.org Acked-by: Lu Baolu baolu.lu@linux.intel.com Reviewed-by: Moritz Fischer mdf@kernel.org Signed-off-by: Yonghyun Hwang yonghyun@google.com Fixes: 3871794642579 ("VT-d: Changes to support KVM") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/intel-iommu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5524,8 +5524,10 @@ static phys_addr_t intel_iommu_iova_to_p u64 phys = 0;
pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level); - if (pte) - phys = dma_pte_addr(pte); + if (pte && dma_pte_present(pte)) + phys = dma_pte_addr(pte) + + (iova & (BIT_MASK(level_to_offset_bits(level) + + VTD_PAGE_SHIFT) - 1));
return phys; }
From: Sven Eckelmann sven@narfation.org
commit 8e8ce08198de193e3d21d42e96945216e3d9ac7f upstream.
A transmission scheduling for an interface which is currently dropped by batadv_iv_ogm_iface_disable could still be in progress. The B.A.T.M.A.N. V is simply cancelling the workqueue item in an synchronous way but this is not possible with B.A.T.M.A.N. IV because the OGM submissions are intertwined.
Instead it has to stop submitting the OGM when it detect that the buffer pointer is set to NULL.
Reported-by: syzbot+a98f2016f40b9cd3818a@syzkaller.appspotmail.com Reported-by: syzbot+ac36b6a33c28a491e929@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann sven@narfation.org Cc: Hillf Danton hdanton@sina.com Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/bat_iv_ogm.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -789,6 +789,10 @@ static void batadv_iv_ogm_schedule_buff(
lockdep_assert_held(&hard_iface->bat_iv.ogm_buff_mutex);
+ /* interface already disabled by batadv_iv_ogm_iface_disable */ + if (!*ogm_buff) + return; + /* the interface gets activated here to avoid race conditions between * the moment of activating the interface in * hardif_activate_interface() where the originator mac is set and
From: Anson Huang Anson.Huang@nxp.com
commit 5eb40257047fb11085d582b7b9ccd0bffe900726 upstream.
IMX8MN_CLK_I2C4 and IMX8MN_CLK_UART1's index definitions are incorrect, fix them.
Fixes: 1e80936a42e1 ("dt-bindings: imx: Add clock binding doc for i.MX8MN") Signed-off-by: Anson Huang Anson.Huang@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/dt-bindings/clock/imx8mn-clock.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/dt-bindings/clock/imx8mn-clock.h +++ b/include/dt-bindings/clock/imx8mn-clock.h @@ -122,8 +122,8 @@ #define IMX8MN_CLK_I2C1 105 #define IMX8MN_CLK_I2C2 106 #define IMX8MN_CLK_I2C3 107 -#define IMX8MN_CLK_I2C4 118 -#define IMX8MN_CLK_UART1 119 +#define IMX8MN_CLK_I2C4 108 +#define IMX8MN_CLK_UART1 109 #define IMX8MN_CLK_UART2 110 #define IMX8MN_CLK_UART3 111 #define IMX8MN_CLK_UART4 112
From: Nicolas Belin nbelin@baylibre.com
commit dc7a06b0dbbafac8623c2b7657e61362f2f479a7 upstream.
In the gxl driver, the sdio cmd and clk pins are inverted. It has not caused any issue so far because devices using these pins always take both pins so the resulting configuration is OK.
Fixes: 0f15f500ff2c ("pinctrl: meson: Add GXL pinctrl definitions") Reviewed-by: Jerome Brunet jbrunet@baylibre.com Signed-off-by: Nicolas Belin nbelin@baylibre.com Link: https://lore.kernel.org/r/1582204512-7582-1-git-send-email-nbelin@baylibre.c... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/meson/pinctrl-meson-gxl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pinctrl/meson/pinctrl-meson-gxl.c +++ b/drivers/pinctrl/meson/pinctrl-meson-gxl.c @@ -147,8 +147,8 @@ static const unsigned int sdio_d0_pins[] static const unsigned int sdio_d1_pins[] = { GPIOX_1 }; static const unsigned int sdio_d2_pins[] = { GPIOX_2 }; static const unsigned int sdio_d3_pins[] = { GPIOX_3 }; -static const unsigned int sdio_cmd_pins[] = { GPIOX_4 }; -static const unsigned int sdio_clk_pins[] = { GPIOX_5 }; +static const unsigned int sdio_clk_pins[] = { GPIOX_4 }; +static const unsigned int sdio_cmd_pins[] = { GPIOX_5 }; static const unsigned int sdio_irq_pins[] = { GPIOX_7 };
static const unsigned int nand_ce0_pins[] = { BOOT_8 };
From: Leonard Crestez leonard.crestez@nxp.com
commit 4c48e549f39f8ed10cf8a0b6cb96f5eddf0391ce upstream.
The imx SC api strongly assumes that messages are composed out of 4-bytes words but some of our message structs have odd sizeofs.
This produces many oopses with CONFIG_KASAN=y.
Fix by marking with __aligned(4).
Fixes: b96eea718bf6 ("pinctrl: fsl: add scu based pinctrl support") Signed-off-by: Leonard Crestez leonard.crestez@nxp.com Link: https://lore.kernel.org/r/bd7ad5fd755739a6d8d5f4f65e03b3ca4f457bd2.158221614... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/freescale/pinctrl-scu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pinctrl/freescale/pinctrl-scu.c +++ b/drivers/pinctrl/freescale/pinctrl-scu.c @@ -23,12 +23,12 @@ struct imx_sc_msg_req_pad_set { struct imx_sc_rpc_msg hdr; u32 val; u16 pad; -} __packed; +} __packed __aligned(4);
struct imx_sc_msg_req_pad_get { struct imx_sc_rpc_msg hdr; u16 pad; -} __packed; +} __packed __aligned(4);
struct imx_sc_msg_resp_pad_get { struct imx_sc_rpc_msg hdr;
From: Suman Anna s-anna@ti.com
commit f13f09a12cbd0c7b776e083c5d008b6c6a9c4e0b upstream.
The functions vring_new_virtqueue() and __vring_new_virtqueue() are used with split rings, and any allocations within these functions are managed outside of the .we_own_ring flag. The commit cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately") allocates the desc state within the __vring_new_virtqueue() but frees it only when the .we_own_ring flag is set. This leads to a memory leak when freeing such allocated virtqueues with the vring_del_virtqueue() function.
Fix this by moving the desc_state free code outside the flag and only for split rings. Issue was discovered during testing with remoteproc and virtio_rpmsg.
Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately") Signed-off-by: Suman Anna s-anna@ti.com Link: https://lore.kernel.org/r/20200224212643.30672-1-s-anna@ti.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/virtio/virtio_ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -2203,10 +2203,10 @@ void vring_del_virtqueue(struct virtqueu vq->split.queue_size_in_bytes, vq->split.vring.desc, vq->split.queue_dma_addr); - - kfree(vq->split.desc_state); } } + if (!vq->packed_ring) + kfree(vq->split.desc_state); list_del(&_vq->list); kfree(vq); }
From: Tina Zhang tina.zhang@intel.com
commit 259170cb4c84f4165a36c0b05811eb74c495412c upstream.
Commit c3b5a8430daad ("drm/i915/gvt: Enable gfx virtualiztion for CFL") added the support on CFL. The vgpu emulation hotplug support on CFL was supposed to be included in that patch. Without the vgpu emulation hotplug support, the dma-buf based display gives us a blur face.
So fix this issue by adding the vgpu emulation hotplug support on CFL.
Fixes: c3b5a8430daad ("drm/i915/gvt: Enable gfx virtualiztion for CFL") Signed-off-by: Tina Zhang tina.zhang@intel.com Acked-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20200227010041.32248-1-tina.zha... (cherry picked from commit 135dde8853c7e00f6002e710f7e4787ed8585c0e) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gvt/display.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gvt/display.c +++ b/drivers/gpu/drm/i915/gvt/display.c @@ -457,7 +457,8 @@ void intel_vgpu_emulate_hotplug(struct i struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
/* TODO: add more platforms support */ - if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) { + if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) || + IS_COFFEELAKE(dev_priv)) { if (connected) { vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDID_DETECTED;
From: Charles Keepax ckeepax@opensource.cirrus.com
commit aafd56fc79041bf36f97712d4b35208cbe07db90 upstream.
kref_init starts with the reference count at 1, which will be balanced by the pinctrl_put in pinctrl_unregister. The additional kref_get in pinctrl_claim_hogs will increase this count to 2 and cause the hogs to not get freed when pinctrl_unregister is called.
Fixes: 6118714275f0 ("pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20200228154142.13860-1-ckeepax@opensource.cirrus.c... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/core.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/pinctrl/core.c +++ b/drivers/pinctrl/core.c @@ -2025,7 +2025,6 @@ static int pinctrl_claim_hogs(struct pin return PTR_ERR(pctldev->p); }
- kref_get(&pctldev->p->users); pctldev->hog_default = pinctrl_lookup_state(pctldev->p, PINCTRL_STATE_DEFAULT); if (IS_ERR(pctldev->hog_default)) {
From: Zhenyu Wang zhenyuw@linux.intel.com
commit 04d6067f1f19e70a418f92fa3170cf7fe53b7fdf upstream.
From commit f25a49ab8ab9 ("drm/i915/gvt: Use vgpu_lock to protect per
vgpu access") the vgpu idr destroy is moved later than vgpu resource destroy, then it would fail to stop timer for schedule policy clean which to check vgpu idr for any left vGPU. So this trys to destroy vgpu idr earlier.
Cc: Colin Xu colin.xu@intel.com Fixes: f25a49ab8ab9 ("drm/i915/gvt: Use vgpu_lock to protect per vgpu access") Acked-by: Colin Xu colin.xu@intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20200229055445.31481-1-zhenyuw@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gvt/vgpu.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/i915/gvt/vgpu.c +++ b/drivers/gpu/drm/i915/gvt/vgpu.c @@ -272,10 +272,17 @@ void intel_gvt_destroy_vgpu(struct intel { struct intel_gvt *gvt = vgpu->gvt;
- mutex_lock(&vgpu->vgpu_lock); - WARN(vgpu->active, "vGPU is still active!\n");
+ /* + * remove idr first so later clean can judge if need to stop + * service if no active vgpu. + */ + mutex_lock(&gvt->lock); + idr_remove(&gvt->vgpu_idr, vgpu->id); + mutex_unlock(&gvt->lock); + + mutex_lock(&vgpu->vgpu_lock); intel_gvt_debugfs_remove_vgpu(vgpu); intel_vgpu_clean_sched_policy(vgpu); intel_vgpu_clean_submission(vgpu); @@ -290,7 +297,6 @@ void intel_gvt_destroy_vgpu(struct intel mutex_unlock(&vgpu->vgpu_lock);
mutex_lock(&gvt->lock); - idr_remove(&gvt->vgpu_idr, vgpu->id); if (idr_is_empty(&gvt->vgpu_idr)) intel_gvt_clean_irq(gvt); intel_gvt_update_vgpu_types(gvt);
From: Christoph Hellwig hch@lst.de
commit e3a36eb6dfaeea8175c05d5915dcf0b939be6dab upstream.
This does three inter-related things to clarify the usage of the platform device dma_mask field. In the process, fix the bug introduced by cdfee5623290 ("driver core: initialize a default DMA mask for platform device") that caused Artem Tashkinov's laptop to not boot with newer Fedora kernels.
This does:
- First off, rename the field to "platform_dma_mask" to make it greppable.
We have way too many different random fields called "dma_mask" in various data structures, where some of them are actual masks, and some of them are just pointers to the mask. And the structures all have pointers to each other, or embed each other inside themselves, and "pdev" sometimes means "platform device" and sometimes it means "PCI device".
So to make it clear in the code when you actually use this new field, give it a unique name (it really should be something even more unique like "platform_device_dma_mask", since it's per platform device, not per platform, but that gets old really fast, and this is unique enough in context).
To further clarify when the field gets used, initialize it when we actually start using it with the default value.
- Then, use this field instead of the random one-off allocation in platform_device_register_full() that is now unnecessary since we now already have a perfectly fine allocation for it in the platform device structure.
- The above then allows us to fix the actual bug, where the error path of platform_device_register_full() would unconditionally free the platform device DMA allocation with 'kfree()'.
That kfree() was dont regardless of whether the allocation had been done earlier with the (now removed) kmalloc, or whether setup_pdev_dma_masks() had already been used and the dma_mask pointer pointed to the mask that was part of the platform device.
It seems most people never triggered the error path, or only triggered it from a call chain that set an explicit pdevinfo->dma_mask value (and thus caused the unnecessary allocation that was "cleaned up" in the error path) before calling platform_device_register_full().
Robin Murphy points out that in Artem's case the wdat_wdt driver failed in platform_device_add(), and that was the one that had called platform_device_register_full() with pdevinfo.dma_mask = 0, and would have caused that kfree() of pdev.dma_mask corrupting the heap.
A later unrelated kmalloc() then oopsed due to the heap corruption.
Fixes: cdfee5623290 ("driver core: initialize a default DMA mask for platform device") Reported-bisected-and-tested-by: Artem S. Tashkinov aros@gmx.com Reviewed-by: Robin Murphy robin.murphy@arm.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/base/platform.c | 25 ++++++------------------- include/linux/platform_device.h | 2 +- 2 files changed, 7 insertions(+), 20 deletions(-)
--- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -335,10 +335,10 @@ static void setup_pdev_dma_masks(struct { if (!pdev->dev.coherent_dma_mask) pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); - if (!pdev->dma_mask) - pdev->dma_mask = DMA_BIT_MASK(32); - if (!pdev->dev.dma_mask) - pdev->dev.dma_mask = &pdev->dma_mask; + if (!pdev->dev.dma_mask) { + pdev->platform_dma_mask = DMA_BIT_MASK(32); + pdev->dev.dma_mask = &pdev->platform_dma_mask; + } };
/** @@ -634,20 +634,8 @@ struct platform_device *platform_device_ pdev->dev.of_node_reused = pdevinfo->of_node_reused;
if (pdevinfo->dma_mask) { - /* - * This memory isn't freed when the device is put, - * I don't have a nice idea for that though. Conceptually - * dma_mask in struct device should not be a pointer. - * See http://thread.gmane.org/gmane.linux.kernel.pci/9081 - */ - pdev->dev.dma_mask = - kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL); - if (!pdev->dev.dma_mask) - goto err; - - kmemleak_ignore(pdev->dev.dma_mask); - - *pdev->dev.dma_mask = pdevinfo->dma_mask; + pdev->platform_dma_mask = pdevinfo->dma_mask; + pdev->dev.dma_mask = &pdev->platform_dma_mask; pdev->dev.coherent_dma_mask = pdevinfo->dma_mask; }
@@ -672,7 +660,6 @@ struct platform_device *platform_device_ if (ret) { err: ACPI_COMPANION_SET(&pdev->dev, NULL); - kfree(pdev->dev.dma_mask); platform_device_put(pdev); return ERR_PTR(ret); } --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -24,7 +24,7 @@ struct platform_device { int id; bool id_auto; struct device dev; - u64 dma_mask; + u64 platform_dma_mask; u32 num_resources; struct resource *resource;
From: Qian Cai cai@lca.pw
commit 2d48ea0efb8887ebba3e3720bb5b738aced4e574 upstream.
There are several places traverse RCU-list without holding any lock in intel_iommu_init(). Fix them by acquiring dmar_global_lock.
WARNING: suspicious RCU usage ----------------------------- drivers/iommu/intel-iommu.c:5216 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/0/1.
Call Trace: dump_stack+0xa0/0xea lockdep_rcu_suspicious+0x102/0x10b intel_iommu_init+0x947/0xb13 pci_iommu_init+0x26/0x62 do_one_initcall+0xfe/0x500 kernel_init_freeable+0x45a/0x4f8 kernel_init+0x11/0x139 ret_from_fork+0x3a/0x50 DMAR: Intel(R) Virtualization Technology for Directed I/O
Fixes: d8190dc63886 ("iommu/vt-d: Enable DMA remapping after rmrr mapped") Signed-off-by: Qian Cai cai@lca.pw Acked-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/intel-iommu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5024,6 +5024,7 @@ int __init intel_iommu_init(void)
init_iommu_pm_ops();
+ down_read(&dmar_global_lock); for_each_active_iommu(iommu, drhd) { iommu_device_sysfs_add(&iommu->iommu, NULL, intel_iommu_groups, @@ -5031,6 +5032,7 @@ int __init intel_iommu_init(void) iommu_device_set_ops(&iommu->iommu, &intel_iommu_ops); iommu_device_register(&iommu->iommu); } + up_read(&dmar_global_lock);
bus_set_iommu(&pci_bus_type, &intel_iommu_ops); if (si_domain && !hw_pass_through) @@ -5041,7 +5043,6 @@ int __init intel_iommu_init(void) down_read(&dmar_global_lock); if (probe_acpi_namespace_devices()) pr_warn("ACPI name space devices didn't probe correctly\n"); - up_read(&dmar_global_lock);
/* Finally, we enable the DMA remapping hardware. */ for_each_iommu(iommu, drhd) { @@ -5050,6 +5051,8 @@ int __init intel_iommu_init(void)
iommu_disable_protect_mem_regions(iommu); } + up_read(&dmar_global_lock); + pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
intel_iommu_enabled = 1;
From: Hamish Martin hamish.martin@alliedtelesis.co.nz
commit 3747cd2efe7ecb9604972285ab3f60c96cb753a8 upstream.
If a GPIO we are trying to use is not available and we are deferring the probe, don't output an error message. This seems to have been the intent of commit 05c74778858d ("i2c: gpio: Add support for named gpios in DT") but the error was still output due to not checking the updated 'retdesc'.
Fixes: 05c74778858d ("i2c: gpio: Add support for named gpios in DT") Signed-off-by: Hamish Martin hamish.martin@alliedtelesis.co.nz Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/busses/i2c-gpio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-gpio.c +++ b/drivers/i2c/busses/i2c-gpio.c @@ -348,7 +348,7 @@ static struct gpio_desc *i2c_gpio_get_de if (ret == -ENOENT) retdesc = ERR_PTR(-EPROBE_DEFER);
- if (ret != -EPROBE_DEFER) + if (PTR_ERR(retdesc) != -EPROBE_DEFER) dev_err(dev, "error trying to get descriptor: %d\n", ret);
return retdesc;
From: Jakub Kicinski kuba@kernel.org
commit 0e1a1d853ecedc99da9d27f9f5c376935547a0e2 upstream.
Add missing attribute validation for critical protocol fields to the netlink policy.
Fixes: 5de17984898c ("cfg80211: introduce critical protocol indication from user-space") Signed-off-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/r/20200303051058.4089398-2-kuba@kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/wireless/nl80211.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -530,6 +530,8 @@ const struct nla_policy nl80211_policy[N [NL80211_ATTR_MDID] = { .type = NLA_U16 }, [NL80211_ATTR_IE_RIC] = { .type = NLA_BINARY, .len = IEEE80211_MAX_DATA_LEN }, + [NL80211_ATTR_CRIT_PROT_ID] = { .type = NLA_U16 }, + [NL80211_ATTR_MAX_CRIT_PROT_DURATION] = { .type = NLA_U16 }, [NL80211_ATTR_PEER_AID] = NLA_POLICY_RANGE(NLA_U16, 1, IEEE80211_MAX_AID), [NL80211_ATTR_CH_SWITCH_COUNT] = { .type = NLA_U32 },
From: Jakub Kicinski kuba@kernel.org
commit 056e9375e1f3c4bf2fd49b70258c7daf788ecd9d upstream.
Add missing attribute validation for beacon report scanning to the netlink policy.
Fixes: 1d76250bd34a ("nl80211: support beacon report scanning") Signed-off-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/r/20200303051058.4089398-3-kuba@kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/wireless/nl80211.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -469,6 +469,8 @@ const struct nla_policy nl80211_policy[N [NL80211_ATTR_WOWLAN_TRIGGERS] = { .type = NLA_NESTED }, [NL80211_ATTR_STA_PLINK_STATE] = NLA_POLICY_MAX(NLA_U8, NUM_NL80211_PLINK_STATES - 1), + [NL80211_ATTR_MEASUREMENT_DURATION] = { .type = NLA_U16 }, + [NL80211_ATTR_MEASUREMENT_DURATION_MANDATORY] = { .type = NLA_FLAG }, [NL80211_ATTR_MESH_PEER_AID] = NLA_POLICY_RANGE(NLA_U16, 1, IEEE80211_MAX_AID), [NL80211_ATTR_SCHED_SCAN_INTERVAL] = { .type = NLA_U32 },
From: Jakub Kicinski kuba@kernel.org
commit 5cde05c61cbe13cbb3fa66d52b9ae84f7975e5e6 upstream.
Add missing attribute validation for NL80211_ATTR_OPER_CLASS to the netlink policy.
Fixes: 1057d35ede5d ("cfg80211: introduce TDLS channel switch commands") Signed-off-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/r/20200303051058.4089398-4-kuba@kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/wireless/nl80211.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -564,6 +564,7 @@ const struct nla_policy nl80211_policy[N NLA_POLICY_MAX(NLA_U8, IEEE80211_NUM_UPS - 1), [NL80211_ATTR_ADMITTED_TIME] = { .type = NLA_U16 }, [NL80211_ATTR_SMPS_MODE] = { .type = NLA_U8 }, + [NL80211_ATTR_OPER_CLASS] = { .type = NLA_U8 }, [NL80211_ATTR_MAC_MASK] = { .type = NLA_EXACT_LEN_WARN, .len = ETH_ALEN
From: Tommi Rantala tommi.t.rantala@nokia.com
commit f649bd9dd5d5004543bbc3c50b829577b49f5d75 upstream.
Since commit 3b2323c2c1c4 ("perf bench futex: Use cpumaps") the default number of threads the benchmark uses got changed from number of online CPUs to zero:
$ perf bench futex wake # Running 'futex/wake' benchmark: Run summary [PID 15930]: blocking on 0 threads (at [private] futex 0x558b8ee4bfac), waking up 1 at a time. [Run 1]: Wokeup 0 of 0 threads in 0.0000 ms [...] [Run 10]: Wokeup 0 of 0 threads in 0.0000 ms Wokeup 0 of 0 threads in 0.0004 ms (+-40.82%)
Restore the old behavior by grabbing the number of online CPUs via cpu->nr:
$ perf bench futex wake # Running 'futex/wake' benchmark: Run summary [PID 18356]: blocking on 8 threads (at [private] futex 0xb3e62c), waking up 1 at a time. [Run 1]: Wokeup 8 of 8 threads in 0.0260 ms [...] [Run 10]: Wokeup 8 of 8 threads in 0.0270 ms Wokeup 8 of 8 threads in 0.0419 ms (+-24.35%)
Fixes: 3b2323c2c1c4 ("perf bench futex: Use cpumaps") Signed-off-by: Tommi Rantala tommi.t.rantala@nokia.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Darren Hart dvhart@infradead.org Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lore.kernel.org/lkml/20200305083714.9381-3-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/bench/futex-wake.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/perf/bench/futex-wake.c +++ b/tools/perf/bench/futex-wake.c @@ -43,7 +43,7 @@ static bool done = false, silent = false static pthread_mutex_t thread_lock; static pthread_cond_t thread_parent, thread_worker; static struct stats waketime_stats, wakeup_stats; -static unsigned int ncpus, threads_starting, nthreads = 0; +static unsigned int threads_starting, nthreads = 0; static int futex_flag = 0;
static const struct option options[] = { @@ -141,7 +141,7 @@ int bench_futex_wake(int argc, const cha sigaction(SIGINT, &act, NULL);
if (!nthreads) - nthreads = ncpus; + nthreads = cpu->nr;
worker = calloc(nthreads, sizeof(*worker)); if (!worker)
From: Jakub Kicinski kuba@kernel.org
commit c049b3450072b8e3998053490e025839fecfef31 upstream.
Add missing attribute validation for cthelper to the netlink policy.
Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nfnetlink_cthelper.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c +++ b/net/netfilter/nfnetlink_cthelper.c @@ -742,6 +742,8 @@ static const struct nla_policy nfnl_cthe [NFCTH_NAME] = { .type = NLA_NUL_STRING, .len = NF_CT_HELPER_NAME_LEN-1 }, [NFCTH_QUEUE_NUM] = { .type = NLA_U32, }, + [NFCTH_PRIV_DATA_LEN] = { .type = NLA_U32, }, + [NFCTH_STATUS] = { .type = NLA_U32, }, };
static const struct nfnl_callback nfnl_cthelper_cb[NFNL_MSG_CTHELPER_MAX] = {
From: Jakub Kicinski kuba@kernel.org
commit 9d6effb2f1523eb84516e44213c00f2fd9e6afff upstream.
Add missing attribute validation for NFTA_PAYLOAD_CSUM_FLAGS to the netlink policy.
Fixes: 1814096980bb ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nft_payload.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -121,6 +121,7 @@ static const struct nla_policy nft_paylo [NFTA_PAYLOAD_LEN] = { .type = NLA_U32 }, [NFTA_PAYLOAD_CSUM_TYPE] = { .type = NLA_U32 }, [NFTA_PAYLOAD_CSUM_OFFSET] = { .type = NLA_U32 }, + [NFTA_PAYLOAD_CSUM_FLAGS] = { .type = NLA_U32 }, };
static int nft_payload_init(const struct nft_ctx *ctx,
From: Jakub Kicinski kuba@kernel.org
commit 88a637719a1570705c02cacb3297af164b1714e7 upstream.
Add missing attribute validation for tunnel source and destination ports to the netlink policy.
Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nft_tunnel.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/netfilter/nft_tunnel.c +++ b/net/netfilter/nft_tunnel.c @@ -339,6 +339,8 @@ static const struct nla_policy nft_tunne [NFTA_TUNNEL_KEY_FLAGS] = { .type = NLA_U32, }, [NFTA_TUNNEL_KEY_TOS] = { .type = NLA_U8, }, [NFTA_TUNNEL_KEY_TTL] = { .type = NLA_U8, }, + [NFTA_TUNNEL_KEY_SPORT] = { .type = NLA_U16, }, + [NFTA_TUNNEL_KEY_DPORT] = { .type = NLA_U16, }, [NFTA_TUNNEL_KEY_OPTS] = { .type = NLA_NESTED, }, };
From: Pablo Neira Ayuso pablo@netfilter.org
commit d78008de6103c708171baff9650a7862645d23b0 upstream.
Missing NFTA_CHAIN_FLAGS netlink attribute when dumping basechain definitions.
Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_tables_api.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1309,6 +1309,11 @@ static int nf_tables_fill_chain_info(str lockdep_commit_lock_is_held(net)); if (nft_dump_stats(skb, stats)) goto nla_put_failure; + + if ((chain->flags & NFT_CHAIN_HW_OFFLOAD) && + nla_put_be32(skb, NFTA_CHAIN_FLAGS, + htonl(NFT_CHAIN_HW_OFFLOAD))) + goto nla_put_failure; }
if (nla_put_be32(skb, NFTA_CHAIN_USE, htonl(chain->use)))
From: Pablo Neira Ayuso pablo@netfilter.org
commit 6a42cefb25d8bdc1b391f4a53c78c32164eea2dd upstream.
Set owner to THIS_MODULE, otherwise the nft_chain_nat module might be removed while there are still inet/nat chains in place.
[ 117.942096] BUG: unable to handle page fault for address: ffffffffa0d5e040 [ 117.942101] #PF: supervisor read access in kernel mode [ 117.942103] #PF: error_code(0x0000) - not-present page [ 117.942106] PGD 200c067 P4D 200c067 PUD 200d063 PMD 3dc909067 PTE 0 [ 117.942113] Oops: 0000 [#1] PREEMPT SMP PTI [ 117.942118] CPU: 3 PID: 27 Comm: kworker/3:0 Not tainted 5.6.0-rc3+ #348 [ 117.942133] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 117.942145] RIP: 0010:nf_tables_chain_destroy.isra.0+0x94/0x15a [nf_tables] [ 117.942149] Code: f6 45 54 01 0f 84 d1 00 00 00 80 3b 05 74 44 48 8b 75 e8 48 c7 c7 72 be de a0 e8 56 e6 2d e0 48 8b 45 e8 48 c7 c7 7f be de a0 <48> 8b 30 e8 43 e6 2d e0 48 8b 45 e8 48 8b 40 10 48 85 c0 74 5b 8b [ 117.942152] RSP: 0018:ffffc9000015be10 EFLAGS: 00010292 [ 117.942155] RAX: ffffffffa0d5e040 RBX: ffff88840be87fc2 RCX: 0000000000000007 [ 117.942158] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffffffa0debe7f [ 117.942160] RBP: ffff888403b54b50 R08: 0000000000001482 R09: 0000000000000004 [ 117.942162] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8883eda7e540 [ 117.942164] R13: dead000000000122 R14: dead000000000100 R15: ffff888403b3db80 [ 117.942167] FS: 0000000000000000(0000) GS:ffff88840e4c0000(0000) knlGS:0000000000000000 [ 117.942169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 117.942172] CR2: ffffffffa0d5e040 CR3: 00000003e4c52002 CR4: 00000000001606e0 [ 117.942174] Call Trace: [ 117.942188] nf_tables_trans_destroy_work.cold+0xd/0x12 [nf_tables] [ 117.942196] process_one_work+0x1d6/0x3b0 [ 117.942200] worker_thread+0x45/0x3c0 [ 117.942203] ? process_one_work+0x3b0/0x3b0 [ 117.942210] kthread+0x112/0x130 [ 117.942214] ? kthread_create_worker_on_cpu+0x40/0x40 [ 117.942221] ret_from_fork+0x35/0x40
nf_tables_chain_destroy() crashes on module_put() because the module is gone.
Fixes: d164385ec572 ("netfilter: nat: add inet family nat support") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nft_chain_nat.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/netfilter/nft_chain_nat.c +++ b/net/netfilter/nft_chain_nat.c @@ -89,6 +89,7 @@ static const struct nft_chain_type nft_c .name = "nat", .type = NFT_CHAIN_T_NAT, .family = NFPROTO_INET, + .owner = THIS_MODULE, .hook_mask = (1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_IN) | (1 << NF_INET_LOCAL_OUT) |
From: Zhenzhong Duan zhenzhong.duan@gmail.com
commit b0bb0c22c4db623f2e7b1a471596fbf1c22c6dc5 upstream.
When base address in RHSA structure doesn't match base address in each DRHD structure, the base address in last DRHD is printed out.
This doesn't make sense when there are multiple DRHD units, fix it by printing the buggy RHSA's base address.
Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Zhenzhong Duan zhenzhong.duan@gmail.com Fixes: fd0c8894893cb ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/dmar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -475,7 +475,7 @@ static int dmar_parse_one_rhsa(struct ac pr_warn(FW_BUG "Your BIOS is broken; RHSA refers to non-existent DMAR unit at %llx\n" "BIOS vendor: %s; Ver: %s; Product Version: %s\n", - drhd->reg_base_addr, + rhsa->base_address, dmi_get_system_info(DMI_BIOS_VENDOR), dmi_get_system_info(DMI_BIOS_VERSION), dmi_get_system_info(DMI_PRODUCT_VERSION));
From: Daniel Drake drake@endlessm.com
commit da72a379b2ec0bad3eb265787f7008bead0b040c upstream.
VMD subdevices are created with a PCI domain ID of 0x10000 or higher.
These subdevices are also handled like all other PCI devices by dmar_pci_bus_notifier().
However, when dmar_alloc_pci_notify_info() take records of such devices, it will truncate the domain ID to a u16 value (in info->seg). The device at (e.g.) 10000:00:02.0 is then treated by the DMAR code as if it is 0000:00:02.0.
In the unlucky event that a real device also exists at 0000:00:02.0 and also has a device-specific entry in the DMAR table, dmar_insert_dev_scope() will crash on: BUG_ON(i >= devices_cnt);
That's basically a sanity check that only one PCI device matches a single DMAR entry; in this case we seem to have two matching devices.
Fix this by ignoring devices that have a domain number higher than what can be looked up in the DMAR table.
This problem was carefully diagnosed by Jian-Hong Pan.
Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Daniel Drake drake@endlessm.com Fixes: 59ce0515cdaf3 ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/dmar.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -28,6 +28,7 @@ #include <linux/slab.h> #include <linux/iommu.h> #include <linux/numa.h> +#include <linux/limits.h> #include <asm/irq_remapping.h> #include <asm/iommu_table.h>
@@ -128,6 +129,13 @@ dmar_alloc_pci_notify_info(struct pci_de
BUG_ON(dev->is_virtfn);
+ /* + * Ignore devices that have a domain number higher than what can + * be looked up in DMAR, e.g. VMD subdevices with domain 0x10000 + */ + if (pci_domain_nr(dev->bus) > U16_MAX) + return NULL; + /* Only generate path[] for device addition event */ if (event == BUS_NOTIFY_ADD_DEVICE) for (tmp = dev; tmp; tmp = tmp->bus->self)
From: Wolfram Sang wsa+renesas@sang-engineering.com
commit 8daee952b4389729358665fb91949460641659d4 upstream.
i2c_verify_client() can fail, so we need to put the device when that happens.
Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications") Reported-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/i2c-core-acpi.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -394,9 +394,17 @@ EXPORT_SYMBOL_GPL(i2c_acpi_find_adapter_ static struct i2c_client *i2c_acpi_find_client_by_adev(struct acpi_device *adev) { struct device *dev; + struct i2c_client *client;
dev = bus_find_device_by_acpi_dev(&i2c_bus_type, adev); - return dev ? i2c_verify_client(dev) : NULL; + if (!dev) + return NULL; + + client = i2c_verify_client(dev); + if (!client) + put_device(dev); + + return client; }
static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value,
From: Suravee Suthikulpanit suravee.suthikulpanit@amd.com
commit 730ad0ede130015a773229573559e97ba0943065 upstream.
Commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code") accidentally left out the ir_data pointer when calling modity_irte_ga(), which causes the function amd_iommu_update_ga() to return prematurely due to struct amd_ir_data.ref is NULL and the "is_run" bit of IRTE does not get updated properly.
This results in bad I/O performance since IOMMU AVIC always generate GA Log entry and notify IOMMU driver and KVM when it receives interrupt from the PCI pass-through device instead of directly inject interrupt to the vCPU.
Fixes by passing ir_data when calling modify_irte_ga() as done previously.
Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code") Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/amd_iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4421,7 +4421,7 @@ int amd_iommu_activate_guest_mode(void * entry->lo.fields_vapic.ga_tag = ir_data->ga_tag;
return modify_irte_ga(ir_data->irq_2_irte.devid, - ir_data->irq_2_irte.index, entry, NULL); + ir_data->irq_2_irte.index, entry, ir_data); } EXPORT_SYMBOL(amd_iommu_activate_guest_mode);
@@ -4447,7 +4447,7 @@ int amd_iommu_deactivate_guest_mode(void APICID_TO_IRTE_DEST_HI(cfg->dest_apicid);
return modify_irte_ga(ir_data->irq_2_irte.devid, - ir_data->irq_2_irte.index, entry, NULL); + ir_data->irq_2_irte.index, entry, ir_data); } EXPORT_SYMBOL(amd_iommu_deactivate_guest_mode);
From: Eric Dumazet edumazet@google.com
commit b6f6118901d1e867ac9177bbff3b00b185bd4fdc upstream.
IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one. While this operation sounds illogical, we have to support it.
One of the things it does for TCP socket is to switch sk->sk_prot to tcp_prot.
We now have other layers playing with sk->sk_prot, so we should make sure to not interfere with them.
This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.
syzbot reported : BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD a0113067 P4D a0113067 PUD a8771067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427 __sock_release net/socket.c:605 [inline] sock_close+0xe1/0x260 net/socket.c:1283 __fput+0x2e4/0x740 fs/file_table.c:280 ____fput+0x15/0x20 fs/file_table.c:313 task_work_run+0x176/0x1b0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop arch/x86/entry/common.c:164 [inline] prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c429 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429 RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004 RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000 R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c Modules linked in: CR2: 0000000000000000 ---[ end trace 82567b5207e87bae ]--- RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com Cc: Daniel Borkmann daniel@iogearbox.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/ipv6/ipv6_sockglue.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -183,9 +183,15 @@ static int do_ipv6_setsockopt(struct soc retv = -EBUSY; break; } - } else if (sk->sk_protocol != IPPROTO_TCP) + } else if (sk->sk_protocol == IPPROTO_TCP) { + if (sk->sk_prot != &tcpv6_prot) { + retv = -EBUSY; + break; + } + break; + } else { break; - + } if (sk->sk_state != TCP_ESTABLISHED) { retv = -ENOTCONN; break;
From: Karsten Graul kgraul@linux.ibm.com
commit a2f2ef4a54c0d97aa6a8386f4ff23f36ebb488cf upstream.
In smc_ib_remove_dev() check if the provided ib device was actually initialized for SMC before.
Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Signed-off-by: Karsten Graul kgraul@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/smc/smc_ib.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/smc/smc_ib.c +++ b/net/smc/smc_ib.c @@ -560,6 +560,8 @@ static void smc_ib_remove_dev(struct ib_ struct smc_ib_device *smcibdev;
smcibdev = ib_get_client_data(ibdev, &smc_ib_client); + if (!smcibdev || smcibdev->ibdev != ibdev) + return; ib_set_client_data(ibdev, &smc_ib_client, NULL); spin_lock(&smc_ib_devices.lock); list_del_init(&smcibdev->list); /* remove from smc_ib_devices */
From: Karsten Graul kgraul@linux.ibm.com
commit ece0d7bd74615773268475b6b64d6f1ebbd4b4c6 upstream.
During IB device removal, cancel the event worker before the device structure is freed.
Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com Signed-off-by: Karsten Graul kgraul@linux.ibm.com Reviewed-by: Ursula Braun ubraun@linux.ibm.com Reviewed-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/smc/smc_ib.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/smc/smc_ib.c +++ b/net/smc/smc_ib.c @@ -568,6 +568,7 @@ static void smc_ib_remove_dev(struct ib_ spin_unlock(&smc_ib_devices.lock); smc_ib_cleanup_per_ibdev(smcibdev); ib_unregister_event_handler(&smcibdev->event_handler); + cancel_work_sync(&smcibdev->port_event_work); kfree(smcibdev); }
On 3/17/20 3:53 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.26 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Mar 2020 10:31:16 +0000. Anything received after that time might be too late.
Build results: total: 158 pass: 158 fail: 0 Qemu test results: total: 427 pass: 427 fail: 0
Guenter
On Tue, 17 Mar 2020 at 16:32, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.4.26 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Mar 2020 10:31:16 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.26-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.4.26-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.4.y git commit: bd9158ff941e0efcea216f7311abc7fe13e8ae39 git describe: v5.4.25-124-gbd9158ff941e Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.4-oe/build/v5.4.25-124-...
No regressions (compared to build v5.4.25)
No fixes (compared to build v5.4.25)
Ran 16905 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - nxp-ls2088 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest * libgpiod * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * v4l2-compliance * libhugetlbfs * ltp-fs-tests * network-basic-tests * kvm-unit-tests * ltp-containers-tests * ltp-cve-tests * spectre-meltdown-checker-test * ltp-open-posix-tests
On 3/17/20 4:53 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.4.26 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 19 Mar 2020 10:31:16 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.26-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org