This is the start of the stable review cycle for the 4.19.37 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Apr 2019 05:07:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.37-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.37-rc1
Linus Torvalds torvalds@linux-foundation.org i2c-hid: properly terminate i2c_hid_dmi_desc_override_table[] array
Katsuhiro Suzuki katsuhiro@katsuster.net ASoC: rockchip: add missing INTERLEAVED PCM attribute
Arnaldo Carvalho de Melo acme@redhat.com tools include: Adopt linux/bits.h
Matteo Croce mcroce@redhat.com percpu: stop printing kernel addresses
Takashi Iwai tiwai@suse.de ALSA: info: Fix racy addition/deletion of nodes
Konstantin Khlebnikov khlebnikov@yandex-team.ru mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
Jann Horn jannh@google.com device_cgroup: fix RCU imbalance in error case
Phil Auld pauld@redhat.com sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
Matthias Kaehlcke mka@chromium.org Revert "kbuild: use -Oz instead of -Os when using clang"
Yue Haibing yuehaibing@huawei.com tpm: Fix the type of the return value in calc_tpm2_event_size()
Jarkko Sakkinen jarkko.sakkinen@linux.intel.com tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
Masahiro Yamada yamada.masahiro@socionext.com modpost: file2alias: check prototype of handler
Masahiro Yamada yamada.masahiro@socionext.com modpost: file2alias: go back to simple devtable lookup
Adrian Hunter adrian.hunter@intel.com mmc: sdhci: Handle auto-command errors
Adrian Hunter adrian.hunter@intel.com mmc: sdhci: Rename SDHCI_ACMD12_ERR and SDHCI_INT_ACMD12ERR
Adrian Hunter adrian.hunter@intel.com mmc: sdhci: Fix data command CRC error handling
Dan Williams dan.j.williams@intel.com nfit/ars: Avoid stale ARS results
Dan Williams dan.j.williams@intel.com nfit/ars: Allow root to busy-poll the ARS state machine
Dan Williams dan.j.williams@intel.com nfit/ars: Introduce scrub_flags
Dan Williams dan.j.williams@intel.com nfit/ars: Remove ars_start_flags
Chang-An Chen chang-an.chen@mediatek.com timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze()
Thomas Gleixner tglx@linutronix.de x86/speculation: Prevent deadlock on ssb_state::lock
Kan Liang kan.liang@linux.intel.com perf/x86: Fix incorrect PEBS_REGS
Andi Kleen ak@linux.intel.com x86/cpu/bugs: Use __initconst for 'const' init data
Kim Phillips kim.phillips@amd.com perf/x86/amd: Add event map for AMD Family 17h
Alex Deucher alexander.deucher@amd.com drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming
Felix Fietkau nbd@nbd.name mac80211: do not call driver wake_tx_queue op during reconfig
Vijayakumar Durai vijayakumar.durai1@vivint.com rt2x00: do not increment sequence number while re-transmitting
Masami Hiramatsu mhiramat@kernel.org kprobes: Fix error check when reusing optimized probes
Masami Hiramatsu mhiramat@kernel.org kprobes: Mark ftrace mcount handler functions nokprobe
Masami Hiramatsu mhiramat@kernel.org x86/kprobes: Verify stack frame on kretprobe
Nathan Chancellor natechancellor@gmail.com arm64: futex: Restore oldval initialization to work around buggy compilers
Christian König christian.koenig@amd.com drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
Eric Biggers ebiggers@google.com crypto: x86/poly1305 - fix overflow during partial reduction
Corey Minyard cminyard@mvista.com ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier
Andrea Arcangeli aarcange@redhat.com coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
Suthikulpanit, Suravee Suravee.Suthikulpanit@amd.com Revert "svm: Fix AVIC incomplete IPI emulation"
Saurav Kashyap skashyap@marvell.com Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
Jaesoo Lee jalee@purestorage.com scsi: core: set result when the command cannot be dispatched
Mikulas Patocka mpatocka@redhat.com vt: fix cursor when clearing the screen
Geert Uytterhoeven geert+renesas@glider.be serial: sh-sci: Fix HSCIF RX sampling point calculation
Geert Uytterhoeven geert+renesas@glider.be serial: sh-sci: Fix HSCIF RX sampling point adjustment
KT Liao kt.liao@emc.com.tw Input: elan_i2c - add hardware ID for multiple Lenovo laptops
Takashi Iwai tiwai@suse.de ALSA: core: Fix card races between register and disconnect
Hui Wang hui.wang@canonical.com ALSA: hda/realtek - add two more pin configuration sets to quirk table
Ian Abbott abbotti@mev.co.uk staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
Ian Abbott abbotti@mev.co.uk staging: comedi: ni_usb6501: Fix use of uninitialized mutex
Ian Abbott abbotti@mev.co.uk staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
Ian Abbott abbotti@mev.co.uk staging: comedi: vmk80xx: Fix use of uninitialized semaphore
Christian Gromm christian.gromm@microchip.com staging: most: core: use device description as name
he, bo bo.he@intel.com io: accel: kxcjk1013: restore the range after resume.
Fabrice Gasnier fabrice.gasnier@st.com iio: core: fix a possible circular locking dependency
Georg Ottinger g.ottinger@abatec.at iio: adc: at91: disable adc channel interrupt in timeout case
Lars-Peter Clausen lars@metafoo.de iio: Fix scan mask selection
Jean-Francois Dagenais jeff.dagenais@gmail.com iio: dac: mcp4725: add missing powerdown bits in store eeprom
Dragos Bogdan dragos.bogdan@analog.com iio: ad_sigma_delta: select channel when reading register
Gwendal Grignou gwendal@chromium.org iio: cros_ec: Fix the maths for gyro scale calculation
Mike Looijmans mike.looijmans@topic.nl iio:chemical:bme680: Fix SPI read interface
Mike Looijmans mike.looijmans@topic.nl iio:chemical:bme680: Fix, report temperature in millidegrees
Mike Looijmans mike.looijmans@topic.nl iio/gyro/bmg160: Use millidegrees for temperature scale
Sergey Larin cerg2010cerg2010@mail.ru iio: gyro: mpu3050: fix chip ID reading
Mircea Caprioru mircea.caprioru@analog.com staging: iio: ad7192: Fix ad7193 channel address
Leonard Pollak leonardp@tr-host.de Staging: iio: meter: fixed typo
Vitaly Kuznetsov vkuznets@redhat.com KVM: x86: svm: make sure NMI is injected after nmi_singlestep
Sean Christopherson sean.j.christopherson@intel.com KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
Ronnie Sahlberg lsahlber@redhat.com cifs: fix handle leak in smb2_query_symlink()
ZhangXiaoxu zhangxiaoxu5@huawei.com cifs: Fix use-after-free in SMB2_read
ZhangXiaoxu zhangxiaoxu5@huawei.com cifs: Fix use-after-free in SMB2_write
Aurelien Aptel aaptel@suse.com CIFS: keep FileInfo handle live during oplock break
Peter Oskolkov posk@google.com net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c
Peter Oskolkov posk@google.com net: IP6 defrag: use rbtrees for IPv6 defrag
Peter Oskolkov posk@google.com net: IP defrag: encapsulate rbtree defrag code into callable functions
Toke Høiland-Jørgensen toke@redhat.com sch_cake: Simplify logic in cake_select_tin()
Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com nfp: flower: remove vlan CFI bit from push vlan action
Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com nfp: flower: replace CFI with vlan present
Toke Høiland-Jørgensen toke@redhat.com sch_cake: Make sure we can write the IP header before changing DSCP bits
Toke Høiland-Jørgensen toke@redhat.com sch_cake: Use tc_skb_protocol() helper for getting packet protocol
Jonathan Lemon jonathan.lemon@gmail.com route: Avoid crash from dereferencing NULL rt->from
Saeed Mahameed saeedm@mellanox.com net/mlx5: FPGA, tls, idr remove on flow delete
Jakub Kicinski jakub.kicinski@netronome.com net/tls: prevent bad memory access in tls_is_sk_tx_device_offloaded()
Saeed Mahameed saeedm@mellanox.com net/mlx5: FPGA, tls, hold rcu read lock a bit longer
Matteo Croce mcroce@redhat.com net: thunderx: don't allow jumbo frames with XDP
Matteo Croce mcroce@redhat.com net: thunderx: raise XDP MTU to 1508
Eric Dumazet edumazet@google.com ipv4: ensure rcu_read_lock() in ipv4_link_failure()
Stephen Suryaputra ssuryaextr@gmail.com ipv4: recompile ip options in ipv4_link_failure
Jason Wang jasowang@redhat.com vhost: reject zero size iova range
Hoang Le hoang.h.le@dektech.com.au tipc: missing entries in name table of publications
Hangbin Liu liuhangbin@gmail.com team: set slave to promisc if team is already in promisc mode
Eric Dumazet edumazet@google.com tcp: tcp_grow_window() needs to respect tcp_space()
Lorenzo Bianconi lorenzo.bianconi@redhat.com net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
Yuya Kusakabe yuya.kusakabe@gmail.com net: Fix missing meta data in skb with vlan packet
Nikolay Aleksandrov nikolay@cumulusnetworks.com net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
Nikolay Aleksandrov nikolay@cumulusnetworks.com net: bridge: fix per-port af_packet sockets
Gustavo A. R. Silva gustavo@embeddedor.com net: atm: Fix potential Spectre v1 vulnerabilities
Si-Wei Liu si-wei.liu@oracle.com failover: allow name change on IFF_UP slave interfaces
Sabrina Dubroca sd@queasysnail.net bonding: fix event handling for stacked bonds
-------------
Diffstat:
Makefile | 7 +- arch/arm64/include/asm/futex.h | 2 +- arch/x86/crypto/poly1305-avx2-x86_64.S | 14 +- arch/x86/crypto/poly1305-sse2-x86_64.S | 22 +- arch/x86/events/amd/core.c | 35 ++- arch/x86/events/intel/core.c | 2 +- arch/x86/events/perf_event.h | 38 +-- arch/x86/kernel/cpu/bugs.c | 6 +- arch/x86/kernel/kprobes/core.c | 26 ++ arch/x86/kernel/process.c | 8 +- arch/x86/kvm/emulate.c | 21 +- arch/x86/kvm/svm.c | 22 +- crypto/testmgr.h | 44 ++- drivers/acpi/nfit/core.c | 63 +++-- drivers/acpi/nfit/nfit.h | 11 +- drivers/char/ipmi/ipmi_msghandler.c | 19 +- drivers/char/tpm/eventlog/tpm2.c | 4 +- drivers/char/tpm/tpm_i2c_atmel.c | 4 + drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 1 + drivers/gpu/drm/ttm/ttm_page_alloc.c | 5 +- drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 3 +- drivers/iio/accel/kxcjk-1013.c | 2 + drivers/iio/adc/ad_sigma_delta.c | 1 + drivers/iio/adc/at91_adc.c | 28 +- drivers/iio/chemical/bme680.h | 6 +- drivers/iio/chemical/bme680_core.c | 54 +++- drivers/iio/chemical/bme680_i2c.c | 21 -- drivers/iio/chemical/bme680_spi.c | 115 +++++--- .../iio/common/cros_ec_sensors/cros_ec_sensors.c | 7 +- drivers/iio/dac/mcp4725.c | 1 + drivers/iio/gyro/bmg160_core.c | 6 +- drivers/iio/gyro/mpu3050-core.c | 8 +- drivers/iio/industrialio-buffer.c | 5 +- drivers/iio/industrialio-core.c | 4 +- drivers/input/mouse/elan_i2c_core.c | 25 ++ drivers/mmc/host/sdhci-esdhc-imx.c | 12 +- drivers/mmc/host/sdhci.c | 79 ++++-- drivers/mmc/host/sdhci.h | 11 +- drivers/net/bonding/bond_main.c | 6 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 22 +- drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 61 ++--- drivers/net/ethernet/netronome/nfp/flower/action.c | 3 +- drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 3 +- drivers/net/ethernet/netronome/nfp/flower/match.c | 14 +- drivers/net/team/team.c | 26 ++ drivers/net/wireless/ralink/rt2x00/rt2x00.h | 1 - drivers/net/wireless/ralink/rt2x00/rt2x00mac.c | 10 - drivers/net/wireless/ralink/rt2x00/rt2x00queue.c | 15 +- drivers/scsi/libfc/fc_rport.c | 1 - drivers/scsi/scsi_lib.c | 6 +- drivers/staging/comedi/drivers/ni_usb6501.c | 10 +- drivers/staging/comedi/drivers/vmk80xx.c | 8 +- drivers/staging/iio/adc/ad7192.c | 8 +- drivers/staging/iio/meter/ade7854.c | 2 +- drivers/staging/most/core.c | 2 +- drivers/tty/serial/sh-sci.c | 6 +- drivers/tty/vt/vt.c | 3 +- drivers/vhost/vhost.c | 6 +- fs/cifs/cifsglob.h | 2 + fs/cifs/file.c | 30 +- fs/cifs/misc.c | 25 +- fs/cifs/smb2misc.c | 6 +- fs/cifs/smb2ops.c | 2 + fs/cifs/smb2pdu.c | 6 +- fs/proc/task_mmu.c | 18 ++ fs/userfaultfd.c | 9 + include/linux/kprobes.h | 1 + include/linux/netdevice.h | 3 + include/linux/sched/mm.h | 21 ++ include/net/inet_frag.h | 16 +- include/net/ipv6_frag.h | 11 +- include/net/tls.h | 2 +- kernel/kprobes.c | 6 +- kernel/sched/fair.c | 25 ++ kernel/time/sched_clock.c | 4 +- kernel/time/tick-common.c | 2 + kernel/time/timekeeping.h | 7 + kernel/trace/ftrace.c | 6 +- mm/mmap.c | 7 +- mm/percpu.c | 8 +- mm/vmstat.c | 5 - net/atm/lec.c | 6 +- net/bridge/br_input.c | 23 +- net/bridge/br_multicast.c | 4 +- net/core/dev.c | 16 +- net/core/failover.c | 6 +- net/core/skbuff.c | 10 +- net/ipv4/fou.c | 4 +- net/ipv4/inet_fragment.c | 293 ++++++++++++++++++++ net/ipv4/ip_fragment.c | 302 +++------------------ net/ipv4/route.c | 16 +- net/ipv4/tcp_input.c | 10 +- net/ipv6/netfilter/nf_conntrack_reasm.c | 260 +++++------------- net/ipv6/reassembly.c | 240 +++++----------- net/ipv6/route.c | 4 + net/mac80211/driver-ops.h | 3 + net/sched/sch_cake.c | 57 ++-- net/tipc/name_table.c | 3 +- scripts/mod/file2alias.c | 149 ++++------ security/device_cgroup.c | 2 +- sound/core/info.c | 12 +- sound/core/init.c | 18 +- sound/pci/hda/patch_realtek.c | 6 + sound/soc/rockchip/rockchip_pcm.c | 3 +- tools/include/linux/bitops.h | 7 +- tools/include/linux/bits.h | 26 ++ tools/perf/check-headers.sh | 1 + 107 files changed, 1506 insertions(+), 1152 deletions(-)
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit 92480b3977fd3884649d404cbbaf839b70035699 ]
When a bond is enslaved to another bond, bond_netdev_event() only handles the event as if the bond is a master, and skips treating the bond as a slave.
This leads to a refcount leak on the slave, since we don't remove the adjacency to its master and the master holds a reference on the slave.
Reproducer: ip link add bondL type bond ip link add bondU type bond ip link set bondL master bondU ip link del bondL
No "Fixes:" tag, this code is older than git history.
Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3198,8 +3198,12 @@ static int bond_netdev_event(struct noti return NOTIFY_DONE;
if (event_dev->flags & IFF_MASTER) { + int ret; + netdev_dbg(event_dev, "IFF_MASTER\n"); - return bond_master_netdev_event(event, event_dev); + ret = bond_master_netdev_event(event, event_dev); + if (ret != NOTIFY_DONE) + return ret; }
if (event_dev->flags & IFF_SLAVE) {
From: Si-Wei Liu si-wei.liu@oracle.com
[ Upstream commit 8065a779f17e94536a1c4dcee4f9d88011672f97 ]
When a netdev appears through hot plug then gets enslaved by a failover master that is already up and running, the slave will be opened right away after getting enslaved. Today there's a race that userspace (udev) may fail to rename the slave if the kernel (net_failover) opens the slave earlier than when the userspace rename happens. Unlike bond or team, the primary slave of failover can't be renamed by userspace ahead of time, since the kernel initiated auto-enslavement is unable to, or rather, is never meant to be synchronized with the rename request from userspace.
As the failover slave interfaces are not designed to be operated directly by userspace apps: IP configuration, filter rules with regard to network traffic passing and etc., should all be done on master interface. In general, userspace apps only care about the name of master interface, while slave names are less important as long as admin users can see reliable names that may carry other information describing the netdev. For e.g., they can infer that "ens3nsby" is a standby slave of "ens3", while for a name like "eth0" they can't tell which master it belongs to.
Historically the name of IFF_UP interface can't be changed because there might be admin script or management software that is already relying on such behavior and assumes that the slave name can't be changed once UP. But failover is special: with the in-kernel auto-enslavement mechanism, the userspace expectation for device enumeration and bring-up order is already broken. Previously initramfs and various userspace config tools were modified to bypass failover slaves because of auto-enslavement and duplicate MAC address. Similarly, in case that users care about seeing reliable slave name, the new type of failover slaves needs to be taken care of specifically in userspace anyway.
It's less risky to lift up the rename restriction on failover slave which is already UP. Although it's possible this change may potentially break userspace component (most likely configuration scripts or management software) that assumes slave name can't be changed while UP, it's relatively a limited and controllable set among all userspace components, which can be fixed specifically to listen for the rename events on failover slaves. Userspace component interacting with slaves is expected to be changed to operate on failover master interface instead, as the failover slave is dynamic in nature which may come and go at any point. The goal is to make the role of failover slaves less relevant, and userspace components should only deal with failover master in the long run.
Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module") Signed-off-by: Si-Wei Liu si-wei.liu@oracle.com Reviewed-by: Liran Alon liran.alon@oracle.com Acked-by: Sridhar Samudrala sridhar.samudrala@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/netdevice.h | 3 +++ net/core/dev.c | 16 +++++++++++++++- net/core/failover.c | 6 +++--- 3 files changed, 21 insertions(+), 4 deletions(-)
--- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1456,6 +1456,7 @@ struct net_device_ops { * @IFF_FAILOVER: device is a failover master device * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device + * @IFF_LIVE_RENAME_OK: rename is allowed while device is up and running */ enum netdev_priv_flags { IFF_802_1Q_VLAN = 1<<0, @@ -1488,6 +1489,7 @@ enum netdev_priv_flags { IFF_FAILOVER = 1<<27, IFF_FAILOVER_SLAVE = 1<<28, IFF_L3MDEV_RX_HANDLER = 1<<29, + IFF_LIVE_RENAME_OK = 1<<30, };
#define IFF_802_1Q_VLAN IFF_802_1Q_VLAN @@ -1519,6 +1521,7 @@ enum netdev_priv_flags { #define IFF_FAILOVER IFF_FAILOVER #define IFF_FAILOVER_SLAVE IFF_FAILOVER_SLAVE #define IFF_L3MDEV_RX_HANDLER IFF_L3MDEV_RX_HANDLER +#define IFF_LIVE_RENAME_OK IFF_LIVE_RENAME_OK
/** * struct net_device - The DEVICE structure. --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1180,7 +1180,21 @@ int dev_change_name(struct net_device *d BUG_ON(!dev_net(dev));
net = dev_net(dev); - if (dev->flags & IFF_UP) + + /* Some auto-enslaved devices e.g. failover slaves are + * special, as userspace might rename the device after + * the interface had been brought up and running since + * the point kernel initiated auto-enslavement. Allow + * live name change even when these slave devices are + * up and running. + * + * Typically, users of these auto-enslaving devices + * don't actually care about slave name change, as + * they are supposed to operate on master interface + * directly. + */ + if (dev->flags & IFF_UP && + likely(!(dev->priv_flags & IFF_LIVE_RENAME_OK))) return -EBUSY;
write_seqcount_begin(&devnet_rename_seq); --- a/net/core/failover.c +++ b/net/core/failover.c @@ -80,14 +80,14 @@ static int failover_slave_register(struc goto err_upper_link; }
- slave_dev->priv_flags |= IFF_FAILOVER_SLAVE; + slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK);
if (fops && fops->slave_register && !fops->slave_register(slave_dev, failover_dev)) return NOTIFY_OK;
netdev_upper_dev_unlink(slave_dev, failover_dev); - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK); err_upper_link: netdev_rx_handler_unregister(slave_dev); done: @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net
netdev_rx_handler_unregister(slave_dev); netdev_upper_dev_unlink(slave_dev, failover_dev); - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK);
if (fops && fops->slave_unregister && !fops->slave_unregister(slave_dev, failover_dev))
From: "Gustavo A. R. Silva" gustavo@embeddedor.com
[ Upstream commit 899537b73557aafbdd11050b501cf54b4f5c45af ]
arg is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
net/atm/lec.c:715 lec_mcast_attach() warn: potential spectre issue 'dev_lec' [r] (local cap)
Fix this by sanitizing arg before using it to index dev_lec.
Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1].
[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/
Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/atm/lec.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/net/atm/lec.c +++ b/net/atm/lec.c @@ -710,7 +710,10 @@ static int lec_vcc_attach(struct atm_vcc
static int lec_mcast_attach(struct atm_vcc *vcc, int arg) { - if (arg < 0 || arg >= MAX_LEC_ITF || !dev_lec[arg]) + if (arg < 0 || arg >= MAX_LEC_ITF) + return -EINVAL; + arg = array_index_nospec(arg, MAX_LEC_ITF); + if (!dev_lec[arg]) return -EINVAL; vcc->proto_data = dev_lec[arg]; return lec_mcast_make(netdev_priv(dev_lec[arg]), vcc); @@ -728,6 +731,7 @@ static int lecd_attach(struct atm_vcc *v i = arg; if (arg >= MAX_LEC_ITF) return -EINVAL; + i = array_index_nospec(arg, MAX_LEC_ITF); if (!dev_lec[i]) { int size;
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
[ Upstream commit 3b2e2904deb314cc77a2192f506f2fd44e3d10d0 ]
When the commit below was introduced it changed two visible things: - the skb was no longer passed through the protocol handlers with the original device - the skb was passed up the stack with skb->dev = bridge
The first change broke af_packet sockets on bridge ports. For example we use them for hostapd which listens for ETH_P_PAE packets on the ports. We discussed two possible fixes: - create a clone and pass it through NF_HOOK(), act on the original skb based on the result - somehow signal to the caller from the okfn() that it was called, meaning the skb is ok to be passed, which this patch is trying to implement via returning 1 from the bridge link-local okfn()
Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and drop/error would return < 0 thus the okfn() is called only when the return was 1, so we signal to the caller that it was called by preserving the return value from nf_hook().
Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_input.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-)
--- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -195,13 +195,10 @@ static void __br_handle_local_finish(str /* note: already called with rcu_read_lock */ static int br_handle_local_finish(struct net *net, struct sock *sk, struct sk_buff *skb) { - struct net_bridge_port *p = br_port_get_rcu(skb->dev); - __br_handle_local_finish(skb);
- BR_INPUT_SKB_CB(skb)->brdev = p->br->dev; - br_pass_frame_up(skb); - return 0; + /* return 1 to signal the okfn() was called so it's ok to use the skb */ + return 1; }
/* @@ -278,10 +275,18 @@ rx_handler_result_t br_handle_frame(stru goto forward; }
- /* Deliver packet to local host only */ - NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN, dev_net(skb->dev), - NULL, skb, skb->dev, NULL, br_handle_local_finish); - return RX_HANDLER_CONSUMED; + /* The else clause should be hit when nf_hook(): + * - returns < 0 (drop/error) + * - returns = 0 (stolen/nf_queue) + * Thus return 1 from the okfn() to signal the skb is ok to pass + */ + if (NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN, + dev_net(skb->dev), NULL, skb, skb->dev, NULL, + br_handle_local_finish) == 1) { + return RX_HANDLER_PASS; + } else { + return RX_HANDLER_CONSUMED; + } }
forward:
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
[ Upstream commit c5b493ce192bd7a4e7bd073b5685aad121eeef82 ]
br_multicast_start_querier() walks over the port list but it can be called from a timer with only multicast_lock held which doesn't protect the port list, so use RCU to walk over it.
Fixes: c83b8fab06fc ("bridge: Restart queries when last querier expires") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_multicast.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -2152,7 +2152,8 @@ static void br_multicast_start_querier(s
__br_multicast_open(br, query);
- list_for_each_entry(port, &br->port_list, list) { + rcu_read_lock(); + list_for_each_entry_rcu(port, &br->port_list, list) { if (port->state == BR_STATE_DISABLED || port->state == BR_STATE_BLOCKING) continue; @@ -2164,6 +2165,7 @@ static void br_multicast_start_querier(s br_multicast_enable(&port->ip6_own_query); #endif } + rcu_read_unlock(); }
int br_multicast_toggle(struct net_bridge *br, unsigned long val)
From: Yuya Kusakabe yuya.kusakabe@gmail.com
[ Upstream commit d85e8be2a5a02869f815dd0ac2d743deb4cd7957 ]
skb_reorder_vlan_header() should move XDP meta data with ethernet header if XDP meta data exists.
Fixes: de8f3a83b0a0 ("bpf: add meta pointer for direct access") Signed-off-by: Yuya Kusakabe yuya.kusakabe@gmail.com Signed-off-by: Takeru Hayasaka taketarou2@gmail.com Co-developed-by: Takeru Hayasaka taketarou2@gmail.com Reviewed-by: Toshiaki Makita makita.toshiaki@lab.ntt.co.jp Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/skbuff.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -5071,7 +5071,8 @@ EXPORT_SYMBOL_GPL(skb_gso_validate_mac_l
static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb) { - int mac_len; + int mac_len, meta_len; + void *meta;
if (skb_cow(skb, skb_headroom(skb)) < 0) { kfree_skb(skb); @@ -5083,6 +5084,13 @@ static struct sk_buff *skb_reorder_vlan_ memmove(skb_mac_header(skb) + VLAN_HLEN, skb_mac_header(skb), mac_len - VLAN_HLEN - ETH_TLEN); } + + meta_len = skb_metadata_len(skb); + if (meta_len) { + meta = skb_metadata_end(skb) - meta_len; + memmove(meta + VLAN_HLEN, meta, meta_len); + } + skb->mac_header += VLAN_HLEN; return skb; }
From: Lorenzo Bianconi lorenzo.bianconi@redhat.com
[ Upstream commit 988dc4a9a3b66be75b30405a5494faf0dc7cffb6 ]
gue tunnels run iptunnel_pull_offloads on received skbs. This can determine a possible use-after-free accessing guehdr pointer since the packet will be 'uncloned' running pskb_expand_head if it is a cloned gso skb (e.g if the packet has been sent though a veth device)
Fixes: a09a4c8dd1ec ("tunnels: Remove encapsulation offloads on decap") Signed-off-by: Lorenzo Bianconi lorenzo.bianconi@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/fou.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/ipv4/fou.c +++ b/net/ipv4/fou.c @@ -120,6 +120,7 @@ static int gue_udp_recv(struct sock *sk, struct guehdr *guehdr; void *data; u16 doffset = 0; + u8 proto_ctype;
if (!fou) return 1; @@ -211,13 +212,14 @@ static int gue_udp_recv(struct sock *sk, if (unlikely(guehdr->control)) return gue_control_message(skb, guehdr);
+ proto_ctype = guehdr->proto_ctype; __skb_pull(skb, sizeof(struct udphdr) + hdrlen); skb_reset_transport_header(skb);
if (iptunnel_pull_offloads(skb)) goto drop;
- return -guehdr->proto_ctype; + return -proto_ctype;
drop: kfree_skb(skb);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 50ce163a72d817a99e8974222dcf2886d5deb1ae ]
For some reason, tcp_grow_window() correctly tests if enough room is present before attempting to increase tp->rcv_ssthresh, but does not prevent it to grow past tcp_space()
This is causing hard to debug issues, like failing the (__tcp_select_window(sk) >= tp->rcv_wnd) test in __tcp_ack_snd_check(), causing ACK delays and possibly slow flows.
Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio, we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000" after about 60 round trips, when the active side no longer sends immediate acks.
This bug predates git history.
Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Soheil Hassas Yeganeh soheil@google.com Acked-by: Neal Cardwell ncardwell@google.com Acked-by: Wei Wang weiwan@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_input.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -402,11 +402,12 @@ static int __tcp_grow_window(const struc static void tcp_grow_window(struct sock *sk, const struct sk_buff *skb) { struct tcp_sock *tp = tcp_sk(sk); + int room; + + room = min_t(int, tp->window_clamp, tcp_space(sk)) - tp->rcv_ssthresh;
/* Check #1 */ - if (tp->rcv_ssthresh < tp->window_clamp && - (int)tp->rcv_ssthresh < tcp_space(sk) && - !tcp_under_memory_pressure(sk)) { + if (room > 0 && !tcp_under_memory_pressure(sk)) { int incr;
/* Check #2. Increase window, if skb with such overhead @@ -419,8 +420,7 @@ static void tcp_grow_window(struct sock
if (incr) { incr = max_t(int, incr, 2 * skb->len); - tp->rcv_ssthresh = min(tp->rcv_ssthresh + incr, - tp->window_clamp); + tp->rcv_ssthresh += min(room, incr); inet_csk(sk)->icsk_ack.quick |= 1; } }
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 43c2adb9df7ddd6560fd3546d925b42cef92daa0 ]
After adding a team interface to bridge, the team interface will enter promisc mode. Then if we add a new slave to team0, the slave will keep promisc off. Fix it by setting slave to promisc on if team master is already in promisc mode, also do the same for allmulti.
v2: add promisc and allmulti checking when delete ports
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1250,6 +1250,23 @@ static int team_port_add(struct team *te goto err_option_port_add; }
+ /* set promiscuity level to new slave */ + if (dev->flags & IFF_PROMISC) { + err = dev_set_promiscuity(port_dev, 1); + if (err) + goto err_set_slave_promisc; + } + + /* set allmulti level to new slave */ + if (dev->flags & IFF_ALLMULTI) { + err = dev_set_allmulti(port_dev, 1); + if (err) { + if (dev->flags & IFF_PROMISC) + dev_set_promiscuity(port_dev, -1); + goto err_set_slave_promisc; + } + } + netif_addr_lock_bh(dev); dev_uc_sync_multiple(port_dev, dev); dev_mc_sync_multiple(port_dev, dev); @@ -1266,6 +1283,9 @@ static int team_port_add(struct team *te
return 0;
+err_set_slave_promisc: + __team_option_inst_del_port(team, port); + err_option_port_add: team_upper_dev_unlink(team, port);
@@ -1311,6 +1331,12 @@ static int team_port_del(struct team *te
team_port_disable(team, port); list_del_rcu(&port->list); + + if (dev->flags & IFF_PROMISC) + dev_set_promiscuity(port_dev, -1); + if (dev->flags & IFF_ALLMULTI) + dev_set_allmulti(port_dev, -1); + team_upper_dev_unlink(team, port); netdev_rx_handler_unregister(port_dev); team_port_disable_netpoll(port);
From: Hoang Le hoang.h.le@dektech.com.au
[ Upstream commit d1841533e54876f152a30ac398a34f47ad6590b1 ]
When binding multiple services with specific type 1Ki, 2Ki.., this leads to some entries in the name table of publications missing when listed out via 'tipc name show'.
The problem is at identify zero last_type conditional provided via netlink. The first is initial 'type' when starting name table dummping. The second is continuously with zero type (node state service type). Then, lookup function failure to finding node state service type in next iteration.
To solve this, adding more conditional to marked as dirty type and lookup correct service type for the next iteration instead of select the first service as initial 'type' zero.
Acked-by: Jon Maloy jon.maloy@ericsson.com Signed-off-by: Hoang Le hoang.h.le@dektech.com.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/name_table.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -908,7 +908,8 @@ static int tipc_nl_service_list(struct n for (; i < TIPC_NAMETBL_SIZE; i++) { head = &tn->nametbl->services[i];
- if (*last_type) { + if (*last_type || + (!i && *last_key && (*last_lower == *last_key))) { service = tipc_service_find(net, *last_type); if (!service) return -EPIPE;
From: Jason Wang jasowang@redhat.com
[ Upstream commit 813dbeb656d6c90266f251d8bd2b02d445afa63f ]
We used to accept zero size iova range which will lead a infinite loop in translate_desc(). Fixing this by failing the request in this case.
Reported-by: syzbot+d21e6e297322a900c128@syzkaller.appspotmail.com Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vhost/vhost.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -911,8 +911,12 @@ static int vhost_new_umem_range(struct v u64 start, u64 size, u64 end, u64 userspace_addr, int perm) { - struct vhost_umem_node *tmp, *node = kmalloc(sizeof(*node), GFP_ATOMIC); + struct vhost_umem_node *tmp, *node;
+ if (!size) + return -EFAULT; + + node = kmalloc(sizeof(*node), GFP_ATOMIC); if (!node) return -ENOMEM;
From: Stephen Suryaputra ssuryaextr@gmail.com
[ Upstream commit ed0de45a1008991fdaa27a0152befcb74d126a8b ]
Recompile IP options since IPCB may not be valid anymore when ipv4_link_failure is called from arp_error_report.
Refer to the commit 3da1ed7ac398 ("net: avoid use IPCB in cipso_v4_error") and the commit before that (9ef6b42ad6fd) for a similar issue.
Signed-off-by: Stephen Suryaputra ssuryaextr@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/route.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1188,8 +1188,16 @@ static struct dst_entry *ipv4_dst_check( static void ipv4_link_failure(struct sk_buff *skb) { struct rtable *rt; + struct ip_options opt;
- icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0); + /* Recompile ip options since IPCB may not be valid anymore. + */ + memset(&opt, 0, sizeof(opt)); + opt.optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); + if (__ip_options_compile(dev_net(skb->dev), &opt, skb, NULL)) + return; + + __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
rt = skb_rtable(skb); if (rt)
From: Eric Dumazet edumazet@google.com
[ Upstream commit c543cb4a5f07e09237ec0fc2c60c9f131b2c79ad ]
fib_compute_spec_dst() needs to be called under rcu protection.
syzbot reported :
WARNING: suspicious RCU usage 5.1.0-rc4+ #165 Not tainted include/linux/inetdevice.h:220 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by swapper/0/0: #0: 0000000051b67925 ((&n->timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:170 [inline] #0: 0000000051b67925 ((&n->timer)){+.-.}, at: call_timer_fn+0xda/0x720 kernel/time/timer.c:1315
stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-rc4+ #165 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5162 __in_dev_get_rcu include/linux/inetdevice.h:220 [inline] fib_compute_spec_dst+0xbbd/0x1030 net/ipv4/fib_frontend.c:294 spec_dst_fill net/ipv4/ip_options.c:245 [inline] __ip_options_compile+0x15a7/0x1a10 net/ipv4/ip_options.c:343 ipv4_link_failure+0x172/0x400 net/ipv4/route.c:1195 dst_link_failure include/net/dst.h:427 [inline] arp_error_report+0xd1/0x1c0 net/ipv4/arp.c:297 neigh_invalidate+0x24b/0x570 net/core/neighbour.c:995 neigh_timer_handler+0xc35/0xf30 net/core/neighbour.c:1081 call_timer_fn+0x190/0x720 kernel/time/timer.c:1325 expire_timers kernel/time/timer.c:1362 [inline] __run_timers kernel/time/timer.c:1681 [inline] __run_timers kernel/time/timer.c:1649 [inline] run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694 __do_softirq+0x266/0x95a kernel/softirq.c:293 invoke_softirq kernel/softirq.c:374 [inline] irq_exit+0x180/0x1d0 kernel/softirq.c:414 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Stephen Suryaputra ssuryaextr@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/route.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1187,14 +1187,20 @@ static struct dst_entry *ipv4_dst_check(
static void ipv4_link_failure(struct sk_buff *skb) { - struct rtable *rt; struct ip_options opt; + struct rtable *rt; + int res;
/* Recompile ip options since IPCB may not be valid anymore. */ memset(&opt, 0, sizeof(opt)); opt.optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); - if (__ip_options_compile(dev_net(skb->dev), &opt, skb, NULL)) + + rcu_read_lock(); + res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); + rcu_read_unlock(); + + if (res) return;
__icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
From: Matteo Croce mcroce@redhat.com
[ Upstream commit 5ee15c101f29e0093ffb5448773ccbc786eb313b ]
The thunderx driver splits frames bigger than 1530 bytes to multiple pages, making impossible to run an eBPF program on it. This leads to a maximum MTU of 1508 if QinQ is in use.
The thunderx driver forbids to load an eBPF program if the MTU is higher than 1500 bytes. Raise the limit to 1508 so it is possible to use L2 protocols which need some more headroom.
Fixes: 05c773f52b96e ("net: thunderx: Add basic XDP support") Signed-off-by: Matteo Croce mcroce@redhat.com Acked-by: Jesper Dangaard Brouer brouer@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -32,6 +32,13 @@ #define DRV_NAME "nicvf" #define DRV_VERSION "1.0"
+/* NOTE: Packets bigger than 1530 are split across multiple pages and XDP needs + * the buffer to be contiguous. Allow XDP to be set up only if we don't exceed + * this value, keeping headroom for the 14 byte Ethernet header and two + * VLAN tags (for QinQ) + */ +#define MAX_XDP_MTU (1530 - ETH_HLEN - VLAN_HLEN * 2) + /* Supported devices */ static const struct pci_device_id nicvf_id_table[] = { { PCI_DEVICE_SUB(PCI_VENDOR_ID_CAVIUM, @@ -1795,8 +1802,10 @@ static int nicvf_xdp_setup(struct nicvf bool bpf_attached = false; int ret = 0;
- /* For now just support only the usual MTU sized frames */ - if (prog && (dev->mtu > 1500)) { + /* For now just support only the usual MTU sized frames, + * plus some headroom for VLAN, QinQ. + */ + if (prog && dev->mtu > MAX_XDP_MTU) { netdev_warn(dev, "Jumbo frames not yet supported with XDP, current MTU %d.\n", dev->mtu); return -EOPNOTSUPP;
From: Matteo Croce mcroce@redhat.com
[ Upstream commit 1f227d16083b2e280b7dde4ca78883d75593f2fd ]
The thunderx driver forbids to load an eBPF program if the MTU is too high, but this can be circumvented by loading the eBPF, then raising the MTU.
Fix this by limiting the MTU if an eBPF program is already loaded.
Fixes: 05c773f52b96e ("net: thunderx: Add basic XDP support") Signed-off-by: Matteo Croce mcroce@redhat.com Acked-by: Jesper Dangaard Brouer brouer@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1554,6 +1554,15 @@ static int nicvf_change_mtu(struct net_d struct nicvf *nic = netdev_priv(netdev); int orig_mtu = netdev->mtu;
+ /* For now just support only the usual MTU sized frames, + * plus some headroom for VLAN, QinQ. + */ + if (nic->xdp_prog && new_mtu > MAX_XDP_MTU) { + netdev_warn(netdev, "Jumbo frames not yet supported with XDP, current MTU %d.\n", + netdev->mtu); + return -EINVAL; + } + netdev->mtu = new_mtu;
if (!netif_running(netdev))
From: Saeed Mahameed saeedm@mellanox.com
[ Upstream commit 31634bf5dcc418b5b2cacd954394c0c4620db6a2 ]
To avoid use-after-free, hold the rcu read lock until we are done copying flow data into the command buffer.
Fixes: ab412e1dd7db ("net/mlx5: Accel, add TLS rx offload routines") Reported-by: Eric Dumazet eric.dumazet@gmail.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c @@ -217,22 +217,22 @@ int mlx5_fpga_tls_resync_rx(struct mlx5_ void *cmd; int ret;
- rcu_read_lock(); - flow = idr_find(&mdev->fpga->tls->rx_idr, ntohl(handle)); - rcu_read_unlock(); - - if (!flow) { - WARN_ONCE(1, "Received NULL pointer for handle\n"); - return -EINVAL; - } - buf = kzalloc(size, GFP_ATOMIC); if (!buf) return -ENOMEM;
cmd = (buf + 1);
+ rcu_read_lock(); + flow = idr_find(&mdev->fpga->tls->rx_idr, ntohl(handle)); + if (unlikely(!flow)) { + rcu_read_unlock(); + WARN_ONCE(1, "Received NULL pointer for handle\n"); + kfree(buf); + return -EINVAL; + } mlx5_fpga_tls_flow_to_cmd(flow, cmd); + rcu_read_unlock();
MLX5_SET(tls_cmd, cmd, swid, ntohl(handle)); MLX5_SET64(tls_cmd, cmd, tls_rcd_sn, be64_to_cpu(rcd_sn));
From: Jakub Kicinski jakub.kicinski@netronome.com
[ Upstream commit b4f47f3848eb70986f75d06112af7b48b7f5f462 ]
Unlike '&&' operator, the '&' does not have short-circuit evaluation semantics. IOW both sides of the operator always get evaluated. Fix the wrong operator in tls_is_sk_tx_device_offloaded(), which would lead to out-of-bounds access for for non-full sockets.
Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Reviewed-by: Dirk van der Merwe dirk.vandermerwe@netronome.com Reviewed-by: Simon Horman simon.horman@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/tls.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/net/tls.h +++ b/include/net/tls.h @@ -317,7 +317,7 @@ tls_validate_xmit_skb(struct sock *sk, s static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk) { #ifdef CONFIG_SOCK_VALIDATE_XMIT - return sk_fullsock(sk) & + return sk_fullsock(sk) && (smp_load_acquire(&sk->sk_validate_xmit_skb) == &tls_validate_xmit_skb); #else
From: Saeed Mahameed saeedm@mellanox.com
[ Upstream commit df3a8344d404a810b4aadbf19b08c8232fbaa715 ]
Flow is kfreed on mlx5_fpga_tls_del_flow but kept in the idr data structure, this is risky and can cause use-after-free, since the idr_remove is delayed until tls_send_teardown_cmd completion.
Instead of delaying idr_remove, in this patch we do it on mlx5_fpga_tls_del_flow, before actually kfree(flow).
Added synchronize_rcu before kfree(flow)
Fixes: ab412e1dd7db ("net/mlx5: Accel, add TLS rx offload routines") Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 43 +++++++-------------- 1 file changed, 15 insertions(+), 28 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c @@ -148,14 +148,16 @@ static int mlx5_fpga_tls_alloc_swid(stru return ret; }
-static void mlx5_fpga_tls_release_swid(struct idr *idr, - spinlock_t *idr_spinlock, u32 swid) +static void *mlx5_fpga_tls_release_swid(struct idr *idr, + spinlock_t *idr_spinlock, u32 swid) { unsigned long flags; + void *ptr;
spin_lock_irqsave(idr_spinlock, flags); - idr_remove(idr, swid); + ptr = idr_remove(idr, swid); spin_unlock_irqrestore(idr_spinlock, flags); + return ptr; }
static void mlx_tls_kfree_complete(struct mlx5_fpga_conn *conn, @@ -165,20 +167,12 @@ static void mlx_tls_kfree_complete(struc kfree(buf); }
-struct mlx5_teardown_stream_context { - struct mlx5_fpga_tls_command_context cmd; - u32 swid; -}; - static void mlx5_fpga_tls_teardown_completion(struct mlx5_fpga_conn *conn, struct mlx5_fpga_device *fdev, struct mlx5_fpga_tls_command_context *cmd, struct mlx5_fpga_dma_buf *resp) { - struct mlx5_teardown_stream_context *ctx = - container_of(cmd, struct mlx5_teardown_stream_context, cmd); - if (resp) { u32 syndrome = MLX5_GET(tls_resp, resp->sg[0].data, syndrome);
@@ -186,14 +180,6 @@ mlx5_fpga_tls_teardown_completion(struct mlx5_fpga_err(fdev, "Teardown stream failed with syndrome = %d", syndrome); - else if (MLX5_GET(tls_cmd, cmd->buf.sg[0].data, direction_sx)) - mlx5_fpga_tls_release_swid(&fdev->tls->tx_idr, - &fdev->tls->tx_idr_spinlock, - ctx->swid); - else - mlx5_fpga_tls_release_swid(&fdev->tls->rx_idr, - &fdev->tls->rx_idr_spinlock, - ctx->swid); } mlx5_fpga_tls_put_command_ctx(cmd); } @@ -253,7 +239,7 @@ int mlx5_fpga_tls_resync_rx(struct mlx5_ static void mlx5_fpga_tls_send_teardown_cmd(struct mlx5_core_dev *mdev, void *flow, u32 swid, gfp_t flags) { - struct mlx5_teardown_stream_context *ctx; + struct mlx5_fpga_tls_command_context *ctx; struct mlx5_fpga_dma_buf *buf; void *cmd;
@@ -261,7 +247,7 @@ static void mlx5_fpga_tls_send_teardown_ if (!ctx) return;
- buf = &ctx->cmd.buf; + buf = &ctx->buf; cmd = (ctx + 1); MLX5_SET(tls_cmd, cmd, command_type, CMD_TEARDOWN_STREAM); MLX5_SET(tls_cmd, cmd, swid, swid); @@ -272,8 +258,7 @@ static void mlx5_fpga_tls_send_teardown_ buf->sg[0].data = cmd; buf->sg[0].size = MLX5_TLS_COMMAND_SIZE;
- ctx->swid = swid; - mlx5_fpga_tls_cmd_send(mdev->fpga, &ctx->cmd, + mlx5_fpga_tls_cmd_send(mdev->fpga, ctx, mlx5_fpga_tls_teardown_completion); }
@@ -283,13 +268,14 @@ void mlx5_fpga_tls_del_flow(struct mlx5_ struct mlx5_fpga_tls *tls = mdev->fpga->tls; void *flow;
- rcu_read_lock(); if (direction_sx) - flow = idr_find(&tls->tx_idr, swid); + flow = mlx5_fpga_tls_release_swid(&tls->tx_idr, + &tls->tx_idr_spinlock, + swid); else - flow = idr_find(&tls->rx_idr, swid); - - rcu_read_unlock(); + flow = mlx5_fpga_tls_release_swid(&tls->rx_idr, + &tls->rx_idr_spinlock, + swid);
if (!flow) { mlx5_fpga_err(mdev->fpga, "No flow information for swid %u\n", @@ -297,6 +283,7 @@ void mlx5_fpga_tls_del_flow(struct mlx5_ return; }
+ synchronize_rcu(); /* before kfree(flow) */ mlx5_fpga_tls_send_teardown_cmd(mdev, flow, swid, flags); }
From: Jonathan Lemon jonathan.lemon@gmail.com
[ Upstream commit 9c69a13205151c0d801de9f9d83a818e6e8f60ec ]
When __ip6_rt_update_pmtu() is called, rt->from is RCU dereferenced, but is never checked for null - rt6_flush_exceptions() may have removed the entry.
[ 1913.989004] RIP: 0010:ip6_rt_cache_alloc+0x13/0x170 [ 1914.209410] Call Trace: [ 1914.214798] <IRQ> [ 1914.219226] __ip6_rt_update_pmtu+0xb0/0x190 [ 1914.228649] ip6_tnl_xmit+0x2c2/0x970 [ip6_tunnel] [ 1914.239223] ? ip6_tnl_parse_tlv_enc_lim+0x32/0x1a0 [ip6_tunnel] [ 1914.252489] ? __gre6_xmit+0x148/0x530 [ip6_gre] [ 1914.262678] ip6gre_tunnel_xmit+0x17e/0x3c7 [ip6_gre] [ 1914.273831] dev_hard_start_xmit+0x8d/0x1f0 [ 1914.283061] sch_direct_xmit+0xfa/0x230 [ 1914.291521] __qdisc_run+0x154/0x4b0 [ 1914.299407] net_tx_action+0x10e/0x1f0 [ 1914.307678] __do_softirq+0xca/0x297 [ 1914.315567] irq_exit+0x96/0xa0 [ 1914.322494] smp_apic_timer_interrupt+0x68/0x130 [ 1914.332683] apic_timer_interrupt+0xf/0x20 [ 1914.341721] </IRQ>
Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Jonathan Lemon jonathan.lemon@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@gmail.com Reviewed-by: Martin KaFai Lau kafai@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2367,6 +2367,10 @@ static void __ip6_rt_update_pmtu(struct
rcu_read_lock(); from = rcu_dereference(rt6->from); + if (!from) { + rcu_read_unlock(); + return; + } nrt6 = ip6_rt_cache_alloc(from, daddr, saddr); if (nrt6) { rt6_do_update_pmtu(nrt6, mtu);
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit b2100cc56fca8c51d28aa42a9f1fbcb2cf351996 ]
We shouldn't be using skb->protocol directly as that will miss cases with hardware-accelerated VLAN tags. Use the helper instead to get the right protocol number.
Reported-by: Kevin Darbyshire-Bryant kevin@darbyshire-bryant.me.uk Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1526,7 +1526,7 @@ static u8 cake_handle_diffserv(struct sk { u8 dscp;
- switch (skb->protocol) { + switch (tc_skb_protocol(skb)) { case htons(ETH_P_IP): dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2; if (wash && dscp)
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit c87b4ecdbe8db27867a7b7f840291cd843406bd7 ]
There is not actually any guarantee that the IP headers are valid before we access the DSCP bits of the packets. Fix this using the same approach taken in sch_dsmark.
Reported-by: Kevin Darbyshire-Bryant kevin@darbyshire-bryant.me.uk Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1524,16 +1524,27 @@ static void cake_wash_diffserv(struct sk
static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) { + int wlen = skb_network_offset(skb); u8 dscp;
switch (tc_skb_protocol(skb)) { case htons(ETH_P_IP): + wlen += sizeof(struct iphdr); + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2; if (wash && dscp) ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0); return dscp;
case htons(ETH_P_IPV6): + wlen += sizeof(struct ipv6hdr); + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + dscp = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2; if (wash && dscp) ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0);
From: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com
[ Upstream commit f7ee799a51ddbcc205ef615fe424fb5084e9e0aa ]
Replace vlan CFI bit with a vlan present bit that indicates the presence of a vlan tag. Previously the driver incorrectly assumed that an vlan id of 0 is not matchable, therefore we indicate vlan presence with a vlan present bit.
Fixes: 5571e8c9f241 ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com Signed-off-by: Louis Peens louis.peens@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 2 +- drivers/net/ethernet/netronome/nfp/flower/match.c | 14 ++++++-------- 2 files changed, 7 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h +++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h @@ -55,7 +55,7 @@ #define NFP_FLOWER_LAYER2_GENEVE_OP BIT(6)
#define NFP_FLOWER_MASK_VLAN_PRIO GENMASK(15, 13) -#define NFP_FLOWER_MASK_VLAN_CFI BIT(12) +#define NFP_FLOWER_MASK_VLAN_PRESENT BIT(12) #define NFP_FLOWER_MASK_VLAN_VID GENMASK(11, 0)
#define NFP_FLOWER_MASK_MPLS_LB GENMASK(31, 12) --- a/drivers/net/ethernet/netronome/nfp/flower/match.c +++ b/drivers/net/ethernet/netronome/nfp/flower/match.c @@ -56,14 +56,12 @@ nfp_flower_compile_meta_tci(struct nfp_f FLOW_DISSECTOR_KEY_VLAN, target); /* Populate the tci field. */ - if (flow_vlan->vlan_id || flow_vlan->vlan_priority) { - tmp_tci = FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO, - flow_vlan->vlan_priority) | - FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID, - flow_vlan->vlan_id) | - NFP_FLOWER_MASK_VLAN_CFI; - frame->tci = cpu_to_be16(tmp_tci); - } + tmp_tci = NFP_FLOWER_MASK_VLAN_PRESENT; + tmp_tci |= FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO, + flow_vlan->vlan_priority) | + FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID, + flow_vlan->vlan_id); + frame->tci = cpu_to_be16(tmp_tci); } }
From: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com
[ Upstream commit 42cd5484a22f1a1b947e21e2af65fa7dab09d017 ]
We no longer set CFI when pushing vlan tags, therefore we remove the CFI bit from push vlan.
Fixes: 1a1e586f54bf ("nfp: add basic action capabilities to flower offloads") Signed-off-by: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com Signed-off-by: Louis Peens louis.peens@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/flower/action.c | 3 +-- drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 1 - 2 files changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c +++ b/drivers/net/ethernet/netronome/nfp/flower/action.c @@ -80,8 +80,7 @@ nfp_fl_push_vlan(struct nfp_fl_push_vlan
tmp_push_vlan_tci = FIELD_PREP(NFP_FL_PUSH_VLAN_PRIO, tcf_vlan_push_prio(action)) | - FIELD_PREP(NFP_FL_PUSH_VLAN_VID, tcf_vlan_push_vid(action)) | - NFP_FL_PUSH_VLAN_CFI; + FIELD_PREP(NFP_FL_PUSH_VLAN_VID, tcf_vlan_push_vid(action)); push_vlan->vlan_tci = cpu_to_be16(tmp_push_vlan_tci); }
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h +++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h @@ -109,7 +109,6 @@ #define NFP_FL_OUT_FLAGS_TYPE_IDX GENMASK(2, 0)
#define NFP_FL_PUSH_VLAN_PRIO GENMASK(15, 13) -#define NFP_FL_PUSH_VLAN_CFI BIT(12) #define NFP_FL_PUSH_VLAN_VID GENMASK(11, 0)
/* LAG ports */
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit 4976e3c683f328bc6f2edef555a4ffee6524486f ]
The logic in cake_select_tin() was getting a bit hairy, and it turns out we can simplify it quite a bit. This also allows us to get rid of one of the two diffserv parsing functions, which has the added benefit that already-zeroed DSCP fields won't get re-written.
Suggested-by: Kevin Darbyshire-Bryant ldir@darbyshire-bryant.me.uk Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 44 ++++++++++++++++---------------------------- 1 file changed, 16 insertions(+), 28 deletions(-)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1508,20 +1508,6 @@ static unsigned int cake_drop(struct Qdi return idx + (tin << 16); }
-static void cake_wash_diffserv(struct sk_buff *skb) -{ - switch (skb->protocol) { - case htons(ETH_P_IP): - ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0); - break; - case htons(ETH_P_IPV6): - ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0); - break; - default: - break; - } -} - static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) { int wlen = skb_network_offset(skb); @@ -1564,25 +1550,27 @@ static struct cake_tin_data *cake_select { struct cake_sched_data *q = qdisc_priv(sch); u32 tin; + u8 dscp; + + /* Tin selection: Default to diffserv-based selection, allow overriding + * using firewall marks or skb->priority. + */ + dscp = cake_handle_diffserv(skb, + q->rate_flags & CAKE_FLAG_WASH); + + if (q->tin_mode == CAKE_DIFFSERV_BESTEFFORT) + tin = 0;
- if (TC_H_MAJ(skb->priority) == sch->handle && - TC_H_MIN(skb->priority) > 0 && - TC_H_MIN(skb->priority) <= q->tin_cnt) { + else if (TC_H_MAJ(skb->priority) == sch->handle && + TC_H_MIN(skb->priority) > 0 && + TC_H_MIN(skb->priority) <= q->tin_cnt) tin = q->tin_order[TC_H_MIN(skb->priority) - 1];
- if (q->rate_flags & CAKE_FLAG_WASH) - cake_wash_diffserv(skb); - } else if (q->tin_mode != CAKE_DIFFSERV_BESTEFFORT) { - /* extract the Diffserv Precedence field, if it exists */ - /* and clear DSCP bits if washing */ - tin = q->tin_index[cake_handle_diffserv(skb, - q->rate_flags & CAKE_FLAG_WASH)]; + else { + tin = q->tin_index[dscp]; + if (unlikely(tin >= q->tin_cnt)) tin = 0; - } else { - tin = 0; - if (q->rate_flags & CAKE_FLAG_WASH) - cake_wash_diffserv(skb); }
return &q->tins[tin];
[ Upstream commit c23f35d19db3b36ffb9e04b08f1d91565d15f84f ]
This is a refactoring patch: without changing runtime behavior, it moves rbtree-related code from IPv4-specific files/functions into .h/.c defrag files shared with IPv6 defragmentation code.
v2: make handling of overlapping packets match upstream.
Signed-off-by: Peter Oskolkov posk@google.com Cc: Eric Dumazet edumazet@google.com Cc: Florian Westphal fw@strlen.de Cc: Tom Herbert tom@herbertland.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_frag.h | 16 ++- net/ipv4/inet_fragment.c | 293 +++++++++++++++++++++++++++++++++++++ net/ipv4/ip_fragment.c | 302 +++++---------------------------------- 3 files changed, 342 insertions(+), 269 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h index 1662cbc0b46b..b02bf737d019 100644 --- a/include/net/inet_frag.h +++ b/include/net/inet_frag.h @@ -77,8 +77,8 @@ struct inet_frag_queue { struct timer_list timer; spinlock_t lock; refcount_t refcnt; - struct sk_buff *fragments; /* Used in IPv6. */ - struct rb_root rb_fragments; /* Used in IPv4. */ + struct sk_buff *fragments; /* used in 6lopwpan IPv6. */ + struct rb_root rb_fragments; /* Used in IPv4/IPv6. */ struct sk_buff *fragments_tail; struct sk_buff *last_run_head; ktime_t stamp; @@ -153,4 +153,16 @@ static inline void add_frag_mem_limit(struct netns_frags *nf, long val)
extern const u8 ip_frag_ecn_table[16];
+/* Return values of inet_frag_queue_insert() */ +#define IPFRAG_OK 0 +#define IPFRAG_DUP 1 +#define IPFRAG_OVERLAP 2 +int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb, + int offset, int end); +void *inet_frag_reasm_prepare(struct inet_frag_queue *q, struct sk_buff *skb, + struct sk_buff *parent); +void inet_frag_reasm_finish(struct inet_frag_queue *q, struct sk_buff *head, + void *reasm_data); +struct sk_buff *inet_frag_pull_head(struct inet_frag_queue *q); + #endif diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 760a9e52e02b..9f69411251d0 100644 --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -25,6 +25,62 @@ #include <net/sock.h> #include <net/inet_frag.h> #include <net/inet_ecn.h> +#include <net/ip.h> +#include <net/ipv6.h> + +/* Use skb->cb to track consecutive/adjacent fragments coming at + * the end of the queue. Nodes in the rb-tree queue will + * contain "runs" of one or more adjacent fragments. + * + * Invariants: + * - next_frag is NULL at the tail of a "run"; + * - the head of a "run" has the sum of all fragment lengths in frag_run_len. + */ +struct ipfrag_skb_cb { + union { + struct inet_skb_parm h4; + struct inet6_skb_parm h6; + }; + struct sk_buff *next_frag; + int frag_run_len; +}; + +#define FRAG_CB(skb) ((struct ipfrag_skb_cb *)((skb)->cb)) + +static void fragcb_clear(struct sk_buff *skb) +{ + RB_CLEAR_NODE(&skb->rbnode); + FRAG_CB(skb)->next_frag = NULL; + FRAG_CB(skb)->frag_run_len = skb->len; +} + +/* Append skb to the last "run". */ +static void fragrun_append_to_last(struct inet_frag_queue *q, + struct sk_buff *skb) +{ + fragcb_clear(skb); + + FRAG_CB(q->last_run_head)->frag_run_len += skb->len; + FRAG_CB(q->fragments_tail)->next_frag = skb; + q->fragments_tail = skb; +} + +/* Create a new "run" with the skb. */ +static void fragrun_create(struct inet_frag_queue *q, struct sk_buff *skb) +{ + BUILD_BUG_ON(sizeof(struct ipfrag_skb_cb) > sizeof(skb->cb)); + fragcb_clear(skb); + + if (q->last_run_head) + rb_link_node(&skb->rbnode, &q->last_run_head->rbnode, + &q->last_run_head->rbnode.rb_right); + else + rb_link_node(&skb->rbnode, NULL, &q->rb_fragments.rb_node); + rb_insert_color(&skb->rbnode, &q->rb_fragments); + + q->fragments_tail = skb; + q->last_run_head = skb; +}
/* Given the OR values of all fragments, apply RFC 3168 5.3 requirements * Value : 0xff if frame should be dropped. @@ -123,6 +179,28 @@ static void inet_frag_destroy_rcu(struct rcu_head *head) kmem_cache_free(f->frags_cachep, q); }
+unsigned int inet_frag_rbtree_purge(struct rb_root *root) +{ + struct rb_node *p = rb_first(root); + unsigned int sum = 0; + + while (p) { + struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode); + + p = rb_next(p); + rb_erase(&skb->rbnode, root); + while (skb) { + struct sk_buff *next = FRAG_CB(skb)->next_frag; + + sum += skb->truesize; + kfree_skb(skb); + skb = next; + } + } + return sum; +} +EXPORT_SYMBOL(inet_frag_rbtree_purge); + void inet_frag_destroy(struct inet_frag_queue *q) { struct sk_buff *fp; @@ -224,3 +302,218 @@ struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key) return fq; } EXPORT_SYMBOL(inet_frag_find); + +int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb, + int offset, int end) +{ + struct sk_buff *last = q->fragments_tail; + + /* RFC5722, Section 4, amended by Errata ID : 3089 + * When reassembling an IPv6 datagram, if + * one or more its constituent fragments is determined to be an + * overlapping fragment, the entire datagram (and any constituent + * fragments) MUST be silently discarded. + * + * Duplicates, however, should be ignored (i.e. skb dropped, but the + * queue/fragments kept for later reassembly). + */ + if (!last) + fragrun_create(q, skb); /* First fragment. */ + else if (last->ip_defrag_offset + last->len < end) { + /* This is the common case: skb goes to the end. */ + /* Detect and discard overlaps. */ + if (offset < last->ip_defrag_offset + last->len) + return IPFRAG_OVERLAP; + if (offset == last->ip_defrag_offset + last->len) + fragrun_append_to_last(q, skb); + else + fragrun_create(q, skb); + } else { + /* Binary search. Note that skb can become the first fragment, + * but not the last (covered above). + */ + struct rb_node **rbn, *parent; + + rbn = &q->rb_fragments.rb_node; + do { + struct sk_buff *curr; + int curr_run_end; + + parent = *rbn; + curr = rb_to_skb(parent); + curr_run_end = curr->ip_defrag_offset + + FRAG_CB(curr)->frag_run_len; + if (end <= curr->ip_defrag_offset) + rbn = &parent->rb_left; + else if (offset >= curr_run_end) + rbn = &parent->rb_right; + else if (offset >= curr->ip_defrag_offset && + end <= curr_run_end) + return IPFRAG_DUP; + else + return IPFRAG_OVERLAP; + } while (*rbn); + /* Here we have parent properly set, and rbn pointing to + * one of its NULL left/right children. Insert skb. + */ + fragcb_clear(skb); + rb_link_node(&skb->rbnode, parent, rbn); + rb_insert_color(&skb->rbnode, &q->rb_fragments); + } + + skb->ip_defrag_offset = offset; + + return IPFRAG_OK; +} +EXPORT_SYMBOL(inet_frag_queue_insert); + +void *inet_frag_reasm_prepare(struct inet_frag_queue *q, struct sk_buff *skb, + struct sk_buff *parent) +{ + struct sk_buff *fp, *head = skb_rb_first(&q->rb_fragments); + struct sk_buff **nextp; + int delta; + + if (head != skb) { + fp = skb_clone(skb, GFP_ATOMIC); + if (!fp) + return NULL; + FRAG_CB(fp)->next_frag = FRAG_CB(skb)->next_frag; + if (RB_EMPTY_NODE(&skb->rbnode)) + FRAG_CB(parent)->next_frag = fp; + else + rb_replace_node(&skb->rbnode, &fp->rbnode, + &q->rb_fragments); + if (q->fragments_tail == skb) + q->fragments_tail = fp; + skb_morph(skb, head); + FRAG_CB(skb)->next_frag = FRAG_CB(head)->next_frag; + rb_replace_node(&head->rbnode, &skb->rbnode, + &q->rb_fragments); + consume_skb(head); + head = skb; + } + WARN_ON(head->ip_defrag_offset != 0); + + delta = -head->truesize; + + /* Head of list must not be cloned. */ + if (skb_unclone(head, GFP_ATOMIC)) + return NULL; + + delta += head->truesize; + if (delta) + add_frag_mem_limit(q->net, delta); + + /* If the first fragment is fragmented itself, we split + * it to two chunks: the first with data and paged part + * and the second, holding only fragments. + */ + if (skb_has_frag_list(head)) { + struct sk_buff *clone; + int i, plen = 0; + + clone = alloc_skb(0, GFP_ATOMIC); + if (!clone) + return NULL; + skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list; + skb_frag_list_init(head); + for (i = 0; i < skb_shinfo(head)->nr_frags; i++) + plen += skb_frag_size(&skb_shinfo(head)->frags[i]); + clone->data_len = head->data_len - plen; + clone->len = clone->data_len; + head->truesize += clone->truesize; + clone->csum = 0; + clone->ip_summed = head->ip_summed; + add_frag_mem_limit(q->net, clone->truesize); + skb_shinfo(head)->frag_list = clone; + nextp = &clone->next; + } else { + nextp = &skb_shinfo(head)->frag_list; + } + + return nextp; +} +EXPORT_SYMBOL(inet_frag_reasm_prepare); + +void inet_frag_reasm_finish(struct inet_frag_queue *q, struct sk_buff *head, + void *reasm_data) +{ + struct sk_buff **nextp = (struct sk_buff **)reasm_data; + struct rb_node *rbn; + struct sk_buff *fp; + + skb_push(head, head->data - skb_network_header(head)); + + /* Traverse the tree in order, to build frag_list. */ + fp = FRAG_CB(head)->next_frag; + rbn = rb_next(&head->rbnode); + rb_erase(&head->rbnode, &q->rb_fragments); + while (rbn || fp) { + /* fp points to the next sk_buff in the current run; + * rbn points to the next run. + */ + /* Go through the current run. */ + while (fp) { + *nextp = fp; + nextp = &fp->next; + fp->prev = NULL; + memset(&fp->rbnode, 0, sizeof(fp->rbnode)); + fp->sk = NULL; + head->data_len += fp->len; + head->len += fp->len; + if (head->ip_summed != fp->ip_summed) + head->ip_summed = CHECKSUM_NONE; + else if (head->ip_summed == CHECKSUM_COMPLETE) + head->csum = csum_add(head->csum, fp->csum); + head->truesize += fp->truesize; + fp = FRAG_CB(fp)->next_frag; + } + /* Move to the next run. */ + if (rbn) { + struct rb_node *rbnext = rb_next(rbn); + + fp = rb_to_skb(rbn); + rb_erase(rbn, &q->rb_fragments); + rbn = rbnext; + } + } + sub_frag_mem_limit(q->net, head->truesize); + + *nextp = NULL; + skb_mark_not_on_list(head); + head->prev = NULL; + head->tstamp = q->stamp; +} +EXPORT_SYMBOL(inet_frag_reasm_finish); + +struct sk_buff *inet_frag_pull_head(struct inet_frag_queue *q) +{ + struct sk_buff *head; + + if (q->fragments) { + head = q->fragments; + q->fragments = head->next; + } else { + struct sk_buff *skb; + + head = skb_rb_first(&q->rb_fragments); + if (!head) + return NULL; + skb = FRAG_CB(head)->next_frag; + if (skb) + rb_replace_node(&head->rbnode, &skb->rbnode, + &q->rb_fragments); + else + rb_erase(&head->rbnode, &q->rb_fragments); + memset(&head->rbnode, 0, sizeof(head->rbnode)); + barrier(); + } + if (head == q->fragments_tail) + q->fragments_tail = NULL; + + sub_frag_mem_limit(q->net, head->truesize); + + return head; +} +EXPORT_SYMBOL(inet_frag_pull_head); diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index d95b32af4a0e..5a1d39e32196 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -57,57 +57,6 @@ */ static const char ip_frag_cache_name[] = "ip4-frags";
-/* Use skb->cb to track consecutive/adjacent fragments coming at - * the end of the queue. Nodes in the rb-tree queue will - * contain "runs" of one or more adjacent fragments. - * - * Invariants: - * - next_frag is NULL at the tail of a "run"; - * - the head of a "run" has the sum of all fragment lengths in frag_run_len. - */ -struct ipfrag_skb_cb { - struct inet_skb_parm h; - struct sk_buff *next_frag; - int frag_run_len; -}; - -#define FRAG_CB(skb) ((struct ipfrag_skb_cb *)((skb)->cb)) - -static void ip4_frag_init_run(struct sk_buff *skb) -{ - BUILD_BUG_ON(sizeof(struct ipfrag_skb_cb) > sizeof(skb->cb)); - - FRAG_CB(skb)->next_frag = NULL; - FRAG_CB(skb)->frag_run_len = skb->len; -} - -/* Append skb to the last "run". */ -static void ip4_frag_append_to_last_run(struct inet_frag_queue *q, - struct sk_buff *skb) -{ - RB_CLEAR_NODE(&skb->rbnode); - FRAG_CB(skb)->next_frag = NULL; - - FRAG_CB(q->last_run_head)->frag_run_len += skb->len; - FRAG_CB(q->fragments_tail)->next_frag = skb; - q->fragments_tail = skb; -} - -/* Create a new "run" with the skb. */ -static void ip4_frag_create_run(struct inet_frag_queue *q, struct sk_buff *skb) -{ - if (q->last_run_head) - rb_link_node(&skb->rbnode, &q->last_run_head->rbnode, - &q->last_run_head->rbnode.rb_right); - else - rb_link_node(&skb->rbnode, NULL, &q->rb_fragments.rb_node); - rb_insert_color(&skb->rbnode, &q->rb_fragments); - - ip4_frag_init_run(skb); - q->fragments_tail = skb; - q->last_run_head = skb; -} - /* Describe an entry in the "incomplete datagrams" queue. */ struct ipq { struct inet_frag_queue q; @@ -212,27 +161,9 @@ static void ip_expire(struct timer_list *t) * pull the head out of the tree in order to be able to * deal with head->dev. */ - if (qp->q.fragments) { - head = qp->q.fragments; - qp->q.fragments = head->next; - } else { - head = skb_rb_first(&qp->q.rb_fragments); - if (!head) - goto out; - if (FRAG_CB(head)->next_frag) - rb_replace_node(&head->rbnode, - &FRAG_CB(head)->next_frag->rbnode, - &qp->q.rb_fragments); - else - rb_erase(&head->rbnode, &qp->q.rb_fragments); - memset(&head->rbnode, 0, sizeof(head->rbnode)); - barrier(); - } - if (head == qp->q.fragments_tail) - qp->q.fragments_tail = NULL; - - sub_frag_mem_limit(qp->q.net, head->truesize); - + head = inet_frag_pull_head(&qp->q); + if (!head) + goto out; head->dev = dev_get_by_index_rcu(net, qp->iif); if (!head->dev) goto out; @@ -345,12 +276,10 @@ static int ip_frag_reinit(struct ipq *qp) static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb) { struct net *net = container_of(qp->q.net, struct net, ipv4.frags); - struct rb_node **rbn, *parent; - struct sk_buff *skb1, *prev_tail; - int ihl, end, skb1_run_end; + int ihl, end, flags, offset; + struct sk_buff *prev_tail; struct net_device *dev; unsigned int fragsize; - int flags, offset; int err = -ENOENT; u8 ecn;
@@ -382,7 +311,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb) */ if (end < qp->q.len || ((qp->q.flags & INET_FRAG_LAST_IN) && end != qp->q.len)) - goto err; + goto discard_qp; qp->q.flags |= INET_FRAG_LAST_IN; qp->q.len = end; } else { @@ -394,82 +323,33 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb) if (end > qp->q.len) { /* Some bits beyond end -> corruption. */ if (qp->q.flags & INET_FRAG_LAST_IN) - goto err; + goto discard_qp; qp->q.len = end; } } if (end == offset) - goto err; + goto discard_qp;
err = -ENOMEM; if (!pskb_pull(skb, skb_network_offset(skb) + ihl)) - goto err; + goto discard_qp;
err = pskb_trim_rcsum(skb, end - offset); if (err) - goto err; + goto discard_qp;
/* Note : skb->rbnode and skb->dev share the same location. */ dev = skb->dev; /* Makes sure compiler wont do silly aliasing games */ barrier();
- /* RFC5722, Section 4, amended by Errata ID : 3089 - * When reassembling an IPv6 datagram, if - * one or more its constituent fragments is determined to be an - * overlapping fragment, the entire datagram (and any constituent - * fragments) MUST be silently discarded. - * - * We do the same here for IPv4 (and increment an snmp counter) but - * we do not want to drop the whole queue in response to a duplicate - * fragment. - */ - - err = -EINVAL; - /* Find out where to put this fragment. */ prev_tail = qp->q.fragments_tail; - if (!prev_tail) - ip4_frag_create_run(&qp->q, skb); /* First fragment. */ - else if (prev_tail->ip_defrag_offset + prev_tail->len < end) { - /* This is the common case: skb goes to the end. */ - /* Detect and discard overlaps. */ - if (offset < prev_tail->ip_defrag_offset + prev_tail->len) - goto discard_qp; - if (offset == prev_tail->ip_defrag_offset + prev_tail->len) - ip4_frag_append_to_last_run(&qp->q, skb); - else - ip4_frag_create_run(&qp->q, skb); - } else { - /* Binary search. Note that skb can become the first fragment, - * but not the last (covered above). - */ - rbn = &qp->q.rb_fragments.rb_node; - do { - parent = *rbn; - skb1 = rb_to_skb(parent); - skb1_run_end = skb1->ip_defrag_offset + - FRAG_CB(skb1)->frag_run_len; - if (end <= skb1->ip_defrag_offset) - rbn = &parent->rb_left; - else if (offset >= skb1_run_end) - rbn = &parent->rb_right; - else if (offset >= skb1->ip_defrag_offset && - end <= skb1_run_end) - goto err; /* No new data, potential duplicate */ - else - goto discard_qp; /* Found an overlap */ - } while (*rbn); - /* Here we have parent properly set, and rbn pointing to - * one of its NULL left/right children. Insert skb. - */ - ip4_frag_init_run(skb); - rb_link_node(&skb->rbnode, parent, rbn); - rb_insert_color(&skb->rbnode, &qp->q.rb_fragments); - } + err = inet_frag_queue_insert(&qp->q, skb, offset, end); + if (err) + goto insert_error;
if (dev) qp->iif = dev->ifindex; - skb->ip_defrag_offset = offset;
qp->q.stamp = skb->tstamp; qp->q.meat += skb->len; @@ -494,15 +374,24 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb) skb->_skb_refdst = 0UL; err = ip_frag_reasm(qp, skb, prev_tail, dev); skb->_skb_refdst = orefdst; + if (err) + inet_frag_kill(&qp->q); return err; }
skb_dst_drop(skb); return -EINPROGRESS;
+insert_error: + if (err == IPFRAG_DUP) { + kfree_skb(skb); + return -EINVAL; + } + err = -EINVAL; + __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS); discard_qp: inet_frag_kill(&qp->q); - __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS); + __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS); err: kfree_skb(skb); return err; @@ -514,13 +403,8 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, { struct net *net = container_of(qp->q.net, struct net, ipv4.frags); struct iphdr *iph; - struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments); - struct sk_buff **nextp; /* To build frag_list. */ - struct rb_node *rbn; - int len; - int ihlen; - int delta; - int err; + void *reasm_data; + int len, err; u8 ecn;
ipq_kill(qp); @@ -530,117 +414,23 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, err = -EINVAL; goto out_fail; } - /* Make the one we just received the head. */ - if (head != skb) { - fp = skb_clone(skb, GFP_ATOMIC); - if (!fp) - goto out_nomem; - FRAG_CB(fp)->next_frag = FRAG_CB(skb)->next_frag; - if (RB_EMPTY_NODE(&skb->rbnode)) - FRAG_CB(prev_tail)->next_frag = fp; - else - rb_replace_node(&skb->rbnode, &fp->rbnode, - &qp->q.rb_fragments); - if (qp->q.fragments_tail == skb) - qp->q.fragments_tail = fp; - skb_morph(skb, head); - FRAG_CB(skb)->next_frag = FRAG_CB(head)->next_frag; - rb_replace_node(&head->rbnode, &skb->rbnode, - &qp->q.rb_fragments); - consume_skb(head); - head = skb; - }
- WARN_ON(head->ip_defrag_offset != 0); - - /* Allocate a new buffer for the datagram. */ - ihlen = ip_hdrlen(head); - len = ihlen + qp->q.len; + /* Make the one we just received the head. */ + reasm_data = inet_frag_reasm_prepare(&qp->q, skb, prev_tail); + if (!reasm_data) + goto out_nomem;
+ len = ip_hdrlen(skb) + qp->q.len; err = -E2BIG; if (len > 65535) goto out_oversize;
- delta = - head->truesize; - - /* Head of list must not be cloned. */ - if (skb_unclone(head, GFP_ATOMIC)) - goto out_nomem; - - delta += head->truesize; - if (delta) - add_frag_mem_limit(qp->q.net, delta); - - /* If the first fragment is fragmented itself, we split - * it to two chunks: the first with data and paged part - * and the second, holding only fragments. */ - if (skb_has_frag_list(head)) { - struct sk_buff *clone; - int i, plen = 0; - - clone = alloc_skb(0, GFP_ATOMIC); - if (!clone) - goto out_nomem; - skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list; - skb_frag_list_init(head); - for (i = 0; i < skb_shinfo(head)->nr_frags; i++) - plen += skb_frag_size(&skb_shinfo(head)->frags[i]); - clone->len = clone->data_len = head->data_len - plen; - head->truesize += clone->truesize; - clone->csum = 0; - clone->ip_summed = head->ip_summed; - add_frag_mem_limit(qp->q.net, clone->truesize); - skb_shinfo(head)->frag_list = clone; - nextp = &clone->next; - } else { - nextp = &skb_shinfo(head)->frag_list; - } + inet_frag_reasm_finish(&qp->q, skb, reasm_data);
- skb_push(head, head->data - skb_network_header(head)); + skb->dev = dev; + IPCB(skb)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
- /* Traverse the tree in order, to build frag_list. */ - fp = FRAG_CB(head)->next_frag; - rbn = rb_next(&head->rbnode); - rb_erase(&head->rbnode, &qp->q.rb_fragments); - while (rbn || fp) { - /* fp points to the next sk_buff in the current run; - * rbn points to the next run. - */ - /* Go through the current run. */ - while (fp) { - *nextp = fp; - nextp = &fp->next; - fp->prev = NULL; - memset(&fp->rbnode, 0, sizeof(fp->rbnode)); - fp->sk = NULL; - head->data_len += fp->len; - head->len += fp->len; - if (head->ip_summed != fp->ip_summed) - head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_COMPLETE) - head->csum = csum_add(head->csum, fp->csum); - head->truesize += fp->truesize; - fp = FRAG_CB(fp)->next_frag; - } - /* Move to the next run. */ - if (rbn) { - struct rb_node *rbnext = rb_next(rbn); - - fp = rb_to_skb(rbn); - rb_erase(rbn, &qp->q.rb_fragments); - rbn = rbnext; - } - } - sub_frag_mem_limit(qp->q.net, head->truesize); - - *nextp = NULL; - head->next = NULL; - head->prev = NULL; - head->dev = dev; - head->tstamp = qp->q.stamp; - IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size); - - iph = ip_hdr(head); + iph = ip_hdr(skb); iph->tot_len = htons(len); iph->tos |= ecn;
@@ -653,7 +443,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, * from one very small df-fragment and one large non-df frag. */ if (qp->max_df_size == qp->q.max_size) { - IPCB(head)->flags |= IPSKB_FRAG_PMTU; + IPCB(skb)->flags |= IPSKB_FRAG_PMTU; iph->frag_off = htons(IP_DF); } else { iph->frag_off = 0; @@ -751,28 +541,6 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user) } EXPORT_SYMBOL(ip_check_defrag);
-unsigned int inet_frag_rbtree_purge(struct rb_root *root) -{ - struct rb_node *p = rb_first(root); - unsigned int sum = 0; - - while (p) { - struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode); - - p = rb_next(p); - rb_erase(&skb->rbnode, root); - while (skb) { - struct sk_buff *next = FRAG_CB(skb)->next_frag; - - sum += skb->truesize; - kfree_skb(skb); - skb = next; - } - } - return sum; -} -EXPORT_SYMBOL(inet_frag_rbtree_purge); - #ifdef CONFIG_SYSCTL static int dist_min;
[ Upstream commit d4289fcc9b16b89619ee1c54f829e05e56de8b9a ]
Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu")
This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html
This patch re-uses common IP defragmentation queueing and reassembly code in IPv6, removing the 1280 byte restriction.
v2: change handling of overlaps to match that of upstream.
Signed-off-by: Peter Oskolkov posk@google.com Reported-by: Tom Herbert tom@herbertland.com Cc: Eric Dumazet edumazet@google.com Cc: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ipv6_frag.h | 11 +- net/ipv6/reassembly.c | 240 +++++++++++----------------------------- 2 files changed, 75 insertions(+), 176 deletions(-)
diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 6ced1e6899b6..28aa9b30aece 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -82,8 +82,15 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
/* Don't send error if the first segment did not arrive. */ - head = fq->q.fragments; - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) + goto out; + + /* sk_buff::dev and sk_buff::rbnode are unionized. So we + * pull the head out of the tree in order to be able to + * deal with head->dev. + */ + head = inet_frag_pull_head(&fq->q); + if (!head) goto out;
head->dev = dev; diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index 7c943392c128..095825f964e2 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -69,8 +69,8 @@ static u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
static struct inet_frags ip6_frags;
-static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev, - struct net_device *dev); +static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb, + struct sk_buff *prev_tail, struct net_device *dev);
static void ip6_frag_expire(struct timer_list *t) { @@ -111,21 +111,26 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb, struct frag_hdr *fhdr, int nhoff, u32 *prob_offset) { - struct sk_buff *prev, *next; - struct net_device *dev; - int offset, end, fragsize; struct net *net = dev_net(skb_dst(skb)->dev); + int offset, end, fragsize; + struct sk_buff *prev_tail; + struct net_device *dev; + int err = -ENOENT; u8 ecn;
if (fq->q.flags & INET_FRAG_COMPLETE) goto err;
+ err = -EINVAL; offset = ntohs(fhdr->frag_off) & ~0x7; end = offset + (ntohs(ipv6_hdr(skb)->payload_len) - ((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
if ((unsigned int)end > IPV6_MAXPLEN) { *prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb); + /* note that if prob_offset is set, the skb is freed elsewhere, + * we do not free it here. + */ return -1; }
@@ -145,7 +150,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb, */ if (end < fq->q.len || ((fq->q.flags & INET_FRAG_LAST_IN) && end != fq->q.len)) - goto err; + goto discard_fq; fq->q.flags |= INET_FRAG_LAST_IN; fq->q.len = end; } else { @@ -162,70 +167,35 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb, if (end > fq->q.len) { /* Some bits beyond end -> corruption. */ if (fq->q.flags & INET_FRAG_LAST_IN) - goto err; + goto discard_fq; fq->q.len = end; } }
if (end == offset) - goto err; + goto discard_fq;
+ err = -ENOMEM; /* Point into the IP datagram 'data' part. */ if (!pskb_pull(skb, (u8 *) (fhdr + 1) - skb->data)) - goto err; - - if (pskb_trim_rcsum(skb, end - offset)) - goto err; - - /* Find out which fragments are in front and at the back of us - * in the chain of fragments so far. We must know where to put - * this fragment, right? - */ - prev = fq->q.fragments_tail; - if (!prev || prev->ip_defrag_offset < offset) { - next = NULL; - goto found; - } - prev = NULL; - for (next = fq->q.fragments; next != NULL; next = next->next) { - if (next->ip_defrag_offset >= offset) - break; /* bingo! */ - prev = next; - } - -found: - /* RFC5722, Section 4, amended by Errata ID : 3089 - * When reassembling an IPv6 datagram, if - * one or more its constituent fragments is determined to be an - * overlapping fragment, the entire datagram (and any constituent - * fragments) MUST be silently discarded. - */ - - /* Check for overlap with preceding fragment. */ - if (prev && - (prev->ip_defrag_offset + prev->len) > offset) goto discard_fq;
- /* Look for overlap with succeeding segment. */ - if (next && next->ip_defrag_offset < end) + err = pskb_trim_rcsum(skb, end - offset); + if (err) goto discard_fq;
- /* Note : skb->ip_defrag_offset and skb->dev share the same location */ + /* Note : skb->rbnode and skb->dev share the same location. */ dev = skb->dev; - if (dev) - fq->iif = dev->ifindex; /* Makes sure compiler wont do silly aliasing games */ barrier(); - skb->ip_defrag_offset = offset;
- /* Insert this fragment in the chain of fragments. */ - skb->next = next; - if (!next) - fq->q.fragments_tail = skb; - if (prev) - prev->next = skb; - else - fq->q.fragments = skb; + prev_tail = fq->q.fragments_tail; + err = inet_frag_queue_insert(&fq->q, skb, offset, end); + if (err) + goto insert_error; + + if (dev) + fq->iif = dev->ifindex;
fq->q.stamp = skb->tstamp; fq->q.meat += skb->len; @@ -246,44 +216,48 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && fq->q.meat == fq->q.len) { - int res; unsigned long orefdst = skb->_skb_refdst;
skb->_skb_refdst = 0UL; - res = ip6_frag_reasm(fq, prev, dev); + err = ip6_frag_reasm(fq, skb, prev_tail, dev); skb->_skb_refdst = orefdst; - return res; + return err; }
skb_dst_drop(skb); - return -1; + return -EINPROGRESS;
+insert_error: + if (err == IPFRAG_DUP) { + kfree_skb(skb); + return -EINVAL; + } + err = -EINVAL; + __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_REASM_OVERLAPS); discard_fq: inet_frag_kill(&fq->q); -err: __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_REASMFAILS); +err: kfree_skb(skb); - return -1; + return err; }
/* * Check if this packet is complete. - * Returns NULL on failure by any reason, and pointer - * to current nexthdr field in reassembled frame. * * It is called with locked fq, and caller must check that * queue is eligible for reassembly i.e. it is not COMPLETE, * the last and the first frames arrived and all the bits are here. */ -static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev, - struct net_device *dev) +static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb, + struct sk_buff *prev_tail, struct net_device *dev) { struct net *net = container_of(fq->q.net, struct net, ipv6.frags); - struct sk_buff *fp, *head = fq->q.fragments; - int payload_len, delta; unsigned int nhoff; - int sum_truesize; + void *reasm_data; + int payload_len; u8 ecn;
inet_frag_kill(&fq->q); @@ -292,121 +266,40 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev, if (unlikely(ecn == 0xff)) goto out_fail;
- /* Make the one we just received the head. */ - if (prev) { - head = prev->next; - fp = skb_clone(head, GFP_ATOMIC); - - if (!fp) - goto out_oom; - - fp->next = head->next; - if (!fp->next) - fq->q.fragments_tail = fp; - prev->next = fp; - - skb_morph(head, fq->q.fragments); - head->next = fq->q.fragments->next; - - consume_skb(fq->q.fragments); - fq->q.fragments = head; - } - - WARN_ON(head == NULL); - WARN_ON(head->ip_defrag_offset != 0); + reasm_data = inet_frag_reasm_prepare(&fq->q, skb, prev_tail); + if (!reasm_data) + goto out_oom;
- /* Unfragmented part is taken from the first segment. */ - payload_len = ((head->data - skb_network_header(head)) - + payload_len = ((skb->data - skb_network_header(skb)) - sizeof(struct ipv6hdr) + fq->q.len - sizeof(struct frag_hdr)); if (payload_len > IPV6_MAXPLEN) goto out_oversize;
- delta = - head->truesize; - - /* Head of list must not be cloned. */ - if (skb_unclone(head, GFP_ATOMIC)) - goto out_oom; - - delta += head->truesize; - if (delta) - add_frag_mem_limit(fq->q.net, delta); - - /* If the first fragment is fragmented itself, we split - * it to two chunks: the first with data and paged part - * and the second, holding only fragments. */ - if (skb_has_frag_list(head)) { - struct sk_buff *clone; - int i, plen = 0; - - clone = alloc_skb(0, GFP_ATOMIC); - if (!clone) - goto out_oom; - clone->next = head->next; - head->next = clone; - skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list; - skb_frag_list_init(head); - for (i = 0; i < skb_shinfo(head)->nr_frags; i++) - plen += skb_frag_size(&skb_shinfo(head)->frags[i]); - clone->len = clone->data_len = head->data_len - plen; - head->data_len -= clone->len; - head->len -= clone->len; - clone->csum = 0; - clone->ip_summed = head->ip_summed; - add_frag_mem_limit(fq->q.net, clone->truesize); - } - /* We have to remove fragment header from datagram and to relocate * header in order to calculate ICV correctly. */ nhoff = fq->nhoffset; - skb_network_header(head)[nhoff] = skb_transport_header(head)[0]; - memmove(head->head + sizeof(struct frag_hdr), head->head, - (head->data - head->head) - sizeof(struct frag_hdr)); - if (skb_mac_header_was_set(head)) - head->mac_header += sizeof(struct frag_hdr); - head->network_header += sizeof(struct frag_hdr); - - skb_reset_transport_header(head); - skb_push(head, head->data - skb_network_header(head)); - - sum_truesize = head->truesize; - for (fp = head->next; fp;) { - bool headstolen; - int delta; - struct sk_buff *next = fp->next; - - sum_truesize += fp->truesize; - if (head->ip_summed != fp->ip_summed) - head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_COMPLETE) - head->csum = csum_add(head->csum, fp->csum); - - if (skb_try_coalesce(head, fp, &headstolen, &delta)) { - kfree_skb_partial(fp, headstolen); - } else { - fp->sk = NULL; - if (!skb_shinfo(head)->frag_list) - skb_shinfo(head)->frag_list = fp; - head->data_len += fp->len; - head->len += fp->len; - head->truesize += fp->truesize; - } - fp = next; - } - sub_frag_mem_limit(fq->q.net, sum_truesize); + skb_network_header(skb)[nhoff] = skb_transport_header(skb)[0]; + memmove(skb->head + sizeof(struct frag_hdr), skb->head, + (skb->data - skb->head) - sizeof(struct frag_hdr)); + if (skb_mac_header_was_set(skb)) + skb->mac_header += sizeof(struct frag_hdr); + skb->network_header += sizeof(struct frag_hdr); + + skb_reset_transport_header(skb); + + inet_frag_reasm_finish(&fq->q, skb, reasm_data);
- head->next = NULL; - head->dev = dev; - head->tstamp = fq->q.stamp; - ipv6_hdr(head)->payload_len = htons(payload_len); - ipv6_change_dsfield(ipv6_hdr(head), 0xff, ecn); - IP6CB(head)->nhoff = nhoff; - IP6CB(head)->flags |= IP6SKB_FRAGMENTED; - IP6CB(head)->frag_max_size = fq->q.max_size; + skb->dev = dev; + ipv6_hdr(skb)->payload_len = htons(payload_len); + ipv6_change_dsfield(ipv6_hdr(skb), 0xff, ecn); + IP6CB(skb)->nhoff = nhoff; + IP6CB(skb)->flags |= IP6SKB_FRAGMENTED; + IP6CB(skb)->frag_max_size = fq->q.max_size;
/* Yes, and fold redundant checksum back. 8) */ - skb_postpush_rcsum(head, skb_network_header(head), - skb_network_header_len(head)); + skb_postpush_rcsum(skb, skb_network_header(skb), + skb_network_header_len(skb));
rcu_read_lock(); __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS); @@ -414,6 +307,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev, fq->q.fragments = NULL; fq->q.rb_fragments = RB_ROOT; fq->q.fragments_tail = NULL; + fq->q.last_run_head = NULL; return 1;
out_oversize: @@ -425,6 +319,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev, rcu_read_lock(); __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS); rcu_read_unlock(); + inet_frag_kill(&fq->q); return -1; }
@@ -463,10 +358,6 @@ static int ipv6_frag_rcv(struct sk_buff *skb) return 1; }
- if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU && - fhdr->frag_off & htons(IP6_MF)) - goto fail_hdr; - iif = skb->dev ? skb->dev->ifindex : 0; fq = fq_find(net, fhdr->identification, hdr, iif); if (fq) { @@ -484,6 +375,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb) if (prob_offset) { __IP6_INC_STATS(net, __in6_dev_get_safely(skb->dev), IPSTATS_MIB_INHDRERRORS); + /* icmpv6_param_prob() calls kfree_skb(skb) */ icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, prob_offset); } return ret;
[ Upstream commit 997dd96471641e147cb2c33ad54284000d0f5e35 ]
Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu")
This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html
This patch re-uses common IP defragmentation queueing and reassembly code in IP6 defragmentation in nf_conntrack, removing the 1280 byte restriction.
Signed-off-by: Peter Oskolkov posk@google.com Reported-by: Tom Herbert tom@herbertland.com Cc: Eric Dumazet edumazet@google.com Cc: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/netfilter/nf_conntrack_reasm.c | 260 +++++++----------------- 1 file changed, 71 insertions(+), 189 deletions(-)
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index 043ed8eb0ab9..cb1b4772dac0 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -136,6 +136,9 @@ static void __net_exit nf_ct_frags6_sysctl_unregister(struct net *net) } #endif
+static int nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *skb, + struct sk_buff *prev_tail, struct net_device *dev); + static inline u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h) { return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK); @@ -177,9 +180,10 @@ static struct frag_queue *fq_find(struct net *net, __be32 id, u32 user, static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb, const struct frag_hdr *fhdr, int nhoff) { - struct sk_buff *prev, *next; unsigned int payload_len; - int offset, end; + struct net_device *dev; + struct sk_buff *prev; + int offset, end, err; u8 ecn;
if (fq->q.flags & INET_FRAG_COMPLETE) { @@ -254,55 +258,18 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb, goto err; }
- /* Find out which fragments are in front and at the back of us - * in the chain of fragments so far. We must know where to put - * this fragment, right? - */ - prev = fq->q.fragments_tail; - if (!prev || prev->ip_defrag_offset < offset) { - next = NULL; - goto found; - } - prev = NULL; - for (next = fq->q.fragments; next != NULL; next = next->next) { - if (next->ip_defrag_offset >= offset) - break; /* bingo! */ - prev = next; - } - -found: - /* RFC5722, Section 4: - * When reassembling an IPv6 datagram, if - * one or more its constituent fragments is determined to be an - * overlapping fragment, the entire datagram (and any constituent - * fragments, including those not yet received) MUST be silently - * discarded. - */ - - /* Check for overlap with preceding fragment. */ - if (prev && - (prev->ip_defrag_offset + prev->len) > offset) - goto discard_fq; - - /* Look for overlap with succeeding segment. */ - if (next && next->ip_defrag_offset < end) - goto discard_fq; - - /* Note : skb->ip_defrag_offset and skb->dev share the same location */ - if (skb->dev) - fq->iif = skb->dev->ifindex; + /* Note : skb->rbnode and skb->dev share the same location. */ + dev = skb->dev; /* Makes sure compiler wont do silly aliasing games */ barrier(); - skb->ip_defrag_offset = offset;
- /* Insert this fragment in the chain of fragments. */ - skb->next = next; - if (!next) - fq->q.fragments_tail = skb; - if (prev) - prev->next = skb; - else - fq->q.fragments = skb; + prev = fq->q.fragments_tail; + err = inet_frag_queue_insert(&fq->q, skb, offset, end); + if (err) + goto insert_error; + + if (dev) + fq->iif = dev->ifindex;
fq->q.stamp = skb->tstamp; fq->q.meat += skb->len; @@ -319,11 +286,25 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb, fq->q.flags |= INET_FRAG_FIRST_IN; }
- return 0; + if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && + fq->q.meat == fq->q.len) { + unsigned long orefdst = skb->_skb_refdst; + + skb->_skb_refdst = 0UL; + err = nf_ct_frag6_reasm(fq, skb, prev, dev); + skb->_skb_refdst = orefdst; + return err; + } + + skb_dst_drop(skb); + return -EINPROGRESS;
-discard_fq: +insert_error: + if (err == IPFRAG_DUP) + goto err; inet_frag_kill(&fq->q); err: + skb_dst_drop(skb); return -EINVAL; }
@@ -333,147 +314,67 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb, * It is called with locked fq, and caller must check that * queue is eligible for reassembly i.e. it is not COMPLETE, * the last and the first frames arrived and all the bits are here. - * - * returns true if *prev skb has been transformed into the reassembled - * skb, false otherwise. */ -static bool -nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev) +static int nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *skb, + struct sk_buff *prev_tail, struct net_device *dev) { - struct sk_buff *fp, *head = fq->q.fragments; - int payload_len, delta; + void *reasm_data; + int payload_len; u8 ecn;
inet_frag_kill(&fq->q);
- WARN_ON(head == NULL); - WARN_ON(head->ip_defrag_offset != 0); - ecn = ip_frag_ecn_table[fq->ecn]; if (unlikely(ecn == 0xff)) - return false; + goto err; + + reasm_data = inet_frag_reasm_prepare(&fq->q, skb, prev_tail); + if (!reasm_data) + goto err;
- /* Unfragmented part is taken from the first segment. */ - payload_len = ((head->data - skb_network_header(head)) - + payload_len = ((skb->data - skb_network_header(skb)) - sizeof(struct ipv6hdr) + fq->q.len - sizeof(struct frag_hdr)); if (payload_len > IPV6_MAXPLEN) { net_dbg_ratelimited("nf_ct_frag6_reasm: payload len = %d\n", payload_len); - return false; - } - - delta = - head->truesize; - - /* Head of list must not be cloned. */ - if (skb_unclone(head, GFP_ATOMIC)) - return false; - - delta += head->truesize; - if (delta) - add_frag_mem_limit(fq->q.net, delta); - - /* If the first fragment is fragmented itself, we split - * it to two chunks: the first with data and paged part - * and the second, holding only fragments. */ - if (skb_has_frag_list(head)) { - struct sk_buff *clone; - int i, plen = 0; - - clone = alloc_skb(0, GFP_ATOMIC); - if (clone == NULL) - return false; - - clone->next = head->next; - head->next = clone; - skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list; - skb_frag_list_init(head); - for (i = 0; i < skb_shinfo(head)->nr_frags; i++) - plen += skb_frag_size(&skb_shinfo(head)->frags[i]); - clone->len = clone->data_len = head->data_len - plen; - head->data_len -= clone->len; - head->len -= clone->len; - clone->csum = 0; - clone->ip_summed = head->ip_summed; - - add_frag_mem_limit(fq->q.net, clone->truesize); - } - - /* morph head into last received skb: prev. - * - * This allows callers of ipv6 conntrack defrag to continue - * to use the last skb(frag) passed into the reasm engine. - * The last skb frag 'silently' turns into the full reassembled skb. - * - * Since prev is also part of q->fragments we have to clone it first. - */ - if (head != prev) { - struct sk_buff *iter; - - fp = skb_clone(prev, GFP_ATOMIC); - if (!fp) - return false; - - fp->next = prev->next; - - iter = head; - while (iter) { - if (iter->next == prev) { - iter->next = fp; - break; - } - iter = iter->next; - } - - skb_morph(prev, head); - prev->next = head->next; - consume_skb(head); - head = prev; + goto err; }
/* We have to remove fragment header from datagram and to relocate * header in order to calculate ICV correctly. */ - skb_network_header(head)[fq->nhoffset] = skb_transport_header(head)[0]; - memmove(head->head + sizeof(struct frag_hdr), head->head, - (head->data - head->head) - sizeof(struct frag_hdr)); - head->mac_header += sizeof(struct frag_hdr); - head->network_header += sizeof(struct frag_hdr); - - skb_shinfo(head)->frag_list = head->next; - skb_reset_transport_header(head); - skb_push(head, head->data - skb_network_header(head)); - - for (fp = head->next; fp; fp = fp->next) { - head->data_len += fp->len; - head->len += fp->len; - if (head->ip_summed != fp->ip_summed) - head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_COMPLETE) - head->csum = csum_add(head->csum, fp->csum); - head->truesize += fp->truesize; - fp->sk = NULL; - } - sub_frag_mem_limit(fq->q.net, head->truesize); + skb_network_header(skb)[fq->nhoffset] = skb_transport_header(skb)[0]; + memmove(skb->head + sizeof(struct frag_hdr), skb->head, + (skb->data - skb->head) - sizeof(struct frag_hdr)); + skb->mac_header += sizeof(struct frag_hdr); + skb->network_header += sizeof(struct frag_hdr); + + skb_reset_transport_header(skb); + + inet_frag_reasm_finish(&fq->q, skb, reasm_data);
- head->ignore_df = 1; - head->next = NULL; - head->dev = dev; - head->tstamp = fq->q.stamp; - ipv6_hdr(head)->payload_len = htons(payload_len); - ipv6_change_dsfield(ipv6_hdr(head), 0xff, ecn); - IP6CB(head)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size; + skb->ignore_df = 1; + skb->dev = dev; + ipv6_hdr(skb)->payload_len = htons(payload_len); + ipv6_change_dsfield(ipv6_hdr(skb), 0xff, ecn); + IP6CB(skb)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
/* Yes, and fold redundant checksum back. 8) */ - if (head->ip_summed == CHECKSUM_COMPLETE) - head->csum = csum_partial(skb_network_header(head), - skb_network_header_len(head), - head->csum); + if (skb->ip_summed == CHECKSUM_COMPLETE) + skb->csum = csum_partial(skb_network_header(skb), + skb_network_header_len(skb), + skb->csum);
fq->q.fragments = NULL; fq->q.rb_fragments = RB_ROOT; fq->q.fragments_tail = NULL; + fq->q.last_run_head = NULL;
- return true; + return 0; + +err: + inet_frag_kill(&fq->q); + return -EINVAL; }
/* @@ -542,7 +443,6 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int *prevhoff, int *fhoff) int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) { u16 savethdr = skb->transport_header; - struct net_device *dev = skb->dev; int fhoff, nhoff, ret; struct frag_hdr *fhdr; struct frag_queue *fq; @@ -565,10 +465,6 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) hdr = ipv6_hdr(skb); fhdr = (struct frag_hdr *)skb_transport_header(skb);
- if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU && - fhdr->frag_off & htons(IP6_MF)) - return -EINVAL; - skb_orphan(skb); fq = fq_find(net, fhdr->identification, user, hdr, skb->dev ? skb->dev->ifindex : 0); @@ -580,31 +476,17 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) spin_lock_bh(&fq->q.lock);
ret = nf_ct_frag6_queue(fq, skb, fhdr, nhoff); - if (ret < 0) { - if (ret == -EPROTO) { - skb->transport_header = savethdr; - ret = 0; - } - goto out_unlock; + if (ret == -EPROTO) { + skb->transport_header = savethdr; + ret = 0; }
/* after queue has assumed skb ownership, only 0 or -EINPROGRESS * must be returned. */ - ret = -EINPROGRESS; - if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && - fq->q.meat == fq->q.len) { - unsigned long orefdst = skb->_skb_refdst; - - skb->_skb_refdst = 0UL; - if (nf_ct_frag6_reasm(fq, skb, dev)) - ret = 0; - skb->_skb_refdst = orefdst; - } else { - skb_dst_drop(skb); - } + if (ret) + ret = -EINPROGRESS;
-out_unlock: spin_unlock_bh(&fq->q.lock); inet_frag_put(&fq->q); return ret;
From: Aurelien Aptel aaptel@suse.com
commit b98749cac4a695f084a5ff076f4510b23e353ecd upstream.
In the oplock break handler, writing pending changes from pages puts the FileInfo handle. If the refcount reaches zero it closes the handle and waits for any oplock break handler to return, thus causing a deadlock.
To prevent this situation:
* We add a wait flag to cifsFileInfo_put() to decide whether we should wait for running/pending oplock break handlers
* We keep an additionnal reference of the SMB FileInfo handle so that for the rest of the handler putting the handle won't close it. - The ref is bumped everytime we queue the handler via the cifs_queue_oplock_break() helper. - The ref is decremented at the end of the handler
This bug was triggered by xfstest 464.
Also important fix to address the various reports of oops in smb2_push_mandatory_locks
Signed-off-by: Aurelien Aptel aaptel@suse.com Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifsglob.h | 2 ++ fs/cifs/file.c | 30 +++++++++++++++++++++++++----- fs/cifs/misc.c | 25 +++++++++++++++++++++++-- fs/cifs/smb2misc.c | 6 +++--- 4 files changed, 53 insertions(+), 10 deletions(-)
--- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1263,6 +1263,7 @@ cifsFileInfo_get_locked(struct cifsFileI }
struct cifsFileInfo *cifsFileInfo_get(struct cifsFileInfo *cifs_file); +void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_hdlr); void cifsFileInfo_put(struct cifsFileInfo *cifs_file);
#define CIFS_CACHE_READ_FLG 1 @@ -1763,6 +1764,7 @@ GLOBAL_EXTERN spinlock_t gidsidlock; #endif /* CONFIG_CIFS_ACL */
void cifs_oplock_break(struct work_struct *work); +void cifs_queue_oplock_break(struct cifsFileInfo *cfile);
extern const struct slow_work_ops cifs_oplock_break_ops; extern struct workqueue_struct *cifsiod_wq; --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -358,13 +358,31 @@ cifsFileInfo_get(struct cifsFileInfo *ci return cifs_file; }
-/* - * Release a reference on the file private data. This may involve closing - * the filehandle out on the server. Must be called without holding - * tcon->open_file_lock and cifs_file->file_info_lock. +/** + * cifsFileInfo_put - release a reference of file priv data + * + * Always potentially wait for oplock handler. See _cifsFileInfo_put(). */ void cifsFileInfo_put(struct cifsFileInfo *cifs_file) { + _cifsFileInfo_put(cifs_file, true); +} + +/** + * _cifsFileInfo_put - release a reference of file priv data + * + * This may involve closing the filehandle @cifs_file out on the + * server. Must be called without holding tcon->open_file_lock and + * cifs_file->file_info_lock. + * + * If @wait_for_oplock_handler is true and we are releasing the last + * reference, wait for any running oplock break handler of the file + * and cancel any pending one. If calling this function from the + * oplock break handler, you need to pass false. + * + */ +void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler) +{ struct inode *inode = d_inode(cifs_file->dentry); struct cifs_tcon *tcon = tlink_tcon(cifs_file->tlink); struct TCP_Server_Info *server = tcon->ses->server; @@ -411,7 +429,8 @@ void cifsFileInfo_put(struct cifsFileInf
spin_unlock(&tcon->open_file_lock);
- oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break); + oplock_break_cancelled = wait_oplock_handler ? + cancel_work_sync(&cifs_file->oplock_break) : false;
if (!tcon->need_reconnect && !cifs_file->invalidHandle) { struct TCP_Server_Info *server = tcon->ses->server; @@ -4170,6 +4189,7 @@ void cifs_oplock_break(struct work_struc cinode); cifs_dbg(FYI, "Oplock release rc = %d\n", rc); } + _cifsFileInfo_put(cfile, false /* do not wait for ourself */); cifs_done_oplock_break(cinode); }
--- a/fs/cifs/misc.c +++ b/fs/cifs/misc.c @@ -490,8 +490,7 @@ is_valid_oplock_break(char *buffer, stru CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, &pCifsInode->flags);
- queue_work(cifsoplockd_wq, - &netfile->oplock_break); + cifs_queue_oplock_break(netfile); netfile->oplock_break_cancelled = false;
spin_unlock(&tcon->open_file_lock); @@ -588,6 +587,28 @@ void cifs_put_writer(struct cifsInodeInf spin_unlock(&cinode->writers_lock); }
+/** + * cifs_queue_oplock_break - queue the oplock break handler for cfile + * + * This function is called from the demultiplex thread when it + * receives an oplock break for @cfile. + * + * Assumes the tcon->open_file_lock is held. + * Assumes cfile->file_info_lock is NOT held. + */ +void cifs_queue_oplock_break(struct cifsFileInfo *cfile) +{ + /* + * Bump the handle refcount now while we hold the + * open_file_lock to enforce the validity of it for the oplock + * break handler. The matching put is done at the end of the + * handler. + */ + cifsFileInfo_get(cfile); + + queue_work(cifsoplockd_wq, &cfile->oplock_break); +} + void cifs_done_oplock_break(struct cifsInodeInfo *cinode) { clear_bit(CIFS_INODE_PENDING_OPLOCK_BREAK, &cinode->flags); --- a/fs/cifs/smb2misc.c +++ b/fs/cifs/smb2misc.c @@ -555,7 +555,7 @@ smb2_tcon_has_lease(struct cifs_tcon *tc clear_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, &cinode->flags);
- queue_work(cifsoplockd_wq, &cfile->oplock_break); + cifs_queue_oplock_break(cfile); kfree(lw); return true; } @@ -719,8 +719,8 @@ smb2_is_valid_oplock_break(char *buffer, CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, &cinode->flags); spin_unlock(&cfile->file_info_lock); - queue_work(cifsoplockd_wq, - &cfile->oplock_break); + + cifs_queue_oplock_break(cfile);
spin_unlock(&tcon->open_file_lock); spin_unlock(&cifs_tcp_ses_lock);
From: ZhangXiaoxu zhangxiaoxu5@huawei.com
commit 6a3eb3360667170988f8a6477f6686242061488a upstream.
There is a KASAN use-after-free: BUG: KASAN: use-after-free in SMB2_write+0x1342/0x1580 Read of size 8 at addr ffff8880b6a8e450 by task ln/4196
Should not release the 'req' because it will use in the trace.
Fixes: eccb4422cf97 ("smb3: Add ftrace tracepoints for improved SMB3 debugging")
Signed-off-by: ZhangXiaoxu zhangxiaoxu5@huawei.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org 4.18+ Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2pdu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -3591,7 +3591,6 @@ SMB2_write(const unsigned int xid, struc
rc = cifs_send_recv(xid, io_parms->tcon->ses, &rqst, &resp_buftype, flags, &rsp_iov); - cifs_small_buf_release(req); rsp = (struct smb2_write_rsp *)rsp_iov.iov_base;
if (rc) { @@ -3609,6 +3608,7 @@ SMB2_write(const unsigned int xid, struc io_parms->offset, *nbytes); }
+ cifs_small_buf_release(req); free_rsp_buf(resp_buftype, rsp); return rc; }
From: ZhangXiaoxu zhangxiaoxu5@huawei.com
commit 088aaf17aa79300cab14dbee2569c58cfafd7d6e upstream.
There is a KASAN use-after-free: BUG: KASAN: use-after-free in SMB2_read+0x1136/0x1190 Read of size 8 at addr ffff8880b4e45e50 by task ln/1009
Should not release the 'req' because it will use in the trace.
Fixes: eccb4422cf97 ("smb3: Add ftrace tracepoints for improved SMB3 debugging")
Signed-off-by: ZhangXiaoxu zhangxiaoxu5@huawei.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org 4.18+ Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2pdu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -3273,8 +3273,6 @@ SMB2_read(const unsigned int xid, struct rqst.rq_nvec = 1;
rc = cifs_send_recv(xid, ses, &rqst, &resp_buftype, flags, &rsp_iov); - cifs_small_buf_release(req); - rsp = (struct smb2_read_rsp *)rsp_iov.iov_base;
if (rc) { @@ -3293,6 +3291,8 @@ SMB2_read(const unsigned int xid, struct io_parms->tcon->tid, ses->Suid, io_parms->offset, io_parms->length);
+ cifs_small_buf_release(req); + *nbytes = le32_to_cpu(rsp->DataLength); if ((*nbytes > CIFS_MAX_MSGSIZE) || (*nbytes > io_parms->length)) {
From: Ronnie Sahlberg lsahlber@redhat.com
commit e6d0fb7b34f264f72c33053558a360a6a734905e upstream.
If we enter smb2_query_symlink() for something that is not a symlink and where the SMB2_open() would succeed we would never end up closing this handle and would thus leak a handle on the server.
Fix this by immediately calling SMB2_close() on successfull open.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com CC: Stable stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2ops.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -1906,6 +1906,8 @@ smb2_query_symlink(const unsigned int xi
rc = SMB2_open(xid, &oparms, utf16_path, &oplock, NULL, &err_iov, &resp_buftype); + if (!rc) + SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); if (!rc || !err_iov.iov_base) { rc = -ENOENT; goto free_path;
From: Sean Christopherson sean.j.christopherson@intel.com
commit 8f4dc2e77cdfaf7e644ef29693fa229db29ee1de upstream.
Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save state area, i.e. don't save/restore EFER across SMM transitions. KVM somewhat models this, e.g. doesn't clear EFER on entry to SMM if the guest doesn't support long mode. But during RSM, KVM unconditionally clears EFER so that it can get back to pure 32-bit mode in order to start loading CRs with their actual non-SMM values.
Clear EFER only when it will be written when loading the non-SMM state so as to preserve bits that can theoretically be set on 32-bit vCPUs, e.g. KVM always emulates EFER_SCE.
And because CR4.PAE is cleared only to play nice with EFER, wrap that code in the long mode check as well. Note, this may result in a compiler warning about cr4 being consumed uninitialized. Re-read CR4 even though it's technically unnecessary, as doing so allows for more readable code and RSM emulation is not a performance critical path.
Fixes: 660a5d517aaab ("KVM: x86: save/load state on SMM switch") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/emulate.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
--- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2575,15 +2575,13 @@ static int em_rsm(struct x86_emulate_ctx * CR0/CR3/CR4/EFER. It's all a bit more complicated if the vCPU * supports long mode. */ - cr4 = ctxt->ops->get_cr(ctxt, 4); if (emulator_has_longmode(ctxt)) { struct desc_struct cs_desc;
/* Zero CR4.PCIDE before CR0.PG. */ - if (cr4 & X86_CR4_PCIDE) { + cr4 = ctxt->ops->get_cr(ctxt, 4); + if (cr4 & X86_CR4_PCIDE) ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PCIDE); - cr4 &= ~X86_CR4_PCIDE; - }
/* A 32-bit code segment is required to clear EFER.LMA. */ memset(&cs_desc, 0, sizeof(cs_desc)); @@ -2597,13 +2595,16 @@ static int em_rsm(struct x86_emulate_ctx if (cr0 & X86_CR0_PE) ctxt->ops->set_cr(ctxt, 0, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
- /* Now clear CR4.PAE (which must be done before clearing EFER.LME). */ - if (cr4 & X86_CR4_PAE) - ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PAE); - - /* And finally go back to 32-bit mode. */ - efer = 0; - ctxt->ops->set_msr(ctxt, MSR_EFER, efer); + if (emulator_has_longmode(ctxt)) { + /* Clear CR4.PAE before clearing EFER.LME. */ + cr4 = ctxt->ops->get_cr(ctxt, 4); + if (cr4 & X86_CR4_PAE) + ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PAE); + + /* And finally go back to 32-bit mode. */ + efer = 0; + ctxt->ops->set_msr(ctxt, MSR_EFER, efer); + }
smbase = ctxt->ops->get_smbase(ctxt);
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 99c221796a810055974b54c02e8f53297e48d146 upstream.
I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P, the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing shows that we're sometimes able to deliver a few but never all.
When we're trying to inject an NMI we may fail to do so immediately for various reasons, however, we still need to inject it so enable_nmi_window() arms nmi_singlestep mode. #DB occurs as expected, but we're not checking for pending NMIs before entering the guest and unless there's a different event to process, the NMI will never get delivered.
Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure pending NMIs are checked and possibly injected.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/svm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2679,6 +2679,7 @@ static int npf_interception(struct vcpu_ static int db_interception(struct vcpu_svm *svm) { struct kvm_run *kvm_run = svm->vcpu.run; + struct kvm_vcpu *vcpu = &svm->vcpu;
if (!(svm->vcpu.guest_debug & (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP)) && @@ -2689,6 +2690,8 @@ static int db_interception(struct vcpu_s
if (svm->nmi_singlestep) { disable_nmi_singlestep(svm); + /* Make sure we check for pending NMIs upon entry */ + kvm_make_request(KVM_REQ_EVENT, vcpu); }
if (svm->vcpu.guest_debug &
From: Leonard Pollak leonardp@tr-host.de
commit 0a8a29be499cbb67df79370aaf5109085509feb8 upstream.
This patch fixes an obvious typo, which will cause erroneously returning the Peak Voltage instead of the Peak Current.
Signed-off-by: Leonard Pollak leonardp@tr-host.de Cc: Stable@vger.kernel.org Acked-by: Michael Hennerich michael.hennerich@analog.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/iio/meter/ade7854.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/iio/meter/ade7854.c +++ b/drivers/staging/iio/meter/ade7854.c @@ -269,7 +269,7 @@ static IIO_DEV_ATTR_VPEAK(0644, static IIO_DEV_ATTR_IPEAK(0644, ade7854_read_32bit, ade7854_write_32bit, - ADE7854_VPEAK); + ADE7854_IPEAK); static IIO_DEV_ATTR_APHCAL(0644, ade7854_read_16bit, ade7854_write_16bit,
From: Mircea Caprioru mircea.caprioru@analog.com
commit 7ce0f216221856a17fc4934b39284678a5fef2e9 upstream.
This patch fixes the differential channels addresses for the ad7193.
Signed-off-by: Mircea Caprioru mircea.caprioru@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/iio/adc/ad7192.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/staging/iio/adc/ad7192.c +++ b/drivers/staging/iio/adc/ad7192.c @@ -109,10 +109,10 @@ #define AD7192_CH_AIN3 BIT(6) /* AIN3 - AINCOM */ #define AD7192_CH_AIN4 BIT(7) /* AIN4 - AINCOM */
-#define AD7193_CH_AIN1P_AIN2M 0x000 /* AIN1(+) - AIN2(-) */ -#define AD7193_CH_AIN3P_AIN4M 0x001 /* AIN3(+) - AIN4(-) */ -#define AD7193_CH_AIN5P_AIN6M 0x002 /* AIN5(+) - AIN6(-) */ -#define AD7193_CH_AIN7P_AIN8M 0x004 /* AIN7(+) - AIN8(-) */ +#define AD7193_CH_AIN1P_AIN2M 0x001 /* AIN1(+) - AIN2(-) */ +#define AD7193_CH_AIN3P_AIN4M 0x002 /* AIN3(+) - AIN4(-) */ +#define AD7193_CH_AIN5P_AIN6M 0x004 /* AIN5(+) - AIN6(-) */ +#define AD7193_CH_AIN7P_AIN8M 0x008 /* AIN7(+) - AIN8(-) */ #define AD7193_CH_TEMP 0x100 /* Temp senseor */ #define AD7193_CH_AIN2P_AIN2M 0x200 /* AIN2(+) - AIN2(-) */ #define AD7193_CH_AIN1 0x401 /* AIN1 - AINCOM */
From: Sergey Larin cerg2010cerg2010@mail.ru
commit 409a51e0a4a5f908763191fae2c29008632eb712 upstream.
According to the datasheet, the last bit of CHIP_ID register controls I2C bus, and the first one is unused. Handle this correctly.
Note that there are chips out there that have a value such that the id check currently fails.
Signed-off-by: Sergey Larin cerg2010cerg2010@mail.ru Reviewed-by: Linus Walleij linus.walleij@linaro.org Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/gyro/mpu3050-core.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/iio/gyro/mpu3050-core.c +++ b/drivers/iio/gyro/mpu3050-core.c @@ -29,7 +29,8 @@
#include "mpu3050.h"
-#define MPU3050_CHIP_ID 0x69 +#define MPU3050_CHIP_ID 0x68 +#define MPU3050_CHIP_ID_MASK 0x7E
/* * Register map: anything suffixed *_H is a big-endian high byte and always @@ -1176,8 +1177,9 @@ int mpu3050_common_probe(struct device * goto err_power_down; }
- if (val != MPU3050_CHIP_ID) { - dev_err(dev, "unsupported chip id %02x\n", (u8)val); + if ((val & MPU3050_CHIP_ID_MASK) != MPU3050_CHIP_ID) { + dev_err(dev, "unsupported chip id %02x\n", + (u8)(val & MPU3050_CHIP_ID_MASK)); ret = -ENODEV; goto err_power_down; }
From: Mike Looijmans mike.looijmans@topic.nl
commit 40a7198a4a01037003c7ca714f0d048a61e729ac upstream.
Standard unit for temperature is millidegrees Celcius, whereas this driver was reporting in degrees. Fix the scale factor in the driver.
Signed-off-by: Mike Looijmans mike.looijmans@topic.nl Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/gyro/bmg160_core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/iio/gyro/bmg160_core.c +++ b/drivers/iio/gyro/bmg160_core.c @@ -582,11 +582,10 @@ static int bmg160_read_raw(struct iio_de case IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY: return bmg160_get_filter(data, val); case IIO_CHAN_INFO_SCALE: - *val = 0; switch (chan->type) { case IIO_TEMP: - *val2 = 500000; - return IIO_VAL_INT_PLUS_MICRO; + *val = 500; + return IIO_VAL_INT; case IIO_ANGL_VEL: { int i; @@ -594,6 +593,7 @@ static int bmg160_read_raw(struct iio_de for (i = 0; i < ARRAY_SIZE(bmg160_scale_table); ++i) { if (bmg160_scale_table[i].dps_range == data->dps_range) { + *val = 0; *val2 = bmg160_scale_table[i].scale; return IIO_VAL_INT_PLUS_MICRO; }
From: Mike Looijmans mike.looijmans@topic.nl
commit 9436f45dd53595e21566a8c6627411077dfdb776 upstream.
The standard unit for temperature is millidegrees Celcius. Adapt the driver to report in millidegrees instead of degrees.
Signed-off-by: Mike Looijmans mike.looijmans@topic.nl Fixes: 1b3bd8592780 ("iio: chemical: Add support for Bosch BME680 sensor"); Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/chemical/bme680_core.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
--- a/drivers/iio/chemical/bme680_core.c +++ b/drivers/iio/chemical/bme680_core.c @@ -591,8 +591,7 @@ static int bme680_gas_config(struct bme6 return ret; }
-static int bme680_read_temp(struct bme680_data *data, - int *val, int *val2) +static int bme680_read_temp(struct bme680_data *data, int *val) { struct device *dev = regmap_get_device(data->regmap); int ret; @@ -625,10 +624,9 @@ static int bme680_read_temp(struct bme68 * compensate_press/compensate_humid to get compensated * pressure/humidity readings. */ - if (val && val2) { - *val = comp_temp; - *val2 = 100; - return IIO_VAL_FRACTIONAL; + if (val) { + *val = comp_temp * 10; /* Centidegrees to millidegrees */ + return IIO_VAL_INT; }
return ret; @@ -643,7 +641,7 @@ static int bme680_read_press(struct bme6 s32 adc_press;
/* Read and compensate temperature to get a reading of t_fine */ - ret = bme680_read_temp(data, NULL, NULL); + ret = bme680_read_temp(data, NULL); if (ret < 0) return ret;
@@ -676,7 +674,7 @@ static int bme680_read_humid(struct bme6 u32 comp_humidity;
/* Read and compensate temperature to get a reading of t_fine */ - ret = bme680_read_temp(data, NULL, NULL); + ret = bme680_read_temp(data, NULL); if (ret < 0) return ret;
@@ -769,7 +767,7 @@ static int bme680_read_raw(struct iio_de case IIO_CHAN_INFO_PROCESSED: switch (chan->type) { case IIO_TEMP: - return bme680_read_temp(data, val, val2); + return bme680_read_temp(data, val); case IIO_PRESSURE: return bme680_read_press(data, val, val2); case IIO_HUMIDITYRELATIVE:
From: Mike Looijmans mike.looijmans@topic.nl
commit 73f3bc6da506711302bb67572440eb84b1ec4a2c upstream.
The SPI interface implementation was completely broken.
When using the SPI interface, there are only 7 address bits, the upper bit is controlled by a page select register. The core needs access to both ranges, so implement register read/write for both regions. The regmap paging functionality didn't agree with a register that needs to be read and modified, so I implemented a custom paging algorithm.
This fixes that the device wouldn't even probe in SPI mode.
The SPI interface then isn't different from I2C, merged them into the core, and the I2C/SPI named registers are no longer needed.
Implemented register value caching for the registers to reduce the I2C/SPI data transfers considerably.
The calibration set reads as all zeroes until some undefined point in time, and I couldn't determine what makes it valid. The datasheet mentions these registers but does not provide any hints on when they become valid, and they aren't even enumerated in the memory map. So check the calibration and retry reading it from the device after each measurement until it provides something valid.
Despite the size this is suitable for a stable backport given that it seems the SPI support never worked.
Signed-off-by: Mike Looijmans mike.looijmans@topic.nl Fixes: 1b3bd8592780 ("iio: chemical: Add support for Bosch BME680 sensor"); Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/chemical/bme680.h | 6 - drivers/iio/chemical/bme680_core.c | 38 ++++++++++++ drivers/iio/chemical/bme680_i2c.c | 21 ------ drivers/iio/chemical/bme680_spi.c | 115 +++++++++++++++++++++++++------------ 4 files changed, 118 insertions(+), 62 deletions(-)
--- a/drivers/iio/chemical/bme680.h +++ b/drivers/iio/chemical/bme680.h @@ -2,11 +2,9 @@ #ifndef BME680_H_ #define BME680_H_
-#define BME680_REG_CHIP_I2C_ID 0xD0 -#define BME680_REG_CHIP_SPI_ID 0x50 +#define BME680_REG_CHIP_ID 0xD0 #define BME680_CHIP_ID_VAL 0x61 -#define BME680_REG_SOFT_RESET_I2C 0xE0 -#define BME680_REG_SOFT_RESET_SPI 0x60 +#define BME680_REG_SOFT_RESET 0xE0 #define BME680_CMD_SOFTRESET 0xB6 #define BME680_REG_STATUS 0x73 #define BME680_SPI_MEM_PAGE_BIT BIT(4) --- a/drivers/iio/chemical/bme680_core.c +++ b/drivers/iio/chemical/bme680_core.c @@ -63,9 +63,23 @@ struct bme680_data { s32 t_fine; };
+static const struct regmap_range bme680_volatile_ranges[] = { + regmap_reg_range(BME680_REG_MEAS_STAT_0, BME680_REG_GAS_R_LSB), + regmap_reg_range(BME680_REG_STATUS, BME680_REG_STATUS), + regmap_reg_range(BME680_T2_LSB_REG, BME680_GH3_REG), +}; + +static const struct regmap_access_table bme680_volatile_table = { + .yes_ranges = bme680_volatile_ranges, + .n_yes_ranges = ARRAY_SIZE(bme680_volatile_ranges), +}; + const struct regmap_config bme680_regmap_config = { .reg_bits = 8, .val_bits = 8, + .max_register = 0xef, + .volatile_table = &bme680_volatile_table, + .cache_type = REGCACHE_RBTREE, }; EXPORT_SYMBOL(bme680_regmap_config);
@@ -330,6 +344,10 @@ static s16 bme680_compensate_temp(struct s64 var1, var2, var3; s16 calc_temp;
+ /* If the calibration is invalid, attempt to reload it */ + if (!calib->par_t2) + bme680_read_calib(data, calib); + var1 = (adc_temp >> 3) - (calib->par_t1 << 1); var2 = (var1 * calib->par_t2) >> 11; var3 = ((var1 >> 1) * (var1 >> 1)) >> 12; @@ -903,8 +921,28 @@ int bme680_core_probe(struct device *dev { struct iio_dev *indio_dev; struct bme680_data *data; + unsigned int val; int ret;
+ ret = regmap_write(regmap, BME680_REG_SOFT_RESET, + BME680_CMD_SOFTRESET); + if (ret < 0) { + dev_err(dev, "Failed to reset chip\n"); + return ret; + } + + ret = regmap_read(regmap, BME680_REG_CHIP_ID, &val); + if (ret < 0) { + dev_err(dev, "Error reading chip ID\n"); + return ret; + } + + if (val != BME680_CHIP_ID_VAL) { + dev_err(dev, "Wrong chip ID, got %x expected %x\n", + val, BME680_CHIP_ID_VAL); + return -ENODEV; + } + indio_dev = devm_iio_device_alloc(dev, sizeof(*data)); if (!indio_dev) return -ENOMEM; --- a/drivers/iio/chemical/bme680_i2c.c +++ b/drivers/iio/chemical/bme680_i2c.c @@ -23,8 +23,6 @@ static int bme680_i2c_probe(struct i2c_c { struct regmap *regmap; const char *name = NULL; - unsigned int val; - int ret;
regmap = devm_regmap_init_i2c(client, &bme680_regmap_config); if (IS_ERR(regmap)) { @@ -33,25 +31,6 @@ static int bme680_i2c_probe(struct i2c_c return PTR_ERR(regmap); }
- ret = regmap_write(regmap, BME680_REG_SOFT_RESET_I2C, - BME680_CMD_SOFTRESET); - if (ret < 0) { - dev_err(&client->dev, "Failed to reset chip\n"); - return ret; - } - - ret = regmap_read(regmap, BME680_REG_CHIP_I2C_ID, &val); - if (ret < 0) { - dev_err(&client->dev, "Error reading I2C chip ID\n"); - return ret; - } - - if (val != BME680_CHIP_ID_VAL) { - dev_err(&client->dev, "Wrong chip ID, got %x expected %x\n", - val, BME680_CHIP_ID_VAL); - return -ENODEV; - } - if (id) name = id->name;
--- a/drivers/iio/chemical/bme680_spi.c +++ b/drivers/iio/chemical/bme680_spi.c @@ -11,28 +11,93 @@
#include "bme680.h"
+struct bme680_spi_bus_context { + struct spi_device *spi; + u8 current_page; +}; + +/* + * In SPI mode there are only 7 address bits, a "page" register determines + * which part of the 8-bit range is active. This function looks at the address + * and writes the page selection bit if needed + */ +static int bme680_regmap_spi_select_page( + struct bme680_spi_bus_context *ctx, u8 reg) +{ + struct spi_device *spi = ctx->spi; + int ret; + u8 buf[2]; + u8 page = (reg & 0x80) ? 0 : 1; /* Page "1" is low range */ + + if (page == ctx->current_page) + return 0; + + /* + * Data sheet claims we're only allowed to change bit 4, so we must do + * a read-modify-write on each and every page select + */ + buf[0] = BME680_REG_STATUS; + ret = spi_write_then_read(spi, buf, 1, buf + 1, 1); + if (ret < 0) { + dev_err(&spi->dev, "failed to set page %u\n", page); + return ret; + } + + buf[0] = BME680_REG_STATUS; + if (page) + buf[1] |= BME680_SPI_MEM_PAGE_BIT; + else + buf[1] &= ~BME680_SPI_MEM_PAGE_BIT; + + ret = spi_write(spi, buf, 2); + if (ret < 0) { + dev_err(&spi->dev, "failed to set page %u\n", page); + return ret; + } + + ctx->current_page = page; + + return 0; +} + static int bme680_regmap_spi_write(void *context, const void *data, size_t count) { - struct spi_device *spi = context; + struct bme680_spi_bus_context *ctx = context; + struct spi_device *spi = ctx->spi; + int ret; u8 buf[2];
memcpy(buf, data, 2); + + ret = bme680_regmap_spi_select_page(ctx, buf[0]); + if (ret) + return ret; + /* * The SPI register address (= full register address without bit 7) * and the write command (bit7 = RW = '0') */ buf[0] &= ~0x80;
- return spi_write_then_read(spi, buf, 2, NULL, 0); + return spi_write(spi, buf, 2); }
static int bme680_regmap_spi_read(void *context, const void *reg, size_t reg_size, void *val, size_t val_size) { - struct spi_device *spi = context; + struct bme680_spi_bus_context *ctx = context; + struct spi_device *spi = ctx->spi; + int ret; + u8 addr = *(const u8 *)reg; + + ret = bme680_regmap_spi_select_page(ctx, addr); + if (ret) + return ret; + + addr |= 0x80; /* bit7 = RW = '1' */
- return spi_write_then_read(spi, reg, reg_size, val, val_size); + return spi_write_then_read(spi, &addr, 1, val, val_size); }
static struct regmap_bus bme680_regmap_bus = { @@ -45,8 +110,8 @@ static struct regmap_bus bme680_regmap_b static int bme680_spi_probe(struct spi_device *spi) { const struct spi_device_id *id = spi_get_device_id(spi); + struct bme680_spi_bus_context *bus_context; struct regmap *regmap; - unsigned int val; int ret;
spi->bits_per_word = 8; @@ -56,45 +121,21 @@ static int bme680_spi_probe(struct spi_d return ret; }
+ bus_context = devm_kzalloc(&spi->dev, sizeof(*bus_context), GFP_KERNEL); + if (!bus_context) + return -ENOMEM; + + bus_context->spi = spi; + bus_context->current_page = 0xff; /* Undefined on warm boot */ + regmap = devm_regmap_init(&spi->dev, &bme680_regmap_bus, - &spi->dev, &bme680_regmap_config); + bus_context, &bme680_regmap_config); if (IS_ERR(regmap)) { dev_err(&spi->dev, "Failed to register spi regmap %d\n", (int)PTR_ERR(regmap)); return PTR_ERR(regmap); }
- ret = regmap_write(regmap, BME680_REG_SOFT_RESET_SPI, - BME680_CMD_SOFTRESET); - if (ret < 0) { - dev_err(&spi->dev, "Failed to reset chip\n"); - return ret; - } - - /* after power-on reset, Page 0(0x80-0xFF) of spi_mem_page is active */ - ret = regmap_read(regmap, BME680_REG_CHIP_SPI_ID, &val); - if (ret < 0) { - dev_err(&spi->dev, "Error reading SPI chip ID\n"); - return ret; - } - - if (val != BME680_CHIP_ID_VAL) { - dev_err(&spi->dev, "Wrong chip ID, got %x expected %x\n", - val, BME680_CHIP_ID_VAL); - return -ENODEV; - } - /* - * select Page 1 of spi_mem_page to enable access to - * to registers from address 0x00 to 0x7F. - */ - ret = regmap_write_bits(regmap, BME680_REG_STATUS, - BME680_SPI_MEM_PAGE_BIT, - BME680_SPI_MEM_PAGE_1_VAL); - if (ret < 0) { - dev_err(&spi->dev, "failed to set page 1 of spi_mem_page\n"); - return ret; - } - return bme680_core_probe(&spi->dev, regmap, id->name); }
From: Gwendal Grignou gwendal@chromium.org
commit 3d02d7082e5823598090530c3988a35f69689943 upstream.
Calculation did not use IIO_DEGREE_TO_RAD and implemented a variant to avoid precision loss as we aim a nano value. The offset added to avoid rounding error, though, doesn't give us a close result to the expected value. E.g.
For 1000dps, the result should be:
(1000 * pi ) / 180 >> 15 ~= 0.000532632218
But with current calculation we get
$ cat scale 0.000547890
Fix the calculation by just doing the maths involved for a nano value
val * pi * 10e12 / (180 * 2^15)
so we get a closer result.
$ cat scale 0.000532632
Fixes: c14dca07a31d ("iio: cros_ec_sensors: add ChromeOS EC Contiguous Sensors driver") Signed-off-by: Gwendal Grignou gwendal@chromium.org Signed-off-by: Enric Balletbo i Serra enric.balletbo@collabora.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c +++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c @@ -103,9 +103,10 @@ static int cros_ec_sensors_read(struct i * Do not use IIO_DEGREE_TO_RAD to avoid precision * loss. Round to the nearest integer. */ - *val = div_s64(val64 * 314159 + 9000000ULL, 1000); - *val2 = 18000 << (CROS_EC_SENSOR_BITS - 1); - ret = IIO_VAL_FRACTIONAL; + *val = 0; + *val2 = div_s64(val64 * 3141592653ULL, + 180 << (CROS_EC_SENSOR_BITS - 1)); + ret = IIO_VAL_INT_PLUS_NANO; break; case MOTIONSENSE_TYPE_MAG: /*
From: Dragos Bogdan dragos.bogdan@analog.com
commit fccfb9ce70ed4ea7a145f77b86de62e38178517f upstream.
The desired channel has to be selected in order to correctly fill the buffer with the corresponding data. The `ad_sd_write_reg()` already does this, but for the `ad_sd_read_reg_raw()` this was omitted.
Fixes: af3008485ea03 ("iio:adc: Add common code for ADI Sigma Delta devices") Signed-off-by: Dragos Bogdan dragos.bogdan@analog.com Signed-off-by: Alexandru Ardelean alexandru.ardelean@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/adc/ad_sigma_delta.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/iio/adc/ad_sigma_delta.c +++ b/drivers/iio/adc/ad_sigma_delta.c @@ -121,6 +121,7 @@ static int ad_sd_read_reg_raw(struct ad_ if (sigma_delta->info->has_registers) { data[0] = reg << sigma_delta->info->addr_shift; data[0] |= sigma_delta->info->read_mask; + data[0] |= sigma_delta->comm; spi_message_add_tail(&t[0], &m); } spi_message_add_tail(&t[1], &m);
From: Jean-Francois Dagenais jeff.dagenais@gmail.com
commit 06003531502d06bc89d32528f6ec96bf978790f9 upstream.
When issuing the write DAC register and write eeprom command, the two powerdown bits (PD0 and PD1) are assumed by the chip to be present in the bytes sent. Leaving them at 0 implies "powerdown disabled" which is a different state that the current one. By adding the current state of the powerdown in the i2c write, the chip will correctly power-on exactly like as it is at the moment of store_eeprom call.
This is documented in MCP4725's datasheet, FIGURE 6-2: "Write Commands for DAC Input Register and EEPROM" and MCP4726's datasheet, FIGURE 6-3: "Write All Memory Command".
Signed-off-by: Jean-Francois Dagenais jeff.dagenais@gmail.com Acked-by: Peter Meerwald-Stadler pmeerw@pmeerw.net Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/dac/mcp4725.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/iio/dac/mcp4725.c +++ b/drivers/iio/dac/mcp4725.c @@ -98,6 +98,7 @@ static ssize_t mcp4725_store_eeprom(stru
inoutbuf[0] = 0x60; /* write EEPROM */ inoutbuf[0] |= data->ref_mode << 3; + inoutbuf[0] |= data->powerdown ? ((data->powerdown_mode + 1) << 1) : 0; inoutbuf[1] = data->dac_value >> 4; inoutbuf[2] = (data->dac_value & 0xf) << 4;
From: Lars-Peter Clausen lars@metafoo.de
commit 20ea39ef9f2f911bd01c69519e7d69cfec79fde3 upstream.
The trialmask is expected to have all bits set to 0 after allocation. Currently kmalloc_array() is used which does not zero the memory and so random bits are set. This results in random channels being enabled when they shouldn't. Replace kmalloc_array() with kcalloc() which has the same interface but zeros the memory.
Note the fix is actually required earlier than the below fixes tag, but will require a manual backport due to move from kmalloc to kmalloc_array.
Signed-off-by: Lars-Peter Clausen lars@metafoo.de Signed-off-by: Alexandru Ardelean alexandru.ardelean@analog.com Fixes commit 057ac1acdfc4 ("iio: Use kmalloc_array() in iio_scan_mask_set()"). Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/industrialio-buffer.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -320,9 +320,8 @@ static int iio_scan_mask_set(struct iio_ const unsigned long *mask; unsigned long *trialmask;
- trialmask = kmalloc_array(BITS_TO_LONGS(indio_dev->masklength), - sizeof(*trialmask), - GFP_KERNEL); + trialmask = kcalloc(BITS_TO_LONGS(indio_dev->masklength), + sizeof(*trialmask), GFP_KERNEL); if (trialmask == NULL) return -ENOMEM; if (!indio_dev->masklength) {
From: Georg Ottinger g.ottinger@abatec.at
commit 09c6bdee51183a575bf7546890c8c137a75a2b44 upstream.
Having a brief look at at91_adc_read_raw() it is obvious that in the case of a timeout the setting of AT91_ADC_CHDR and AT91_ADC_IDR registers is omitted. If 2 different channels are queried we can end up with a situation where two interrupts are enabled, but only one interrupt is cleared in the interrupt handler. Resulting in a interrupt loop and a system hang.
Signed-off-by: Georg Ottinger g.ottinger@abatec.at Acked-by: Ludovic Desroches ludovic.desroches@microchip.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/adc/at91_adc.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
--- a/drivers/iio/adc/at91_adc.c +++ b/drivers/iio/adc/at91_adc.c @@ -704,23 +704,29 @@ static int at91_adc_read_raw(struct iio_ ret = wait_event_interruptible_timeout(st->wq_data_avail, st->done, msecs_to_jiffies(1000)); - if (ret == 0) - ret = -ETIMEDOUT; - if (ret < 0) { - mutex_unlock(&st->lock); - return ret; - } - - *val = st->last_value;
+ /* Disable interrupts, regardless if adc conversion was + * successful or not + */ at91_adc_writel(st, AT91_ADC_CHDR, AT91_ADC_CH(chan->channel)); at91_adc_writel(st, AT91_ADC_IDR, BIT(chan->channel));
- st->last_value = 0; - st->done = false; + if (ret > 0) { + /* a valid conversion took place */ + *val = st->last_value; + st->last_value = 0; + st->done = false; + ret = IIO_VAL_INT; + } else if (ret == 0) { + /* conversion timeout */ + dev_err(&idev->dev, "ADC Channel %d timeout.\n", + chan->channel); + ret = -ETIMEDOUT; + } + mutex_unlock(&st->lock); - return IIO_VAL_INT; + return ret;
case IIO_CHAN_INFO_SCALE: *val = st->vref_mv;
From: Fabrice Gasnier fabrice.gasnier@st.com
commit 7f75591fc5a123929a29636834d1bcb8b5c9fee3 upstream.
This fixes a possible circular locking dependency detected warning seen with: - CONFIG_PROVE_LOCKING=y - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
When using the IIO consumer interface, e.g. iio_channel_get(), the consumer device will likely call iio_read_channel_raw() or similar that rely on 'info_exist_lock' mutex.
typically: ... mutex_lock(&chan->indio_dev->info_exist_lock); if (chan->indio_dev->info == NULL) { ret = -ENODEV; goto err_unlock; } ret = do_some_ops() err_unlock: mutex_unlock(&chan->indio_dev->info_exist_lock); return ret; ...
Same mutex is also hold in iio_device_unregister().
The following deadlock warning happens when: - the consumer device has called an API like iio_read_channel_raw() at least once. - the consumer driver is unregistered, removed (unbind from sysfs)
====================================================== WARNING: possible circular locking dependency detected 4.19.24 #577 Not tainted ------------------------------------------------------ sh/372 is trying to acquire lock: (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
but task is already holding lock: (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&dev->info_exist_lock){+.+.}: __mutex_lock+0x70/0xa3c mutex_lock_nested+0x1c/0x24 iio_read_channel_raw+0x1c/0x60 iio_read_channel_info+0xa8/0xb0 dev_attr_show+0x1c/0x48 sysfs_kf_seq_show+0x84/0xec seq_read+0x154/0x528 __vfs_read+0x2c/0x15c vfs_read+0x8c/0x110 ksys_read+0x4c/0xac ret_fast_syscall+0x0/0x28 0xbedefb60
-> #0 (kn->count#30){++++}: lock_acquire+0xd8/0x268 __kernfs_remove+0x288/0x374 kernfs_remove_by_name_ns+0x3c/0x84 remove_files+0x34/0x78 sysfs_remove_group+0x40/0x9c sysfs_remove_groups+0x24/0x34 device_remove_attrs+0x38/0x64 device_del+0x11c/0x360 cdev_device_del+0x14/0x2c iio_device_unregister+0x24/0x60 release_nodes+0x1bc/0x200 device_release_driver_internal+0x1a0/0x230 unbind_store+0x80/0x130 kernfs_fop_write+0x100/0x1e4 __vfs_write+0x2c/0x160 vfs_write+0xa4/0x17c ksys_write+0x4c/0xac ret_fast_syscall+0x0/0x28 0xbe906840
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&dev->info_exist_lock); lock(kn->count#30); lock(&dev->info_exist_lock); lock(kn->count#30);
*** DEADLOCK *** ...
cdev_device_del() can be called without holding the lock. It should be safe as info_exist_lock prevents kernelspace consumers to use the exported routines during/after provider removal. cdev_device_del() is for userspace.
Help to reproduce: See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt sysv { compatible = "voltage-divider"; io-channels = <&adc 0>; output-ohms = <22>; full-ohms = <222>; };
First, go to iio:deviceX for the "voltage-divider", do one read: $ cd /sys/bus/iio/devices/iio:deviceX $ cat in_voltage0_raw
Then, unbind the consumer driver. It triggers above deadlock warning. $ cd /sys/bus/platform/drivers/iio-rescale/ $ echo sysv > unbind
Note I don't actually expect stable will pick this up all the way back into IIO being in staging, but if's probably valid that far back.
Signed-off-by: Fabrice Gasnier fabrice.gasnier@st.com Fixes: ac917a81117c ("staging:iio:core set the iio_dev.info pointer to null on unregister") Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/industrialio-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/industrialio-core.c +++ b/drivers/iio/industrialio-core.c @@ -1735,10 +1735,10 @@ EXPORT_SYMBOL(__iio_device_register); **/ void iio_device_unregister(struct iio_dev *indio_dev) { - mutex_lock(&indio_dev->info_exist_lock); - cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
+ mutex_lock(&indio_dev->info_exist_lock); + iio_device_unregister_debugfs(indio_dev);
iio_disable_all_buffers(indio_dev);
From: he, bo bo.he@intel.com
commit fe2d3df639a7940a125a33d6460529b9689c5406 upstream.
On some laptops, kxcjk1013 is powered off when system enters S3. We need restore the range regiter during resume. Otherwise, the sensor doesn't work properly after S3.
Signed-off-by: he, bo bo.he@intel.com Signed-off-by: Chen, Hu hu1.chen@intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iio/accel/kxcjk-1013.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/iio/accel/kxcjk-1013.c +++ b/drivers/iio/accel/kxcjk-1013.c @@ -1437,6 +1437,8 @@ static int kxcjk1013_resume(struct devic
mutex_lock(&data->mutex); ret = kxcjk1013_set_mode(data, OPERATION); + if (ret == 0) + ret = kxcjk1013_set_range(data, data->range); mutex_unlock(&data->mutex);
return ret;
From: Christian Gromm christian.gromm@microchip.com
commit 131ac62253dba79daf4a6d83ab12293d2b9863d3 upstream.
This patch uses the device description to clearly identity a device attached to the bus. It is needed as the currently useed mdevX notation is not sufficiant in case more than one network interface controller is being used at the same time.
Cc: stable@vger.kernel.org Signed-off-by: Christian Gromm christian.gromm@microchip.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/most/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/most/core.c +++ b/drivers/staging/most/core.c @@ -1412,7 +1412,7 @@ int most_register_interface(struct most_
INIT_LIST_HEAD(&iface->p->channel_list); iface->p->dev_id = id; - snprintf(iface->p->name, STRING_SIZE, "mdev%d", id); + strcpy(iface->p->name, iface->description); iface->dev.init_name = iface->p->name; iface->dev.bus = &mc.bus; iface->dev.parent = &mc.dev;
From: Ian Abbott abbotti@mev.co.uk
commit 08b7c2f9208f0e2a32159e4e7a4831b7adb10a3e upstream.
If `vmk80xx_auto_attach()` returns an error, the core comedi module code will call `vmk80xx_detach()` to clean up. If `vmk80xx_auto_attach()` successfully allocated the comedi device private data, `vmk80xx_detach()` assumes that a `struct semaphore limit_sem` contained in the private data has been initialized and uses it. Unfortunately, there are a couple of places where `vmk80xx_auto_attach()` can return an error after allocating the device private data but before initializing the semaphore, so this assumption is invalid. Fix it by initializing the semaphore just after allocating the private data in `vmk80xx_auto_attach()` before any other errors can be returned.
I believe this was the cause of the following syzbot crash report https://syzkaller.appspot.com/bug?extid=54c2f58f15fe6876b6ad:
usb 1-1: config 0 has no interface number 0 usb 1-1: New USB device found, idVendor=10cf, idProduct=8068, bcdDevice=e6.8d usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 usb 1-1: config 0 descriptor?? vmk80xx 1-1:0.117: driver 'vmk80xx' failed to auto-configure device. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.1.0-rc4-319354-g9a33b36 #3 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xe8/0x16e lib/dump_stack.c:113 assign_lock_key kernel/locking/lockdep.c:786 [inline] register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095 __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582 lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152 down+0x12/0x80 kernel/locking/semaphore.c:58 vmk80xx_detach+0x59/0x100 drivers/staging/comedi/drivers/vmk80xx.c:829 comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204 comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156 comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline] comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190 comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline] comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880 comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068 usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021 generic_probe+0xa2/0xda drivers/usb/core/generic.c:210 usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534 hub_port_connect drivers/usb/core/hub.c:5089 [inline] hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] port_event drivers/usb/core/hub.c:5350 [inline] hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432 process_one_work+0x90f/0x1580 kernel/workqueue.c:2269 worker_thread+0x9b/0xe20 kernel/workqueue.c:2415 kthread+0x313/0x420 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Reported-by: syzbot+54c2f58f15fe6876b6ad@syzkaller.appspotmail.com Signed-off-by: Ian Abbott abbotti@mev.co.uk Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/comedi/drivers/vmk80xx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/staging/comedi/drivers/vmk80xx.c +++ b/drivers/staging/comedi/drivers/vmk80xx.c @@ -800,6 +800,8 @@ static int vmk80xx_auto_attach(struct co
devpriv->model = board->model;
+ sema_init(&devpriv->limit_sem, 8); + ret = vmk80xx_find_usb_endpoints(dev); if (ret) return ret; @@ -808,8 +810,6 @@ static int vmk80xx_auto_attach(struct co if (ret) return ret;
- sema_init(&devpriv->limit_sem, 8); - usb_set_intfdata(intf, devpriv);
if (devpriv->model == VMK8055_MODEL)
From: Ian Abbott abbotti@mev.co.uk
commit 663d294b4768bfd89e529e069bffa544a830b5bf upstream.
`vmk80xx_alloc_usb_buffers()` is called from `vmk80xx_auto_attach()` to allocate RX and TX buffers for USB transfers. It allocates `devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`. If the allocation of `devpriv->usb_tx_buf` fails, it frees `devpriv->usb_rx_buf`, leaving the pointer set dangling, and returns an error. Later, `vmk80xx_detach()` will be called from the core comedi module code to clean up. `vmk80xx_detach()` also frees both `devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but `devpriv->usb_rx_buf` may have already been freed, leading to a double-free error. Fix it by removing the call to `kfree(devpriv->usb_rx_buf)` from `vmk80xx_alloc_usb_buffers()`, relying on `vmk80xx_detach()` to free the memory.
Signed-off-by: Ian Abbott abbotti@mev.co.uk Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/comedi/drivers/vmk80xx.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/staging/comedi/drivers/vmk80xx.c +++ b/drivers/staging/comedi/drivers/vmk80xx.c @@ -682,10 +682,8 @@ static int vmk80xx_alloc_usb_buffers(str
size = usb_endpoint_maxp(devpriv->ep_tx); devpriv->usb_tx_buf = kzalloc(size, GFP_KERNEL); - if (!devpriv->usb_tx_buf) { - kfree(devpriv->usb_rx_buf); + if (!devpriv->usb_tx_buf) return -ENOMEM; - }
return 0; }
From: Ian Abbott abbotti@mev.co.uk
commit 660cf4ce9d0f3497cc7456eaa6d74c8b71d6282c upstream.
If `ni6501_auto_attach()` returns an error, the core comedi module code will call `ni6501_detach()` to clean up. If `ni6501_auto_attach()` successfully allocated the comedi device private data, `ni6501_detach()` assumes that a `struct mutex mut` contained in the private data has been initialized and uses it. Unfortunately, there are a couple of places where `ni6501_auto_attach()` can return an error after allocating the device private data but before initializing the mutex, so this assumption is invalid. Fix it by initializing the mutex just after allocating the private data in `ni6501_auto_attach()` before any other errors can be retturned. Also move the call to `usb_set_intfdata()` just to keep the code a bit neater (either position for the call is fine).
I believe this was the cause of the following syzbot crash report https://syzkaller.appspot.com/bug?extid=cf4f2b6c24aff0a3edf6:
usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 usb 1-1: config 0 descriptor?? usb 1-1: string descriptor 0 read error: -71 comedi comedi0: Wrong number of endpoints ni6501 1-1:0.233: driver 'ni6501' failed to auto-configure device. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 585 Comm: kworker/0:3 Not tainted 5.1.0-rc4-319354-g9a33b36 #3 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xe8/0x16e lib/dump_stack.c:113 assign_lock_key kernel/locking/lockdep.c:786 [inline] register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095 __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582 lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211 __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0xfe/0x12b0 kernel/locking/mutex.c:1072 ni6501_detach+0x5b/0x110 drivers/staging/comedi/drivers/ni_usb6501.c:567 comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204 comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156 comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline] comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190 comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline] comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880 comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068 usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021 generic_probe+0xa2/0xda drivers/usb/core/generic.c:210 usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534 hub_port_connect drivers/usb/core/hub.c:5089 [inline] hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] port_event drivers/usb/core/hub.c:5350 [inline] hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432 process_one_work+0x90f/0x1580 kernel/workqueue.c:2269 worker_thread+0x9b/0xe20 kernel/workqueue.c:2415 kthread+0x313/0x420 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Reported-by: syzbot+cf4f2b6c24aff0a3edf6@syzkaller.appspotmail.com Signed-off-by: Ian Abbott abbotti@mev.co.uk Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/comedi/drivers/ni_usb6501.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/staging/comedi/drivers/ni_usb6501.c +++ b/drivers/staging/comedi/drivers/ni_usb6501.c @@ -518,6 +518,9 @@ static int ni6501_auto_attach(struct com if (!devpriv) return -ENOMEM;
+ mutex_init(&devpriv->mut); + usb_set_intfdata(intf, devpriv); + ret = ni6501_find_endpoints(dev); if (ret) return ret; @@ -526,9 +529,6 @@ static int ni6501_auto_attach(struct com if (ret) return ret;
- mutex_init(&devpriv->mut); - usb_set_intfdata(intf, devpriv); - ret = comedi_alloc_subdevices(dev, 2); if (ret) return ret;
From: Ian Abbott abbotti@mev.co.uk
commit af4b54a2e5ba18259ff9aac445bf546dd60d037e upstream.
`ni6501_alloc_usb_buffers()` is called from `ni6501_auto_attach()` to allocate RX and TX buffers for USB transfers. It allocates `devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`. If the allocation of `devpriv->usb_tx_buf` fails, it frees `devpriv->usb_rx_buf`, leaving the pointer set dangling, and returns an error. Later, `ni6501_detach()` will be called from the core comedi module code to clean up. `ni6501_detach()` also frees both `devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but `devpriv->usb_rx_buf` may have already beed freed, leading to a double-free error. Fix it bu removing the call to `kfree(devpriv->usb_rx_buf)` from `ni6501_alloc_usb_buffers()`, relying on `ni6501_detach()` to free the memory.
Signed-off-by: Ian Abbott abbotti@mev.co.uk Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/comedi/drivers/ni_usb6501.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/staging/comedi/drivers/ni_usb6501.c +++ b/drivers/staging/comedi/drivers/ni_usb6501.c @@ -463,10 +463,8 @@ static int ni6501_alloc_usb_buffers(stru
size = usb_endpoint_maxp(devpriv->ep_tx); devpriv->usb_tx_buf = kzalloc(size, GFP_KERNEL); - if (!devpriv->usb_tx_buf) { - kfree(devpriv->usb_rx_buf); + if (!devpriv->usb_tx_buf) return -ENOMEM; - }
return 0; }
From: Hui Wang hui.wang@canonical.com
commit b26e36b7ef36a8a3a147b1609b2505f8a4ecf511 upstream.
We have two Dell laptops which have the codec 10ec0236 and 10ec0256 respectively, the headset mic on them can't work, need to apply the quirk of ALC255_FIXUP_DELL1_MIC_NO_PRESENCE. So adding their pin configurations in the pin quirk table.
Cc: stable@vger.kernel.org Signed-off-by: Hui Wang hui.wang@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7170,6 +7170,8 @@ static const struct snd_hda_pin_quirk al {0x12, 0x90a60140}, {0x14, 0x90170150}, {0x21, 0x02211020}), + SND_HDA_PIN_QUIRK(0x10ec0236, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, + {0x21, 0x02211020}), SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL2_MIC_NO_PRESENCE, {0x14, 0x90170110}, {0x21, 0x02211020}), @@ -7280,6 +7282,10 @@ static const struct snd_hda_pin_quirk al {0x21, 0x0221101f}), SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, ALC256_STANDARD_PINS), + SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, + {0x14, 0x90170110}, + {0x1b, 0x01011020}, + {0x21, 0x0221101f}), SND_HDA_PIN_QUIRK(0x10ec0256, 0x1043, "ASUS", ALC256_FIXUP_ASUS_MIC, {0x14, 0x90170110}, {0x1b, 0x90a70130},
From: Takashi Iwai tiwai@suse.de
commit 2a3f7221acddfe1caa9ff09b3a8158c39b2fdeac upstream.
There is a small race window in the card disconnection code that allows the registration of another card with the very same card id. This leads to a warning in procfs creation as caught by syzkaller.
The problem is that we delete snd_cards and snd_cards_lock entries at the very beginning of the disconnection procedure. This makes the slot available to be assigned for another card object while the disconnection procedure is being processed. Then it becomes possible to issue a procfs registration with the existing file name although we check the conflict beforehand.
The fix is simply to move the snd_cards and snd_cards_lock clearances at the end of the disconnection procedure. The references to these entries are merely either from the global proc files like /proc/asound/cards or from the card registration / disconnection, so it should be fine to shift at the very end.
Reported-by: syzbot+48df349490c36f9f54ab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/init.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/sound/core/init.c +++ b/sound/core/init.c @@ -407,14 +407,7 @@ int snd_card_disconnect(struct snd_card card->shutdown = 1; spin_unlock(&card->files_lock);
- /* phase 1: disable fops (user space) operations for ALSA API */ - mutex_lock(&snd_card_mutex); - snd_cards[card->number] = NULL; - clear_bit(card->number, snd_cards_lock); - mutex_unlock(&snd_card_mutex); - - /* phase 2: replace file->f_op with special dummy operations */ - + /* replace file->f_op with special dummy operations */ spin_lock(&card->files_lock); list_for_each_entry(mfile, &card->files_list, list) { /* it's critical part, use endless loop */ @@ -430,7 +423,7 @@ int snd_card_disconnect(struct snd_card } spin_unlock(&card->files_lock);
- /* phase 3: notify all connected devices about disconnection */ + /* notify all connected devices about disconnection */ /* at this point, they cannot respond to any calls except release() */
#if IS_ENABLED(CONFIG_SND_MIXER_OSS) @@ -446,6 +439,13 @@ int snd_card_disconnect(struct snd_card device_del(&card->card_dev); card->registered = false; } + + /* disable fops (user space) operations for ALSA API */ + mutex_lock(&snd_card_mutex); + snd_cards[card->number] = NULL; + clear_bit(card->number, snd_cards_lock); + mutex_unlock(&snd_card_mutex); + #ifdef CONFIG_PM wake_up(&card->power_sleep); #endif
From: KT Liao kt.liao@emc.com.tw
commit 738c06d0e4562e0acf9f2c7438a22b2d5afc67aa upstream.
There are many Lenovo laptops which need elan_i2c support, this patch adds relevant IDs to the Elan driver so that touchpads are recognized.
Signed-off-by: KT Liao kt.liao@emc.com.tw Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/mouse/elan_i2c_core.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
--- a/drivers/input/mouse/elan_i2c_core.c +++ b/drivers/input/mouse/elan_i2c_core.c @@ -1339,21 +1339,46 @@ static const struct acpi_device_id elan_ { "ELAN0600", 0 }, { "ELAN0601", 0 }, { "ELAN0602", 0 }, + { "ELAN0603", 0 }, + { "ELAN0604", 0 }, { "ELAN0605", 0 }, + { "ELAN0606", 0 }, + { "ELAN0607", 0 }, { "ELAN0608", 0 }, { "ELAN0609", 0 }, { "ELAN060B", 0 }, { "ELAN060C", 0 }, + { "ELAN060F", 0 }, + { "ELAN0610", 0 }, { "ELAN0611", 0 }, { "ELAN0612", 0 }, + { "ELAN0615", 0 }, + { "ELAN0616", 0 }, { "ELAN0617", 0 }, { "ELAN0618", 0 }, + { "ELAN0619", 0 }, + { "ELAN061A", 0 }, + { "ELAN061B", 0 }, { "ELAN061C", 0 }, { "ELAN061D", 0 }, { "ELAN061E", 0 }, + { "ELAN061F", 0 }, { "ELAN0620", 0 }, { "ELAN0621", 0 }, { "ELAN0622", 0 }, + { "ELAN0623", 0 }, + { "ELAN0624", 0 }, + { "ELAN0625", 0 }, + { "ELAN0626", 0 }, + { "ELAN0627", 0 }, + { "ELAN0628", 0 }, + { "ELAN0629", 0 }, + { "ELAN062A", 0 }, + { "ELAN062B", 0 }, + { "ELAN062C", 0 }, + { "ELAN062D", 0 }, + { "ELAN0631", 0 }, + { "ELAN0632", 0 }, { "ELAN1000", 0 }, { } };
From: Geert Uytterhoeven geert+renesas@glider.be
commit 6b87784b53592a90d21576be8eff688b56d93cce upstream.
The calculation of the sampling point has min() and max() exchanged. Fix this by using the clamp() helper instead.
Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Ulrich Hecht uli+renesas@fpond.eu Reviewed-by: Wolfram Sang wsa+renesas@sang-engineering.com Acked-by: Dirk Behme dirk.behme@de.bosch.com Cc: stable stable@vger.kernel.org Reviewed-by: Simon Horman horms+renesas@verge.net.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/sh-sci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -2504,7 +2504,7 @@ done: * last stop bit; we can increase the error * margin by shifting the sampling point. */ - int shift = min(-8, max(7, deviation / 2)); + int shift = clamp(deviation / 2, -8, 7);
hssrr |= (shift << HSCIF_SRHP_SHIFT) & HSCIF_SRHP_MASK;
From: Geert Uytterhoeven geert+renesas@glider.be
commit ace965696da2611af759f0284e26342b7b6cec89 upstream.
There are several issues with the formula used for calculating the deviation from the intended rate: 1. While min_err and last_stop are signed, srr and baud are unsigned. Hence the signed values are promoted to unsigned, which will lead to a bogus value of deviation if min_err is negative, 2. Srr is the register field value, which is one less than the actual sampling rate factor, 3. The divisions do not use rounding.
Fix this by casting unsigned variables to int, adding one to srr, and using a single DIV_ROUND_CLOSEST().
Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Mukesh Ojha mojha@codeaurora.org Cc: stable stable@vger.kernel.org Reviewed-by: Ulrich Hecht uli+renesas@fpond.eu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/serial/sh-sci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -2497,7 +2497,9 @@ done: * center of the last stop bit in sampling clocks. */ int last_stop = bits * 2 - 1; - int deviation = min_err * srr * last_stop / 2 / baud; + int deviation = DIV_ROUND_CLOSEST(min_err * last_stop * + (int)(srr + 1), + 2 * (int)baud);
if (abs(deviation) >= 2) { /* At least two sampling clocks off at the
From: Mikulas Patocka mpatocka@redhat.com
commit b2ecf00631362a83744e5ec249947620db5e240c upstream.
The patch a6dbe4427559 ("vt: perform safe console erase in the right order") introduced a bug. The conditional do_update_region() was replaced by a call to update_region() that does contain the conditional already, but with unwanted extra side effects such as restoring the cursor drawing.
In order to reproduce the bug: - use framebuffer console with the AMDGPU driver - type "links" to start the console www browser - press 'q' and space to exit links
Now the cursor will be permanently visible in the center of the screen. It will stay there until something overwrites it.
The bug goes away if we change update_region() back to the conditional do_update_region().
[ nico: reworded changelog ]
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Nicolas Pitre nico@fluxnic.net Cc: stable@vger.kernel.org Fixes: a6dbe4427559 ("vt: perform safe console erase in the right order") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/vt/vt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -1521,7 +1521,8 @@ static void csi_J(struct vc_data *vc, in return; } scr_memsetw(start, vc->vc_video_erase_char, 2 * count); - update_region(vc, (unsigned long) start, count); + if (con_should_update(vc)) + do_update_region(vc, (unsigned long) start, count); vc->vc_need_wrap = 0; }
From: Jaesoo Lee jalee@purestorage.com
commit be549d49115422f846b6d96ee8fd7173a5f7ceb0 upstream.
When SCSI blk-mq is enabled, there is a bug in handling errors in scsi_queue_rq. Specifically, the bug is not setting result field of scsi_request correctly when the dispatch of the command has been failed. Since the upper layer code including the sg_io ioctl expects to receive any error status from result field of scsi_request, the error is silently ignored and this could cause data corruptions for some applications.
Fixes: d285203cf647 ("scsi: add support for a blk-mq based I/O path.") Cc: stable@vger.kernel.org Signed-off-by: Jaesoo Lee jalee@purestorage.com Reviewed-by: Hannes Reinecke hare@suse.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/scsi_lib.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2149,8 +2149,12 @@ out_put_budget: ret = BLK_STS_DEV_RESOURCE; break; default: + if (unlikely(!scsi_device_online(sdev))) + scsi_req(req)->result = DID_NO_CONNECT << 16; + else + scsi_req(req)->result = DID_ERROR << 16; /* - * Make sure to release all allocated ressources when + * Make sure to release all allocated resources when * we hit an error, as we will never see this command * again. */
From: Saurav Kashyap skashyap@marvell.com
commit 0228034d8e5915b98c33db35a98f5e909e848ae9 upstream.
This patch clears FC_RP_STARTED flag during logoff, because of this re-login(flogi) didn't happen to the switch.
This reverts commit 1550ec458e0cf1a40a170ab1f4c46e3f52860f65.
Fixes: 1550ec458e0c ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Saurav Kashyap skashyap@marvell.com Reviewed-by: Hannes Reinecke <hare@#suse.com> Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/libfc/fc_rport.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/scsi/libfc/fc_rport.c +++ b/drivers/scsi/libfc/fc_rport.c @@ -2153,7 +2153,6 @@ static void fc_rport_recv_logo_req(struc FC_RPORT_DBG(rdata, "Received LOGO request while in state %s\n", fc_rport_state(rdata));
- rdata->flags &= ~FC_RP_STARTED; fc_rport_enter_delete(rdata, RPORT_EV_STOP); mutex_unlock(&rdata->rp_mutex); kref_put(&rdata->kref, fc_rport_destroy);
From: Suthikulpanit, Suravee Suravee.Suthikulpanit@amd.com
commit 4a58038b9e420276157785afa0a0bbb4b9bc2265 upstream.
This reverts commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57.
As Oren Twaig pointed out the old discussion:
https://patchwork.kernel.org/patch/8292231/
that the change coud potentially cause an extra IPI to be sent to the destination vcpu because the AVIC hardware already set the IRR bit before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running). Since writting to ICR and ICR2 will also set the IRR. If something triggers the destination vcpu to get scheduled before the emulation finishes, then this could result in an additional IPI.
Also, the issue mentioned in the commit bb218fbcfaaa was misdiagnosed.
Cc: Radim Krčmář rkrcmar@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Reported-by: Oren Twaig oren@scalemp.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/svm.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -4496,14 +4496,25 @@ static int avic_incomplete_ipi_intercept kvm_lapic_reg_write(apic, APIC_ICR, icrl); break; case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: { + int i; + struct kvm_vcpu *vcpu; + struct kvm *kvm = svm->vcpu.kvm; struct kvm_lapic *apic = svm->vcpu.arch.apic;
/* - * Update ICR high and low, then emulate sending IPI, - * which is handled when writing APIC_ICR. + * At this point, we expect that the AVIC HW has already + * set the appropriate IRR bits on the valid target + * vcpus. So, we just need to kick the appropriate vcpu. */ - kvm_lapic_reg_write(apic, APIC_ICR2, icrh); - kvm_lapic_reg_write(apic, APIC_ICR, icrl); + kvm_for_each_vcpu(i, vcpu, kvm) { + bool m = kvm_apic_match_dest(vcpu, apic, + icrl & KVM_APIC_SHORT_MASK, + GET_APIC_DEST_FIELD(icrh), + icrl & KVM_APIC_DEST_MASK); + + if (m && !avic_vcpu_is_running(vcpu)) + kvm_vcpu_wake_up(vcpu); + } break; } case AVIC_IPI_FAILURE_INVALID_TARGET:
From: Andrea Arcangeli aarcange@redhat.com
commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.
The core dumping code has always run without holding the mmap_sem for writing, despite that is the only way to ensure that the entire vma layout will not change from under it. Only using some signal serialization on the processes belonging to the mm is not nearly enough. This was pointed out earlier. For example in Hugh's post from Jul 2017:
https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
"Not strictly relevant here, but a related note: I was very surprised to discover, only quite recently, how handle_mm_fault() may be called without down_read(mmap_sem) - when core dumping. That seems a misguided optimization to me, which would also be nice to correct"
In particular because the growsdown and growsup can move the vm_start/vm_end the various loops the core dump does around the vma will not be consistent if page faults can happen concurrently.
Pretty much all users calling mmget_not_zero()/get_task_mm() and then taking the mmap_sem had the potential to introduce unexpected side effects in the core dumping code.
Adding mmap_sem for writing around the ->core_dump invocation is a viable long term fix, but it requires removing all copy user and page faults and to replace them with get_dump_page() for all binary formats which is not suitable as a short term fix.
For the time being this solution manually covers the places that can confuse the core dump either by altering the vma layout or the vma flags while it runs. Once ->core_dump runs under mmap_sem for writing the function mmget_still_valid() can be dropped.
Allowing mmap_sem protected sections to run in parallel with the coredump provides some minor parallelism advantage to the swapoff code (which seems to be safe enough by never mangling any vma field and can keep doing swapins in parallel to the core dumping) and to some other corner case.
In order to facilitate the backporting I added "Fixes: 86039bd3b4e6" however the side effect of this same race condition in /proc/pid/mem should be reproducible since before 2.6.12-rc2 so I couldn't add any other "Fixes:" because there's no hash beyond the git genesis commit.
Because find_extend_vma() is the only location outside of the process context that could modify the "mm" structures under mmap_sem for reading, by adding the mmget_still_valid() check to it, all other cases that take the mmap_sem for reading don't need the new check after mmget_not_zero()/get_task_mm(). The expand_stack() in page fault context also doesn't need the new check, because all tasks under core dumping are frozen.
Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Andrea Arcangeli aarcange@redhat.com Reported-by: Jann Horn jannh@google.com Suggested-by: Oleg Nesterov oleg@redhat.com Acked-by: Peter Xu peterx@redhat.com Reviewed-by: Mike Rapoport rppt@linux.ibm.com Reviewed-by: Oleg Nesterov oleg@redhat.com Reviewed-by: Jann Horn jannh@google.com Acked-by: Jason Gunthorpe jgg@mellanox.com Acked-by: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/task_mmu.c | 18 ++++++++++++++++++ fs/userfaultfd.c | 9 +++++++++ include/linux/sched/mm.h | 21 +++++++++++++++++++++ mm/mmap.c | 7 ++++++- 4 files changed, 54 insertions(+), 1 deletion(-)
--- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1138,6 +1138,24 @@ static ssize_t clear_refs_write(struct f count = -EINTR; goto out_mm; } + /* + * Avoid to modify vma->vm_flags + * without locked ops while the + * coredump reads the vm_flags. + */ + if (!mmget_still_valid(mm)) { + /* + * Silently return "count" + * like if get_task_mm() + * failed. FIXME: should this + * function have returned + * -ESRCH if get_task_mm() + * failed like if + * get_proc_task() fails? + */ + up_write(&mm->mmap_sem); + goto out_mm; + } for (vma = mm->mmap; vma; vma = vma->vm_next) { vma->vm_flags &= ~VM_SOFTDIRTY; vma_set_page_prot(vma); --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -630,6 +630,8 @@ static void userfaultfd_event_wait_compl
/* the various vma->vm_userfaultfd_ctx still points to it */ down_write(&mm->mmap_sem); + /* no task can run (and in turn coredump) yet */ + VM_WARN_ON(!mmget_still_valid(mm)); for (vma = mm->mmap; vma; vma = vma->vm_next) if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) { vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; @@ -884,6 +886,8 @@ static int userfaultfd_release(struct in * taking the mmap_sem for writing. */ down_write(&mm->mmap_sem); + if (!mmget_still_valid(mm)) + goto skip_mm; prev = NULL; for (vma = mm->mmap; vma; vma = vma->vm_next) { cond_resched(); @@ -906,6 +910,7 @@ static int userfaultfd_release(struct in vma->vm_flags = new_flags; vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; } +skip_mm: up_write(&mm->mmap_sem); mmput(mm); wakeup: @@ -1334,6 +1339,8 @@ static int userfaultfd_register(struct u goto out;
down_write(&mm->mmap_sem); + if (!mmget_still_valid(mm)) + goto out_unlock; vma = find_vma_prev(mm, start, &prev); if (!vma) goto out_unlock; @@ -1521,6 +1528,8 @@ static int userfaultfd_unregister(struct goto out;
down_write(&mm->mmap_sem); + if (!mmget_still_valid(mm)) + goto out_unlock; vma = find_vma_prev(mm, start, &prev); if (!vma) goto out_unlock; --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -49,6 +49,27 @@ static inline void mmdrop(struct mm_stru __mmdrop(mm); }
+/* + * This has to be called after a get_task_mm()/mmget_not_zero() + * followed by taking the mmap_sem for writing before modifying the + * vmas or anything the coredump pretends not to change from under it. + * + * NOTE: find_extend_vma() called from GUP context is the only place + * that can modify the "mm" (notably the vm_start/end) under mmap_sem + * for reading and outside the context of the process, so it is also + * the only case that holds the mmap_sem for reading that must call + * this function. Generally if the mmap_sem is hold for reading + * there's no need of this check after get_task_mm()/mmget_not_zero(). + * + * This function can be obsoleted and the check can be removed, after + * the coredump code will hold the mmap_sem for writing before + * invoking the ->core_dump methods. + */ +static inline bool mmget_still_valid(struct mm_struct *mm) +{ + return likely(!mm->core_state); +} + /** * mmget() - Pin the address space associated with a &struct mm_struct. * @mm: The address space to pin. --- a/mm/mmap.c +++ b/mm/mmap.c @@ -45,6 +45,7 @@ #include <linux/moduleparam.h> #include <linux/pkeys.h> #include <linux/oom.h> +#include <linux/sched/mm.h>
#include <linux/uaccess.h> #include <asm/cacheflush.h> @@ -2491,7 +2492,8 @@ find_extend_vma(struct mm_struct *mm, un vma = find_vma_prev(mm, addr, &prev); if (vma && (vma->vm_start <= addr)) return vma; - if (!prev || expand_stack(prev, addr)) + /* don't alter vm_end if the coredump is running */ + if (!prev || !mmget_still_valid(mm) || expand_stack(prev, addr)) return NULL; if (prev->vm_flags & VM_LOCKED) populate_vma_page_range(prev, addr, prev->vm_end, NULL); @@ -2517,6 +2519,9 @@ find_extend_vma(struct mm_struct *mm, un return vma; if (!(vma->vm_flags & VM_GROWSDOWN)) return NULL; + /* don't alter vm_start if the coredump is running */ + if (!mmget_still_valid(mm)) + return NULL; start = vma->vm_start; if (expand_stack(vma, addr)) return NULL;
From: Corey Minyard cminyard@mvista.com
commit 3b9a907223d7f6b9d1dadea29436842ae9bcd76d upstream.
free_user() could be called in atomic context.
This patch pushed the free operation off into a workqueue.
Example:
BUG: sleeping function called from invalid context at kernel/workqueue.c:2856 in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27 CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1 Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015 Call Trace: dump_stack+0x5c/0x7b ___might_sleep+0xec/0x110 __flush_work+0x48/0x1f0 ? try_to_del_timer_sync+0x4d/0x80 _cleanup_srcu_struct+0x104/0x140 free_user+0x18/0x30 [ipmi_msghandler] ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler] deliver_response+0xbd/0xd0 [ipmi_msghandler] deliver_local_response+0xe/0x30 [ipmi_msghandler] handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler] ? dequeue_entity+0xa0/0x960 handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler] tasklet_action_common.isra.22+0x103/0x120 __do_softirq+0xf8/0x2d7 run_ksoftirqd+0x26/0x50 smpboot_thread_fn+0x11d/0x1e0 kthread+0x103/0x140 ? sort_range+0x20/0x20 ? kthread_destroy_worker+0x40/0x40 ret_from_fork+0x1f/0x40
Fixes: 77f8269606bf ("ipmi: fix use-after-free of user->release_barrier.rda")
Reported-by: Konstantin Khlebnikov khlebnikov@yandex-team.ru Signed-off-by: Corey Minyard cminyard@mvista.com Cc: stable@vger.kernel.org # 5.0 Cc: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/ipmi/ipmi_msghandler.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
--- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -213,6 +213,9 @@ struct ipmi_user {
/* Does this interface receive IPMI events? */ bool gets_events; + + /* Free must run in process context for RCU cleanup. */ + struct work_struct remove_work; };
static struct ipmi_user *acquire_ipmi_user(struct ipmi_user *user, int *index) @@ -1078,6 +1081,15 @@ static int intf_err_seq(struct ipmi_smi }
+static void free_user_work(struct work_struct *work) +{ + struct ipmi_user *user = container_of(work, struct ipmi_user, + remove_work); + + cleanup_srcu_struct(&user->release_barrier); + kfree(user); +} + int ipmi_create_user(unsigned int if_num, const struct ipmi_user_hndl *handler, void *handler_data, @@ -1121,6 +1133,8 @@ int ipmi_create_user(unsigned int goto out_kfree;
found: + INIT_WORK(&new_user->remove_work, free_user_work); + rv = init_srcu_struct(&new_user->release_barrier); if (rv) goto out_kfree; @@ -1183,8 +1197,9 @@ EXPORT_SYMBOL(ipmi_get_smi_info); static void free_user(struct kref *ref) { struct ipmi_user *user = container_of(ref, struct ipmi_user, refcount); - cleanup_srcu_struct(&user->release_barrier); - kfree(user); + + /* SRCU cleanup must happen in task context. */ + schedule_work(&user->remove_work); }
static void _ipmi_destroy_user(struct ipmi_user *user)
From: Eric Biggers ebiggers@google.com
commit 678cce4019d746da6c680c48ba9e6d417803e127 upstream.
The x86_64 implementation of Poly1305 produces the wrong result on some inputs because poly1305_4block_avx2() incorrectly assumes that when partially reducing the accumulator, the bits carried from limb 'd4' to limb 'h0' fit in a 32-bit integer. This is true for poly1305-generic which processes only one block at a time. However, it's not true for the AVX2 implementation, which processes 4 blocks at a time and therefore can produce intermediate limbs about 4x larger.
Fix it by making the relevant calculations use 64-bit arithmetic rather than 32-bit. Note that most of the carries already used 64-bit arithmetic, but the d4 -> h0 carry was different for some reason.
To be safe I also made the same change to the corresponding SSE2 code, though that only operates on 1 or 2 blocks at a time. I don't think it's really needed for poly1305_block_sse2(), but it doesn't hurt because it's already x86_64 code. It *might* be needed for poly1305_2block_sse2(), but overflows aren't easy to reproduce there.
This bug was originally detected by my patches that improve testmgr to fuzz algorithms against their generic implementation. But also add a test vector which reproduces it directly (in the AVX2 case).
Fixes: b1ccc8f4b631 ("crypto: poly1305 - Add a four block AVX2 variant for x86_64") Fixes: c70f4abef07a ("crypto: poly1305 - Add a SSE2 SIMD variant for x86_64") Cc: stable@vger.kernel.org # v4.3+ Cc: Martin Willi martin@strongswan.org Cc: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Martin Willi martin@strongswan.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/crypto/poly1305-avx2-x86_64.S | 14 +++++++--- arch/x86/crypto/poly1305-sse2-x86_64.S | 22 ++++++++++------ crypto/testmgr.h | 44 ++++++++++++++++++++++++++++++++- 3 files changed, 67 insertions(+), 13 deletions(-)
--- a/arch/x86/crypto/poly1305-avx2-x86_64.S +++ b/arch/x86/crypto/poly1305-avx2-x86_64.S @@ -323,6 +323,12 @@ ENTRY(poly1305_4block_avx2) vpaddq t2,t1,t1 vmovq t1x,d4
+ # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> + # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small + # amount. Careful: we must not assume the carry bits 'd0 >> 26', + # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit + # integers. It's true in a single-block implementation, but not here. + # d1 += d0 >> 26 mov d0,%rax shr $26,%rax @@ -361,16 +367,16 @@ ENTRY(poly1305_4block_avx2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4
# h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx --- a/arch/x86/crypto/poly1305-sse2-x86_64.S +++ b/arch/x86/crypto/poly1305-sse2-x86_64.S @@ -253,16 +253,16 @@ ENTRY(poly1305_block_sse2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4
# h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx @@ -520,6 +520,12 @@ ENTRY(poly1305_2block_sse2) paddq t2,t1 movq t1,d4
+ # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> + # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small + # amount. Careful: we must not assume the carry bits 'd0 >> 26', + # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit + # integers. It's true in a single-block implementation, but not here. + # d1 += d0 >> 26 mov d0,%rax shr $26,%rax @@ -558,16 +564,16 @@ ENTRY(poly1305_2block_sse2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4
# h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -5592,7 +5592,49 @@ static const struct hash_testvec poly130 .psize = 80, .digest = "\x13\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00", - }, + }, { /* Regression test for overflow in AVX2 implementation */ + .plaintext = "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff", + .psize = 300, + .digest = "\xfb\x5e\x96\xd8\x61\xd5\xc7\xc8" + "\x78\xe5\x87\xcc\x2d\x5a\x22\xe1", + } };
/*
From: Christian König christian.koenig@amd.com
commit a66477b0efe511d98dde3e4aaeb189790e6f0a39 upstream.
When ttm_put_pages() tries to figure out whether it's dealing with transparent hugepages, it just reads past the bounds of the pages array without a check.
v2: simplify the test if enough pages are left in the array (Christian).
Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Christian König christian.koenig@amd.com Fixes: 5c42c64f7d54 ("drm/ttm: fix the fix for huge compound pages") Cc: stable@vger.kernel.org Reviewed-by: Michel Dänzer michel.daenzer@amd.com Reviewed-by: Junwei Zhang Jerry.Zhang@amd.com Reviewed-by: Huang Rui ray.huang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/ttm/ttm_page_alloc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c @@ -730,7 +730,8 @@ static void ttm_put_pages(struct page ** }
#ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (!(flags & TTM_PAGE_FLAG_DMA32)) { + if (!(flags & TTM_PAGE_FLAG_DMA32) && + (npages - i) >= HPAGE_PMD_NR) { for (j = 0; j < HPAGE_PMD_NR; ++j) if (p++ != pages[i + j]) break; @@ -759,7 +760,7 @@ static void ttm_put_pages(struct page ** unsigned max_size, n2free;
spin_lock_irqsave(&huge->lock, irq_flags); - while (i < npages) { + while ((npages - i) >= HPAGE_PMD_NR) { struct page *p = pages[i]; unsigned j;
From: Nathan Chancellor natechancellor@gmail.com
commit ff8acf929014b7f87315588e0daf8597c8aa9d1c upstream.
Commit 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value") removed oldval's zero initialization in arch_futex_atomic_op_inuser because it is not necessary. Unfortunately, Android's arm64 GCC 4.9.4 [1] does not agree:
../kernel/futex.c: In function 'do_futex': ../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized in this function [-Wmaybe-uninitialized] return oldval == cmparg; ^ In file included from ../kernel/futex.c:73:0: ../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here int oldval, ret, tmp; ^
GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser returns right away, avoiding the uninitialized use that it claims. Restoring the zero initialization works around this issue.
[1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aa...
Cc: stable@vger.kernel.org Fixes: 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value") Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/futex.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/include/asm/futex.h +++ b/arch/arm64/include/asm/futex.h @@ -50,7 +50,7 @@ do { \ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) { - int oldval, ret, tmp; + int oldval = 0, ret, tmp; u32 __user *uaddr = __uaccess_mask_ptr(_uaddr);
pagefault_disable();
From: Masami Hiramatsu mhiramat@kernel.org
commit 3ff9c075cc767b3060bdac12da72fc94dd7da1b8 upstream.
Verify the stack frame pointer on kretprobe trampoline handler, If the stack frame pointer does not match, it skips the wrong entry and tries to find correct one.
This can happen if user puts the kretprobe on the function which can be used in the path of ftrace user-function call. Such functions should not be probed, so this adds a warning message that reports which function should be blacklisted.
Tested-by: Andrea Righi righi.andrea@gmail.com Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Steven Rostedt rostedt@goodmis.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/kprobes/core.c | 26 ++++++++++++++++++++++++++ include/linux/kprobes.h | 1 + 2 files changed, 27 insertions(+)
--- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -569,6 +569,7 @@ void arch_prepare_kretprobe(struct kretp unsigned long *sara = stack_addr(regs);
ri->ret_addr = (kprobe_opcode_t *) *sara; + ri->fp = sara;
/* Replace the return addr with trampoline addr */ *sara = (unsigned long) &kretprobe_trampoline; @@ -759,15 +760,21 @@ __visible __used void *trampoline_handle unsigned long flags, orig_ret_address = 0; unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline; kprobe_opcode_t *correct_ret_addr = NULL; + void *frame_pointer; + bool skipped = false;
INIT_HLIST_HEAD(&empty_rp); kretprobe_hash_lock(current, &head, &flags); /* fixup registers */ #ifdef CONFIG_X86_64 regs->cs = __KERNEL_CS; + /* On x86-64, we use pt_regs->sp for return address holder. */ + frame_pointer = ®s->sp; #else regs->cs = __KERNEL_CS | get_kernel_rpl(); regs->gs = 0; + /* On x86-32, we use pt_regs->flags for return address holder. */ + frame_pointer = ®s->flags; #endif regs->ip = trampoline_address; regs->orig_ax = ~0UL; @@ -789,8 +796,25 @@ __visible __used void *trampoline_handle if (ri->task != current) /* another task is sharing our hash bucket */ continue; + /* + * Return probes must be pushed on this hash list correct + * order (same as return order) so that it can be poped + * correctly. However, if we find it is pushed it incorrect + * order, this means we find a function which should not be + * probed, because the wrong order entry is pushed on the + * path of processing other kretprobe itself. + */ + if (ri->fp != frame_pointer) { + if (!skipped) + pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n"); + skipped = true; + continue; + }
orig_ret_address = (unsigned long)ri->ret_addr; + if (skipped) + pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n", + ri->rp->kp.addr);
if (orig_ret_address != trampoline_address) /* @@ -808,6 +832,8 @@ __visible __used void *trampoline_handle if (ri->task != current) /* another task is sharing our hash bucket */ continue; + if (ri->fp != frame_pointer) + continue;
orig_ret_address = (unsigned long)ri->ret_addr; if (ri->rp && ri->rp->handler) { --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -173,6 +173,7 @@ struct kretprobe_instance { struct kretprobe *rp; kprobe_opcode_t *ret_addr; struct task_struct *task; + void *fp; char data[0]; };
From: Masami Hiramatsu mhiramat@kernel.org
commit fabe38ab6b2bd9418350284c63825f13b8a6abba upstream.
Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack.
Reported-by: Francis Deslauriers francis.deslauriers@efficios.com Tested-by: Andrea Righi righi.andrea@gmail.com Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Steven Rostedt rostedt@goodmis.org Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/ftrace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -34,6 +34,7 @@ #include <linux/list.h> #include <linux/hash.h> #include <linux/rcupdate.h> +#include <linux/kprobes.h>
#include <trace/events/sched.h>
@@ -6250,7 +6251,7 @@ void ftrace_reset_array_ops(struct trace tr->ops->func = ftrace_stub; }
-static inline void +static nokprobe_inline void __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, struct ftrace_ops *ignored, struct pt_regs *regs) { @@ -6310,11 +6311,13 @@ static void ftrace_ops_list_func(unsigne { __ftrace_ops_list_func(ip, parent_ip, NULL, regs); } +NOKPROBE_SYMBOL(ftrace_ops_list_func); #else static void ftrace_ops_no_ops(unsigned long ip, unsigned long parent_ip) { __ftrace_ops_list_func(ip, parent_ip, NULL, NULL); } +NOKPROBE_SYMBOL(ftrace_ops_no_ops); #endif
/* @@ -6341,6 +6344,7 @@ static void ftrace_ops_assist_func(unsig preempt_enable_notrace(); trace_clear_recursion(bit); } +NOKPROBE_SYMBOL(ftrace_ops_assist_func);
/** * ftrace_ops_get_func - get the function a trampoline should call
From: Masami Hiramatsu mhiramat@kernel.org
commit 5f843ed415581cfad4ef8fefe31c138a8346ca8a upstream.
The following commit introduced a bug in one of our error paths:
819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()")
it missed to handle the return value of kprobe_optready() as error-value. In reality, the kprobe_optready() returns a bool result, so "true" case must be passed instead of 0.
This causes some errors on kprobe boot-time selftests on ARM:
[ ] Beginning kprobe tests... [ ] Probe ARM code [ ] kprobe [ ] kretprobe [ ] ARM instruction simulation [ ] Check decoding tables [ ] Run test cases [ ] FAIL: test_case_handler not run [ ] FAIL: Test andge r10, r11, r14, asr r7 [ ] FAIL: Scenario 11 ... [ ] FAIL: Scenario 7 [ ] Total instruction simulation tests=1631, pass=1433 fail=198 [ ] kprobe tests failed
This can happen if an optimized probe is unregistered and next kprobe is registered on same address until the previous probe is not reclaimed.
If this happens, a hidden aggregated probe may be kept in memory, and no new kprobe can probe same address. Also, in that case register_kprobe() will return "1" instead of minus error value, which can mislead caller logic.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: David S . Miller davem@davemloft.net Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Naveen N . Rao naveen.n.rao@linux.vnet.ibm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org # v5.0+ Fixes: 819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()") Link: http://lkml.kernel.org/r/155530808559.32517.539898325433642204.stgit@devnote... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/kprobes.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -703,7 +703,6 @@ static void unoptimize_kprobe(struct kpr static int reuse_unused_kprobe(struct kprobe *ap) { struct optimized_kprobe *op; - int ret;
BUG_ON(!kprobe_unused(ap)); /* @@ -715,9 +714,8 @@ static int reuse_unused_kprobe(struct kp /* Enable the probe again */ ap->flags &= ~KPROBE_FLAG_DISABLED; /* Optimize it again (remove from op->list) */ - ret = kprobe_optready(ap); - if (ret) - return ret; + if (!kprobe_optready(ap)) + return -EINVAL;
optimize_kprobe(ap); return 0;
From: Vijayakumar Durai vijayakumar.durai1@vivint.com
commit 746ba11f170603bf1eaade817553a6c2e9135bbe upstream.
Currently rt2x00 devices retransmit the management frames with incremented sequence number if hardware is assigning the sequence.
This is HW bug fixed already for non-QOS data frames, but it should be fixed for management frames except beacon.
Without fix retransmitted frames have wrong SN:
AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1648, FN=0, Flags=........C Frame is not being retransmitted 1648 1 AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1649, FN=0, Flags=....R...C Frame is being retransmitted 1649 1 AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1650, FN=0, Flags=....R...C Frame is being retransmitted 1650 1
With the fix SN stays correctly the same:
88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=........C 88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C 88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
Cc: stable@vger.kernel.org Signed-off-by: Vijayakumar Durai vijayakumar.durai1@vivint.com [sgruszka: simplify code, change comments and changelog] Signed-off-by: Stanislaw Gruszka sgruszka@redhat.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ralink/rt2x00/rt2x00.h | 1 - drivers/net/wireless/ralink/rt2x00/rt2x00mac.c | 10 ---------- drivers/net/wireless/ralink/rt2x00/rt2x00queue.c | 15 +++++++++------ 3 files changed, 9 insertions(+), 17 deletions(-)
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00.h +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00.h @@ -672,7 +672,6 @@ enum rt2x00_state_flags { CONFIG_CHANNEL_HT40, CONFIG_POWERSAVING, CONFIG_HT_DISABLED, - CONFIG_QOS_DISABLED, CONFIG_MONITORING,
/* --- a/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c @@ -642,19 +642,9 @@ void rt2x00mac_bss_info_changed(struct i rt2x00dev->intf_associated--;
rt2x00leds_led_assoc(rt2x00dev, !!rt2x00dev->intf_associated); - - clear_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags); }
/* - * Check for access point which do not support 802.11e . We have to - * generate data frames sequence number in S/W for such AP, because - * of H/W bug. - */ - if (changes & BSS_CHANGED_QOS && !bss_conf->qos) - set_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags); - - /* * When the erp information has changed, we should perform * additional configuration steps. For all other changes we are done. */ --- a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c @@ -200,15 +200,18 @@ static void rt2x00queue_create_tx_descri if (!rt2x00_has_cap_flag(rt2x00dev, REQUIRE_SW_SEQNO)) { /* * rt2800 has a H/W (or F/W) bug, device incorrectly increase - * seqno on retransmited data (non-QOS) frames. To workaround - * the problem let's generate seqno in software if QOS is - * disabled. + * seqno on retransmitted data (non-QOS) and management frames. + * To workaround the problem let's generate seqno in software. + * Except for beacons which are transmitted periodically by H/W + * hence hardware has to assign seqno for them. */ - if (test_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags)) - __clear_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags); - else + if (ieee80211_is_beacon(hdr->frame_control)) { + __set_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags); /* H/W will generate sequence number */ return; + } + + __clear_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags); }
/*
From: Felix Fietkau nbd@nbd.name
commit 4856bfd230985e43e84c26473c91028ff0a533bd upstream.
There are several scenarios in which mac80211 can call drv_wake_tx_queue after ieee80211_restart_hw has been called and has not yet completed. Driver private structs are considered uninitialized until mac80211 has uploaded the vifs, stations and keys again, so using private tx queue data during that time is not safe.
The driver can also not rely on drv_reconfig_complete to figure out when it is safe to accept drv_wake_tx_queue calls again, because it is only called after all tx queues are woken again.
To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig is set.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/mac80211/driver-ops.h | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/mac80211/driver-ops.h +++ b/net/mac80211/driver-ops.h @@ -1166,6 +1166,9 @@ static inline void drv_wake_tx_queue(str { struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif);
+ if (local->in_reconfig) + return; + if (!check_sdata_in_driver(sdata)) return;
From: Alex Deucher alexander.deucher@amd.com
commit 1925e7d3d4677e681cc2e878c2bdbeaee988c8e2 upstream.
Got accidently dropped when 2+1 level support was added.
Fixes: 6a42fd6fbf534096 ("drm/amdgpu: implement 2+1 PD support for Raven v3") Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c @@ -164,6 +164,7 @@ static void mmhub_v1_0_init_cache_regs(s tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 6); } + WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL3, tmp);
tmp = mmVM_L2_CNTL4_DEFAULT; tmp = REG_SET_FIELD(tmp, VM_L2_CNTL4, VMC_TAP_PDE_REQUEST_PHYSICAL, 0);
From: Kim Phillips kim.phillips@amd.com
commit 3fe3331bb285700ab2253dbb07f8e478fcea2f1b upstream.
Family 17h differs from prior families by:
- Does not support an L2 cache miss event - It has re-enumerated PMC counters for: - L2 cache references - front & back end stalled cycles
So we add a new amd_f17h_perfmon_event_map[] so that the generic perf event names will resolve to the correct h/w events on family 17h and above processors.
Reference sections 2.1.13.3.3 (stalls) and 2.1.13.3.6 (L2):
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0F...
Signed-off-by: Kim Phillips kim.phillips@amd.com Cc: stable@vger.kernel.org # v4.9+ Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: H. Peter Anvin hpa@zytor.com Cc: Janakarajan Natarajan Janakarajan.Natarajan@amd.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Martin Liška mliska@suse.cz Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Pu Wen puwen@hygon.cn Cc: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-kernel@vger.kernel.org Fixes: e40ed1542dd7 ("perf/x86: Add perf support for AMD family-17h processors") [ Improved the formatting a bit. ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/amd/core.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-)
--- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -117,22 +117,39 @@ static __initconst const u64 amd_hw_cach };
/* - * AMD Performance Monitor K7 and later. + * AMD Performance Monitor K7 and later, up to and including Family 16h: */ static const u64 amd_perfmon_event_map[PERF_COUNT_HW_MAX] = { - [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, - [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, - [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d, - [PERF_COUNT_HW_CACHE_MISSES] = 0x077e, - [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, - [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, - [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */ - [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x00d1, /* "Dispatch stalls" event */ + [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, + [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d, + [PERF_COUNT_HW_CACHE_MISSES] = 0x077e, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, + [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, + [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */ + [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x00d1, /* "Dispatch stalls" event */ +}; + +/* + * AMD Performance Monitor Family 17h and later: + */ +static const u64 amd_f17h_perfmon_event_map[PERF_COUNT_HW_MAX] = +{ + [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, + [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, + [PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, + [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, + [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287, + [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x0187, };
static u64 amd_pmu_event_map(int hw_event) { + if (boot_cpu_data.x86 >= 0x17) + return amd_f17h_perfmon_event_map[hw_event]; + return amd_perfmon_event_map[hw_event]; }
From: Andi Kleen ak@linux.intel.com
commit 1de7edbb59c8f1b46071f66c5c97b8a59569eb51 upstream.
Some of the recently added const tables use __initdata which causes section attribute conflicts.
Use __initconst instead.
Fixes: fa1202ef2243 ("x86/speculation: Add command line control") Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190330004743.29541-9-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/bugs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -272,7 +272,7 @@ static const struct { const char *option; enum spectre_v2_user_cmd cmd; bool secure; -} v2_user_options[] __initdata = { +} v2_user_options[] __initconst = { { "auto", SPECTRE_V2_USER_CMD_AUTO, false }, { "off", SPECTRE_V2_USER_CMD_NONE, false }, { "on", SPECTRE_V2_USER_CMD_FORCE, true }, @@ -407,7 +407,7 @@ static const struct { const char *option; enum spectre_v2_mitigation_cmd cmd; bool secure; -} mitigation_options[] __initdata = { +} mitigation_options[] __initconst = { { "off", SPECTRE_V2_CMD_NONE, false }, { "on", SPECTRE_V2_CMD_FORCE, true }, { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, @@ -643,7 +643,7 @@ static const char * const ssb_strings[] static const struct { const char *option; enum ssb_mitigation_cmd cmd; -} ssb_mitigation_options[] __initdata = { +} ssb_mitigation_options[] __initconst = { { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
From: Kan Liang kan.liang@linux.intel.com
commit 9d5dcc93a6ddfc78124f006ccd3637ce070ef2fc upstream.
PEBS_REGS used as mask for the supported registers for large PEBS. However, the mask cannot filter the sample_regs_user/sample_regs_intr correctly.
(1ULL << PERF_REG_X86_*) should be used to replace PERF_REG_X86_*, which is only the index.
Rename PEBS_REGS to PEBS_GP_REGS, because the mask is only for general purpose registers.
Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: acme@kernel.org Cc: jolsa@kernel.org Fixes: 2fe1bc1f501d ("perf/x86: Enable free running PEBS for REGS_USER/INTR") Link: https://lkml.kernel.org/r/20190402194509.2832-2-kan.liang@linux.intel.com [ Renamed it to PEBS_GP_REGS - as 'GPRS' is used elsewhere ;-) ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/intel/core.c | 2 +- arch/x86/events/perf_event.h | 38 +++++++++++++++++++------------------- 2 files changed, 20 insertions(+), 20 deletions(-)
--- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3014,7 +3014,7 @@ static unsigned long intel_pmu_large_peb flags &= ~PERF_SAMPLE_TIME; if (!event->attr.exclude_kernel) flags &= ~PERF_SAMPLE_REGS_USER; - if (event->attr.sample_regs_user & ~PEBS_REGS) + if (event->attr.sample_regs_user & ~PEBS_GP_REGS) flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR); return flags; } --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -96,25 +96,25 @@ struct amd_nb { PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER | \ PERF_SAMPLE_PERIOD)
-#define PEBS_REGS \ - (PERF_REG_X86_AX | \ - PERF_REG_X86_BX | \ - PERF_REG_X86_CX | \ - PERF_REG_X86_DX | \ - PERF_REG_X86_DI | \ - PERF_REG_X86_SI | \ - PERF_REG_X86_SP | \ - PERF_REG_X86_BP | \ - PERF_REG_X86_IP | \ - PERF_REG_X86_FLAGS | \ - PERF_REG_X86_R8 | \ - PERF_REG_X86_R9 | \ - PERF_REG_X86_R10 | \ - PERF_REG_X86_R11 | \ - PERF_REG_X86_R12 | \ - PERF_REG_X86_R13 | \ - PERF_REG_X86_R14 | \ - PERF_REG_X86_R15) +#define PEBS_GP_REGS \ + ((1ULL << PERF_REG_X86_AX) | \ + (1ULL << PERF_REG_X86_BX) | \ + (1ULL << PERF_REG_X86_CX) | \ + (1ULL << PERF_REG_X86_DX) | \ + (1ULL << PERF_REG_X86_DI) | \ + (1ULL << PERF_REG_X86_SI) | \ + (1ULL << PERF_REG_X86_SP) | \ + (1ULL << PERF_REG_X86_BP) | \ + (1ULL << PERF_REG_X86_IP) | \ + (1ULL << PERF_REG_X86_FLAGS) | \ + (1ULL << PERF_REG_X86_R8) | \ + (1ULL << PERF_REG_X86_R9) | \ + (1ULL << PERF_REG_X86_R10) | \ + (1ULL << PERF_REG_X86_R11) | \ + (1ULL << PERF_REG_X86_R12) | \ + (1ULL << PERF_REG_X86_R13) | \ + (1ULL << PERF_REG_X86_R14) | \ + (1ULL << PERF_REG_X86_R15))
/* * Per register state.
From: Thomas Gleixner tglx@linutronix.de
commit 2f5fb19341883bb6e37da351bc3700489d8506a7 upstream.
Mikhail reported a lockdep splat related to the AMD specific ssb_state lock:
CPU0 CPU1 lock(&st->lock); local_irq_disable(); lock(&(&sighand->siglock)->rlock); lock(&st->lock); <Interrupt> lock(&(&sighand->siglock)->rlock);
*** DEADLOCK ***
The connection between sighand->siglock and st->lock comes through seccomp, which takes st->lock while holding sighand->siglock.
Make sure interrupts are disabled when __speculation_ctrl_update() is invoked via prctl() -> speculation_ctrl_update(). Add a lockdep assert to catch future offenders.
Fixes: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD") Reported-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com Cc: Thomas Lendacky thomas.lendacky@amd.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1904141948200.4917@nanos.tec.linut... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/process.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -411,6 +411,8 @@ static __always_inline void __speculatio u64 msr = x86_spec_ctrl_base; bool updmsr = false;
+ lockdep_assert_irqs_disabled(); + /* * If TIF_SSBD is different, select the proper mitigation * method. Note that if SSBD mitigation is disabled or permanentely @@ -462,10 +464,12 @@ static unsigned long speculation_ctrl_up
void speculation_ctrl_update(unsigned long tif) { + unsigned long flags; + /* Forced update. Make sure all relevant TIF flags are different */ - preempt_disable(); + local_irq_save(flags); __speculation_ctrl_update(~tif, tif); - preempt_enable(); + local_irq_restore(flags); }
/* Called from seccomp/prctl update */
From: Chang-An Chen chang-an.chen@mediatek.com
commit 3f2552f7e9c5abef2775c53f7af66532f8bf65bc upstream.
tick_freeze() introduced by suspend-to-idle in commit 124cf9117c5f ("PM / sleep: Make it possible to quiesce timers during suspend-to-idle") uses timekeeping_suspend() instead of syscore_suspend() during suspend-to-idle. As a consequence generic sched_clock will keep going because sched_clock_suspend() and sched_clock_resume() are not invoked during suspend-to-idle which can result in a generic sched_clock wrap.
On a ARM system with suspend-to-idle enabled, sched_clock is registered as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which means the real wrapping duration is 8796093022202ns.
[ 134.551779] suspend-to-idle suspend (timekeeping_suspend()) [ 1204.912239] suspend-to-idle resume (timekeeping_resume()) ...... [ 1206.912239] suspend-to-idle suspend (timekeeping_suspend()) [ 5880.502807] suspend-to-idle resume (timekeeping_resume()) ...... [ 6000.403724] suspend-to-idle suspend (timekeeping_suspend()) [ 8035.753167] suspend-to-idle resume (timekeeping_resume()) ...... [ 8795.786684] (2)[321:charger_thread]...... [ 8795.788387] (2)[321:charger_thread]...... [ 0.057226] (0)[0:swapper/0]...... [ 0.061447] (2)[0:swapper/2]......
sched_clock was not stopped during suspend-to-idle, and sched_clock_poll hrtimer was not expired because timekeeping_suspend() was invoked during suspend-to-idle. It makes sched_clock wrap at kernel time 8796s.
To prevent this, invoke sched_clock_suspend() and sched_clock_resume() in tick_freeze() together with timekeeping_suspend() and timekeeping_resume().
Fixes: 124cf9117c5f (PM / sleep: Make it possible to quiesce timers during suspend-to-idle) Signed-off-by: Chang-An Chen chang-an.chen@mediatek.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Matthias Brugger matthias.bgg@gmail.com Cc: John Stultz john.stultz@linaro.org Cc: Kees Cook keescook@chromium.org Cc: Corey Minyard cminyard@mvista.com Cc: linux-mediatek@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: Stanley Chu stanley.chu@mediatek.com Cc: kuohong.wang@mediatek.com Cc: freddy.hsin@mediatek.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1553828349-8914-1-git-send-email-chang-an.chen@med... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/time/sched_clock.c | 4 ++-- kernel/time/tick-common.c | 2 ++ kernel/time/timekeeping.h | 7 +++++++ 3 files changed, 11 insertions(+), 2 deletions(-)
--- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -275,7 +275,7 @@ static u64 notrace suspended_sched_clock return cd.read_data[seq & 1].epoch_cyc; }
-static int sched_clock_suspend(void) +int sched_clock_suspend(void) { struct clock_read_data *rd = &cd.read_data[0];
@@ -286,7 +286,7 @@ static int sched_clock_suspend(void) return 0; }
-static void sched_clock_resume(void) +void sched_clock_resume(void) { struct clock_read_data *rd = &cd.read_data[0];
--- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -491,6 +491,7 @@ void tick_freeze(void) trace_suspend_resume(TPS("timekeeping_freeze"), smp_processor_id(), true); system_state = SYSTEM_SUSPEND; + sched_clock_suspend(); timekeeping_suspend(); } else { tick_suspend_local(); @@ -514,6 +515,7 @@ void tick_unfreeze(void)
if (tick_freeze_depth == num_online_cpus()) { timekeeping_resume(); + sched_clock_resume(); system_state = SYSTEM_RUNNING; trace_suspend_resume(TPS("timekeeping_freeze"), smp_processor_id(), false); --- a/kernel/time/timekeeping.h +++ b/kernel/time/timekeeping.h @@ -14,6 +14,13 @@ extern u64 timekeeping_max_deferment(voi extern void timekeeping_warp_clock(void); extern int timekeeping_suspend(void); extern void timekeeping_resume(void); +#ifdef CONFIG_GENERIC_SCHED_CLOCK +extern int sched_clock_suspend(void); +extern void sched_clock_resume(void); +#else +static inline int sched_clock_suspend(void) { return 0; } +static inline void sched_clock_resume(void) { } +#endif
extern void do_timer(unsigned long ticks); extern void update_wall_time(void);
commit 317a992ab9266b86b774b9f6b0f87eb4f59879a1 upstream.
The ars_start_flags property of 'struct acpi_nfit_desc' is no longer used since ARS_REQ_SHORT and ARS_REQ_LONG were added.
Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/nfit/core.c | 10 +++++----- drivers/acpi/nfit/nfit.h | 1 - 2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index df2175b1169a..b5237a506464 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2539,11 +2539,11 @@ static int ars_continue(struct acpi_nfit_desc *acpi_desc) struct nvdimm_bus_descriptor *nd_desc = &acpi_desc->nd_desc; struct nd_cmd_ars_status *ars_status = acpi_desc->ars_status;
- memset(&ars_start, 0, sizeof(ars_start)); - ars_start.address = ars_status->restart_address; - ars_start.length = ars_status->restart_length; - ars_start.type = ars_status->type; - ars_start.flags = acpi_desc->ars_start_flags; + ars_start = (struct nd_cmd_ars_start) { + .address = ars_status->restart_address, + .length = ars_status->restart_length, + .type = ars_status->type, + }; rc = nd_desc->ndctl(nd_desc, NULL, ND_CMD_ARS_START, &ars_start, sizeof(ars_start), &cmd_rc); if (rc < 0) diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h index 02c10de50386..df31a2721573 100644 --- a/drivers/acpi/nfit/nfit.h +++ b/drivers/acpi/nfit/nfit.h @@ -194,7 +194,6 @@ struct acpi_nfit_desc { struct list_head idts; struct nvdimm_bus *nvdimm_bus; struct device *dev; - u8 ars_start_flags; struct nd_cmd_ars_status *ars_status; struct nfit_spa *scrub_spa; struct delayed_work dwork;
commit e34b8252a3d2893ca55c82dbfcdaa302fa03d400 upstream.
In preparation for introducing new flags to gate whether ARS results are stale, or poll the completion state, convert the existing flags to an unsigned long with enumerated values. This conversion allows the flags to be atomically updated outside of ->init_mutex.
Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/nfit/core.c | 30 +++++++++++++++++------------- drivers/acpi/nfit/nfit.h | 8 ++++++-- 2 files changed, 23 insertions(+), 15 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index b5237a506464..6b5a3c3b4458 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1298,19 +1298,23 @@ static ssize_t scrub_show(struct device *dev, struct device_attribute *attr, char *buf) { struct nvdimm_bus_descriptor *nd_desc; + struct acpi_nfit_desc *acpi_desc; ssize_t rc = -ENXIO; + bool busy;
device_lock(dev); nd_desc = dev_get_drvdata(dev); - if (nd_desc) { - struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc); - - mutex_lock(&acpi_desc->init_mutex); - rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, - acpi_desc->scrub_busy - && !acpi_desc->cancel ? "+\n" : "\n"); - mutex_unlock(&acpi_desc->init_mutex); + if (!nd_desc) { + device_unlock(dev); + return rc; } + acpi_desc = to_acpi_desc(nd_desc); + + mutex_lock(&acpi_desc->init_mutex); + busy = test_bit(ARS_BUSY, &acpi_desc->scrub_flags) + && !test_bit(ARS_CANCEL, &acpi_desc->scrub_flags); + rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, busy ? "+\n" : "\n"); + mutex_unlock(&acpi_desc->init_mutex); device_unlock(dev); return rc; } @@ -2960,7 +2964,7 @@ static unsigned int __acpi_nfit_scrub(struct acpi_nfit_desc *acpi_desc,
lockdep_assert_held(&acpi_desc->init_mutex);
- if (acpi_desc->cancel) + if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags)) return 0;
if (query_rc == -EBUSY) { @@ -3034,7 +3038,7 @@ static void __sched_ars(struct acpi_nfit_desc *acpi_desc, unsigned int tmo) { lockdep_assert_held(&acpi_desc->init_mutex);
- acpi_desc->scrub_busy = 1; + set_bit(ARS_BUSY, &acpi_desc->scrub_flags); /* note this should only be set from within the workqueue */ if (tmo) acpi_desc->scrub_tmo = tmo; @@ -3050,7 +3054,7 @@ static void notify_ars_done(struct acpi_nfit_desc *acpi_desc) { lockdep_assert_held(&acpi_desc->init_mutex);
- acpi_desc->scrub_busy = 0; + clear_bit(ARS_BUSY, &acpi_desc->scrub_flags); acpi_desc->scrub_count++; if (acpi_desc->scrub_count_state) sysfs_notify_dirent(acpi_desc->scrub_count_state); @@ -3322,7 +3326,7 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc, struct nfit_spa *nfit_spa;
mutex_lock(&acpi_desc->init_mutex); - if (acpi_desc->cancel) { + if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags)) { mutex_unlock(&acpi_desc->init_mutex); return 0; } @@ -3401,7 +3405,7 @@ void acpi_nfit_shutdown(void *data) mutex_unlock(&acpi_desc_lock);
mutex_lock(&acpi_desc->init_mutex); - acpi_desc->cancel = 1; + set_bit(ARS_CANCEL, &acpi_desc->scrub_flags); cancel_delayed_work_sync(&acpi_desc->dwork); mutex_unlock(&acpi_desc->init_mutex);
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h index df31a2721573..94710e579598 100644 --- a/drivers/acpi/nfit/nfit.h +++ b/drivers/acpi/nfit/nfit.h @@ -181,6 +181,11 @@ struct nfit_mem { bool has_lsw; };
+enum scrub_flags { + ARS_BUSY, + ARS_CANCEL, +}; + struct acpi_nfit_desc { struct nvdimm_bus_descriptor nd_desc; struct acpi_table_header acpi_header; @@ -202,8 +207,7 @@ struct acpi_nfit_desc { unsigned int max_ars; unsigned int scrub_count; unsigned int scrub_mode; - unsigned int scrub_busy:1; - unsigned int cancel:1; + unsigned long scrub_flags; unsigned long dimm_cmd_force_en; unsigned long bus_cmd_force_en; unsigned long bus_nfit_cmd_force_en;
commit 5479b2757f26fe9908fc341d105b2097fe820b6f upstream.
The ARS implementation implements exponential back-off on the poll interval to prevent high-frequency access to the DIMM / platform interface. Depending on when the ARS completes the poll interval may exceed the completion event by minutes. Allow root to reset the timeout each time it probes the status. A one-second timeout is still enforced, but root can otherwise can control the poll interval.
Fixes: bc6ba8085842 ("nfit, address-range-scrub: rework and simplify ARS...") Cc: stable@vger.kernel.org Reported-by: Erwin Tsaur erwin.tsaur@oracle.com Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/nfit/core.c | 8 ++++++++ drivers/acpi/nfit/nfit.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 6b5a3c3b4458..4b489d14a680 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1314,6 +1314,13 @@ static ssize_t scrub_show(struct device *dev, busy = test_bit(ARS_BUSY, &acpi_desc->scrub_flags) && !test_bit(ARS_CANCEL, &acpi_desc->scrub_flags); rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, busy ? "+\n" : "\n"); + /* Allow an admin to poll the busy state at a higher rate */ + if (busy && capable(CAP_SYS_RAWIO) && !test_and_set_bit(ARS_POLL, + &acpi_desc->scrub_flags)) { + acpi_desc->scrub_tmo = 1; + mod_delayed_work(nfit_wq, &acpi_desc->dwork, HZ); + } + mutex_unlock(&acpi_desc->init_mutex); device_unlock(dev); return rc; @@ -3075,6 +3082,7 @@ static void acpi_nfit_scrub(struct work_struct *work) else notify_ars_done(acpi_desc); memset(acpi_desc->ars_status, 0, acpi_desc->max_ars); + clear_bit(ARS_POLL, &acpi_desc->scrub_flags); mutex_unlock(&acpi_desc->init_mutex); }
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h index 94710e579598..b5fd3522abc7 100644 --- a/drivers/acpi/nfit/nfit.h +++ b/drivers/acpi/nfit/nfit.h @@ -184,6 +184,7 @@ struct nfit_mem { enum scrub_flags { ARS_BUSY, ARS_CANCEL, + ARS_POLL, };
struct acpi_nfit_desc {
commit 78153dd45e7e0596ba32b15d02bda08e1513111e upstream.
Gate ARS result consumption on whether the OS issued start-ARS since the previous consumption. The BIOS may only clear its result buffers after a successful start-ARS.
Fixes: 0caeef63e6d2 ("libnvdimm: Add a poison list and export badblocks") Cc: stable@vger.kernel.org Reported-by: Krzysztof Rusocki krzysztof.rusocki@intel.com Reported-by: Vishal Verma vishal.l.verma@intel.com Reviewed-by: Toshi Kani toshi.kani@hpe.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/nfit/core.c | 17 ++++++++++++++++- drivers/acpi/nfit/nfit.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 4b489d14a680..925dbc751322 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -2540,7 +2540,10 @@ static int ars_start(struct acpi_nfit_desc *acpi_desc,
if (rc < 0) return rc; - return cmd_rc; + if (cmd_rc < 0) + return cmd_rc; + set_bit(ARS_VALID, &acpi_desc->scrub_flags); + return 0; }
static int ars_continue(struct acpi_nfit_desc *acpi_desc) @@ -2633,6 +2636,17 @@ static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc) */ if (ars_status->out_length < 44) return 0; + + /* + * Ignore potentially stale results that are only refreshed + * after a start-ARS event. + */ + if (!test_and_clear_bit(ARS_VALID, &acpi_desc->scrub_flags)) { + dev_dbg(acpi_desc->dev, "skip %d stale records\n", + ars_status->num_records); + return 0; + } + for (i = 0; i < ars_status->num_records; i++) { /* only process full records */ if (ars_status->out_length @@ -3117,6 +3131,7 @@ static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc) struct nfit_spa *nfit_spa; int rc;
+ set_bit(ARS_VALID, &acpi_desc->scrub_flags); list_for_each_entry(nfit_spa, &acpi_desc->spas, list) { switch (nfit_spa_type(nfit_spa->spa)) { case NFIT_SPA_VOLATILE: diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h index b5fd3522abc7..68848fc4b7c9 100644 --- a/drivers/acpi/nfit/nfit.h +++ b/drivers/acpi/nfit/nfit.h @@ -184,6 +184,7 @@ struct nfit_mem { enum scrub_flags { ARS_BUSY, ARS_CANCEL, + ARS_VALID, ARS_POLL, };
[ Upstream commit 4bf780996669280171c9cd58196512849b93434e ]
Existing data command CRC error handling is non-standard and does not work with some Intel host controllers. Specifically, the assumption that the host controller will continue operating normally after the error interrupt, is not valid. Change the driver to handle the error in the same manner as a data CRC error, taking care to ensure that the data line reset is done for single or multi-block transfers, and it is done before unmapping DMA.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/sdhci.c | 40 +++++++++++++++------------------------- 1 file changed, 15 insertions(+), 25 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 654051e00117..4bfaca33a477 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1078,8 +1078,7 @@ static bool sdhci_needs_reset(struct sdhci_host *host, struct mmc_request *mrq) return (!(host->flags & SDHCI_DEVICE_DEAD) && ((mrq->cmd && mrq->cmd->error) || (mrq->sbc && mrq->sbc->error) || - (mrq->data && ((mrq->data->error && !mrq->data->stop) || - (mrq->data->stop && mrq->data->stop->error))) || + (mrq->data && mrq->data->stop && mrq->data->stop->error) || (host->quirks & SDHCI_QUIRK_RESET_AFTER_REQUEST))); }
@@ -1131,6 +1130,16 @@ static void sdhci_finish_data(struct sdhci_host *host) host->data = NULL; host->data_cmd = NULL;
+ /* + * The controller needs a reset of internal state machines upon error + * conditions. + */ + if (data->error) { + if (!host->cmd || host->cmd == data_cmd) + sdhci_do_reset(host, SDHCI_RESET_CMD); + sdhci_do_reset(host, SDHCI_RESET_DATA); + } + if ((host->flags & (SDHCI_REQ_USE_DMA | SDHCI_USE_ADMA)) == (SDHCI_REQ_USE_DMA | SDHCI_USE_ADMA)) sdhci_adma_table_post(host, data); @@ -1155,17 +1164,6 @@ static void sdhci_finish_data(struct sdhci_host *host) if (data->stop && (data->error || !data->mrq->sbc)) { - - /* - * The controller needs a reset of internal state machines - * upon error conditions. - */ - if (data->error) { - if (!host->cmd || host->cmd == data_cmd) - sdhci_do_reset(host, SDHCI_RESET_CMD); - sdhci_do_reset(host, SDHCI_RESET_DATA); - } - /* * 'cap_cmd_during_tfr' request must not use the command line * after mmc_command_done() has been called. It is upper layer's @@ -2642,7 +2640,7 @@ static void sdhci_timeout_data_timer(struct timer_list *t) * * *****************************************************************************/
-static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask) +static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask, u32 *intmask_p) { if (!host->cmd) { /* @@ -2665,20 +2663,12 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask) else host->cmd->error = -EILSEQ;
- /* - * If this command initiates a data phase and a response - * CRC error is signalled, the card can start transferring - * data - the card may have received the command without - * error. We must not terminate the mmc_request early. - * - * If the card did not receive the command or returned an - * error which prevented it sending data, the data phase - * will time out. - */ + /* Treat data command CRC error the same as data CRC error */ if (host->cmd->data && (intmask & (SDHCI_INT_CRC | SDHCI_INT_TIMEOUT)) == SDHCI_INT_CRC) { host->cmd = NULL; + *intmask_p |= SDHCI_INT_DATA_CRC; return; }
@@ -2906,7 +2896,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id) }
if (intmask & SDHCI_INT_CMD_MASK) - sdhci_cmd_irq(host, intmask & SDHCI_INT_CMD_MASK); + sdhci_cmd_irq(host, intmask & SDHCI_INT_CMD_MASK, &intmask);
if (intmask & SDHCI_INT_DATA_MASK) sdhci_data_irq(host, intmask & SDHCI_INT_DATA_MASK);
[ Upstream commit 869f8a69bb3a4aec4eb914a330d4ba53a9eed495 ]
The SDHCI_ACMD12_ERR register is used for auto-CMD23 and auto-CMD12 errors, as is the SDHCI_INT_ACMD12ERR interrupt bit. Rename them to SDHCI_AUTO_CMD_STATUS and SDHCI_INT_AUTO_CMD_ERR respectively.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/sdhci-esdhc-imx.c | 12 ++++++------ drivers/mmc/host/sdhci.c | 4 ++-- drivers/mmc/host/sdhci.h | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c index 8dae12b841b3..629860f7327c 100644 --- a/drivers/mmc/host/sdhci-esdhc-imx.c +++ b/drivers/mmc/host/sdhci-esdhc-imx.c @@ -429,7 +429,7 @@ static u16 esdhc_readw_le(struct sdhci_host *host, int reg) val = readl(host->ioaddr + ESDHC_MIX_CTRL); else if (imx_data->socdata->flags & ESDHC_FLAG_STD_TUNING) /* the std tuning bits is in ACMD12_ERR for imx6sl */ - val = readl(host->ioaddr + SDHCI_ACMD12_ERR); + val = readl(host->ioaddr + SDHCI_AUTO_CMD_STATUS); }
if (val & ESDHC_MIX_CTRL_EXE_TUNE) @@ -494,7 +494,7 @@ static void esdhc_writew_le(struct sdhci_host *host, u16 val, int reg) } writel(new_val , host->ioaddr + ESDHC_MIX_CTRL); } else if (imx_data->socdata->flags & ESDHC_FLAG_STD_TUNING) { - u32 v = readl(host->ioaddr + SDHCI_ACMD12_ERR); + u32 v = readl(host->ioaddr + SDHCI_AUTO_CMD_STATUS); u32 m = readl(host->ioaddr + ESDHC_MIX_CTRL); if (val & SDHCI_CTRL_TUNED_CLK) { v |= ESDHC_MIX_CTRL_SMPCLK_SEL; @@ -512,7 +512,7 @@ static void esdhc_writew_le(struct sdhci_host *host, u16 val, int reg) v &= ~ESDHC_MIX_CTRL_EXE_TUNE; }
- writel(v, host->ioaddr + SDHCI_ACMD12_ERR); + writel(v, host->ioaddr + SDHCI_AUTO_CMD_STATUS); writel(m, host->ioaddr + ESDHC_MIX_CTRL); } return; @@ -957,9 +957,9 @@ static void esdhc_reset_tuning(struct sdhci_host *host) writel(ctrl, host->ioaddr + ESDHC_MIX_CTRL); writel(0, host->ioaddr + ESDHC_TUNE_CTRL_STATUS); } else if (imx_data->socdata->flags & ESDHC_FLAG_STD_TUNING) { - ctrl = readl(host->ioaddr + SDHCI_ACMD12_ERR); + ctrl = readl(host->ioaddr + SDHCI_AUTO_CMD_STATUS); ctrl &= ~ESDHC_MIX_CTRL_SMPCLK_SEL; - writel(ctrl, host->ioaddr + SDHCI_ACMD12_ERR); + writel(ctrl, host->ioaddr + SDHCI_AUTO_CMD_STATUS); } } } @@ -1319,7 +1319,7 @@ static int sdhci_esdhc_imx_probe(struct platform_device *pdev)
/* clear tuning bits in case ROM has set it already */ writel(0x0, host->ioaddr + ESDHC_MIX_CTRL); - writel(0x0, host->ioaddr + SDHCI_ACMD12_ERR); + writel(0x0, host->ioaddr + SDHCI_AUTO_CMD_STATUS); writel(0x0, host->ioaddr + ESDHC_TUNE_CTRL_STATUS); }
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 4bfaca33a477..0eb05a42a857 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -82,8 +82,8 @@ void sdhci_dumpregs(struct sdhci_host *host) SDHCI_DUMP("Int enab: 0x%08x | Sig enab: 0x%08x\n", sdhci_readl(host, SDHCI_INT_ENABLE), sdhci_readl(host, SDHCI_SIGNAL_ENABLE)); - SDHCI_DUMP("AC12 err: 0x%08x | Slot int: 0x%08x\n", - sdhci_readw(host, SDHCI_ACMD12_ERR), + SDHCI_DUMP("ACmd stat: 0x%08x | Slot int: 0x%08x\n", + sdhci_readw(host, SDHCI_AUTO_CMD_STATUS), sdhci_readw(host, SDHCI_SLOT_INT_STATUS)); SDHCI_DUMP("Caps: 0x%08x | Caps_1: 0x%08x\n", sdhci_readl(host, SDHCI_CAPABILITIES), diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h index f0bd36ce3817..33a7728c71fa 100644 --- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -144,7 +144,7 @@ #define SDHCI_INT_DATA_CRC 0x00200000 #define SDHCI_INT_DATA_END_BIT 0x00400000 #define SDHCI_INT_BUS_POWER 0x00800000 -#define SDHCI_INT_ACMD12ERR 0x01000000 +#define SDHCI_INT_AUTO_CMD_ERR 0x01000000 #define SDHCI_INT_ADMA_ERROR 0x02000000
#define SDHCI_INT_NORMAL_MASK 0x00007FFF @@ -166,7 +166,7 @@
#define SDHCI_CQE_INT_MASK (SDHCI_CQE_INT_ERR_MASK | SDHCI_INT_CQE)
-#define SDHCI_ACMD12_ERR 0x3C +#define SDHCI_AUTO_CMD_STATUS 0x3C
#define SDHCI_HOST_CONTROL2 0x3E #define SDHCI_CTRL_UHS_MASK 0x0007
[ Upstream commit af849c86109d79222e549826068bbf4e7f9a2472 ]
If the host controller supports auto-commands then enable the auto-command error interrupt and handle it. In the case of auto-CMD23, the error is treated the same as manual CMD23 error. In the case of auto-CMD12, commands-during-transfer are not permitted, so the error handling is treated the same as a data error.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/sdhci.c | 35 +++++++++++++++++++++++++++++++++++ drivers/mmc/host/sdhci.h | 7 ++++++- 2 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 0eb05a42a857..c749d3dc1d36 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -841,6 +841,11 @@ static void sdhci_set_transfer_irqs(struct sdhci_host *host) else host->ier = (host->ier & ~dma_irqs) | pio_irqs;
+ if (host->flags & (SDHCI_AUTO_CMD23 | SDHCI_AUTO_CMD12)) + host->ier |= SDHCI_INT_AUTO_CMD_ERR; + else + host->ier &= ~SDHCI_INT_AUTO_CMD_ERR; + sdhci_writel(host, host->ier, SDHCI_INT_ENABLE); sdhci_writel(host, host->ier, SDHCI_SIGNAL_ENABLE); } @@ -2642,6 +2647,21 @@ static void sdhci_timeout_data_timer(struct timer_list *t)
static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask, u32 *intmask_p) { + /* Handle auto-CMD12 error */ + if (intmask & SDHCI_INT_AUTO_CMD_ERR && host->data_cmd) { + struct mmc_request *mrq = host->data_cmd->mrq; + u16 auto_cmd_status = sdhci_readw(host, SDHCI_AUTO_CMD_STATUS); + int data_err_bit = (auto_cmd_status & SDHCI_AUTO_CMD_TIMEOUT) ? + SDHCI_INT_DATA_TIMEOUT : + SDHCI_INT_DATA_CRC; + + /* Treat auto-CMD12 error the same as data error */ + if (!mrq->sbc && (host->flags & SDHCI_AUTO_CMD12)) { + *intmask_p |= data_err_bit; + return; + } + } + if (!host->cmd) { /* * SDHCI recovers from errors by resetting the cmd and data @@ -2676,6 +2696,21 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask, u32 *intmask_p) return; }
+ /* Handle auto-CMD23 error */ + if (intmask & SDHCI_INT_AUTO_CMD_ERR) { + struct mmc_request *mrq = host->cmd->mrq; + u16 auto_cmd_status = sdhci_readw(host, SDHCI_AUTO_CMD_STATUS); + int err = (auto_cmd_status & SDHCI_AUTO_CMD_TIMEOUT) ? + -ETIMEDOUT : + -EILSEQ; + + if (mrq->sbc && (host->flags & SDHCI_AUTO_CMD23)) { + mrq->sbc->error = err; + sdhci_finish_mrq(host, mrq); + return; + } + } + if (intmask & SDHCI_INT_RESPONSE) sdhci_finish_command(host); } diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h index 33a7728c71fa..0f8c4f3ccafc 100644 --- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -151,7 +151,8 @@ #define SDHCI_INT_ERROR_MASK 0xFFFF8000
#define SDHCI_INT_CMD_MASK (SDHCI_INT_RESPONSE | SDHCI_INT_TIMEOUT | \ - SDHCI_INT_CRC | SDHCI_INT_END_BIT | SDHCI_INT_INDEX) + SDHCI_INT_CRC | SDHCI_INT_END_BIT | SDHCI_INT_INDEX | \ + SDHCI_INT_AUTO_CMD_ERR) #define SDHCI_INT_DATA_MASK (SDHCI_INT_DATA_END | SDHCI_INT_DMA_END | \ SDHCI_INT_DATA_AVAIL | SDHCI_INT_SPACE_AVAIL | \ SDHCI_INT_DATA_TIMEOUT | SDHCI_INT_DATA_CRC | \ @@ -167,6 +168,10 @@ #define SDHCI_CQE_INT_MASK (SDHCI_CQE_INT_ERR_MASK | SDHCI_INT_CQE)
#define SDHCI_AUTO_CMD_STATUS 0x3C +#define SDHCI_AUTO_CMD_TIMEOUT 0x00000002 +#define SDHCI_AUTO_CMD_CRC 0x00000004 +#define SDHCI_AUTO_CMD_END_BIT 0x00000008 +#define SDHCI_AUTO_CMD_INDEX 0x00000010
#define SDHCI_HOST_CONTROL2 0x3E #define SDHCI_CTRL_UHS_MASK 0x0007
[ Upstream commit ec91e78d378cc5d4b43805a1227d8e04e5dfa17d ]
Commit e49ce14150c6 ("modpost: use linker section to generate table.") was not so cool as we had expected first; it ended up with ugly section hacks when commit dd2a3acaecd7 ("mod/file2alias: make modpost compile on darwin again") came in.
Given a certain degree of unknowledge about the link stage of host programs, I really want to see simple, stupid table lookup so that this works in the same way regardless of the underlying executable format.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Acked-by: Mathieu Malaterre malat@debian.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/mod/file2alias.c | 144 +++++++++++++-------------------------- 1 file changed, 49 insertions(+), 95 deletions(-)
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 7be43697ff84..9f3ebde7a0cd 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -50,46 +50,6 @@ struct devtable { void *function; };
-#define ___cat(a,b) a ## b -#define __cat(a,b) ___cat(a,b) - -/* we need some special handling for this host tool running eventually on - * Darwin. The Mach-O section handling is a bit different than ELF section - * handling. The differnces in detail are: - * a) we have segments which have sections - * b) we need a API call to get the respective section symbols */ -#if defined(__MACH__) -#include <mach-o/getsect.h> - -#define INIT_SECTION(name) do { \ - unsigned long name ## _len; \ - char *__cat(pstart_,name) = getsectdata("__TEXT", \ - #name, &__cat(name,_len)); \ - char *__cat(pstop_,name) = __cat(pstart_,name) + \ - __cat(name, _len); \ - __cat(__start_,name) = (void *)__cat(pstart_,name); \ - __cat(__stop_,name) = (void *)__cat(pstop_,name); \ - } while (0) -#define SECTION(name) __attribute__((section("__TEXT, " #name))) - -struct devtable **__start___devtable, **__stop___devtable; -#else -#define INIT_SECTION(name) /* no-op for ELF */ -#define SECTION(name) __attribute__((section(#name))) - -/* We construct a table of pointers in an ELF section (pointers generally - * go unpadded by gcc). ld creates boundary syms for us. */ -extern struct devtable *__start___devtable[], *__stop___devtable[]; -#endif /* __MACH__ */ - -#if !defined(__used) -# if __GNUC__ == 3 && __GNUC_MINOR__ < 3 -# define __used __attribute__((__unused__)) -# else -# define __used __attribute__((__used__)) -# endif -#endif - /* Define a variable f that holds the value of field f of struct devid * based at address m. */ @@ -102,16 +62,6 @@ extern struct devtable *__start___devtable[], *__stop___devtable[]; #define DEF_FIELD_ADDR(m, devid, f) \ typeof(((struct devid *)0)->f) *f = ((m) + OFF_##devid##_##f)
-/* Add a table entry. We test function type matches while we're here. */ -#define ADD_TO_DEVTABLE(device_id, type, function) \ - static struct devtable __cat(devtable,__LINE__) = { \ - device_id + 0*sizeof((function)((const char *)NULL, \ - (void *)NULL, \ - (char *)NULL)), \ - SIZE_##type, (function) }; \ - static struct devtable *SECTION(__devtable) __used \ - __cat(devtable_ptr,__LINE__) = &__cat(devtable,__LINE__) - #define ADD(str, sep, cond, field) \ do { \ strcat(str, sep); \ @@ -431,7 +381,6 @@ static int do_hid_entry(const char *filename,
return 1; } -ADD_TO_DEVTABLE("hid", hid_device_id, do_hid_entry);
/* Looks like: ieee1394:venNmoNspNverN */ static int do_ieee1394_entry(const char *filename, @@ -456,7 +405,6 @@ static int do_ieee1394_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("ieee1394", ieee1394_device_id, do_ieee1394_entry);
/* Looks like: pci:vNdNsvNsdNbcNscNiN. */ static int do_pci_entry(const char *filename, @@ -500,7 +448,6 @@ static int do_pci_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("pci", pci_device_id, do_pci_entry);
/* looks like: "ccw:tNmNdtNdmN" */ static int do_ccw_entry(const char *filename, @@ -524,7 +471,6 @@ static int do_ccw_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("ccw", ccw_device_id, do_ccw_entry);
/* looks like: "ap:tN" */ static int do_ap_entry(const char *filename, @@ -535,7 +481,6 @@ static int do_ap_entry(const char *filename, sprintf(alias, "ap:t%02X*", dev_type); return 1; } -ADD_TO_DEVTABLE("ap", ap_device_id, do_ap_entry);
/* looks like: "css:tN" */ static int do_css_entry(const char *filename, @@ -546,7 +491,6 @@ static int do_css_entry(const char *filename, sprintf(alias, "css:t%01X", type); return 1; } -ADD_TO_DEVTABLE("css", css_device_id, do_css_entry);
/* Looks like: "serio:tyNprNidNexN" */ static int do_serio_entry(const char *filename, @@ -566,7 +510,6 @@ static int do_serio_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("serio", serio_device_id, do_serio_entry);
/* looks like: "acpi:ACPI0003" or "acpi:PNP0C0B" or "acpi:LNXVIDEO" or * "acpi:bbsspp" (bb=base-class, ss=sub-class, pp=prog-if) @@ -604,7 +547,6 @@ static int do_acpi_entry(const char *filename, } return 1; } -ADD_TO_DEVTABLE("acpi", acpi_device_id, do_acpi_entry);
/* looks like: "pnp:dD" */ static void do_pnp_device_entry(void *symval, unsigned long size, @@ -725,7 +667,6 @@ static int do_pcmcia_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("pcmcia", pcmcia_device_id, do_pcmcia_entry);
static int do_vio_entry(const char *filename, void *symval, char *alias) @@ -745,7 +686,6 @@ static int do_vio_entry(const char *filename, void *symval, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("vio", vio_device_id, do_vio_entry);
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
@@ -818,7 +758,6 @@ static int do_input_entry(const char *filename, void *symval, do_input(alias, *swbit, 0, INPUT_DEVICE_ID_SW_MAX); return 1; } -ADD_TO_DEVTABLE("input", input_device_id, do_input_entry);
static int do_eisa_entry(const char *filename, void *symval, char *alias) @@ -830,7 +769,6 @@ static int do_eisa_entry(const char *filename, void *symval, strcat(alias, "*"); return 1; } -ADD_TO_DEVTABLE("eisa", eisa_device_id, do_eisa_entry);
/* Looks like: parisc:tNhvNrevNsvN */ static int do_parisc_entry(const char *filename, void *symval, @@ -850,7 +788,6 @@ static int do_parisc_entry(const char *filename, void *symval, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("parisc", parisc_device_id, do_parisc_entry);
/* Looks like: sdio:cNvNdN. */ static int do_sdio_entry(const char *filename, @@ -867,7 +804,6 @@ static int do_sdio_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("sdio", sdio_device_id, do_sdio_entry);
/* Looks like: ssb:vNidNrevN. */ static int do_ssb_entry(const char *filename, @@ -884,7 +820,6 @@ static int do_ssb_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("ssb", ssb_device_id, do_ssb_entry);
/* Looks like: bcma:mNidNrevNclN. */ static int do_bcma_entry(const char *filename, @@ -903,7 +838,6 @@ static int do_bcma_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("bcma", bcma_device_id, do_bcma_entry);
/* Looks like: virtio:dNvN */ static int do_virtio_entry(const char *filename, void *symval, @@ -919,7 +853,6 @@ static int do_virtio_entry(const char *filename, void *symval, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("virtio", virtio_device_id, do_virtio_entry);
/* * Looks like: vmbus:guid @@ -942,7 +875,6 @@ static int do_vmbus_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("vmbus", hv_vmbus_device_id, do_vmbus_entry);
/* Looks like: rpmsg:S */ static int do_rpmsg_entry(const char *filename, void *symval, @@ -953,7 +885,6 @@ static int do_rpmsg_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("rpmsg", rpmsg_device_id, do_rpmsg_entry);
/* Looks like: i2c:S */ static int do_i2c_entry(const char *filename, void *symval, @@ -964,7 +895,6 @@ static int do_i2c_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("i2c", i2c_device_id, do_i2c_entry);
/* Looks like: spi:S */ static int do_spi_entry(const char *filename, void *symval, @@ -975,7 +905,6 @@ static int do_spi_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("spi", spi_device_id, do_spi_entry);
static const struct dmifield { const char *prefix; @@ -1030,7 +959,6 @@ static int do_dmi_entry(const char *filename, void *symval, strcat(alias, ":"); return 1; } -ADD_TO_DEVTABLE("dmi", dmi_system_id, do_dmi_entry);
static int do_platform_entry(const char *filename, void *symval, char *alias) @@ -1039,7 +967,6 @@ static int do_platform_entry(const char *filename, sprintf(alias, PLATFORM_MODULE_PREFIX "%s", *name); return 1; } -ADD_TO_DEVTABLE("platform", platform_device_id, do_platform_entry);
static int do_mdio_entry(const char *filename, void *symval, char *alias) @@ -1064,7 +991,6 @@ static int do_mdio_entry(const char *filename,
return 1; } -ADD_TO_DEVTABLE("mdio", mdio_device_id, do_mdio_entry);
/* Looks like: zorro:iN. */ static int do_zorro_entry(const char *filename, void *symval, @@ -1075,7 +1001,6 @@ static int do_zorro_entry(const char *filename, void *symval, ADD(alias, "i", id != ZORRO_WILDCARD, id); return 1; } -ADD_TO_DEVTABLE("zorro", zorro_device_id, do_zorro_entry);
/* looks like: "pnp:dD" */ static int do_isapnp_entry(const char *filename, @@ -1091,7 +1016,6 @@ static int do_isapnp_entry(const char *filename, (function >> 12) & 0x0f, (function >> 8) & 0x0f); return 1; } -ADD_TO_DEVTABLE("isapnp", isapnp_device_id, do_isapnp_entry);
/* Looks like: "ipack:fNvNdN". */ static int do_ipack_entry(const char *filename, @@ -1107,7 +1031,6 @@ static int do_ipack_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("ipack", ipack_device_id, do_ipack_entry);
/* * Append a match expression for a single masked hex digit. @@ -1178,7 +1101,6 @@ static int do_amba_entry(const char *filename,
return 1; } -ADD_TO_DEVTABLE("amba", amba_id, do_amba_entry);
/* * looks like: "mipscdmm:tN" @@ -1194,7 +1116,6 @@ static int do_mips_cdmm_entry(const char *filename, sprintf(alias, "mipscdmm:t%02X*", type); return 1; } -ADD_TO_DEVTABLE("mipscdmm", mips_cdmm_device_id, do_mips_cdmm_entry);
/* LOOKS like cpu:type:x86,venVVVVfamFFFFmodMMMM:feature:*,FEAT,* * All fields are numbers. It would be nicer to use strings for vendor @@ -1219,7 +1140,6 @@ static int do_x86cpu_entry(const char *filename, void *symval, sprintf(alias + strlen(alias), "%04X*", feature); return 1; } -ADD_TO_DEVTABLE("x86cpu", x86_cpu_id, do_x86cpu_entry);
/* LOOKS like cpu:type:*:feature:*FEAT* */ static int do_cpu_entry(const char *filename, void *symval, char *alias) @@ -1229,7 +1149,6 @@ static int do_cpu_entry(const char *filename, void *symval, char *alias) sprintf(alias, "cpu:type:*:feature:*%04X*", feature); return 1; } -ADD_TO_DEVTABLE("cpu", cpu_feature, do_cpu_entry);
/* Looks like: mei:S:uuid:N:* */ static int do_mei_entry(const char *filename, void *symval, @@ -1248,7 +1167,6 @@ static int do_mei_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("mei", mei_cl_device_id, do_mei_entry);
/* Looks like: rapidio:vNdNavNadN */ static int do_rio_entry(const char *filename, @@ -1268,7 +1186,6 @@ static int do_rio_entry(const char *filename, add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("rapidio", rio_device_id, do_rio_entry);
/* Looks like: ulpi:vNpN */ static int do_ulpi_entry(const char *filename, void *symval, @@ -1281,7 +1198,6 @@ static int do_ulpi_entry(const char *filename, void *symval,
return 1; } -ADD_TO_DEVTABLE("ulpi", ulpi_device_id, do_ulpi_entry);
/* Looks like: hdaudio:vNrNaN */ static int do_hda_entry(const char *filename, void *symval, char *alias) @@ -1298,7 +1214,6 @@ static int do_hda_entry(const char *filename, void *symval, char *alias) add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("hdaudio", hda_device_id, do_hda_entry);
/* Looks like: sdw:mNpN */ static int do_sdw_entry(const char *filename, void *symval, char *alias) @@ -1313,7 +1228,6 @@ static int do_sdw_entry(const char *filename, void *symval, char *alias) add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("sdw", sdw_device_id, do_sdw_entry);
/* Looks like: fsl-mc:vNdN */ static int do_fsl_mc_entry(const char *filename, void *symval, @@ -1325,7 +1239,6 @@ static int do_fsl_mc_entry(const char *filename, void *symval, sprintf(alias, "fsl-mc:v%08Xd%s", vendor, *obj_type); return 1; } -ADD_TO_DEVTABLE("fslmc", fsl_mc_device_id, do_fsl_mc_entry);
/* Looks like: tbsvc:kSpNvNrN */ static int do_tbsvc_entry(const char *filename, void *symval, char *alias) @@ -1350,7 +1263,6 @@ static int do_tbsvc_entry(const char *filename, void *symval, char *alias) add_wildcard(alias); return 1; } -ADD_TO_DEVTABLE("tbsvc", tb_service_id, do_tbsvc_entry);
/* Looks like: typec:idNmN */ static int do_typec_entry(const char *filename, void *symval, char *alias) @@ -1363,7 +1275,6 @@ static int do_typec_entry(const char *filename, void *symval, char *alias)
return 1; } -ADD_TO_DEVTABLE("typec", typec_device_id, do_typec_entry);
/* Does namelen bytes of name exactly match the symbol? */ static bool sym_is(const char *name, unsigned namelen, const char *symbol) @@ -1396,6 +1307,48 @@ static void do_table(void *symval, unsigned long size, } }
+static const struct devtable devtable[] = { + {"hid", SIZE_hid_device_id, do_hid_entry}, + {"ieee1394", SIZE_ieee1394_device_id, do_ieee1394_entry}, + {"pci", SIZE_pci_device_id, do_pci_entry}, + {"ccw", SIZE_ccw_device_id, do_ccw_entry}, + {"ap", SIZE_ap_device_id, do_ap_entry}, + {"css", SIZE_css_device_id, do_css_entry}, + {"serio", SIZE_serio_device_id, do_serio_entry}, + {"acpi", SIZE_acpi_device_id, do_acpi_entry}, + {"pcmcia", SIZE_pcmcia_device_id, do_pcmcia_entry}, + {"vio", SIZE_vio_device_id, do_vio_entry}, + {"input", SIZE_input_device_id, do_input_entry}, + {"eisa", SIZE_eisa_device_id, do_eisa_entry}, + {"parisc", SIZE_parisc_device_id, do_parisc_entry}, + {"sdio", SIZE_sdio_device_id, do_sdio_entry}, + {"ssb", SIZE_ssb_device_id, do_ssb_entry}, + {"bcma", SIZE_bcma_device_id, do_bcma_entry}, + {"virtio", SIZE_virtio_device_id, do_virtio_entry}, + {"vmbus", SIZE_hv_vmbus_device_id, do_vmbus_entry}, + {"rpmsg", SIZE_rpmsg_device_id, do_rpmsg_entry}, + {"i2c", SIZE_i2c_device_id, do_i2c_entry}, + {"spi", SIZE_spi_device_id, do_spi_entry}, + {"dmi", SIZE_dmi_system_id, do_dmi_entry}, + {"platform", SIZE_platform_device_id, do_platform_entry}, + {"mdio", SIZE_mdio_device_id, do_mdio_entry}, + {"zorro", SIZE_zorro_device_id, do_zorro_entry}, + {"isapnp", SIZE_isapnp_device_id, do_isapnp_entry}, + {"ipack", SIZE_ipack_device_id, do_ipack_entry}, + {"amba", SIZE_amba_id, do_amba_entry}, + {"mipscdmm", SIZE_mips_cdmm_device_id, do_mips_cdmm_entry}, + {"x86cpu", SIZE_x86_cpu_id, do_x86cpu_entry}, + {"cpu", SIZE_cpu_feature, do_cpu_entry}, + {"mei", SIZE_mei_cl_device_id, do_mei_entry}, + {"rapidio", SIZE_rio_device_id, do_rio_entry}, + {"ulpi", SIZE_ulpi_device_id, do_ulpi_entry}, + {"hdaudio", SIZE_hda_device_id, do_hda_entry}, + {"sdw", SIZE_sdw_device_id, do_sdw_entry}, + {"fslmc", SIZE_fsl_mc_device_id, do_fsl_mc_entry}, + {"tbsvc", SIZE_tb_service_id, do_tbsvc_entry}, + {"typec", SIZE_typec_device_id, do_typec_entry}, +}; + /* Create MODULE_ALIAS() statements. * At this time, we cannot write the actual output C source yet, * so we write into the mod->dev_table_buf buffer. */ @@ -1450,13 +1403,14 @@ void handle_moddevtable(struct module *mod, struct elf_info *info, else if (sym_is(name, namelen, "pnp_card")) do_pnp_card_entries(symval, sym->st_size, mod); else { - struct devtable **p; - INIT_SECTION(__devtable); + int i; + + for (i = 0; i < ARRAY_SIZE(devtable); i++) { + const struct devtable *p = &devtable[i];
- for (p = __start___devtable; p < __stop___devtable; p++) { - if (sym_is(name, namelen, (*p)->device_id)) { - do_table(symval, sym->st_size, (*p)->id_size, - (*p)->device_id, (*p)->function, mod); + if (sym_is(name, namelen, p->device_id)) { + do_table(symval, sym->st_size, p->id_size, + p->device_id, p->function, mod); break; } }
[ Upstream commit f880eea68fe593342fa6e09be9bb661f3c297aec ]
Use specific prototype instead of an opaque pointer so that the compiler can catch function prototype mismatch.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Reviewed-by: Mathieu Malaterre malat@debian.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/mod/file2alias.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 9f3ebde7a0cd..7f40b6aab689 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -47,7 +47,7 @@ typedef struct { struct devtable { const char *device_id; /* name of table, __mod_<name>__*_device_table. */ unsigned long id_size; - void *function; + int (*do_entry)(const char *filename, void *symval, char *alias); };
/* Define a variable f that holds the value of field f of struct devid @@ -1288,12 +1288,11 @@ static bool sym_is(const char *name, unsigned namelen, const char *symbol) static void do_table(void *symval, unsigned long size, unsigned long id_size, const char *device_id, - void *function, + int (*do_entry)(const char *filename, void *symval, char *alias), struct module *mod) { unsigned int i; char alias[500]; - int (*do_entry)(const char *, void *entry, char *alias) = function;
device_id_check(mod->name, device_id, size, id_size, symval); /* Leave last one: it's the terminator. */ @@ -1410,7 +1409,7 @@ void handle_moddevtable(struct module *mod, struct elf_info *info,
if (sym_is(name, namelen, p->device_id)) { do_table(symval, sym->st_size, p->id_size, - p->device_id, p->function, mod); + p->device_id, p->do_entry, mod); break; } }
[ Upstream commit 442601e87a4769a8daba4976ec3afa5222ca211d ]
Return -E2BIG when the transfer is incomplete. The upper layer does not retry, so not doing that is incorrect behaviour.
Cc: stable@vger.kernel.org Fixes: a2871c62e186 ("tpm: Add support for Atmel I2C TPMs") Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Reviewed-by: Stefan Berger stefanb@linux.ibm.com Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/tpm_i2c_atmel.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/char/tpm/tpm_i2c_atmel.c b/drivers/char/tpm/tpm_i2c_atmel.c index 32a8e27c5382..cc4e642d3180 100644 --- a/drivers/char/tpm/tpm_i2c_atmel.c +++ b/drivers/char/tpm/tpm_i2c_atmel.c @@ -69,6 +69,10 @@ static int i2c_atmel_send(struct tpm_chip *chip, u8 *buf, size_t len) if (status < 0) return status;
+ /* The upper layer does not support incomplete sends. */ + if (status != len) + return -E2BIG; + return 0; }
commit b9d0a85d6b2e76630cfd4c475ee3af4109bfd87a upstream
calc_tpm2_event_size() has an invalid signature because it returns a 'size_t' where as its signature says that it returns 'int'.
Cc: stable@vger.kernel.org Fixes: 4d23cc323cdb ("tpm: add securityfs support for TPM 2.0 firmware event log") Suggested-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Yue Haibing yuehaibing@huawei.com Reviewed-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: James Morris james.morris@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/eventlog/tpm2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/char/tpm/eventlog/tpm2.c b/drivers/char/tpm/eventlog/tpm2.c index 1b8fa9de2cac..41b9f6c92da7 100644 --- a/drivers/char/tpm/eventlog/tpm2.c +++ b/drivers/char/tpm/eventlog/tpm2.c @@ -37,8 +37,8 @@ * * Returns size of the event. If it is an invalid event, returns 0. */ -static int calc_tpm2_event_size(struct tcg_pcr_event2 *event, - struct tcg_pcr_event *event_header) +static size_t calc_tpm2_event_size(struct tcg_pcr_event2 *event, + struct tcg_pcr_event *event_header) { struct tcg_efi_specid_event *efispecid; struct tcg_event_field *event_field;
commit a75bb4eb9e565b9f5115e2e8c07377ce32cbe69a upstream.
The clang option -Oz enables *aggressive* optimization for size, which doesn't necessarily result in smaller images, but can have negative impact on performance. Switch back to the less aggressive -Os.
This reverts commit 6748cb3c299de1ffbe56733647b01dbcc398c419.
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Matthias Kaehlcke mka@chromium.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index 3fac08f6a11e..91e265ca0ca5 100644 --- a/Makefile +++ b/Makefile @@ -661,8 +661,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow) KBUILD_CFLAGS += $(call cc-disable-warning, int-in-bool-context)
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE -KBUILD_CFLAGS += $(call cc-option,-Oz,-Os) -KBUILD_CFLAGS += $(call cc-disable-warning,maybe-uninitialized,) +KBUILD_CFLAGS += -Os $(call cc-disable-warning,maybe-uninitialized,) else ifdef CONFIG_PROFILE_ALL_BRANCHES KBUILD_CFLAGS += -O2 $(call cc-disable-warning,maybe-uninitialized,)
[ Upstream commit 2e8e19226398db8265a8e675fcc0118b9e80c9e8 ]
With extremely short cfs_period_us setting on a parent task group with a large number of children the for loop in sched_cfs_period_timer() can run until the watchdog fires. There is no guarantee that the call to hrtimer_forward_now() will ever return 0. The large number of children can make do_sched_cfs_period_timer() take longer than the period.
NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 RIP: 0010:tg_nop+0x0/0x10 <IRQ> walk_tg_tree_from+0x29/0xb0 unthrottle_cfs_rq+0xe0/0x1a0 distribute_cfs_runtime+0xd3/0xf0 sched_cfs_period_timer+0xcb/0x160 ? sched_cfs_slack_timer+0xd0/0xd0 __hrtimer_run_queues+0xfb/0x270 hrtimer_interrupt+0x122/0x270 smp_apic_timer_interrupt+0x6a/0x140 apic_timer_interrupt+0xf/0x20 </IRQ>
To prevent this we add protection to the loop that detects when the loop has run too many times and scales the period and quota up, proportionally, so that the timer can complete before then next period expires. This preserves the relative runtime quota while preventing the hard lockup.
A warning is issued reporting this state and the new values.
Signed-off-by: Phil Auld pauld@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Cc: Anton Blanchard anton@ozlabs.org Cc: Ben Segall bsegall@google.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190319130005.25492-1-pauld@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 640094391169..4aa8e7d90c25 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4847,12 +4847,15 @@ static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer) return HRTIMER_NORESTART; }
+extern const u64 max_cfs_quota_period; + static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) { struct cfs_bandwidth *cfs_b = container_of(timer, struct cfs_bandwidth, period_timer); int overrun; int idle = 0; + int count = 0;
raw_spin_lock(&cfs_b->lock); for (;;) { @@ -4860,6 +4863,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) if (!overrun) break;
+ if (++count > 3) { + u64 new, old = ktime_to_ns(cfs_b->period); + + new = (old * 147) / 128; /* ~115% */ + new = min(new, max_cfs_quota_period); + + cfs_b->period = ns_to_ktime(new); + + /* since max is 1s, this is limited to 1e9^2, which fits in u64 */ + cfs_b->quota *= new; + cfs_b->quota = div64_u64(cfs_b->quota, old); + + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + div_u64(new, NSEC_PER_USEC), + div_u64(cfs_b->quota, NSEC_PER_USEC)); + + /* reset count so we don't come right back in here */ + count = 0; + } + idle = do_sched_cfs_period_timer(cfs_b, overrun); } if (idle)
From: Jann Horn jannh@google.com
commit 0fcc4c8c044e117ac126ab6df4138ea9a67fa2a9 upstream.
When dev_exception_add() returns an error (due to a failed memory allocation), make sure that we move the RCU preemption count back to where it was before we were called. We dropped the RCU read lock inside the loop body, so we can't just "break".
sparse complains about this, too:
$ make -s C=2 security/device_cgroup.o ./include/linux/rcupdate.h:647:9: warning: context imbalance in 'propagate_exception' - unexpected unlock
Fixes: d591fb56618f ("device_cgroup: simplify cgroup tree walk in propagate_exception()") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com Acked-by: Michal Hocko mhocko@suse.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/device_cgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/security/device_cgroup.c +++ b/security/device_cgroup.c @@ -560,7 +560,7 @@ static int propagate_exception(struct de devcg->behavior == DEVCG_DEFAULT_ALLOW) { rc = dev_exception_add(devcg, ex); if (rc) - break; + return rc; } else { /* * in the other possible cases:
From: Konstantin Khlebnikov khlebnikov@yandex-team.ru
commit e8277b3b52240ec1caad8e6df278863e4bf42eac upstream.
Commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly") depends on skipping vmstat entries with empty name introduced in 7aaf77272358 ("mm: don't show nr_indirectly_reclaimable in /proc/vmstat") but reverted in b29940c1abd7 ("mm: rename and change semantics of nr_indirectly_reclaimable_bytes").
So skipping no longer works and /proc/vmstat has misformatted lines " 0".
This patch simply shows debug counters "nr_tlb_remote_*" for UP.
Link: http://lkml.kernel.org/r/155481488468.467.4295519102880913454.stgit@buzz Fixes: 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly") Signed-off-by: Konstantin Khlebnikov khlebnikov@yandex-team.ru Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Roman Gushchin guro@fb.com Cc: Jann Horn jannh@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmstat.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1272,13 +1272,8 @@ const char * const vmstat_text[] = { #endif #endif /* CONFIG_MEMORY_BALLOON */ #ifdef CONFIG_DEBUG_TLBFLUSH -#ifdef CONFIG_SMP "nr_tlb_remote_flush", "nr_tlb_remote_flush_received", -#else - "", /* nr_tlb_remote_flush */ - "", /* nr_tlb_remote_flush_received */ -#endif /* CONFIG_SMP */ "nr_tlb_local_flush_all", "nr_tlb_local_flush_one", #endif /* CONFIG_DEBUG_TLBFLUSH */
From: Takashi Iwai tiwai@suse.de
commit 8c2f870890fd28e023b0fcf49dcee333f2c8bad7 upstream.
The ALSA proc helper manages the child nodes in a linked list, but its addition and deletion is done without any lock. This leads to a corruption if they are operated concurrently. Usually this isn't a problem because the proc entries are added sequentially in the driver probe procedure itself. But the card registrations are done often asynchronously, and the crash could be actually reproduced with syzkaller.
This patch papers over it by protecting the link addition and deletion with the parent's mutex. There is "access" mutex that is used for the file access, and this can be reused for this purpose as well.
Reported-by: syzbot+48df349490c36f9f54ab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/info.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/sound/core/info.c +++ b/sound/core/info.c @@ -722,8 +722,11 @@ snd_info_create_entry(const char *name, INIT_LIST_HEAD(&entry->children); INIT_LIST_HEAD(&entry->list); entry->parent = parent; - if (parent) + if (parent) { + mutex_lock(&parent->access); list_add_tail(&entry->list, &parent->children); + mutex_unlock(&parent->access); + } return entry; }
@@ -805,7 +808,12 @@ void snd_info_free_entry(struct snd_info list_for_each_entry_safe(p, n, &entry->children, list) snd_info_free_entry(p);
- list_del(&entry->list); + p = entry->parent; + if (p) { + mutex_lock(&p->access); + list_del(&entry->list); + mutex_unlock(&p->access); + } kfree(entry->name); if (entry->private_free) entry->private_free(entry);
From: Matteo Croce mcroce@redhat.com
commit 00206a69ee32f03e6f40837684dcbe475ea02266 upstream.
Since commit ad67b74d2469d9b8 ("printk: hash addresses printed with %p"), at boot "____ptrval____" is printed instead of actual addresses:
percpu: Embedded 38 pages/cpu @(____ptrval____) s124376 r0 d31272 u524288
Instead of changing the print to "%px", and leaking kernel addresses, just remove the print completely, cfr. e.g. commit 071929dbdd865f77 ("arm64: Stop printing the virtual memory layout").
Signed-off-by: Matteo Croce mcroce@redhat.com Signed-off-by: Dennis Zhou dennis@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/percpu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/percpu.c +++ b/mm/percpu.c @@ -2529,8 +2529,8 @@ int __init pcpu_embed_first_chunk(size_t ai->groups[group].base_offset = areas[group] - base; }
- pr_info("Embedded %zu pages/cpu @%p s%zu r%zu d%zu u%zu\n", - PFN_DOWN(size_sum), base, ai->static_size, ai->reserved_size, + pr_info("Embedded %zu pages/cpu s%zu r%zu d%zu u%zu\n", + PFN_DOWN(size_sum), ai->static_size, ai->reserved_size, ai->dyn_size, ai->unit_size);
rc = pcpu_setup_first_chunk(ai, base); @@ -2651,8 +2651,8 @@ int __init pcpu_page_first_chunk(size_t }
/* we're ready, commit */ - pr_info("%d %s pages/cpu @%p s%zu r%zu d%zu\n", - unit_pages, psize_str, vm.addr, ai->static_size, + pr_info("%d %s pages/cpu s%zu r%zu d%zu\n", + unit_pages, psize_str, ai->static_size, ai->reserved_size, ai->dyn_size);
rc = pcpu_setup_first_chunk(ai, vm.addr);
From: Arnaldo Carvalho de Melo acme@redhat.com
commit ba4aa02b417f08a0bee5e7b8ed70cac788a7c854 upstream.
So that we reduce the difference of tools/include/linux/bitops.h to the original kernel file, include/linux/bitops.h, trying to remove the need to define BITS_PER_LONG, to avoid clashes with asm/bitsperlong.h.
And the things removed from tools/include/linux/bitops.h are really in linux/bits.h, so that we can have a copy and then tools/perf/check_headers.sh will tell us when new stuff gets added to linux/bits.h so that we can check if it is useful and if any adjustment needs to be done to the tools/{include,arch}/ copies.
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Sverdlin alexander.sverdlin@nokia.com Cc: David Ahern dsahern@gmail.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Wang Nan wangnan0@huawei.com Link: https://lkml.kernel.org/n/tip-y1sqyydvfzo0bjjoj4zsl562@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/include/linux/bitops.h | 7 ++----- tools/include/linux/bits.h | 26 ++++++++++++++++++++++++++ tools/perf/check-headers.sh | 1 + 3 files changed, 29 insertions(+), 5 deletions(-)
--- a/tools/include/linux/bitops.h +++ b/tools/include/linux/bitops.h @@ -3,8 +3,6 @@ #define _TOOLS_LINUX_BITOPS_H_
#include <asm/types.h> -#include <linux/compiler.h> - #ifndef __WORDSIZE #define __WORDSIZE (__SIZEOF_LONG__ * 8) #endif @@ -12,10 +10,9 @@ #ifndef BITS_PER_LONG # define BITS_PER_LONG __WORDSIZE #endif +#include <linux/bits.h> +#include <linux/compiler.h>
-#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) -#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) -#define BITS_PER_BYTE 8 #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long)) #define BITS_TO_U64(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(u64)) #define BITS_TO_U32(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(u32)) --- /dev/null +++ b/tools/include/linux/bits.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_BITS_H +#define __LINUX_BITS_H +#include <asm/bitsperlong.h> + +#define BIT(nr) (1UL << (nr)) +#define BIT_ULL(nr) (1ULL << (nr)) +#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG)) +#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG) +#define BITS_PER_BYTE 8 + +/* + * Create a contiguous bitmask starting at bit position @l and ending at + * position @h. For example + * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000. + */ +#define GENMASK(h, l) \ + (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) + +#define GENMASK_ULL(h, l) \ + (((~0ULL) - (1ULL << (l)) + 1) & \ + (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h)))) + +#endif /* __LINUX_BITS_H */ --- a/tools/perf/check-headers.sh +++ b/tools/perf/check-headers.sh @@ -14,6 +14,7 @@ include/uapi/linux/sched.h include/uapi/linux/stat.h include/uapi/linux/vhost.h include/uapi/sound/asound.h +include/linux/bits.h include/linux/hash.h include/uapi/linux/hw_breakpoint.h arch/x86/include/asm/disabled-features.h
From: Katsuhiro Suzuki katsuhiro@katsuster.net
commit 24d6638302b48328a58c13439276d4531af4ca7d upstream.
This patch adds SNDRV_PCM_INFO_INTERLEAVED into PCM hardware info.
Signed-off-by: Katsuhiro Suzuki katsuhiro@katsuster.net Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/rockchip/rockchip_pcm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/sound/soc/rockchip/rockchip_pcm.c +++ b/sound/soc/rockchip/rockchip_pcm.c @@ -21,7 +21,8 @@ static const struct snd_pcm_hardware snd .info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_PAUSE | - SNDRV_PCM_INFO_RESUME, + SNDRV_PCM_INFO_RESUME | + SNDRV_PCM_INFO_INTERLEAVED, .period_bytes_min = 32, .period_bytes_max = 8192, .periods_min = 1,
From: Linus Torvalds torvalds@linux-foundation.org
commit b59dfdaef173677b0b7e10f375226c0a1114fd20 upstream.
Commit 9ee3e06610fd ("HID: i2c-hid: override HID descriptors for certain devices") added a new dmi_system_id quirk table to override certain HID report descriptors for some systems that lack them.
But the table wasn't properly terminated, causing the dmi matching to walk off into la-la-land, and starting to treat random data as dmi descriptor pointers, causing boot-time oopses if you were at all unlucky.
Terminate the array.
We really should have some way to just statically check that arrays that should be terminated by an empty entry actually are so. But the HID people really should have caught this themselves, rather than have me deal with an oops during the merge window. Tssk, tssk.
Cc: Julian Sax jsbc@gmx.de Cc: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Jiri Kosina jkosina@suse.cz Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Ambrož Bizjak abizjak.pro@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c +++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c @@ -337,7 +337,8 @@ static const struct dmi_system_id i2c_hi DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "FlexBook edge11 - M-FBE11"), }, .driver_data = (void *)&sipodev_desc - } + }, + { } /* Terminate list */ };
stable-rc/linux-4.19.y boot: 112 boots: 0 failed, 101 passed with 10 offline, 1 untried/unknown (v4.19.36-96-g16ede5319518)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.19.y/kernel/v4.19... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.19.y/kernel/v4.19.36-96-...
Tree: stable-rc Branch: linux-4.19.y Git Describe: v4.19.36-96-g16ede5319518 Git Commit: 16ede5319518ed5379e1b00dc6f928d0aaccfc4b Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 66 unique boards, 24 SoC families, 14 builds out of 206
Offline Platforms:
arm:
davinci_all_defconfig: gcc-7 da850-lcdk: 1 offline lab
imx_v6_v7_defconfig: gcc-7 vf610-colibri-eval-v3: 1 offline lab
multi_v7_defconfig: gcc-7 meson8b-odroidc1: 1 offline lab vf610-colibri-eval-v3: 1 offline lab
arm64:
defconfig: gcc-7 meson-gxbb-odroidc2: 1 offline lab meson-gxl-s905d-p230: 1 offline lab meson-gxl-s905x-nexbox-a95x: 1 offline lab meson-gxl-s905x-p212: 1 offline lab meson-gxm-nexbox-a1: 1 offline lab rk3399-firefly: 1 offline lab
--- For more info write to info@kernelci.org
On Wed, 24 Apr 2019 at 23:01, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.37 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Apr 2019 05:07:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.37-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.19.37-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.19.y git commit: cca3bd6b49acb02c8bf1a9231efe069edb667d03 git describe: v4.19.36-97-gcca3bd6b49ac Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.36-97...
No regressions (compared to build v4.19.36)
No fixes (compared to build v4.19.36)
Ran 25017 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * kvm-unit-tests * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On 24/04/2019 18:09, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.37 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Apr 2019 05:07:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.37-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.19: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.19.37-rc2-gc60b2d4 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On 4/24/19 11:09 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.37 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Apr 2019 05:07:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.37-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Wed, Apr 24, 2019 at 07:09:05PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.37 release. There are 96 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Apr 2019 05:07:35 PM UTC. Anything received after that time might be too late.
For v4.19.36-99-gc60b2d434e04:
Build results: total: 156 pass: 156 fail: 0 Qemu test results: total: 349 pass: 349 fail: 0
Guenter
linux-stable-mirror@lists.linaro.org