This is the start of the stable review cycle for the 4.4.156 release. There are 60 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Sep 15 13:17:29 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.156-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.156-rc1
Ethan Lien ethanlien@synology.com btrfs: use correct compare function of dirty_metadata_bytes
Gustavo A. R. Silva gustavo@embeddedor.com ASoC: wm8994: Fix missing break in switch
Martin Schwidefsky schwidefsky@de.ibm.com s390/lib: use expoline for all bcr instructions
Tomas Winkler tomas.winkler@intel.com mei: me: allow runtime pm for platform with D0i3
Nikolay Aleksandrov nikolay@cumulusnetworks.com sch_tbf: fix two null pointer dereferences on init failure
Nikolay Aleksandrov nikolay@cumulusnetworks.com sch_netem: avoid null pointer deref on init failure
Nikolay Aleksandrov nikolay@cumulusnetworks.com sch_hhf: fix null pointer dereference on init failure
Nikolay Aleksandrov nikolay@cumulusnetworks.com sch_multiq: fix double free on init failure
Nikolay Aleksandrov nikolay@cumulusnetworks.com sch_htb: fix crash on init failure
Miklos Szeredi mszeredi@redhat.com ovl: proper cleanup of workdir
Antonio Murdaca amurdaca@redhat.com ovl: override creds with the ones from the superblock mounter
Miklos Szeredi mszeredi@redhat.com ovl: rename is_merge to is_lowest
Marc Zyngier marc.zyngier@arm.com irqchip/gic: Make interrupt ID 1020 invalid
Marc Zyngier marc.zyngier@arm.com irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar()
Shanker Donthineni shankerd@codeaurora.org irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size
Shanker Donthineni shankerd@codeaurora.org irqchip/gicv3-its: Fix memory leak in its_free_tables()
Marc Zyngier marc.zyngier@arm.com irqchip/gic-v3-its: Recompute the number of pages on page size change
Sudeep Holla sudeep.holla@arm.com genirq: Delay incrementing interrupt count if it's disabled/pending
Chas Williams chas3@att.com Fixes: Commit cdbf92675fad ("mm: numa: avoid waiting on freed migrated pages")
Govindarajulu Varadarajan gvaradar@cisco.com enic: do not call enic_change_mtu in enic_probe
Fabio Estevam fabio.estevam@nxp.com Revert "ARM: imx_v6_v7_defconfig: Select ULPI support"
Tyler Hicks tyhicks@canonical.com irda: Only insert new objects into the global database via setsockopt
Tyler Hicks tyhicks@canonical.com irda: Fix memory leak caused by repeated binds of irda socket
Randy Dunlap rdunlap@infradead.org kbuild: make missing $DEPMOD a Warning instead of an Error
Juergen Gross jgross@suse.com x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
Joel Fernandes (Google) joel@joelfernandes.org debugobjects: Make stack check warning more informative
Qu Wenruo wqu@suse.com btrfs: Don't remove block group that still has pinned down bytes
Qu Wenruo wqu@suse.com btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized
Misono Tomohiro misono.tomohiro@jp.fujitsu.com btrfs: replace: Reset on-disk dev stats value after replace
Mahesh Salgaonkar mahesh@linux.vnet.ibm.com powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX.
Steve French stfrench@microsoft.com SMB3: Number of requests sent should be displayed for SMB3 not just CIFS
Steve French stfrench@microsoft.com smb3: fix reset of bytes read and written stats
Breno Leitao leitao@debian.org selftests/powerpc: Kill child processes on SIGINT
Ian Abbott abbotti@mev.co.uk staging: comedi: ni_mio_common: fix subdevice flags for PFI subdevice
John Pittman jpittman@redhat.com dm kcopyd: avoid softlockup in run_complete_job
Thomas Petazzoni thomas.petazzoni@bootlin.com PCI: mvebu: Fix I/O space end address calculation
Dan Carpenter dan.carpenter@oracle.com scsi: aic94xx: fix an error code in aic94xx_init()
Stefan Haberland sth@linux.ibm.com s390/dasd: fix hanging offline processing due to canceled worker
Dan Carpenter dan.carpenter@oracle.com powerpc: Fix size calculation using resource_size()
Jean-Philippe Brucker jean-philippe.brucker@arm.com net/9p: fix error path of p9_virtio_probe
Jonas Gorski jonas.gorski@gmail.com irqchip/bcm7038-l1: Hide cpu offline callback when building for !SMP
Aleh Filipovich aleh@vaolix.com platform/x86: asus-nb-wmi: Add keymap entry for lid flip action on UX360
Guenter Roeck linux@roeck-us.net mfd: sm501: Set coherent_dma_mask when creating subdevices
Tan Hu tan.hu@zte.com.cn ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot()
Andrey Ryabinin aryabinin@virtuozzo.com mm/fadvise.c: fix signed overflow UBSAN complaint
Randy Dunlap rdunlap@infradead.org scripts: modpost: check memory allocation results
OGAWA Hirofumi hirofumi@mail.parknet.co.jp fat: validate ->i_start before using
Ernesto A. Fernández ernesto.mnd.fernandez@gmail.com hfsplus: fix NULL dereference in hfsplus_lookup()
Arnd Bergmann arnd@arndb.de reiserfs: change j_timestamp type to time64_t
Jann Horn jannh@google.com fork: don't copy inconsistent signal handler state to child
Ernesto A. Fernández ernesto.mnd.fernandez@gmail.com hfs: prevent crash on exit from failed search
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp hfsplus: don't return 0 when fill_super() failed
Ronnie Sahlberg lsahlber@redhat.com cifs: check if SMB2 PDU size has been padded and suppress the warning
Alexey Kodanev alexey.kodanev@oracle.com vti6: remove !skb->ignore_df check from vti6_xmit()
Florian Westphal fw@strlen.de tcp: do not restart timewait timer on rst reception
Manish Chopra manish.chopra@cavium.com qlge: Fix netdev features configuration.
Doug Berger opendmb@gmail.com net: bcmgenet: use MAC link status for fixed phy
Greg Hackmann ghackmann@android.com staging: android: ion: fix ION_IOC_{MAP,SHARE} use-after-free
Michal Hocko mhocko@suse.cz x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
-------------
Diffstat:
Makefile | 4 +- arch/arm/configs/imx_v6_v7_defconfig | 2 - arch/arm/include/asm/arch_gicv3.h | 1 + arch/powerpc/platforms/pseries/ras.c | 2 +- arch/powerpc/sysdev/mpic_msgr.c | 2 +- arch/s390/lib/mem.S | 9 ++- arch/x86/include/asm/pgtable-3level.h | 7 +- arch/x86/include/asm/pgtable.h | 2 +- drivers/irqchip/irq-bcm7038-l1.c | 4 ++ drivers/irqchip/irq-gic-v3-its.c | 34 ++++++---- drivers/irqchip/irq-gic.c | 2 +- drivers/md/dm-kcopyd.c | 2 + drivers/mfd/sm501.c | 1 + drivers/misc/mei/pci-me.c | 5 +- drivers/net/ethernet/broadcom/genet/bcmgenet.h | 3 + drivers/net/ethernet/broadcom/genet/bcmmii.c | 10 ++- drivers/net/ethernet/cisco/enic/enic_main.c | 2 +- drivers/net/ethernet/qlogic/qlge/qlge_main.c | 23 +++---- drivers/pci/host/pci-mvebu.c | 2 +- drivers/platform/x86/asus-nb-wmi.c | 1 + drivers/s390/block/dasd_eckd.c | 7 +- drivers/scsi/aic94xx/aic94xx_init.c | 4 +- drivers/staging/android/ion/ion.c | 60 ++++++++++------- drivers/staging/comedi/drivers/ni_mio_common.c | 3 +- fs/btrfs/dev-replace.c | 6 ++ fs/btrfs/disk-io.c | 10 +-- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/relocation.c | 23 ++++--- fs/cifs/cifs_debug.c | 8 +++ fs/cifs/smb2misc.c | 7 ++ fs/cifs/smb2pdu.c | 2 +- fs/dcache.c | 3 +- fs/fat/cache.c | 19 ++++-- fs/fat/fat.h | 5 ++ fs/fat/fatent.c | 6 +- fs/hfs/brec.c | 7 +- fs/hfsplus/dir.c | 4 +- fs/hfsplus/super.c | 4 +- fs/overlayfs/copy_up.c | 26 +------ fs/overlayfs/dir.c | 67 ++----------------- fs/overlayfs/overlayfs.h | 3 + fs/overlayfs/readdir.c | 93 ++++++++++++++++++++------ fs/overlayfs/super.c | 20 +++++- fs/reiserfs/reiserfs.h | 2 +- kernel/fork.c | 2 + kernel/irq/chip.c | 8 +-- lib/debugobjects.c | 7 +- mm/fadvise.c | 8 ++- mm/huge_memory.c | 2 +- net/9p/trans_virtio.c | 3 +- net/ipv4/tcp_minisocks.c | 3 +- net/ipv6/ip6_vti.c | 2 +- net/irda/af_irda.c | 13 +++- net/netfilter/ipvs/ip_vs_core.c | 15 +++-- net/sched/sch_hhf.c | 3 + net/sched/sch_htb.c | 5 +- net/sched/sch_multiq.c | 9 +-- net/sched/sch_netem.c | 4 +- net/sched/sch_tbf.c | 5 +- scripts/depmod.sh | 4 +- scripts/mod/modpost.c | 8 +-- sound/soc/codecs/wm8994.c | 1 + tools/testing/selftests/powerpc/harness.c | 18 +++-- 63 files changed, 369 insertions(+), 260 deletions(-)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hocko mhocko@suse.cz
commit e14d7dfb41f5807a0c1c26a13f2b8ef16af24935 upstream.
Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for CONFIG_PAE because phys_addr_t is wider than unsigned long and so the pte_val reps. shift left would get truncated. Fix this up by using proper types.
[Just one chunk, again, needed here. Thanks to Ben and Guenter for finding and fixing this. - gregkh]
Fixes: 6b28baca9b1f ("x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation") Reported-by: Jan Beulich JBeulich@suse.com Signed-off-by: Michal Hocko mhocko@suse.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Guenter Roeck linux@roeck-us.net Cc: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/pgtable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -385,7 +385,7 @@ static inline pmd_t pfn_pmd(unsigned lon
static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot) { - phys_addr_t pfn = page_nr << PAGE_SHIFT; + phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT; pfn ^= protnone_mask(pgprot_val(pgprot)); pfn &= PHYSICAL_PUD_PAGE_MASK; return __pud(pfn | massage_pgprot(pgprot));
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Hackmann ghackmann@android.com
The ION_IOC_{MAP,SHARE} ioctls drop and reacquire client->lock several times while operating on one of the client's ion_handles. This creates windows where userspace can call ION_IOC_FREE on the same client with the same handle, and effectively make the kernel drop its own reference. For example:
- thread A: ION_IOC_ALLOC creates an ion_handle with refcount 1 - thread A: starts ION_IOC_MAP and increments the refcount to 2 - thread B: ION_IOC_FREE decrements the refcount to 1 - thread B: ION_IOC_FREE decrements the refcount to 0 and frees the handle - thread A: continues ION_IOC_MAP with a dangling ion_handle * to freed memory
Fix this by holding client->lock for the duration of ION_IOC_{MAP,SHARE}, preventing the concurrent ION_IOC_FREE. Also remove ion_handle_get_by_id(), since there's literally no way to use it safely.
This patch is applied on top of 4.4.y, and applies to older kernels too. 4.9.y was fixed separately. Kernels 4.12 and later are unaffected, since all the underlying ion_handle infrastructure has been ripped out.
Cc: stable@vger.kernel.org # v4.4- Signed-off-by: Greg Hackmann ghackmann@google.com Acked-by: Laura Abbott labbott@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- v2: remove Change-Id line from commit message
drivers/staging/android/ion/ion.c | 60 +++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 23 deletions(-)
--- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -449,18 +449,6 @@ static struct ion_handle *ion_handle_get return ERR_PTR(-EINVAL); }
-struct ion_handle *ion_handle_get_by_id(struct ion_client *client, - int id) -{ - struct ion_handle *handle; - - mutex_lock(&client->lock); - handle = ion_handle_get_by_id_nolock(client, id); - mutex_unlock(&client->lock); - - return handle; -} - static bool ion_handle_validate(struct ion_client *client, struct ion_handle *handle) { @@ -1138,24 +1126,28 @@ static struct dma_buf_ops dma_buf_ops = .kunmap = ion_dma_buf_kunmap, };
-struct dma_buf *ion_share_dma_buf(struct ion_client *client, - struct ion_handle *handle) +static struct dma_buf *__ion_share_dma_buf(struct ion_client *client, + struct ion_handle *handle, + bool lock_client) { DEFINE_DMA_BUF_EXPORT_INFO(exp_info); struct ion_buffer *buffer; struct dma_buf *dmabuf; bool valid_handle;
- mutex_lock(&client->lock); + if (lock_client) + mutex_lock(&client->lock); valid_handle = ion_handle_validate(client, handle); if (!valid_handle) { WARN(1, "%s: invalid handle passed to share.\n", __func__); - mutex_unlock(&client->lock); + if (lock_client) + mutex_unlock(&client->lock); return ERR_PTR(-EINVAL); } buffer = handle->buffer; ion_buffer_get(buffer); - mutex_unlock(&client->lock); + if (lock_client) + mutex_unlock(&client->lock);
exp_info.ops = &dma_buf_ops; exp_info.size = buffer->size; @@ -1170,14 +1162,21 @@ struct dma_buf *ion_share_dma_buf(struct
return dmabuf; } + +struct dma_buf *ion_share_dma_buf(struct ion_client *client, + struct ion_handle *handle) +{ + return __ion_share_dma_buf(client, handle, true); +} EXPORT_SYMBOL(ion_share_dma_buf);
-int ion_share_dma_buf_fd(struct ion_client *client, struct ion_handle *handle) +static int __ion_share_dma_buf_fd(struct ion_client *client, + struct ion_handle *handle, bool lock_client) { struct dma_buf *dmabuf; int fd;
- dmabuf = ion_share_dma_buf(client, handle); + dmabuf = __ion_share_dma_buf(client, handle, lock_client); if (IS_ERR(dmabuf)) return PTR_ERR(dmabuf);
@@ -1187,8 +1186,19 @@ int ion_share_dma_buf_fd(struct ion_clie
return fd; } + +int ion_share_dma_buf_fd(struct ion_client *client, struct ion_handle *handle) +{ + return __ion_share_dma_buf_fd(client, handle, true); +} EXPORT_SYMBOL(ion_share_dma_buf_fd);
+static int ion_share_dma_buf_fd_nolock(struct ion_client *client, + struct ion_handle *handle) +{ + return __ion_share_dma_buf_fd(client, handle, false); +} + struct ion_handle *ion_import_dma_buf(struct ion_client *client, int fd) { struct dma_buf *dmabuf; @@ -1335,11 +1345,15 @@ static long ion_ioctl(struct file *filp, { struct ion_handle *handle;
- handle = ion_handle_get_by_id(client, data.handle.handle); - if (IS_ERR(handle)) + mutex_lock(&client->lock); + handle = ion_handle_get_by_id_nolock(client, data.handle.handle); + if (IS_ERR(handle)) { + mutex_unlock(&client->lock); return PTR_ERR(handle); - data.fd.fd = ion_share_dma_buf_fd(client, handle); - ion_handle_put(handle); + } + data.fd.fd = ion_share_dma_buf_fd_nolock(client, handle); + ion_handle_put_nolock(handle); + mutex_unlock(&client->lock); if (data.fd.fd < 0) ret = data.fd.fd; break;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Doug Berger opendmb@gmail.com
[ Upstream commit c3c397c1f16c51601a3fac4fe0c63ad8aa85a904 ]
When using the fixed PHY with GENET (e.g. MOCA) the PHY link status can be determined from the internal link status captured by the MAC. This allows the PHY state machine to use the correct link state with the fixed PHY even if MAC link event interrupts are missed when the net device is opened.
Fixes: 8d88c6ebb34c ("net: bcmgenet: enable MoCA link state change detection") Signed-off-by: Doug Berger opendmb@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/genet/bcmgenet.h | 3 +++ drivers/net/ethernet/broadcom/genet/bcmmii.c | 10 ++++++++-- 2 files changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h @@ -185,6 +185,9 @@ struct bcmgenet_mib_counters { #define UMAC_MAC1 0x010 #define UMAC_MAX_FRAME_LEN 0x014
+#define UMAC_MODE 0x44 +#define MODE_LINK_STATUS (1 << 5) + #define UMAC_EEE_CTRL 0x064 #define EN_LPI_RX_PAUSE (1 << 0) #define EN_LPI_TX_PFC (1 << 1) --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -167,8 +167,14 @@ void bcmgenet_mii_setup(struct net_devic static int bcmgenet_fixed_phy_link_update(struct net_device *dev, struct fixed_phy_status *status) { - if (dev && dev->phydev && status) - status->link = dev->phydev->link; + struct bcmgenet_priv *priv; + u32 reg; + + if (dev && dev->phydev && status) { + priv = netdev_priv(dev); + reg = bcmgenet_umac_readl(priv, UMAC_MODE); + status->link = !!(reg & MODE_LINK_STATUS); + }
return 0; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manish Chopra manish.chopra@cavium.com
[ Upstream commit 6750c87074c5b534d82fdaabb1deb45b8f1f57de ]
qlge_fix_features() is not supposed to modify hardware or driver state, rather it is supposed to only fix requested fetures bits. Currently qlge_fix_features() also goes for interface down and up unnecessarily if there is not even any change in features set.
This patch changes/fixes following -
1) Move reload of interface or device re-config from qlge_fix_features() to qlge_set_features(). 2) Reload of interface in qlge_set_features() only if relevant feature bit (NETIF_F_HW_VLAN_CTAG_RX) is changed. 3) Get rid of qlge_fix_features() since driver is not really required to fix any features bit.
Signed-off-by: Manish manish.chopra@cavium.com Reviewed-by: Benjamin Poirier bpoirier@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qlge/qlge_main.c | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-)
--- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c @@ -2388,26 +2388,20 @@ static int qlge_update_hw_vlan_features( return status; }
-static netdev_features_t qlge_fix_features(struct net_device *ndev, - netdev_features_t features) -{ - int err; - - /* Update the behavior of vlan accel in the adapter */ - err = qlge_update_hw_vlan_features(ndev, features); - if (err) - return err; - - return features; -} - static int qlge_set_features(struct net_device *ndev, netdev_features_t features) { netdev_features_t changed = ndev->features ^ features; + int err; + + if (changed & NETIF_F_HW_VLAN_CTAG_RX) { + /* Update the behavior of vlan accel in the adapter */ + err = qlge_update_hw_vlan_features(ndev, features); + if (err) + return err;
- if (changed & NETIF_F_HW_VLAN_CTAG_RX) qlge_vlan_mode(ndev, features); + }
return 0; } @@ -4720,7 +4714,6 @@ static const struct net_device_ops qlge_ .ndo_set_mac_address = qlge_set_mac_address, .ndo_validate_addr = eth_validate_addr, .ndo_tx_timeout = qlge_tx_timeout, - .ndo_fix_features = qlge_fix_features, .ndo_set_features = qlge_set_features, .ndo_vlan_rx_add_vid = qlge_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = qlge_vlan_rx_kill_vid,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 63cc357f7bba6729869565a12df08441a5995d9a ]
RFC 1337 says: ''Ignore RST segments in TIME-WAIT state. If the 2 minute MSL is enforced, this fix avoids all three hazards.''
So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk expire rather than removing it instantly when a reset is received.
However, Linux will also re-start the TIME-WAIT timer.
This causes connect to fail when tying to re-use ports or very long delays (until syn retry interval exceeds MSL).
packetdrill test case: // Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode. `sysctl net.ipv4.tcp_rfc1337=1`
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0
0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4
// Receive first segment 0.310 < P. 1:1001(1000) ack 1 win 46
// Send one ACK 0.310 > . 1:1(0) ack 1001
// read 1000 byte 0.310 read(4, ..., 1000) = 1000
// Application writes 100 bytes 0.350 write(4, ..., 100) = 100 0.350 > P. 1:101(100) ack 1001
// ACK 0.500 < . 1001:1001(0) ack 101 win 257
// close the connection 0.600 close(4) = 0 0.600 > F. 101:101(0) ack 1001 win 244
// Our side is in FIN_WAIT_1 & waits for ack to fin 0.7 < . 1001:1001(0) ack 102 win 244
// Our side is in FIN_WAIT_2 with no outstanding data. 0.8 < F. 1001:1001(0) ack 102 win 244 0.8 > . 102:102(0) ack 1002 win 244
// Our side is now in TIME_WAIT state, send ack for fin. 0.9 < F. 1002:1002(0) ack 102 win 244 0.9 > . 102:102(0) ack 1002 win 244
// Peer reopens with in-window SYN: 1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// Therefore, reply with ACK. 1.000 > . 102:102(0) ack 1002 win 244
// Peer sends RST for this ACK. Normally this RST results // in tw socket removal, but rfc1337=1 setting prevents this. 1.100 < R 1002:1002(0) win 244
// second syn. Due to rfc1337=1 expect another pure ACK. 31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 31.0 > . 102:102(0) ack 1002 win 244
// .. and another RST from peer. 31.1 < R 1002:1002(0) win 244 31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`
// third syn after one minute. Time-Wait socket should have expired by now. 63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// so we expect a syn-ack & 3whs to proceed from here on. 63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
Without this patch, 'ss' shows restarts of tw timer and last packet is thus just another pure ack, more than one minute later.
This restores the original code from commit 283fd6cf0be690a83 ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
For some reason the else branch was removed/lost in 1f28b683339f7 ("Merge in TCP/UDP optimizations and [..]") and timer restart became unconditional.
Reported-by: Michal Tesar mtesar@redhat.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_minisocks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -200,8 +200,9 @@ kill: inet_twsk_deschedule_put(tw); return TCP_TW_SUCCESS; } + } else { + inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN); } - inet_twsk_reschedule(tw, TCP_TIMEWAIT_LEN);
if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent = tmp_opt.rcv_tsval;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kodanev alexey.kodanev@oracle.com
[ Upstream commit 9f2895461439fda2801a7906fb4c5fb3dbb37a0a ]
Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting on xmit") '!skb->ignore_df' check was always true because the function skb_scrub_packet() was called before it, resetting ignore_df to zero.
In the commit, skb_scrub_packet() was moved below, and now this check can be false for the packet, e.g. when sending it in the two fragments, this prevents successful PMTU updates in such case. The next attempts to send the packet lead to the same tx error. Moreover, vti6 initial MTU value relies on PMTU adjustments.
This issue can be reproduced with the following LTP test script: udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000
Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.") Signed-off-by: Alexey Kodanev alexey.kodanev@oracle.com Acked-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_vti.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -470,7 +470,7 @@ vti6_xmit(struct sk_buff *skb, struct ne }
mtu = dst_mtu(dst); - if (!skb->ignore_df && skb->len > mtu) { + if (skb->len > mtu) { skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
if (skb->protocol == htons(ETH_P_IPV6)) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit e6c47dd0da1e3a484e778046fc10da0b20606a86 ]
Some SMB2/3 servers, Win2016 but possibly others too, adds padding not only between PDUs in a compound but also to the final PDU. This padding extends the PDU to a multiple of 8 bytes.
Check if the unexpected length looks like this might be the case and avoid triggering the log messages for :
"SMB2 server sent bad RFC1001 len %d not %d\n"
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2misc.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/cifs/smb2misc.c +++ b/fs/cifs/smb2misc.c @@ -185,6 +185,13 @@ smb2_check_message(char *buf, unsigned i return 0;
/* + * Some windows servers (win2016) will pad also the final + * PDU in a compound to 8 bytes. + */ + if (((clc_len + 7) & ~7) == len) + return 0; + + /* * MacOS server pads after SMB2.1 write response with 3 bytes * of junk. Other servers match RFC1001 len to actual * SMB2/SMB3 frame length (header + smb2 response specific data)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 7464726cb5998846306ed0a7d6714afb2e37b25d ]
syzbot is reporting NULL pointer dereference at mount_fs() [1]. This is because hfsplus_fill_super() is by error returning 0 when hfsplus_fill_super() detected invalid filesystem image, and mount_bdev() is returning NULL because dget(s->s_root) == NULL if s->s_root == NULL, and mount_fs() is accessing root->d_sb because IS_ERR(root) == false if root == NULL. Fix this by returning -EINVAL when hfsplus_fill_super() detected invalid filesystem image.
[1] https://syzkaller.appspot.com/bug?id=21acb6850cecbc960c927229e597158cf35f33d...
Link: http://lkml.kernel.org/r/d83ce31a-874c-dd5b-f790-41405983a5be@I-love.SAKURA.... Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+01ffaf5d9568dd1609f7@syzkaller.appspotmail.com Reviewed-by: Ernesto A. Fernández ernesto.mnd.fernandez@gmail.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hfsplus/super.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -521,8 +521,10 @@ static int hfsplus_fill_super(struct sup goto out_put_root; if (!hfs_brec_read(&fd, &entry, sizeof(entry))) { hfs_find_exit(&fd); - if (entry.type != cpu_to_be16(HFSPLUS_FOLDER)) + if (entry.type != cpu_to_be16(HFSPLUS_FOLDER)) { + err = -EINVAL; goto out_put_root; + } inode = hfsplus_iget(sb, be32_to_cpu(entry.folder.id)); if (IS_ERR(inode)) { err = PTR_ERR(inode);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Ernesto A. Fern�ndez" ernesto.mnd.fernandez@gmail.com
[ Upstream commit dc2572791d3a41bab94400af2b6bca9d71ccd303 ]
hfs_find_exit() expects fd->bnode to be NULL after a search has failed. hfs_brec_insert() may instead set it to an error-valued pointer. Fix this to prevent a crash.
Link: http://lkml.kernel.org/r/53d9749a029c41b4016c495fc5838c9dba3afc52.1530294815... Signed-off-by: Ernesto A. Fernández ernesto.mnd.fernandez@gmail.com Cc: Anatoly Trosinenko anatoly.trosinenko@gmail.com Cc: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hfs/brec.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/fs/hfs/brec.c +++ b/fs/hfs/brec.c @@ -74,9 +74,10 @@ int hfs_brec_insert(struct hfs_find_data if (!fd->bnode) { if (!tree->root) hfs_btree_inc_height(tree); - fd->bnode = hfs_bnode_find(tree, tree->leaf_head); - if (IS_ERR(fd->bnode)) - return PTR_ERR(fd->bnode); + node = hfs_bnode_find(tree, tree->leaf_head); + if (IS_ERR(node)) + return PTR_ERR(node); + fd->bnode = node; fd->record = -1; } new_node = NULL;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
[ Upstream commit 06e62a46bbba20aa5286102016a04214bb446141 ]
Before this change, if a multithreaded process forks while one of its threads is changing a signal handler using sigaction(), the memcpy() in copy_sighand() can race with the struct assignment in do_sigaction(). It isn't clear whether this can cause corruption of the userspace signal handler pointer, but it definitely can cause inconsistency between different fields of struct sigaction.
Take the appropriate spinlock to avoid this.
I have tested that this patch prevents inconsistency between sa_sigaction and sa_flags, which is possible before this patch.
Link: http://lkml.kernel.org/r/20180702145108.73189-1-jannh@google.com Signed-off-by: Jann Horn jannh@google.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Rik van Riel riel@redhat.com Cc: "Peter Zijlstra (Intel)" peterz@infradead.org Cc: Kees Cook keescook@chromium.org Cc: Oleg Nesterov oleg@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -1109,7 +1109,9 @@ static int copy_sighand(unsigned long cl return -ENOMEM;
atomic_set(&sig->count, 1); + spin_lock_irq(¤t->sighand->siglock); memcpy(sig->action, current->sighand->action, sizeof(sig->action)); + spin_unlock_irq(¤t->sighand->siglock); return 0; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 8b73ce6a4bae4fe12bcb2c361c0da4183c2e1b6f ]
This uses the deprecated time_t type but is write-only, and could be removed, but as Jeff explains, having a timestamp can be usefule for post-mortem analysis in crash dumps.
In order to remove one of the last instances of time_t, this changes the type to time64_t, same as j_trans_start_time.
Link: http://lkml.kernel.org/r/20180622133315.221210-1-arnd@arndb.de Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Jan Kara jack@suse.cz Cc: Jeff Mahoney jeffm@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/reiserfs/reiserfs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/reiserfs/reiserfs.h +++ b/fs/reiserfs/reiserfs.h @@ -270,7 +270,7 @@ struct reiserfs_journal_list {
struct mutex j_commit_mutex; unsigned int j_trans_id; - time_t j_timestamp; + time64_t j_timestamp; /* write-only but useful for crash dump analysis */ struct reiserfs_list_bitmap *j_list_bitmap; struct buffer_head *j_commit_bh; /* commit buffer head */ struct reiserfs_journal_cnode *j_realblock;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Ernesto A. Fern�ndez" ernesto.mnd.fernandez@gmail.com
[ Upstream commit a7ec7a4193a2eb3b5341243fc0b621c1ac9e4ec4 ]
An HFS+ filesystem can be mounted read-only without having a metadata directory, which is needed to support hardlinks. But if the catalog data is corrupted, a directory lookup may still find dentries claiming to be hardlinks.
hfsplus_lookup() does check that ->hidden_dir is not NULL in such a situation, but mistakenly does so after dereferencing it for the first time. Reorder this check to prevent a crash.
This happens when looking up corrupted catalog data (dentry) on a filesystem with no metadata directory (this could only ever happen on a read-only mount). Wen Xu sent the replication steps in detail to the fsdevel list: https://bugzilla.kernel.org/show_bug.cgi?id=200297
Link: http://lkml.kernel.org/r/20180712215344.q44dyrhymm4ajkao@eaf Signed-off-by: Ernesto A. Fernández ernesto.mnd.fernandez@gmail.com Reported-by: Wen Xu wen.xu@gatech.edu Cc: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hfsplus/dir.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/hfsplus/dir.c +++ b/fs/hfsplus/dir.c @@ -77,13 +77,13 @@ again: cpu_to_be32(HFSP_HARDLINK_TYPE) && entry.file.user_info.fdCreator == cpu_to_be32(HFSP_HFSPLUS_CREATOR) && + HFSPLUS_SB(sb)->hidden_dir && (entry.file.create_date == HFSPLUS_I(HFSPLUS_SB(sb)->hidden_dir)-> create_date || entry.file.create_date == HFSPLUS_I(d_inode(sb->s_root))-> - create_date) && - HFSPLUS_SB(sb)->hidden_dir) { + create_date)) { struct qstr str; char name[32];
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: OGAWA Hirofumi hirofumi@mail.parknet.co.jp
[ Upstream commit 0afa9626667c3659ef8bd82d42a11e39fedf235c ]
On corrupted FATfs may have invalid ->i_start. To handle it, this checks ->i_start before using, and return proper error code.
Link: http://lkml.kernel.org/r/87o9f8y1t5.fsf_-_@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi hirofumi@mail.parknet.co.jp Reported-by: Anatoly Trosinenko anatoly.trosinenko@gmail.com Tested-by: Anatoly Trosinenko anatoly.trosinenko@gmail.com Cc: Alan Cox gnomes@lxorguk.ukuu.org.uk Cc: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fat/cache.c | 19 ++++++++++++------- fs/fat/fat.h | 5 +++++ fs/fat/fatent.c | 6 +++--- 3 files changed, 20 insertions(+), 10 deletions(-)
--- a/fs/fat/cache.c +++ b/fs/fat/cache.c @@ -224,7 +224,8 @@ static inline void cache_init(struct fat int fat_get_cluster(struct inode *inode, int cluster, int *fclus, int *dclus) { struct super_block *sb = inode->i_sb; - const int limit = sb->s_maxbytes >> MSDOS_SB(sb)->cluster_bits; + struct msdos_sb_info *sbi = MSDOS_SB(sb); + const int limit = sb->s_maxbytes >> sbi->cluster_bits; struct fat_entry fatent; struct fat_cache_id cid; int nr; @@ -233,6 +234,12 @@ int fat_get_cluster(struct inode *inode,
*fclus = 0; *dclus = MSDOS_I(inode)->i_start; + if (!fat_valid_entry(sbi, *dclus)) { + fat_fs_error_ratelimit(sb, + "%s: invalid start cluster (i_pos %lld, start %08x)", + __func__, MSDOS_I(inode)->i_pos, *dclus); + return -EIO; + } if (cluster == 0) return 0;
@@ -249,9 +256,8 @@ int fat_get_cluster(struct inode *inode, /* prevent the infinite loop of cluster chain */ if (*fclus > limit) { fat_fs_error_ratelimit(sb, - "%s: detected the cluster chain loop" - " (i_pos %lld)", __func__, - MSDOS_I(inode)->i_pos); + "%s: detected the cluster chain loop (i_pos %lld)", + __func__, MSDOS_I(inode)->i_pos); nr = -EIO; goto out; } @@ -261,9 +267,8 @@ int fat_get_cluster(struct inode *inode, goto out; else if (nr == FAT_ENT_FREE) { fat_fs_error_ratelimit(sb, - "%s: invalid cluster chain (i_pos %lld)", - __func__, - MSDOS_I(inode)->i_pos); + "%s: invalid cluster chain (i_pos %lld)", + __func__, MSDOS_I(inode)->i_pos); nr = -EIO; goto out; } else if (nr == FAT_ENT_EOF) { --- a/fs/fat/fat.h +++ b/fs/fat/fat.h @@ -344,6 +344,11 @@ static inline void fatent_brelse(struct fatent->fat_inode = NULL; }
+static inline bool fat_valid_entry(struct msdos_sb_info *sbi, int entry) +{ + return FAT_START_ENT <= entry && entry < sbi->max_cluster; +} + extern void fat_ent_access_init(struct super_block *sb); extern int fat_ent_read(struct inode *inode, struct fat_entry *fatent, int entry); --- a/fs/fat/fatent.c +++ b/fs/fat/fatent.c @@ -23,7 +23,7 @@ static void fat12_ent_blocknr(struct sup { struct msdos_sb_info *sbi = MSDOS_SB(sb); int bytes = entry + (entry >> 1); - WARN_ON(entry < FAT_START_ENT || sbi->max_cluster <= entry); + WARN_ON(!fat_valid_entry(sbi, entry)); *offset = bytes & (sb->s_blocksize - 1); *blocknr = sbi->fat_start + (bytes >> sb->s_blocksize_bits); } @@ -33,7 +33,7 @@ static void fat_ent_blocknr(struct super { struct msdos_sb_info *sbi = MSDOS_SB(sb); int bytes = (entry << sbi->fatent_shift); - WARN_ON(entry < FAT_START_ENT || sbi->max_cluster <= entry); + WARN_ON(!fat_valid_entry(sbi, entry)); *offset = bytes & (sb->s_blocksize - 1); *blocknr = sbi->fat_start + (bytes >> sb->s_blocksize_bits); } @@ -353,7 +353,7 @@ int fat_ent_read(struct inode *inode, st int err, offset; sector_t blocknr;
- if (entry < FAT_START_ENT || sbi->max_cluster <= entry) { + if (!fat_valid_entry(sbi, entry)) { fatent_brelse(fatent); fat_fs_error(sb, "invalid access to FAT (entry 0x%08x)", entry); return -EIO;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 1f3aa9002dc6a0d59a4b599b4fc8f01cf43ef014 ]
Fix missing error check for memory allocation functions in scripts/mod/modpost.c.
Fixes kernel bugzilla #200319: https://bugzilla.kernel.org/show_bug.cgi?id=200319
Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Yuexing Wang wangyxlandq@gmail.com Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/mod/modpost.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -649,7 +649,7 @@ static void handle_modversions(struct mo if (ELF_ST_TYPE(sym->st_info) == STT_SPARC_REGISTER) break; if (symname[0] == '.') { - char *munged = strdup(symname); + char *munged = NOFAIL(strdup(symname)); munged[0] = '_'; munged[1] = toupper(munged[1]); symname = munged; @@ -1311,7 +1311,7 @@ static Elf_Sym *find_elf_symbol2(struct static char *sec2annotation(const char *s) { if (match(s, init_exit_sections)) { - char *p = malloc(20); + char *p = NOFAIL(malloc(20)); char *r = p;
*p++ = '_'; @@ -1331,7 +1331,7 @@ static char *sec2annotation(const char * strcat(p, " "); return r; } else { - return strdup(""); + return NOFAIL(strdup("")); } }
@@ -2032,7 +2032,7 @@ void buf_write(struct buffer *buf, const { if (buf->size - buf->pos < len) { buf->size += len + SZ; - buf->p = realloc(buf->p, buf->size); + buf->p = NOFAIL(realloc(buf->p, buf->size)); } strncpy(buf->p + buf->pos, s, len); buf->pos += len;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Ryabinin aryabinin@virtuozzo.com
[ Upstream commit a718e28f538441a3b6612da9ff226973376cdf0f ]
Signed integer overflow is undefined according to the C standard. The overflow in ksys_fadvise64_64() is deliberate, but since it is signed overflow, UBSAN complains:
UBSAN: Undefined behaviour in mm/fadvise.c:76:10 signed integer overflow: 4 + 9223372036854775805 cannot be represented in type 'long long int'
Use unsigned types to do math. Unsigned overflow is defined so UBSAN will not complain about it. This patch doesn't change generated code.
[akpm@linux-foundation.org: add comment explaining the casts] Link: http://lkml.kernel.org/r/20180629184453.7614-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin aryabinin@virtuozzo.com Reported-by: icytxw@gmail.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Alexander Potapenko glider@google.com Cc: Dmitry Vyukov dvyukov@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/fadvise.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/mm/fadvise.c +++ b/mm/fadvise.c @@ -68,8 +68,12 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, l goto out; }
- /* Careful about overflows. Len == 0 means "as much as possible" */ - endbyte = offset + len; + /* + * Careful about overflows. Len == 0 means "as much as possible". Use + * unsigned math because signed overflows are undefined and UBSan + * complains. + */ + endbyte = (u64)offset + (u64)len; if (!len || endbyte < len) endbyte = -1; else
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 6cd00a01f0c1ae6a852b09c59b8dd55cc6c35d1d ]
Since only dentry->d_name.len + 1 bytes out of DNAME_INLINE_LEN bytes are initialized at __d_alloc(), we can't copy the whole size unconditionally.
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff8fa27465ac50) 636f6e66696766732e746d70000000000010000000000000020000000188ffff i i i i i i i i i i i i i u u u u u u u u u u i i i i i u u u u ^ RIP: 0010:take_dentry_name_snapshot+0x28/0x50 RSP: 0018:ffffa83000f5bdf8 EFLAGS: 00010246 RAX: 0000000000000020 RBX: ffff8fa274b20550 RCX: 0000000000000002 RDX: ffffa83000f5be40 RSI: ffff8fa27465ac50 RDI: ffffa83000f5be60 RBP: ffffa83000f5bdf8 R08: ffffa83000f5be48 R09: 0000000000000001 R10: ffff8fa27465ac00 R11: ffff8fa27465acc0 R12: ffff8fa27465ac00 R13: ffff8fa27465acc0 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f79737ac8c0(0000) GS:ffffffff8fc30000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8fa274c0b000 CR3: 0000000134aa7002 CR4: 00000000000606f0 take_dentry_name_snapshot+0x28/0x50 vfs_rename+0x128/0x870 SyS_rename+0x3b2/0x3d0 entry_SYSCALL_64_fastpath+0x1a/0xa4 0xffffffffffffffff
Link: http://lkml.kernel.org/r/201709131912.GBG39012.QMJLOVFSFFOOtH@I-love.SAKURA.... Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Vegard Nossum vegard.nossum@gmail.com Cc: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dcache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/dcache.c +++ b/fs/dcache.c @@ -278,7 +278,8 @@ void take_dentry_name_snapshot(struct na spin_unlock(&dentry->d_lock); name->name = p->name; } else { - memcpy(name->inline_name, dentry->d_iname, DNAME_INLINE_LEN); + memcpy(name->inline_name, dentry->d_iname, + dentry->d_name.len + 1); spin_unlock(&dentry->d_lock); name->name = name->inline_name; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tan Hu tan.hu@zte.com.cn
[ Upstream commit a53b42c11815d2357e31a9403ae3950517525894 ]
We came across infinite loop in ipvs when using ipvs in docker env.
When ipvs receives new packets and cannot find an ipvs connection, it will create a new connection, then if the dest is unavailable (i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently.
But if the dropped packet is the first packet of this connection, the connection control timer never has a chance to start and the ipvs connection cannot be released. This will lead to memory leak, or infinite loop in cleanup_net() when net namespace is released like this:
ip_vs_conn_net_cleanup at ffffffffa0a9f31a [ip_vs] __ip_vs_cleanup at ffffffffa0a9f60a [ip_vs] ops_exit_list at ffffffff81567a49 cleanup_net at ffffffff81568b40 process_one_work at ffffffff810a851b worker_thread at ffffffff810a9356 kthread at ffffffff810b0b6f ret_from_fork at ffffffff81697a18
race condition: CPU1 CPU2 ip_vs_in() ip_vs_conn_new() ip_vs_del_dest() __ip_vs_unlink_dest() ~IP_VS_DEST_F_AVAILABLE cp->dest && !IP_VS_DEST_F_AVAILABLE __ip_vs_conn_put ... cleanup_net ---> infinite looping
Fix this by checking whether the timer already started.
Signed-off-by: Tan Hu tan.hu@zte.com.cn Reviewed-by: Jiang Biao jiang.biao2@zte.com.cn Acked-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/ipvs/ip_vs_core.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -1809,13 +1809,20 @@ ip_vs_in(struct netns_ipvs *ipvs, unsign if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) { /* the destination server is not available */
- if (sysctl_expire_nodest_conn(ipvs)) { + __u32 flags = cp->flags; + + /* when timer already started, silently drop the packet.*/ + if (timer_pending(&cp->timer)) + __ip_vs_conn_put(cp); + else + ip_vs_conn_put(cp); + + if (sysctl_expire_nodest_conn(ipvs) && + !(flags & IP_VS_CONN_F_ONE_PACKET)) { /* try to expire the connection immediately */ ip_vs_conn_expire_now(cp); } - /* don't restart its timer, and silently - drop the packet. */ - __ip_vs_conn_put(cp); + return NF_DROP; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 2f606da78230f09cf1a71fde6ee91d0c710fa2b2 ]
Instantiating the sm501 OHCI subdevice results in a kernel warning.
sm501-usb sm501-usb: SM501 OHCI sm501-usb sm501-usb: new USB bus registered, assigned bus number 1 WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 ohci_init+0x194/0x2d8 Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.18.0-rc7-00178-g0b5b1f9a78b5 #1 PC is at ohci_init+0x194/0x2d8 PR is at ohci_init+0x168/0x2d8 PC : 8c27844c SP : 8f81dd94 SR : 40008001 TEA : 29613060 R0 : 00000000 R1 : 00000000 R2 : 00000000 R3 : 00000202 R4 : 8fa98b88 R5 : 8c277e68 R6 : 00000000 R7 : 00000000 R8 : 8f965814 R9 : 8c388100 R10 : 8fa98800 R11 : 8fa98928 R12 : 8c48302c R13 : 8fa98920 R14 : 8c48302c MACH: 00000096 MACL: 0000017c GBR : 00000000 PR : 8c278420
Call trace: [<(ptrval)>] usb_add_hcd+0x1e8/0x6ec [<(ptrval)>] _dev_info+0x0/0x54 [<(ptrval)>] arch_local_save_flags+0x0/0x8 [<(ptrval)>] arch_local_irq_restore+0x0/0x24 [<(ptrval)>] ohci_hcd_sm501_drv_probe+0x114/0x2d8 ...
Initialize coherent_dma_mask when creating SM501 subdevices to fix the problem.
Fixes: b6d6454fdb66f ("mfd: SM501 core driver") Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/sm501.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mfd/sm501.c +++ b/drivers/mfd/sm501.c @@ -714,6 +714,7 @@ sm501_create_subdev(struct sm501_devdata smdev->pdev.name = name; smdev->pdev.id = sm->pdev_id; smdev->pdev.dev.parent = sm->dev; + smdev->pdev.dev.coherent_dma_mask = 0xffffffff;
if (res_count) { smdev->pdev.resource = (struct resource *)(smdev+1);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleh Filipovich aleh@vaolix.com
[ Upstream commit 880b29ac107d15644bf4da228376ba3cd6af6d71 ]
Add entry to WMI keymap for lid flip event on Asus UX360.
On Asus Zenbook ux360 flipping lid from/to tablet mode triggers keyscan code 0xfa which cannot be handled and results in kernel log message "Unknown key fa pressed".
Signed-off-by: Aleh Filipovichaleh@appnexus.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/asus-nb-wmi.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/platform/x86/asus-nb-wmi.c +++ b/drivers/platform/x86/asus-nb-wmi.c @@ -392,6 +392,7 @@ static const struct key_entry asus_nb_wm { KE_KEY, 0xC4, { KEY_KBDILLUMUP } }, { KE_KEY, 0xC5, { KEY_KBDILLUMDOWN } }, { KE_IGNORE, 0xC6, }, /* Ambient Light Sensor notification */ + { KE_KEY, 0xFA, { KEY_PROG2 } }, /* Lid flip action */ { KE_END, 0}, };
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 0702bc4d2fe793018ad9aa0eb14bff7f526c4095 ]
When compiling bmips with SMP disabled, the build fails with:
drivers/irqchip/irq-bcm7038-l1.o: In function `bcm7038_l1_cpu_offline': drivers/irqchip/irq-bcm7038-l1.c:242: undefined reference to `irq_set_affinity_locked' make[5]: *** [vmlinux] Error 1
Fix this by adding and setting bcm7038_l1_cpu_offline only when actually compiling for SMP. It wouldn't have been used anyway, as it requires CPU_HOTPLUG, which in turn requires SMP.
Fixes: 34c535793bcb ("irqchip/bcm7038-l1: Implement irq_cpu_offline() callback") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/irqchip/irq-bcm7038-l1.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/irqchip/irq-bcm7038-l1.c +++ b/drivers/irqchip/irq-bcm7038-l1.c @@ -216,6 +216,7 @@ static int bcm7038_l1_set_affinity(struc return 0; }
+#ifdef CONFIG_SMP static void bcm7038_l1_cpu_offline(struct irq_data *d) { struct cpumask *mask = irq_data_get_affinity_mask(d); @@ -240,6 +241,7 @@ static void bcm7038_l1_cpu_offline(struc } irq_set_affinity_locked(d, &new_affinity, false); } +#endif
static int __init bcm7038_l1_init_one(struct device_node *dn, unsigned int idx, @@ -292,7 +294,9 @@ static struct irq_chip bcm7038_l1_irq_ch .irq_mask = bcm7038_l1_mask, .irq_unmask = bcm7038_l1_unmask, .irq_set_affinity = bcm7038_l1_set_affinity, +#ifdef CONFIG_SMP .irq_cpu_offline = bcm7038_l1_cpu_offline, +#endif };
static int bcm7038_l1_map(struct irq_domain *d, unsigned int virq,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Philippe Brucker jean-philippe.brucker@arm.com
[ Upstream commit 92aef4675d5b1b55404e1532379e343bed0e5cf2 ]
Currently when virtio_find_single_vq fails, we go through del_vqs which throws a warning (Trying to free already-free IRQ). Skip del_vqs if vq allocation failed.
Link: http://lkml.kernel.org/r/20180524101021.49880-1-jean-philippe.brucker@arm.co... Signed-off-by: Jean-Philippe Brucker jean-philippe.brucker@arm.com Reviewed-by: Greg Kurz groug@kaod.org Cc: Eric Van Hensbergen ericvh@gmail.com Cc: Ron Minnich rminnich@sandia.gov Cc: Latchesar Ionkov lucho@ionkov.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Dominique Martinet dominique.martinet@cea.fr Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/9p/trans_virtio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -574,7 +574,7 @@ static int p9_virtio_probe(struct virtio chan->vq = virtio_find_single_vq(vdev, req_done, "requests"); if (IS_ERR(chan->vq)) { err = PTR_ERR(chan->vq); - goto out_free_vq; + goto out_free_chan; } chan->vq->vdev->priv = chan; spin_lock_init(&chan->lock); @@ -627,6 +627,7 @@ out_free_tag: kfree(tag); out_free_vq: vdev->config->del_vqs(vdev); +out_free_chan: kfree(chan); fail: return err;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit c42d3be0c06f0c1c416054022aa535c08a1f9b39 ]
The problem is the the calculation should be "end - start + 1" but the plus one is missing in this calculation.
Fixes: 8626816e905e ("powerpc: add support for MPIC message register API") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Tyrel Datwyler tyreld@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/sysdev/mpic_msgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/sysdev/mpic_msgr.c +++ b/arch/powerpc/sysdev/mpic_msgr.c @@ -196,7 +196,7 @@ static int mpic_msgr_probe(struct platfo
/* IO map the message register block. */ of_address_to_resource(np, 0, &rsrc); - msgr_block_addr = ioremap(rsrc.start, rsrc.end - rsrc.start); + msgr_block_addr = ioremap(rsrc.start, resource_size(&rsrc)); if (!msgr_block_addr) { dev_err(&dev->dev, "Failed to iomap MPIC message registers"); return -EFAULT;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Haberland sth@linux.ibm.com
[ Upstream commit 669f3765b755fd8739ab46ce3a9c6292ce8b3d2a ]
During offline processing two worker threads are canceled without freeing the device reference which leads to a hanging offline process.
Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Signed-off-by: Stefan Haberland sth@linux.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd_eckd.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/s390/block/dasd_eckd.c +++ b/drivers/s390/block/dasd_eckd.c @@ -2101,8 +2101,11 @@ static int dasd_eckd_basic_to_ready(stru
static int dasd_eckd_online_to_ready(struct dasd_device *device) { - cancel_work_sync(&device->reload_device); - cancel_work_sync(&device->kick_validate); + if (cancel_work_sync(&device->reload_device)) + dasd_put_device(device); + if (cancel_work_sync(&device->kick_validate)) + dasd_put_device(device); + return 0; };
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 0756c57bce3d26da2592d834d8910b6887021701 ]
We accidentally return success instead of -ENOMEM on this error path.
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Johannes Thumshirn jthumshirn@suse.de Reviewed-by: John Garry john.garry@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/aic94xx/aic94xx_init.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/scsi/aic94xx/aic94xx_init.c +++ b/drivers/scsi/aic94xx/aic94xx_init.c @@ -1031,8 +1031,10 @@ static int __init aic94xx_init(void)
aic94xx_transport_template = sas_domain_attach_transport(&aic94xx_transport_functions); - if (!aic94xx_transport_template) + if (!aic94xx_transport_template) { + err = -ENOMEM; goto out_destroy_caches; + }
err = pci_register_driver(&aic94xx_pci_driver); if (err)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Petazzoni thomas.petazzoni@bootlin.com
[ Upstream commit dfd0309fd7b30a5baffaf47b2fccb88b46d64d69 ]
pcie->realio.end should be the address of last byte of the area, therefore using resource_size() of another resource is not correct, we must substract 1 to get the address of the last byte.
Fixes: 11be65472a427 ("PCI: mvebu: Adapt to the new device tree layout") Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/host/pci-mvebu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/host/pci-mvebu.c +++ b/drivers/pci/host/pci-mvebu.c @@ -1235,7 +1235,7 @@ static int mvebu_pcie_probe(struct platf pcie->realio.start = PCIBIOS_MIN_IO; pcie->realio.end = min_t(resource_size_t, IO_SPACE_LIMIT, - resource_size(&pcie->io)); + resource_size(&pcie->io) - 1); } else pcie->realio = pcie->io;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Pittman jpittman@redhat.com
[ Upstream commit 784c9a29e99eb40b842c29ecf1cc3a79e00fb629 ]
It was reported that softlockups occur when using dm-snapshot ontop of slow (rbd) storage. E.g.:
[ 4047.990647] watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [kworker/10:23:26177] ... [ 4048.034151] Workqueue: kcopyd do_work [dm_mod] [ 4048.034156] RIP: 0010:copy_callback+0x41/0x160 [dm_snapshot] ... [ 4048.034190] Call Trace: [ 4048.034196] ? __chunk_is_tracked+0x70/0x70 [dm_snapshot] [ 4048.034200] run_complete_job+0x5f/0xb0 [dm_mod] [ 4048.034205] process_jobs+0x91/0x220 [dm_mod] [ 4048.034210] ? kcopyd_put_pages+0x40/0x40 [dm_mod] [ 4048.034214] do_work+0x46/0xa0 [dm_mod] [ 4048.034219] process_one_work+0x171/0x370 [ 4048.034221] worker_thread+0x1fc/0x3f0 [ 4048.034224] kthread+0xf8/0x130 [ 4048.034226] ? max_active_store+0x80/0x80 [ 4048.034227] ? kthread_bind+0x10/0x10 [ 4048.034231] ret_from_fork+0x35/0x40 [ 4048.034233] Kernel panic - not syncing: softlockup: hung tasks
Fix this by calling cond_resched() after run_complete_job()'s callout to the dm_kcopyd_notify_fn (which is dm-snap.c:copy_callback in the above trace).
Signed-off-by: John Pittman jpittman@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-kcopyd.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/md/dm-kcopyd.c +++ b/drivers/md/dm-kcopyd.c @@ -454,6 +454,8 @@ static int run_complete_job(struct kcopy if (atomic_dec_and_test(&kc->nr_jobs)) wake_up(&kc->destroyq);
+ cond_resched(); + return 0; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Abbott abbotti@mev.co.uk
[ Upstream commit e083926b3e269d4064825dcf2ad50c636fddf8cf ]
The PFI subdevice flags indicate that the subdevice is readable and writeable, but that is only true for the supported "M-series" boards, not the older "E-series" boards. Only set the SDF_READABLE and SDF_WRITABLE subdevice flags for the M-series boards. These two flags are mainly for informational purposes.
Signed-off-by: Ian Abbott abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/ni_mio_common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/staging/comedi/drivers/ni_mio_common.c +++ b/drivers/staging/comedi/drivers/ni_mio_common.c @@ -5275,11 +5275,11 @@ static int ni_E_init(struct comedi_devic /* Digital I/O (PFI) subdevice */ s = &dev->subdevices[NI_PFI_DIO_SUBDEV]; s->type = COMEDI_SUBD_DIO; - s->subdev_flags = SDF_READABLE | SDF_WRITABLE | SDF_INTERNAL; s->maxdata = 1; if (devpriv->is_m_series) { s->n_chan = 16; s->insn_bits = ni_pfi_insn_bits; + s->subdev_flags = SDF_READABLE | SDF_WRITABLE | SDF_INTERNAL;
ni_writew(dev, s->state, NI_M_PFI_DO_REG); for (i = 0; i < NUM_PFI_OUTPUT_SELECT_REGS; ++i) { @@ -5288,6 +5288,7 @@ static int ni_E_init(struct comedi_devic } } else { s->n_chan = 10; + s->subdev_flags = SDF_INTERNAL; } s->insn_config = ni_pfi_insn_config;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Breno Leitao leitao@debian.org
[ Upstream commit 7c27a26e1ed5a7dd709aa19685d2c98f64e1cf0c ]
There are some powerpc selftests, as tm/tm-unavailable, that run for a long period (>120 seconds), and if it is interrupted, as pressing CRTL-C (SIGINT), the foreground process (harness) dies but the child process and threads continue to execute (with PPID = 1 now) in background.
In this case, you'd think the whole test exited, but there are remaining threads and processes being executed in background. Sometimes these zombies processes are doing annoying things, as consuming the whole CPU or dumping things to STDOUT.
This patch fixes this problem by attaching an empty signal handler to SIGINT in the harness process. This handler will interrupt (EINTR) the parent process waitpid() call, letting the code to follow through the normal flow, which will kill all the processes in the child process group.
This patch also fixes a typo.
Signed-off-by: Breno Leitao leitao@debian.org Signed-off-by: Gustavo Romero gromero@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/powerpc/harness.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/tools/testing/selftests/powerpc/harness.c +++ b/tools/testing/selftests/powerpc/harness.c @@ -85,13 +85,13 @@ wait: return status; }
-static void alarm_handler(int signum) +static void sig_handler(int signum) { - /* Jut wake us up from waitpid */ + /* Just wake us up from waitpid */ }
-static struct sigaction alarm_action = { - .sa_handler = alarm_handler, +static struct sigaction sig_action = { + .sa_handler = sig_handler, };
int test_harness(int (test_function)(void), char *name) @@ -101,8 +101,14 @@ int test_harness(int (test_function)(voi test_start(name); test_set_git_version(GIT_VERSION);
- if (sigaction(SIGALRM, &alarm_action, NULL)) { - perror("sigaction"); + if (sigaction(SIGINT, &sig_action, NULL)) { + perror("sigaction (sigint)"); + test_error(name); + return 1; + } + + if (sigaction(SIGALRM, &sig_action, NULL)) { + perror("sigaction (sigalrm)"); test_error(name); return 1; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
[ Upstream commit c281bc0c7412308c7ec0888904f7c99353da4796 ]
echo 0 > /proc/fs/cifs/Stats is supposed to reset the stats but there were four (see example below) that were not reset (bytes read and witten, total vfs ops and max ops at one time).
... 0 session 0 share reconnects Total vfs operations: 100 maximum at one time: 2
1) \localhost\test SMBs: 0 Bytes read: 502092 Bytes written: 31457286 TreeConnects: 0 total 0 failed TreeDisconnects: 0 total 0 failed ...
This patch fixes cifs_stats_proc_write to properly reset those four.
Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Aurelien Aptel aaptel@suse.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifs_debug.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -285,6 +285,10 @@ static ssize_t cifs_stats_proc_write(str atomic_set(&totBufAllocCount, 0); atomic_set(&totSmBufAllocCount, 0); #endif /* CONFIG_CIFS_STATS2 */ + spin_lock(&GlobalMid_Lock); + GlobalMaxActiveXid = 0; + GlobalCurrentXid = 0; + spin_unlock(&GlobalMid_Lock); spin_lock(&cifs_tcp_ses_lock); list_for_each(tmp1, &cifs_tcp_ses_list) { server = list_entry(tmp1, struct TCP_Server_Info, @@ -297,6 +301,10 @@ static ssize_t cifs_stats_proc_write(str struct cifs_tcon, tcon_list); atomic_set(&tcon->num_smbs_sent, 0); + spin_lock(&tcon->stat_lock); + tcon->bytes_read = 0; + tcon->bytes_written = 0; + spin_unlock(&tcon->stat_lock); if (server->ops->clear_stats) server->ops->clear_stats(tcon); }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
[ Upstream commit 289131e1f1e6ad8c661ec05e176b8f0915672059 ]
For SMB2/SMB3 the number of requests sent was not displayed in /proc/fs/cifs/Stats unless CONFIG_CIFS_STATS2 was enabled (only number of failed requests displayed). As with earlier dialects, we should be displaying these counters if CONFIG_CIFS_STATS is enabled. They are important for debugging.
e.g. when you cat /proc/fs/cifs/Stats (before the patch) Resources in use CIFS Session: 1 Share (unique mount targets): 2 SMB Request/Response Buffer: 1 Pool size: 5 SMB Small Req/Resp Buffer: 1 Pool size: 30 Operations (MIDs): 0
0 session 0 share reconnects Total vfs operations: 690 maximum at one time: 2
1) \localhost\test SMBs: 975 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 2 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 1 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 63 failed
Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2pdu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -315,7 +315,7 @@ small_smb2_init(__le16 smb2_command, str smb2_hdr_assemble((struct smb2_hdr *) *request_buf, smb2_command, tcon);
if (tcon != NULL) { -#ifdef CONFIG_CIFS_STATS2 +#ifdef CONFIG_CIFS_STATS uint16_t com_code = le16_to_cpu(smb2_command); cifs_stats_inc(&tcon->stats.smb2_stats.smb2_com_sent[com_code]); #endif
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com
[ Upstream commit 74e96bf44f430cf7a01de19ba6cf49b361cdfd6e ]
The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead.
Fixes: d368514c3097 ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek msuchanek@suse.com Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/pseries/ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -311,7 +311,7 @@ static struct rtas_error_log *fwnmi_get_ int len, error_log_length;
error_log_length = 8 + rtas_error_extended_log_length(h); - len = max_t(int, error_log_length, RTAS_ERROR_LOG_MAX); + len = min_t(int, error_log_length, RTAS_ERROR_LOG_MAX); memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX); memcpy(global_mce_data_buf, h, len); errhdr = (struct rtas_error_log *)global_mce_data_buf;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Misono Tomohiro misono.tomohiro@jp.fujitsu.com
[ Upstream commit 1e7e1f9e3aba00c9b9c323bfeeddafe69ff21ff6 ]
on-disk devs stats value is updated in btrfs_run_dev_stats(), which is called during commit transaction, if device->dev_stats_ccnt is not zero.
Since current replace operation does not touch dev_stats_ccnt, on-disk dev stats value is not updated. Therefore "btrfs device stats" may return old device's value after umount/mount (Example: See "btrfs ins dump-t -t DEV $DEV" after btrfs/100 finish).
Fix this by just incrementing dev_stats_ccnt in btrfs_dev_replace_finishing() when replace is succeeded and this will update the values.
Signed-off-by: Misono Tomohiro misono.tomohiro@jp.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/dev-replace.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -574,6 +574,12 @@ static int btrfs_dev_replace_finishing(s btrfs_rm_dev_replace_unblocked(fs_info);
/* + * Increment dev_stats_ccnt so that btrfs_run_dev_stats() will + * update on-disk dev stats value during commit transaction + */ + atomic_inc(&tgt_device->dev_stats_ccnt); + + /* * this is again a consistent state where no dev_replace procedure * is running, the target device is part of the filesystem, the * source device is not part of the filesystem anymore and its 1st
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 389305b2aa68723c754f88d9dbd268a400e10664 ]
Invalid reloc tree can cause kernel NULL pointer dereference when btrfs does some cleanup of the reloc roots.
It turns out that fs_info::reloc_ctl can be NULL in btrfs_recover_relocation() as we allocate relocation control after all reloc roots have been verified. So when we hit: note, we haven't called set_reloc_control() thus fs_info::reloc_ctl is still NULL.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199833 Reported-by: Xu Wen wen.xu@gatech.edu Signed-off-by: Qu Wenruo wqu@suse.com Tested-by: Gu Jinxiang gujx@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/relocation.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
--- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1318,18 +1318,19 @@ static void __del_reloc_root(struct btrf struct mapping_node *node = NULL; struct reloc_control *rc = root->fs_info->reloc_ctl;
- spin_lock(&rc->reloc_root_tree.lock); - rb_node = tree_search(&rc->reloc_root_tree.rb_root, - root->node->start); - if (rb_node) { - node = rb_entry(rb_node, struct mapping_node, rb_node); - rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root); + if (rc) { + spin_lock(&rc->reloc_root_tree.lock); + rb_node = tree_search(&rc->reloc_root_tree.rb_root, + root->node->start); + if (rb_node) { + node = rb_entry(rb_node, struct mapping_node, rb_node); + rb_erase(&node->rb_node, &rc->reloc_root_tree.rb_root); + } + spin_unlock(&rc->reloc_root_tree.lock); + if (!node) + return; + BUG_ON((struct btrfs_root *)node->data != root); } - spin_unlock(&rc->reloc_root_tree.lock); - - if (!node) - return; - BUG_ON((struct btrfs_root *)node->data != root);
spin_lock(&root->fs_info->trans_lock); list_del_init(&root->root_list);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 43794446548730ac8461be30bbe47d5d027d1d16 ]
[BUG] Under certain KVM load and LTP tests, it is possible to hit the following calltrace if quota is enabled:
BTRFS critical (device vda2): unable to find logical 8820195328 length 4096 BTRFS critical (device vda2): unable to find logical 8820195328 length 4096
WARNING: CPU: 0 PID: 49 at ../block/blk-core.c:172 blk_status_to_errno+0x1a/0x30 CPU: 0 PID: 49 Comm: kworker/u2:1 Not tainted 4.12.14-15-default #1 SLE15 (unreleased) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] task: ffff9f827b340bc0 task.stack: ffffb4f8c0304000 RIP: 0010:blk_status_to_errno+0x1a/0x30 Call Trace: submit_extent_page+0x191/0x270 [btrfs] ? btrfs_create_repair_bio+0x130/0x130 [btrfs] __do_readpage+0x2d2/0x810 [btrfs] ? btrfs_create_repair_bio+0x130/0x130 [btrfs] ? run_one_async_done+0xc0/0xc0 [btrfs] __extent_read_full_page+0xe7/0x100 [btrfs] ? run_one_async_done+0xc0/0xc0 [btrfs] read_extent_buffer_pages+0x1ab/0x2d0 [btrfs] ? run_one_async_done+0xc0/0xc0 [btrfs] btree_read_extent_buffer_pages+0x94/0xf0 [btrfs] read_tree_block+0x31/0x60 [btrfs] read_block_for_search.isra.35+0xf0/0x2e0 [btrfs] btrfs_search_slot+0x46b/0xa00 [btrfs] ? kmem_cache_alloc+0x1a8/0x510 ? btrfs_get_token_32+0x5b/0x120 [btrfs] find_parent_nodes+0x11d/0xeb0 [btrfs] ? leaf_space_used+0xb8/0xd0 [btrfs] ? btrfs_leaf_free_space+0x49/0x90 [btrfs] ? btrfs_find_all_roots_safe+0x93/0x100 [btrfs] btrfs_find_all_roots_safe+0x93/0x100 [btrfs] btrfs_find_all_roots+0x45/0x60 [btrfs] btrfs_qgroup_trace_extent_post+0x20/0x40 [btrfs] btrfs_add_delayed_data_ref+0x1a3/0x1d0 [btrfs] btrfs_alloc_reserved_file_extent+0x38/0x40 [btrfs] insert_reserved_file_extent.constprop.71+0x289/0x2e0 [btrfs] btrfs_finish_ordered_io+0x2f4/0x7f0 [btrfs] ? pick_next_task_fair+0x2cd/0x530 ? __switch_to+0x92/0x4b0 btrfs_worker_helper+0x81/0x300 [btrfs] process_one_work+0x1da/0x3f0 worker_thread+0x2b/0x3f0 ? process_one_work+0x3f0/0x3f0 kthread+0x11a/0x130 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x35/0x40
BTRFS critical (device vda2): unable to find logical 8820195328 length 16384 BTRFS: error (device vda2) in btrfs_finish_ordered_io:3023: errno=-5 IO failure BTRFS info (device vda2): forced readonly BTRFS error (device vda2): pending csums is 2887680
[CAUSE] It's caused by race with block group auto removal:
- There is a meta block group X, which has only one tree block The tree block belongs to fs tree 257. - In current transaction, some operation modified fs tree 257 The tree block gets COWed, so the block group X is empty, and marked as unused, queued to be deleted. - Some workload (like fsync) wakes up cleaner_kthread() Which will call btrfs_delete_unused_bgs() to remove unused block groups. So block group X along its chunk map get removed. - Some delalloc work finished for fs tree 257 Quota needs to get the original reference of the extent, which will read tree blocks of commit root of 257. Then since the chunk map gets removed, the above warning gets triggered.
[FIX] Just let btrfs_delete_unused_bgs() skip block group which still has pinned bytes.
However there is a minor side effect: currently we only queue empty blocks at update_block_group(), and such empty block group with pinned bytes won't go through update_block_group() again, such block group won't be removed, until it gets new extent allocated and removed.
Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10410,7 +10410,7 @@ void btrfs_delete_unused_bgs(struct btrf /* Don't want to race with allocators so take the groups_sem */ down_write(&space_info->groups_sem); spin_lock(&block_group->lock); - if (block_group->reserved || + if (block_group->reserved || block_group->pinned || btrfs_block_group_used(&block_group->item) || block_group->ro || list_is_singular(&block_group->list)) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joel Fernandes (Google) joel@joelfernandes.org
commit fc91a3c4c27acdca0bc13af6fbb68c35cfd519f2 upstream.
While debugging an issue debugobject tracking warned about an annotation issue of an object on stack. It turned out that the issue was due to the object in concern being on a different stack which was due to another issue.
Thomas suggested to print the pointers and the location of the stack for the currently running task. This helped to figure out that the object was on the wrong stack.
As this is general useful information for debugging similar issues, make the error message more informative by printing the pointers.
[ tglx: Massaged changelog ]
Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Waiman Long longman@redhat.com Acked-by: Yang Shi yang.shi@linux.alibaba.com Cc: kernel-team@android.com Cc: Arnd Bergmann arnd@arndb.de Cc: astrachan@google.com Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- lib/debugobjects.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -295,9 +295,12 @@ static void debug_object_is_on_stack(voi
limit++; if (is_on_stack) - pr_warn("object is on stack, but not annotated\n"); + pr_warn("object %p is on stack %p, but NOT annotated.\n", addr, + task_stack_page(current)); else - pr_warn("object is not on stack, but annotated\n"); + pr_warn("object %p is NOT on stack %p, but annotated.\n", addr, + task_stack_page(current)); + WARN_ON(1); }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
commit b2d7a075a1ccef2fb321d595802190c8e9b39004 upstream.
Using only 32-bit writes for the pte will result in an intermediate L1TF vulnerable PTE. When running as a Xen PV guest this will at once switch the guest to shadow mode resulting in a loss of performance.
Use arch_atomic64_xchg() instead which will perform the requested operation atomically with all 64 bits.
Some performance considerations according to:
https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scal...
The main number should be the latency, as there is no tight loop around native_ptep_get_and_clear().
"lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a memory operand) isn't mentioned in that document. "lock xadd" (with xadd having 3 cycles less latency than xchg) has a latency of 11, so we can assume a latency of 14 for "lock xchg".
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Jan Beulich jbeulich@suse.com Tested-by: Jason Andryuk jandryuk@gmail.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/pgtable-3level.h | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_PGTABLE_3LEVEL_H #define _ASM_X86_PGTABLE_3LEVEL_H
+#include <asm/atomic64_32.h> + /* * Intel Physical Address Extension (PAE) Mode - three-level page * tables on PPro+ CPUs. @@ -142,10 +144,7 @@ static inline pte_t native_ptep_get_and_ { pte_t res;
- /* xchg acts as a barrier before the setting of the high bits */ - res.pte_low = xchg(&ptep->pte_low, 0); - res.pte_high = ptep->pte_high; - ptep->pte_high = 0; + res.pte = (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0);
return res; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
commit 914b087ff9e0e9a399a4927fa30793064afc0178 upstream.
When $DEPMOD is not found, only print a warning instead of exiting with an error message and error status:
Warning: 'make modules_install' requires /sbin/depmod. Please install it. This is probably in the kmod package.
Change the Error to a Warning because "not all build hosts for cross compiling Linux are Linux systems and are able to provide a working port of depmod, especially at the file patch /sbin/depmod."
I.e., "make modules_install" may be used to copy/install the loadable modules files to a target directory on a build system and then transferred to an embedded device where /sbin/depmod is run instead of it being run on the build system.
Fixes: 934193a654c1 ("kbuild: verify that $DEPMOD is installed") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: H. Nikolaus Schaller hns@goldelico.com Cc: stable@vger.kernel.org Cc: Lucas De Marchi lucas.demarchi@profusion.mobi Cc: Lucas De Marchi lucas.de.marchi@gmail.com Cc: Michal Marek michal.lkml@markovi.net Cc: Jessica Yu jeyu@kernel.org Cc: Chih-Wei Huang cwhuang@linux.org.tw Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Maxim Zhukov mussitantesmortem@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/depmod.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/scripts/depmod.sh +++ b/scripts/depmod.sh @@ -15,9 +15,9 @@ if ! test -r System.map ; then fi
if [ -z $(command -v $DEPMOD) ]; then - echo "'make modules_install' requires $DEPMOD. Please install it." >&2 + echo "Warning: 'make modules_install' requires $DEPMOD. Please install it." >&2 echo "This is probably in the kmod package." >&2 - exit 1 + exit 0 fi
# older versions of depmod don't support -P <symbol-prefix>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tyler Hicks tyhicks@canonical.com
The irda_bind() function allocates memory for self->ias_obj without checking to see if the socket is already bound. A userspace process could repeatedly bind the socket, have each new object added into the LM-IAS database, and lose the reference to the old object assigned to the socket to exhaust memory resources. This patch errors out of the bind operation when self->ias_obj is already assigned.
CVE-2018-6554
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Tyler Hicks tyhicks@canonical.com Reviewed-by: Seth Arnold seth.arnold@canonical.com Reviewed-by: Stefan Bader stefan.bader@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/irda/af_irda.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/irda/af_irda.c +++ b/net/irda/af_irda.c @@ -774,6 +774,13 @@ static int irda_bind(struct socket *sock return -EINVAL;
lock_sock(sk); + + /* Ensure that the socket is not already bound */ + if (self->ias_obj) { + err = -EINVAL; + goto out; + } + #ifdef CONFIG_IRDA_ULTRA /* Special care for Ultra sockets */ if ((sk->sk_type == SOCK_DGRAM) &&
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tyler Hicks tyhicks@canonical.com
The irda_setsockopt() function conditionally allocates memory for a new self->ias_object or, in some cases, reuses the existing self->ias_object. Existing objects were incorrectly reinserted into the LM_IAS database which corrupted the doubly linked list used for the hashbin implementation of the LM_IAS database. When combined with a memory leak in irda_bind(), this issue could be leveraged to create a use-after-free vulnerability in the hashbin list. This patch fixes the issue by only inserting newly allocated objects into the database.
CVE-2018-6555
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Tyler Hicks tyhicks@canonical.com Reviewed-by: Seth Arnold seth.arnold@canonical.com Reviewed-by: Stefan Bader stefan.bader@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/irda/af_irda.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/net/irda/af_irda.c +++ b/net/irda/af_irda.c @@ -2027,7 +2027,11 @@ static int irda_setsockopt(struct socket err = -EINVAL; goto out; } - irias_insert_object(ias_obj); + + /* Only insert newly allocated objects */ + if (free_ias) + irias_insert_object(ias_obj); + kfree(ias_opt); break; case IRLMP_IAS_DEL:
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabio Estevam fabio.estevam@nxp.com
This reverts commit 0d0af17ae83d6feb29d676c72423461419df5110.
This commit causes reboot to fail on imx6 wandboard, so let's revert it.
Cc: stable@vger.kernel.org #4.4 Reported-by: Rasmus Villemoes rasmus.villemoes@prevas.dk Signed-off-by: Fabio Estevam fabio.estevam@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/configs/imx_v6_v7_defconfig | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/arm/configs/imx_v6_v7_defconfig +++ b/arch/arm/configs/imx_v6_v7_defconfig @@ -261,7 +261,6 @@ CONFIG_USB_STORAGE=y CONFIG_USB_CHIPIDEA=y CONFIG_USB_CHIPIDEA_UDC=y CONFIG_USB_CHIPIDEA_HOST=y -CONFIG_USB_CHIPIDEA_ULPI=y CONFIG_USB_SERIAL=m CONFIG_USB_SERIAL_GENERIC=y CONFIG_USB_SERIAL_FTDI_SIO=m @@ -288,7 +287,6 @@ CONFIG_USB_G_NCM=m CONFIG_USB_GADGETFS=m CONFIG_USB_MASS_STORAGE=m CONFIG_USB_G_SERIAL=m -CONFIG_USB_ULPI_BUS=y CONFIG_MMC=y CONFIG_MMC_SDHCI=y CONFIG_MMC_SDHCI_PLTFM=y
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Govindarajulu Varadarajan gvaradar@cisco.com
commit cb5c6568867325f9905e80c96531d963bec8e5ea upstream.
In commit ab123fe071c9 ("enic: handle mtu change for vf properly") ASSERT_RTNL() is added to _enic_change_mtu() to prevent it from being called without rtnl held. enic_probe() calls enic_change_mtu() without rtnl held. At this point netdev is not registered yet. Remove call to enic_change_mtu and assign the mtu to netdev->mtu.
Fixes: ab123fe071c9 ("enic: handle mtu change for vf properly") Signed-off-by: Govindarajulu Varadarajan gvaradar@cisco.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/ethernet/cisco/enic/enic_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -2683,7 +2683,6 @@ static int enic_probe(struct pci_dev *pd */
enic->port_mtu = enic->config.mtu; - (void)enic_change_mtu(netdev, enic->port_mtu);
err = enic_set_mac_addr(netdev, enic->mac_addr); if (err) { @@ -2732,6 +2731,7 @@ static int enic_probe(struct pci_dev *pd netdev->features |= NETIF_F_HIGHDMA;
netdev->priv_flags |= IFF_UNICAST_FLT; + netdev->mtu = enic->port_mtu;
err = register_netdev(netdev); if (err) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chas Williams chas3@att.com
Commit cdbf92675fad ("mm: numa: avoid waiting on freed migrated pages") was an incomplete backport of the upstream commit. It is necessary to always reset page_nid before attempting any early exit.
The original commit conflicted due to lack of commit 82b0f8c39a38 ("mm: join struct fault_env and vm_fault") in 4.9 so it wasn't a clean application, and the change must have just gotten lost in the noise.
Signed-off-by: Chas Williams chas3@att.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1393,12 +1393,12 @@ int do_huge_pmd_numa_page(struct mm_stru
/* Migration could have started since the pmd_trans_migrating check */ if (!page_locked) { + page_nid = -1; if (!get_page_unless_zero(page)) goto out_unlock; spin_unlock(ptl); wait_on_page_locked(page); put_page(page); - page_nid = -1; goto out; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sudeep Holla sudeep.holla@arm.com
commit a946e8c717f9355d1abd5408ed0adc0002d1aed1 upstream.
In case of a wakeup interrupt, irq_pm_check_wakeup disables the interrupt and marks it pending and suspended, disables it and notifies the pm core about the wake event. The interrupt gets handled later once the system is resumed.
However the irq stats is updated twice: once when it's disabled waiting for the system to resume and later when it's handled, resulting in wrong counting of the wakeup interrupt when waking up the system.
This patch updates the interrupt count so that it's updated only when the interrupt gets handled. It's already handled correctly in handle_edge_irq and handle_edge_eoi_irq.
Reported-by: Manoil Claudiu claudiu.manoil@freescale.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Link: http://lkml.kernel.org/r/1446661957-1019-1-git-send-email-sudeep.holla@arm.c... Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/irq/chip.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -338,7 +338,6 @@ void handle_nested_irq(unsigned int irq) raw_spin_lock_irq(&desc->lock);
desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); - kstat_incr_irqs_this_cpu(desc);
action = desc->action; if (unlikely(!action || irqd_irq_disabled(&desc->irq_data))) { @@ -346,6 +345,7 @@ void handle_nested_irq(unsigned int irq) goto out_unlock; }
+ kstat_incr_irqs_this_cpu(desc); irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS); raw_spin_unlock_irq(&desc->lock);
@@ -412,13 +412,13 @@ void handle_simple_irq(struct irq_desc * goto out_unlock;
desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); - kstat_incr_irqs_this_cpu(desc);
if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) { desc->istate |= IRQS_PENDING; goto out_unlock; }
+ kstat_incr_irqs_this_cpu(desc); handle_irq_event(desc);
out_unlock: @@ -462,7 +462,6 @@ void handle_level_irq(struct irq_desc *d goto out_unlock;
desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); - kstat_incr_irqs_this_cpu(desc);
/* * If its disabled or no action available @@ -473,6 +472,7 @@ void handle_level_irq(struct irq_desc *d goto out_unlock; }
+ kstat_incr_irqs_this_cpu(desc); handle_irq_event(desc);
cond_unmask_irq(desc); @@ -532,7 +532,6 @@ void handle_fasteoi_irq(struct irq_desc goto out;
desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING); - kstat_incr_irqs_this_cpu(desc);
/* * If its disabled or no action available @@ -544,6 +543,7 @@ void handle_fasteoi_irq(struct irq_desc goto out; }
+ kstat_incr_irqs_this_cpu(desc); if (desc->istate & IRQS_ONESHOT) mask_irq(desc);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
commit 18aa60ce2751c95d3412ed06a58b8b6cfb6f88f2 upstream.
When the programming of a GITS_BASERn register fails because of an unsupported ITS page size, we retry it with a smaller page size. Unfortunately, we don't recompute the number of allocated ITS pages, indicating the wrong value computed in the original allocation.
A convenient fix is to free the pages we allocated, update the page size, and restart the allocation. This will ensure that we always allocate the right amount in the case of a device table, specially if we have to reduce the allocation order to stay within the boundaries of the ITS maximum allocation.
Reported-and-tested-by: Ma Jun majun258@huawei.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Cc: linux-arm-kernel@lists.infradead.org Cc: Jason Cooper jason@lakedaemon.net Link: http://lkml.kernel.org/r/1453818255-1289-1-git-send-email-marc.zyngier@arm.c... Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-gic-v3-its.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -884,6 +884,7 @@ static int its_alloc_tables(const char * }
alloc_size = (1 << order) * PAGE_SIZE; +retry_alloc_baser: alloc_pages = (alloc_size / psz); if (alloc_pages > GITS_BASER_PAGES_MAX) { alloc_pages = GITS_BASER_PAGES_MAX; @@ -947,13 +948,16 @@ retry_baser: * size and retry. If we reach 4K, then * something is horribly wrong... */ + free_pages((unsigned long)base, order); + its->tables[i] = NULL; + switch (psz) { case SZ_16K: psz = SZ_4K; - goto retry_baser; + goto retry_alloc_baser; case SZ_64K: psz = SZ_16K; - goto retry_baser; + goto retry_alloc_baser; } }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shanker Donthineni shankerd@codeaurora.org
commit 1a485f4d2e28efd77075b2952926683d6c245633 upstream.
The current ITS driver has a memory leak in its_free_tables(). It happens on tear down path of the driver when its_probe() call fails. its_free_tables() should free the exact number of pages that have been allocated, not just a single page as current code does.
This patch records the memory size for each ITS_BASERn at the time of page allocation and uses the same size information when freeing pages to fix the issue.
Signed-off-by: Shanker Donthineni shankerd@codeaurora.org Acked-by: Marc Zyngier marc.zyngier@arm.com Cc: Jason Cooper jason@lakedaemon.net Cc: Vikram Sethi vikrams@codeaurora.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1454379584-21772-1-git-send-email-shankerd@codeauro... Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-gic-v3-its.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -67,7 +67,10 @@ struct its_node { unsigned long phys_base; struct its_cmd_block *cmd_base; struct its_cmd_block *cmd_write; - void *tables[GITS_BASER_NR_REGS]; + struct { + void *base; + u32 order; + } tables[GITS_BASER_NR_REGS]; struct its_collection *collections; struct list_head its_device_list; u64 flags; @@ -816,9 +819,10 @@ static void its_free_tables(struct its_n int i;
for (i = 0; i < GITS_BASER_NR_REGS; i++) { - if (its->tables[i]) { - free_page((unsigned long)its->tables[i]); - its->tables[i] = NULL; + if (its->tables[i].base) { + free_pages((unsigned long)its->tables[i].base, + its->tables[i].order); + its->tables[i].base = NULL; } } } @@ -899,7 +903,8 @@ retry_alloc_baser: goto out_free; }
- its->tables[i] = base; + its->tables[i].base = base; + its->tables[i].order = order;
retry_baser: val = (virt_to_phys(base) | @@ -949,7 +954,7 @@ retry_baser: * something is horribly wrong... */ free_pages((unsigned long)base, order); - its->tables[i] = NULL; + its->tables[i].base = NULL;
switch (psz) { case SZ_16K:
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shanker Donthineni shankerd@codeaurora.org
commit 2eca0d6ceea1f108b2d3ac81fb34698c4fd41006 upstream.
Function its_alloc_tables() maintains two local variables, "order" and and "alloc_size", to hold memory size that has been allocated to ITS_BASEn. We don't always refresh the variable alloc_size whenever value of the variable order changes, causing the following two problems.
- Cache flush operation with size more than required. - Information reported by pr_info is not correct.
Use a helper macro that converts page order to size in bytes instead of variable "alloc_size" to fix both the problems.
Signed-off-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-gic-v3-its.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -80,6 +80,9 @@ struct its_node {
#define ITS_ITT_ALIGN SZ_256
+/* Convert page order to size in bytes */ +#define PAGE_ORDER_TO_SIZE(o) (PAGE_SIZE << (o)) + struct event_lpi_map { unsigned long *lpi_map; u16 *col_map; @@ -855,7 +858,6 @@ static int its_alloc_tables(const char * u64 type = GITS_BASER_TYPE(val); u64 entry_size = GITS_BASER_ENTRY_SIZE(val); int order = get_order(psz); - int alloc_size; int alloc_pages; u64 tmp; void *base; @@ -887,9 +889,8 @@ static int its_alloc_tables(const char * } }
- alloc_size = (1 << order) * PAGE_SIZE; retry_alloc_baser: - alloc_pages = (alloc_size / psz); + alloc_pages = (PAGE_ORDER_TO_SIZE(order) / psz); if (alloc_pages > GITS_BASER_PAGES_MAX) { alloc_pages = GITS_BASER_PAGES_MAX; order = get_order(GITS_BASER_PAGES_MAX * psz); @@ -942,7 +943,7 @@ retry_baser: shr = tmp & GITS_BASER_SHAREABILITY_MASK; if (!shr) { cache = GITS_BASER_nC; - __flush_dcache_area(base, alloc_size); + __flush_dcache_area(base, PAGE_ORDER_TO_SIZE(order)); } goto retry_baser; } @@ -975,7 +976,7 @@ retry_baser: }
pr_info("ITS: allocated %d %s @%lx (psz %dK, shr %d)\n", - (int)(alloc_size / entry_size), + (int)(PAGE_ORDER_TO_SIZE(order) / entry_size), its_base_type_string[type], (unsigned long)virt_to_phys(base), psz / SZ_1K, (int)shr >> GITS_BASER_SHAREABILITY_SHIFT);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
commit 8f318526a292c5e7cebb82f3f766b83c22343293 upstream.
Commit 1a1ebd5 ("irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor") fixed the missing barrier on arm64, but forgot to update the 32bit counterpart, which has the same requirements. Let's fix it.
Fixes: 1a1ebd5 ("irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor") Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/include/asm/arch_gicv3.h | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm/include/asm/arch_gicv3.h +++ b/arch/arm/include/asm/arch_gicv3.h @@ -117,6 +117,7 @@ static inline u32 gic_read_iar(void) u32 irqstat;
asm volatile("mrc " __stringify(ICC_IAR1) : "=r" (irqstat)); + dsb(sy); return irqstat; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
commit 327ebe1f3a9b7e20e298b39d0cff627169a28012 upstream.
The GIC has no such thing as interrupt 1020: the last valid ID is 1019, and the range 1020-1023 is reserved - 1023 indicating that no interrupt is pending. So let's make sure we don't try to handle this ID.
This bug has been in since the initial GIC code was introduced in 8ad68bbf7a06 ("[ARM] Add support for ARM RealView board").
Reported-by: Eric Auger eric.auger@linaro.org Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Hanjun Guo hanjun.guo@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/irqchip/irq-gic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -336,7 +336,7 @@ static void __exception_irq_entry gic_ha irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); irqnr = irqstat & GICC_IAR_INT_ID_MASK;
- if (likely(irqnr > 15 && irqnr < 1021)) { + if (likely(irqnr > 15 && irqnr < 1020)) { if (static_key_true(&supports_deactivate)) writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); handle_domain_irq(gic->domain, irqnr, regs);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 56656e960b555cb98bc414382566dcb59aae99a2 upstream.
The 'is_merge' is an historical naming from when only a single lower layer could exist. With the introduction of multiple lower layers the meaning of this flag was changed to mean only the "lowest layer" (while all lower layers were being merged).
So now 'is_merge' is inaccurate and hence renaming to 'is_lowest'
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: SZ Lin (林上智) sz.lin@moxa.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/readdir.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -36,7 +36,7 @@ struct ovl_dir_cache {
struct ovl_readdir_data { struct dir_context ctx; - bool is_merge; + bool is_lowest; struct rb_root root; struct list_head *list; struct list_head middle; @@ -140,9 +140,9 @@ static int ovl_cache_entry_add_rb(struct return 0; }
-static int ovl_fill_lower(struct ovl_readdir_data *rdd, - const char *name, int namelen, - loff_t offset, u64 ino, unsigned int d_type) +static int ovl_fill_lowest(struct ovl_readdir_data *rdd, + const char *name, int namelen, + loff_t offset, u64 ino, unsigned int d_type) { struct ovl_cache_entry *p;
@@ -194,10 +194,10 @@ static int ovl_fill_merge(struct dir_con container_of(ctx, struct ovl_readdir_data, ctx);
rdd->count++; - if (!rdd->is_merge) + if (!rdd->is_lowest) return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type); else - return ovl_fill_lower(rdd, name, namelen, offset, ino, d_type); + return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type); }
static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd) @@ -290,7 +290,7 @@ static int ovl_dir_read_merged(struct de .ctx.actor = ovl_fill_merge, .list = list, .root = RB_ROOT, - .is_merge = false, + .is_lowest = false, }; int idx, next;
@@ -307,7 +307,7 @@ static int ovl_dir_read_merged(struct de * allows offsets to be reasonably constant */ list_add(&rdd.middle, rdd.list); - rdd.is_merge = true; + rdd.is_lowest = true; err = ovl_dir_read(&realpath, &rdd); list_del(&rdd.middle); }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antonio Murdaca amurdaca@redhat.com
commit 3fe6e52f062643676eb4518d68cee3bc1272091b upstream.
In user namespace the whiteout creation fails with -EPERM because the current process isn't capable(CAP_SYS_ADMIN) when setting xattr.
A simple reproducer:
$ mkdir upper lower work merged lower/dir $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged $ unshare -m -p -f -U -r bash
Now as root in the user namespace:
# touch merged/dir/{1,2,3} # this will force a copy up of lower/dir # rm -fR merged/*
This ends up failing with -EPERM after the files in dir has been correctly deleted:
unlinkat(4, "2", 0) = 0 unlinkat(4, "1", 0) = 0 unlinkat(4, "3", 0) = 0 close(4) = 0 unlinkat(AT_FDCWD, "merged/dir", AT_REMOVEDIR) = -1 EPERM (Operation not permitted)
Interestingly, if you don't place files in merged/dir you can remove it, meaning if upper/dir does not exist, creating the char device file works properly in that same location.
This patch uses ovl_sb_creator_cred() to get the cred struct from the superblock mounter and override the old cred with these new ones so that the whiteout creation is possible because overlay is wrong in assuming that the creds it will get with prepare_creds will be in the initial user namespace. The old cap_raise game is removed in favor of just overriding the old cred struct.
This patch also drops from ovl_copy_up_one() the following two lines:
override_cred->fsuid = stat->uid; override_cred->fsgid = stat->gid;
This is because the correct uid and gid are taken directly with the stat struct and correctly set with ovl_set_attr().
Signed-off-by: Antonio Murdaca runcom@redhat.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: SZ Lin (林上智) sz.lin@moxa.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/copy_up.c | 26 ------------------ fs/overlayfs/dir.c | 67 +++-------------------------------------------- fs/overlayfs/overlayfs.h | 1 fs/overlayfs/readdir.c | 14 ++------- fs/overlayfs/super.c | 18 +++++++++++- 5 files changed, 27 insertions(+), 99 deletions(-)
--- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -317,7 +317,6 @@ int ovl_copy_up_one(struct dentry *paren struct dentry *upperdir; struct dentry *upperdentry; const struct cred *old_cred; - struct cred *override_cred; char *link = NULL;
if (WARN_ON(!workdir)) @@ -336,28 +335,7 @@ int ovl_copy_up_one(struct dentry *paren return PTR_ERR(link); }
- err = -ENOMEM; - override_cred = prepare_creds(); - if (!override_cred) - goto out_free_link; - - override_cred->fsuid = stat->uid; - override_cred->fsgid = stat->gid; - /* - * CAP_SYS_ADMIN for copying up extended attributes - * CAP_DAC_OVERRIDE for create - * CAP_FOWNER for chmod, timestamp update - * CAP_FSETID for chmod - * CAP_CHOWN for chown - * CAP_MKNOD for mknod - */ - cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); - cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); - cap_raise(override_cred->cap_effective, CAP_FOWNER); - cap_raise(override_cred->cap_effective, CAP_FSETID); - cap_raise(override_cred->cap_effective, CAP_CHOWN); - cap_raise(override_cred->cap_effective, CAP_MKNOD); - old_cred = override_creds(override_cred); + old_cred = ovl_override_creds(dentry->d_sb);
err = -EIO; if (lock_rename(workdir, upperdir) != NULL) { @@ -380,9 +358,7 @@ int ovl_copy_up_one(struct dentry *paren out_unlock: unlock_rename(workdir, upperdir); revert_creds(old_cred); - put_cred(override_cred);
-out_free_link: if (link) free_page((unsigned long) link);
--- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -408,28 +408,13 @@ static int ovl_create_or_link(struct den err = ovl_create_upper(dentry, inode, &stat, link, hardlink); } else { const struct cred *old_cred; - struct cred *override_cred;
- err = -ENOMEM; - override_cred = prepare_creds(); - if (!override_cred) - goto out_iput; - - /* - * CAP_SYS_ADMIN for setting opaque xattr - * CAP_DAC_OVERRIDE for create in workdir, rename - * CAP_FOWNER for removing whiteout from sticky dir - */ - cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); - cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); - cap_raise(override_cred->cap_effective, CAP_FOWNER); - old_cred = override_creds(override_cred); + old_cred = ovl_override_creds(dentry->d_sb);
err = ovl_create_over_whiteout(dentry, inode, &stat, link, hardlink);
revert_creds(old_cred); - put_cred(override_cred); }
if (!err) @@ -659,32 +644,11 @@ static int ovl_do_remove(struct dentry * if (OVL_TYPE_PURE_UPPER(type)) { err = ovl_remove_upper(dentry, is_dir); } else { - const struct cred *old_cred; - struct cred *override_cred; - - err = -ENOMEM; - override_cred = prepare_creds(); - if (!override_cred) - goto out_drop_write; - - /* - * CAP_SYS_ADMIN for setting xattr on whiteout, opaque dir - * CAP_DAC_OVERRIDE for create in workdir, rename - * CAP_FOWNER for removing whiteout from sticky dir - * CAP_FSETID for chmod of opaque dir - * CAP_CHOWN for chown of opaque dir - */ - cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); - cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); - cap_raise(override_cred->cap_effective, CAP_FOWNER); - cap_raise(override_cred->cap_effective, CAP_FSETID); - cap_raise(override_cred->cap_effective, CAP_CHOWN); - old_cred = override_creds(override_cred); + const struct cred *old_cred = ovl_override_creds(dentry->d_sb);
err = ovl_remove_and_whiteout(dentry, is_dir);
revert_creds(old_cred); - put_cred(override_cred); } out_drop_write: ovl_drop_write(dentry); @@ -723,7 +687,6 @@ static int ovl_rename2(struct inode *old bool new_is_dir = false; struct dentry *opaquedir = NULL; const struct cred *old_cred = NULL; - struct cred *override_cred = NULL;
err = -EINVAL; if (flags & ~(RENAME_EXCHANGE | RENAME_NOREPLACE)) @@ -792,26 +755,8 @@ static int ovl_rename2(struct inode *old old_opaque = !OVL_TYPE_PURE_UPPER(old_type); new_opaque = !OVL_TYPE_PURE_UPPER(new_type);
- if (old_opaque || new_opaque) { - err = -ENOMEM; - override_cred = prepare_creds(); - if (!override_cred) - goto out_drop_write; - - /* - * CAP_SYS_ADMIN for setting xattr on whiteout, opaque dir - * CAP_DAC_OVERRIDE for create in workdir - * CAP_FOWNER for removing whiteout from sticky dir - * CAP_FSETID for chmod of opaque dir - * CAP_CHOWN for chown of opaque dir - */ - cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); - cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); - cap_raise(override_cred->cap_effective, CAP_FOWNER); - cap_raise(override_cred->cap_effective, CAP_FSETID); - cap_raise(override_cred->cap_effective, CAP_CHOWN); - old_cred = override_creds(override_cred); - } + if (old_opaque || new_opaque) + old_cred = ovl_override_creds(old->d_sb);
if (overwrite && OVL_TYPE_MERGE_OR_LOWER(new_type) && new_is_dir) { opaquedir = ovl_check_empty_and_clear(new); @@ -942,10 +887,8 @@ out_dput_old: out_unlock: unlock_rename(new_upperdir, old_upperdir); out_revert_creds: - if (old_opaque || new_opaque) { + if (old_opaque || new_opaque) revert_creds(old_cred); - put_cred(override_cred); - } out_drop_write: ovl_drop_write(old); out: --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -150,6 +150,7 @@ void ovl_drop_write(struct dentry *dentr bool ovl_dentry_is_opaque(struct dentry *dentry); void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque); bool ovl_is_whiteout(struct dentry *dentry); +const struct cred *ovl_override_creds(struct super_block *sb); void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry); struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags); --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -36,6 +36,7 @@ struct ovl_dir_cache {
struct ovl_readdir_data { struct dir_context ctx; + struct dentry *dentry; bool is_lowest; struct rb_root root; struct list_head *list; @@ -206,17 +207,8 @@ static int ovl_check_whiteouts(struct de struct ovl_cache_entry *p; struct dentry *dentry; const struct cred *old_cred; - struct cred *override_cred; - - override_cred = prepare_creds(); - if (!override_cred) - return -ENOMEM;
- /* - * CAP_DAC_OVERRIDE for lookup - */ - cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); - old_cred = override_creds(override_cred); + old_cred = ovl_override_creds(rdd->dentry->d_sb);
err = mutex_lock_killable(&dir->d_inode->i_mutex); if (!err) { @@ -232,7 +224,6 @@ static int ovl_check_whiteouts(struct de mutex_unlock(&dir->d_inode->i_mutex); } revert_creds(old_cred); - put_cred(override_cred);
return err; } @@ -288,6 +279,7 @@ static int ovl_dir_read_merged(struct de struct path realpath; struct ovl_readdir_data rdd = { .ctx.actor = ovl_fill_merge, + .dentry = dentry, .list = list, .root = RB_ROOT, .is_lowest = false, --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -42,6 +42,8 @@ struct ovl_fs { long lower_namelen; /* pathnames of lower and upper dirs, for show_options */ struct ovl_config config; + /* creds of process who forced instantiation of super block */ + const struct cred *creator_cred; };
struct ovl_dir_cache; @@ -246,6 +248,13 @@ bool ovl_is_whiteout(struct dentry *dent return inode && IS_WHITEOUT(inode); }
+const struct cred *ovl_override_creds(struct super_block *sb) +{ + struct ovl_fs *ofs = sb->s_fs_info; + + return override_creds(ofs->creator_cred); +} + static bool ovl_is_opaquedir(struct dentry *dentry) { int res; @@ -587,6 +596,7 @@ static void ovl_put_super(struct super_b kfree(ufs->config.lowerdir); kfree(ufs->config.upperdir); kfree(ufs->config.workdir); + put_cred(ufs->creator_cred); kfree(ufs); }
@@ -1107,10 +1117,14 @@ static int ovl_fill_super(struct super_b else sb->s_d_op = &ovl_dentry_operations;
+ ufs->creator_cred = prepare_creds(); + if (!ufs->creator_cred) + goto out_put_lower_mnt; + err = -ENOMEM; oe = ovl_alloc_entry(numlower); if (!oe) - goto out_put_lower_mnt; + goto out_put_cred;
root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, oe)); if (!root_dentry) @@ -1143,6 +1157,8 @@ static int ovl_fill_super(struct super_b
out_free_oe: kfree(oe); +out_put_cred: + put_cred(ufs->creator_cred); out_put_lower_mnt: for (i = 0; i < ufs->numlower; i++) mntput(ufs->lower_mnt[i]);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit eea2fb4851e9dcbab6b991aaf47e2e024f1f55a0 upstream.
When mounting overlayfs it needs a clean "work" directory under the supplied workdir.
Previously the mount code removed this directory if it already existed and created a new one. If the removal failed (e.g. directory was not empty) then it fell back to a read-only mount not using the workdir.
While this has never been reported, it is possible to get a non-empty "work" dir from a previous mount of overlayfs in case of crash in the middle of an operation using the work directory.
In this case the left over state should be discarded and the overlay filesystem will be consistent, guaranteed by the atomicity of operations on moving to/from the workdir to the upper layer.
This patch implements cleaning out any files left in workdir. It is implemented using real recursion for simplicity, but the depth is limited to 2, because the worst case is that of a directory containing whiteouts under "work".
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Cc: stable@vger.kernel.org Signed-off-by: SZ Lin (林上智) sz.lin@moxa.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/overlayfs/overlayfs.h | 2 + fs/overlayfs/readdir.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++- fs/overlayfs/super.c | 2 - 3 files changed, 65 insertions(+), 2 deletions(-)
--- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -165,6 +165,8 @@ int ovl_check_empty_dir(struct dentry *d void ovl_cleanup_whiteouts(struct dentry *upper, struct list_head *list); void ovl_cache_free(struct list_head *list); int ovl_check_d_type_supported(struct path *realpath); +void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt, + struct dentry *dentry, int level);
/* inode.c */ int ovl_setattr(struct dentry *dentry, struct iattr *attr); --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -248,7 +248,7 @@ static inline int ovl_dir_read(struct pa err = rdd->err; } while (!err && rdd->count);
- if (!err && rdd->first_maybe_whiteout) + if (!err && rdd->first_maybe_whiteout && rdd->dentry) err = ovl_check_whiteouts(realpath->dentry, rdd);
fput(realfile); @@ -610,3 +610,64 @@ int ovl_check_d_type_supported(struct pa
return rdd.d_type_supported; } + +static void ovl_workdir_cleanup_recurse(struct path *path, int level) +{ + int err; + struct inode *dir = path->dentry->d_inode; + LIST_HEAD(list); + struct ovl_cache_entry *p; + struct ovl_readdir_data rdd = { + .ctx.actor = ovl_fill_merge, + .dentry = NULL, + .list = &list, + .root = RB_ROOT, + .is_lowest = false, + }; + + err = ovl_dir_read(path, &rdd); + if (err) + goto out; + + inode_lock_nested(dir, I_MUTEX_PARENT); + list_for_each_entry(p, &list, l_node) { + struct dentry *dentry; + + if (p->name[0] == '.') { + if (p->len == 1) + continue; + if (p->len == 2 && p->name[1] == '.') + continue; + } + dentry = lookup_one_len(p->name, path->dentry, p->len); + if (IS_ERR(dentry)) + continue; + if (dentry->d_inode) + ovl_workdir_cleanup(dir, path->mnt, dentry, level); + dput(dentry); + } + inode_unlock(dir); +out: + ovl_cache_free(&list); +} + +void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt, + struct dentry *dentry, int level) +{ + int err; + + if (!d_is_dir(dentry) || level > 1) { + ovl_cleanup(dir, dentry); + return; + } + + err = ovl_do_rmdir(dir, dentry); + if (err) { + struct path path = { .mnt = mnt, .dentry = dentry }; + + inode_unlock(dir); + ovl_workdir_cleanup_recurse(&path, level + 1); + inode_lock_nested(dir, I_MUTEX_PARENT); + ovl_cleanup(dir, dentry); + } +} --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -784,7 +784,7 @@ retry: goto out_dput;
retried = true; - ovl_cleanup(dir, work); + ovl_workdir_cleanup(dir, mnt, work, 0); dput(work); goto retry; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit 88c2ace69dbef696edba77712882af03879abc9c upstream.
The commit below added a call to the ->destroy() callback for all qdiscs which failed in their ->init(), but some were not prepared for such change and can't handle partially initialized qdisc. HTB is one of them and if any error occurs before the qdisc watchdog timer and qdisc work are initialized then we can hit either a null ptr deref (timer->base) when canceling in ->destroy or lockdep error info about trying to register a non-static key and a stack dump. So to fix these two move the watchdog timer and workqueue init before anything that can err out. To reproduce userspace needs to send broken htb qdisc create request, tested with a modified tc (q_htb.c).
Trace log: [ 2710.897602] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2710.897977] IP: hrtimer_active+0x17/0x8a [ 2710.898174] PGD 58fab067 [ 2710.898175] P4D 58fab067 [ 2710.898353] PUD 586c0067 [ 2710.898531] PMD 0 [ 2710.898710] [ 2710.899045] Oops: 0000 [#1] SMP [ 2710.899232] Modules linked in: [ 2710.899419] CPU: 1 PID: 950 Comm: tc Not tainted 4.13.0-rc6+ #54 [ 2710.899646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 2710.900035] task: ffff880059ed2700 task.stack: ffff88005ad4c000 [ 2710.900262] RIP: 0010:hrtimer_active+0x17/0x8a [ 2710.900467] RSP: 0018:ffff88005ad4f960 EFLAGS: 00010246 [ 2710.900684] RAX: 0000000000000000 RBX: ffff88003701e298 RCX: 0000000000000000 [ 2710.900933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003701e298 [ 2710.901177] RBP: ffff88005ad4f980 R08: 0000000000000001 R09: 0000000000000001 [ 2710.901419] R10: ffff88005ad4f800 R11: 0000000000000400 R12: 0000000000000000 [ 2710.901663] R13: ffff88003701e298 R14: ffffffff822a4540 R15: ffff88005ad4fac0 [ 2710.901907] FS: 00007f2f5e90f740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000 [ 2710.902277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2710.902500] CR2: 0000000000000000 CR3: 0000000058ca3000 CR4: 00000000000406e0 [ 2710.902744] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2710.902977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2710.903180] Call Trace: [ 2710.903332] hrtimer_try_to_cancel+0x1a/0x93 [ 2710.903504] hrtimer_cancel+0x15/0x20 [ 2710.903667] qdisc_watchdog_cancel+0x12/0x14 [ 2710.903866] htb_destroy+0x2e/0xf7 [ 2710.904097] qdisc_create+0x377/0x3fd [ 2710.904330] tc_modify_qdisc+0x4d2/0x4fd [ 2710.904511] rtnetlink_rcv_msg+0x188/0x197 [ 2710.904682] ? rcu_read_unlock+0x3e/0x5f [ 2710.904849] ? rtnl_newlink+0x729/0x729 [ 2710.905017] netlink_rcv_skb+0x6c/0xce [ 2710.905183] rtnetlink_rcv+0x23/0x2a [ 2710.905345] netlink_unicast+0x103/0x181 [ 2710.905511] netlink_sendmsg+0x326/0x337 [ 2710.905679] sock_sendmsg_nosec+0x14/0x3f [ 2710.905847] sock_sendmsg+0x29/0x2e [ 2710.906010] ___sys_sendmsg+0x209/0x28b [ 2710.906176] ? do_raw_spin_unlock+0xcd/0xf8 [ 2710.906346] ? _raw_spin_unlock+0x27/0x31 [ 2710.906514] ? __handle_mm_fault+0x651/0xdb1 [ 2710.906685] ? check_chain_key+0xb0/0xfd [ 2710.906855] __sys_sendmsg+0x45/0x63 [ 2710.907018] ? __sys_sendmsg+0x45/0x63 [ 2710.907185] SyS_sendmsg+0x19/0x1b [ 2710.907344] entry_SYSCALL_64_fastpath+0x23/0xc2
Note that probably this bug goes further back because the default qdisc handling always calls ->destroy on init failure too.
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net [AmitP: Rebased for linux-4.4.y] Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_htb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1025,6 +1025,9 @@ static int htb_init(struct Qdisc *sch, s int err; int i;
+ qdisc_watchdog_init(&q->watchdog, sch); + INIT_WORK(&q->work, htb_work_func); + if (!opt) return -EINVAL;
@@ -1045,8 +1048,6 @@ static int htb_init(struct Qdisc *sch, s for (i = 0; i < TC_HTB_NUMPRIO; i++) INIT_LIST_HEAD(q->drops + i);
- qdisc_watchdog_init(&q->watchdog, sch); - INIT_WORK(&q->work, htb_work_func); __skb_queue_head_init(&q->direct_queue);
if (tb[TCA_HTB_DIRECT_QLEN])
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit e89d469e3be3ed3d7124a803211a463ff83d0964 upstream.
The below commit added a call to ->destroy() on init failure, but multiq still frees ->queues on error in init, but ->queues is also freed by ->destroy() thus we get double free and corrupted memory.
Very easy to reproduce (eth0 not multiqueue): $ tc qdisc add dev eth0 root multiq RTNETLINK answers: Operation not supported $ ip l add dumdum type dummy (crash)
Trace log: [ 3929.467747] general protection fault: 0000 [#1] SMP [ 3929.468083] Modules linked in: [ 3929.468302] CPU: 3 PID: 967 Comm: ip Not tainted 4.13.0-rc6+ #56 [ 3929.468625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 3929.469124] task: ffff88003716a700 task.stack: ffff88005872c000 [ 3929.469449] RIP: 0010:__kmalloc_track_caller+0x117/0x1be [ 3929.469746] RSP: 0018:ffff88005872f6a0 EFLAGS: 00010246 [ 3929.470042] RAX: 00000000000002de RBX: 0000000058a59000 RCX: 00000000000002df [ 3929.470406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff821f7020 [ 3929.470770] RBP: ffff88005872f6e8 R08: 000000000001f010 R09: 0000000000000000 [ 3929.471133] R10: ffff88005872f730 R11: 0000000000008cdd R12: ff006d75646d7564 [ 3929.471496] R13: 00000000014000c0 R14: ffff88005b403c00 R15: ffff88005b403c00 [ 3929.471869] FS: 00007f0b70480740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 3929.472286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3929.472677] CR2: 00007ffcee4f3000 CR3: 0000000059d45000 CR4: 00000000000406e0 [ 3929.473209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3929.474109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3929.474873] Call Trace: [ 3929.475337] ? kstrdup_const+0x23/0x25 [ 3929.475863] kstrdup+0x2e/0x4b [ 3929.476338] kstrdup_const+0x23/0x25 [ 3929.478084] __kernfs_new_node+0x28/0xbc [ 3929.478478] kernfs_new_node+0x35/0x55 [ 3929.478929] kernfs_create_link+0x23/0x76 [ 3929.479478] sysfs_do_create_link_sd.isra.2+0x85/0xd7 [ 3929.480096] sysfs_create_link+0x33/0x35 [ 3929.480649] device_add+0x200/0x589 [ 3929.481184] netdev_register_kobject+0x7c/0x12f [ 3929.481711] register_netdevice+0x373/0x471 [ 3929.482174] rtnl_newlink+0x614/0x729 [ 3929.482610] ? rtnl_newlink+0x17f/0x729 [ 3929.483080] rtnetlink_rcv_msg+0x188/0x197 [ 3929.483533] ? rcu_read_unlock+0x3e/0x5f [ 3929.483984] ? rtnl_newlink+0x729/0x729 [ 3929.484420] netlink_rcv_skb+0x6c/0xce [ 3929.484858] rtnetlink_rcv+0x23/0x2a [ 3929.485291] netlink_unicast+0x103/0x181 [ 3929.485735] netlink_sendmsg+0x326/0x337 [ 3929.486181] sock_sendmsg_nosec+0x14/0x3f [ 3929.486614] sock_sendmsg+0x29/0x2e [ 3929.486973] ___sys_sendmsg+0x209/0x28b [ 3929.487340] ? do_raw_spin_unlock+0xcd/0xf8 [ 3929.487719] ? _raw_spin_unlock+0x27/0x31 [ 3929.488092] ? __handle_mm_fault+0x651/0xdb1 [ 3929.488471] ? check_chain_key+0xb0/0xfd [ 3929.488847] __sys_sendmsg+0x45/0x63 [ 3929.489206] ? __sys_sendmsg+0x45/0x63 [ 3929.489576] SyS_sendmsg+0x19/0x1b [ 3929.489901] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 3929.490172] RIP: 0033:0x7f0b6fb93690 [ 3929.490423] RSP: 002b:00007ffcee4ed588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 3929.490881] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f0b6fb93690 [ 3929.491198] RDX: 0000000000000000 RSI: 00007ffcee4ed5d0 RDI: 0000000000000003 [ 3929.491521] RBP: ffff88005872ff98 R08: 0000000000000001 R09: 0000000000000000 [ 3929.491801] R10: 00007ffcee4ed350 R11: 0000000000000246 R12: 0000000000000002 [ 3929.492075] R13: 000000000066f1a0 R14: 00007ffcee4f5680 R15: 0000000000000000 [ 3929.492352] ? trace_hardirqs_off_caller+0xa7/0xcf [ 3929.492590] Code: 8b 45 c0 48 8b 45 b8 74 17 48 8b 4d c8 83 ca ff 44 89 ee 4c 89 f7 e8 83 ca ff ff 49 89 c4 eb 49 49 63 56 20 48 8d 48 01 4d 8b 06 <49> 8b 1c 14 48 89 c2 4c 89 e0 65 49 0f c7 08 0f 94 c0 83 f0 01 [ 3929.493335] RIP: __kmalloc_track_caller+0x117/0x1be RSP: ffff88005872f6a0
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: f07d1501292b ("multiq: Further multiqueue cleanup") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net [AmitP: Removed unused variable 'err' in multiq_init()] Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_multiq.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -254,7 +254,7 @@ static int multiq_tune(struct Qdisc *sch static int multiq_init(struct Qdisc *sch, struct nlattr *opt) { struct multiq_sched_data *q = qdisc_priv(sch); - int i, err; + int i;
q->queues = NULL;
@@ -269,12 +269,7 @@ static int multiq_init(struct Qdisc *sch for (i = 0; i < q->max_bands; i++) q->queues[i] = &noop_qdisc;
- err = multiq_tune(sch, opt); - - if (err) - kfree(q->queues); - - return err; + return multiq_tune(sch, opt); }
static int multiq_dump(struct Qdisc *sch, struct sk_buff *skb)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit 32db864d33c21fd70a217ba53cb7224889354ffb upstream.
If sch_hhf fails in its ->init() function (either due to wrong user-space arguments as below or memory alloc failure of hh_flows) it will do a null pointer deref of q->hh_flows in its ->destroy() function.
To reproduce the crash: $ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000
Crash log: [ 690.654882] BUG: unable to handle kernel NULL pointer dereference at (null) [ 690.655565] IP: hhf_destroy+0x48/0xbc [ 690.655944] PGD 37345067 [ 690.655948] P4D 37345067 [ 690.656252] PUD 58402067 [ 690.656554] PMD 0 [ 690.656857] [ 690.657362] Oops: 0000 [#1] SMP [ 690.657696] Modules linked in: [ 690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57 [ 690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 690.659255] task: ffff880058578000 task.stack: ffff88005acbc000 [ 690.659747] RIP: 0010:hhf_destroy+0x48/0xbc [ 690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246 [ 690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000 [ 690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0 [ 690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000 [ 690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea [ 690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000 [ 690.663769] FS: 00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 690.667069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0 [ 690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 690.671003] Call Trace: [ 690.671743] qdisc_create+0x377/0x3fd [ 690.672534] tc_modify_qdisc+0x4d2/0x4fd [ 690.673324] rtnetlink_rcv_msg+0x188/0x197 [ 690.674204] ? rcu_read_unlock+0x3e/0x5f [ 690.675091] ? rtnl_newlink+0x729/0x729 [ 690.675877] netlink_rcv_skb+0x6c/0xce [ 690.676648] rtnetlink_rcv+0x23/0x2a [ 690.677405] netlink_unicast+0x103/0x181 [ 690.678179] netlink_sendmsg+0x326/0x337 [ 690.678958] sock_sendmsg_nosec+0x14/0x3f [ 690.679743] sock_sendmsg+0x29/0x2e [ 690.680506] ___sys_sendmsg+0x209/0x28b [ 690.681283] ? __handle_mm_fault+0xc7d/0xdb1 [ 690.681915] ? check_chain_key+0xb0/0xfd [ 690.682449] __sys_sendmsg+0x45/0x63 [ 690.682954] ? __sys_sendmsg+0x45/0x63 [ 690.683471] SyS_sendmsg+0x19/0x1b [ 690.683974] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 690.684516] RIP: 0033:0x7f8ae529d690 [ 690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690 [ 690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003 [ 690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000 [ 690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002 [ 690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000 [ 690.688475] ? trace_hardirqs_off_caller+0xa7/0xcf [ 690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83 c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02 00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1 [ 690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0 [ 690.690636] CR2: 0000000000000000
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_hhf.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_hhf.c +++ b/net/sched/sch_hhf.c @@ -501,6 +501,9 @@ static void hhf_destroy(struct Qdisc *sc hhf_free(q->hhf_valid_bits[i]); }
+ if (!q->hh_flows) + return; + for (i = 0; i < HH_FLOWS_CNT; i++) { struct hh_flow_state *flow, *next; struct list_head *head = &q->hh_flows[i];
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit 634576a1844dba15bc5e6fc61d72f37e13a21615 upstream.
netem can fail in ->init due to missing options (either not supplied by user-space or used as a default qdisc) causing a timer->base null pointer deref in its ->destroy() and ->reset() callbacks.
Reproduce: $ sysctl net.core.default_qdisc=netem $ ip l set ethX up
Crash log: [ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1814.847181] IP: hrtimer_active+0x17/0x8a [ 1814.847270] PGD 59c34067 [ 1814.847271] P4D 59c34067 [ 1814.847337] PUD 37374067 [ 1814.847403] PMD 0 [ 1814.847468] [ 1814.847582] Oops: 0000 [#1] SMP [ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O) [ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G O 4.13.0-rc6+ #62 [ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000 [ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a [ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246 [ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000 [ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8 [ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff [ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000 [ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001 [ 1814.849616] FS: 00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000 [ 1814.849919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0 [ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1814.850723] Call Trace: [ 1814.850875] hrtimer_try_to_cancel+0x1a/0x93 [ 1814.851047] hrtimer_cancel+0x15/0x20 [ 1814.851211] qdisc_watchdog_cancel+0x12/0x14 [ 1814.851383] netem_reset+0xe6/0xed [sch_netem] [ 1814.851561] qdisc_destroy+0x8b/0xe5 [ 1814.851723] qdisc_create_dflt+0x86/0x94 [ 1814.851890] ? dev_activate+0x129/0x129 [ 1814.852057] attach_one_default_qdisc+0x36/0x63 [ 1814.852232] netdev_for_each_tx_queue+0x3d/0x48 [ 1814.852406] dev_activate+0x4b/0x129 [ 1814.852569] __dev_open+0xe7/0x104 [ 1814.852730] __dev_change_flags+0xc6/0x15c [ 1814.852899] dev_change_flags+0x25/0x59 [ 1814.853064] do_setlink+0x30c/0xb3f [ 1814.853228] ? check_chain_key+0xb0/0xfd [ 1814.853396] ? check_chain_key+0xb0/0xfd [ 1814.853565] rtnl_newlink+0x3a4/0x729 [ 1814.853728] ? rtnl_newlink+0x117/0x729 [ 1814.853905] ? ns_capable_common+0xd/0xb1 [ 1814.854072] ? ns_capable+0x13/0x15 [ 1814.854234] rtnetlink_rcv_msg+0x188/0x197 [ 1814.854404] ? rcu_read_unlock+0x3e/0x5f [ 1814.854572] ? rtnl_newlink+0x729/0x729 [ 1814.854737] netlink_rcv_skb+0x6c/0xce [ 1814.854902] rtnetlink_rcv+0x23/0x2a [ 1814.855064] netlink_unicast+0x103/0x181 [ 1814.855230] netlink_sendmsg+0x326/0x337 [ 1814.855398] sock_sendmsg_nosec+0x14/0x3f [ 1814.855584] sock_sendmsg+0x29/0x2e [ 1814.855747] ___sys_sendmsg+0x209/0x28b [ 1814.855912] ? do_raw_spin_unlock+0xcd/0xf8 [ 1814.856082] ? _raw_spin_unlock+0x27/0x31 [ 1814.856251] ? __handle_mm_fault+0x651/0xdb1 [ 1814.856421] ? check_chain_key+0xb0/0xfd [ 1814.856592] __sys_sendmsg+0x45/0x63 [ 1814.856755] ? __sys_sendmsg+0x45/0x63 [ 1814.856923] SyS_sendmsg+0x19/0x1b [ 1814.857083] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 1814.857256] RIP: 0033:0x7f733b2dd690 [ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690 [ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003 [ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003 [ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002 [ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000 [ 1814.859267] ? trace_hardirqs_off_caller+0xa7/0xcf [ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3 31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b 45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89 [ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590 [ 1814.860214] CR2: 0000000000000000
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sched/sch_netem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -943,11 +943,11 @@ static int netem_init(struct Qdisc *sch, struct netem_sched_data *q = qdisc_priv(sch); int ret;
+ qdisc_watchdog_init(&q->watchdog, sch); + if (!opt) return -EINVAL;
- qdisc_watchdog_init(&q->watchdog, sch); - q->loss_model = CLG_RANDOM; ret = netem_change(sch, opt); if (ret)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit c2d6511e6a4f1f3673d711569c00c3849549e9b0 upstream.
sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy callbacks but it may fail before the timer is initialized due to missing options (either not supplied by user-space or set as a default qdisc), also q->qdisc is used by ->reset and ->destroy so we need it initialized.
Reproduce: $ sysctl net.core.default_qdisc=tbf $ ip l set ethX up
Crash log: [ 959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 959.160323] IP: qdisc_reset+0xa/0x5c [ 959.160400] PGD 59cdb067 [ 959.160401] P4D 59cdb067 [ 959.160466] PUD 59ccb067 [ 959.160532] PMD 0 [ 959.160597] [ 959.160706] Oops: 0000 [#1] SMP [ 959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem [ 959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62 [ 959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000 [ 959.161263] RIP: 0010:qdisc_reset+0xa/0x5c [ 959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286 [ 959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000 [ 959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000 [ 959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff [ 959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0 [ 959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001 [ 959.162546] FS: 00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000 [ 959.162844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0 [ 959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 959.163638] Call Trace: [ 959.163788] tbf_reset+0x19/0x64 [sch_tbf] [ 959.163957] qdisc_destroy+0x8b/0xe5 [ 959.164119] qdisc_create_dflt+0x86/0x94 [ 959.164284] ? dev_activate+0x129/0x129 [ 959.164449] attach_one_default_qdisc+0x36/0x63 [ 959.164623] netdev_for_each_tx_queue+0x3d/0x48 [ 959.164795] dev_activate+0x4b/0x129 [ 959.164957] __dev_open+0xe7/0x104 [ 959.165118] __dev_change_flags+0xc6/0x15c [ 959.165287] dev_change_flags+0x25/0x59 [ 959.165451] do_setlink+0x30c/0xb3f [ 959.165613] ? check_chain_key+0xb0/0xfd [ 959.165782] rtnl_newlink+0x3a4/0x729 [ 959.165947] ? rtnl_newlink+0x117/0x729 [ 959.166121] ? ns_capable_common+0xd/0xb1 [ 959.166288] ? ns_capable+0x13/0x15 [ 959.166450] rtnetlink_rcv_msg+0x188/0x197 [ 959.166617] ? rcu_read_unlock+0x3e/0x5f [ 959.166783] ? rtnl_newlink+0x729/0x729 [ 959.166948] netlink_rcv_skb+0x6c/0xce [ 959.167113] rtnetlink_rcv+0x23/0x2a [ 959.167273] netlink_unicast+0x103/0x181 [ 959.167439] netlink_sendmsg+0x326/0x337 [ 959.167607] sock_sendmsg_nosec+0x14/0x3f [ 959.167772] sock_sendmsg+0x29/0x2e [ 959.167932] ___sys_sendmsg+0x209/0x28b [ 959.168098] ? do_raw_spin_unlock+0xcd/0xf8 [ 959.168267] ? _raw_spin_unlock+0x27/0x31 [ 959.168432] ? __handle_mm_fault+0x651/0xdb1 [ 959.168602] ? check_chain_key+0xb0/0xfd [ 959.168773] __sys_sendmsg+0x45/0x63 [ 959.168934] ? __sys_sendmsg+0x45/0x63 [ 959.169100] SyS_sendmsg+0x19/0x1b [ 959.169260] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 959.169432] RIP: 0033:0x7fcc5097e690 [ 959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690 [ 959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003 [ 959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003 [ 959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006 [ 959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000 [ 959.170900] ? trace_hardirqs_off_caller+0xa7/0xcf [ 959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24 98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89 e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb [ 959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610 [ 959.171821] CR2: 0000000000000018
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_tbf.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -432,12 +432,13 @@ static int tbf_init(struct Qdisc *sch, s { struct tbf_sched_data *q = qdisc_priv(sch);
+ qdisc_watchdog_init(&q->watchdog, sch); + q->qdisc = &noop_qdisc; + if (opt == NULL) return -EINVAL;
q->t_c = ktime_get_ns(); - qdisc_watchdog_init(&q->watchdog, sch); - q->qdisc = &noop_qdisc;
return tbf_change(sch, opt); }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomas Winkler tomas.winkler@intel.com
commit cc365dcf0e56271bedf3de95f88922abe248e951 upstream.
From the pci power documentation:
"The driver itself should not call pm_runtime_allow(), though. Instead, it should let user space or some platform-specific code do that (user space can do it via sysfs as stated above)..."
However, the S0ix residency cannot be reached without MEI device getting into low power state. Hence, for mei devices that support D0i3, it's better to make runtime power management mandatory and not rely on the system integration such as udev rules. This policy cannot be applied globally as some older platforms were found to have broken power management.
Cc: stable@vger.kernel.org v4.13+ Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Reviewed-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/pci-me.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -230,8 +230,11 @@ static int mei_me_probe(struct pci_dev * if (!pci_dev_run_wake(pdev)) mei_me_set_pm_domain(dev);
- if (mei_pg_is_enabled(dev)) + if (mei_pg_is_enabled(dev)) { pm_runtime_put_noidle(&pdev->dev); + if (hw->d0i3_supported) + pm_runtime_allow(&pdev->dev); + }
dev_dbg(&pdev->dev, "initialization successful.\n");
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit 5eda25b10297684c1f46a14199ec00210f3c346e upstream.
The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well.
Cc: stable@vger.kernel.org # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/lib/mem.S | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/arch/s390/lib/mem.S +++ b/arch/s390/lib/mem.S @@ -26,7 +26,7 @@ */ ENTRY(memset) ltgr %r4,%r4 - bzr %r14 + jz .Lmemset_exit ltgr %r3,%r3 jnz .Lmemset_fill aghi %r4,-1 @@ -41,12 +41,13 @@ ENTRY(memset) .Lmemset_clear_rest: larl %r3,.Lmemset_xc ex %r4,0(%r3) +.Lmemset_exit: BR_EX %r14 .Lmemset_fill: stc %r3,0(%r2) cghi %r4,1 lgr %r1,%r2 - ber %r14 + je .Lmemset_fill_exit aghi %r4,-2 srlg %r3,%r4,8 ltgr %r3,%r3 @@ -58,6 +59,7 @@ ENTRY(memset) .Lmemset_fill_rest: larl %r3,.Lmemset_mvc ex %r4,0(%r3) +.Lmemset_fill_exit: BR_EX %r14 .Lmemset_xc: xc 0(1,%r1),0(%r1) @@ -71,7 +73,7 @@ ENTRY(memset) */ ENTRY(memcpy) ltgr %r4,%r4 - bzr %r14 + jz .Lmemcpy_exit aghi %r4,-1 srlg %r5,%r4,8 ltgr %r5,%r5 @@ -80,6 +82,7 @@ ENTRY(memcpy) .Lmemcpy_rest: larl %r5,.Lmemcpy_mvc ex %r4,0(%r5) +.Lmemcpy_exit: BR_EX %r14 .Lmemcpy_loop: mvc 0(256,%r1),0(%r3)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit ad0eaee6195db1db1749dd46b9e6f4466793d178 upstream.
Add missing break statement in order to prevent the code from falling through to the default case.
Addresses-Coverity-ID: 115050 ("Missing break in switch") Reported-by: Valdis Kletnieks valdis.kletnieks@vt.edu Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org [Gustavo: Backported to 3.16..4.18 - Remove code comment removal] Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wm8994.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/soc/codecs/wm8994.c +++ b/sound/soc/codecs/wm8994.c @@ -2431,6 +2431,7 @@ static int wm8994_set_dai_sysclk(struct snd_soc_update_bits(codec, WM8994_POWER_MANAGEMENT_2, WM8994_OPCLK_ENA, 0); } + break;
default: return -EINVAL;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ethan Lien ethanlien@synology.com
commit d814a49198eafa6163698bdd93961302f3a877a4 upstream.
We use customized, nodesize batch value to update dirty_metadata_bytes. We should also use batch version of compare function or we will easily goto fast path and get false result from percpu_counter_compare().
Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Ethan Lien ethanlien@synology.com Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: David Sterba dsterba@suse.com nb: Rebased on 4.4.y ] Signed-off-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/disk-io.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1011,8 +1011,9 @@ static int btree_writepages(struct addre
fs_info = BTRFS_I(mapping->host)->root->fs_info; /* this is a bit racy, but that's ok */ - ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes, - BTRFS_DIRTY_METADATA_THRESH); + ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes, + BTRFS_DIRTY_METADATA_THRESH, + fs_info->dirty_metadata_batch); if (ret < 0) return 0; } @@ -3987,8 +3988,9 @@ static void __btrfs_btree_balance_dirty( if (flush_delayed) btrfs_balance_delayed_items(root);
- ret = percpu_counter_compare(&root->fs_info->dirty_metadata_bytes, - BTRFS_DIRTY_METADATA_THRESH); + ret = __percpu_counter_compare(&root->fs_info->dirty_metadata_bytes, + BTRFS_DIRTY_METADATA_THRESH, + root->fs_info->dirty_metadata_batch); if (ret > 0) { balance_dirty_pages_ratelimited( root->fs_info->btree_inode->i_mapping);
On Thu, Sep 13, 2018 at 03:30:07PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.156 release. There are 60 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Sep 15 13:17:29 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.156-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Merged, compiled with -Werror, and installed onto my Pixel 2 XL.
No initial issues noticed in dmesg or general usage.
Thanks! Nathan
On 13 September 2018 at 19:00, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.4.156 release. There are 60 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Sep 15 13:17:29 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.156-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64 and i386.
Summary ------------------------------------------------------------------------
kernel: 4.4.156-rc2 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.4.y git commit: 2052f80448e54f0f36a360bb5489f2818ae13026 git describe: v4.4.155-61-g2052f80448e5 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.4-oe/build/v4.4.155-61-...
No regressions (compared to build v4.4.155)
Ran 16889 total tests in the following environments and test suites.
Environments -------------- - i386 - juno-r2 - arm64 - qemu_arm - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
Summary ------------------------------------------------------------------------
kernel: 4.4.156-rc2 git repo: https://git.linaro.org/lkft/arm64-stable-rc.git git branch: 4.4.156-rc2-hikey-20180913-283 git commit: 8cbc538b759e57b63bc7bbb767bc7c3336634b3d git describe: 4.4.156-rc2-hikey-20180913-283 Test details: https://qa-reports.linaro.org/lkft/linaro-hikey-stable-rc-4.4-oe/build/4.4.1...
No regressions (compared to build 4.4.156-rc1-hikey-20180913-282)
Ran 2724 total tests in the following environments and test suites.
Environments -------------- - hi6220-hikey - arm64 - qemu_arm64
Test Suites ----------- * boot * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests
On Thu, Sep 13, 2018 at 03:30:07PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.156 release. There are 60 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Sep 15 13:17:29 UTC 2018. Anything received after that time might be too late.
Build results: total: 151 pass: 151 fail: 0 Qemu test results: total: 285 pass: 285 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
linux-stable-mirror@lists.linaro.org