This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.64-rc1
Bart Van Assche bvanassche@acm.org scsi: core: Avoid that a kernel warning appears during system resume
Bart Van Assche bvanassche@acm.org block, scsi: Change the preempt-only flag into a counter
Yan, Zheng zyan@redhat.com ceph: hold i_ceph_lock when removing caps for freeing inode
Yoshinori Sato ysato@users.sourceforge.jp Fix allyesconfig output.
Miroslav Lichvar mlichvar@redhat.com drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
Linus Torvalds torvalds@linux-foundation.org /proc/<pid>/cmdline: add back the setproctitle() special case
Linus Torvalds torvalds@linux-foundation.org /proc/<pid>/cmdline: remove all the special cases
Jann Horn jannh@google.com sched/fair: Use RCU accessors consistently for ->numa_group
Jann Horn jannh@google.com sched/fair: Don't free p->numa_faults with concurrent readers
Jason Wang jasowang@redhat.com vhost: scsi: add weight support
Jason Wang jasowang@redhat.com vhost: vsock: add weight support
Jason Wang jasowang@redhat.com vhost_net: fix possible infinite loop
Jason Wang jasowang@redhat.com vhost: introduce vhost_exceeds_weight()
Vladis Dronov vdronov@redhat.com Bluetooth: hci_uart: check for missing tty operations
Joerg Roedel jroedel@suse.de iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA
Dmitry Safonov dima@arista.com iommu/vt-d: Don't queue_iova() if there is no flush queue
Luke Nowakowski-Krijger lnowakow@eng.ucsd.edu media: radio-raremono: change devm_k*alloc to k*alloc
Benjamin Coddington bcodding@redhat.com NFS: Cleanup if nfs_match_client is interrupted
Andrey Konovalov andreyknvl@google.com media: pvrusb2: use a different format for warnings
Oliver Neukum oneukum@suse.com media: cpia2_usb: first wake up, then free in disconnect
Fabio Estevam festevam@gmail.com ath10k: Change the warning message string
Sean Young sean@mess.org media: au0828: fix null dereference in error path
Phong Tran tranmanphong@gmail.com ISDN: hfcsusb: checking idx of ep configuration
Todd Kjos tkjos@android.com binder: fix possible UAF when freeing buffer
Will Deacon will.deacon@arm.com arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ
Minas Harutyunyan minas.harutyunyan@synopsys.com usb: dwc2: Fix disable all EP's on disconnect
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: Disable all EP's on disconnect
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Fix lookup revalidate of regular files
Trond Myklebust trond.myklebust@hammerspace.com NFS: Refactor nfs_lookup_revalidate()
Trond Myklebust trond.myklebust@hammerspace.com NFS: Fix dentry revalidation on NFSv4 lookup
Sunil Muthuswamy sunilmut@microsoft.com vsock: correct removal of socket from the list
Sunil Muthuswamy sunilmut@microsoft.com hv_sock: Add support for delayed close
-------------
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/compat.h | 1 + arch/sh/boards/Kconfig | 14 +- block/blk-core.c | 35 ++-- block/blk-mq-debugfs.c | 10 +- drivers/android/binder.c | 16 +- drivers/bluetooth/hci_ath.c | 3 + drivers/bluetooth/hci_bcm.c | 3 + drivers/bluetooth/hci_intel.c | 3 + drivers/bluetooth/hci_ldisc.c | 13 ++ drivers/bluetooth/hci_mrvl.c | 3 + drivers/bluetooth/hci_qca.c | 3 + drivers/bluetooth/hci_uart.h | 1 + drivers/iommu/intel-iommu.c | 2 +- drivers/iommu/iova.c | 18 +- drivers/isdn/hardware/mISDN/hfcsusb.c | 3 + drivers/media/radio/radio-raremono.c | 30 ++- drivers/media/usb/au0828/au0828-core.c | 12 +- drivers/media/usb/cpia2/cpia2_usb.c | 3 +- drivers/media/usb/pvrusb2/pvrusb2-hdw.c | 4 +- drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c | 6 +- drivers/media/usb/pvrusb2/pvrusb2-std.c | 2 +- drivers/net/wireless/ath/ath10k/usb.c | 2 +- drivers/pps/pps.c | 8 + drivers/scsi/scsi_lib.c | 15 +- drivers/usb/dwc2/gadget.c | 41 +++- drivers/vhost/net.c | 41 ++-- drivers/vhost/scsi.c | 15 +- drivers/vhost/vhost.c | 20 +- drivers/vhost/vhost.h | 5 +- drivers/vhost/vsock.c | 28 ++- fs/ceph/caps.c | 7 +- fs/exec.c | 2 +- fs/nfs/client.c | 4 +- fs/nfs/dir.c | 295 +++++++++++++++------------ fs/nfs/nfs4proc.c | 15 +- fs/proc/base.c | 132 ++++++------ include/linux/blkdev.h | 14 +- include/linux/iova.h | 6 + include/linux/sched.h | 10 +- include/linux/sched/numa_balancing.h | 4 +- kernel/fork.c | 2 +- kernel/sched/fair.c | 144 +++++++++---- net/vmw_vsock/af_vsock.c | 38 +--- net/vmw_vsock/hyperv_transport.c | 108 +++++++--- 45 files changed, 719 insertions(+), 426 deletions(-)
From: Sunil Muthuswamy sunilmut@microsoft.com
commit a9eeb998c28d5506616426bd3a216bd5735a18b8 upstream.
Currently, hvsock does not implement any delayed or background close logic. Whenever the hvsock socket is closed, a FIN is sent to the peer, and the last reference to the socket is dropped, which leads to a call to .destruct where the socket can hang indefinitely waiting for the peer to close it's side. The can cause the user application to hang in the close() call.
This change implements proper STREAM(TCP) closing handshake mechanism by sending the FIN to the peer and the waiting for the peer's FIN to arrive for a given timeout. On timeout, it will try to terminate the connection (i.e. a RST). This is in-line with other socket providers such as virtio.
This change does not address the hang in the vmbus_hvsock_device_unregister where it waits indefinitely for the host to rescind the channel. That should be taken up as a separate fix.
Signed-off-by: Sunil Muthuswamy sunilmut@microsoft.com Reviewed-by: Dexuan Cui decui@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/vmw_vsock/hyperv_transport.c | 110 +++++++++++++++++++++++++++------------ 1 file changed, 78 insertions(+), 32 deletions(-)
--- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -35,6 +35,9 @@ /* The MTU is 16KB per the host side's design */ #define HVS_MTU_SIZE (1024 * 16)
+/* How long to wait for graceful shutdown of a connection */ +#define HVS_CLOSE_TIMEOUT (8 * HZ) + struct vmpipe_proto_header { u32 pkt_type; u32 data_size; @@ -290,19 +293,32 @@ static void hvs_channel_cb(void *ctx) sk->sk_write_space(sk); }
-static void hvs_close_connection(struct vmbus_channel *chan) +static void hvs_do_close_lock_held(struct vsock_sock *vsk, + bool cancel_timeout) { - struct sock *sk = get_per_channel_state(chan); - struct vsock_sock *vsk = vsock_sk(sk); - - lock_sock(sk); + struct sock *sk = sk_vsock(vsk);
- sk->sk_state = TCP_CLOSE; sock_set_flag(sk, SOCK_DONE); - vsk->peer_shutdown |= SEND_SHUTDOWN | RCV_SHUTDOWN; - + vsk->peer_shutdown = SHUTDOWN_MASK; + if (vsock_stream_has_data(vsk) <= 0) + sk->sk_state = TCP_CLOSING; sk->sk_state_change(sk); + if (vsk->close_work_scheduled && + (!cancel_timeout || cancel_delayed_work(&vsk->close_work))) { + vsk->close_work_scheduled = false; + vsock_remove_sock(vsk); + + /* Release the reference taken while scheduling the timeout */ + sock_put(sk); + } +} + +static void hvs_close_connection(struct vmbus_channel *chan) +{ + struct sock *sk = get_per_channel_state(chan);
+ lock_sock(sk); + hvs_do_close_lock_held(vsock_sk(sk), true); release_sock(sk); }
@@ -445,50 +461,80 @@ static int hvs_connect(struct vsock_sock return vmbus_send_tl_connect_request(&h->vm_srv_id, &h->host_srv_id); }
+static void hvs_shutdown_lock_held(struct hvsock *hvs, int mode) +{ + struct vmpipe_proto_header hdr; + + if (hvs->fin_sent || !hvs->chan) + return; + + /* It can't fail: see hvs_channel_writable_bytes(). */ + (void)hvs_send_data(hvs->chan, (struct hvs_send_buf *)&hdr, 0); + hvs->fin_sent = true; +} + static int hvs_shutdown(struct vsock_sock *vsk, int mode) { struct sock *sk = sk_vsock(vsk); - struct vmpipe_proto_header hdr; - struct hvs_send_buf *send_buf; - struct hvsock *hvs;
if (!(mode & SEND_SHUTDOWN)) return 0;
lock_sock(sk); - - hvs = vsk->trans; - if (hvs->fin_sent) - goto out; - - send_buf = (struct hvs_send_buf *)&hdr; - - /* It can't fail: see hvs_channel_writable_bytes(). */ - (void)hvs_send_data(hvs->chan, send_buf, 0); - - hvs->fin_sent = true; -out: + hvs_shutdown_lock_held(vsk->trans, mode); release_sock(sk); return 0; }
-static void hvs_release(struct vsock_sock *vsk) +static void hvs_close_timeout(struct work_struct *work) { + struct vsock_sock *vsk = + container_of(work, struct vsock_sock, close_work.work); struct sock *sk = sk_vsock(vsk); - struct hvsock *hvs = vsk->trans; - struct vmbus_channel *chan;
+ sock_hold(sk); lock_sock(sk); + if (!sock_flag(sk, SOCK_DONE)) + hvs_do_close_lock_held(vsk, false);
- sk->sk_state = TCP_CLOSING; - vsock_remove_sock(vsk); - + vsk->close_work_scheduled = false; release_sock(sk); + sock_put(sk); +}
- chan = hvs->chan; - if (chan) - hvs_shutdown(vsk, RCV_SHUTDOWN | SEND_SHUTDOWN); +/* Returns true, if it is safe to remove socket; false otherwise */ +static bool hvs_close_lock_held(struct vsock_sock *vsk) +{ + struct sock *sk = sk_vsock(vsk); + + if (!(sk->sk_state == TCP_ESTABLISHED || + sk->sk_state == TCP_CLOSING)) + return true; + + if ((sk->sk_shutdown & SHUTDOWN_MASK) != SHUTDOWN_MASK) + hvs_shutdown_lock_held(vsk->trans, SHUTDOWN_MASK); + + if (sock_flag(sk, SOCK_DONE)) + return true; + + /* This reference will be dropped by the delayed close routine */ + sock_hold(sk); + INIT_DELAYED_WORK(&vsk->close_work, hvs_close_timeout); + vsk->close_work_scheduled = true; + schedule_delayed_work(&vsk->close_work, HVS_CLOSE_TIMEOUT); + return false; +}
+static void hvs_release(struct vsock_sock *vsk) +{ + struct sock *sk = sk_vsock(vsk); + bool remove_sock; + + lock_sock(sk); + remove_sock = hvs_close_lock_held(vsk); + release_sock(sk); + if (remove_sock) + vsock_remove_sock(vsk); }
static void hvs_destruct(struct vsock_sock *vsk)
From: Sunil Muthuswamy sunilmut@microsoft.com
commit d5afa82c977ea06f7119058fa0eb8519ea501031 upstream.
The current vsock code for removal of socket from the list is both subject to race and inefficient. It takes the lock, checks whether the socket is in the list, drops the lock and if the socket was on the list, deletes it from the list. This is subject to race because as soon as the lock is dropped once it is checked for presence, that condition cannot be relied upon for any decision. It is also inefficient because if the socket is present in the list, it takes the lock twice.
Signed-off-by: Sunil Muthuswamy sunilmut@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/vmw_vsock/af_vsock.c | 38 +++++++------------------------------- 1 file changed, 7 insertions(+), 31 deletions(-)
--- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -281,7 +281,8 @@ EXPORT_SYMBOL_GPL(vsock_insert_connected void vsock_remove_bound(struct vsock_sock *vsk) { spin_lock_bh(&vsock_table_lock); - __vsock_remove_bound(vsk); + if (__vsock_in_bound_table(vsk)) + __vsock_remove_bound(vsk); spin_unlock_bh(&vsock_table_lock); } EXPORT_SYMBOL_GPL(vsock_remove_bound); @@ -289,7 +290,8 @@ EXPORT_SYMBOL_GPL(vsock_remove_bound); void vsock_remove_connected(struct vsock_sock *vsk) { spin_lock_bh(&vsock_table_lock); - __vsock_remove_connected(vsk); + if (__vsock_in_connected_table(vsk)) + __vsock_remove_connected(vsk); spin_unlock_bh(&vsock_table_lock); } EXPORT_SYMBOL_GPL(vsock_remove_connected); @@ -325,35 +327,10 @@ struct sock *vsock_find_connected_socket } EXPORT_SYMBOL_GPL(vsock_find_connected_socket);
-static bool vsock_in_bound_table(struct vsock_sock *vsk) -{ - bool ret; - - spin_lock_bh(&vsock_table_lock); - ret = __vsock_in_bound_table(vsk); - spin_unlock_bh(&vsock_table_lock); - - return ret; -} - -static bool vsock_in_connected_table(struct vsock_sock *vsk) -{ - bool ret; - - spin_lock_bh(&vsock_table_lock); - ret = __vsock_in_connected_table(vsk); - spin_unlock_bh(&vsock_table_lock); - - return ret; -} - void vsock_remove_sock(struct vsock_sock *vsk) { - if (vsock_in_bound_table(vsk)) - vsock_remove_bound(vsk); - - if (vsock_in_connected_table(vsk)) - vsock_remove_connected(vsk); + vsock_remove_bound(vsk); + vsock_remove_connected(vsk); } EXPORT_SYMBOL_GPL(vsock_remove_sock);
@@ -484,8 +461,7 @@ static void vsock_pending_work(struct wo * incoming packets can't find this socket, and to reduce the reference * count. */ - if (vsock_in_connected_table(vsk)) - vsock_remove_connected(vsk); + vsock_remove_connected(vsk);
sk->sk_state = TCP_CLOSE;
From: Trond Myklebust trond.myklebust@hammerspace.com
commit be189f7e7f03de35887e5a85ddcf39b91b5d7fc1 upstream.
We need to ensure that inode and dentry revalidation occurs correctly on reopen of a file that is already open. Currently, we can end up not revalidating either in the case of NFSv4.0, due to the 'cached open' path. Let's fix that by ensuring that we only do cached open for the special cases of open recovery and delegation return.
Reported-by: Stan Hu stanhu@gmail.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Qian Lu luqia@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/nfs4proc.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
--- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1355,12 +1355,20 @@ static bool nfs4_mode_match_open_stateid return false; }
-static int can_open_cached(struct nfs4_state *state, fmode_t mode, int open_mode) +static int can_open_cached(struct nfs4_state *state, fmode_t mode, + int open_mode, enum open_claim_type4 claim) { int ret = 0;
if (open_mode & (O_EXCL|O_TRUNC)) goto out; + switch (claim) { + case NFS4_OPEN_CLAIM_NULL: + case NFS4_OPEN_CLAIM_FH: + goto out; + default: + break; + } switch (mode & (FMODE_READ|FMODE_WRITE)) { case FMODE_READ: ret |= test_bit(NFS_O_RDONLY_STATE, &state->flags) != 0 @@ -1753,7 +1761,7 @@ static struct nfs4_state *nfs4_try_open_
for (;;) { spin_lock(&state->owner->so_lock); - if (can_open_cached(state, fmode, open_mode)) { + if (can_open_cached(state, fmode, open_mode, claim)) { update_open_stateflags(state, fmode); spin_unlock(&state->owner->so_lock); goto out_return_state; @@ -2282,7 +2290,8 @@ static void nfs4_open_prepare(struct rpc if (data->state != NULL) { struct nfs_delegation *delegation;
- if (can_open_cached(data->state, data->o_arg.fmode, data->o_arg.open_flags)) + if (can_open_cached(data->state, data->o_arg.fmode, + data->o_arg.open_flags, claim)) goto out_no_action; rcu_read_lock(); delegation = rcu_dereference(NFS_I(data->state->inode)->delegation);
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 5ceb9d7fdaaf6d8ced6cd7861cf1deb9cd93fa47 upstream.
Refactor the code in nfs_lookup_revalidate() as a stepping stone towards optimising and fixing nfs4_lookup_revalidate().
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Qian Lu luqia@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/dir.c | 222 +++++++++++++++++++++++++++++++++-------------------------- 1 file changed, 126 insertions(+), 96 deletions(-)
--- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1072,6 +1072,100 @@ int nfs_neg_need_reval(struct inode *dir return !nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU); }
+static int +nfs_lookup_revalidate_done(struct inode *dir, struct dentry *dentry, + struct inode *inode, int error) +{ + switch (error) { + case 1: + dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is valid\n", + __func__, dentry); + return 1; + case 0: + nfs_mark_for_revalidate(dir); + if (inode && S_ISDIR(inode->i_mode)) { + /* Purge readdir caches. */ + nfs_zap_caches(inode); + /* + * We can't d_drop the root of a disconnected tree: + * its d_hash is on the s_anon list and d_drop() would hide + * it from shrink_dcache_for_unmount(), leading to busy + * inodes on unmount and further oopses. + */ + if (IS_ROOT(dentry)) + return 1; + } + dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is invalid\n", + __func__, dentry); + return 0; + } + dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) lookup returned error %d\n", + __func__, dentry, error); + return error; +} + +static int +nfs_lookup_revalidate_negative(struct inode *dir, struct dentry *dentry, + unsigned int flags) +{ + int ret = 1; + if (nfs_neg_need_reval(dir, dentry, flags)) { + if (flags & LOOKUP_RCU) + return -ECHILD; + ret = 0; + } + return nfs_lookup_revalidate_done(dir, dentry, NULL, ret); +} + +static int +nfs_lookup_revalidate_delegated(struct inode *dir, struct dentry *dentry, + struct inode *inode) +{ + nfs_set_verifier(dentry, nfs_save_change_attribute(dir)); + return nfs_lookup_revalidate_done(dir, dentry, inode, 1); +} + +static int +nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry, + struct inode *inode) +{ + struct nfs_fh *fhandle; + struct nfs_fattr *fattr; + struct nfs4_label *label; + int ret; + + ret = -ENOMEM; + fhandle = nfs_alloc_fhandle(); + fattr = nfs_alloc_fattr(); + label = nfs4_label_alloc(NFS_SERVER(inode), GFP_KERNEL); + if (fhandle == NULL || fattr == NULL || IS_ERR(label)) + goto out; + + ret = NFS_PROTO(dir)->lookup(dir, &dentry->d_name, fhandle, fattr, label); + if (ret < 0) { + if (ret == -ESTALE || ret == -ENOENT) + ret = 0; + goto out; + } + ret = 0; + if (nfs_compare_fh(NFS_FH(inode), fhandle)) + goto out; + if (nfs_refresh_inode(inode, fattr) < 0) + goto out; + + nfs_setsecurity(inode, fattr, label); + nfs_set_verifier(dentry, nfs_save_change_attribute(dir)); + + /* set a readdirplus hint that we had a cache miss */ + nfs_force_use_readdirplus(dir); + ret = 1; +out: + nfs_free_fattr(fattr); + nfs_free_fhandle(fhandle); + nfs4_label_free(label); + return nfs_lookup_revalidate_done(dir, dentry, inode, ret); +} + /* * This is called every time the dcache has a lookup hit, * and we should check whether we can really trust that @@ -1083,58 +1177,36 @@ int nfs_neg_need_reval(struct inode *dir * If the parent directory is seen to have changed, we throw out the * cached dentry and do a new lookup. */ -static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) +static int +nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry, + unsigned int flags) { - struct inode *dir; struct inode *inode; - struct dentry *parent; - struct nfs_fh *fhandle = NULL; - struct nfs_fattr *fattr = NULL; - struct nfs4_label *label = NULL; int error;
- if (flags & LOOKUP_RCU) { - parent = READ_ONCE(dentry->d_parent); - dir = d_inode_rcu(parent); - if (!dir) - return -ECHILD; - } else { - parent = dget_parent(dentry); - dir = d_inode(parent); - } nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE); inode = d_inode(dentry);
- if (!inode) { - if (nfs_neg_need_reval(dir, dentry, flags)) { - if (flags & LOOKUP_RCU) - return -ECHILD; - goto out_bad; - } - goto out_valid; - } + if (!inode) + return nfs_lookup_revalidate_negative(dir, dentry, flags);
if (is_bad_inode(inode)) { - if (flags & LOOKUP_RCU) - return -ECHILD; dfprintk(LOOKUPCACHE, "%s: %pd2 has dud inode\n", __func__, dentry); goto out_bad; }
if (NFS_PROTO(dir)->have_delegation(inode, FMODE_READ)) - goto out_set_verifier; + return nfs_lookup_revalidate_delegated(dir, dentry, inode);
/* Force a full look up iff the parent directory has changed */ if (!(flags & (LOOKUP_EXCL | LOOKUP_REVAL)) && nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) { error = nfs_lookup_verify_inode(inode, flags); if (error) { - if (flags & LOOKUP_RCU) - return -ECHILD; if (error == -ESTALE) - goto out_zap_parent; - goto out_error; + nfs_zap_caches(dir); + goto out_bad; } nfs_advise_use_readdirplus(dir); goto out_valid; @@ -1146,81 +1218,39 @@ static int nfs_lookup_revalidate(struct if (NFS_STALE(inode)) goto out_bad;
- error = -ENOMEM; - fhandle = nfs_alloc_fhandle(); - fattr = nfs_alloc_fattr(); - if (fhandle == NULL || fattr == NULL) - goto out_error; - - label = nfs4_label_alloc(NFS_SERVER(inode), GFP_NOWAIT); - if (IS_ERR(label)) - goto out_error; - trace_nfs_lookup_revalidate_enter(dir, dentry, flags); - error = NFS_PROTO(dir)->lookup(dir, &dentry->d_name, fhandle, fattr, label); + error = nfs_lookup_revalidate_dentry(dir, dentry, inode); trace_nfs_lookup_revalidate_exit(dir, dentry, flags, error); - if (error == -ESTALE || error == -ENOENT) - goto out_bad; - if (error) - goto out_error; - if (nfs_compare_fh(NFS_FH(inode), fhandle)) - goto out_bad; - if ((error = nfs_refresh_inode(inode, fattr)) != 0) - goto out_bad; - - nfs_setsecurity(inode, fattr, label); - - nfs_free_fattr(fattr); - nfs_free_fhandle(fhandle); - nfs4_label_free(label); + return error; +out_valid: + return nfs_lookup_revalidate_done(dir, dentry, inode, 1); +out_bad: + if (flags & LOOKUP_RCU) + return -ECHILD; + return nfs_lookup_revalidate_done(dir, dentry, inode, 0); +}
- /* set a readdirplus hint that we had a cache miss */ - nfs_force_use_readdirplus(dir); +static int +nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) +{ + struct dentry *parent; + struct inode *dir; + int ret;
-out_set_verifier: - nfs_set_verifier(dentry, nfs_save_change_attribute(dir)); - out_valid: if (flags & LOOKUP_RCU) { + parent = READ_ONCE(dentry->d_parent); + dir = d_inode_rcu(parent); + if (!dir) + return -ECHILD; + ret = nfs_do_lookup_revalidate(dir, dentry, flags); if (parent != READ_ONCE(dentry->d_parent)) return -ECHILD; - } else + } else { + parent = dget_parent(dentry); + ret = nfs_do_lookup_revalidate(d_inode(parent), dentry, flags); dput(parent); - dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is valid\n", - __func__, dentry); - return 1; -out_zap_parent: - nfs_zap_caches(dir); - out_bad: - WARN_ON(flags & LOOKUP_RCU); - nfs_free_fattr(fattr); - nfs_free_fhandle(fhandle); - nfs4_label_free(label); - nfs_mark_for_revalidate(dir); - if (inode && S_ISDIR(inode->i_mode)) { - /* Purge readdir caches. */ - nfs_zap_caches(inode); - /* - * We can't d_drop the root of a disconnected tree: - * its d_hash is on the s_anon list and d_drop() would hide - * it from shrink_dcache_for_unmount(), leading to busy - * inodes on unmount and further oopses. - */ - if (IS_ROOT(dentry)) - goto out_valid; } - dput(parent); - dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) is invalid\n", - __func__, dentry); - return 0; -out_error: - WARN_ON(flags & LOOKUP_RCU); - nfs_free_fattr(fattr); - nfs_free_fhandle(fhandle); - nfs4_label_free(label); - dput(parent); - dfprintk(LOOKUPCACHE, "NFS: %s(%pd2) lookup returned error %d\n", - __func__, dentry, error); - return error; + return ret; }
/*
From: Trond Myklebust trond.myklebust@hammerspace.com
commit c7944ebb9ce9461079659e9e6ec5baaf73724b3b upstream.
If we're revalidating an existing dentry in order to open a file, we need to ensure that we check the directory has not changed before we optimise away the lookup.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Qian Lu luqia@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/dir.c | 79 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 39 insertions(+), 40 deletions(-)
--- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1231,7 +1231,8 @@ out_bad: }
static int -nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) +__nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags, + int (*reval)(struct inode *, struct dentry *, unsigned int)) { struct dentry *parent; struct inode *dir; @@ -1242,17 +1243,22 @@ nfs_lookup_revalidate(struct dentry *den dir = d_inode_rcu(parent); if (!dir) return -ECHILD; - ret = nfs_do_lookup_revalidate(dir, dentry, flags); + ret = reval(dir, dentry, flags); if (parent != READ_ONCE(dentry->d_parent)) return -ECHILD; } else { parent = dget_parent(dentry); - ret = nfs_do_lookup_revalidate(d_inode(parent), dentry, flags); + ret = reval(d_inode(parent), dentry, flags); dput(parent); } return ret; }
+static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) +{ + return __nfs_lookup_revalidate(dentry, flags, nfs_do_lookup_revalidate); +} + /* * A weaker form of d_revalidate for revalidating just the d_inode(dentry) * when we don't really care about the dentry name. This is called when a @@ -1609,62 +1615,55 @@ no_open: } EXPORT_SYMBOL_GPL(nfs_atomic_open);
-static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags) +static int +nfs4_do_lookup_revalidate(struct inode *dir, struct dentry *dentry, + unsigned int flags) { struct inode *inode; - int ret = 0;
if (!(flags & LOOKUP_OPEN) || (flags & LOOKUP_DIRECTORY)) - goto no_open; + goto full_reval; if (d_mountpoint(dentry)) - goto no_open; - if (NFS_SB(dentry->d_sb)->caps & NFS_CAP_ATOMIC_OPEN_V1) - goto no_open; + goto full_reval;
inode = d_inode(dentry);
/* We can't create new files in nfs_open_revalidate(), so we * optimize away revalidation of negative dentries. */ - if (inode == NULL) { - struct dentry *parent; - struct inode *dir; - - if (flags & LOOKUP_RCU) { - parent = READ_ONCE(dentry->d_parent); - dir = d_inode_rcu(parent); - if (!dir) - return -ECHILD; - } else { - parent = dget_parent(dentry); - dir = d_inode(parent); - } - if (!nfs_neg_need_reval(dir, dentry, flags)) - ret = 1; - else if (flags & LOOKUP_RCU) - ret = -ECHILD; - if (!(flags & LOOKUP_RCU)) - dput(parent); - else if (parent != READ_ONCE(dentry->d_parent)) - return -ECHILD; - goto out; - } + if (inode == NULL) + goto full_reval; + + if (NFS_PROTO(dir)->have_delegation(inode, FMODE_READ)) + return nfs_lookup_revalidate_delegated(dir, dentry, inode);
/* NFS only supports OPEN on regular files */ if (!S_ISREG(inode->i_mode)) - goto no_open; + goto full_reval; + /* We cannot do exclusive creation on a positive dentry */ - if (flags & LOOKUP_EXCL) - goto no_open; + if (flags & (LOOKUP_EXCL | LOOKUP_REVAL)) + goto reval_dentry; + + /* Check if the directory changed */ + if (!nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) + goto reval_dentry;
/* Let f_op->open() actually open (and revalidate) the file */ - ret = 1; + return 1; +reval_dentry: + if (flags & LOOKUP_RCU) + return -ECHILD; + return nfs_lookup_revalidate_dentry(dir, dentry, inode);;
-out: - return ret; +full_reval: + return nfs_do_lookup_revalidate(dir, dentry, flags); +}
-no_open: - return nfs_lookup_revalidate(dentry, flags); +static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags) +{ + return __nfs_lookup_revalidate(dentry, flags, + nfs4_do_lookup_revalidate); }
#endif /* CONFIG_NFSV4 */
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit dccf1bad4be7eaa096c1f3697bd37883f9a08ecb upstream.
Disabling all EP's allow to reset EP's to initial state. On disconnect disable all EP's instead of just killing all requests. Because of some platform didn't catch disconnect event, same stuff added to dwc2_hsotg_core_init_disconnected() function when USB reset detected on the bus.
Changed from version 1: Changed lock acquire flow in dwc2_hsotg_ep_disable() function.
Signed-off-by: Minas Harutyunyan hminas@synopsys.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/dwc2/gadget.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -3107,6 +3107,8 @@ static void kill_all_requests(struct dwc dwc2_hsotg_txfifo_flush(hsotg, ep->fifo_index); }
+static int dwc2_hsotg_ep_disable(struct usb_ep *ep); + /** * dwc2_hsotg_disconnect - disconnect service * @hsotg: The device state. @@ -3125,13 +3127,12 @@ void dwc2_hsotg_disconnect(struct dwc2_h hsotg->connected = 0; hsotg->test_mode = 0;
+ /* all endpoints should be shutdown */ for (ep = 0; ep < hsotg->num_of_eps; ep++) { if (hsotg->eps_in[ep]) - kill_all_requests(hsotg, hsotg->eps_in[ep], - -ESHUTDOWN); + dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep); if (hsotg->eps_out[ep]) - kill_all_requests(hsotg, hsotg->eps_out[ep], - -ESHUTDOWN); + dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep); }
call_gadget(hsotg, disconnect); @@ -3189,13 +3190,23 @@ void dwc2_hsotg_core_init_disconnected(s u32 val; u32 usbcfg; u32 dcfg = 0; + int ep;
/* Kill any ep0 requests as controller will be reinitialized */ kill_all_requests(hsotg, hsotg->eps_out[0], -ECONNRESET);
- if (!is_usb_reset) + if (!is_usb_reset) { if (dwc2_core_reset(hsotg, true)) return; + } else { + /* all endpoints should be shutdown */ + for (ep = 1; ep < hsotg->num_of_eps; ep++) { + if (hsotg->eps_in[ep]) + dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep); + if (hsotg->eps_out[ep]) + dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep); + } + }
/* * we must now enable ep0 ready for host detection and then @@ -3996,6 +4007,7 @@ static int dwc2_hsotg_ep_disable(struct unsigned long flags; u32 epctrl_reg; u32 ctrl; + int locked;
dev_dbg(hsotg->dev, "%s(ep %p)\n", __func__, ep);
@@ -4011,7 +4023,9 @@ static int dwc2_hsotg_ep_disable(struct
epctrl_reg = dir_in ? DIEPCTL(index) : DOEPCTL(index);
- spin_lock_irqsave(&hsotg->lock, flags); + locked = spin_is_locked(&hsotg->lock); + if (!locked) + spin_lock_irqsave(&hsotg->lock, flags);
ctrl = dwc2_readl(hsotg, epctrl_reg);
@@ -4035,7 +4049,9 @@ static int dwc2_hsotg_ep_disable(struct hs_ep->fifo_index = 0; hs_ep->fifo_size = 0;
- spin_unlock_irqrestore(&hsotg->lock, flags); + if (!locked) + spin_unlock_irqrestore(&hsotg->lock, flags); + return 0; }
From: Minas Harutyunyan minas.harutyunyan@synopsys.com
commit 4fe4f9fecc36956fd53c8edf96dd0c691ef98ff9 upstream.
Disabling all EP's allow to reset EP's to initial state. Introduced new function dwc2_hsotg_ep_disable_lock() which before calling dwc2_hsotg_ep_disable() function acquire hsotg->lock and release on exiting.
From dwc2_hsotg_ep_disable() function removed acquiring
hsotg->lock. In dwc2_hsotg_core_init_disconnected() function when USB reset interrupt asserted disabling all ep’s by dwc2_hsotg_ep_disable() function. This updates eliminating sparse imbalance warnings.
Reverted changes in dwc2_hostg_disconnect() function. Introduced new function dwc2_hsotg_ep_disable_lock(). Changed dwc2_hsotg_ep_ops. Now disable point to dwc2_hsotg_ep_disable_lock() function. In functions dwc2_hsotg_udc_stop() and dwc2_hsotg_suspend() dwc2_hsotg_ep_disable() function replaced by dwc2_hsotg_ep_disable_lock() function. In dwc2_hsotg_ep_disable() function removed acquiring of hsotg->lock.
Fixes: dccf1bad4be7 ("usb: dwc2: Disable all EP's on disconnect") Signed-off-by: Minas Harutyunyan hminas@synopsys.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/dwc2/gadget.c | 41 +++++++++++++++++++++++------------------ 1 file changed, 23 insertions(+), 18 deletions(-)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -3107,8 +3107,6 @@ static void kill_all_requests(struct dwc dwc2_hsotg_txfifo_flush(hsotg, ep->fifo_index); }
-static int dwc2_hsotg_ep_disable(struct usb_ep *ep); - /** * dwc2_hsotg_disconnect - disconnect service * @hsotg: The device state. @@ -3130,9 +3128,11 @@ void dwc2_hsotg_disconnect(struct dwc2_h /* all endpoints should be shutdown */ for (ep = 0; ep < hsotg->num_of_eps; ep++) { if (hsotg->eps_in[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep); + kill_all_requests(hsotg, hsotg->eps_in[ep], + -ESHUTDOWN); if (hsotg->eps_out[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep); + kill_all_requests(hsotg, hsotg->eps_out[ep], + -ESHUTDOWN); }
call_gadget(hsotg, disconnect); @@ -3176,6 +3176,7 @@ static void dwc2_hsotg_irq_fifoempty(str GINTSTS_PTXFEMP | \ GINTSTS_RXFLVL)
+static int dwc2_hsotg_ep_disable(struct usb_ep *ep); /** * dwc2_hsotg_core_init - issue softreset to the core * @hsotg: The device state @@ -4004,10 +4005,8 @@ static int dwc2_hsotg_ep_disable(struct struct dwc2_hsotg *hsotg = hs_ep->parent; int dir_in = hs_ep->dir_in; int index = hs_ep->index; - unsigned long flags; u32 epctrl_reg; u32 ctrl; - int locked;
dev_dbg(hsotg->dev, "%s(ep %p)\n", __func__, ep);
@@ -4023,10 +4022,6 @@ static int dwc2_hsotg_ep_disable(struct
epctrl_reg = dir_in ? DIEPCTL(index) : DOEPCTL(index);
- locked = spin_is_locked(&hsotg->lock); - if (!locked) - spin_lock_irqsave(&hsotg->lock, flags); - ctrl = dwc2_readl(hsotg, epctrl_reg);
if (ctrl & DXEPCTL_EPENA) @@ -4049,12 +4044,22 @@ static int dwc2_hsotg_ep_disable(struct hs_ep->fifo_index = 0; hs_ep->fifo_size = 0;
- if (!locked) - spin_unlock_irqrestore(&hsotg->lock, flags); - return 0; }
+static int dwc2_hsotg_ep_disable_lock(struct usb_ep *ep) +{ + struct dwc2_hsotg_ep *hs_ep = our_ep(ep); + struct dwc2_hsotg *hsotg = hs_ep->parent; + unsigned long flags; + int ret; + + spin_lock_irqsave(&hsotg->lock, flags); + ret = dwc2_hsotg_ep_disable(ep); + spin_unlock_irqrestore(&hsotg->lock, flags); + return ret; +} + /** * on_list - check request is on the given endpoint * @ep: The endpoint to check. @@ -4202,7 +4207,7 @@ static int dwc2_hsotg_ep_sethalt_lock(st
static const struct usb_ep_ops dwc2_hsotg_ep_ops = { .enable = dwc2_hsotg_ep_enable, - .disable = dwc2_hsotg_ep_disable, + .disable = dwc2_hsotg_ep_disable_lock, .alloc_request = dwc2_hsotg_ep_alloc_request, .free_request = dwc2_hsotg_ep_free_request, .queue = dwc2_hsotg_ep_queue_lock, @@ -4342,9 +4347,9 @@ static int dwc2_hsotg_udc_stop(struct us /* all endpoints should be shutdown */ for (ep = 1; ep < hsotg->num_of_eps; ep++) { if (hsotg->eps_in[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep); + dwc2_hsotg_ep_disable_lock(&hsotg->eps_in[ep]->ep); if (hsotg->eps_out[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep); + dwc2_hsotg_ep_disable_lock(&hsotg->eps_out[ep]->ep); }
spin_lock_irqsave(&hsotg->lock, flags); @@ -4792,9 +4797,9 @@ int dwc2_hsotg_suspend(struct dwc2_hsotg
for (ep = 0; ep < hsotg->num_of_eps; ep++) { if (hsotg->eps_in[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_in[ep]->ep); + dwc2_hsotg_ep_disable_lock(&hsotg->eps_in[ep]->ep); if (hsotg->eps_out[ep]) - dwc2_hsotg_ep_disable(&hsotg->eps_out[ep]->ep); + dwc2_hsotg_ep_disable_lock(&hsotg->eps_out[ep]->ep); } }
From: Will Deacon will.deacon@arm.com
commit 24951465cbd279f60b1fdc2421b3694405bcff42 upstream.
arch/arm/ defines a SIGMINSTKSZ of 2k, so we should use the same value for compat tasks.
Cc: Arnd Bergmann arnd@arndb.de Cc: Dominik Brodowski linux@dominikbrodowski.net Cc: "Eric W. Biederman" ebiederm@xmission.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: Oleg Nesterov oleg@redhat.com Reviewed-by: Dave Martin Dave.Martin@arm.com Reported-by: Steve McIntyre steve.mcintyre@arm.com Tested-by: Steve McIntyre 93sam@debian.org Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/compat.h | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/include/asm/compat.h +++ b/arch/arm64/include/asm/compat.h @@ -159,6 +159,7 @@ static inline compat_uptr_t ptr_to_compa }
#define compat_user_stack_pointer() (user_stack_pointer(task_pt_regs(current))) +#define COMPAT_MINSIGSTKSZ 2048
static inline void __user *arch_compat_alloc_user_space(long len) {
From: Todd Kjos tkjos@android.com
commit a370003cc301d4361bae20c9ef615f89bf8d1e8a upstream.
There is a race between the binder driver cleaning up a completed transaction via binder_free_transaction() and a user calling binder_ioctl(BC_FREE_BUFFER) to release a buffer. It doesn't matter which is first but they need to be protected against running concurrently which can result in a UAF.
Signed-off-by: Todd Kjos tkjos@google.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/android/binder.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -1960,8 +1960,18 @@ static struct binder_thread *binder_get_
static void binder_free_transaction(struct binder_transaction *t) { - if (t->buffer) - t->buffer->transaction = NULL; + struct binder_proc *target_proc = t->to_proc; + + if (target_proc) { + binder_inner_proc_lock(target_proc); + if (t->buffer) + t->buffer->transaction = NULL; + binder_inner_proc_unlock(target_proc); + } + /* + * If the transaction has no target_proc, then + * t->buffer->transaction has already been cleared. + */ kfree(t); binder_stats_deleted(BINDER_STAT_TRANSACTION); } @@ -3484,10 +3494,12 @@ static int binder_thread_write(struct bi buffer->debug_id, buffer->transaction ? "active" : "finished");
+ binder_inner_proc_lock(proc); if (buffer->transaction) { buffer->transaction->buffer = NULL; buffer->transaction = NULL; } + binder_inner_proc_unlock(proc); if (buffer->async_transaction && buffer->target_node) { struct binder_node *buf_node; struct binder_work *w;
From: Phong Tran tranmanphong@gmail.com
commit f384e62a82ba5d85408405fdd6aeff89354deaa9 upstream.
The syzbot test with random endpoint address which made the idx is overflow in the table of endpoint configuations.
this adds the checking for fixing the error report from syzbot
KASAN: stack-out-of-bounds Read in hfcsusb_probe [1] The patch tested by syzbot [2]
Reported-by: syzbot+8750abbc3a46ef47d509@syzkaller.appspotmail.com
[1]: https://syzkaller.appspot.com/bug?id=30a04378dac680c5d521304a00a86156bb91352... [2]: https://groups.google.com/d/msg/syzkaller-bugs/_6HBdge8F3E/OJn7wVNpBAAJ
Signed-off-by: Phong Tran tranmanphong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/isdn/hardware/mISDN/hfcsusb.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1967,6 +1967,9 @@ hfcsusb_probe(struct usb_interface *intf
/* get endpoint base */ idx = ((ep_addr & 0x7f) - 1) * 2; + if (idx > 15) + return -EIO; + if (ep_addr & 0x80) idx++; attr = ep->desc.bmAttributes;
From: Sean Young sean@mess.org
commit 6d0d1ff9ff21fbb06b867c13a1d41ce8ddcd8230 upstream.
au0828_usb_disconnect() gets the au0828_dev struct via usb_get_intfdata, so it needs to set up for the error paths.
Reported-by: syzbot+357d86bcb4cca1a2f572@syzkaller.appspotmail.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/au0828/au0828-core.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/media/usb/au0828/au0828-core.c +++ b/drivers/media/usb/au0828/au0828-core.c @@ -623,6 +623,12 @@ static int au0828_usb_probe(struct usb_i /* Setup */ au0828_card_setup(dev);
+ /* + * Store the pointer to the au0828_dev so it can be accessed in + * au0828_usb_disconnect + */ + usb_set_intfdata(interface, dev); + /* Analog TV */ retval = au0828_analog_register(dev, interface); if (retval) { @@ -641,12 +647,6 @@ static int au0828_usb_probe(struct usb_i /* Remote controller */ au0828_rc_register(dev);
- /* - * Store the pointer to the au0828_dev so it can be accessed in - * au0828_usb_disconnect - */ - usb_set_intfdata(interface, dev); - pr_info("Registered device AU0828 [%s]\n", dev->board.name == NULL ? "Unset" : dev->board.name);
From: Fabio Estevam festevam@gmail.com
commit 265df32eae5845212ad9f55f5ae6b6dcb68b187b upstream.
The "WARNING" string confuses syzbot, which thinks it found a crash [1].
Change the string to avoid such problem.
[1] https://lkml.org/lkml/2019/5/9/243
Reported-by: syzbot+c1b25598aa60dcd47e78@syzkaller.appspotmail.com Suggested-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Fabio Estevam festevam@gmail.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath10k/usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath10k/usb.c +++ b/drivers/net/wireless/ath/ath10k/usb.c @@ -1025,7 +1025,7 @@ static int ath10k_usb_probe(struct usb_i }
/* TODO: remove this once USB support is fully implemented */ - ath10k_warn(ar, "WARNING: ath10k USB support is incomplete, don't expect anything to work!\n"); + ath10k_warn(ar, "Warning: ath10k USB support is incomplete, don't expect anything to work!\n");
return 0;
From: Oliver Neukum oneukum@suse.com
commit eff73de2b1600ad8230692f00bc0ab49b166512a upstream.
Kasan reported a use after free in cpia2_usb_disconnect() It first freed everything and then woke up those waiting. The reverse order is correct.
Fixes: 6c493f8b28c67 ("[media] cpia2: major overhaul to get it in a working state again")
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: syzbot+0c90fc937c84f97d0aa6@syzkaller.appspotmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/cpia2/cpia2_usb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/media/usb/cpia2/cpia2_usb.c +++ b/drivers/media/usb/cpia2/cpia2_usb.c @@ -902,7 +902,6 @@ static void cpia2_usb_disconnect(struct cpia2_unregister_camera(cam); v4l2_device_disconnect(&cam->v4l2_dev); mutex_unlock(&cam->v4l2_lock); - v4l2_device_put(&cam->v4l2_dev);
if(cam->buffers) { DBG("Wakeup waiting processes\n"); @@ -911,6 +910,8 @@ static void cpia2_usb_disconnect(struct wake_up_interruptible(&cam->wq_stream); }
+ v4l2_device_put(&cam->v4l2_dev); + LOG("CPiA2 camera disconnected.\n"); }
From: Andrey Konovalov andreyknvl@google.com
commit 1753c7c4367aa1201e1e5d0a601897ab33444af1 upstream.
When the pvrusb2 driver detects that there's something wrong with the device, it prints a warning message. Right now those message are printed in two different formats:
1. ***WARNING*** message here 2. WARNING: message here
There's an issue with the second format. Syzkaller recognizes it as a message produced by a WARN_ON(), which is used to indicate a bug in the kernel. However pvrusb2 prints those warnings to indicate an issue with the device, not the bug in the kernel.
This patch changes the pvrusb2 driver to consistently use the first warning message format. This will unblock syzkaller testing of this driver.
Reported-by: syzbot+af8f8d2ac0d39b0ed3a0@syzkaller.appspotmail.com Reported-by: syzbot+170a86bf206dd2c6217e@syzkaller.appspotmail.com Signed-off-by: Andrey Konovalov andreyknvl@google.com Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/pvrusb2/pvrusb2-hdw.c | 4 ++-- drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c | 6 +++--- drivers/media/usb/pvrusb2/pvrusb2-std.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c +++ b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c @@ -1680,7 +1680,7 @@ static int pvr2_decoder_enable(struct pv } if (!hdw->flag_decoder_missed) { pvr2_trace(PVR2_TRACE_ERROR_LEGS, - "WARNING: No decoder present"); + "***WARNING*** No decoder present"); hdw->flag_decoder_missed = !0; trace_stbit("flag_decoder_missed", hdw->flag_decoder_missed); @@ -2366,7 +2366,7 @@ struct pvr2_hdw *pvr2_hdw_create(struct if (hdw_desc->flag_is_experimental) { pvr2_trace(PVR2_TRACE_INFO, "**********"); pvr2_trace(PVR2_TRACE_INFO, - "WARNING: Support for this device (%s) is experimental.", + "***WARNING*** Support for this device (%s) is experimental.", hdw_desc->description); pvr2_trace(PVR2_TRACE_INFO, "Important functionality might not be entirely working."); --- a/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c +++ b/drivers/media/usb/pvrusb2/pvrusb2-i2c-core.c @@ -343,11 +343,11 @@ static int i2c_hack_cx25840(struct pvr2_
if ((ret != 0) || (*rdata == 0x04) || (*rdata == 0x0a)) { pvr2_trace(PVR2_TRACE_ERROR_LEGS, - "WARNING: Detected a wedged cx25840 chip; the device will not work."); + "***WARNING*** Detected a wedged cx25840 chip; the device will not work."); pvr2_trace(PVR2_TRACE_ERROR_LEGS, - "WARNING: Try power cycling the pvrusb2 device."); + "***WARNING*** Try power cycling the pvrusb2 device."); pvr2_trace(PVR2_TRACE_ERROR_LEGS, - "WARNING: Disabling further access to the device to prevent other foul-ups."); + "***WARNING*** Disabling further access to the device to prevent other foul-ups."); // This blocks all further communication with the part. hdw->i2c_func[0x44] = NULL; pvr2_hdw_render_useless(hdw); --- a/drivers/media/usb/pvrusb2/pvrusb2-std.c +++ b/drivers/media/usb/pvrusb2/pvrusb2-std.c @@ -353,7 +353,7 @@ struct v4l2_standard *pvr2_std_create_en bcnt = pvr2_std_id_to_str(buf,sizeof(buf),fmsk); pvr2_trace( PVR2_TRACE_ERROR_LEGS, - "WARNING: Failed to classify the following standard(s): %.*s", + "***WARNING*** Failed to classify the following standard(s): %.*s", bcnt,buf); }
From: Benjamin Coddington bcodding@redhat.com
commit 9f7761cf0409465075dadb875d5d4b8ef2f890c8 upstream.
Don't bail out before cleaning up a new allocation if the wait for searching for a matching nfs client is interrupted. Memory leaks.
Reported-by: syzbot+7fe11b49c1cc30e3fce2@syzkaller.appspotmail.com Fixes: 950a578c6128 ("NFS: make nfs_match_client killable") Signed-off-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/client.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -416,10 +416,10 @@ struct nfs_client *nfs_get_client(const clp = nfs_match_client(cl_init); if (clp) { spin_unlock(&nn->nfs_client_lock); - if (IS_ERR(clp)) - return clp; if (new) new->rpc_ops->free_client(new); + if (IS_ERR(clp)) + return clp; return nfs_found_client(cl_init, clp); } if (new) {
From: Luke Nowakowski-Krijger lnowakow@eng.ucsd.edu
commit c666355e60ddb4748ead3bdd983e3f7f2224aaf0 upstream.
Change devm_k*alloc to k*alloc to manually allocate memory
The manual allocation and freeing of memory is necessary because when the USB radio is disconnected, the memory associated with devm_k*alloc is freed. Meaning if we still have unresolved references to the radio device, then we get use-after-free errors.
This patch fixes this by manually allocating memory, and freeing it in the v4l2.release callback that gets called when the last radio device exits.
Reported-and-tested-by: syzbot+a4387f5b6b799f6becbf@syzkaller.appspotmail.com
Signed-off-by: Luke Nowakowski-Krijger lnowakow@eng.ucsd.edu Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl [hverkuil-cisco@xs4all.nl: cleaned up two small checkpatch.pl warnings] [hverkuil-cisco@xs4all.nl: prefix subject with driver name] Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/radio/radio-raremono.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
--- a/drivers/media/radio/radio-raremono.c +++ b/drivers/media/radio/radio-raremono.c @@ -271,6 +271,14 @@ static int vidioc_g_frequency(struct fil return 0; }
+static void raremono_device_release(struct v4l2_device *v4l2_dev) +{ + struct raremono_device *radio = to_raremono_dev(v4l2_dev); + + kfree(radio->buffer); + kfree(radio); +} + /* File system interface */ static const struct v4l2_file_operations usb_raremono_fops = { .owner = THIS_MODULE, @@ -295,12 +303,14 @@ static int usb_raremono_probe(struct usb struct raremono_device *radio; int retval = 0;
- radio = devm_kzalloc(&intf->dev, sizeof(struct raremono_device), GFP_KERNEL); - if (radio) - radio->buffer = devm_kmalloc(&intf->dev, BUFFER_LENGTH, GFP_KERNEL); - - if (!radio || !radio->buffer) + radio = kzalloc(sizeof(*radio), GFP_KERNEL); + if (!radio) + return -ENOMEM; + radio->buffer = kmalloc(BUFFER_LENGTH, GFP_KERNEL); + if (!radio->buffer) { + kfree(radio); return -ENOMEM; + }
radio->usbdev = interface_to_usbdev(intf); radio->intf = intf; @@ -324,7 +334,8 @@ static int usb_raremono_probe(struct usb if (retval != 3 || (get_unaligned_be16(&radio->buffer[1]) & 0xfff) == 0x0242) { dev_info(&intf->dev, "this is not Thanko's Raremono.\n"); - return -ENODEV; + retval = -ENODEV; + goto free_mem; }
dev_info(&intf->dev, "Thanko's Raremono connected: (%04X:%04X)\n", @@ -333,7 +344,7 @@ static int usb_raremono_probe(struct usb retval = v4l2_device_register(&intf->dev, &radio->v4l2_dev); if (retval < 0) { dev_err(&intf->dev, "couldn't register v4l2_device\n"); - return retval; + goto free_mem; }
mutex_init(&radio->lock); @@ -345,6 +356,7 @@ static int usb_raremono_probe(struct usb radio->vdev.ioctl_ops = &usb_raremono_ioctl_ops; radio->vdev.lock = &radio->lock; radio->vdev.release = video_device_release_empty; + radio->v4l2_dev.release = raremono_device_release;
usb_set_intfdata(intf, &radio->v4l2_dev);
@@ -360,6 +372,10 @@ static int usb_raremono_probe(struct usb } dev_err(&intf->dev, "could not register video device\n"); v4l2_device_unregister(&radio->v4l2_dev); + +free_mem: + kfree(radio->buffer); + kfree(radio); return retval; }
From: Dmitry Safonov dima@arista.com
commit effa467870c7612012885df4e246bdb8ffd8e44c upstream.
Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain.
Before deferring a flush, the queue should be allocated and initialized.
Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks.
Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable.
On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices:
BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df
The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39
Cc: David Woodhouse dwmw2@infradead.org Cc: Joerg Roedel joro@8bytes.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: iommu@lists.linux-foundation.org Cc: stable@vger.kernel.org # 4.14+ Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov dima@arista.com Reviewed-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de [v4.14-port notes: o minor conflict with untrusted IOMMU devices check under if-condition] Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel-iommu.c | 2 +- drivers/iommu/iova.c | 18 ++++++++++++++---- include/linux/iova.h | 6 ++++++ 3 files changed, 21 insertions(+), 5 deletions(-)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3721,7 +3721,7 @@ static void intel_unmap(struct device *d
freelist = domain_unmap(domain, start_pfn, last_pfn);
- if (intel_iommu_strict) { + if (intel_iommu_strict || !has_iova_flush_queue(&domain->iovad)) { iommu_flush_iotlb_psi(iommu, domain, start_pfn, nrpages, !freelist, 0); /* free iova */ --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -65,9 +65,14 @@ init_iova_domain(struct iova_domain *iov } EXPORT_SYMBOL_GPL(init_iova_domain);
+bool has_iova_flush_queue(struct iova_domain *iovad) +{ + return !!iovad->fq; +} + static void free_iova_flush_queue(struct iova_domain *iovad) { - if (!iovad->fq) + if (!has_iova_flush_queue(iovad)) return;
if (timer_pending(&iovad->fq_timer)) @@ -85,13 +90,14 @@ static void free_iova_flush_queue(struct int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor) { + struct iova_fq __percpu *queue; int cpu;
atomic64_set(&iovad->fq_flush_start_cnt, 0); atomic64_set(&iovad->fq_flush_finish_cnt, 0);
- iovad->fq = alloc_percpu(struct iova_fq); - if (!iovad->fq) + queue = alloc_percpu(struct iova_fq); + if (!queue) return -ENOMEM;
iovad->flush_cb = flush_cb; @@ -100,13 +106,17 @@ int init_iova_flush_queue(struct iova_do for_each_possible_cpu(cpu) { struct iova_fq *fq;
- fq = per_cpu_ptr(iovad->fq, cpu); + fq = per_cpu_ptr(queue, cpu); fq->head = 0; fq->tail = 0;
spin_lock_init(&fq->lock); }
+ smp_wmb(); + + iovad->fq = queue; + timer_setup(&iovad->fq_timer, fq_flush_timeout, 0); atomic_set(&iovad->fq_timer_on, 0);
--- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -156,6 +156,7 @@ struct iova *reserve_iova(struct iova_do void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to); void init_iova_domain(struct iova_domain *iovad, unsigned long granule, unsigned long start_pfn); +bool has_iova_flush_queue(struct iova_domain *iovad); int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor); struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn); @@ -236,6 +237,11 @@ static inline void init_iova_domain(stru { }
+bool has_iova_flush_queue(struct iova_domain *iovad) +{ + return false; +} + static inline int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor)
From: Joerg Roedel jroedel@suse.de
commit 201c1db90cd643282185a00770f12f95da330eca upstream.
The stub function for !CONFIG_IOMMU_IOVA needs to be 'static inline'.
Fixes: effa467870c76 ('iommu/vt-d: Don't queue_iova() if there is no flush queue') Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/iova.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -237,7 +237,7 @@ static inline void init_iova_domain(stru { }
-bool has_iova_flush_queue(struct iova_domain *iovad) +static inline bool has_iova_flush_queue(struct iova_domain *iovad) { return false; }
From: Vladis Dronov vdronov@redhat.com
commit b36a1552d7319bbfd5cf7f08726c23c5c66d4f73 upstream.
Certain ttys operations (pty_unix98_ops) lack tiocmget() and tiocmset() functions which are called by the certain HCI UART protocols (hci_ath, hci_bcm, hci_intel, hci_mrvl, hci_qca) via hci_uart_set_flow_control() or directly. This leads to an execution at NULL and can be triggered by an unprivileged user. Fix this by adding a helper function and a check for the missing tty operations in the protocols code.
This fixes CVE-2019-10207. The Fixes: lines list commits where calls to tiocm[gs]et() or hci_uart_set_flow_control() were added to the HCI UART protocols.
Link: https://syzkaller.appspot.com/bug?id=1b42faa2848963564a5b1b7f8c837ea7b55ffa5... Reported-by: syzbot+79337b501d6aa974d0f6@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # v2.6.36+ Fixes: b3190df62861 ("Bluetooth: Support for Atheros AR300x serial chip") Fixes: 118612fb9165 ("Bluetooth: hci_bcm: Add suspend/resume PM functions") Fixes: ff2895592f0f ("Bluetooth: hci_intel: Add Intel baudrate configuration support") Fixes: 162f812f23ba ("Bluetooth: hci_uart: Add Marvell support") Fixes: fa9ad876b8e0 ("Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990") Signed-off-by: Vladis Dronov vdronov@redhat.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Reviewed-by: Yu-Chen, Cho acho@suse.com Tested-by: Yu-Chen, Cho acho@suse.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/bluetooth/hci_ath.c | 3 +++ drivers/bluetooth/hci_bcm.c | 3 +++ drivers/bluetooth/hci_intel.c | 3 +++ drivers/bluetooth/hci_ldisc.c | 13 +++++++++++++ drivers/bluetooth/hci_mrvl.c | 3 +++ drivers/bluetooth/hci_qca.c | 3 +++ drivers/bluetooth/hci_uart.h | 1 + 7 files changed, 29 insertions(+)
--- a/drivers/bluetooth/hci_ath.c +++ b/drivers/bluetooth/hci_ath.c @@ -112,6 +112,9 @@ static int ath_open(struct hci_uart *hu)
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu)) + return -EOPNOTSUPP; + ath = kzalloc(sizeof(*ath), GFP_KERNEL); if (!ath) return -ENOMEM; --- a/drivers/bluetooth/hci_bcm.c +++ b/drivers/bluetooth/hci_bcm.c @@ -369,6 +369,9 @@ static int bcm_open(struct hci_uart *hu)
bt_dev_dbg(hu->hdev, "hu %p", hu);
+ if (!hci_uart_has_flow_control(hu)) + return -EOPNOTSUPP; + bcm = kzalloc(sizeof(*bcm), GFP_KERNEL); if (!bcm) return -ENOMEM; --- a/drivers/bluetooth/hci_intel.c +++ b/drivers/bluetooth/hci_intel.c @@ -406,6 +406,9 @@ static int intel_open(struct hci_uart *h
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu)) + return -EOPNOTSUPP; + intel = kzalloc(sizeof(*intel), GFP_KERNEL); if (!intel) return -ENOMEM; --- a/drivers/bluetooth/hci_ldisc.c +++ b/drivers/bluetooth/hci_ldisc.c @@ -299,6 +299,19 @@ static int hci_uart_send_frame(struct hc return 0; }
+/* Check the underlying device or tty has flow control support */ +bool hci_uart_has_flow_control(struct hci_uart *hu) +{ + /* serdev nodes check if the needed operations are present */ + if (hu->serdev) + return true; + + if (hu->tty->driver->ops->tiocmget && hu->tty->driver->ops->tiocmset) + return true; + + return false; +} + /* Flow control or un-flow control the device */ void hci_uart_set_flow_control(struct hci_uart *hu, bool enable) { --- a/drivers/bluetooth/hci_mrvl.c +++ b/drivers/bluetooth/hci_mrvl.c @@ -66,6 +66,9 @@ static int mrvl_open(struct hci_uart *hu
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu)) + return -EOPNOTSUPP; + mrvl = kzalloc(sizeof(*mrvl), GFP_KERNEL); if (!mrvl) return -ENOMEM; --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -450,6 +450,9 @@ static int qca_open(struct hci_uart *hu)
BT_DBG("hu %p qca_open", hu);
+ if (!hci_uart_has_flow_control(hu)) + return -EOPNOTSUPP; + qca = kzalloc(sizeof(struct qca_data), GFP_KERNEL); if (!qca) return -ENOMEM; --- a/drivers/bluetooth/hci_uart.h +++ b/drivers/bluetooth/hci_uart.h @@ -118,6 +118,7 @@ int hci_uart_tx_wakeup(struct hci_uart * int hci_uart_init_ready(struct hci_uart *hu); void hci_uart_init_work(struct work_struct *work); void hci_uart_set_baudrate(struct hci_uart *hu, unsigned int speed); +bool hci_uart_has_flow_control(struct hci_uart *hu); void hci_uart_set_flow_control(struct hci_uart *hu, bool enable); void hci_uart_set_speeds(struct hci_uart *hu, unsigned int init_speed, unsigned int oper_speed);
From: Jason Wang jasowang@redhat.com
commit e82b9b0727ff6d665fff2d326162b460dded554d upstream.
We used to have vhost_exceeds_weight() for vhost-net to:
- prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX
This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed.
Signed-off-by: Jason Wang jasowang@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com [jwang: backport to 4.19, fix conflict in net.c] Signed-off-by: Jack Wang jinpu.wang@cloud.ionos.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/vhost/net.c | 22 ++++++---------------- drivers/vhost/scsi.c | 9 ++++++++- drivers/vhost/vhost.c | 20 +++++++++++++++++++- drivers/vhost/vhost.h | 5 ++++- drivers/vhost/vsock.c | 12 +++++++++++- 5 files changed, 48 insertions(+), 20 deletions(-)
--- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -497,12 +497,6 @@ static size_t init_iov_iter(struct vhost return iov_iter_count(iter); }
-static bool vhost_exceeds_weight(int pkts, int total_len) -{ - return total_len >= VHOST_NET_WEIGHT || - pkts >= VHOST_NET_PKT_WEIGHT; -} - static int get_tx_bufs(struct vhost_net *net, struct vhost_net_virtqueue *nvq, struct msghdr *msg, @@ -598,10 +592,8 @@ static void handle_tx_copy(struct vhost_ err, len); if (++nvq->done_idx >= VHOST_NET_BATCH) vhost_net_signal_used(nvq); - if (vhost_exceeds_weight(++sent_pkts, total_len)) { - vhost_poll_queue(&vq->poll); + if (vhost_exceeds_weight(vq, ++sent_pkts, total_len)) break; - } }
vhost_net_signal_used(nvq); @@ -701,10 +693,9 @@ static void handle_tx_zerocopy(struct vh else vhost_zerocopy_signal_used(net, vq); vhost_net_tx_packet(net); - if (unlikely(vhost_exceeds_weight(++sent_pkts, total_len))) { - vhost_poll_queue(&vq->poll); + if (unlikely(vhost_exceeds_weight(vq, ++sent_pkts, + total_len))) break; - } } }
@@ -1027,10 +1018,8 @@ static void handle_rx(struct vhost_net * vhost_log_write(vq, vq_log, log, vhost_len, vq->iov, in); total_len += vhost_len; - if (unlikely(vhost_exceeds_weight(++recv_pkts, total_len))) { - vhost_poll_queue(&vq->poll); + if (unlikely(vhost_exceeds_weight(vq, ++recv_pkts, total_len))) goto out; - } } if (unlikely(busyloop_intr)) vhost_poll_queue(&vq->poll); @@ -1115,7 +1104,8 @@ static int vhost_net_open(struct inode * vhost_net_buf_init(&n->vqs[i].rxq); } vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX, - UIO_MAXIOV + VHOST_NET_BATCH); + UIO_MAXIOV + VHOST_NET_BATCH, + VHOST_NET_WEIGHT, VHOST_NET_PKT_WEIGHT);
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, dev); vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev); --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -57,6 +57,12 @@ #define VHOST_SCSI_PREALLOC_UPAGES 2048 #define VHOST_SCSI_PREALLOC_PROT_SGLS 2048
+/* Max number of requests before requeueing the job. + * Using this limit prevents one virtqueue from starving others with + * request. + */ +#define VHOST_SCSI_WEIGHT 256 + struct vhost_scsi_inflight { /* Wait for the flush operation to finish */ struct completion comp; @@ -1398,7 +1404,8 @@ static int vhost_scsi_open(struct inode vqs[i] = &vs->vqs[i].vq; vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick; } - vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ, UIO_MAXIOV); + vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ, UIO_MAXIOV, + VHOST_SCSI_WEIGHT, 0);
vhost_scsi_init_inflight(vs, NULL);
--- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -413,8 +413,24 @@ static void vhost_dev_free_iovecs(struct vhost_vq_free_iovecs(dev->vqs[i]); }
+bool vhost_exceeds_weight(struct vhost_virtqueue *vq, + int pkts, int total_len) +{ + struct vhost_dev *dev = vq->dev; + + if ((dev->byte_weight && total_len >= dev->byte_weight) || + pkts >= dev->weight) { + vhost_poll_queue(&vq->poll); + return true; + } + + return false; +} +EXPORT_SYMBOL_GPL(vhost_exceeds_weight); + void vhost_dev_init(struct vhost_dev *dev, - struct vhost_virtqueue **vqs, int nvqs, int iov_limit) + struct vhost_virtqueue **vqs, int nvqs, + int iov_limit, int weight, int byte_weight) { struct vhost_virtqueue *vq; int i; @@ -428,6 +444,8 @@ void vhost_dev_init(struct vhost_dev *de dev->mm = NULL; dev->worker = NULL; dev->iov_limit = iov_limit; + dev->weight = weight; + dev->byte_weight = byte_weight; init_llist_head(&dev->work_list); init_waitqueue_head(&dev->wait); INIT_LIST_HEAD(&dev->read_list); --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -171,10 +171,13 @@ struct vhost_dev { struct list_head pending_list; wait_queue_head_t wait; int iov_limit; + int weight; + int byte_weight; };
+bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int total_len); void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, - int nvqs, int iov_limit); + int nvqs, int iov_limit, int weight, int byte_weight); long vhost_dev_set_owner(struct vhost_dev *dev); bool vhost_dev_has_owner(struct vhost_dev *dev); long vhost_dev_check_owner(struct vhost_dev *); --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -21,6 +21,14 @@ #include "vhost.h"
#define VHOST_VSOCK_DEFAULT_HOST_CID 2 +/* Max number of bytes transferred before requeueing the job. + * Using this limit prevents one virtqueue from starving others. */ +#define VHOST_VSOCK_WEIGHT 0x80000 +/* Max number of packets transferred before requeueing the job. + * Using this limit prevents one virtqueue from starving others with + * small pkts. + */ +#define VHOST_VSOCK_PKT_WEIGHT 256
enum { VHOST_VSOCK_FEATURES = VHOST_FEATURES, @@ -531,7 +539,9 @@ static int vhost_vsock_dev_open(struct i vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick; vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick;
- vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), UIO_MAXIOV); + vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), + UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT, + VHOST_VSOCK_WEIGHT);
file->private_data = vsock; spin_lock_init(&vsock->send_pkt_list_lock);
From: Jason Wang jasowang@redhat.com
commit e2412c07f8f3040593dfb88207865a3cd58680c0 upstream.
When the rx buffer is too small for a packet, we will discard the vq descriptor and retry it for the next packet:
while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk, &busyloop_intr))) { ... /* On overrun, truncate and discard */ if (unlikely(headcount > UIO_MAXIOV)) { iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1); err = sock->ops->recvmsg(sock, &msg, 1, MSG_DONTWAIT | MSG_TRUNC); pr_debug("Discarded rx packet: len %zd\n", sock_len); continue; } ... }
This makes it possible to trigger a infinite while..continue loop through the co-opreation of two VMs like:
1) Malicious VM1 allocate 1 byte rx buffer and try to slow down the vhost process as much as possible e.g using indirect descriptors or other. 2) Malicious VM2 generate packets to VM1 as fast as possible
Fixing this by checking against weight at the end of RX and TX loop. This also eliminate other similar cases when:
- userspace is consuming the packets in the meanwhile - theoretical TOCTOU attack if guest moving avail index back and forth to hit the continue after vhost find guest just add new buffers
This addresses CVE-2019-3900.
Fixes: d8316f3991d20 ("vhost: fix total length when packets are too short") Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang jasowang@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com [jwang: backport to 4.19] Signed-off-by: Jack Wang jinpu.wang@cloud.ionos.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/vhost/net.c | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-)
--- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -551,7 +551,7 @@ static void handle_tx_copy(struct vhost_ int err; int sent_pkts = 0;
- for (;;) { + do { bool busyloop_intr = false;
head = get_tx_bufs(net, nvq, &msg, &out, &in, &len, @@ -592,9 +592,7 @@ static void handle_tx_copy(struct vhost_ err, len); if (++nvq->done_idx >= VHOST_NET_BATCH) vhost_net_signal_used(nvq); - if (vhost_exceeds_weight(vq, ++sent_pkts, total_len)) - break; - } + } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len)));
vhost_net_signal_used(nvq); } @@ -618,7 +616,7 @@ static void handle_tx_zerocopy(struct vh bool zcopy_used; int sent_pkts = 0;
- for (;;) { + do { bool busyloop_intr;
/* Release DMAs done buffers first */ @@ -693,10 +691,7 @@ static void handle_tx_zerocopy(struct vh else vhost_zerocopy_signal_used(net, vq); vhost_net_tx_packet(net); - if (unlikely(vhost_exceeds_weight(vq, ++sent_pkts, - total_len))) - break; - } + } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len))); }
/* Expects to be always run from workqueue - which acts as @@ -932,8 +927,11 @@ static void handle_rx(struct vhost_net * vq->log : NULL; mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
- while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk, - &busyloop_intr))) { + do { + sock_len = vhost_net_rx_peek_head_len(net, sock->sk, + &busyloop_intr); + if (!sock_len) + break; sock_len += sock_hlen; vhost_len = sock_len + vhost_hlen; headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, @@ -1018,12 +1016,11 @@ static void handle_rx(struct vhost_net * vhost_log_write(vq, vq_log, log, vhost_len, vq->iov, in); total_len += vhost_len; - if (unlikely(vhost_exceeds_weight(vq, ++recv_pkts, total_len))) - goto out; - } + } while (likely(!vhost_exceeds_weight(vq, ++recv_pkts, total_len))); + if (unlikely(busyloop_intr)) vhost_poll_queue(&vq->poll); - else + else if (!sock_len) vhost_net_enable_vq(net, vq); out: vhost_net_signal_used(nvq); @@ -1105,7 +1102,7 @@ static int vhost_net_open(struct inode * } vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX, UIO_MAXIOV + VHOST_NET_BATCH, - VHOST_NET_WEIGHT, VHOST_NET_PKT_WEIGHT); + VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT);
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, dev); vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev);
Hi!
This makes it possible to trigger a infinite while..continue loop through the co-opreation of two VMs like:
- Malicious VM1 allocate 1 byte rx buffer and try to slow down the vhost process as much as possible e.g using indirect descriptors or other.
- Malicious VM2 generate packets to VM1 as fast as possible
Fixing this by checking against weight at the end of RX and TX loop. This also eliminate other similar cases when:
- userspace is consuming the packets in the meanwhile
- theoretical TOCTOU attack if guest moving avail index back and forth to hit the continue after vhost find guest just add new buffers
This addresses CVE-2019-3900.
@@ -551,7 +551,7 @@ static void handle_tx_copy(struct vhost_ int err; int sent_pkts = 0;
- for (;;) {
- do { bool busyloop_intr = false;
head = get_tx_bufs(net, nvq, &msg, &out, &in, &len, @@ -592,9 +592,7 @@ static void handle_tx_copy(struct vhost_ err, len); if (++nvq->done_idx >= VHOST_NET_BATCH) vhost_net_signal_used(nvq);
if (vhost_exceeds_weight(vq, ++sent_pkts, total_len))
break;
- }
- } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len)));
vhost_net_signal_used(nvq); }
So this part does not really change anything, right?
@@ -618,7 +616,7 @@ static void handle_tx_zerocopy(struct vh bool zcopy_used; int sent_pkts = 0;
- for (;;) {
- do { bool busyloop_intr;
/* Release DMAs done buffers first */ @@ -693,10 +691,7 @@ static void handle_tx_zerocopy(struct vh else vhost_zerocopy_signal_used(net, vq); vhost_net_tx_packet(net);
if (unlikely(vhost_exceeds_weight(vq, ++sent_pkts,
total_len)))
break;
- }
- } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len)));
} /* Expects to be always run from workqueue - which acts as
Neither does this. Equivalent code. Changelog says it fixes something for the transmit so... is that intentional?
Pavel
On 2019/8/4 上午5:49, Pavel Machek wrote:
Hi!
This makes it possible to trigger a infinite while..continue loop through the co-opreation of two VMs like:
- Malicious VM1 allocate 1 byte rx buffer and try to slow down the vhost process as much as possible e.g using indirect descriptors or other.
- Malicious VM2 generate packets to VM1 as fast as possible
Fixing this by checking against weight at the end of RX and TX loop. This also eliminate other similar cases when:
- userspace is consuming the packets in the meanwhile
- theoretical TOCTOU attack if guest moving avail index back and forth to hit the continue after vhost find guest just add new buffers
This addresses CVE-2019-3900.
@@ -551,7 +551,7 @@ static void handle_tx_copy(struct vhost_ int err; int sent_pkts = 0;
- for (;;) {
- do { bool busyloop_intr = false;
head = get_tx_bufs(net, nvq, &msg, &out, &in, &len, @@ -592,9 +592,7 @@ static void handle_tx_copy(struct vhost_ err, len); if (++nvq->done_idx >= VHOST_NET_BATCH) vhost_net_signal_used(nvq);
if (vhost_exceeds_weight(vq, ++sent_pkts, total_len))
break;
- }
- } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len)));
vhost_net_signal_used(nvq); }
So this part does not really change anything, right?
Nope, if you check the loop you can see we used to use "continue" inside the loop which may bypass the check:
head = get_tx_bufs(net, nvq, &msg, &out, &in, &len, &busyloop_intr); /* On error, stop handling until the next kick. */ if (unlikely(head < 0)) break; /* Nothing new? Wait for eventfd to tell us they refilled. */ if (head == vq->num) { if (unlikely(busyloop_intr)) { vhost_poll_queue(&vq->poll); } else if (unlikely(vhost_enable_notify(&net->dev, vq))) { vhost_disable_notify(&net->dev, vq); continue; } break; }
@@ -618,7 +616,7 @@ static void handle_tx_zerocopy(struct vh bool zcopy_used; int sent_pkts = 0;
- for (;;) {
- do { bool busyloop_intr;
/* Release DMAs done buffers first */ @@ -693,10 +691,7 @@ static void handle_tx_zerocopy(struct vh else vhost_zerocopy_signal_used(net, vq); vhost_net_tx_packet(net);
if (unlikely(vhost_exceeds_weight(vq, ++sent_pkts,
total_len)))
break;
- }
- } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len))); }
/* Expects to be always run from workqueue - which acts as
Neither does this. Equivalent code. Changelog says it fixes something for the transmit so... is that intentional?
Pavel
The same as above. So yes.
Thanks
From: Jason Wang jasowang@redhat.com
commit e79b431fb901ba1106670bcc80b9b617b25def7d upstream.
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing vsock kthread from hogging cpu which is guest triggerable. The weight can help to avoid starving the request from on direction while another direction is being processed.
The value of weight is picked from vhost-net.
This addresses CVE-2019-3900.
Cc: Stefan Hajnoczi stefanha@redhat.com Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Jason Wang jasowang@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/vhost/vsock.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -86,6 +86,7 @@ vhost_transport_do_send_pkt(struct vhost struct vhost_virtqueue *vq) { struct vhost_virtqueue *tx_vq = &vsock->vqs[VSOCK_VQ_TX]; + int pkts = 0, total_len = 0; bool added = false; bool restart_tx = false;
@@ -97,7 +98,7 @@ vhost_transport_do_send_pkt(struct vhost /* Avoid further vmexits, we're already processing the virtqueue */ vhost_disable_notify(&vsock->dev, vq);
- for (;;) { + do { struct virtio_vsock_pkt *pkt; struct iov_iter iov_iter; unsigned out, in; @@ -182,8 +183,9 @@ vhost_transport_do_send_pkt(struct vhost */ virtio_transport_deliver_tap_pkt(pkt);
+ total_len += pkt->len; virtio_transport_free_pkt(pkt); - } + } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len))); if (added) vhost_signal(&vsock->dev, vq);
@@ -358,7 +360,7 @@ static void vhost_vsock_handle_tx_kick(s struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock, dev); struct virtio_vsock_pkt *pkt; - int head; + int head, pkts = 0, total_len = 0; unsigned int out, in; bool added = false;
@@ -368,7 +370,7 @@ static void vhost_vsock_handle_tx_kick(s goto out;
vhost_disable_notify(&vsock->dev, vq); - for (;;) { + do { u32 len;
if (!vhost_vsock_more_replies(vsock)) { @@ -409,9 +411,11 @@ static void vhost_vsock_handle_tx_kick(s else virtio_transport_free_pkt(pkt);
- vhost_add_used(vq, head, sizeof(pkt->hdr) + len); + len += sizeof(pkt->hdr); + vhost_add_used(vq, head, len); + total_len += len; added = true; - } + } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
no_more_replies: if (added)
From: Jason Wang jasowang@redhat.com
commit c1ea02f15ab5efb3e93fc3144d895410bf79fcf2 upstream.
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing scsi kthread from hogging cpu which is guest triggerable.
This addresses CVE-2019-3900.
Cc: Paolo Bonzini pbonzini@redhat.com Cc: Stefan Hajnoczi stefanha@redhat.com Fixes: 057cbf49a1f0 ("tcm_vhost: Initial merge for vhost level target fabric driver") Signed-off-by: Jason Wang jasowang@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com [jwang: backport to 4.19] Signed-off-by: Jack Wang jinpu.wang@cloud.ionos.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/vhost/scsi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -817,7 +817,7 @@ vhost_scsi_handle_vq(struct vhost_scsi * u64 tag; u32 exp_data_len, data_direction; unsigned int out = 0, in = 0; - int head, ret, prot_bytes; + int head, ret, prot_bytes, c = 0; size_t req_size, rsp_size = sizeof(struct virtio_scsi_cmd_resp); size_t out_size, in_size; u16 lun; @@ -836,7 +836,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *
vhost_disable_notify(&vs->dev, vq);
- for (;;) { + do { head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), &out, &in, NULL, NULL); @@ -1051,7 +1051,7 @@ vhost_scsi_handle_vq(struct vhost_scsi * */ INIT_WORK(&cmd->work, vhost_scsi_submission_work); queue_work(vhost_scsi_workqueue, &cmd->work); - } + } while (likely(!vhost_exceeds_weight(vq, ++c, 0))); out: mutex_unlock(&vq->mutex); }
From: Jann Horn jannh@google.com
commit 16d51a590a8ce3befb1308e0e7ab77f3b661af33 upstream.
When going through execve(), zero out the NUMA fault statistics instead of freeing them.
During execve, the task is reachable through procfs and the scheduler. A concurrent /proc/*/sched reader can read data from a freed ->numa_faults allocation (confirmed by KASAN) and write it back to userspace. I believe that it would also be possible for a use-after-free read to occur through a race between a NUMA fault and execve(): task_numa_fault() can lead to task_numa_compare(), which invokes task_weight() on the currently running task of a different CPU.
Another way to fix this would be to make ->numa_faults RCU-managed or add extra locking, but it seems easier to wipe the NUMA fault statistics on execve.
Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Petr Mladek pmladek@suse.com Cc: Sergey Senozhatsky sergey.senozhatsky@gmail.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will@kernel.org Fixes: 82727018b0d3 ("sched/numa: Call task_numa_free() from do_execve()") Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/exec.c | 2 +- include/linux/sched/numa_balancing.h | 4 ++-- kernel/fork.c | 2 +- kernel/sched/fair.c | 24 ++++++++++++++++++++---- 4 files changed, 24 insertions(+), 8 deletions(-)
--- a/fs/exec.c +++ b/fs/exec.c @@ -1826,7 +1826,7 @@ static int __do_execve_file(int fd, stru membarrier_execve(current); rseq_execve(current); acct_update_integrals(current); - task_numa_free(current); + task_numa_free(current, false); free_bprm(bprm); kfree(pathbuf); if (filename) --- a/include/linux/sched/numa_balancing.h +++ b/include/linux/sched/numa_balancing.h @@ -19,7 +19,7 @@ extern void task_numa_fault(int last_node, int node, int pages, int flags); extern pid_t task_numa_group_id(struct task_struct *p); extern void set_numabalancing_state(bool enabled); -extern void task_numa_free(struct task_struct *p); +extern void task_numa_free(struct task_struct *p, bool final); extern bool should_numa_migrate_memory(struct task_struct *p, struct page *page, int src_nid, int dst_cpu); #else @@ -34,7 +34,7 @@ static inline pid_t task_numa_group_id(s static inline void set_numabalancing_state(bool enabled) { } -static inline void task_numa_free(struct task_struct *p) +static inline void task_numa_free(struct task_struct *p, bool final) { } static inline bool should_numa_migrate_memory(struct task_struct *p, --- a/kernel/fork.c +++ b/kernel/fork.c @@ -679,7 +679,7 @@ void __put_task_struct(struct task_struc WARN_ON(tsk == current);
cgroup_free(tsk); - task_numa_free(tsk); + task_numa_free(tsk, true); security_task_free(tsk); exit_creds(tsk); delayacct_tsk_free(tsk); --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2345,13 +2345,23 @@ no_join: return; }
-void task_numa_free(struct task_struct *p) +/* + * Get rid of NUMA staticstics associated with a task (either current or dead). + * If @final is set, the task is dead and has reached refcount zero, so we can + * safely free all relevant data structures. Otherwise, there might be + * concurrent reads from places like load balancing and procfs, and we should + * reset the data back to default state without freeing ->numa_faults. + */ +void task_numa_free(struct task_struct *p, bool final) { struct numa_group *grp = p->numa_group; - void *numa_faults = p->numa_faults; + unsigned long *numa_faults = p->numa_faults; unsigned long flags; int i;
+ if (!numa_faults) + return; + if (grp) { spin_lock_irqsave(&grp->lock, flags); for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++) @@ -2364,8 +2374,14 @@ void task_numa_free(struct task_struct * put_numa_group(grp); }
- p->numa_faults = NULL; - kfree(numa_faults); + if (final) { + p->numa_faults = NULL; + kfree(numa_faults); + } else { + p->total_numa_faults = 0; + for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++) + numa_faults[i] = 0; + } }
/*
From: Jann Horn jannh@google.com
commit cb361d8cdef69990f6b4504dc1fd9a594d983c97 upstream.
The old code used RCU annotations and accessors inconsistently for ->numa_group, which can lead to use-after-frees and NULL dereferences.
Let all accesses to ->numa_group use proper RCU helpers to prevent such issues.
Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Petr Mladek pmladek@suse.com Cc: Sergey Senozhatsky sergey.senozhatsky@gmail.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will@kernel.org Fixes: 8c8a743c5087 ("sched/numa: Use {cpu, pid} to create task groups for shared faults") Link: https://lkml.kernel.org/r/20190716152047.14424-3-jannh@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/sched.h | 10 +++- kernel/sched/fair.c | 120 +++++++++++++++++++++++++++++++++----------------- 2 files changed, 90 insertions(+), 40 deletions(-)
--- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1023,7 +1023,15 @@ struct task_struct { u64 last_sum_exec_runtime; struct callback_head numa_work;
- struct numa_group *numa_group; + /* + * This pointer is only modified for current in syscall and + * pagefault context (and for tasks being destroyed), so it can be read + * from any of the following contexts: + * - RCU read-side critical section + * - current->numa_group from everywhere + * - task's runqueue locked, task not running + */ + struct numa_group __rcu *numa_group;
/* * numa_faults is an array split into four regions: --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1053,6 +1053,21 @@ struct numa_group { unsigned long faults[0]; };
+/* + * For functions that can be called in multiple contexts that permit reading + * ->numa_group (see struct task_struct for locking rules). + */ +static struct numa_group *deref_task_numa_group(struct task_struct *p) +{ + return rcu_dereference_check(p->numa_group, p == current || + (lockdep_is_held(&task_rq(p)->lock) && !READ_ONCE(p->on_cpu))); +} + +static struct numa_group *deref_curr_numa_group(struct task_struct *p) +{ + return rcu_dereference_protected(p->numa_group, p == current); +} + static inline unsigned long group_faults_priv(struct numa_group *ng); static inline unsigned long group_faults_shared(struct numa_group *ng);
@@ -1096,10 +1111,12 @@ static unsigned int task_scan_start(stru { unsigned long smin = task_scan_min(p); unsigned long period = smin; + struct numa_group *ng;
/* Scale the maximum scan period with the amount of shared memory. */ - if (p->numa_group) { - struct numa_group *ng = p->numa_group; + rcu_read_lock(); + ng = rcu_dereference(p->numa_group); + if (ng) { unsigned long shared = group_faults_shared(ng); unsigned long private = group_faults_priv(ng);
@@ -1107,6 +1124,7 @@ static unsigned int task_scan_start(stru period *= shared + 1; period /= private + shared + 1; } + rcu_read_unlock();
return max(smin, period); } @@ -1115,13 +1133,14 @@ static unsigned int task_scan_max(struct { unsigned long smin = task_scan_min(p); unsigned long smax; + struct numa_group *ng;
/* Watch for min being lower than max due to floor calculations */ smax = sysctl_numa_balancing_scan_period_max / task_nr_scan_windows(p);
/* Scale the maximum scan period with the amount of shared memory. */ - if (p->numa_group) { - struct numa_group *ng = p->numa_group; + ng = deref_curr_numa_group(p); + if (ng) { unsigned long shared = group_faults_shared(ng); unsigned long private = group_faults_priv(ng); unsigned long period = smax; @@ -1153,7 +1172,7 @@ void init_numa_balancing(unsigned long c p->numa_scan_period = sysctl_numa_balancing_scan_delay; p->numa_work.next = &p->numa_work; p->numa_faults = NULL; - p->numa_group = NULL; + RCU_INIT_POINTER(p->numa_group, NULL); p->last_task_numa_placement = 0; p->last_sum_exec_runtime = 0;
@@ -1200,7 +1219,16 @@ static void account_numa_dequeue(struct
pid_t task_numa_group_id(struct task_struct *p) { - return p->numa_group ? p->numa_group->gid : 0; + struct numa_group *ng; + pid_t gid = 0; + + rcu_read_lock(); + ng = rcu_dereference(p->numa_group); + if (ng) + gid = ng->gid; + rcu_read_unlock(); + + return gid; }
/* @@ -1225,11 +1253,13 @@ static inline unsigned long task_faults(
static inline unsigned long group_faults(struct task_struct *p, int nid) { - if (!p->numa_group) + struct numa_group *ng = deref_task_numa_group(p); + + if (!ng) return 0;
- return p->numa_group->faults[task_faults_idx(NUMA_MEM, nid, 0)] + - p->numa_group->faults[task_faults_idx(NUMA_MEM, nid, 1)]; + return ng->faults[task_faults_idx(NUMA_MEM, nid, 0)] + + ng->faults[task_faults_idx(NUMA_MEM, nid, 1)]; }
static inline unsigned long group_faults_cpu(struct numa_group *group, int nid) @@ -1367,12 +1397,13 @@ static inline unsigned long task_weight( static inline unsigned long group_weight(struct task_struct *p, int nid, int dist) { + struct numa_group *ng = deref_task_numa_group(p); unsigned long faults, total_faults;
- if (!p->numa_group) + if (!ng) return 0;
- total_faults = p->numa_group->total_faults; + total_faults = ng->total_faults;
if (!total_faults) return 0; @@ -1386,7 +1417,7 @@ static inline unsigned long group_weight bool should_numa_migrate_memory(struct task_struct *p, struct page * page, int src_nid, int dst_cpu) { - struct numa_group *ng = p->numa_group; + struct numa_group *ng = deref_curr_numa_group(p); int dst_nid = cpu_to_node(dst_cpu); int last_cpupid, this_cpupid;
@@ -1592,13 +1623,14 @@ static bool load_too_imbalanced(long src static void task_numa_compare(struct task_numa_env *env, long taskimp, long groupimp, bool maymove) { + struct numa_group *cur_ng, *p_ng = deref_curr_numa_group(env->p); struct rq *dst_rq = cpu_rq(env->dst_cpu); + long imp = p_ng ? groupimp : taskimp; struct task_struct *cur; long src_load, dst_load; - long load; - long imp = env->p->numa_group ? groupimp : taskimp; - long moveimp = imp; int dist = env->dist; + long moveimp = imp; + long load;
if (READ_ONCE(dst_rq->numa_migrate_on)) return; @@ -1637,21 +1669,22 @@ static void task_numa_compare(struct tas * If dst and source tasks are in the same NUMA group, or not * in any group then look only at task weights. */ - if (cur->numa_group == env->p->numa_group) { + cur_ng = rcu_dereference(cur->numa_group); + if (cur_ng == p_ng) { imp = taskimp + task_weight(cur, env->src_nid, dist) - task_weight(cur, env->dst_nid, dist); /* * Add some hysteresis to prevent swapping the * tasks within a group over tiny differences. */ - if (cur->numa_group) + if (cur_ng) imp -= imp / 16; } else { /* * Compare the group weights. If a task is all by itself * (not part of a group), use the task weight instead. */ - if (cur->numa_group && env->p->numa_group) + if (cur_ng && p_ng) imp += group_weight(cur, env->src_nid, dist) - group_weight(cur, env->dst_nid, dist); else @@ -1749,11 +1782,12 @@ static int task_numa_migrate(struct task .best_imp = 0, .best_cpu = -1, }; + unsigned long taskweight, groupweight; struct sched_domain *sd; + long taskimp, groupimp; + struct numa_group *ng; struct rq *best_rq; - unsigned long taskweight, groupweight; int nid, ret, dist; - long taskimp, groupimp;
/* * Pick the lowest SD_NUMA domain, as that would have the smallest @@ -1799,7 +1833,8 @@ static int task_numa_migrate(struct task * multiple NUMA nodes; in order to better consolidate the group, * we need to check other locations. */ - if (env.best_cpu == -1 || (p->numa_group && p->numa_group->active_nodes > 1)) { + ng = deref_curr_numa_group(p); + if (env.best_cpu == -1 || (ng && ng->active_nodes > 1)) { for_each_online_node(nid) { if (nid == env.src_nid || nid == p->numa_preferred_nid) continue; @@ -1832,7 +1867,7 @@ static int task_numa_migrate(struct task * A task that migrated to a second choice node will be better off * trying for a better one later. Do not set the preferred node here. */ - if (p->numa_group) { + if (ng) { if (env.best_cpu == -1) nid = env.src_nid; else @@ -2127,6 +2162,7 @@ static void task_numa_placement(struct t unsigned long total_faults; u64 runtime, period; spinlock_t *group_lock = NULL; + struct numa_group *ng;
/* * The p->mm->numa_scan_seq field gets updated without @@ -2144,8 +2180,9 @@ static void task_numa_placement(struct t runtime = numa_get_avg_runtime(p, &period);
/* If the task is part of a group prevent parallel updates to group stats */ - if (p->numa_group) { - group_lock = &p->numa_group->lock; + ng = deref_curr_numa_group(p); + if (ng) { + group_lock = &ng->lock; spin_lock_irq(group_lock); }
@@ -2186,7 +2223,7 @@ static void task_numa_placement(struct t p->numa_faults[cpu_idx] += f_diff; faults += p->numa_faults[mem_idx]; p->total_numa_faults += diff; - if (p->numa_group) { + if (ng) { /* * safe because we can only change our own group * @@ -2194,14 +2231,14 @@ static void task_numa_placement(struct t * nid and priv in a specific region because it * is at the beginning of the numa_faults array. */ - p->numa_group->faults[mem_idx] += diff; - p->numa_group->faults_cpu[mem_idx] += f_diff; - p->numa_group->total_faults += diff; - group_faults += p->numa_group->faults[mem_idx]; + ng->faults[mem_idx] += diff; + ng->faults_cpu[mem_idx] += f_diff; + ng->total_faults += diff; + group_faults += ng->faults[mem_idx]; } }
- if (!p->numa_group) { + if (!ng) { if (faults > max_faults) { max_faults = faults; max_nid = nid; @@ -2212,8 +2249,8 @@ static void task_numa_placement(struct t } }
- if (p->numa_group) { - numa_group_count_active_nodes(p->numa_group); + if (ng) { + numa_group_count_active_nodes(ng); spin_unlock_irq(group_lock); max_nid = preferred_group_nid(p, max_nid); } @@ -2247,7 +2284,7 @@ static void task_numa_group(struct task_ int cpu = cpupid_to_cpu(cpupid); int i;
- if (unlikely(!p->numa_group)) { + if (unlikely(!deref_curr_numa_group(p))) { unsigned int size = sizeof(struct numa_group) + 4*nr_node_ids*sizeof(unsigned long);
@@ -2283,7 +2320,7 @@ static void task_numa_group(struct task_ if (!grp) goto no_join;
- my_grp = p->numa_group; + my_grp = deref_curr_numa_group(p); if (grp == my_grp) goto no_join;
@@ -2354,7 +2391,8 @@ no_join: */ void task_numa_free(struct task_struct *p, bool final) { - struct numa_group *grp = p->numa_group; + /* safe: p either is current or is being freed by current */ + struct numa_group *grp = rcu_dereference_raw(p->numa_group); unsigned long *numa_faults = p->numa_faults; unsigned long flags; int i; @@ -2434,7 +2472,7 @@ void task_numa_fault(int last_cpupid, in * actively using should be counted as local. This allows the * scan rate to slow down when a workload has settled down. */ - ng = p->numa_group; + ng = deref_curr_numa_group(p); if (!priv && !local && ng && ng->active_nodes > 1 && numa_is_active_node(cpu_node, ng) && numa_is_active_node(mem_node, ng)) @@ -10234,18 +10272,22 @@ void show_numa_stats(struct task_struct { int node; unsigned long tsf = 0, tpf = 0, gsf = 0, gpf = 0; + struct numa_group *ng;
+ rcu_read_lock(); + ng = rcu_dereference(p->numa_group); for_each_online_node(node) { if (p->numa_faults) { tsf = p->numa_faults[task_faults_idx(NUMA_MEM, node, 0)]; tpf = p->numa_faults[task_faults_idx(NUMA_MEM, node, 1)]; } - if (p->numa_group) { - gsf = p->numa_group->faults[task_faults_idx(NUMA_MEM, node, 0)], - gpf = p->numa_group->faults[task_faults_idx(NUMA_MEM, node, 1)]; + if (ng) { + gsf = ng->faults[task_faults_idx(NUMA_MEM, node, 0)], + gpf = ng->faults[task_faults_idx(NUMA_MEM, node, 1)]; } print_numa_stats(m, node, tsf, tpf, gsf, gpf); } + rcu_read_unlock(); } #endif /* CONFIG_NUMA_BALANCING */ #endif /* CONFIG_SCHED_DEBUG */
From: Linus Torvalds torvalds@linux-foundation.org
commit 3d712546d8ba9f25cdf080d79f90482aa4231ed4 upstream.
Start off with a clean slate that only reads exactly from arg_start to arg_end, without any oddities. This simplifies the code and in the process removes the case that caused us to potentially leak an uninitialized byte from the temporary kernel buffer.
Note that in order to start from scratch with an understandable base, this simplifies things _too_ much, and removes all the legacy logic to handle setproctitle() having changed the argument strings.
We'll add back those special cases very differently in the next commit.
Link: https://lore.kernel.org/lkml/20190712160913.17727-1-izbyshev@ispras.ru/ Fixes: f5b65348fd77 ("proc: fix missing final NUL in get_mm_cmdline() rewrite") Cc: Alexey Izbyshev izbyshev@ispras.ru Cc: Alexey Dobriyan adobriyan@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/base.c | 71 ++++++--------------------------------------------------- 1 file changed, 8 insertions(+), 63 deletions(-)
--- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -208,7 +208,7 @@ static int proc_root_link(struct dentry static ssize_t get_mm_cmdline(struct mm_struct *mm, char __user *buf, size_t count, loff_t *ppos) { - unsigned long arg_start, arg_end, env_start, env_end; + unsigned long arg_start, arg_end; unsigned long pos, len; char *page;
@@ -219,36 +219,18 @@ static ssize_t get_mm_cmdline(struct mm_ spin_lock(&mm->arg_lock); arg_start = mm->arg_start; arg_end = mm->arg_end; - env_start = mm->env_start; - env_end = mm->env_end; spin_unlock(&mm->arg_lock);
if (arg_start >= arg_end) return 0;
- /* - * We have traditionally allowed the user to re-write - * the argument strings and overflow the end result - * into the environment section. But only do that if - * the environment area is contiguous to the arguments. - */ - if (env_start != arg_end || env_start >= env_end) - env_start = env_end = arg_end; - - /* .. and limit it to a maximum of one page of slop */ - if (env_end >= arg_end + PAGE_SIZE) - env_end = arg_end + PAGE_SIZE - 1; - /* We're not going to care if "*ppos" has high bits set */ - pos = arg_start + *ppos; - /* .. but we do check the result is in the proper range */ - if (pos < arg_start || pos >= env_end) + pos = arg_start + *ppos; + if (pos < arg_start || pos >= arg_end) return 0; - - /* .. and we never go past env_end */ - if (env_end - pos < count) - count = env_end - pos; + if (count > arg_end - pos) + count = arg_end - pos;
page = (char *)__get_free_page(GFP_KERNEL); if (!page) @@ -258,48 +240,11 @@ static ssize_t get_mm_cmdline(struct mm_ while (count) { int got; size_t size = min_t(size_t, PAGE_SIZE, count); - long offset; - - /* - * Are we already starting past the official end? - * We always include the last byte that is *supposed* - * to be NUL - */ - offset = (pos >= arg_end) ? pos - arg_end + 1 : 0;
- got = access_remote_vm(mm, pos - offset, page, size + offset, FOLL_ANON); - if (got <= offset) + got = access_remote_vm(mm, pos, page, size, FOLL_ANON); + if (got <= 0) break; - got -= offset; - - /* Don't walk past a NUL character once you hit arg_end */ - if (pos + got >= arg_end) { - int n = 0; - - /* - * If we started before 'arg_end' but ended up - * at or after it, we start the NUL character - * check at arg_end-1 (where we expect the normal - * EOF to be). - * - * NOTE! This is smaller than 'got', because - * pos + got >= arg_end - */ - if (pos < arg_end) - n = arg_end - pos - 1; - - /* Cut off at first NUL after 'n' */ - got = n + strnlen(page+n, offset+got-n); - if (got < offset) - break; - got -= offset; - - /* Include the NUL if it existed */ - if (got < size) - got++; - } - - got -= copy_to_user(buf, page+offset, got); + got -= copy_to_user(buf, page, got); if (unlikely(!got)) { if (!len) len = -EFAULT;
From: Linus Torvalds torvalds@linux-foundation.org
commit d26d0cd97c88eb1a5704b42e41ab443406807810 upstream.
This makes the setproctitle() special case very explicit indeed, and handles it with a separate helper function entirely. In the process, it re-instates the original semantics of simply stopping at the first NUL character when the original last NUL character is no longer there.
[ The original semantics can still be seen in mm/util.c: get_cmdline() that is limited to a fixed-size buffer ]
This makes the logic about when we use the string lengths etc much more obvious, and makes it easier to see what we do and what the two very different cases are.
Note that even when we allow walking past the end of the argument array (because the setproctitle() might have overwritten and overflowed the original argv[] strings), we only allow it when it overflows into the environment region if it is immediately adjacent.
[ Fixed for missing 'count' checks noted by Alexey Izbyshev ]
Link: https://lore.kernel.org/lkml/alpine.LNX.2.21.1904052326230.3249@kich.toxcorp... Fixes: 5ab827189965 ("fs/proc: simplify and clarify get_mm_cmdline() function") Cc: Jakub Jankowski shasta@toxcorp.com Cc: Alexey Dobriyan adobriyan@gmail.com Cc: Alexey Izbyshev izbyshev@ispras.ru Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/base.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 77 insertions(+), 4 deletions(-)
--- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -205,12 +205,53 @@ static int proc_root_link(struct dentry return result; }
+/* + * If the user used setproctitle(), we just get the string from + * user space at arg_start, and limit it to a maximum of one page. + */ +static ssize_t get_mm_proctitle(struct mm_struct *mm, char __user *buf, + size_t count, unsigned long pos, + unsigned long arg_start) +{ + char *page; + int ret, got; + + if (pos >= PAGE_SIZE) + return 0; + + page = (char *)__get_free_page(GFP_KERNEL); + if (!page) + return -ENOMEM; + + ret = 0; + got = access_remote_vm(mm, arg_start, page, PAGE_SIZE, FOLL_ANON); + if (got > 0) { + int len = strnlen(page, got); + + /* Include the NUL character if it was found */ + if (len < got) + len++; + + if (len > pos) { + len -= pos; + if (len > count) + len = count; + len -= copy_to_user(buf, page+pos, len); + if (!len) + len = -EFAULT; + ret = len; + } + } + free_page((unsigned long)page); + return ret; +} + static ssize_t get_mm_cmdline(struct mm_struct *mm, char __user *buf, size_t count, loff_t *ppos) { - unsigned long arg_start, arg_end; + unsigned long arg_start, arg_end, env_start, env_end; unsigned long pos, len; - char *page; + char *page, c;
/* Check if process spawned far enough to have cmdline. */ if (!mm->env_end) @@ -219,14 +260,46 @@ static ssize_t get_mm_cmdline(struct mm_ spin_lock(&mm->arg_lock); arg_start = mm->arg_start; arg_end = mm->arg_end; + env_start = mm->env_start; + env_end = mm->env_end; spin_unlock(&mm->arg_lock);
if (arg_start >= arg_end) return 0;
+ /* + * We allow setproctitle() to overwrite the argument + * strings, and overflow past the original end. But + * only when it overflows into the environment area. + */ + if (env_start != arg_end || env_end < env_start) + env_start = env_end = arg_end; + len = env_end - arg_start; + /* We're not going to care if "*ppos" has high bits set */ - /* .. but we do check the result is in the proper range */ - pos = arg_start + *ppos; + pos = *ppos; + if (pos >= len) + return 0; + if (count > len - pos) + count = len - pos; + if (!count) + return 0; + + /* + * Magical special case: if the argv[] end byte is not + * zero, the user has overwritten it with setproctitle(3). + * + * Possible future enhancement: do this only once when + * pos is 0, and set a flag in the 'struct file'. + */ + if (access_remote_vm(mm, arg_end-1, &c, 1, FOLL_ANON) == 1 && c) + return get_mm_proctitle(mm, buf, count, pos, arg_start); + + /* + * For the non-setproctitle() case we limit things strictly + * to the [arg_start, arg_end[ range. + */ + pos += arg_start; if (pos < arg_start || pos >= arg_end) return 0; if (count > arg_end - pos)
From: Miroslav Lichvar mlichvar@redhat.com
commit 5515e9a6273b8c02034466bcbd717ac9f53dab99 upstream.
The PPS assert/clear offset corrections are set by the PPS_SETPARAMS ioctl in the pps_ktime structs, which also contain flags. The flags are not initialized by applications (using the timepps.h header) and they are not used by the kernel for anything except returning them back in the PPS_GETPARAMS ioctl.
Set the flags to zero to make it clear they are unused and avoid leaking uninitialized data of the PPS_SETPARAMS caller to other applications that have a read access to the PPS device.
Link: http://lkml.kernel.org/r/20190702092251.24303-1-mlichvar@redhat.com Signed-off-by: Miroslav Lichvar mlichvar@redhat.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Acked-by: Rodolfo Giometti giometti@enneenne.com Cc: Greg KH greg@kroah.com Cc: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pps/pps.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/pps/pps.c +++ b/drivers/pps/pps.c @@ -166,6 +166,14 @@ static long pps_cdev_ioctl(struct file * pps->params.mode |= PPS_CANWAIT; pps->params.api_version = PPS_API_VERS;
+ /* + * Clear unused fields of pps_kparams to avoid leaking + * uninitialized data of the PPS_SETPARAMS caller via + * PPS_GETPARAMS + */ + pps->params.assert_off_tu.flags = 0; + pps->params.clear_off_tu.flags = 0; + spin_unlock_irq(&pps->lock);
break;
From: Yoshinori Sato ysato@users.sourceforge.jp
commit 1b496469d0c020e09124e03e66a81421c21272a7 upstream.
Conflict JCore-SoC and SolutionEngine 7619.
Signed-off-by: Yoshinori Sato ysato@users.sourceforge.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/sh/boards/Kconfig | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
--- a/arch/sh/boards/Kconfig +++ b/arch/sh/boards/Kconfig @@ -8,27 +8,19 @@ config SH_ALPHA_BOARD bool
config SH_DEVICE_TREE - bool "Board Described by Device Tree" + bool select OF select OF_EARLY_FLATTREE select TIMER_OF select COMMON_CLK select GENERIC_CALIBRATE_DELAY - help - Select Board Described by Device Tree to build a kernel that - does not hard-code any board-specific knowledge but instead uses - a device tree blob provided by the boot-loader. You must enable - drivers for any hardware you want to use separately. At this - time, only boards based on the open-hardware J-Core processors - have sufficient driver coverage to use this option; do not - select it if you are using original SuperH hardware.
config SH_JCORE_SOC bool "J-Core SoC" - depends on SH_DEVICE_TREE && (CPU_SH2 || CPU_J2) + select SH_DEVICE_TREE select CLKSRC_JCORE_PIT select JCORE_AIC - default y if CPU_J2 + depends on CPU_J2 help Select this option to include drivers core components of the J-Core SoC, including interrupt controllers and timers.
From: Yan, Zheng zyan@redhat.com
commit d6e47819721ae2d9d090058ad5570a66f3c42e39 upstream.
ceph_d_revalidate(, LOOKUP_RCU) may call __ceph_caps_issued_mask() on a freeing inode.
Signed-off-by: "Yan, Zheng" zyan@redhat.com Reviewed-by: Jeff Layton jlayton@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ceph/caps.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1237,20 +1237,23 @@ static int send_cap_msg(struct cap_msg_a }
/* - * Queue cap releases when an inode is dropped from our cache. Since - * inode is about to be destroyed, there is no need for i_ceph_lock. + * Queue cap releases when an inode is dropped from our cache. */ void ceph_queue_caps_release(struct inode *inode) { struct ceph_inode_info *ci = ceph_inode(inode); struct rb_node *p;
+ /* lock i_ceph_lock, because ceph_d_revalidate(..., LOOKUP_RCU) + * may call __ceph_caps_issued_mask() on a freeing inode. */ + spin_lock(&ci->i_ceph_lock); p = rb_first(&ci->i_caps); while (p) { struct ceph_cap *cap = rb_entry(p, struct ceph_cap, ci_node); p = rb_next(p); __ceph_remove_cap(cap, true); } + spin_unlock(&ci->i_ceph_lock); }
/*
From: Bart Van Assche bvanassche@acm.org
commit cd84a62e0078dce09f4ed349bec84f86c9d54b30 upstream.
The RQF_PREEMPT flag is used for three purposes: - In the SCSI core, for making sure that power management requests are executed even if a device is in the "quiesced" state. - For domain validation by SCSI drivers that use the parallel port. - In the IDE driver, for IDE preempt requests. Rename "preempt-only" into "pm-only" because the primary purpose of this mode is power management. Since the power management core may but does not have to resume a runtime suspended device before performing system-wide suspend and since a later patch will set "pm-only" mode as long as a block device is runtime suspended, make it possible to set "pm-only" mode from more than one context. Since with this change scsi_device_quiesce() is no longer idempotent, make that function return early if it is called for a quiesced queue.
Signed-off-by: Bart Van Assche bvanassche@acm.org Acked-by: Martin K. Petersen martin.petersen@oracle.com Reviewed-by: Hannes Reinecke hare@suse.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Ming Lei ming.lei@redhat.com Cc: Jianchao Wang jianchao.w.wang@oracle.com Cc: Johannes Thumshirn jthumshirn@suse.de Cc: Alan Stern stern@rowland.harvard.edu Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-core.c | 35 ++++++++++++++++++----------------- block/blk-mq-debugfs.c | 10 +++++++++- drivers/scsi/scsi_lib.c | 11 +++++++---- include/linux/blkdev.h | 14 +++++++++----- 4 files changed, 43 insertions(+), 27 deletions(-)
--- a/block/blk-core.c +++ b/block/blk-core.c @@ -421,24 +421,25 @@ void blk_sync_queue(struct request_queue EXPORT_SYMBOL(blk_sync_queue);
/** - * blk_set_preempt_only - set QUEUE_FLAG_PREEMPT_ONLY + * blk_set_pm_only - increment pm_only counter * @q: request queue pointer - * - * Returns the previous value of the PREEMPT_ONLY flag - 0 if the flag was not - * set and 1 if the flag was already set. */ -int blk_set_preempt_only(struct request_queue *q) +void blk_set_pm_only(struct request_queue *q) { - return blk_queue_flag_test_and_set(QUEUE_FLAG_PREEMPT_ONLY, q); + atomic_inc(&q->pm_only); } -EXPORT_SYMBOL_GPL(blk_set_preempt_only); +EXPORT_SYMBOL_GPL(blk_set_pm_only);
-void blk_clear_preempt_only(struct request_queue *q) +void blk_clear_pm_only(struct request_queue *q) { - blk_queue_flag_clear(QUEUE_FLAG_PREEMPT_ONLY, q); - wake_up_all(&q->mq_freeze_wq); + int pm_only; + + pm_only = atomic_dec_return(&q->pm_only); + WARN_ON_ONCE(pm_only < 0); + if (pm_only == 0) + wake_up_all(&q->mq_freeze_wq); } -EXPORT_SYMBOL_GPL(blk_clear_preempt_only); +EXPORT_SYMBOL_GPL(blk_clear_pm_only);
/** * __blk_run_queue_uncond - run a queue whether or not it has been stopped @@ -916,7 +917,7 @@ EXPORT_SYMBOL(blk_alloc_queue); */ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) { - const bool preempt = flags & BLK_MQ_REQ_PREEMPT; + const bool pm = flags & BLK_MQ_REQ_PREEMPT;
while (true) { bool success = false; @@ -924,11 +925,11 @@ int blk_queue_enter(struct request_queue rcu_read_lock(); if (percpu_ref_tryget_live(&q->q_usage_counter)) { /* - * The code that sets the PREEMPT_ONLY flag is - * responsible for ensuring that that flag is globally - * visible before the queue is unfrozen. + * The code that increments the pm_only counter is + * responsible for ensuring that that counter is + * globally visible before the queue is unfrozen. */ - if (preempt || !blk_queue_preempt_only(q)) { + if (pm || !blk_queue_pm_only(q)) { success = true; } else { percpu_ref_put(&q->q_usage_counter); @@ -953,7 +954,7 @@ int blk_queue_enter(struct request_queue
wait_event(q->mq_freeze_wq, (atomic_read(&q->mq_freeze_depth) == 0 && - (preempt || !blk_queue_preempt_only(q))) || + (pm || !blk_queue_pm_only(q))) || blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -102,6 +102,14 @@ static int blk_flags_show(struct seq_fil return 0; }
+static int queue_pm_only_show(void *data, struct seq_file *m) +{ + struct request_queue *q = data; + + seq_printf(m, "%d\n", atomic_read(&q->pm_only)); + return 0; +} + #define QUEUE_FLAG_NAME(name) [QUEUE_FLAG_##name] = #name static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(QUEUED), @@ -132,7 +140,6 @@ static const char *const blk_queue_flag_ QUEUE_FLAG_NAME(REGISTERED), QUEUE_FLAG_NAME(SCSI_PASSTHROUGH), QUEUE_FLAG_NAME(QUIESCED), - QUEUE_FLAG_NAME(PREEMPT_ONLY), }; #undef QUEUE_FLAG_NAME
@@ -209,6 +216,7 @@ static ssize_t queue_write_hint_store(vo static const struct blk_mq_debugfs_attr blk_mq_debugfs_queue_attrs[] = { { "poll_stat", 0400, queue_poll_stat_show }, { "requeue_list", 0400, .seq_ops = &queue_requeue_list_seq_ops }, + { "pm_only", 0600, queue_pm_only_show, NULL }, { "state", 0600, queue_state_show, queue_state_write }, { "write_hints", 0600, queue_write_hint_show, queue_write_hint_store }, { "zone_wlock", 0400, queue_zone_wlock_show, NULL }, --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3059,11 +3059,14 @@ scsi_device_quiesce(struct scsi_device * */ WARN_ON_ONCE(sdev->quiesced_by && sdev->quiesced_by != current);
- blk_set_preempt_only(q); + if (sdev->quiesced_by == current) + return 0; + + blk_set_pm_only(q);
blk_mq_freeze_queue(q); /* - * Ensure that the effect of blk_set_preempt_only() will be visible + * Ensure that the effect of blk_set_pm_only() will be visible * for percpu_ref_tryget() callers that occur after the queue * unfreeze even if the queue was already frozen before this function * was called. See also https://lwn.net/Articles/573497/. @@ -3076,7 +3079,7 @@ scsi_device_quiesce(struct scsi_device * if (err == 0) sdev->quiesced_by = current; else - blk_clear_preempt_only(q); + blk_clear_pm_only(q); mutex_unlock(&sdev->state_mutex);
return err; @@ -3100,7 +3103,7 @@ void scsi_device_resume(struct scsi_devi */ mutex_lock(&sdev->state_mutex); sdev->quiesced_by = NULL; - blk_clear_preempt_only(sdev->request_queue); + blk_clear_pm_only(sdev->request_queue); if (sdev->sdev_state == SDEV_QUIESCE) scsi_device_set_state(sdev, SDEV_RUNNING); mutex_unlock(&sdev->state_mutex); --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -504,6 +504,12 @@ struct request_queue { * various queue flags, see QUEUE_* below */ unsigned long queue_flags; + /* + * Number of contexts that have called blk_set_pm_only(). If this + * counter is above zero then only RQF_PM and RQF_PREEMPT requests are + * processed. + */ + atomic_t pm_only;
/* * ida allocated id for this queue. Used to index queues from @@ -698,7 +704,6 @@ struct request_queue { #define QUEUE_FLAG_REGISTERED 26 /* queue has been registered to a disk */ #define QUEUE_FLAG_SCSI_PASSTHROUGH 27 /* queue supports SCSI commands */ #define QUEUE_FLAG_QUIESCED 28 /* queue has been quiesced */ -#define QUEUE_FLAG_PREEMPT_ONLY 29 /* only process REQ_PREEMPT requests */
#define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_SAME_COMP) | \ @@ -736,12 +741,11 @@ bool blk_queue_flag_test_and_clear(unsig ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \ REQ_FAILFAST_DRIVER)) #define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags) -#define blk_queue_preempt_only(q) \ - test_bit(QUEUE_FLAG_PREEMPT_ONLY, &(q)->queue_flags) +#define blk_queue_pm_only(q) atomic_read(&(q)->pm_only) #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags)
-extern int blk_set_preempt_only(struct request_queue *q); -extern void blk_clear_preempt_only(struct request_queue *q); +extern void blk_set_pm_only(struct request_queue *q); +extern void blk_clear_pm_only(struct request_queue *q);
static inline int queue_in_flight(struct request_queue *q) {
From: Bart Van Assche bvanassche@acm.org
commit 17605afaae825b0291f80c62a7f6565879edaa8a upstream.
Since scsi_device_quiesce() skips SCSI devices that have another state than RUNNING, OFFLINE or TRANSPORT_OFFLINE, scsi_device_resume() should not complain about SCSI devices that have been skipped. Hence this patch. This patch avoids that the following warning appears during resume:
WARNING: CPU: 3 PID: 1039 at blk_clear_pm_only+0x2a/0x30 CPU: 3 PID: 1039 Comm: kworker/u8:49 Not tainted 5.0.0+ #1 Hardware name: LENOVO 4180F42/4180F42, BIOS 83ET75WW (1.45 ) 05/10/2013 Workqueue: events_unbound async_run_entry_fn RIP: 0010:blk_clear_pm_only+0x2a/0x30 Call Trace: ? scsi_device_resume+0x28/0x50 ? scsi_dev_type_resume+0x2b/0x80 ? async_run_entry_fn+0x2c/0xd0 ? process_one_work+0x1f0/0x3f0 ? worker_thread+0x28/0x3c0 ? process_one_work+0x3f0/0x3f0 ? kthread+0x10c/0x130 ? __kthread_create_on_node+0x150/0x150 ? ret_from_fork+0x1f/0x30
Cc: Christoph Hellwig hch@lst.de Cc: Hannes Reinecke hare@suse.com Cc: Ming Lei ming.lei@redhat.com Cc: Johannes Thumshirn jthumshirn@suse.de Cc: Oleksandr Natalenko oleksandr@natalenko.name Cc: Martin Steigerwald martin@lichtvoll.de Cc: stable@vger.kernel.org Reported-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Tested-by: Jisheng Zhang Jisheng.Zhang@synaptics.com Fixes: 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15 Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/scsi_lib.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3102,8 +3102,10 @@ void scsi_device_resume(struct scsi_devi * device deleted during suspend) */ mutex_lock(&sdev->state_mutex); - sdev->quiesced_by = NULL; - blk_clear_pm_only(sdev->request_queue); + if (sdev->quiesced_by) { + sdev->quiesced_by = NULL; + blk_clear_pm_only(sdev->request_queue); + } if (sdev->sdev_state == SDEV_QUIESCE) scsi_device_set_state(sdev, SDEV_RUNNING); mutex_unlock(&sdev->state_mutex);
From: Thierry Reding treding@nvidia.com
On Fri, 02 Aug 2019 11:39:34 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.19: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.19.64-rc1-g63a8dab46af2 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Thierry
On 8/2/19 3:39 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Fri, 2 Aug 2019 at 15:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.19.64-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.19.y git commit: 63a8dab46af2b65ecdb5a83662d94a3a26be973e git describe: v4.19.62-148-g63a8dab46af2 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.62-14...
No regressions (compared to build v4.19.62)
No fixes (compared to build v4.19.62)
Ran 25243 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-fs-tests * network-basic-tests * ltp-open-posix-tests * kvm-unit-tests * ssuite * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Fri 2019-08-02 11:39:34, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
The git tree does not seem to correspond to the patches posted. git has:
commit 63a8dab46af2b65ecdb5a83662d94a3a26be973e Author: Greg Kroah-Hartman gregkh@linuxfoundation.org Date: Fri Aug 2 13:30:55 2019 +0200
Linux 4.19.64-rc1
commit 1b35ed42aeacc21a9d21646165333566dd8e181a Author: Xin Long lucien.xin@gmail.com Date: Mon Jun 17 21:34:13 2019 +0800
ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL
But 1b35ed42aeacc ip_tunnel patch is not mentioned here nor is included in the series on the list AFAICT. (I don't find anything wrong with 1b35ed42aeacc).
Best regards, Pavel
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.64-rc1
Bart Van Assche bvanassche@acm.org scsi: core: Avoid that a kernel warning appears during system resume
Bart Van Assche bvanassche@acm.org block, scsi: Change the preempt-only flag into a counter
Yan, Zheng zyan@redhat.com ceph: hold i_ceph_lock when removing caps for freeing inode
Yoshinori Sato ysato@users.sourceforge.jp Fix allyesconfig output.
Miroslav Lichvar mlichvar@redhat.com drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
Linus Torvalds torvalds@linux-foundation.org /proc/<pid>/cmdline: add back the setproctitle() special case
Linus Torvalds torvalds@linux-foundation.org /proc/<pid>/cmdline: remove all the special cases
Jann Horn jannh@google.com sched/fair: Use RCU accessors consistently for ->numa_group
Jann Horn jannh@google.com sched/fair: Don't free p->numa_faults with concurrent readers
Jason Wang jasowang@redhat.com vhost: scsi: add weight support
Jason Wang jasowang@redhat.com vhost: vsock: add weight support
Jason Wang jasowang@redhat.com vhost_net: fix possible infinite loop
Jason Wang jasowang@redhat.com vhost: introduce vhost_exceeds_weight()
Vladis Dronov vdronov@redhat.com Bluetooth: hci_uart: check for missing tty operations
Joerg Roedel jroedel@suse.de iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA
Dmitry Safonov dima@arista.com iommu/vt-d: Don't queue_iova() if there is no flush queue
Luke Nowakowski-Krijger lnowakow@eng.ucsd.edu media: radio-raremono: change devm_k*alloc to k*alloc
Benjamin Coddington bcodding@redhat.com NFS: Cleanup if nfs_match_client is interrupted
Andrey Konovalov andreyknvl@google.com media: pvrusb2: use a different format for warnings
Oliver Neukum oneukum@suse.com media: cpia2_usb: first wake up, then free in disconnect
Fabio Estevam festevam@gmail.com ath10k: Change the warning message string
Sean Young sean@mess.org media: au0828: fix null dereference in error path
Phong Tran tranmanphong@gmail.com ISDN: hfcsusb: checking idx of ep configuration
Todd Kjos tkjos@android.com binder: fix possible UAF when freeing buffer
Will Deacon will.deacon@arm.com arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ
Minas Harutyunyan minas.harutyunyan@synopsys.com usb: dwc2: Fix disable all EP's on disconnect
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: Disable all EP's on disconnect
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Fix lookup revalidate of regular files
Trond Myklebust trond.myklebust@hammerspace.com NFS: Refactor nfs_lookup_revalidate()
Trond Myklebust trond.myklebust@hammerspace.com NFS: Fix dentry revalidation on NFSv4 lookup
Sunil Muthuswamy sunilmut@microsoft.com vsock: correct removal of socket from the list
Sunil Muthuswamy sunilmut@microsoft.com hv_sock: Add support for delayed close
On Sat, Aug 03, 2019 at 11:58:25AM +0200, Pavel Machek wrote:
On Fri 2019-08-02 11:39:34, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.64-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
The git tree does not seem to correspond to the patches posted. git has:
commit 63a8dab46af2b65ecdb5a83662d94a3a26be973e Author: Greg Kroah-Hartman gregkh@linuxfoundation.org Date: Fri Aug 2 13:30:55 2019 +0200
Linux 4.19.64-rc1
commit 1b35ed42aeacc21a9d21646165333566dd8e181a Author: Xin Long lucien.xin@gmail.com Date: Mon Jun 17 21:34:13 2019 +0800
ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL
But 1b35ed42aeacc ip_tunnel patch is not mentioned here nor is included in the series on the list AFAICT. (I don't find anything wrong with 1b35ed42aeacc).
It was added after I did this release, see the stable mailing list for the details, it went into the 4.14.y and 4.19.y tree at the same time.
thanks,
greg k-h
On 8/2/19 2:39 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.64 release. There are 32 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun 04 Aug 2019 09:19:34 AM UTC. Anything received after that time might be too late.
Build results: total: 156 pass: 156 fail: 0 Qemu test results: total: 364 pass: 364 fail: 0
Guenter
linux-stable-mirror@lists.linaro.org