This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.5-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.16.5-rc1
Sean Christopherson sean.j.christopherson@intel.com Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
Leon Romanovsky leonro@mellanox.com RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs
Jiri Olsa jolsa@kernel.org perf: Return proper values for user stack errors
Jiri Olsa jolsa@kernel.org perf: Fix sample_max_stack maximum check
Florian Westphal fw@strlen.de netfilter: x_tables: limit allocation requests for blob rule heads
Florian Westphal fw@strlen.de netfilter: compat: reject huge allocation requests
Florian Westphal fw@strlen.de netfilter: compat: prepare xt_compat_init_offsets to return errors
Florian Westphal fw@strlen.de netfilter: x_tables: add counters allocation wrapper
Florian Westphal fw@strlen.de netfilter: x_tables: cap allocations at 512 mbyte
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp mm,vmscan: Allow preallocating memory for register_shrinker().
Thomas Gleixner tglx@linutronix.de alarmtimer: Init nanosleep alarm timer on stack
Imre Deak imre.deak@intel.com drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state
Xidong Wang wangxidong_97@163.com drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value
Gaurav K Singh gaurav.k.singh@intel.com drm/i915/audio: Fix audio detection issue on GLK
Jani Nikula jani.nikula@intel.com drm/i915/bios: filter out invalid DDC pins from VBT child devices
Tina Zhang tina.zhang@intel.com drm/i915/gvt: Add drm_format_mod update
Gerd Hoffmann kraxel@redhat.com drm/i915/gvt: throw error on unhandled vfio ioctls
Daniel J Blueman daniel@quora.org drm/vc4: Fix memory leak during BO teardown
Xiaoming Gao gxm.linux.kernel@gmail.com x86/tsc: Prevent 32bit truncation in calc_hpet_ref()
Laura Abbott labbott@redhat.com posix-cpu-timers: Ensure set_process_cpu_timer is always evaluated
Anson Huang Anson.Huang@nxp.com clocksource/imx-tpm: Correct -ETIME return condition check
Dou Liyang douly.fnst@cn.fujitsu.com x86/acpi: Prevent X2APIC id 0xffffffff from being accounted
Nikolay Borisov nborisov@suse.com btrfs: Fix race condition between delayed refs and blockgroup removal
David Sterba dsterba@suse.com btrfs: fix unaligned access in readdir
Steve French smfrench@gmail.com cifs: do not allow creating sockets except with SMB1 posix exensions
Long Li longli@microsoft.com cifs: smbd: Check for iov length on sending the last iov
-------------
Diffstat:
Makefile | 4 +-- arch/x86/kernel/acpi/boot.c | 4 +++ arch/x86/kernel/tsc.c | 2 +- arch/x86/kvm/mmu.c | 2 +- drivers/clocksource/timer-imx-tpm.c | 2 +- drivers/gpu/drm/drm_dp_dual_mode_helper.c | 39 +++++++++++++++++++---- drivers/gpu/drm/i915/gvt/dmabuf.c | 1 + drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/intel_audio.c | 2 +- drivers/gpu/drm/i915/intel_bios.c | 13 +++++--- drivers/gpu/drm/vc4/vc4_bo.c | 2 ++ drivers/gpu/drm/vc4/vc4_validate_shaders.c | 1 + drivers/infiniband/hw/mlx5/qp.c | 3 +- fs/btrfs/delayed-ref.c | 19 ++++++++--- fs/btrfs/delayed-ref.h | 1 + fs/btrfs/extent-tree.c | 16 +++++++--- fs/btrfs/inode.c | 20 +++++++----- fs/cifs/dir.c | 9 +++--- fs/cifs/smbdirect.c | 2 ++ fs/super.c | 9 +++--- include/linux/netfilter/x_tables.h | 3 +- include/linux/shrinker.h | 7 ++-- kernel/events/callchain.c | 21 ++++++------ kernel/events/core.c | 4 +-- kernel/time/alarmtimer.c | 34 +++++++++++++++----- kernel/time/posix-cpu-timers.c | 4 ++- mm/vmscan.c | 21 +++++++++++- net/bridge/netfilter/ebtables.c | 10 ++++-- net/ipv4/netfilter/arp_tables.c | 12 ++++--- net/ipv4/netfilter/ip_tables.c | 10 ++++-- net/ipv6/netfilter/ip6_tables.c | 12 ++++--- net/netfilter/x_tables.c | 51 ++++++++++++++++++++++++------ 33 files changed, 250 insertions(+), 94 deletions(-)
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Long Li longli@microsoft.com
commit ab60ee7bf9a84954f50a66a3d835860e80f99b7f upstream.
When sending the last iov that breaks into smaller buffers to fit the transfer size, it's necessary to check if this is the last iov.
If this is the latest iov, stop and proceed to send pages.
Signed-off-by: Long Li longli@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smbdirect.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -2194,6 +2194,8 @@ int smbd_send(struct smbd_connection *in goto done; } i++; + if (i == rqst->rq_nvec) + break; } start = i; buflen = 0;
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French smfrench@gmail.com
commit 1d0cffa674cfa7d185a302c8c6850fc50b893bed upstream.
RHBZ: 1453123
Since at least the 3.10 kernel and likely a lot earlier we have not been able to create unix domain sockets in a cifs share when mounted using the SFU mount option (except when mounted with the cifs unix extensions to Samba e.g.) Trying to create a socket, for example using the af_unix command from xfstests will cause : BUG: unable to handle kernel NULL pointer dereference at 00000000 00000040
Since no one uses or depends on being able to create unix domains sockets on a cifs share the easiest fix to stop this vulnerability is to simply not allow creation of any other special files than char or block devices when sfu is used.
Added update to Ronnie's patch to handle a tcon link leak, and to address a buf leak noticed by Gustavo and Colin.
Acked-by: Gustavo A. R. Silva gustavo@embeddedor.com CC: Colin Ian King colin.king@canonical.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Reported-by: Eryu Guan eguan@redhat.com Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French smfrench@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/dir.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -684,6 +684,9 @@ int cifs_mknod(struct inode *inode, stru goto mknod_out; }
+ if (!S_ISCHR(mode) && !S_ISBLK(mode)) + goto mknod_out; + if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL)) goto mknod_out;
@@ -692,10 +695,8 @@ int cifs_mknod(struct inode *inode, stru
buf = kmalloc(sizeof(FILE_ALL_INFO), GFP_KERNEL); if (buf == NULL) { - kfree(full_path); rc = -ENOMEM; - free_xid(xid); - return rc; + goto mknod_out; }
if (backup_cred(cifs_sb)) @@ -742,7 +743,7 @@ int cifs_mknod(struct inode *inode, stru pdev->minor = cpu_to_le64(MINOR(device_number)); rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms, &bytes_written, iov, 1); - } /* else if (S_ISFIFO) */ + } tcon->ses->server->ops->close(xid, tcon, &fid); d_drop(direntry);
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Borisov nborisov@suse.com
commit 5e388e95815408c27f3612190d089afc0774b870 upstream.
When the delayed refs for a head are all run, eventually cleanup_ref_head is called which (in case of deletion) obtains a reference for the relevant btrfs_space_info struct by querying the bg for the range. This is problematic because when the last extent of a bg is deleted a race window emerges between removal of that bg and the subsequent invocation of cleanup_ref_head. This can result in cache being null and either a null pointer dereference or assertion failure.
task: ffff8d04d31ed080 task.stack: ffff9e5dc10cc000 RIP: 0010:assfail.constprop.78+0x18/0x1a [btrfs] RSP: 0018:ffff9e5dc10cfbe8 EFLAGS: 00010292 RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8d04ffc1f868 RSI: ffff8d04ffc178c8 RDI: ffff8d04ffc178c8 RBP: ffff8d04d29e5ea0 R08: 00000000000001f0 R09: 0000000000000001 R10: ffff9e5dc0507d58 R11: 0000000000000001 R12: ffff8d04d29e5ea0 R13: ffff8d04d29e5f08 R14: ffff8d04efe29b40 R15: ffff8d04efe203e0 FS: 00007fbf58ead500(0000) GS:ffff8d04ffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe6c6975648 CR3: 0000000013b2a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __btrfs_run_delayed_refs+0x10e7/0x12c0 [btrfs] btrfs_run_delayed_refs+0x68/0x250 [btrfs] btrfs_should_end_transaction+0x42/0x60 [btrfs] btrfs_truncate_inode_items+0xaac/0xfc0 [btrfs] btrfs_evict_inode+0x4c6/0x5c0 [btrfs] evict+0xc6/0x190 do_unlinkat+0x19c/0x300 do_syscall_64+0x74/0x140 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7fbf589c57a7
To fix this, introduce a new flag "is_system" to head_ref structs, which is populated at insertion time. This allows to decouple the querying for the spaceinfo from querying the possibly deleted bg.
Fixes: d7eae3403f46 ("Btrfs: rework delayed ref total_bytes_pinned accounting") CC: stable@vger.kernel.org # 4.14+ Suggested-by: Omar Sandoval osandov@osandov.com Signed-off-by: Nikolay Borisov nborisov@suse.com Reviewed-by: Omar Sandoval osandov@fb.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/delayed-ref.c | 19 ++++++++++++++----- fs/btrfs/delayed-ref.h | 1 + fs/btrfs/extent-tree.c | 16 +++++++++++----- 3 files changed, 26 insertions(+), 10 deletions(-)
--- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -553,8 +553,10 @@ add_delayed_ref_head(struct btrfs_fs_inf struct btrfs_delayed_ref_head *head_ref, struct btrfs_qgroup_extent_record *qrecord, u64 bytenr, u64 num_bytes, u64 ref_root, u64 reserved, - int action, int is_data, int *qrecord_inserted_ret, + int action, int is_data, int is_system, + int *qrecord_inserted_ret, int *old_ref_mod, int *new_ref_mod) + { struct btrfs_delayed_ref_head *existing; struct btrfs_delayed_ref_root *delayed_refs; @@ -598,6 +600,7 @@ add_delayed_ref_head(struct btrfs_fs_inf head_ref->ref_mod = count_mod; head_ref->must_insert_reserved = must_insert_reserved; head_ref->is_data = is_data; + head_ref->is_system = is_system; head_ref->ref_tree = RB_ROOT; INIT_LIST_HEAD(&head_ref->ref_add_list); RB_CLEAR_NODE(&head_ref->href_node); @@ -785,6 +788,7 @@ int btrfs_add_delayed_tree_ref(struct bt struct btrfs_delayed_ref_root *delayed_refs; struct btrfs_qgroup_extent_record *record = NULL; int qrecord_inserted; + int is_system = (ref_root == BTRFS_CHUNK_TREE_OBJECTID);
BUG_ON(extent_op && extent_op->is_data); ref = kmem_cache_alloc(btrfs_delayed_tree_ref_cachep, GFP_NOFS); @@ -813,8 +817,8 @@ int btrfs_add_delayed_tree_ref(struct bt */ head_ref = add_delayed_ref_head(fs_info, trans, head_ref, record, bytenr, num_bytes, 0, 0, action, 0, - &qrecord_inserted, old_ref_mod, - new_ref_mod); + is_system, &qrecord_inserted, + old_ref_mod, new_ref_mod);
add_delayed_tree_ref(fs_info, trans, head_ref, &ref->node, bytenr, num_bytes, parent, ref_root, level, action); @@ -881,7 +885,7 @@ int btrfs_add_delayed_data_ref(struct bt */ head_ref = add_delayed_ref_head(fs_info, trans, head_ref, record, bytenr, num_bytes, ref_root, reserved, - action, 1, &qrecord_inserted, + action, 1, 0, &qrecord_inserted, old_ref_mod, new_ref_mod);
add_delayed_data_ref(fs_info, trans, head_ref, &ref->node, bytenr, @@ -911,9 +915,14 @@ int btrfs_add_delayed_extent_op(struct b delayed_refs = &trans->transaction->delayed_refs; spin_lock(&delayed_refs->lock);
+ /* + * extent_ops just modify the flags of an extent and they don't result + * in ref count changes, hence it's safe to pass false/0 for is_system + * argument + */ add_delayed_ref_head(fs_info, trans, head_ref, NULL, bytenr, num_bytes, 0, 0, BTRFS_UPDATE_DELAYED_HEAD, - extent_op->is_data, NULL, NULL, NULL); + extent_op->is_data, 0, NULL, NULL, NULL);
spin_unlock(&delayed_refs->lock); return 0; --- a/fs/btrfs/delayed-ref.h +++ b/fs/btrfs/delayed-ref.h @@ -139,6 +139,7 @@ struct btrfs_delayed_ref_head { */ unsigned int must_insert_reserved:1; unsigned int is_data:1; + unsigned int is_system:1; unsigned int processing:1; };
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2615,13 +2615,19 @@ static int cleanup_ref_head(struct btrfs trace_run_delayed_ref_head(fs_info, head, 0);
if (head->total_ref_mod < 0) { - struct btrfs_block_group_cache *cache; + struct btrfs_space_info *space_info; + u64 flags;
- cache = btrfs_lookup_block_group(fs_info, head->bytenr); - ASSERT(cache); - percpu_counter_add(&cache->space_info->total_bytes_pinned, + if (head->is_data) + flags = BTRFS_BLOCK_GROUP_DATA; + else if (head->is_system) + flags = BTRFS_BLOCK_GROUP_SYSTEM; + else + flags = BTRFS_BLOCK_GROUP_METADATA; + space_info = __find_space_info(fs_info, flags); + ASSERT(space_info); + percpu_counter_add(&space_info->total_bytes_pinned, -head->num_bytes); - btrfs_put_block_group(cache);
if (head->is_data) { spin_lock(&delayed_refs->lock);
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anson Huang Anson.Huang@nxp.com
commit 7407188489c62a7b5694bc75a6db2b82af94c9a5 upstream.
The additional brakects added to tpm_set_next_event's return value computation causes (int) forced type conversion NOT taking effect, and the incorrect value return will cause various system timer issue, like RCU stall etc..
Remove the additional brackets to make sure tpm_set_next_event always returns correct value.
Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Anson Huang Anson.Huang@nxp.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Dong Aisheng Aisheng.dong@nxp.com Cc: stable@vger.kernel.org Cc: daniel.lezcano@linaro.org Cc: Linux-imx@nxp.com Link: https://lkml.kernel.org/r/1524117883-2484-1-git-send-email-Anson.Huang@nxp.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clocksource/timer-imx-tpm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clocksource/timer-imx-tpm.c +++ b/drivers/clocksource/timer-imx-tpm.c @@ -105,7 +105,7 @@ static int tpm_set_next_event(unsigned l * of writing CNT registers which may cause the min_delta event got * missed, so we need add a ETIME check here in case it happened. */ - return (int)((next - now) <= 0) ? -ETIME : 0; + return (int)(next - now) <= 0 ? -ETIME : 0; }
static int tpm_set_state_oneshot(struct clock_event_device *evt)
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laura Abbott labbott@redhat.com
commit c3bca5d450b620dd3d36e14b5e1f43639fd47d6b upstream.
Commit a9445e47d897 ("posix-cpu-timers: Make set_process_cpu_timer() more robust") moved the check into the 'if' statement. Unfortunately, it did so on the right side of an && which means that it may get short circuited and never evaluated. This is easily reproduced with:
$ cat loop.c void main() { struct rlimit res; /* set the CPU time limit */ getrlimit(RLIMIT_CPU,&res); res.rlim_cur = 2; res.rlim_max = 2; setrlimit(RLIMIT_CPU,&res);
while (1); }
Which will hang forever instead of being killed. Fix this by pulling the evaluation out of the if statement but checking the return value instead.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1568337 Fixes: a9445e47d897 ("posix-cpu-timers: Make set_process_cpu_timer() more robust") Signed-off-by: Laura Abbott labbott@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Cc: "Max R . P . Grossmann" m@max.pm Cc: John Stultz john.stultz@linaro.org Link: https://lkml.kernel.org/r/20180417215742.2521-1-labbott@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/time/posix-cpu-timers.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/kernel/time/posix-cpu-timers.c +++ b/kernel/time/posix-cpu-timers.c @@ -1205,10 +1205,12 @@ void set_process_cpu_timer(struct task_s u64 *newval, u64 *oldval) { u64 now; + int ret;
WARN_ON_ONCE(clock_idx == CPUCLOCK_SCHED); + ret = cpu_timer_sample_group(clock_idx, tsk, &now);
- if (oldval && cpu_timer_sample_group(clock_idx, tsk, &now) != -EINVAL) { + if (oldval && ret != -EINVAL) { /* * We are setting itimer. The *oldval is absolute and we update * it to be relative, *newval argument is relative and we update
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaoming Gao gxm.linux.kernel@gmail.com
commit d3878e164dcd3925a237a20e879432400e369172 upstream.
The TSC calibration code uses HPET as reference. The conversion normalizes the delta of two HPET timestamps:
hpetref = ((tshpet1 - tshpet2) * HPET_PERIOD) / 1e6
and then divides the normalized delta of the corresponding TSC timestamps by the result to calulate the TSC frequency.
tscfreq = ((tstsc1 - tstsc2 ) * 1e6) / hpetref
This uses do_div() which takes an u32 as the divisor, which worked so far because the HPET frequency was low enough that 'hpetref' never exceeded 32bit.
On Skylake machines the HPET frequency increased so 'hpetref' can exceed 32bit. do_div() truncates the divisor, which causes the calibration to fail.
Use div64_u64() to avoid the problem.
[ tglx: Fixes whitespace mangled patch and rewrote changelog ]
Signed-off-by: Xiaoming Gao newtongao@tencent.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Cc: peterz@infradead.org Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/38894564-4fc9-b8ec-353f-de702839e44e@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -317,7 +317,7 @@ static unsigned long calc_hpet_ref(u64 d hpet2 -= hpet1; tmp = ((u64)hpet2 * hpet_readl(HPET_PERIOD)); do_div(tmp, 1000000); - do_div(deltatsc, tmp); + deltatsc = div64_u64(deltatsc, tmp);
return (unsigned long) deltatsc; }
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel J Blueman daniel@quora.org
commit c0db1b677e1d584fab5d7ac76a32e1c0157542e0 upstream.
During BO teardown, an indirect list 'uniform_addr_offsets' wasn't being freed leading to leaking many 128B allocations. Fix the memory leak by releasing it at teardown time.
Cc: stable@vger.kernel.org Fixes: 6d45c81d229d ("drm/vc4: Add support for branching in shader validation.") Signed-off-by: Daniel J Blueman daniel@quora.org Signed-off-by: Eric Anholt eric@anholt.net Reviewed-by: Eric Anholt eric@anholt.net Link: https://patchwork.freedesktop.org/patch/msgid/20180402071035.25356-1-daniel@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/vc4/vc4_bo.c | 2 ++ drivers/gpu/drm/vc4/vc4_validate_shaders.c | 1 + 2 files changed, 3 insertions(+)
--- a/drivers/gpu/drm/vc4/vc4_bo.c +++ b/drivers/gpu/drm/vc4/vc4_bo.c @@ -195,6 +195,7 @@ static void vc4_bo_destroy(struct vc4_bo vc4_bo_set_label(obj, -1);
if (bo->validated_shader) { + kfree(bo->validated_shader->uniform_addr_offsets); kfree(bo->validated_shader->texture_samples); kfree(bo->validated_shader); bo->validated_shader = NULL; @@ -591,6 +592,7 @@ void vc4_free_object(struct drm_gem_obje }
if (bo->validated_shader) { + kfree(bo->validated_shader->uniform_addr_offsets); kfree(bo->validated_shader->texture_samples); kfree(bo->validated_shader); bo->validated_shader = NULL; --- a/drivers/gpu/drm/vc4/vc4_validate_shaders.c +++ b/drivers/gpu/drm/vc4/vc4_validate_shaders.c @@ -942,6 +942,7 @@ vc4_validate_shader(struct drm_gem_cma_o fail: kfree(validation_state.branch_targets); if (validated_shader) { + kfree(validated_shader->uniform_addr_offsets); kfree(validated_shader->texture_samples); kfree(validated_shader); }
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gerd Hoffmann kraxel@redhat.com
commit 9f591ae60e1be026901398ef99eede91237aa3a1 upstream.
On unknown/unhandled ioctls the driver should return an error, so userspace knows it tried to use something unsupported.
Cc: stable@vger.kernel.org Signed-off-by: Gerd Hoffmann kraxel@redhat.com Reviewed-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1284,7 +1284,7 @@ static long intel_vgpu_ioctl(struct mdev
}
- return 0; + return -ENOTTY; }
static ssize_t
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tina Zhang tina.zhang@intel.com
commit 10996f802109c83421ca30556cfe36ffc3bebae3 upstream.
Add drm_format_mod update, which is omitted.
Fixes: e546e281("drm/i915/gvt: Dmabuf support for GVT-g") Cc: stable@vger.kernel.org Signed-off-by: Tina Zhang tina.zhang@intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gvt/dmabuf.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/i915/gvt/dmabuf.c +++ b/drivers/gpu/drm/i915/gvt/dmabuf.c @@ -323,6 +323,7 @@ static void update_fb_info(struct vfio_d struct intel_vgpu_fb_info *fb_info) { gvt_dmabuf->drm_format = fb_info->drm_format; + gvt_dmabuf->drm_format_mod = fb_info->drm_format_mod; gvt_dmabuf->width = fb_info->width; gvt_dmabuf->height = fb_info->height; gvt_dmabuf->stride = fb_info->stride;
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit a3520b8992e57bc94ab6ec9f95f09c6c932555fd upstream.
The VBT contains the DDC pin to use for specific ports. Alas, sometimes the field appears to contain bogus data, and while we check for it later on in intel_gmbus_get_adapter() we fail to check the returned NULL on errors. Oops results.
The simplest approach seems to be to catch and ignore the bogus DDC pins already at the VBT parsing phase, reverting to fixed per port default pins. This doesn't guarantee display working, but at least it prevents the oops. And we continue to be fuzzed by VBT.
One affected machine is Dell Latitude 5590 where a BIOS upgrade added invalid DDC pins.
Typical backtrace:
[ 35.461411] WARN_ON(!intel_gmbus_is_valid_pin(dev_priv, pin)) [ 35.461432] WARNING: CPU: 6 PID: 411 at drivers/gpu/drm/i915/intel_i2c.c:844 intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461437] Modules linked in: i915 ahci libahci dm_snapshot dm_bufio dm_raid raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx [ 35.461445] CPU: 6 PID: 411 Comm: kworker/u16:2 Not tainted 4.16.0-rc7.x64-g1cda370ffded #1 [ 35.461447] Hardware name: Dell Inc. Latitude 5590/0MM81M, BIOS 1.1.9 03/13/2018 [ 35.461450] Workqueue: events_unbound async_run_entry_fn [ 35.461465] RIP: 0010:intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461467] RSP: 0018:ffff9b4e43d47c40 EFLAGS: 00010286 [ 35.461469] RAX: 0000000000000000 RBX: ffff98f90639f800 RCX: ffffffffae051960 [ 35.461471] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000246 [ 35.461472] RBP: ffff98f905410000 R08: 0000004d062a83f6 R09: 00000000000003bd [ 35.461474] R10: 0000000000000031 R11: ffffffffad4eda58 R12: ffff98f905410000 [ 35.461475] R13: ffff98f9064c1000 R14: ffff9b4e43d47cf0 R15: ffff98f905410000 [ 35.461477] FS: 0000000000000000(0000) GS:ffff98f92e580000(0000) knlGS:0000000000000000 [ 35.461479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.461481] CR2: 00007f5682359008 CR3: 00000001b700c005 CR4: 00000000003606e0 [ 35.461483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 35.461484] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 35.461486] Call Trace: [ 35.461501] intel_hdmi_set_edid+0x37/0x27f [i915] [ 35.461515] intel_hdmi_detect+0x7c/0x97 [i915] [ 35.461518] drm_helper_probe_single_connector_modes+0xe1/0x6c0 [ 35.461521] drm_setup_crtcs+0x129/0xa6a [ 35.461523] ? __switch_to_asm+0x34/0x70 [ 35.461525] ? __switch_to_asm+0x34/0x70 [ 35.461527] ? __switch_to_asm+0x40/0x70 [ 35.461528] ? __switch_to_asm+0x34/0x70 [ 35.461529] ? __switch_to_asm+0x40/0x70 [ 35.461531] ? __switch_to_asm+0x34/0x70 [ 35.461532] ? __switch_to_asm+0x40/0x70 [ 35.461534] ? __switch_to_asm+0x34/0x70 [ 35.461536] __drm_fb_helper_initial_config_and_unlock+0x34/0x46f [ 35.461538] ? __switch_to_asm+0x40/0x70 [ 35.461541] ? _cond_resched+0x10/0x33 [ 35.461557] intel_fbdev_initial_config+0xf/0x1c [i915] [ 35.461560] async_run_entry_fn+0x2e/0xf5 [ 35.461563] process_one_work+0x15b/0x364 [ 35.461565] worker_thread+0x2c/0x3a0 [ 35.461567] ? process_one_work+0x364/0x364 [ 35.461568] kthread+0x10c/0x122 [ 35.461570] ? _kthread_create_on_node+0x5d/0x5d [ 35.461572] ret_from_fork+0x35/0x40 [ 35.461574] Code: 74 16 89 f6 48 8d 04 b6 48 c1 e0 05 48 29 f0 48 8d 84 c7 e8 11 00 00 c3 48 c7 c6 b0 19 1e c0 48 c7 c7 64 8a 1c c0 e8 47 88 ed ec <0f> 0b 31 c0 c3 8b 87 a4 04 00 00 80 e4 fc 09 c6 89 b7 a4 04 00 [ 35.461604] WARNING: CPU: 6 PID: 411 at drivers/gpu/drm/i915/intel_i2c.c:844 intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461606] ---[ end trace 4fe1e63e2dd93373 ]--- [ 35.461609] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 35.461613] IP: i2c_transfer+0x4/0x86 [ 35.461614] PGD 0 P4D 0 [ 35.461616] Oops: 0000 [#1] SMP PTI [ 35.461618] Modules linked in: i915 ahci libahci dm_snapshot dm_bufio dm_raid raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx [ 35.461624] CPU: 6 PID: 411 Comm: kworker/u16:2 Tainted: G W 4.16.0-rc7.x64-g1cda370ffded #1 [ 35.461625] Hardware name: Dell Inc. Latitude 5590/0MM81M, BIOS 1.1.9 03/13/2018 [ 35.461628] Workqueue: events_unbound async_run_entry_fn [ 35.461630] RIP: 0010:i2c_transfer+0x4/0x86 [ 35.461631] RSP: 0018:ffff9b4e43d47b30 EFLAGS: 00010246 [ 35.461633] RAX: ffff9b4e43d47b6e RBX: 0000000000000005 RCX: 0000000000000001 [ 35.461635] RDX: 0000000000000002 RSI: ffff9b4e43d47b80 RDI: 0000000000000000 [ 35.461636] RBP: ffff9b4e43d47bd8 R08: 0000004d062a83f6 R09: 00000000000003bd [ 35.461638] R10: 0000000000000031 R11: ffffffffad4eda58 R12: 0000000000000002 [ 35.461639] R13: 0000000000000001 R14: ffff9b4e43d47b6f R15: ffff9b4e43d47c07 [ 35.461641] FS: 0000000000000000(0000) GS:ffff98f92e580000(0000) knlGS:0000000000000000 [ 35.461643] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.461645] CR2: 0000000000000010 CR3: 00000001b700c005 CR4: 00000000003606e0 [ 35.461646] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 35.461647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 35.461649] Call Trace: [ 35.461652] drm_do_probe_ddc_edid+0xb3/0x128 [ 35.461654] drm_get_edid+0xe5/0x38d [ 35.461669] intel_hdmi_set_edid+0x45/0x27f [i915] [ 35.461684] intel_hdmi_detect+0x7c/0x97 [i915] [ 35.461687] drm_helper_probe_single_connector_modes+0xe1/0x6c0 [ 35.461689] drm_setup_crtcs+0x129/0xa6a [ 35.461691] ? __switch_to_asm+0x34/0x70 [ 35.461693] ? __switch_to_asm+0x34/0x70 [ 35.461694] ? __switch_to_asm+0x40/0x70 [ 35.461696] ? __switch_to_asm+0x34/0x70 [ 35.461697] ? __switch_to_asm+0x40/0x70 [ 35.461698] ? __switch_to_asm+0x34/0x70 [ 35.461700] ? __switch_to_asm+0x40/0x70 [ 35.461701] ? __switch_to_asm+0x34/0x70 [ 35.461703] __drm_fb_helper_initial_config_and_unlock+0x34/0x46f [ 35.461705] ? __switch_to_asm+0x40/0x70 [ 35.461707] ? _cond_resched+0x10/0x33 [ 35.461724] intel_fbdev_initial_config+0xf/0x1c [i915] [ 35.461727] async_run_entry_fn+0x2e/0xf5 [ 35.461729] process_one_work+0x15b/0x364 [ 35.461731] worker_thread+0x2c/0x3a0 [ 35.461733] ? process_one_work+0x364/0x364 [ 35.461734] kthread+0x10c/0x122 [ 35.461736] ? _kthread_create_on_node+0x5d/0x5d [ 35.461738] ret_from_fork+0x35/0x40 [ 35.461739] Code: 5c fa e1 ad 48 89 df e8 ea fb ff ff e9 2a ff ff ff 0f 1f 44 00 00 31 c0 e9 43 fd ff ff 31 c0 45 31 e4 e9 c5 fd ff ff 41 54 55 53 <48> 8b 47 10 48 83 78 10 00 74 70 41 89 d4 48 89 f5 48 89 fb 65 [ 35.461756] RIP: i2c_transfer+0x4/0x86 RSP: ffff9b4e43d47b30 [ 35.461757] CR2: 0000000000000010 [ 35.461759] ---[ end trace 4fe1e63e2dd93374 ]---
Based on a patch by Fei Li.
v2: s/reverting/sticking/ (Chris)
Cc: stable@vger.kernel.org Cc: Fei Li fei.li@intel.com Co-developed-by: Fei Li fei.li@intel.com Reported-by: Pavel Nakonechnyi zorg1331@gmail.com Reported-and-tested-by: Seweryn Kokot sewkokot@gmail.com Reported-and-tested-by: Laszlo Valko valko@linux.karinthy.hu Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105549 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105961 Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20180411131519.9091-1-jani.nik... (cherry picked from commit f212bf9abe5de9f938fecea7df07046e74052dde) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/intel_bios.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/i915/intel_bios.c +++ b/drivers/gpu/drm/i915/intel_bios.c @@ -1255,7 +1255,6 @@ static void parse_ddi_port(struct drm_i9 return;
aux_channel = child->aux_channel; - ddc_pin = child->ddc_pin;
is_dvi = child->device_type & DEVICE_TYPE_TMDS_DVI_SIGNALING; is_dp = child->device_type & DEVICE_TYPE_DISPLAYPORT_OUTPUT; @@ -1302,9 +1301,15 @@ static void parse_ddi_port(struct drm_i9 DRM_DEBUG_KMS("Port %c is internal DP\n", port_name(port));
if (is_dvi) { - info->alternate_ddc_pin = map_ddc_pin(dev_priv, ddc_pin); - - sanitize_ddc_pin(dev_priv, port); + ddc_pin = map_ddc_pin(dev_priv, child->ddc_pin); + if (intel_gmbus_is_valid_pin(dev_priv, ddc_pin)) { + info->alternate_ddc_pin = ddc_pin; + sanitize_ddc_pin(dev_priv, port); + } else { + DRM_DEBUG_KMS("Port %c has invalid DDC pin %d, " + "sticking to defaults\n", + port_name(port), ddc_pin); + } }
if (is_dp) {
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gaurav K Singh gaurav.k.singh@intel.com
commit b4615730530be85fc45ab4631c2ad6d8e2d0b97d upstream.
On Geminilake, sometimes audio card is not getting detected after reboot. This is a spurious issue happening on Geminilake. HW codec and HD audio controller link was going out of sync for which there was a fix in i915 driver but was not getting invoked for GLK. Extending this fix to GLK as well.
Tested by Du,Wenkai on GLK board.
Bspec: 21829
v2: Instead of checking GEN9_BC, BXT and GLK macros, use IS_GEN9 macro (Jani N)
Cc: stable@vger.kernel.org # b651bd2a3ae3 ("drm/i915/audio: Fix audio enumeration issue on BXT") Cc: stable@vger.kernel.org Signed-off-by: Gaurav K Singh gaurav.k.singh@intel.com Reviewed-by: Abhay Kumar abhay.Kumar@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/1523989338-29677-1-git-send-em... (cherry picked from commit 8221229046e862977ae93ec9d34aa583fbd10397) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/intel_audio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/intel_audio.c +++ b/drivers/gpu/drm/i915/intel_audio.c @@ -729,7 +729,7 @@ static void i915_audio_component_codec_w struct drm_i915_private *dev_priv = kdev_to_i915(kdev); u32 tmp;
- if (!IS_GEN9_BC(dev_priv)) + if (!IS_GEN9(dev_priv)) return;
i915_audio_component_get_power(kdev);
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xidong Wang wangxidong_97@163.com
commit fcf1fadf4c65eea6c519c773d2d9901e8ad94f5f upstream.
Along the eb_lookup_vmas() error path, the return value from kmem_cache_alloc() was freed using kfree(). Fix it to use the proper kmem_cache_free() instead.
Fixes: d1b48c1e7184 ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: Xidong Wang wangxidong_97@163.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20180404093824.9313-1-chris@ch... (cherry picked from commit 6be1187dbffa0027ea379c53f7ca0c782515c610) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -728,7 +728,7 @@ static int eb_lookup_vmas(struct i915_ex
err = radix_tree_insert(handles_vma, handle, vma); if (unlikely(err)) { - kfree(lut); + kmem_cache_free(eb->i915->luts, lut); goto err_obj; }
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit bd03143007eb9b03a7f2316c677780561b68ba2a upstream.
syszbot reported the following debugobjects splat:
ODEBUG: object is on stack, but not annotated WARNING: CPU: 0 PID: 4185 at lib/debugobjects.c:328
RIP: 0010:debug_object_is_on_stack lib/debugobjects.c:327 [inline] debug_object_init+0x17/0x20 lib/debugobjects.c:391 debug_hrtimer_init kernel/time/hrtimer.c:410 [inline] debug_init kernel/time/hrtimer.c:458 [inline] hrtimer_init+0x8c/0x410 kernel/time/hrtimer.c:1259 alarm_init kernel/time/alarmtimer.c:339 [inline] alarm_timer_nsleep+0x164/0x4d0 kernel/time/alarmtimer.c:787 SYSC_clock_nanosleep kernel/time/posix-timers.c:1226 [inline] SyS_clock_nanosleep+0x235/0x330 kernel/time/posix-timers.c:1204 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7
This happens because the hrtimer for the alarm nanosleep is on stack, but the code does not use the proper debug objects initialization.
Split out the code for the allocated use cases and invoke hrtimer_init_on_stack() for the nanosleep related functions.
Reported-by: syzbot+a3e0726462b2e346a31d@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: John Stultz john.stultz@linaro.org Cc: syzkaller-bugs@googlegroups.com Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1803261528270.1585@nanos.tec.linut... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/time/alarmtimer.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-)
--- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -326,6 +326,17 @@ static int alarmtimer_resume(struct devi } #endif
+static void +__alarm_init(struct alarm *alarm, enum alarmtimer_type type, + enum alarmtimer_restart (*function)(struct alarm *, ktime_t)) +{ + timerqueue_init(&alarm->node); + alarm->timer.function = alarmtimer_fired; + alarm->function = function; + alarm->type = type; + alarm->state = ALARMTIMER_STATE_INACTIVE; +} + /** * alarm_init - Initialize an alarm structure * @alarm: ptr to alarm to be initialized @@ -335,13 +346,9 @@ static int alarmtimer_resume(struct devi void alarm_init(struct alarm *alarm, enum alarmtimer_type type, enum alarmtimer_restart (*function)(struct alarm *, ktime_t)) { - timerqueue_init(&alarm->node); hrtimer_init(&alarm->timer, alarm_bases[type].base_clockid, - HRTIMER_MODE_ABS); - alarm->timer.function = alarmtimer_fired; - alarm->function = function; - alarm->type = type; - alarm->state = ALARMTIMER_STATE_INACTIVE; + HRTIMER_MODE_ABS); + __alarm_init(alarm, type, function); } EXPORT_SYMBOL_GPL(alarm_init);
@@ -719,6 +726,8 @@ static int alarmtimer_do_nsleep(struct a
__set_current_state(TASK_RUNNING);
+ destroy_hrtimer_on_stack(&alarm->timer); + if (!alarm->data) return 0;
@@ -740,6 +749,15 @@ static int alarmtimer_do_nsleep(struct a return -ERESTART_RESTARTBLOCK; }
+static void +alarm_init_on_stack(struct alarm *alarm, enum alarmtimer_type type, + enum alarmtimer_restart (*function)(struct alarm *, ktime_t)) +{ + hrtimer_init_on_stack(&alarm->timer, alarm_bases[type].base_clockid, + HRTIMER_MODE_ABS); + __alarm_init(alarm, type, function); +} + /** * alarm_timer_nsleep_restart - restartblock alarmtimer nsleep * @restart: ptr to restart block @@ -752,7 +770,7 @@ static long __sched alarm_timer_nsleep_r ktime_t exp = restart->nanosleep.expires; struct alarm alarm;
- alarm_init(&alarm, type, alarmtimer_nsleep_wakeup); + alarm_init_on_stack(&alarm, type, alarmtimer_nsleep_wakeup);
return alarmtimer_do_nsleep(&alarm, exp, type); } @@ -784,7 +802,7 @@ static int alarm_timer_nsleep(const cloc if (!capable(CAP_WAKE_ALARM)) return -EPERM;
- alarm_init(&alarm, type, alarmtimer_nsleep_wakeup); + alarm_init_on_stack(&alarm, type, alarmtimer_nsleep_wakeup);
exp = timespec64_to_ktime(*tsreq); /* Convert (if necessary) to absolute time */
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 8e04944f0ea8b838399049bdcda920ab36ae3b04 upstream.
syzbot is catching so many bugs triggered by commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()"). That commit expected that calling kill_sb() from deactivate_locked_super() without successful fill_super() is safe, but the reality was different; some callers assign attributes which are needed for kill_sb() after sget() succeeds.
For example, [1] is a report where sb->s_mode (which seems to be either FMODE_READ | FMODE_EXCL | FMODE_WRITE or FMODE_READ | FMODE_EXCL) is not assigned unless sget() succeeds. But it does not worth complicate sget() so that register_shrinker() failure path can safely call kill_block_super() via kill_sb(). Making alloc_super() fail if memory allocation for register_shrinker() failed is much simpler. Let's avoid calling deactivate_locked_super() from sget_userns() by preallocating memory for the shrinker and making register_shrinker() in sget_userns() never fail.
[1] https://syzkaller.appspot.com/bug?id=588996a25a2587be2e3a54e8646728fb9cae44e...
Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+5a170e19c963a2e0df79@syzkaller.appspotmail.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/super.c | 9 ++++----- include/linux/shrinker.h | 7 +++++-- mm/vmscan.c | 21 ++++++++++++++++++++- 3 files changed, 29 insertions(+), 8 deletions(-)
--- a/fs/super.c +++ b/fs/super.c @@ -166,6 +166,7 @@ static void destroy_unused_super(struct security_sb_free(s); put_user_ns(s->s_user_ns); kfree(s->s_subtype); + free_prealloced_shrinker(&s->s_shrink); /* no delays needed */ destroy_super_work(&s->destroy_work); } @@ -251,6 +252,8 @@ static struct super_block *alloc_super(s s->s_shrink.count_objects = super_cache_count; s->s_shrink.batch = 1024; s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE; + if (prealloc_shrinker(&s->s_shrink)) + goto fail; return s;
fail: @@ -517,11 +520,7 @@ retry: hlist_add_head(&s->s_instances, &type->fs_supers); spin_unlock(&sb_lock); get_filesystem(type); - err = register_shrinker(&s->s_shrink); - if (err) { - deactivate_locked_super(s); - s = ERR_PTR(err); - } + register_shrinker_prepared(&s->s_shrink); return s; }
--- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -75,6 +75,9 @@ struct shrinker { #define SHRINKER_NUMA_AWARE (1 << 0) #define SHRINKER_MEMCG_AWARE (1 << 1)
-extern int register_shrinker(struct shrinker *); -extern void unregister_shrinker(struct shrinker *); +extern int prealloc_shrinker(struct shrinker *shrinker); +extern void register_shrinker_prepared(struct shrinker *shrinker); +extern int register_shrinker(struct shrinker *shrinker); +extern void unregister_shrinker(struct shrinker *shrinker); +extern void free_prealloced_shrinker(struct shrinker *shrinker); #endif --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -258,7 +258,7 @@ unsigned long lruvec_lru_size(struct lru /* * Add a shrinker callback to be called from the vm. */ -int register_shrinker(struct shrinker *shrinker) +int prealloc_shrinker(struct shrinker *shrinker) { size_t size = sizeof(*shrinker->nr_deferred);
@@ -268,10 +268,29 @@ int register_shrinker(struct shrinker *s shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); if (!shrinker->nr_deferred) return -ENOMEM; + return 0; +} + +void free_prealloced_shrinker(struct shrinker *shrinker) +{ + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; +}
+void register_shrinker_prepared(struct shrinker *shrinker) +{ down_write(&shrinker_rwsem); list_add_tail(&shrinker->list, &shrinker_list); up_write(&shrinker_rwsem); +} + +int register_shrinker(struct shrinker *shrinker) +{ + int err = prealloc_shrinker(shrinker); + + if (err) + return err; + register_shrinker_prepared(shrinker); return 0; } EXPORT_SYMBOL(register_shrinker);
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 19926968ea86a286aa6fbea16ee3f2e7442f10f0 upstream.
Arbitrary limit, however, this still allows huge rulesets (> 1 million rules). This helps with automated fuzzer as it prevents oom-killer invocation.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/x_tables.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -40,6 +40,7 @@ MODULE_AUTHOR("Harald Welte <laforge@net MODULE_DESCRIPTION("{ip,ip6,arp,eb}_tables backend module");
#define XT_PCPU_BLOCK_SIZE 4096 +#define XT_MAX_TABLE_SIZE (512 * 1024 * 1024)
struct compat_delta { unsigned int offset; /* offset in kernel */ @@ -1029,7 +1030,7 @@ struct xt_table_info *xt_alloc_table_inf struct xt_table_info *info = NULL; size_t sz = sizeof(*info) + size;
- if (sz < sizeof(*info)) + if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE) return NULL;
/* __GFP_NORETRY is not fully supported by kvmalloc but it should
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit c84ca954ac9fa67a6ce27f91f01e4451c74fd8f6 upstream.
allows to have size checks in a single spot. This is supposed to reduce oom situations when fuzz-testing xtables.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/netfilter/x_tables.h | 1 + net/ipv4/netfilter/arp_tables.c | 2 +- net/ipv4/netfilter/ip_tables.c | 2 +- net/ipv6/netfilter/ip6_tables.c | 2 +- net/netfilter/x_tables.c | 15 +++++++++++++++ 5 files changed, 19 insertions(+), 3 deletions(-)
--- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -301,6 +301,7 @@ int xt_data_to_user(void __user *dst, co
void *xt_copy_counters_from_user(const void __user *user, unsigned int len, struct xt_counters_info *info, bool compat); +struct xt_counters *xt_counters_alloc(unsigned int counters);
struct xt_table *xt_register_table(struct net *net, const struct xt_table *table, --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -895,7 +895,7 @@ static int __do_replace(struct net *net, struct arpt_entry *iter;
ret = 0; - counters = vzalloc(num_counters * sizeof(struct xt_counters)); + counters = xt_counters_alloc(num_counters); if (!counters) { ret = -ENOMEM; goto out; --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1057,7 +1057,7 @@ __do_replace(struct net *net, const char struct ipt_entry *iter;
ret = 0; - counters = vzalloc(num_counters * sizeof(struct xt_counters)); + counters = xt_counters_alloc(num_counters); if (!counters) { ret = -ENOMEM; goto out; --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1075,7 +1075,7 @@ __do_replace(struct net *net, const char struct ip6t_entry *iter;
ret = 0; - counters = vzalloc(num_counters * sizeof(struct xt_counters)); + counters = xt_counters_alloc(num_counters); if (!counters) { ret = -ENOMEM; goto out; --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -1199,6 +1199,21 @@ static int xt_jumpstack_alloc(struct xt_ return 0; }
+struct xt_counters *xt_counters_alloc(unsigned int counters) +{ + struct xt_counters *mem; + + if (counters == 0 || counters > INT_MAX / sizeof(*mem)) + return NULL; + + counters *= sizeof(*mem); + if (counters > XT_MAX_TABLE_SIZE) + return NULL; + + return vzalloc(counters); +} +EXPORT_SYMBOL(xt_counters_alloc); + struct xt_table_info * xt_replace_table(struct xt_table *table, unsigned int num_counters,
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 9782a11efc072faaf91d4aa60e9d23553f918029 upstream.
should have no impact, function still always returns 0. This patch is only to ease review.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/netfilter/x_tables.h | 2 +- net/bridge/netfilter/ebtables.c | 10 ++++++++-- net/ipv4/netfilter/arp_tables.c | 10 +++++++--- net/ipv4/netfilter/ip_tables.c | 8 ++++++-- net/ipv6/netfilter/ip6_tables.c | 10 +++++++--- net/netfilter/x_tables.c | 4 +++- 6 files changed, 32 insertions(+), 12 deletions(-)
--- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -510,7 +510,7 @@ void xt_compat_unlock(u_int8_t af);
int xt_compat_add_offset(u_int8_t af, unsigned int offset, int delta); void xt_compat_flush_offsets(u_int8_t af); -void xt_compat_init_offsets(u_int8_t af, unsigned int number); +int xt_compat_init_offsets(u8 af, unsigned int number); int xt_compat_calc_jump(u_int8_t af, unsigned int offset);
int xt_compat_match_offset(const struct xt_match *match); --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1821,10 +1821,14 @@ static int compat_table_info(const struc { unsigned int size = info->entries_size; const void *entries = info->entries; + int ret;
newinfo->entries_size = size;
- xt_compat_init_offsets(NFPROTO_BRIDGE, info->nentries); + ret = xt_compat_init_offsets(NFPROTO_BRIDGE, info->nentries); + if (ret) + return ret; + return EBT_ENTRY_ITERATE(entries, size, compat_calc_entry, info, entries, newinfo); } @@ -2268,7 +2272,9 @@ static int compat_do_replace(struct net
xt_compat_lock(NFPROTO_BRIDGE);
- xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries); + ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries); + if (ret < 0) + goto out_unlock; ret = compat_copy_entries(entries_tmp, tmp.entries_size, &state); if (ret < 0) goto out_unlock; --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -781,7 +781,9 @@ static int compat_table_info(const struc memcpy(newinfo, info, offsetof(struct xt_table_info, entries)); newinfo->initial_entries = 0; loc_cpu_entry = info->entries; - xt_compat_init_offsets(NFPROTO_ARP, info->number); + ret = xt_compat_init_offsets(NFPROTO_ARP, info->number); + if (ret) + return ret; xt_entry_foreach(iter, loc_cpu_entry, info->size) { ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo); if (ret != 0) @@ -1167,7 +1169,7 @@ static int translate_compat_table(struct struct compat_arpt_entry *iter0; struct arpt_replace repl; unsigned int size; - int ret = 0; + int ret;
info = *pinfo; entry0 = *pentry0; @@ -1176,7 +1178,9 @@ static int translate_compat_table(struct
j = 0; xt_compat_lock(NFPROTO_ARP); - xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries); + ret = xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries); + if (ret) + goto out_unlock; /* Walk through entries, checking offsets. */ xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -945,7 +945,9 @@ static int compat_table_info(const struc memcpy(newinfo, info, offsetof(struct xt_table_info, entries)); newinfo->initial_entries = 0; loc_cpu_entry = info->entries; - xt_compat_init_offsets(AF_INET, info->number); + ret = xt_compat_init_offsets(AF_INET, info->number); + if (ret) + return ret; xt_entry_foreach(iter, loc_cpu_entry, info->size) { ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo); if (ret != 0) @@ -1418,7 +1420,9 @@ translate_compat_table(struct net *net,
j = 0; xt_compat_lock(AF_INET); - xt_compat_init_offsets(AF_INET, compatr->num_entries); + ret = xt_compat_init_offsets(AF_INET, compatr->num_entries); + if (ret) + goto out_unlock; /* Walk through entries, checking offsets. */ xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -962,7 +962,9 @@ static int compat_table_info(const struc memcpy(newinfo, info, offsetof(struct xt_table_info, entries)); newinfo->initial_entries = 0; loc_cpu_entry = info->entries; - xt_compat_init_offsets(AF_INET6, info->number); + ret = xt_compat_init_offsets(AF_INET6, info->number); + if (ret) + return ret; xt_entry_foreach(iter, loc_cpu_entry, info->size) { ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo); if (ret != 0) @@ -1425,7 +1427,7 @@ translate_compat_table(struct net *net, struct compat_ip6t_entry *iter0; struct ip6t_replace repl; unsigned int size; - int ret = 0; + int ret;
info = *pinfo; entry0 = *pentry0; @@ -1434,7 +1436,9 @@ translate_compat_table(struct net *net,
j = 0; xt_compat_lock(AF_INET6); - xt_compat_init_offsets(AF_INET6, compatr->num_entries); + ret = xt_compat_init_offsets(AF_INET6, compatr->num_entries); + if (ret) + goto out_unlock; /* Walk through entries, checking offsets. */ xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -604,10 +604,12 @@ int xt_compat_calc_jump(u_int8_t af, uns } EXPORT_SYMBOL_GPL(xt_compat_calc_jump);
-void xt_compat_init_offsets(u_int8_t af, unsigned int number) +int xt_compat_init_offsets(u8 af, unsigned int number) { xt[af].number = number; xt[af].cur = 0; + + return 0; } EXPORT_SYMBOL(xt_compat_init_offsets);
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 7d7d7e02111e9a4dc9d0658597f528f815d820fd upstream.
no need to bother even trying to allocating huge compat offset arrays, such ruleset is rejected later on anyway becaus we refuse to allocate overly large rule blobs.
However, compat translation happens before blob allocation, so we should add a check there too.
This is supposed to help with fuzzing by avoiding oom-killer.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/x_tables.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-)
--- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -554,14 +554,8 @@ int xt_compat_add_offset(u_int8_t af, un { struct xt_af *xp = &xt[af];
- if (!xp->compat_tab) { - if (!xp->number) - return -EINVAL; - xp->compat_tab = vmalloc(sizeof(struct compat_delta) * xp->number); - if (!xp->compat_tab) - return -ENOMEM; - xp->cur = 0; - } + if (WARN_ON(!xp->compat_tab)) + return -ENOMEM;
if (xp->cur >= xp->number) return -EINVAL; @@ -606,6 +600,22 @@ EXPORT_SYMBOL_GPL(xt_compat_calc_jump);
int xt_compat_init_offsets(u8 af, unsigned int number) { + size_t mem; + + if (!number || number > (INT_MAX / sizeof(struct compat_delta))) + return -EINVAL; + + if (WARN_ON(xt[af].compat_tab)) + return -EINVAL; + + mem = sizeof(struct compat_delta) * number; + if (mem > XT_MAX_TABLE_SIZE) + return -ENOMEM; + + xt[af].compat_tab = vmalloc(mem); + if (!xt[af].compat_tab) + return -ENOMEM; + xt[af].number = number; xt[af].cur = 0;
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 9d5c12a7c08f67999772065afd50fb222072114e upstream.
This is a very conservative limit (134217728 rules), but good enough to not trigger frequent oom from syzkaller.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/x_tables.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -818,6 +818,9 @@ EXPORT_SYMBOL(xt_check_entry_offsets); */ unsigned int *xt_alloc_entry_offsets(unsigned int size) { + if (size > XT_MAX_TABLE_SIZE / sizeof(unsigned int)) + return NULL; + return kvmalloc_array(size, sizeof(unsigned int), GFP_KERNEL | __GFP_ZERO);
}
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa jolsa@kernel.org
commit 5af44ca53d019de47efe6dbc4003dd518e5197ed upstream.
The syzbot hit KASAN bug in perf_callchain_store having the entry stored behind the allocated bounds [1].
We miss the sample_max_stack check for the initial event that allocates callchain buffers. This missing check allows to create an event with sample_max_stack value bigger than the global sysctl maximum:
# sysctl -a | grep perf_event_max_stack kernel.perf_event_max_stack = 127
# perf record -vv -C 1 -e cycles/max-stack=256/ kill ... perf_event_attr: size 112 ... sample_max_stack 256 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
Note the '-C 1', which forces perf record to create just single event. Otherwise it opens event for every cpu, then the sample_max_stack check fails on the second event and all's fine.
The fix is to run the sample_max_stack check also for the first event with callchains.
[1] https://marc.info/?l=linux-kernel&m=152352732920874&w=2
Reported-by: syzbot+7c449856228b63ac951e@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen andi@firstfloor.org Cc: H. Peter Anvin hpa@zytor.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 97c79a38cd45 ("perf core: Per event callchain limit") Link: http://lkml.kernel.org/r/20180415092352.12403-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/events/callchain.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
--- a/kernel/events/callchain.c +++ b/kernel/events/callchain.c @@ -119,19 +119,22 @@ int get_callchain_buffers(int event_max_ goto exit; }
+ /* + * If requesting per event more than the global cap, + * return a different error to help userspace figure + * this out. + * + * And also do it here so that we have &callchain_mutex held. + */ + if (event_max_stack > sysctl_perf_event_max_stack) { + err = -EOVERFLOW; + goto exit; + } + if (count > 1) { /* If the allocation failed, give up */ if (!callchain_cpus_entries) err = -ENOMEM; - /* - * If requesting per event more than the global cap, - * return a different error to help userspace figure - * this out. - * - * And also do it here so that we have &callchain_mutex held. - */ - if (event_max_stack > sysctl_perf_event_max_stack) - err = -EOVERFLOW; goto exit; }
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa jolsa@kernel.org
commit 78b562fbfa2cf0a9fcb23c3154756b690f4905c1 upstream.
Return immediately when we find issue in the user stack checks. The error value could get overwritten by following check for PERF_SAMPLE_REGS_INTR.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen andi@firstfloor.org Cc: H. Peter Anvin hpa@zytor.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt") Link: http://lkml.kernel.org/r/20180415092352.12403-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9730,9 +9730,9 @@ static int perf_copy_attr(struct perf_ev * __u16 sample size limit. */ if (attr->sample_stack_user >= USHRT_MAX) - ret = -EINVAL; + return -EINVAL; else if (!IS_ALIGNED(attr->sample_stack_user, sizeof(u64))) - ret = -EINVAL; + return -EINVAL; }
if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@mellanox.com
commit 75a4598209cbe45540baa316c3b51d9db222e96e upstream.
mlx5 modify_qp() relies on FW that the error will be thrown if wrong state is supplied. The missing check in FW causes the following crash while using XRC_TGT QPs.
[ 14.769632] BUG: unable to handle kernel NULL pointer dereference at (null) [ 14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0 [ 14.773126] Oops: 0002 [#1] SMP PTI [ 14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119 [ 14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246 [ 14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000 [ 14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000 [ 14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504 [ 14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240 [ 14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000 [ 14.785800] FS: 00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000 [ 14.787073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0 [ 14.788689] Call Trace: [ 14.789007] _ib_modify_qp+0x71/0x120 [ 14.789475] modify_qp.isra.20+0x207/0x2f0 [ 14.790010] ib_uverbs_modify_qp+0x90/0xe0 [ 14.790532] ib_uverbs_write+0x1d2/0x3c0 [ 14.791049] ? __handle_mm_fault+0x93c/0xe40 [ 14.791644] __vfs_write+0x36/0x180 [ 14.792096] ? handle_mm_fault+0xc1/0x210 [ 14.792601] vfs_write+0xad/0x1e0 [ 14.793018] SyS_write+0x52/0xc0 [ 14.793422] do_syscall_64+0x75/0x180 [ 14.793888] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 14.794527] RIP: 0033:0x7f545ad76099 [ 14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099 [ 14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003 [ 14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480 [ 14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760 [ 14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000 [ 14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00 00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00 00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c [ 14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8 [ 14.804838] CR2: 0000000000000000 [ 14.805288] ---[ end trace 3f1da0df5c8b7c37 ]---
Cc: syzkaller syzkaller@googlegroups.com Reported-by: Maor Gottlieb maorg@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/qp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -3157,7 +3157,8 @@ static int __mlx5_ib_modify_qp(struct ib * If we moved a kernel QP to RESET, clean up all old CQ * entries and reinitialize the QP. */ - if (new_state == IB_QPS_RESET && !ibqp->uobject) { + if (new_state == IB_QPS_RESET && + !ibqp->uobject && ibqp->qp_type != IB_QPT_XRC_TGT) { mlx5_ib_cq_clean(recv_cq, base->mqp.qpn, ibqp->srq ? to_msrq(ibqp->srq) : NULL); if (send_cq != recv_cq)
4.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson sean.j.christopherson@intel.com
commit 2c151b25441ae5c2da66472abd165af785c9ecd2 upstream.
The bug that led to commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 was a benign warning (no adverse affects other than the warning itself) that was detected by syzkaller. Further inspection shows that the WARN_ON in question, in handle_ept_misconfig(), is unnecessary and flawed (this was also briefly discussed in the original patch: https://patchwork.kernel.org/patch/10204649).
* The WARN_ON is unnecessary as kvm_mmu_page_fault() will WARN if reserved bits are set in the SPTEs, i.e. it covers the case where an EPT misconfig occurred because of a KVM bug.
* The WARN_ON is flawed because it will fire on any system error code that is hit while handling the fault, e.g. -ENOMEM can be returned by mmu_topup_memory_caches() while handling a legitmate MMIO EPT misconfig.
The original behavior of returning -EFAULT when userspace munmaps an HVA without first removing the memslot is correct and desirable, i.e. KVM is letting userspace know it has generated a bad address. Returning RET_PF_EMULATE masks the WARN_ON in the EPT misconfig path, but does not fix the underlying bug, i.e. the WARN_ON is bogus.
Furthermore, returning RET_PF_EMULATE has the unwanted side effect of causing KVM to attempt to emulate an instruction on any page fault with an invalid HVA translation, e.g. a not-present EPT violation on a VM_PFNMAP VMA whose fault handler failed to insert a PFN.
* There is no guarantee that the fault is directly related to the instruction, i.e. the fault could have been triggered by a side effect memory access in the guest, e.g. while vectoring a #DB or writing a tracing record. This could cause KVM to effectively mask the fault if KVM doesn't model the behavior leading to the fault, i.e. emulation could succeed and resume the guest.
* If emulation does fail, KVM will return EMULATION_FAILED instead of -EFAULT, which is a red herring as the user will either debug a bogus emulation attempt or scratch their head wondering why we were attempting emulation in the first place.
TL;DR: revert to returning -EFAULT and remove the bogus WARN_ON in handle_ept_misconfig in a future patch.
This reverts commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5.
Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3031,7 +3031,7 @@ static int kvm_handle_bad_page(struct kv return RET_PF_RETRY; }
- return RET_PF_EMULATE; + return -EFAULT; }
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
On Wed, Apr 25, 2018 at 12:33:09PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
For v4.16.4-28-gb04d14a:
Build results: total: 143 pass: 143 fail: 0 Qemu test results: total: 139 pass: 139 fail: 0
Details are available at http://kerneltests.org/builders/.
Guenter
On Wed, Apr 25, 2018 at 08:34:43AM -0700, Guenter Roeck wrote:
On Wed, Apr 25, 2018 at 12:33:09PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
For v4.16.4-28-gb04d14a:
Build results: total: 143 pass: 143 fail: 0 Qemu test results: total: 139 pass: 139 fail: 0
Details are available at http://kerneltests.org/builders/.
Great! Thanks for testing and letting me know.
greg k-h
On 04/25/2018 04:33 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.5-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Wed, Apr 25, 2018 at 12:36:05PM -0600, Shuah Khan wrote:
On 04/25/2018 04:33 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.5-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing both of these and letting me know.
greg k-h
On Wed, Apr 25, 2018 at 12:33:09PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.5-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below.
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
We did find a regression in 4.16.4 that is dragonboard-410c specific that we missed the first time around. Discovered using kselftest/printf.sh:
/sbin/modprobe test_printf [ 22.725551] test_printf: hashing plain 'p' has unexpected format [ 22.726031] test_printf: failed 1 out of 236 tests modprobe: ERROR: could not insert 'test_printf': Invalid argument
We'll bisect it and find the issue but since it was already released, I assume you won't want to hold up 4.16.5 for it.
Summary ------------------------------------------------------------------------
kernel: 4.16.5-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.16.y git commit: 9a7eb4ea64f3850a0fabfdf2d95f4c8cbffcb324 git describe: v4.16.4-27-g9a7eb4ea64f3 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.16-oe/build/v4.16.4-27-...
No regressions (compared to build v4.16.4)
Boards, architectures and test suites: -------------------------------------
dragonboard-410c - arm64 * boot - pass: 18, fail: 2, * kselftest - pass: 41, fail: 1, skip: 26 * libhugetlbfs - pass: 89, fail: 1, skip: 1 * ltp-containers-tests - pass: 64, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 57, skip: 6 * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 21, skip: 1 * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 14, * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1017, skip: 133 * ltp-timers-tests - pass: 13,
hi6220-hikey - arm64 * boot - pass: 20, * kselftest - pass: 45, skip: 22 * libhugetlbfs - pass: 90, skip: 1 * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 57, skip: 6 * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 21, skip: 1 * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 10, skip: 4 * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1016, skip: 134 * ltp-timers-tests - pass: 13,
juno-r2 - arm64 * boot - pass: 20, * libhugetlbfs - pass: 90, skip: 1 * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17 * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 57, skip: 6 * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 10, skip: 4 * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1017, skip: 133 * ltp-timers-tests - pass: 13,
qemu_arm * boot - pass: 15, fail: 5, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 62, fail: 2, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 58, skip: 5 * ltp-fs_bind-tests - pass: 2, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 21, skip: 1 * ltp-io-tests - pass: 3, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-timers-tests - pass: 13,
qemu_arm64 * boot - pass: 15, fail: 5, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-timers-tests - pass: 13,
qemu_x86_64 * boot - pass: 22, * kselftest - pass: 50, skip: 30 * kselftest-vsyscall-mode-native - pass: 50, skip: 30 * kselftest-vsyscall-mode-none - pass: 50, skip: 30 * libhugetlbfs - pass: 90, skip: 1 * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 57, skip: 6 * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 13, skip: 1 * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 998, skip: 152 * ltp-timers-tests - pass: 13,
x15 - arm * boot - pass: 20, * kselftest - pass: 37, skip: 28 * libhugetlbfs - pass: 87, skip: 1 * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 63, skip: 18 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 58, skip: 5 * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 20, skip: 2 * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 13, skip: 1 * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1075, skip: 75 * ltp-timers-tests - pass: 13,
x86_64 * boot - pass: 22, * kselftest - pass: 55, skip: 20 * kselftest-vsyscall-mode-native - pass: 55, skip: 20 * kselftest-vsyscall-mode-none - pass: 55, skip: 20 * libhugetlbfs - pass: 90, skip: 1 * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 58, skip: 5 * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 9, skip: 5 * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1034, skip: 116 * ltp-timers-tests - pass: 13,
On Wed, Apr 25, 2018 at 04:42:20PM -0500, Dan Rue wrote:
On Wed, Apr 25, 2018 at 12:33:09PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.16.5 release. There are 26 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri Apr 27 10:33:04 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.5-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below.
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
We did find a regression in 4.16.4 that is dragonboard-410c specific that we missed the first time around. Discovered using kselftest/printf.sh:
/sbin/modprobe test_printf [ 22.725551] test_printf: hashing plain 'p' has unexpected format [ 22.726031] test_printf: failed 1 out of 236 tests modprobe: ERROR: could not insert 'test_printf': Invalid argument
We'll bisect it and find the issue but since it was already released, I assume you won't want to hold up 4.16.5 for it.
As it's an old issue, no, I'll wait for you all to find it and not hold this release up :)
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org