This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.93-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 3.18.93-rc1
Jim Westfall jwestfall@surrealistic.net ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
Mike Maloney maloney@google.com ipv6: fix udpv6 sendmsg crash caused by too small MTU
Jim Westfall jwestfall@surrealistic.net net: Allow neigh contructor functions ability to modify the primary_key
Neil Horman nhorman@tuxdriver.com vmxnet3: repair memory leak
Xin Long lucien.xin@gmail.com sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
Xin Long lucien.xin@gmail.com sctp: do not allow the v4 socket to bind a v4mapped v6 address
Guillaume Nault g.nault@alphalink.fr pppoe: take ->needed_headroom of lower device into account on xmit
Eric Dumazet edumazet@google.com net: qdisc_pkt_len_init() should be more robust
Craig Gallek kraig@google.com tcp: __tcp_hdrlen() helper
Felix Fietkau nbd@nbd.name net: igmp: fix source address check for IGMPv3 reports
Alexey Kodanev alexey.kodanev@oracle.com dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
Dan Streetman ddstreet@ieee.org net: tcp: close sock if net namespace is exiting
Jia Zhang zhang.jia@linux.alibaba.com x86/microcode/intel: Extend BDW late-loading further with LLC size check
Richard Weinberger richard@nod.at um: Remove copy&paste code from init.h
Richard Weinberger richard@nod.at um: Stop abusing __KERNEL__
Greg KH gregkh@linuxfoundation.org eventpoll.h: add missing epoll event masks
Thomas Meyer thomas@m3y3r.de um: link vmlinux with -no-pie
Johannes Thumshirn jthumshirn@suse.de scsi: libiscsi: fix shifting of DID_REQUEUE host byte
Jiri Slaby jslaby@suse.cz fs/fcntl: f_setown, avoid undefined behaviour
Jeff Mahoney jeffm@suse.com reiserfs: don't preallocate blocks for extended attributes
Jeff Mahoney jeffm@suse.com reiserfs: fix race in prealloc discard
Kevin Cernekee cernekee@chromium.org netfilter: xt_osf: Add missing permission checks
Kevin Cernekee cernekee@chromium.org netfilter: nfnetlink_cthelper: Add missing permission checks
Ulrich Weber ulrich.weber@riverbed.com netfilter: nf_conntrack_sip: extend request line validation
Florian Westphal fw@strlen.de netfilter: restart search if moved to other chain
Liping Zhang liping.zhang@spreadtrum.com netfilter: nf_ct_expect: remove the redundant slash when policy name is empty
Jiri Slaby jslaby@suse.cz ipc: msg, make msgrcv work with LONG_MIN
Michal Hocko mhocko@suse.com hwpoison, memcg: forcibly uncharge LRU pages
Michal Hocko mhocko@suse.com mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
Marc Kleine-Budde mkl@pengutronix.de can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
Marc Kleine-Budde mkl@pengutronix.de can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
Jonathan Dieter jdieter@lesbg.com usbip: Fix implicit fallthrough warning
Andy Lutomirski luto@kernel.org x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
Jonas Gorski jonas.gorski@gmail.com MIPS: AR7: ensure the port type's FCR value is used
Marc Zyngier marc.zyngier@arm.com arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
Dennis Yang dennisyang@qnap.com dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6
Joe Thornber thornber@redhat.com dm btree: fix serious bug in btree_split_beneath()
Thomas Petazzoni thomas.petazzoni@free-electrons.com ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7
Arnd Bergmann arnd@arndb.de phy: work around 'phys' references to usb-nop-xceiv devices
Johan Hovold johan@kernel.org Input: twl4030-vibra - fix sibling-node lookup
Marek Belisko marek@goldelico.com Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning
Johan Hovold johan@kernel.org Input: twl6040-vibra - fix child-node lookup
H. Nikolaus Schaller hns@goldelico.com Input: twl6040-vibra - fix DT node memory management
Johan Hovold johan@kernel.org Input: 88pm860x-ts - fix child-node lookup
Joe Lawrence joe.lawrence@redhat.com pipe: avoid round_pipe_size() nr_pages overflow on 32-bit
Eric Biggers ebiggers@google.com af_key: fix buffer overread in parse_exthdrs()
Eric Biggers ebiggers@google.com af_key: fix buffer overread in verify_address_len()
Takashi Iwai tiwai@suse.de ALSA: hda - Apply the existing quirk to iMac 14,1
Takashi Iwai tiwai@suse.de ALSA: pcm: Remove yet superfluous WARN_ON()
Li Jinyue lijinyue@huawei.com futex: Prevent overflow by strengthen input validation
Hannes Reinecke hare@suse.de scsi: sg: disable SET_FORCE_LOW_DMA
Arnd Bergmann arnd@arndb.de gcov: disable for COMPILE_TEST
-------------
Diffstat:
Makefile | 4 ++-- arch/arm/boot/dts/kirkwood-openblocks_a7.dts | 10 ++++++++-- arch/arm64/kvm/handle_exit.c | 4 ++-- arch/mips/ar7/platform.c | 2 +- arch/um/Makefile | 9 +++++---- arch/um/drivers/mconsole.h | 2 +- arch/um/include/shared/init.h | 24 ++-------------------- arch/um/include/shared/user.h | 2 +- arch/x86/include/asm/processor.h | 2 +- arch/x86/kernel/cpu/microcode/intel.c | 20 +++++++++++++++++-- arch/x86/um/shared/sysdep/tls.h | 6 +++--- drivers/input/misc/twl4030-vibra.c | 7 +++++-- drivers/input/misc/twl6040-vibra.c | 2 +- drivers/input/touchscreen/88pm860x-ts.c | 16 +++++++++++---- drivers/md/dm-thin-metadata.c | 6 +++++- drivers/md/persistent-data/dm-btree.c | 19 ++---------------- drivers/net/ppp/pppoe.c | 11 +++++----- drivers/net/vmxnet3/vmxnet3_drv.c | 2 +- drivers/phy/phy-core.c | 4 ++++ drivers/scsi/libiscsi.c | 2 +- drivers/scsi/sg.c | 30 +++++++++------------------- fs/fcntl.c | 4 ++++ fs/pipe.c | 18 +++++++++++++++-- fs/reiserfs/bitmap.c | 14 ++++++++++--- include/linux/tcp.h | 7 ++++++- include/net/arp.h | 3 +++ include/net/net_namespace.h | 10 ++++++++++ include/scsi/sg.h | 1 - include/uapi/linux/eventpoll.h | 13 ++++++++++++ ipc/msg.c | 5 ++++- kernel/futex.c | 3 +++ kernel/gcov/Kconfig | 1 + mm/memcontrol.c | 2 +- mm/memory-failure.c | 7 +++++++ mm/mmap.c | 6 ++++-- net/can/af_can.c | 22 ++++++++++---------- net/core/dev.c | 19 ++++++++++++++---- net/core/neighbour.c | 4 ++-- net/dccp/ccids/ccid2.c | 3 +++ net/ipv4/arp.c | 7 ++++++- net/ipv4/igmp.c | 2 +- net/ipv4/tcp.c | 3 +++ net/ipv4/tcp_timer.c | 15 ++++++++++++++ net/ipv6/ip6_output.c | 6 ++++-- net/key/af_key.c | 8 ++++++++ net/netfilter/nf_conntrack_core.c | 7 +++++++ net/netfilter/nf_conntrack_expect.c | 2 +- net/netfilter/nf_conntrack_sip.c | 5 ++++- net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++ net/netfilter/xt_osf.c | 7 +++++++ net/sctp/socket.c | 30 +++++++++++----------------- sound/core/pcm_lib.c | 1 - sound/pci/hda/patch_cirrus.c | 1 + tools/usb/usbip/src/usbip.c | 2 ++ 54 files changed, 284 insertions(+), 148 deletions(-)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit cc622420798c4bcf093785d872525087a7798db9 upstream.
Enabling gcov is counterproductive to compile testing: it significantly increases the kernel image size, compile time, and it produces lots of false positive "may be used uninitialized" warnings as the result of missed optimizations.
This is in line with how UBSAN_SANITIZE_ALL and PROFILE_ALL_BRANCHES work, both of which have similar problems.
With an ARM allmodconfig kernel, I see the build time drop from 283 minutes CPU time to 225 minutes, and the vmlinux size drops from 43MB to 26MB.
Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Peter Oberparleiter oberpar@linux.vnet.ibm.com Signed-off-by: Michal Marek mmarek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/gcov/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/gcov/Kconfig +++ b/kernel/gcov/Kconfig @@ -34,6 +34,7 @@ config GCOV_KERNEL
config GCOV_PROFILE_ALL bool "Profile entire Kernel" + depends on !COMPILE_TEST depends on GCOV_KERNEL depends on SUPERH || S390 || X86 || PPC || MICROBLAZE || ARM || ARM64 default n
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hannes Reinecke hare@suse.de
commit 745dfa0d8ec26b24f3304459ff6e9eacc5c8351b upstream.
The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely.
Signed-off-by: Hannes Reinecke hare@suse.com Reviewed-by: Johannes Thumshirn jthumshirn@suse.de Tested-by: Johannes Thumshirn jthumshirn@suse.de Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sg.c | 30 +++++++++--------------------- include/scsi/sg.h | 1 - 2 files changed, 9 insertions(+), 22 deletions(-)
--- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -160,7 +160,6 @@ typedef struct sg_fd { /* holds the sta struct list_head rq_list; /* head of request list */ struct fasync_struct *async_qp; /* used by asynchronous notification */ Sg_request req_arr[SG_MAX_QUEUE]; /* used as singly-linked list */ - char low_dma; /* as in parent but possibly overridden to 1 */ char force_packid; /* 1 -> pack_id input to read(), 0 -> ignored */ char cmd_q; /* 1 -> allow command queuing, 0 -> don't */ unsigned char next_cmd_len; /* 0: automatic, >0: use on next write() */ @@ -947,24 +946,14 @@ sg_ioctl(struct file *filp, unsigned int /* strange ..., for backward compatibility */ return sfp->timeout_user; case SG_SET_FORCE_LOW_DMA: - result = get_user(val, ip); - if (result) - return result; - if (val) { - sfp->low_dma = 1; - if ((0 == sfp->low_dma) && !sfp->res_in_use) { - val = (int) sfp->reserve.bufflen; - sg_remove_scat(sfp, &sfp->reserve); - sg_build_reserve(sfp, val); - } - } else { - if (atomic_read(&sdp->detaching)) - return -ENODEV; - sfp->low_dma = sdp->device->host->unchecked_isa_dma; - } + /* + * N.B. This ioctl never worked properly, but failed to + * return an error value. So returning '0' to keep compability + * with legacy applications. + */ return 0; case SG_GET_LOW_DMA: - return put_user((int) sfp->low_dma, ip); + return put_user((int) sdp->device->host->unchecked_isa_dma, ip); case SG_GET_SCSI_ID: if (!access_ok(VERIFY_WRITE, p, sizeof (sg_scsi_id_t))) return -EFAULT; @@ -1916,6 +1905,7 @@ sg_build_indirect(Sg_scatter_hold * schp int sg_tablesize = sfp->parentdp->sg_tablesize; int blk_size = buff_size, order; gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; + struct sg_device *sdp = sfp->parentdp;
if (blk_size < 0) return -EFAULT; @@ -1941,7 +1931,7 @@ sg_build_indirect(Sg_scatter_hold * schp scatter_elem_sz_prev = num; }
- if (sfp->low_dma) + if (sdp->device->host->unchecked_isa_dma) gfp_mask |= GFP_DMA;
if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) @@ -2204,8 +2194,6 @@ sg_add_sfp(Sg_device * sdp) sfp->timeout = SG_DEFAULT_TIMEOUT; sfp->timeout_user = SG_DEFAULT_TIMEOUT_USER; sfp->force_packid = SG_DEF_FORCE_PACK_ID; - sfp->low_dma = (SG_DEF_FORCE_LOW_DMA == 0) ? - sdp->device->host->unchecked_isa_dma : 1; sfp->cmd_q = SG_DEF_COMMAND_Q; sfp->keep_orphan = SG_DEF_KEEP_ORPHAN; sfp->parentdp = sdp; @@ -2664,7 +2652,7 @@ static void sg_proc_debug_helper(struct jiffies_to_msecs(fp->timeout), fp->reserve.bufflen, (int) fp->reserve.k_use_sg, - (int) fp->low_dma); + (int) sdp->device->host->unchecked_isa_dma); seq_printf(s, " cmd_q=%d f_packid=%d k_orphan=%d closed=0\n", (int) fp->cmd_q, (int) fp->force_packid, (int) fp->keep_orphan); --- a/include/scsi/sg.h +++ b/include/scsi/sg.h @@ -194,7 +194,6 @@ typedef struct sg_req_info { /* used by #define SG_DEFAULT_RETRIES 0
/* Defaults, commented if they differ from original sg driver */ -#define SG_DEF_FORCE_LOW_DMA 0 /* was 1 -> memory below 16MB on i386 */ #define SG_DEF_FORCE_PACK_ID 0 #define SG_DEF_KEEP_ORPHAN 0 #define SG_DEF_RESERVED_SIZE SG_SCATTER_SZ /* load time option */
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Jinyue lijinyue@huawei.com
commit fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a upstream.
UBSAN reports signed integer overflow in kernel/futex.c:
UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int'
Add a sanity check to catch negative values of nr_wake and nr_requeue.
Signed-off-by: Li Jinyue lijinyue@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -1514,6 +1514,9 @@ static int futex_requeue(u32 __user *uad struct futex_hash_bucket *hb1, *hb2; struct futex_q *this, *next;
+ if (nr_wake < 0 || nr_requeue < 0) + return -EINVAL; + if (requeue_pi) { /* * Requeue PI only works on two distinct uaddrs. This
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c upstream.
muldiv32() contains a snd_BUG_ON() (which is morphed as WARN_ON() with debug option) for checking the case of 0 / 0. This would be helpful if this happens only as a logical error; however, since the hw refine is performed with any data set provided by user, the inconsistent values that can trigger such a condition might be passed easily. Actually, syzbot caught this by passing some zero'ed old hw_params ioctl.
So, having snd_BUG_ON() there is simply superfluous and rather harmful to give unnecessary confusions. Let's get rid of it.
Reported-by: syzbot+7e6ee55011deeebce15d@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/pcm_lib.c | 1 - 1 file changed, 1 deletion(-)
--- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -644,7 +644,6 @@ static inline unsigned int muldiv32(unsi { u_int64_t n = (u_int64_t) a * b; if (c == 0) { - snd_BUG_ON(!n); *r = 0; return UINT_MAX; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 031f335cda879450095873003abb03ae8ed3b74a upstream.
iMac 14,1 requires the same quirk as iMac 12,2, using GPIO 2 and 3 for headphone and speaker output amps. Add the codec SSID quirk entry (106b:0600) accordingly.
BugLink: http://lkml.kernel.org/r/CAEw6Zyteav09VGHRfD5QwsfuWv5a43r0tFBNbfcHXoNrxVz7ew... Reported-by: Freaky freaky2000@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_cirrus.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -394,6 +394,7 @@ static const struct snd_pci_quirk cs420x /*SND_PCI_QUIRK(0x8086, 0x7270, "IMac 27 Inch", CS420X_IMAC27),*/
/* codec SSID */ + SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81), SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101),
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream.
If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case.
This bug was found using syzkaller with KMSAN.
Reproducer:
#include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h>
int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg *msg = (void *)buf; struct sadb_address *addr = (void *)(msg + 1);
msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;
write(sock, buf, 24); }
Reported-by: Alexander Potapenko glider@google.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/key/af_key.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -401,6 +401,11 @@ static int verify_address_len(const void #endif int len;
+ if (sp->sadb_address_len < + DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family), + sizeof(uint64_t))) + return -EINVAL; + switch (addr->sa_family) { case AF_INET: len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t));
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream.
If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case.
Reproducer:
#include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h>
int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg *msg = (void *)buf;
msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2;
write(sock, buf, 17); }
Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/key/af_key.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -516,6 +516,9 @@ static int parse_exthdrs(struct sk_buff uint16_t ext_type; int ext_len;
+ if (len < sizeof(*ehdr)) + return -EINVAL; + ext_len = ehdr->sadb_ext_len; ext_len *= sizeof(uint64_t); ext_type = ehdr->sadb_ext_type;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Lawrence joe.lawrence@redhat.com
commit d3f14c485867cfb2e0c48aa88c41d0ef4bf5209c upstream.
round_pipe_size() contains a right-bit-shift expression which may overflow, which would cause undefined results in a subsequent roundup_pow_of_two() call.
static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages;
nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; }
PAGE_SIZE is defined as (1UL << PAGE_SHIFT), so: - 4 bytes wide on 32-bit (0 to 0xffffffff) - 8 bytes wide on 64-bit (0 to 0xffffffffffffffff)
That means that 32-bit round_pipe_size(), nr_pages may overflow to 0:
size=0x00000000 nr_pages=0x0 size=0x00000001 nr_pages=0x1 size=0xfffff000 nr_pages=0xfffff size=0xfffff001 nr_pages=0x0 << ! size=0xffffffff nr_pages=0x0 << !
This is bad because roundup_pow_of_two(n) is undefined when n == 0!
64-bit is not a problem as the unsigned int size is 4 bytes wide (similar to 32-bit) and the larger, 8 byte wide unsigned long, is sufficient to handle the largest value of the bit shift expression:
size=0xffffffff nr_pages=100000
Modify round_pipe_size() to return 0 if n == 0 and updates its callers to handle accordingly.
Link: http://lkml.kernel.org/r/1507658689-11669-3-git-send-email-joe.lawrence@redh... Signed-off-by: Joe Lawrence joe.lawrence@redhat.com Reported-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Mikulas Patocka mpatocka@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Jens Axboe axboe@kernel.dk Cc: Michael Kerrisk mtk.manpages@gmail.com Cc: Randy Dunlap rdunlap@infradead.org Cc: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Dong Jinguang dongjinguang@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/pipe.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
--- a/fs/pipe.c +++ b/fs/pipe.c @@ -1002,6 +1002,9 @@ static long pipe_set_size(struct pipe_in { struct pipe_buffer *bufs;
+ if (!nr_pages) + return -EINVAL; + /* * We can shrink the pipe, if arg >= pipe->nrbufs. Since we don't * expect a lot of shrink+grow operations, just free and allocate @@ -1046,13 +1049,19 @@ static long pipe_set_size(struct pipe_in
/* * Currently we rely on the pipe array holding a power-of-2 number - * of pages. + * of pages. Returns 0 on error. */ static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages;
+ if (size < pipe_min_size) + size = pipe_min_size; + nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; + if (nr_pages == 0) + return 0; + return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; }
@@ -1063,13 +1072,18 @@ static inline unsigned int round_pipe_si int pipe_proc_fn(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos) { + unsigned int rounded_pipe_max_size; int ret;
ret = proc_dointvec_minmax(table, write, buf, lenp, ppos); if (ret < 0 || !write) return ret;
- pipe_max_size = round_pipe_size(pipe_max_size); + rounded_pipe_max_size = round_pipe_size(pipe_max_size); + if (rounded_pipe_max_size == 0) + return -EINVAL; + + pipe_max_size = rounded_pipe_max_size; return ret; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 906bf7daa0618d0ef39f4872ca42218c29a3631f upstream.
Fix child node-lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children.
To make things worse, the parent node was prematurely freed, while the child node was leaked.
Fixes: 2e57d56747e6 ("mfd: 88pm860x: Device tree support") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/touchscreen/88pm860x-ts.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/drivers/input/touchscreen/88pm860x-ts.c +++ b/drivers/input/touchscreen/88pm860x-ts.c @@ -126,7 +126,7 @@ static int pm860x_touch_dt_init(struct p int data, n, ret; if (!np) return -ENODEV; - np = of_find_node_by_name(np, "touch"); + np = of_get_child_by_name(np, "touch"); if (!np) { dev_err(&pdev->dev, "Can't find touch node\n"); return -EINVAL; @@ -144,13 +144,13 @@ static int pm860x_touch_dt_init(struct p if (data) { ret = pm860x_reg_write(i2c, PM8607_GPADC_MISC1, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } /* set tsi prebias time */ if (!of_property_read_u32(np, "marvell,88pm860x-tsi-prebias", &data)) { ret = pm860x_reg_write(i2c, PM8607_TSI_PREBIAS, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } /* set prebias & prechg time of pen detect */ data = 0; @@ -161,10 +161,18 @@ static int pm860x_touch_dt_init(struct p if (data) { ret = pm860x_reg_write(i2c, PM8607_PD_PREBIAS, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } of_property_read_u32(np, "marvell,88pm860x-resistor-X", res_x); + + of_node_put(np); + return 0; + +err_put_node: + of_node_put(np); + + return -EINVAL; } #else #define pm860x_touch_dt_init(x, y, z) (-1)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: H. Nikolaus Schaller hns@goldelico.com
commit c52c545ead97fcc2f4f8ea38f1ae3c23211e09a8 upstream.
commit e7ec014a47e4 ("Input: twl6040-vibra - update for device tree support")
made the separate vibra DT node to a subnode of the twl6040.
It now calls of_find_node_by_name() to locate the "vibra" subnode. This function has a side effect to call of_node_put on() for the twl6040 parent node passed in as a parameter. This causes trouble later on.
Solution: we must call of_node_get() before of_find_node_by_name()
Signed-off-by: H. Nikolaus Schaller hns@goldelico.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl6040-vibra.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/misc/twl6040-vibra.c +++ b/drivers/input/misc/twl6040-vibra.c @@ -264,6 +264,7 @@ static int twl6040_vibra_probe(struct pl int vddvibr_uV = 0; int error;
+ of_node_get(twl6040_core_dev->of_node); twl6040_core_node = of_find_node_by_name(twl6040_core_dev->of_node, "vibra"); if (!twl6040_core_node) {
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit dcaf12a8b0bbdbfcfa2be8dff2c4948d9844b4ad upstream.
Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children.
Later sanity checks on node properties (which would likely be missing) should prevent this from causing much trouble however, especially as the original premature free of the parent node has already been fixed separately (but that "fix" was apparently never backported to stable).
Fixes: e7ec014a47e4 ("Input: twl6040-vibra - update for device tree support") Fixes: c52c545ead97 ("Input: twl6040-vibra - fix DT node memory management") Signed-off-by: Johan Hovold johan@kernel.org Acked-by: Peter Ujfalusi peter.ujfalusi@ti.com Tested-by: H. Nikolaus Schaller hns@goldelico.com (on Pyra OMAP5 hardware) Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl6040-vibra.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/input/misc/twl6040-vibra.c +++ b/drivers/input/misc/twl6040-vibra.c @@ -264,8 +264,7 @@ static int twl6040_vibra_probe(struct pl int vddvibr_uV = 0; int error;
- of_node_get(twl6040_core_dev->of_node); - twl6040_core_node = of_find_node_by_name(twl6040_core_dev->of_node, + twl6040_core_node = of_get_child_by_name(twl6040_core_dev->of_node, "vibra"); if (!twl6040_core_node) { dev_err(&pdev->dev, "parent of node is missing?\n");
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Belisko marek@goldelico.com
commit e661d0a04462dd98667f8947141bd8defab5b34a upstream.
Fix following: [ 8.862274] ERROR: Bad of_node_put() on /ocp/i2c@48070000/twl@48/audio [ 8.869293] CPU: 0 PID: 1003 Comm: modprobe Not tainted 4.2.0-rc2-letux+ #1175 [ 8.876922] Hardware name: Generic OMAP36xx (Flattened Device Tree) [ 8.883514] [<c00159e0>] (unwind_backtrace) from [<c0012488>] (show_stack+0x10/0x14) [ 8.891693] [<c0012488>] (show_stack) from [<c05cb810>] (dump_stack+0x78/0x94) [ 8.899322] [<c05cb810>] (dump_stack) from [<c02cfd5c>] (kobject_release+0x68/0x7c) [ 8.907409] [<c02cfd5c>] (kobject_release) from [<bf0040c4>] (twl4030_vibra_probe+0x74/0x188 [twl4030_vibra]) [ 8.917877] [<bf0040c4>] (twl4030_vibra_probe [twl4030_vibra]) from [<c03816ac>] (platform_drv_probe+0x48/0x90) [ 8.928497] [<c03816ac>] (platform_drv_probe) from [<c037feb4>] (really_probe+0xd4/0x238) [ 8.937103] [<c037feb4>] (really_probe) from [<c0380160>] (driver_probe_device+0x30/0x48) [ 8.945678] [<c0380160>] (driver_probe_device) from [<c03801e0>] (__driver_attach+0x68/0x8c) [ 8.954589] [<c03801e0>] (__driver_attach) from [<c037ea60>] (bus_for_each_dev+0x50/0x84) [ 8.963226] [<c037ea60>] (bus_for_each_dev) from [<c037f828>] (bus_add_driver+0xcc/0x1e4) [ 8.971832] [<c037f828>] (bus_add_driver) from [<c0380b60>] (driver_register+0x9c/0xe0) [ 8.980255] [<c0380b60>] (driver_register) from [<c00097e0>] (do_one_initcall+0x100/0x1b8) [ 8.988983] [<c00097e0>] (do_one_initcall) from [<c00b8008>] (do_init_module+0x58/0x1c0) [ 8.997497] [<c00b8008>] (do_init_module) from [<c00b8cac>] (SyS_init_module+0x54/0x64) [ 9.005950] [<c00b8cac>] (SyS_init_module) from [<c000ed20>] (ret_fast_syscall+0x0/0x54) [ 9.015838] input: twl4030:vibrator as /devices/platform/68000000.ocp/48070000.i2c/i2c-0/0-0048/48070000.i2c:twl@48:audio/input/input2
node passed to of_find_node_by_name is put inside that function and new node is returned if found. Free returned node not already freed node.
Signed-off-by: Marek Belisko marek@goldelico.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl4030-vibra.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/input/misc/twl4030-vibra.c +++ b/drivers/input/misc/twl4030-vibra.c @@ -185,7 +185,8 @@ static bool twl4030_vibra_check_coexist( if (pdata && pdata->coexist) return true;
- if (of_find_node_by_name(node, "codec")) { + node = of_find_node_by_name(node, "codec"); + if (node) { of_node_put(node); return true; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 5b189201993ab03001a398de731045bfea90c689 upstream.
A helper purported to look up a child node based on its name was using the wrong of-helper and ended up prematurely freeing the parent of-node while searching the whole device tree depth-first starting at the parent node.
Fixes: 64b9e4d803b1 ("input: twl4030-vibra: Support for DT booted kernel") Fixes: e661d0a04462 ("Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl4030-vibra.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/input/misc/twl4030-vibra.c +++ b/drivers/input/misc/twl4030-vibra.c @@ -180,12 +180,14 @@ static SIMPLE_DEV_PM_OPS(twl4030_vibra_p twl4030_vibra_suspend, twl4030_vibra_resume);
static bool twl4030_vibra_check_coexist(struct twl4030_vibra_data *pdata, - struct device_node *node) + struct device_node *parent) { + struct device_node *node; + if (pdata && pdata->coexist) return true;
- node = of_find_node_by_name(node, "codec"); + node = of_get_child_by_name(parent, "codec"); if (node) { of_node_put(node); return true;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit b7563e2796f8b23c98afcfea7363194227fa089d upstream.
Stefan Wahren reports a problem with a warning fix that was merged for v4.15: we had lots of device nodes with a 'phys' property pointing to a device node that is not compliant with the binding documented in Documentation/devicetree/bindings/phy/phy-bindings.txt
This generally works because USB HCD drivers that support both the generic phy subsystem and the older usb-phy subsystem ignore most errors from phy_get() and related calls and then use the usb-phy driver instead.
However, it turns out that making the usb-nop-xceiv device compatible with the generic-phy binding changes the phy_get() return code from -EINVAL to -EPROBE_DEFER, and the dwc2 usb controller driver for bcm2835 now returns -EPROBE_DEFER from its probe function rather than ignoring the failure, breaking all USB support on raspberry-pi when CONFIG_GENERIC_PHY is enabled. The same code is used in the dwc3 driver and the usb_add_hcd() function, so a reasonable assumption would be that many other platforms are affected as well.
I have reviewed all the related patches and concluded that "usb-nop-xceiv" is the only USB phy that is affected by the change, and since it is by far the most commonly referenced phy, all the other USB phy drivers appear to be used in ways that are are either safe in DT (they don't use the 'phys' property), or in the driver (they already ignore -EPROBE_DEFER from generic-phy when usb-phy is available).
To work around the problem, this adds a special case to _of_phy_get() so we ignore any PHY node that is compatible with "usb-nop-xceiv", as we know that this can never load no matter how much we defer. In the future, we might implement a generic-phy driver for "usb-nop-xceiv" and then remove this workaround.
Since we generally want older kernels to also want to work with the fixed devicetree files, it would be good to backport the patch into stable kernels as well (3.13+ are possibly affected), even though they don't contain any of the patches that may have caused regressions.
Fixes: 014d6da6cb25 ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells Fixes: c5bbf358b790 arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv Fixes: 44e5dced2ef6 arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv Fixes: f568f6f554b8 ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv Fixes: d745d5f277bf ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv Fixes: 915fbe59cbf2 ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv Link: https://marc.info/?l=linux-usb&m=151518314314753&w=2 Link: https://patchwork.kernel.org/patch/10158145/ Cc: Felipe Balbi balbi@kernel.org Cc: Eric Anholt eric@anholt.net Tested-by: Stefan Wahren stefan.wahren@i2se.com Acked-by: Rob Herring robh@kernel.org Tested-by: Hans Verkuil hans.verkuil@cisco.com Acked-by: Kishon Vijay Abraham I kishon@ti.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/phy/phy-core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -319,6 +319,10 @@ static struct phy *_of_phy_get(struct de if (ret) return ERR_PTR(-ENODEV);
+ /* This phy type handled by the usb-phy subsystem for now */ + if (of_device_is_compatible(args.np, "usb-nop-xceiv")) + return ERR_PTR(-ENODEV); + mutex_lock(&phy_provider_mutex); phy_provider = of_phy_provider_lookup(args.np); if (IS_ERR(phy_provider) || !try_module_get(phy_provider->owner)) {
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Petazzoni thomas.petazzoni@free-electrons.com
commit 56aeb07c914a616ab84357d34f8414a69b140cdf upstream.
MPP7 is currently muxed as "gpio", but this function doesn't exist for MPP7, only "gpo" is available. This causes the following error:
kirkwood-pinctrl f1010000.pin-controller: unsupported function gpio on pin mpp7 pinctrl core: failed to register map default (6): invalid type given kirkwood-pinctrl f1010000.pin-controller: error claiming hogs: -22 kirkwood-pinctrl f1010000.pin-controller: could not claim hogs: -22 kirkwood-pinctrl f1010000.pin-controller: unable to register pinctrl driver kirkwood-pinctrl: probe of f1010000.pin-controller failed with error -22
So the pinctrl driver is not probed, all device drivers (including the UART driver) do a -EPROBE_DEFER, and therefore the system doesn't really boot (well, it boots, but with no UART, and no devices that require pin-muxing).
Back when the Device Tree file for this board was introduced, the definition was already wrong. The pinctrl driver also always described as "gpo" this function for MPP7. However, between Linux 4.10 and 4.11, a hog pin failing to be muxed was turned from a simple warning to a hard error that caused the entire pinctrl driver probe to bail out. This is probably the result of commit 6118714275f0a ("pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()").
This commit fixes the Device Tree to use the proper "gpo" function for MPP7, which fixes the boot of OpenBlocks A7, which was broken since Linux 4.11.
Fixes: f24b56cbcd9d ("ARM: kirkwood: add support for OpenBlocks A7 platform") Signed-off-by: Thomas Petazzoni thomas.petazzoni@free-electrons.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Gregory CLEMENT gregory.clement@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/kirkwood-openblocks_a7.dts | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/kirkwood-openblocks_a7.dts +++ b/arch/arm/boot/dts/kirkwood-openblocks_a7.dts @@ -53,7 +53,8 @@ };
pinctrl: pin-controller@10000 { - pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header>; + pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header + &pmx_gpio_header_gpo>; pinctrl-names = "default";
pmx_uart0: pmx-uart0 { @@ -85,11 +86,16 @@ * ground. */ pmx_gpio_header: pmx-gpio-header { - marvell,pins = "mpp17", "mpp7", "mpp29", "mpp28", + marvell,pins = "mpp17", "mpp29", "mpp28", "mpp35", "mpp34", "mpp40"; marvell,function = "gpio"; };
+ pmx_gpio_header_gpo: pxm-gpio-header-gpo { + marvell,pins = "mpp7"; + marvell,function = "gpo"; + }; + pmx_gpio_init: pmx-init { marvell,pins = "mpp38"; marvell,function = "gpio";
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Thornber thornber@redhat.com
commit bc68d0a43560e950850fc69b58f0f8254b28f6d6 upstream.
When inserting a new key/value pair into a btree we walk down the spine of btree nodes performing the following 2 operations:
i) space for a new entry ii) adjusting the first key entry if the new key is lower than any in the node.
If the _root_ node is full, the function btree_split_beneath() allocates 2 new nodes, and redistibutes the root nodes entries between them. The root node is left with 2 entries corresponding to the 2 new nodes.
btree_split_beneath() then adjusts the spine to point to one of the two new children. This means the first key is never adjusted if the new key was lower, ie. operation (ii) gets missed out. This can result in the new key being 'lost' for a period; until another low valued key is inserted that will uncover it.
This is a serious bug, and quite hard to make trigger in normal use. A reproducing test case ("thin create devices-in-reverse-order") is available as part of the thin-provision-tools project: https://github.com/jthornber/thin-provisioning-tools/blob/master/functional-...
Fix the issue by changing btree_split_beneath() so it no longer adjusts the spine. Instead it unlocks both the new nodes, and lets the main loop in btree_insert_raw() relock the appropriate one and make any neccessary adjustments.
Reported-by: Monty Pavel monty_pavel@sina.com Signed-off-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/persistent-data/dm-btree.c | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-)
--- a/drivers/md/persistent-data/dm-btree.c +++ b/drivers/md/persistent-data/dm-btree.c @@ -572,23 +572,8 @@ static int btree_split_beneath(struct sh pn->keys[1] = rn->keys[0]; memcpy_disk(value_ptr(pn, 1), &val, sizeof(__le64));
- /* - * rejig the spine. This is ugly, since it knows too - * much about the spine - */ - if (s->nodes[0] != new_parent) { - unlock_block(s->info, s->nodes[0]); - s->nodes[0] = new_parent; - } - if (key < le64_to_cpu(rn->keys[0])) { - unlock_block(s->info, right); - s->nodes[1] = left; - } else { - unlock_block(s->info, left); - s->nodes[1] = right; - } - s->count = 2; - + unlock_block(s->info, left); + unlock_block(s->info, right); return 0; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dennis Yang dennisyang@qnap.com
commit 490ae017f54e55bde382d45ea24bddfb6d1a0aaf upstream.
For btree removal, there is a corner case that a single thread could takes 6 locks which is more than THIN_MAX_CONCURRENT_LOCKS(5) and leads to deadlock.
A btree removal might eventually call rebalance_children()->rebalance3() to rebalance entries of three neighbor child nodes when shadow_spine has already acquired two write locks. In rebalance3(), it tries to shadow and acquire the write locks of all three child nodes. However, shadowing a child node requires acquiring a read lock of the original child node and a write lock of the new block. Although the read lock will be released after block shadowing, shadowing the third child node in rebalance3() could still take the sixth lock. (2 write locks for shadow_spine + 2 write locks for the first two child nodes's shadow + 1 write lock for the last child node's shadow + 1 read lock for the last child node)
Signed-off-by: Dennis Yang dennisyang@qnap.com Acked-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-thin-metadata.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -81,10 +81,14 @@ #define SECTOR_TO_BLOCK_SHIFT 3
/* + * For btree insert: * 3 for btree insert + * 2 for btree lookup used within space map + * For btree remove: + * 2 for shadow spine + + * 4 for rebalance 3 child node */ -#define THIN_MAX_CONCURRENT_LOCKS 5 +#define THIN_MAX_CONCURRENT_LOCKS 6
/* This should be plenty */ #define SPACE_MAP_ROOT_SIZE 128
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
commit acfb3b883f6d6a4b5d27ad7fdded11f6a09ae6dd upstream.
KVM doesn't follow the SMCCC when it comes to unimplemented calls, and inject an UNDEF instead of returning an error. Since firmware calls are now used for security mitigation, they are becoming more common, and the undef is counter productive.
Instead, let's follow the SMCCC which states that -1 must be returned to the caller when getting an unknown function number.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Christoffer Dall christoffer.dall@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kvm/handle_exit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -34,7 +34,7 @@ static int handle_hvc(struct kvm_vcpu *v
ret = kvm_psci_call(vcpu); if (ret < 0) { - kvm_inject_undefined(vcpu); + *vcpu_reg(vcpu, 0) = ~0UL; return 1; }
@@ -43,7 +43,7 @@ static int handle_hvc(struct kvm_vcpu *v
static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run) { - kvm_inject_undefined(vcpu); + *vcpu_reg(vcpu, 0) = ~0UL; return 1; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@gmail.com
commit 0a5191efe06b5103909206e4fbcff81d30283f8e upstream.
Since commit aef9a7bd9b67 ("serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers"), the port's default FCR value isn't used in serial8250_do_set_termios anymore, but copied over once in serial8250_config_port and then modified as needed.
Unfortunately, serial8250_config_port will never be called if the port is shared between kernel and userspace, and the port's flag doesn't have UPF_BOOT_AUTOCONF, which would trigger a serial8250_config_port as well.
This causes garbled output from userspace:
[ 5.220000] random: procd urandom read with 49 bits of entropy available ers [kee
Fix this by forcing it to be configured on boot, resulting in the expected output:
[ 5.250000] random: procd urandom read with 50 bits of entropy available Press the [f] key and hit [enter] to enter failsafe mode Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
Fixes: aef9a7bd9b67 ("serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Yoshihiro YUNOMAE yoshihiro.yunomae.ez@hitachi.com Cc: Florian Fainelli f.fainelli@gmail.com Cc: Nicolas Schichan nschichan@freebox.fr Cc: linux-mips@linux-mips.org Cc: linux-serial@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17544/ Signed-off-by: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/ar7/platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/mips/ar7/platform.c +++ b/arch/mips/ar7/platform.c @@ -581,7 +581,7 @@ static int __init ar7_register_uarts(voi uart_port.type = PORT_AR7; uart_port.uartclk = clk_get_rate(bus_clk) / 2; uart_port.iotype = UPIO_MEM32; - uart_port.flags = UPF_FIXED_TYPE; + uart_port.flags = UPF_FIXED_TYPE | UPF_BOOT_AUTOCONF; uart_port.regshift = 2;
uart_port.line = 0;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
commit 1c52d859cb2d417e7216d3e56bb7fea88444cec9 upstream.
We support various non-Intel CPUs that don't have the CPUID instruction, so the M486 test was wrong. For now, fix it with a big hammer: handle missing CPUID on all 32-bit CPUs.
Reported-by: One Thousand Gnomes gnomes@lxorguk.ukuu.org.uk Signed-off-by: Andy Lutomirski luto@kernel.org Cc: Juergen Gross jgross@suse.com Cc: Peter Zijlstra peterz@infradead.org Cc: Brian Gerst brgerst@gmail.com Cc: Matthew Whitehead tedheadster@gmail.com Cc: Borislav Petkov bp@alien8.de Cc: Henrique de Moraes Holschuh hmh@hmh.eng.br Cc: Andrew Cooper andrew.cooper3@citrix.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: xen-devel Xen-devel@lists.xen.org Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769... Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/processor.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -669,7 +669,7 @@ static inline void sync_core(void) { int tmp;
-#ifdef CONFIG_M486 +#ifdef CONFIG_X86_32 /* * Do a CPUID if available, otherwise do a jump. The jump * can conveniently enough be the jump around CPUID.
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Dieter jdieter@lesbg.com
commit cfd6ed4537a9e938fa76facecd4b9cd65b6d1563 upstream.
GCC 7 now warns when switch statements fall through implicitly, and with -Werror enabled in configure.ac, that makes these tools unbuildable.
We fix this by notifying the compiler that this particular case statement is meant to fall through.
Reviewed-by: Peter Senna Tschudin peter.senna@gmail.com Signed-off-by: Jonathan Dieter jdieter@lesbg.com Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/usb/usbip/src/usbip.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/usb/usbip/src/usbip.c +++ b/tools/usb/usbip/src/usbip.c @@ -176,6 +176,8 @@ int main(int argc, char *argv[]) break; case '?': printf("usbip: invalid option\n"); + /* Terminate after printing error */ + /* FALLTHRU */ default: usbip_usage(); goto out;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 8cb68751c115d176ec851ca56ecfbb411568c9e8 upstream.
If an invalid CAN frame is received, from a driver or from a tun interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here.
Reported-by: syzbot+4386709c0c1284dca827@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov dvyukov@google.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/af_can.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -719,13 +719,12 @@ static int can_rcv(struct sk_buff *skb, if (unlikely(!net_eq(dev_net(dev), &init_net))) goto drop;
- if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CAN_MTU || - cfd->len > CAN_MAX_DLEN, - "PF_CAN: dropped non conform CAN skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU || + cfd->len > CAN_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); goto drop; + }
can_receive(skb, dev); return NET_RX_SUCCESS;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit d4689846881d160a4d12a514e991a740bcb5d65a upstream.
If an invalid CANFD frame is received, from a driver or from a tun interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here.
Reported-by: syzbot+e3b775f40babeff6e68b@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov dvyukov@google.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Cc: linux-stable stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/af_can.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -742,13 +742,12 @@ static int canfd_rcv(struct sk_buff *skb if (unlikely(!net_eq(dev_net(dev), &init_net))) goto drop;
- if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CANFD_MTU || - cfd->len > CANFD_MAX_DLEN, - "PF_CAN: dropped non conform CAN FD skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU || + cfd->len > CANFD_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN FD skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); goto drop; + }
can_receive(skb, dev); return NET_RX_SUCCESS;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hocko mhocko@suse.com
commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f upstream.
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has introduced a regression in some rust and Java environments which are trying to implement their own stack guard page. They are punching a new MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the stack expansion would in fact get us too close to an existing non-stack vma which is a correct behavior wrt safety. It is a real regression on the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part of the stack. This is a gros hack but overflowing to such a mapping would trap anyway an we only can hope that usespace knows what it is doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz Signed-off-by: Michal Hocko mhocko@suse.com Debugged-by: Vlastimil Babka vbabka@suse.cz Cc: Ben Hutchings ben@decadent.org.uk Cc: Willy Tarreau w@1wt.eu Cc: Oleg Nesterov oleg@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Hugh Dickins hughd@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/mmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/mmap.c +++ b/mm/mmap.c @@ -2191,7 +2191,8 @@ int expand_upwards(struct vm_area_struct gap_addr = TASK_SIZE;
next = vma->vm_next; - if (next && next->vm_start < gap_addr) { + if (next && next->vm_start < gap_addr && + (next->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) { if (!(next->vm_flags & VM_GROWSUP)) return -ENOMEM; /* Check that both stack segments have the same anon_vma? */ @@ -2271,7 +2272,8 @@ int expand_downwards(struct vm_area_stru if (gap_addr > address) return -ENOMEM; prev = vma->vm_prev; - if (prev && prev->vm_end > gap_addr) { + if (prev && prev->vm_end > gap_addr && + (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) { if (!(prev->vm_flags & VM_GROWSDOWN)) return -ENOMEM; /* Check that both stack segments have the same anon_vma? */
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hocko mhocko@suse.com
commit 18365225f0440d09708ad9daade2ec11275c3df9 upstream.
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In his particular case he has hit a bad_page("page still charged to cgroup") when onlining a hwpoison page. While this looks like something that shouldn't happen in the first place because onlining hwpages and returning them to the page allocator makes only little sense it shows a real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol: take a css reference for each charged page")) as well and so the mem_cgroup and the associated state will never go away. Fix this leak by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache(). We also have to tweak uncharge_list because it cannot rely on zero ref count for these pages.
[akpm@linux-foundation.org: coding-style fixes] Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz Signed-off-by: Michal Hocko mhocko@suse.com Reported-by: Laurent Dufour ldufour@linux.vnet.ibm.com Tested-by: Laurent Dufour ldufour@linux.vnet.ibm.com Reviewed-by: Balbir Singh bsingharora@gmail.com Reviewed-by: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/memcontrol.c | 2 +- mm/memory-failure.c | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6500,7 +6500,7 @@ static void uncharge_list(struct list_he next = page->lru.next;
VM_BUG_ON_PAGE(PageLRU(page), page); - VM_BUG_ON_PAGE(page_count(page), page); + VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
pc = lookup_page_cgroup(page); if (!PageCgroupUsed(pc)) --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -548,6 +548,13 @@ static int delete_from_lru_cache(struct */ ClearPageActive(p); ClearPageUnevictable(p); + + /* + * Poisoned page might never drop its ref count to 0 so we have + * to uncharge it manually from its memcg. + */ + mem_cgroup_uncharge(p); + /* * drop the page count elevated by isolate_lru_page() */
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Slaby jslaby@suse.cz
commit 999898355e08ae3b92dfd0a08db706e0c6703d30 upstream.
When LONG_MIN is passed to msgrcv, one would expect to recieve any message. But convert_mode does *msgtyp = -*msgtyp and -LONG_MIN is undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN again.
So handle this case properly by assigning LONG_MAX to *msgtyp if LONG_MIN was specified as msgtyp to msgrcv.
This code: long msg[] = { 100, 200 }; int m = msgget(IPC_PRIVATE, IPC_CREAT | 0644); msgsnd(m, &msg, sizeof(msg), 0); msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0);
produces currently nothing:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 65538 msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0 msgrcv(65538, ...
Except a UBSAN warning:
UBSAN: Undefined behaviour in ipc/msg.c:745:13 negation of -9223372036854775808 cannot be represented in type 'long int':
With the patch, I see what I expect:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 0 msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0 msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16
Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz Signed-off-by: Jiri Slaby jslaby@suse.cz Cc: Davidlohr Bueso dave@stgolabs.net Cc: Manfred Spraul manfred@colorfullife.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- ipc/msg.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/ipc/msg.c +++ b/ipc/msg.c @@ -740,7 +740,10 @@ static inline int convert_mode(long *msg if (*msgtyp == 0) return SEARCH_ANY; if (*msgtyp < 0) { - *msgtyp = -*msgtyp; + if (*msgtyp == LONG_MIN) /* -LONG_MIN is undefined */ + *msgtyp = LONG_MAX; + else + *msgtyp = -*msgtyp; return SEARCH_LESSEQUAL; } if (msgflg & MSG_EXCEPT)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liping Zhang liping.zhang@spreadtrum.com
commit b173a28f62cf929324a8a6adcc45adadce311d16 upstream.
The 'name' filed in struct nf_conntrack_expect_policy{} is not a pointer, so check it is NULL or not will always return true. Even if the name is empty, slash will always be displayed like follows: # cat /proc/net/nf_conntrack_expect 297 l3proto = 2 proto=6 src=1.1.1.1 dst=2.2.2.2 sport=1 dport=1025 ftp/ ^
Fixes: 3a8fc53a45c4 ("netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names") Signed-off-by: Liping Zhang liping.zhang@spreadtrum.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Acked-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conntrack_expect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_conntrack_expect.c +++ b/net/netfilter/nf_conntrack_expect.c @@ -557,7 +557,7 @@ static int exp_seq_show(struct seq_file helper = rcu_dereference(nfct_help(expect->master)->helper); if (helper) { seq_printf(s, "%s%s", expect->flags ? " " : "", helper->name); - if (helper->expect_policy[expect->class].name) + if (helper->expect_policy[expect->class].name[0]) seq_printf(s, "/%s", helper->expect_policy[expect->class].name); }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 95a8d19f28e6b29377a880c6264391a62e07fccc upstream.
In case nf_conntrack_tuple_taken did not find a conflicting entry check that all entries in this hash slot were tested and restart in case an entry was moved to another chain.
Reported-by: Eric Dumazet edumazet@google.com Fixes: ea781f197d6a ("netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()") Signed-off-by: Florian Westphal fw@strlen.de Acked-by: Eric Dumazet edumazet@google.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Acked-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conntrack_core.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -695,6 +695,7 @@ nf_conntrack_tuple_taken(const struct nf * least once for the stats anyway. */ rcu_read_lock_bh(); + begin: hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) { ct = nf_ct_tuplehash_to_ctrack(h); if (ct != ignored_conntrack && @@ -706,6 +707,12 @@ nf_conntrack_tuple_taken(const struct nf } NF_CT_STAT_INC(net, searched); } + + if (get_nulls_value(n) != hash) { + NF_CT_STAT_INC(net, search_restart); + goto begin; + } + rcu_read_unlock_bh();
return 0;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ulrich Weber ulrich.weber@riverbed.com
commit 444f901742d054a4cd5ff045871eac5131646cfb upstream.
on SIP requests, so a fragmented TCP SIP packet from an allow header starting with INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE Content-Length: 0
will not bet interpreted as an INVITE request. Also Request-URI must start with an alphabetic character.
Confirm with RFC 3261 Request-Line = Method SP Request-URI SP SIP-Version CRLF
Fixes: 30f33e6dee80 ("[NETFILTER]: nf_conntrack_sip: support method specific request/response handling") Signed-off-by: Ulrich Weber ulrich.weber@riverbed.com Acked-by: Marco Angaroni marcoangaroni@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Acked-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conntrack_sip.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/net/netfilter/nf_conntrack_sip.c +++ b/net/netfilter/nf_conntrack_sip.c @@ -1434,9 +1434,12 @@ static int process_sip_request(struct sk handler = &sip_handlers[i]; if (handler->request == NULL) continue; - if (*datalen < handler->len || + if (*datalen < handler->len + 2 || strncasecmp(*dptr, handler->method, handler->len)) continue; + if ((*dptr)[handler->len] != ' ' || + !isalpha((*dptr)[handler->len+1])) + continue;
if (ct_sip_get_header(ct, *dptr, 0, *datalen, SIP_HDR_CSEQ, &matchoff, &matchlen) <= 0) {
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kevin Cernekee cernekee@chromium.org
commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
The capability check in nfnetlink_rcv() verifies that the caller has CAP_NET_ADMIN in the namespace that "owns" the netlink socket. However, nfnl_cthelper_list is shared by all net namespaces on the system. An unprivileged user can create user and net namespaces in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable() check:
$ nfct helper list nfct v1.4.4: netlink error: Operation not permitted $ vpnns -- nfct helper list { .name = ftp, .queuenum = 0, .l3protonum = 2, .l4protonum = 6, .priv_data_len = 24, .status = enabled, };
Add capable() checks in nfnetlink_cthelper, as this is cleaner than trying to generalize the solution.
Signed-off-by: Kevin Cernekee cernekee@chromium.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Acked-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c +++ b/net/netfilter/nfnetlink_cthelper.c @@ -17,6 +17,7 @@ #include <linux/types.h> #include <linux/list.h> #include <linux/errno.h> +#include <linux/capability.h> #include <net/netlink.h> #include <net/sock.h>
@@ -392,6 +393,9 @@ nfnl_cthelper_new(struct sock *nfnl, str struct nfnl_cthelper *nlcth; int ret = 0;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE]) return -EINVAL;
@@ -595,6 +599,9 @@ nfnl_cthelper_get(struct sock *nfnl, str struct nfnl_cthelper *nlcth; bool tuple_set = false;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (nlh->nlmsg_flags & NLM_F_DUMP) { struct netlink_dump_control c = { .dump = nfnl_cthelper_dump_table, @@ -661,6 +668,9 @@ nfnl_cthelper_del(struct sock *nfnl, str struct nfnl_cthelper *nlcth, *n; int j = 0, ret;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (tb[NFCTH_NAME]) helper_name = nla_data(tb[NFCTH_NAME]);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kevin Cernekee cernekee@chromium.org
commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
The capability check in nfnetlink_rcv() verifies that the caller has CAP_NET_ADMIN in the namespace that "owns" the netlink socket. However, xt_osf_fingers is shared by all net namespaces on the system. An unprivileged user can create user and net namespaces in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable() check:
vpnns -- nfnl_osf -f /tmp/pf.os
vpnns -- nfnl_osf -f /tmp/pf.os -d
These non-root operations successfully modify the systemwide OS fingerprint list. Add new capable() checks so that they can't.
Signed-off-by: Kevin Cernekee cernekee@chromium.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Acked-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/xt_osf.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/netfilter/xt_osf.c +++ b/net/netfilter/xt_osf.c @@ -19,6 +19,7 @@ #include <linux/module.h> #include <linux/kernel.h>
+#include <linux/capability.h> #include <linux/if.h> #include <linux/inetdevice.h> #include <linux/ip.h> @@ -69,6 +70,9 @@ static int xt_osf_add_callback(struct so struct xt_osf_finger *kf = NULL, *sf; int err = 0;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (!osf_attrs[OSF_ATTR_FINGER]) return -EINVAL;
@@ -112,6 +116,9 @@ static int xt_osf_remove_callback(struct struct xt_osf_finger *sf; int err = -ENOENT;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (!osf_attrs[OSF_ATTR_FINGER]) return -EINVAL;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Mahoney jeffm@suse.com
commit 08db141b5313ac2f64b844fb5725b8d81744b417 upstream.
The main loop in __discard_prealloc is protected by the reiserfs write lock which is dropped across schedules like the BKL it replaced. The problem is that it checks the value, calls a routine that schedules, and then adjusts the state. As a result, two threads that are calling reiserfs_prealloc_discard at the same time can race when one calls reiserfs_free_prealloc_block, the lock is dropped, and the other calls reiserfs_free_prealloc_block with the same block number. In the right circumstances, it can cause the prealloc count to go negative.
Signed-off-by: Jeff Mahoney jeffm@suse.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/reiserfs/bitmap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/reiserfs/bitmap.c +++ b/fs/reiserfs/bitmap.c @@ -513,9 +513,17 @@ static void __discard_prealloc(struct re "inode has negative prealloc blocks count."); #endif while (ei->i_prealloc_count > 0) { - reiserfs_free_prealloc_block(th, inode, ei->i_prealloc_block); - ei->i_prealloc_block++; + b_blocknr_t block_to_free; + + /* + * reiserfs_free_prealloc_block can drop the write lock, + * which could allow another caller to free the same block. + * We can protect against it by modifying the prealloc + * state before calling it. + */ + block_to_free = ei->i_prealloc_block++; ei->i_prealloc_count--; + reiserfs_free_prealloc_block(th, inode, block_to_free); dirty = 1; } if (dirty)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Mahoney jeffm@suse.com
commit 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a upstream.
Most extended attributes will fit in a single block. More importantly, we drop the reference to the inode while holding the transaction open so the preallocated blocks aren't released. As a result, the inode may be evicted before it's removed from the transaction's prealloc list which can cause memory corruption.
Signed-off-by: Jeff Mahoney jeffm@suse.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/reiserfs/bitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/reiserfs/bitmap.c +++ b/fs/reiserfs/bitmap.c @@ -1136,7 +1136,7 @@ static int determine_prealloc_size(reise hint->prealloc_size = 0;
if (!hint->formatted_node && hint->preallocate) { - if (S_ISREG(hint->inode->i_mode) + if (S_ISREG(hint->inode->i_mode) && !IS_PRIVATE(hint->inode) && hint->inode->i_size >= REISERFS_SB(hint->th->t_super)->s_alloc_options. preallocmin * hint->inode->i_sb->s_blocksize)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Slaby jslaby@suse.cz
commit fc3dc67471461c0efcb1ed22fb7595121d65fad9 upstream.
fcntl(0, F_SETOWN, 0x80000000) triggers: UBSAN: Undefined behaviour in fs/fcntl.c:118:7 negation of -2147483648 cannot be represented in type 'int': CPU: 1 PID: 18261 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1 ... Call Trace: ... [<ffffffffad8f0868>] ? f_setown+0x1d8/0x200 [<ffffffffad8f19a9>] ? SyS_fcntl+0x999/0xf30 [<ffffffffaed1fb00>] ? entry_SYSCALL_64_fastpath+0x23/0xc1
Fix that by checking the arg parameter properly (against INT_MAX) before "who = -who". And return immediatelly with -EINVAL in case it is wrong. Note that according to POSIX we can return EINVAL: http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
[EINVAL] The cmd argument is F_SETOWN and the value of the argument is not valid as a process or process group identifier.
[v2] returns an error, v1 used to fail silently [v3] implement proper check for the bad value INT_MIN
Signed-off-by: Jiri Slaby jslaby@suse.cz Cc: Jeff Layton jlayton@poochiereds.net Cc: "J. Bruce Fields" bfields@fieldses.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Jeff Layton jlayton@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fcntl.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -113,6 +113,10 @@ void f_setown(struct file *filp, unsigne int who = arg; type = PIDTYPE_PID; if (who < 0) { + /* avoid overflow below */ + if (who == INT_MIN) + return; + type = PIDTYPE_PGID; who = -who; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Thumshirn jthumshirn@suse.de
commit eef9ffdf9cd39b2986367bc8395e2772bc1284ba upstream.
The SCSI host byte should be shifted left by 16 in order to have scsi_decide_disposition() do the right thing (.i.e. requeue the command).
Signed-off-by: Johannes Thumshirn jthumshirn@suse.de Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common") Cc: Lee Duncan lduncan@suse.com Cc: Hannes Reinecke hare@suse.de Cc: Bart Van Assche Bart.VanAssche@sandisk.com Cc: Chris Leech cleech@redhat.com Acked-by: Lee Duncan lduncan@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/libiscsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -1727,7 +1727,7 @@ int iscsi_queuecommand(struct Scsi_Host
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) { reason = FAILURE_SESSION_IN_RECOVERY; - sc->result = DID_REQUEUE; + sc->result = DID_REQUEUE << 16; goto fault; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Meyer thomas@m3y3r.de
commit 883354afbc109c57f925ccc19840055193da0cc0 upstream.
Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option. Link UML dynamic kernel image also with -no-pie to fix the build.
Signed-off-by: Thomas Meyer thomas@m3y3r.de Signed-off-by: Richard Weinberger richard@nod.at Cc: Bernie Innocenti codewiz@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/um/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/um/Makefile +++ b/arch/um/Makefile @@ -116,7 +116,7 @@ archheaders: archprepare: include/generated/user_constants.h
LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static -LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib +LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \ $(call cc-option, -fno-stack-protector,) \
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg KH gregkh@linuxfoundation.org
commit 7e040726850a106587485c21bdacc0bfc8a0cbed upstream.
[resend due to me forgetting to cc: linux-api the first time around I posted these back on Feb 23]
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
For some reason these values are not in the uapi header file, so any libc has to define it themselves. To prevent them from needing to do this, just have the kernel provide the correct values.
Reported-by: Elliott Hughes enh@google.com Signed-off-by: Greg Hackmann ghackmann@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/uapi/linux/eventpoll.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/include/uapi/linux/eventpoll.h +++ b/include/uapi/linux/eventpoll.h @@ -26,6 +26,19 @@ #define EPOLL_CTL_DEL 2 #define EPOLL_CTL_MOD 3
+/* Epoll event masks */ +#define EPOLLIN 0x00000001 +#define EPOLLPRI 0x00000002 +#define EPOLLOUT 0x00000004 +#define EPOLLERR 0x00000008 +#define EPOLLHUP 0x00000010 +#define EPOLLRDNORM 0x00000040 +#define EPOLLRDBAND 0x00000080 +#define EPOLLWRNORM 0x00000100 +#define EPOLLWRBAND 0x00000200 +#define EPOLLMSG 0x00000400 +#define EPOLLRDHUP 0x00002000 + /* * Request the handling of system wakeup events so as to prevent system suspends * from happening while those events are being processed.
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Weinberger richard@nod.at
commit 298e20ba8c197e8d429a6c8671550c41c7919033 upstream.
Currently UML is abusing __KERNEL__ to distinguish between kernel and host code (os-Linux). It is better to use a custom define such that existing users of __KERNEL__ don't get confused.
Signed-off-by: Richard Weinberger richard@nod.at Cc: Greg Hackmann ghackmann@google.com Cc: Bernie Innocenti codewiz@google.com Cc: Lorenzo Colitti lorenzo@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/um/Makefile | 7 ++++--- arch/um/drivers/mconsole.h | 2 +- arch/um/include/shared/init.h | 4 ++-- arch/um/include/shared/user.h | 2 +- arch/x86/um/shared/sysdep/tls.h | 6 +++--- 5 files changed, 11 insertions(+), 10 deletions(-)
--- a/arch/um/Makefile +++ b/arch/um/Makefile @@ -68,9 +68,10 @@ KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -
KBUILD_AFLAGS += $(ARCH_INCLUDE)
-USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -D__KERNEL__,,\ - $(patsubst -I%,,$(KBUILD_CFLAGS)))) $(ARCH_INCLUDE) $(MODE_INCLUDE) \ - $(filter -I%,$(CFLAGS)) -D_FILE_OFFSET_BITS=64 -idirafter include +USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \ + $(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \ + -D_FILE_OFFSET_BITS=64 -idirafter include \ + -D__KERNEL__ -D__UM_HOST__
#This will adjust *FLAGS accordingly to the platform. include $(srctree)/$(ARCH_DIR)/Makefile-os-$(OS) --- a/arch/um/drivers/mconsole.h +++ b/arch/um/drivers/mconsole.h @@ -7,7 +7,7 @@ #ifndef __MCONSOLE_H__ #define __MCONSOLE_H__
-#ifndef __KERNEL__ +#ifdef __UM_HOST__ #include <stdint.h> #define u32 uint32_t #endif --- a/arch/um/include/shared/init.h +++ b/arch/um/include/shared/init.h @@ -40,7 +40,7 @@ typedef int (*initcall_t)(void); typedef void (*exitcall_t)(void);
-#ifndef __KERNEL__ +#ifdef __UM_HOST__ #ifndef __section # define __section(S) __attribute__ ((__section__(#S))) #endif @@ -131,7 +131,7 @@ extern struct uml_param __uml_setup_star #define __uml_postsetup_call __used __section(.uml.postsetup.init) #define __uml_exit_call __used __section(.uml.exitcall.exit)
-#ifndef __KERNEL__ +#ifdef __UM_HOST__
#define __define_initcall(level,fn) \ static initcall_t __initcall_##fn __used \ --- a/arch/um/include/shared/user.h +++ b/arch/um/include/shared/user.h @@ -17,7 +17,7 @@ #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
/* This is to get size_t */ -#ifdef __KERNEL__ +#ifndef __UM_HOST__ #include <linux/types.h> #else #include <stddef.h> --- a/arch/x86/um/shared/sysdep/tls.h +++ b/arch/x86/um/shared/sysdep/tls.h @@ -1,7 +1,7 @@ #ifndef _SYSDEP_TLS_H #define _SYSDEP_TLS_H
-# ifndef __KERNEL__ +#ifdef __UM_HOST__
/* Change name to avoid conflicts with the original one from <asm/ldt.h>, which * may be named user_desc (but in 2.4 and in header matching its API was named @@ -22,11 +22,11 @@ typedef struct um_dup_user_desc { #endif } user_desc_t;
-# else /* __KERNEL__ */ +#else /* __UM_HOST__ */
typedef struct user_desc user_desc_t;
-# endif /* __KERNEL__ */ +#endif /* __UM_HOST__ */
extern int os_set_thread_area(user_desc_t *info, int pid); extern int os_get_thread_area(user_desc_t *info, int pid);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Weinberger richard@nod.at
commit 30b11ee9ae23d78de66b9ae315880af17a64ba83 upstream.
As we got rid of the __KERNEL__ abuse, we can directly include linux/compiler.h now. This also allows gcc 5 to build UML.
Reported-by: Hans-Werner Hilse hwhilse@gmail.com Signed-off-by: Richard Weinberger richard@nod.at Cc: Greg Hackmann ghackmann@google.com Cc: Bernie Innocenti codewiz@google.com Cc: Lorenzo Colitti lorenzo@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/um/include/shared/init.h | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-)
--- a/arch/um/include/shared/init.h +++ b/arch/um/include/shared/init.h @@ -40,28 +40,8 @@ typedef int (*initcall_t)(void); typedef void (*exitcall_t)(void);
-#ifdef __UM_HOST__ -#ifndef __section -# define __section(S) __attribute__ ((__section__(#S))) -#endif - -#if __GNUC__ == 3 - -#if __GNUC_MINOR__ >= 3 -# define __used __attribute__((__used__)) -#else -# define __used __attribute__((__unused__)) -#endif - -#else -#if __GNUC__ == 4 -# define __used __attribute__((__used__)) -#endif -#endif - -#else #include <linux/compiler.h> -#endif + /* These are for everybody (although not all archs will actually discard it in modules) */ #define __init __section(.init.text)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhang zhang.jia@linux.alibaba.com
commit 7e702d17ed138cf4ae7c00e8c00681ed464587c7 upstream.
Commit b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check") reduced the impact of erratum BDF90 for Broadwell model 79.
The impact can be reduced further by checking the size of the last level cache portion per core.
Tony: "The erratum says the problem only occurs on the large-cache SKUs. So we only need to avoid the update if we are on a big cache SKU that is also running old microcode."
For more details, see erratum BDF90 in document #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family Specification Update) from September 2017.
Fixes: b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check") Signed-off-by: Jia Zhang zhang.jia@linux.alibaba.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Tony Luck tony.luck@intel.com Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/microcode/intel.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -87,6 +87,9 @@ MODULE_DESCRIPTION("Microcode Update Dri MODULE_AUTHOR("Tigran Aivazian tigran@aivazian.fsnet.co.uk"); MODULE_LICENSE("GPL");
+/* last level cache size per core */ +static int llc_size_per_core; + static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) { struct cpuinfo_x86 *c = &cpu_data(cpu_num); @@ -273,12 +276,14 @@ static bool is_blacklisted(unsigned int
/* * Late loading on model 79 with microcode revision less than 0x0b000021 - * may result in a system hang. This behavior is documented in item - * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). + * and LLC size per core bigger than 2.5MB may result in a system hang. + * This behavior is documented in item BDF90, #334165 (Intel Xeon + * Processor E7-8800/4800 v4 Product Family). */ if (c->x86 == 6 && c->x86_model == 79 && c->x86_mask == 0x01 && + llc_size_per_core > 2621440 && c->microcode < 0x0b000021) { pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode); pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); @@ -345,6 +350,15 @@ static struct microcode_ops microcode_in .microcode_fini_cpu = microcode_fini_cpu, };
+static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c) +{ + u64 llc_size = c->x86_cache_size * 1024; + + do_div(llc_size, c->x86_max_cores); + + return (int)llc_size; +} + struct microcode_ops * __init init_intel_microcode(void) { struct cpuinfo_x86 *c = &cpu_data(0); @@ -355,6 +369,8 @@ struct microcode_ops * __init init_intel return NULL; }
+ llc_size_per_core = calc_llc_size_per_core(c); + return µcode_intel_ops; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Streetman ddstreet@ieee.org
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811 Signed-off-by: Dan Streetman ddstreet@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/net_namespace.h | 10 ++++++++++ net/ipv4/tcp.c | 3 +++ net/ipv4/tcp_timer.c | 15 +++++++++++++++ 3 files changed, 28 insertions(+)
--- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -200,6 +200,11 @@ int net_eq(const struct net *net1, const return net1 == net2; }
+static inline int check_net(const struct net *net) +{ + return atomic_read(&net->count) != 0; +} + void net_drop_ns(void *);
#else @@ -223,6 +228,11 @@ int net_eq(const struct net *net1, const { return 1; } + +static inline int check_net(const struct net *net) +{ + return 1; +}
#define net_drop_ns NULL #endif --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2182,6 +2182,9 @@ adjudge_to_death: tcp_send_active_reset(sk, GFP_ATOMIC); NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY); + } else if (!check_net(sock_net(sk))) { + /* Not possible to send reset; just close */ + tcp_set_state(sk, TCP_CLOSE); } }
--- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -46,11 +46,19 @@ static void tcp_write_err(struct sock *s * to prevent DoS attacks. It is called when a retransmission timeout * or zero probe timeout occurs on orphaned socket. * + * Also close if our net namespace is exiting; in that case there is no + * hope of ever communicating again since all netns interfaces are already + * down (or about to be down), and we need to release our dst references, + * which have been moved to the netns loopback interface, so the namespace + * can finish exiting. This condition is only possible if we are a kernel + * socket, as those do not hold references to the namespace. + * * Criteria is still not confirmed experimentally and may change. * We kill the socket, if: * 1. If number of orphaned sockets exceeds an administratively configured * limit. * 2. If we have strong memory pressure. + * 3. If our net namespace is exiting. */ static int tcp_out_of_resources(struct sock *sk, bool do_reset) { @@ -79,6 +87,13 @@ static int tcp_out_of_resources(struct s NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY); return 1; } + + if (!check_net(sock_net(sk))) { + /* Not possible to send reset; just close */ + tcp_done(sk); + return 1; + } + return 0; }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kodanev alexey.kodanev@oracle.com
[ Upstream commit dd5684ecae3bd8e44b644f50e2c12c7e57fdfef5 ]
ccid2_hc_tx_rto_expire() timer callback always restarts the timer again and can run indefinitely (unless it is stopped outside), and after commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"), which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from dccp_destroy_sock() to sk_destruct(), this started to happen quite often. The timer prevents releasing the socket, as a result, sk_destruct() won't be called.
Found with LTP/dccp_ipsec tests running on the bonding device, which later couldn't be unloaded after the tests were completed:
unregister_netdevice: waiting for bond0 to become free. Usage count = 148
Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation") Signed-off-by: Alexey Kodanev alexey.kodanev@oracle.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dccp/ccids/ccid2.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -140,6 +140,9 @@ static void ccid2_hc_tx_rto_expire(unsig
ccid2_pr_debug("RTO_EXPIRE\n");
+ if (sk->sk_state == DCCP_CLOSED) + goto out; + /* back-off timer */ hc->tx_rto <<= 1; if (hc->tx_rto > DCCP_RTO_MAX)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
[ Upstream commit ad23b750933ea7bf962678972a286c78a8fa36aa ]
Commit "net: igmp: Use correct source address on IGMPv3 reports" introduced a check to validate the source address of locally generated IGMPv3 packets. Instead of checking the local interface address directly, it uses inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the local subnet (or equal to the point-to-point address if used).
This breaks for point-to-point interfaces, so check against ifa->ifa_local directly.
Cc: Kevin Cernekee cernekee@chromium.org Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports") Reported-by: Sebastian Gottschall s.gottschall@dd-wrt.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/igmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -329,7 +329,7 @@ static __be32 igmpv3_get_srcaddr(struct return htonl(INADDR_ANY);
for_ifa(in_dev) { - if (inet_ifa_match(fl4->saddr, ifa)) + if (fl4->saddr == ifa->ifa_local) return fl4->saddr; } endfor_ifa(in_dev);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Craig Gallek kraig@google.com
commit d9b3fca27385eafe61c3ca6feab6cb1e7dc77482 upstream.
tcp_hdrlen is wasteful if you already have a pointer to struct tcphdr. This splits the size calculation into a helper function that can be used if a struct tcphdr is already available.
Signed-off-by: Craig Gallek kraig@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/tcp.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -29,9 +29,14 @@ static inline struct tcphdr *tcp_hdr(con return (struct tcphdr *)skb_transport_header(skb); }
+static inline unsigned int __tcp_hdrlen(const struct tcphdr *th) +{ + return th->doff * 4; +} + static inline unsigned int tcp_hdrlen(const struct sk_buff *skb) { - return tcp_hdr(skb)->doff * 4; + return __tcp_hdrlen(tcp_hdr(skb)); }
static inline struct tcphdr *inner_tcp_hdr(const struct sk_buff *skb)
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7c68d1a6b4db9012790af7ac0f0fdc0d2083422a ]
Without proper validation of DODGY packets, we might very well feed qdisc_pkt_len_init() with invalid GSO packets.
tcp_hdrlen() might access out-of-bound data, so let's use skb_header_pointer() and proper checks.
Whole story is described in commit d0c081b49137 ("flow_dissector: properly cap thoff field")
We have the goal of validating DODGY packets earlier in the stack, so we might very well revert this fix in the future.
Signed-off-by: Eric Dumazet edumazet@google.com Cc: Willem de Bruijn willemb@google.com Cc: Jason Wang jasowang@redhat.com Reported-by: syzbot+9da69ebac7dddd804552@syzkaller.appspotmail.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -2772,10 +2772,21 @@ static void qdisc_pkt_len_init(struct sk hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
/* + transport layer */ - if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) - hdr_len += tcp_hdrlen(skb); - else - hdr_len += sizeof(struct udphdr); + if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) { + const struct tcphdr *th; + struct tcphdr _tcphdr; + + th = skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_tcphdr), &_tcphdr); + if (likely(th)) + hdr_len += __tcp_hdrlen(th); + } else { + struct udphdr _udphdr; + + if (skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_udphdr), &_udphdr)) + hdr_len += sizeof(struct udphdr); + }
if (shinfo->gso_type & SKB_GSO_DODGY) gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault g.nault@alphalink.fr
[ Upstream commit 02612bb05e51df8489db5e94d0cf8d1c81f87b0c ]
In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom was probably fine before the introduction of ->needed_headroom in commit f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom").
But now, virtual devices typically advertise the size of their overhead in dev->needed_headroom, so we must also take it into account in skb_reserve(). Allocation size of skb is also updated to take dev->needed_tailroom into account and replace the arbitrary 32 bytes with the real size of a PPPoE header.
This issue was discovered by syzbot, who connected a pppoe socket to a gre device which had dev->header_ops->create == ipgre_header and dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any headroom, and dev_hard_header() crashed when ipgre_header() tried to prepend its header to skb->data.
skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24 head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted 4.15.0-rc7-next-20180115+ #97 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100 RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282 RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000 RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0 R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180 FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_under_panic net/core/skbuff.c:114 [inline] skb_push+0xce/0xf0 net/core/skbuff.c:1714 ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879 dev_hard_header include/linux/netdevice.h:2723 [inline] pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg+0xca/0x110 net/socket.c:640 sock_write_iter+0x31a/0x5d0 net/socket.c:909 call_write_iter include/linux/fs.h:1775 [inline] do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653 do_iter_write+0x154/0x540 fs/read_write.c:932 vfs_writev+0x18a/0x340 fs/read_write.c:977 do_writev+0xfc/0x2a0 fs/read_write.c:1012 SYSC_writev fs/read_write.c:1085 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1082 entry_SYSCALL_64_fastpath+0x29/0xa0
Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like interfaces, but reserving space for ->needed_headroom is a more fundamental issue that needs to be addressed first.
Same problem exists for __pppoe_xmit(), which also needs to take dev->needed_headroom into account in skb_cow_head().
Fixes: f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom") Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault g.nault@alphalink.fr Reviewed-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ppp/pppoe.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -830,6 +830,7 @@ static int pppoe_sendmsg(struct kiocb *i struct pppoe_hdr *ph; struct net_device *dev; char *start; + int hlen;
lock_sock(sk); if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) { @@ -848,16 +849,16 @@ static int pppoe_sendmsg(struct kiocb *i if (total_len > (dev->mtu + dev->hard_header_len)) goto end;
- - skb = sock_wmalloc(sk, total_len + dev->hard_header_len + 32, - 0, GFP_KERNEL); + hlen = LL_RESERVED_SPACE(dev); + skb = sock_wmalloc(sk, hlen + sizeof(*ph) + total_len + + dev->needed_tailroom, 0, GFP_KERNEL); if (!skb) { error = -ENOMEM; goto end; }
/* Reserve space for headers. */ - skb_reserve(skb, dev->hard_header_len); + skb_reserve(skb, hlen); skb_reset_network_header(skb);
skb->dev = dev; @@ -918,7 +919,7 @@ static int __pppoe_xmit(struct sock *sk, /* Copy the data if there is no space for the header or if it's * read-only. */ - if (skb_cow_head(skb, sizeof(*ph) + dev->hard_header_len)) + if (skb_cow_head(skb, LL_RESERVED_SPACE(dev) + sizeof(*ph))) goto abort;
__skb_push(skb, sizeof(*ph));
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit c5006b8aa74599ce19104b31d322d2ea9ff887cc ]
The check in sctp_sockaddr_af is not robust enough to forbid binding a v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4 socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first, then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.") Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -333,16 +333,14 @@ static struct sctp_af *sctp_sockaddr_af( if (len < sizeof (struct sockaddr)) return NULL;
+ if (!opt->pf->af_supported(addr->sa.sa_family, opt)) + return NULL; + /* V4 mapped address are really of AF_INET family */ if (addr->sa.sa_family == AF_INET6 && - ipv6_addr_v4mapped(&addr->v6.sin6_addr)) { - if (!opt->pf->af_supported(AF_INET, opt)) - return NULL; - } else { - /* Does this PF support this AF? */ - if (!opt->pf->af_supported(addr->sa.sa_family, opt)) - return NULL; - } + ipv6_addr_v4mapped(&addr->v6.sin6_addr) && + !opt->pf->af_supported(AF_INET, opt)) + return NULL;
/* If we get this far, af is valid. */ af = sctp_get_af_specific(addr->sa.sa_family);
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit a0ff660058b88d12625a783ce9e5c1371c87951f ]
After commit cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep"), it may change to lock another sk if the asoc has been peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in the sendmsg context of the old sk that can't avoid the new sk's closing. If the sk's last one refcnt is held by this asoc, later on after putting this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning error under the old sk's lock.
Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep") Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -83,7 +83,7 @@ static int sctp_writeable(struct sock *sk); static void sctp_wfree(struct sk_buff *skb); static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, - size_t msg_len, struct sock **orig_sk); + size_t msg_len); static int sctp_wait_for_packet(struct sock *sk, int *err, long *timeo_p); static int sctp_wait_for_connect(struct sctp_association *, long *timeo_p); static int sctp_wait_for_accept(struct sock *sk, long timeo); @@ -1948,7 +1948,7 @@ static int sctp_sendmsg(struct kiocb *io timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); if (!sctp_wspace(asoc)) { /* sk can be changed by peel off when waiting for buf. */ - err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len, &sk); + err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len); if (err) { if (err == -ESRCH) { /* asoc is already dead. */ @@ -6981,12 +6981,12 @@ void sctp_sock_rfree(struct sk_buff *skb
/* Helper function to wait for space in the sndbuf. */ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, - size_t msg_len, struct sock **orig_sk) + size_t msg_len) { struct sock *sk = asoc->base.sk; - int err = 0; long current_timeo = *timeo_p; DEFINE_WAIT(wait); + int err = 0;
pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc, *timeo_p, msg_len); @@ -7015,17 +7015,13 @@ static int sctp_wait_for_sndbuf(struct s release_sock(sk); current_timeo = schedule_timeout(current_timeo); lock_sock(sk); - if (sk != asoc->base.sk) { - release_sock(sk); - sk = asoc->base.sk; - lock_sock(sk); - } + if (sk != asoc->base.sk) + goto do_error;
*timeo_p = current_timeo; }
out: - *orig_sk = sk; finish_wait(&asoc->wait, &wait);
/* Release the association's refcnt. */
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neil Horman nhorman@tuxdriver.com
[ Upstream commit 848b159835ddef99cc4193083f7e786c3992f580 ]
with the introduction of commit b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info is improperly handled. While it is heap allocated when an rx queue is setup, and freed when torn down, an old line of code in vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0] being set to NULL prior to its being freed, causing a memory leak, which eventually exhausts the system on repeated create/destroy operations (for example, when the mtu of a vmxnet3 interface is changed frequently.
Fix is pretty straight forward, just move the NULL set to after the free.
Tested by myself with successful results
Applies to net, and should likely be queued for stable, please
Signed-off-by: Neil Horman nhorman@tuxdriver.com Reported-By: boyang@redhat.com CC: boyang@redhat.com CC: Shrikrishna Khare skhare@vmware.com CC: "VMware, Inc." pv-drivers@vmware.com CC: David S. Miller davem@davemloft.net Acked-by: Shrikrishna Khare skhare@vmware.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/vmxnet3/vmxnet3_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/vmxnet3/vmxnet3_drv.c +++ b/drivers/net/vmxnet3/vmxnet3_drv.c @@ -1420,7 +1420,6 @@ static void vmxnet3_rq_destroy(struct vm rq->rx_ring[i].basePA); rq->rx_ring[i].base = NULL; } - rq->buf_info[i] = NULL; }
if (rq->comp_ring.base) { @@ -1435,6 +1434,7 @@ static void vmxnet3_rq_destroy(struct vm (rq->rx_ring[0].size + rq->rx_ring[1].size); dma_free_coherent(&adapter->pdev->dev, sz, rq->buf_info[0], rq->buf_info_pa); + rq->buf_info[0] = rq->buf_info[1] = NULL; } }
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Westfall jwestfall@surrealistic.net
[ Upstream commit 096b9854c04df86f03b38a97d40b6506e5730919 ]
Use n->primary_key instead of pkey to account for the possibility that a neigh constructor function may have modified the primary_key value.
Signed-off-by: Jim Westfall jwestfall@surrealistic.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/neighbour.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -508,7 +508,7 @@ struct neighbour *__neigh_create(struct if (atomic_read(&tbl->entries) > (1 << nht->hash_shift)) nht = neigh_hash_grow(tbl, nht->hash_shift + 1);
- hash_val = tbl->hash(pkey, dev, nht->hash_rnd) >> (32 - nht->hash_shift); + hash_val = tbl->hash(n->primary_key, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
if (n->parms->dead) { rc = ERR_PTR(-EINVAL); @@ -520,7 +520,7 @@ struct neighbour *__neigh_create(struct n1 != NULL; n1 = rcu_dereference_protected(n1->next, lockdep_is_held(&tbl->lock))) { - if (dev == n1->dev && !memcmp(n1->primary_key, pkey, key_len)) { + if (dev == n1->dev && !memcmp(n1->primary_key, n->primary_key, key_len)) { if (want_ref) neigh_hold(n1); rc = n1;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Maloney maloney@google.com
[ Upstream commit 749439bfac6e1a2932c582e2699f91d329658196 ]
The logic in __ip6_append_data() assumes that the MTU is at least large enough for the headers. A device's MTU may be adjusted after being added while sendmsg() is processing data, resulting in __ip6_append_data() seeing any MTU. For an mtu smaller than the size of the fragmentation header, the math results in a negative 'maxfraglen', which causes problems when refragmenting any previous skb in the skb_write_queue, leaving it possibly malformed.
Instead sendmsg returns EINVAL when the mtu is calculated to be less than IPV6_MIN_MTU.
Found by syzkaller: kernel BUG at ./include/linux/skbuff.h:2064! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801d0b68580 task.stack: ffff8801ac6b8000 RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline] RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216 RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000 RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0 RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000 R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8 R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000 FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6_finish_skb include/net/ipv6.h:911 [inline] udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x352/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4512e9 RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9 RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005 RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69 R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000 Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570 RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570
Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Mike Maloney maloney@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_output.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1214,14 +1214,16 @@ int ip6_append_data(struct sock *sk, int np->cork.tclass = tclass; if (rt->dst.flags & DST_XFRM_TUNNEL) mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ? - rt->dst.dev->mtu : dst_mtu(&rt->dst); + READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst); else mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ? - rt->dst.dev->mtu : dst_mtu(rt->dst.path); + READ_ONCE(rt->dst.dev->mtu) : dst_mtu(rt->dst.path); if (np->frag_size < mtu) { if (np->frag_size) mtu = np->frag_size; } + if (mtu < IPV6_MIN_MTU) + return -EINVAL; cork->fragsize = mtu; if (dst_allfrag(rt->dst.path)) cork->flags |= IPCORK_ALLFRAG;
3.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Westfall jwestfall@surrealistic.net
[ Upstream commit cd9ff4de0107c65d69d02253bb25d6db93c3dbc1 ]
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices to avoid making an entry for every remote ip the device needs to talk to.
This used the be the old behavior but became broken in a263b3093641f (ipv4: Make neigh lookups directly in output packet path) and later removed in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point devices) because it was broken.
Signed-off-by: Jim Westfall jwestfall@surrealistic.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/arp.h | 3 +++ net/ipv4/arp.c | 7 ++++++- 2 files changed, 9 insertions(+), 1 deletion(-)
--- a/include/net/arp.h +++ b/include/net/arp.h @@ -37,6 +37,9 @@ static inline struct neighbour *__ipv4_n { struct neighbour *n;
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT)) + key = INADDR_ANY; + rcu_read_lock_bh(); n = __ipv4_neigh_lookup_noref(dev, key); if (n && !atomic_inc_not_zero(&n->refcnt)) --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -221,11 +221,16 @@ static u32 arp_hash(const void *pkey,
static int arp_constructor(struct neighbour *neigh) { - __be32 addr = *(__be32 *)neigh->primary_key; + __be32 addr; struct net_device *dev = neigh->dev; struct in_device *in_dev; struct neigh_parms *parms; + u32 inaddr_any = INADDR_ANY;
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT)) + memcpy(neigh->primary_key, &inaddr_any, arp_tbl.key_len); + + addr = *(__be32 *)neigh->primary_key; rcu_read_lock(); in_dev = __in_dev_get_rcu(dev); if (in_dev == NULL) {
On 01/29/2018 05:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.93-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Mon, Jan 29, 2018 at 04:58:05PM -0700, Shuah Khan wrote:
On 01/29/2018 05:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.93-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all of these and letting me know.
greg k-h
On 01/29/2018 04:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 135 fail: 1 Failed builds: um:defconfig Qemu test results: total: 112 pass: 112 fail: 0
The build failure is:
In file included from arch/um/kernel/config.c:8:0: arch/um/include/shared/init.h:43:28: fatal error: linux/compiler.h: No such file or directory
[ several instances ]
Details are available at http://kerneltests.org/builders. Let me know if you need me to bisect.
Thanks, Guenter
On Tue, Jan 30, 2018 at 06:19:15AM -0800, Guenter Roeck wrote:
On 01/29/2018 04:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 135 fail: 1 Failed builds: um:defconfig Qemu test results: total: 112 pass: 112 fail: 0
The build failure is:
In file included from arch/um/kernel/config.c:8:0: arch/um/include/shared/init.h:43:28: fatal error: linux/compiler.h: No such file or directory
[ several instances ]
Crap, I was trying to apply a number of the UM patches that Android relies on for their build systems that they patch their kernel for. I'll go look into those to try to figure out what I got wrong...
thanks,
greg k-h
On Tue, Jan 30, 2018 at 03:51:56PM +0100, Greg Kroah-Hartman wrote:
On Tue, Jan 30, 2018 at 06:19:15AM -0800, Guenter Roeck wrote:
On 01/29/2018 04:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 135 fail: 1 Failed builds: um:defconfig Qemu test results: total: 112 pass: 112 fail: 0
The build failure is:
In file included from arch/um/kernel/config.c:8:0: arch/um/include/shared/init.h:43:28: fatal error: linux/compiler.h: No such file or directory
[ several instances ]
Crap, I was trying to apply a number of the UM patches that Android relies on for their build systems that they patch their kernel for. I'll go look into those to try to figure out what I got wrong...
Ok, I can't even build a defconfig for ARCH=um at all, with no patches applied to 3.18. I had to go find a 4.9.4 kernel to even get close to building, gcc7 did really odd things.
Do you have the .config file you use to build this arch with? I looked on the builder site and couldn't seem to find it anywhere, am I just missing something obvious?
thanks,
greg k-h
On Tue, Jan 30, 2018 at 07:51:31PM +0100, Greg Kroah-Hartman wrote:
On Tue, Jan 30, 2018 at 03:51:56PM +0100, Greg Kroah-Hartman wrote:
On Tue, Jan 30, 2018 at 06:19:15AM -0800, Guenter Roeck wrote:
On 01/29/2018 04:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 135 fail: 1 Failed builds: um:defconfig Qemu test results: total: 112 pass: 112 fail: 0
The build failure is:
In file included from arch/um/kernel/config.c:8:0: arch/um/include/shared/init.h:43:28: fatal error: linux/compiler.h: No such file or directory
[ several instances ]
Crap, I was trying to apply a number of the UM patches that Android relies on for their build systems that they patch their kernel for. I'll go look into those to try to figure out what I got wrong...
Ok, I can't even build a defconfig for ARCH=um at all, with no patches applied to 3.18. I had to go find a 4.9.4 kernel to even get close to building, gcc7 did really odd things.
Do you have the .config file you use to build this arch with? I looked on the builder site and couldn't seem to find it anywhere, am I just missing something obvious?
mkdir /tmp/build make O=/tmp/build ARCH=um SUBARCH=x86_64 defconfig make O=/tmp/build ARCH=um SUBARCH=x86_64 -j30
The O= is essential; in-tree builds are fine.
Also, turns out you are correct; 3.18.92 fails to build for me as well if I use a recent compiler. It does build with the compiler from Poky 1.3. Bisect points to commit a3a8321bf0f00 ("um: Remove copy&paste code from init.h)"; bisect log is attached. Not sure if it is worth fixing it, though. Maybe I should just stop building it for 3.18 instead. Thoughts ?
Guenter
--- # bad: [9ea3053b8236d87e0716496e6cd90242aadc2f63] Linux 3.18.93-rc1 # good: [a5d35deca214e095bf9d1745aa6c00dd7ced0517] Linux 3.18.92 git bisect start 'HEAD' 'v3.18.92' # good: [b70017b84be2dbab6c7d47c898d8b0c298d1924f] netfilter: nf_ct_expect: remove the redundant slash when policy name is empty git bisect good b70017b84be2dbab6c7d47c898d8b0c298d1924f # bad: [dcdf22915ebd63044e32efe6f78511c7360fb105] x86/microcode/intel: Extend BDW late-loading further with LLC size check git bisect bad dcdf22915ebd63044e32efe6f78511c7360fb105 # good: [eb71bc9be0cb6615f0743571b560e773ad3f58af] reiserfs: don't preallocate blocks for extended attributes git bisect good eb71bc9be0cb6615f0743571b560e773ad3f58af # good: [3db3a49f92e159173ca108f01559d45b95f66933] um: link vmlinux with -no-pie git bisect good 3db3a49f92e159173ca108f01559d45b95f66933 # good: [05fbee5e81254471451ee0232edaead9f71329b2] um: Stop abusing __KERNEL__ git bisect good 05fbee5e81254471451ee0232edaead9f71329b2 # bad: [a3a8321bf0f001b21178712b5e1693f54afe95db] um: Remove copy&paste code from init.h git bisect bad a3a8321bf0f001b21178712b5e1693f54afe95db # first bad commit: [a3a8321bf0f001b21178712b5e1693f54afe95db] um: Remove copy&paste code from init.h
On Tue, Jan 30, 2018 at 11:48:58AM -0800, Guenter Roeck wrote:
On Tue, Jan 30, 2018 at 07:51:31PM +0100, Greg Kroah-Hartman wrote:
On Tue, Jan 30, 2018 at 03:51:56PM +0100, Greg Kroah-Hartman wrote:
On Tue, Jan 30, 2018 at 06:19:15AM -0800, Guenter Roeck wrote:
On 01/29/2018 04:56 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 3.18.93 release. There are 52 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 31 12:36:07 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 135 fail: 1 Failed builds: um:defconfig Qemu test results: total: 112 pass: 112 fail: 0
The build failure is:
In file included from arch/um/kernel/config.c:8:0: arch/um/include/shared/init.h:43:28: fatal error: linux/compiler.h: No such file or directory
[ several instances ]
Crap, I was trying to apply a number of the UM patches that Android relies on for their build systems that they patch their kernel for. I'll go look into those to try to figure out what I got wrong...
Ok, I can't even build a defconfig for ARCH=um at all, with no patches applied to 3.18. I had to go find a 4.9.4 kernel to even get close to building, gcc7 did really odd things.
Do you have the .config file you use to build this arch with? I looked on the builder site and couldn't seem to find it anywhere, am I just missing something obvious?
mkdir /tmp/build make O=/tmp/build ARCH=um SUBARCH=x86_64 defconfig make O=/tmp/build ARCH=um SUBARCH=x86_64 -j30
The O= is essential; in-tree builds are fine.
in-tree builds does not work for me either :(
Also, turns out you are correct; 3.18.92 fails to build for me as well if I use a recent compiler. It does build with the compiler from Poky 1.3. Bisect points to commit a3a8321bf0f00 ("um: Remove copy&paste code from init.h)"; bisect log is attached. Not sure if it is worth fixing it, though. Maybe I should just stop building it for 3.18 instead. Thoughts ?
Let me pull the um patches out of this release, queue them up for the next one after this, and try to figure out what is going on in a more relaxed way. I don't want the "real" bugfixes that are queued up right now to be stopped from being released due to this odd arch.
I'll look into this later this week, no need for you to pull this out of your build system just yet.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org