Linux-stable-mirror February 2019

linux-stable-mirror@lists.linaro.org

332 participants
869 discussions

[PATCH for-4.19 00/12] erofs fixes for linux-4.19.y

by Gao Xiang

This series backports bugfixes already merged in linux upstream which we found these issues in our commerical products, which are serious and should be fixed immediately. Note that it also includes some xarray modification since upcoming patches heavily needs it, which can reduce more conflicts later. All patches have been tested again as a whole. Thanks, Gao Xiang Chen Gong (1): staging: erofs: replace BUG_ON with DBG_BUGON in data.c Gao Xiang (11): staging: erofs: fix a bug when appling cache strategy staging: erofs: complete error handing of z_erofs_do_read_page staging: erofs: drop multiref support temporarily staging: erofs: remove the redundant d_rehash() for the root dentry staging: erofs: fix race when the managed cache is enabled staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}' staging: erofs: add a full barrier in erofs_workgroup_unfreeze staging: erofs: {dir,inode,super}.c: rectify BUG_ONs staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs drivers/staging/erofs/data.c | 31 ++++--- drivers/staging/erofs/dir.c | 7 +- drivers/staging/erofs/inode.c | 10 ++- drivers/staging/erofs/internal.h | 71 ++++++++++------ drivers/staging/erofs/super.c | 19 ++--- drivers/staging/erofs/unzip_pagevec.h | 2 +- drivers/staging/erofs/unzip_vle.c | 97 ++++++++-------------- drivers/staging/erofs/unzip_vle.h | 12 +-- drivers/staging/erofs/unzip_vle_lz4.c | 2 +- drivers/staging/erofs/utils.c | 150 +++++++++++++++++++++++----------- include/linux/xarray.h | 48 +++++++++++ 11 files changed, 271 insertions(+), 178 deletions(-) -- 2.14.4

6 years, 10 months

[PATCH AUTOSEL 4.20 01/81] ARM: OMAP: dts: N950/N9: fix onenand timings

by Sasha Levin

From: Aaro Koskinen <aaro.koskinen(a)iki.fi> [ Upstream commit 8443e4843e1c2594bf5664e1d993a1be71d1befb ] Commit a758f50f10cf ("mtd: onenand: omap2: Configure driver from DT") started using DT specified timings for GPMC, and as a result the OneNAND stopped working on N950/N9 as we had wrong values in the DT. Fix by updating the values to bootloader timings that have been tested to be working on both Nokia N950 and N9. Fixes: a758f50f10cf ("mtd: onenand: omap2: Configure driver from DT") Signed-off-by: Aaro Koskinen <aaro.koskinen(a)iki.fi> Signed-off-by: Tony Lindgren <tony(a)atomide.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- arch/arm/boot/dts/omap3-n950-n9.dtsi | 42 ++++++++++++++++++---------- 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/arch/arm/boot/dts/omap3-n950-n9.dtsi b/arch/arm/boot/dts/omap3-n950-n9.dtsi index 0d9b85317529b..e142e6c70a59f 100644 --- a/arch/arm/boot/dts/omap3-n950-n9.dtsi +++ b/arch/arm/boot/dts/omap3-n950-n9.dtsi @@ -370,6 +370,19 @@ compatible = "ti,omap2-onenand"; reg = <0 0 0x20000>; /* CS0, offset 0, IO size 128K */ + /* + * These timings are based on CONFIG_OMAP_GPMC_DEBUG=y reported + * bootloader set values when booted with v4.19 using both N950 + * and N9 devices (OneNAND Manufacturer: Samsung): + * + * gpmc cs0 before gpmc_cs_program_settings: + * cs0 GPMC_CS_CONFIG1: 0xfd001202 + * cs0 GPMC_CS_CONFIG2: 0x00181800 + * cs0 GPMC_CS_CONFIG3: 0x00030300 + * cs0 GPMC_CS_CONFIG4: 0x18001804 + * cs0 GPMC_CS_CONFIG5: 0x03171d1d + * cs0 GPMC_CS_CONFIG6: 0x97080000 + */ gpmc,sync-read; gpmc,sync-write; gpmc,burst-length = <16>; @@ -379,26 +392,27 @@ gpmc,device-width = <2>; gpmc,mux-add-data = <2>; gpmc,cs-on-ns = <0>; - gpmc,cs-rd-off-ns = <87>; - gpmc,cs-wr-off-ns = <87>; + gpmc,cs-rd-off-ns = <122>; + gpmc,cs-wr-off-ns = <122>; gpmc,adv-on-ns = <0>; - gpmc,adv-rd-off-ns = <10>; - gpmc,adv-wr-off-ns = <10>; - gpmc,oe-on-ns = <15>; - gpmc,oe-off-ns = <87>; + gpmc,adv-rd-off-ns = <15>; + gpmc,adv-wr-off-ns = <15>; + gpmc,oe-on-ns = <20>; + gpmc,oe-off-ns = <122>; gpmc,we-on-ns = <0>; - gpmc,we-off-ns = <87>; - gpmc,rd-cycle-ns = <112>; - gpmc,wr-cycle-ns = <112>; - gpmc,access-ns = <81>; + gpmc,we-off-ns = <122>; + gpmc,rd-cycle-ns = <148>; + gpmc,wr-cycle-ns = <148>; + gpmc,access-ns = <117>; gpmc,page-burst-access-ns = <15>; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,wait-monitoring-ns = <0>; - gpmc,clk-activation-ns = <5>; - gpmc,wr-data-mux-bus-ns = <30>; - gpmc,wr-access-ns = <81>; - gpmc,sync-clk-ps = <15000>; + gpmc,clk-activation-ns = <10>; + gpmc,wr-data-mux-bus-ns = <40>; + gpmc,wr-access-ns = <117>; + + gpmc,sync-clk-ps = <15000>; /* TBC; Where this value came? */ /* * MTD partition table corresponding to Nokia's MeeGo 1.2 -- 2.19.1

6 years, 10 months

[PATCH AUTOSEL 4.19 01/64] ARM: OMAP: dts: N950/N9: fix onenand timings

by Sasha Levin

6 years, 10 months

Re: [tip:x86/cpu] x86/CPU/AMD: Set the CPB bit unconditionally on F17h

by Borislav Petkov

On Fri, Jan 18, 2019 at 07:48:59AM -0800, tip-bot for Jiaxun Yang wrote: > Commit-ID: 0237199186e7a4aa5310741f0a6498a20c820fd7 > Gitweb: https://git.kernel.org/tip/0237199186e7a4aa5310741f0a6498a20c820fd7 > Author: Jiaxun Yang <jiaxun.yang(a)flygoat.com> > AuthorDate: Tue, 20 Nov 2018 11:00:18 +0800 > Committer: Borislav Petkov <bp(a)suse.de> > CommitDate: Fri, 18 Jan 2019 16:44:03 +0100 > > x86/CPU/AMD: Set the CPB bit unconditionally on F17h > > Some F17h models do not have CPB set in CPUID even though the CPU > supports it. Set the feature bit unconditionally on all F17h. > > [ bp: Rewrite commit message and patch. ] > > Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com> > Signed-off-by: Borislav Petkov <bp(a)suse.de> > Acked-by: Tom Lendacky <thomas.lendacky(a)amd.com> > Cc: "H. Peter Anvin" <hpa(a)zytor.com> > Cc: Ingo Molnar <mingo(a)redhat.com> > Cc: Sherry Hurwitz <sherry.hurwitz(a)amd.com> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com> > Cc: Thomas Gleixner <tglx(a)linutronix.de> > Cc: x86-ml <x86(a)kernel.org> > Link: https://lkml.kernel.org/r/20181120030018.5185-1-jiaxun.yang@flygoat.com > --- > arch/x86/kernel/cpu/amd.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c > index 69f6bbb41be0..01004bfb1a1b 100644 > --- a/arch/x86/kernel/cpu/amd.c > +++ b/arch/x86/kernel/cpu/amd.c > @@ -819,11 +819,9 @@ static void init_amd_bd(struct cpuinfo_x86 *c) > static void init_amd_zn(struct cpuinfo_x86 *c) > { > set_cpu_cap(c, X86_FEATURE_ZEN); > - /* > - * Fix erratum 1076: CPB feature bit not being set in CPUID. It affects > - * all up to and including B1. > - */ > - if (c->x86_model <= 1 && c->x86_stepping <= 1) > + > + /* Fix erratum 1076: CPB feature bit not being set in CPUID. */ > + if (!cpu_has(c, X86_FEATURE_CPB)) > set_cpu_cap(c, X86_FEATURE_CPB); Stable folks, please take this one above into those stable trees which have backported f7f3dc00f612 ("x86/cpu/AMD: Fix erratum 1076 (CPB bit)") Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.

6 years, 10 months

[PATCH AUTOSEL 4.20 01/72] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel

by Sasha Levin

From: Su Yanjun <suyj.fnst(a)cn.fujitsu.com> [ Upstream commit dd9ee3444014e8f28c0eefc9fffc9ac9c5248c12 ] Recently we run a network test over ipcomp virtual tunnel.We find that if a ipv4 packet needs fragment, then the peer can't receive it. We deep into the code and find that when packet need fragment the smaller fragment will be encapsulated by ipip not ipcomp. So when the ipip packet goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code always set skb'dev to the last fragment's dev. After ipv4 defrag processing, when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV error. This patch adds compatible support for the ipip process in ipcomp virtual tunnel. Signed-off-by: Su Yanjun <suyj.fnst(a)cn.fujitsu.com> Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/ipv4/ip_vti.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index d7b43e700023a..68a21bf75dd0b 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -74,6 +74,33 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi, return 0; } +static int vti_input_ipip(struct sk_buff *skb, int nexthdr, __be32 spi, + int encap_type) +{ + struct ip_tunnel *tunnel; + const struct iphdr *iph = ip_hdr(skb); + struct net *net = dev_net(skb->dev); + struct ip_tunnel_net *itn = net_generic(net, vti_net_id); + + tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY, + iph->saddr, iph->daddr, 0); + if (tunnel) { + if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto drop; + + XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = tunnel; + + skb->dev = tunnel->dev; + + return xfrm_input(skb, nexthdr, spi, encap_type); + } + + return -EINVAL; +drop: + kfree_skb(skb); + return 0; +} + static int vti_rcv(struct sk_buff *skb) { XFRM_SPI_SKB_CB(skb)->family = AF_INET; @@ -82,6 +109,14 @@ static int vti_rcv(struct sk_buff *skb) return vti_input(skb, ip_hdr(skb)->protocol, 0, 0); } +static int vti_rcv_ipip(struct sk_buff *skb) +{ + XFRM_SPI_SKB_CB(skb)->family = AF_INET; + XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct iphdr, daddr); + + return vti_input_ipip(skb, ip_hdr(skb)->protocol, ip_hdr(skb)->saddr, 0); +} + static int vti_rcv_cb(struct sk_buff *skb, int err) { unsigned short family; @@ -435,6 +470,12 @@ static struct xfrm4_protocol vti_ipcomp4_protocol __read_mostly = { .priority = 100, }; +static struct xfrm_tunnel ipip_handler __read_mostly = { + .handler = vti_rcv_ipip, + .err_handler = vti4_err, + .priority = 0, +}; + static int __net_init vti_init_net(struct net *net) { int err; @@ -603,6 +644,13 @@ static int __init vti_init(void) if (err < 0) goto xfrm_proto_comp_failed; + msg = "ipip tunnel"; + err = xfrm4_tunnel_register(&ipip_handler, AF_INET); + if (err < 0) { + pr_info("%s: cant't register tunnel\n",__func__); + goto xfrm_tunnel_failed; + } + msg = "netlink interface"; err = rtnl_link_register(&vti_link_ops); if (err < 0) @@ -612,6 +660,8 @@ static int __init vti_init(void) rtnl_link_failed: xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP); +xfrm_tunnel_failed: + xfrm4_tunnel_deregister(&ipip_handler, AF_INET); xfrm_proto_comp_failed: xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH); xfrm_proto_ah_failed: -- 2.19.1

6 years, 10 months

[PATCH] drm/vkms: fix use-after-free when drm_gem_handle_create() fails

by Eric Biggers

From: Eric Biggers <ebiggers(a)google.com> If drm_gem_handle_create() fails in vkms_gem_create(), then the vkms_gem_object is freed twice: once when the reference is dropped by drm_gem_object_put_unlocked(), and again by the extra calls to drm_gem_object_release() and kfree(). Fix it by skipping the second release and free. This bug was originally found in the vgem driver by syzkaller using fault injection, but I noticed it's also present in the vkms driver. Fixes: 559e50fd34d1 ("drm/vkms: Add dumb operations") Cc: Rodrigo Siqueira <rodrigosiqueiramelo(a)gmail.com> Cc: Haneen Mohammed <hamohammed.sa(a)gmail.com> Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: stable(a)vger.kernel.org Signed-off-by: Eric Biggers <ebiggers(a)google.com> --- drivers/gpu/drm/vkms/vkms_gem.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/vkms/vkms_gem.c b/drivers/gpu/drm/vkms/vkms_gem.c index 138b0bb325cf9..69048e73377dc 100644 --- a/drivers/gpu/drm/vkms/vkms_gem.c +++ b/drivers/gpu/drm/vkms/vkms_gem.c @@ -111,11 +111,8 @@ struct drm_gem_object *vkms_gem_create(struct drm_device *dev, ret = drm_gem_handle_create(file, &obj->gem, handle); drm_gem_object_put_unlocked(&obj->gem); - if (ret) { - drm_gem_object_release(&obj->gem); - kfree(obj); + if (ret) return ERR_PTR(ret); - } return &obj->gem; } -- 2.21.0.rc2.261.ga7da99ff1b-goog

6 years, 10 months

[tip:sched/core] sched/core: Fix a potential double-fetch bug in sched_copy_attr()

by tip-bot for Kangjie Lu

Commit-ID: 120e4e76857ddbc9268e1aa3f9de61a498e84618 Gitweb: https://git.kernel.org/tip/120e4e76857ddbc9268e1aa3f9de61a498e84618 Author: Kangjie Lu <kjlu(a)umn.edu> AuthorDate: Wed, 9 Jan 2019 01:45:24 -0600 Committer: Ingo Molnar <mingo(a)kernel.org> CommitDate: Mon, 21 Jan 2019 11:26:17 +0100 sched/core: Fix a potential double-fetch bug in sched_copy_attr() "uattr->size" is copied in from user space and checked. However, it is copied in again after the security check. A malicious user may race to change it. The fix sets uattr->size to be the checked size. Signed-off-by: Kangjie Lu <kjlu(a)umn.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: pakki001(a)umn.edu Cc: <stable(a)vger.kernel.org> Link: https://lkml.kernel.org/r/20190109074524.10176-1-kjlu@umn.edu Signed-off-by: Ingo Molnar <mingo(a)kernel.org> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a674c7db2f29..d4d3514c4fe9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4499,6 +4499,9 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a if (ret) return -EFAULT; + /* In case attr->size was changed by user-space: */ + attr->size = size; + /* * XXX: Do we want to be lenient like existing syscalls; or do we want * to be strict and return an error on out-of-bounds values?

6 years, 10 months

[RESEND, PATCH v2] fuse: Don't drop NOTIFY_REPLY if we promised it

by Kirill Smelkov

A successful call to NOTIFY_RETRIEVE by filesystem carries promise from the kernel to send back NOTIFY_REPLY message. However if the filesystem is not reading requests with fuse_conn->max_pages capacity, fuse_dev_do_read might see that the "request is too large" and decide to "reply with an error and restart the read". "Reply with an error" has underlying assumption that there is a "requester thread" that is waiting for request completion, which is true for most requests, but is not true for NOTIFY_REPLY: NOTIFY_RETRIEVE handler completes with OK status right after it could successfully queue NOTIFY_REPLY message without waiting for NOTIFY_REPLY completion. This leads to situation when filesystem requested to retrieve inode data with NOTIFY_RETRIEVE, got err=OK for that notification request, but NOTIFY_REPLY is not coming back. More, since there is no "requester thread" to handle the error, the situation shows itself as /sys/fs/fuse/connections/X/waiting=1 _and_ /dev/fuse read(s) queued. Which is misleading since NOTIFY_REPLY request was removed from pending queue and abandoned. One way to fix would be to change NOTIFY_RETRIEVE handler to wait until queued NOTIFY_REPLY is actually read back to the server and only then return NOTIFY_RETRIEVE status. However this is change in behaviour and would require filesystems to have at least 2 threads. In particular a single-threaded filesystem that was previously successfully using NOTIFY_RETRIEVE would become stuck after the change. This way of fixing is thus not acceptable. However we can fix it another way - by always returning NOTIFY_REPLY irregardless of its original size - with so much data as provided read buffer could fit. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested. This fix requires another behaviour change however - to be sure that read buffer has enough capacity to always fit fixed NOTIFY_REPLY part plus at least some (0 or more) data, we have to precheck the buffer before dequeuing and handling a request. And if the buffer is very small - return EINVAL to read in filesystem with semantic that queued read was invalid from the viewpoint of FUSE protocol. Even though this is also behaviour change, this should not practically cause problems: 1d3d752b47 (fuse: clean up request size limit checking), which originally removed such EINVAL return and reworked fuse_dev_do_read to loop and retry, also added FUSE_MIN_READ_BUFFER=8K to user-visible fuse.h with comment that "The read buffer is required to be at least 8k ..." Even though FUSE_MIN_READ_BUFFER is not currently checked anywhere in the kernel, libfuse always initializes session with bufsize=32·pages and, since its beginning, (at least from 2005) issues a warning should user modify fuse_session->bufsize directly to be sure that queued buffers are at least as large as that sane minimum: https://github.com/libfuse/libfuse/blob/fuse-3.3.0-22-g63d53ecc3a/lib/fuse_… https://github.com/libfuse/libfuse/blob/fuse-3.3.0-22-g63d53ecc3a/lib/fuse_… (semantic added in https://github.com/libfuse/libfuse/commit/044da2e9e0) This way we should be safe to add the check for minimum read buffer size. I've hit this bug for real with my filesystem that is using https://github.com/hanwen/go-fuse: there was no NOTIFY_REPLY after successful NOTIFY_RETRIEVE and the filesystem was stuck waiting, because FUSE protocol (definition scattered through many places) states that NOTIFY_REPLY is guaranteed to come after successful NOTIFY_RETRIEVE (see 2d45ba381a "fuse: add retrieve request"). After inspecting /sys/fs/fuse/connections/X/waiting and seeing it was 1, I was initially suspecting that it was user-space who is not issuing /dev/fuse reads and NOTIFY_REPLY is there but stuck in kernel pending queue. However tracing what is going on in /dev/fuse exchange and in both kernel and userspace (see https://lab.nexedi.com/kirr/wendelin.core/blob/13d2d1f8/wcfs/fusetrace) showed that there are correctly queued /dev/fuse reads still pending after NOTIFY_RETRIEVE returns and it is the kernel who is not replying back: ... P2 2.215710 /dev/fuse <- qread wcfs/11399_4_r: syscall.Syscall+48 syscall.Read+73 github.com/hanwen/go-fuse/fuse.(*Server).readRequest.func1+85 github.com/hanwen/go-fuse/fuse.handleEINTR+39 github.com/hanwen/go-fuse/fuse.(*Server).readRequest+355 github.com/hanwen/go-fuse/fuse.(*Server).loop+107 runtime.goexit+1 P2 2.215810 /dev/fuse -> read wcfs/11399_4_r: .56 RELEASE i8 ... (ret=64) P2 2.215859 /dev/fuse <- write wcfs/11399_5_w: .56 (0) ... syscall.Syscall+48 syscall.Write+73 github.com/hanwen/go-fuse/fuse.(*Server).systemWrite.func1+76 github.com/hanwen/go-fuse/fuse.handleEINTR+39 github.com/hanwen/go-fuse/fuse.(*Server).systemWrite+931 github.com/hanwen/go-fuse/fuse.(*Server).write+194 github.com/hanwen/go-fuse/fuse.(*Server).handleRequest+179 github.com/hanwen/go-fuse/fuse.(*Server).loop+399 runtime.goexit+1 P2 2.215871 /dev/fuse -> write_ack wcfs/11399_5_w (ret=16) P2 2.215876 /dev/fuse <- qread wcfs/11399_5_r: <-- NOTE syscall.Syscall+48 syscall.Read+73 github.com/hanwen/go-fuse/fuse.(*Server).readRequest.func1+85 github.com/hanwen/go-fuse/fuse.handleEINTR+39 github.com/hanwen/go-fuse/fuse.(*Server).readRequest+355 github.com/hanwen/go-fuse/fuse.(*Server).loop+107 runtime.goexit+1 P0 2.221527 /dev/fuse <- qread wcfs/11401_1_r: <-- NOTE syscall.Syscall+48 syscall.Read+73 github.com/hanwen/go-fuse/fuse.(*Server).readRequest.func1+85 github.com/hanwen/go-fuse/fuse.handleEINTR+39 github.com/hanwen/go-fuse/fuse.(*Server).readRequest+355 github.com/hanwen/go-fuse/fuse.(*Server).loop+107 runtime.goexit+1 P1 2.239384 /dev/fuse -> read wcfs/11398_6_r: # woken read that was queued before "..." .57 READ i5 ... (ret=80) P0 2.239626 /dev/fuse <- write wcfs/11397_0_w: NOTIFY_RETRIEVE ... syscall.Syscall+48 syscall.Write+73 github.com/hanwen/go-fuse/fuse.(*Server).systemWrite.func1+76 github.com/hanwen/go-fuse/fuse.handleEINTR+39 github.com/hanwen/go-fuse/fuse.(*Server).systemWrite+931 github.com/hanwen/go-fuse/fuse.(*Server).write+194 github.com/hanwen/go-fuse/fuse.(*Server).InodeRetrieveCache+764 github.com/hanwen/go-fuse/fuse/nodefs.(*FileSystemConnector).FileRetrieveCa… main.(*BigFile).invalidateBlk+232 main.(*Root).zδhandle1.func1+72 golang.org/x/sync/errgroup.(*Group).Go.func1+87 runtime.goexit+1 P0 2.239660 /dev/fuse -> write_ack wcfs/11397_0_w (ret=48) # stuck # (full trace: https://lab.nexedi.com/kirr/wendelin.core/commit/96416aaabd) with queued / served read analysis confirming that two reads were indeed queued and not served: grep -w -e '<- qread\>' y.log |awk {'print $6'} |sort >qread.txt grep -w -e '-> read\>' y.log |awk {'print $6'} |sort >read.txt # xdiff qread.txt read.txt diff --git a/qread.txt b/read.txt index 4ab50d7..fdd2be1 100644 --- a/qread.txt +++ b/read.txt @@ -53,7 +53,5 @@ wcfs/11399_1_r: wcfs/11399_2_r: wcfs/11399_3_r: wcfs/11399_4_r: -wcfs/11399_5_r: wcfs/11400_0_r: wcfs/11401_0_r: -wcfs/11401_1_r: The bug was hit because go-fuse by default uses 64K for read buffer size https://github.com/hanwen/go-fuse/blob/33711add/fuse/server.go#L142 and the kernel presets fuse_conn->max_pages to be 128K (= 32·4K pages). Go-fuse will be likely fixed to both use bufsize=kernel's and to correctly handle size > bufsize in InodeRetrieveCache. However we should also fix the kernel to always deliver NOTIFY_REPLY once NOTIFY_RETRIEVE was successful, so that FUSE protocol guarantee always holds irregardless of whether userspace used default or other valid buffer size setting, and so that filesystems can count not to get stuck waiting for kernel who promised a reply. This way this patch is here. Signed-off-by: Kirill Smelkov <kirr(a)nexedi.com> Cc: Han-Wen Nienhuys <hanwen(a)google.com> Cc: Jakob Unterwurzacher <jakobunt(a)gmail.com> Cc: <stable(a)vger.kernel.org> # v2.6.36+ --- First patch version was sent 1 week ago, but got no response: https://marc.info/?l=linux-fsdevel&m=155000277921155&w=2 Changes since v1: don't forget to also update req->misc.retrieve_in.size after truncation. ( This is my first patch to fs/fuse, so please forgive me if I missed anything. ) fs/fuse/dev.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 65 insertions(+), 6 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 8a63e52785e9..93deb8e54d88 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -381,6 +381,40 @@ static void queue_request(struct fuse_iqueue *fiq, struct fuse_req *req) kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } +/* + * fuse_req_truncate_data truncates data in request that has paged data + * (req.in.argpages=1), so that whole request, when serialized, is <= nbytes. + * + * nbytes must be >= size(request without data). + */ +static void fuse_req_truncate_data(struct fuse_req *req, unsigned nbytes) { + unsigned size, n; + + BUG_ON(!req->in.argpages); + BUG_ON(req->in.numargs < 1); + + /* request size without data */ + size = sizeof(struct fuse_in_header) + + len_args(req->in.numargs - 1, (struct fuse_arg *) req->in.args); + BUG_ON(nbytes < size); + + /* truncate paged data */ + for (n = 0; n < req->num_pages; n++) { + struct fuse_page_desc *p = &req->page_descs[n]; + + if (size >= nbytes) { + p->length = 0; + } else { + p->length = min_t(unsigned, p->length, nbytes - size); + } + + size += p->length; + } + + /* update whole request length in the header */ + req->in.h.len = size; +} + void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget, u64 nodeid, u64 nlookup) { @@ -1317,6 +1351,15 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned reqsize; unsigned int hash; + /* + * Require sane minimum read buffer - that has capacity for fixed part + * of any request + some room for data. If the requirement is not + * satisfied return EINVAL to the filesystem without dequeueing / + * aborting any request. + */ + if (nbytes < FUSE_MIN_READ_BUFFER) + return -EINVAL; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; @@ -1358,12 +1401,28 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, /* If request is too large, reply with an error and restart the read */ if (nbytes < reqsize) { - req->out.h.error = -EIO; - /* SETXATTR is special, since it may contain too large data */ - if (in->h.opcode == FUSE_SETXATTR) - req->out.h.error = -E2BIG; - request_end(fc, req); - goto restart; + switch (in->h.opcode) { + default: + req->out.h.error = -EIO; + /* SETXATTR is special, since it may contain too large data */ + if (in->h.opcode == FUSE_SETXATTR) + req->out.h.error = -E2BIG; + request_end(fc, req); + goto restart; + + /* + * NOTIFY_REPLY is special: if it was queued we already + * promised to filesystem to deliver it when handling + * NOTIFY_RETRIVE. We know that read buffer has capacity for at + * least some data. Truncate retrieved data to read buffer size + * and deliver it to stay to the promise. + */ + case FUSE_NOTIFY_REPLY: + fuse_req_truncate_data(req, nbytes); + req->misc.retrieve_in.size -= reqsize - in->h.len; + reqsize = in->h.len; + } + } spin_lock(&fpq->lock); list_add(&req->list, &fpq->io); -- 2.21.0.rc0.269.g1a574e7a28

6 years, 10 months

[PATCH for-next 0/3] Driver fixes

by Dennis Dalessandro

Hi Jason and Doug, Here are some fixes that didn't quite make the boat for 5.0-rc. So we can go ahead and send them to -next. The third patch here is really a v2 of: https://patchwork.kernel.org/patch/10769005/ --- Michael J. Ruhl (2): IB/rdmavt: Fix concurrency panics in QP post_send and modify to error IB/hfi1: Close race condition on user context disable and close Mike Marciniszyn (1): IB/rdmavt: Fix loopback send with invalidate ordering drivers/infiniband/hw/hfi1/hfi.h | 2 + drivers/infiniband/hw/hfi1/init.c | 14 ++++++--- drivers/infiniband/sw/rdmavt/qp.c | 59 ++++++++++++++++++++++++------------- 3 files changed, 49 insertions(+), 26 deletions(-) -- -Denny

6 years, 10 months

[RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

by Marc Gonzalez

From: Mike Rapoport <rppt(a)linux.ibm.com> Marc Gonzalez reported the following kmemleak crash: Unable to handle kernel paging request at virtual address ffffffc021e00000 Mem abort info: ESR = 0x96000006 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____) [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP Modules linked in: CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13 Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT) pstate: 80000085 (Nzcv daIf -PAN -UAO) pc : scan_block+0x70/0x190 lr : scan_block+0x6c/0x190 sp : ffffff8012e8bd20 x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018 x27: ffffffc022000000 x26: 0000000000000080 x25: ffffff8011aadf70 x24: ffffffc0f8cc8000 x23: ffffff8010dc8000 x22: ffffff8010dc8830 x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050 x19: ffffffc021e00000 x18: 0000000000002409 x17: 0000000000000200 x16: 0000000000000000 x15: ffffff8010e14dd8 x14: 0000000000002406 x13: 000000004c4dd0c6 x12: ffffffc0f77dad58 x11: 0000000000000001 x10: ffffff8010d9e688 x9 : ffffff8010d9f000 x8 : ffffff8010d9e688 x7 : 0000000000000002 x6 : 0000000000000000 x5 : ffffff8011511c20 x4 : 00000000000026d1 x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000 x1 : 0000000000208040 x0 : 0000000000000000 Process kmemleak (pid: 523, stack limit = 0x(____ptrval____)) Call trace: scan_block+0x70/0x190 scan_gray_list+0x108/0x1c0 kmemleak_scan+0x33c/0x7c0 kmemleak_scan_thread+0x98/0xf0 kthread+0x11c/0x120 ret_from_fork+0x10/0x1c Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260) ---[ end trace 176d6ed9d86a0c33 ]--- note: kmemleak[523] exited with preempt_count 2 The crash happens when a no-map area is allocated in early_init_dt_alloc_reserved_memory_arch(). The allocated region is registered with kmemleak, but it is then removed from memblock using memblock_remove() that is not kmemleak-aware. Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure that the allocated memory is not added to kmemleak and then memblock_remove()'ing this memory is safe. As a bonus, since memblock_find_in_range() ensures the allocation in the specified range, the bounds check can be removed. Cc: stable(a)vger.kernel.org # 3.15+ Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory") Acked-by: Marek Szyprowski <m.szyprowski(a)samsung.com> Acked-by: Prateek Patel <prpatel(a)nvidia.com> Tested-by: Marc Gonzalez <marc.w.gonzalez(a)free.fr> Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com> --- Resend with DT CCed to reach robh's patch queue I added CC: stable, Fixes, and Prateek's ack Trim recipients list to minimize inconvenience --- drivers/of/of_reserved_mem.c | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c index 1977ee0adcb1..2ae81604ffef 100644 --- a/drivers/of/of_reserved_mem.c +++ b/drivers/of/of_reserved_mem.c @@ -31,27 +31,19 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size, phys_addr_t *res_base) { phys_addr_t base; - /* - * We use __memblock_alloc_base() because memblock_alloc_base() - * panic()s on allocation failure. - */ + end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end; align = !align ? SMP_CACHE_BYTES : align; - base = __memblock_alloc_base(size, align, end); + base = memblock_find_in_range(size, align, start, end); if (!base) return -ENOMEM; - /* - * Check if the allocated region fits in to start..end window - */ - if (base < start) { - memblock_free(base, size); - return -ENOMEM; - } - *res_base = base; if (nomap) return memblock_remove(base, size); + else + return memblock_reserve(base, size); + return 0; } -- 2.7.4

6 years, 10 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror February 2019