- Linux-stable-mirror - lists.linaro.org

FAILED: patch "[PATCH] drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is" failed to apply to 4.16-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.16-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 91ba9f28a3de97761c2b5fd5df5d88421268e507 Mon Sep 17 00:00:00 2001 From: Deepak Rawat <drawat(a)vmware.com> Date: Tue, 15 May 2018 15:39:09 +0200 Subject: [PATCH] drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful SOU primary plane prepare_fb hook depends upon dmabuf_size to pin up BO (and not call a new vmw_dmabuf_init) when a new fb size is same as current fb. This was changed in a recent commit which is causing page_flip to fail on VM with low display memory and multi-mon failure when cycle monitors from secondary display. Cc: <stable(a)vger.kernel.org> # 4.14, 4.16 Fixes: 20fb5a635a0c ("drm/vmwgfx: Unpin the screen object backup buffer when not used") Signed-off-by: Deepak Rawat <drawat(a)vmware.com> Reviewed-by: Sinclair Yeh <syeh(a)vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom(a)vmware.com> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c index 648f8127f65a..3d667e903beb 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c @@ -482,6 +482,8 @@ vmw_sou_primary_plane_prepare_fb(struct drm_plane *plane, return ret; } + vps->dmabuf_size = size; + /* * TTM already thinks the buffer is pinned, but make sure the * pin_count is upped.

7 years, 5 months

1
0
0 0

FAILED: patch "[PATCH] ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From a057344806d035cb9ac991619fa07854e807562d Mon Sep 17 00:00:00 2001 From: Maxime Chevallier <maxime.chevallier(a)bootlin.com> Date: Wed, 25 Apr 2018 13:07:31 +0200 Subject: [PATCH] ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio node The Marvell XSMI controller needs 3 clocks to operate correctly : - The MG clock (clk 5) - The MG Core clock (clk 6) - The GOP clock (clk 18) This commit adds them, to avoid system hangs when using these interfaces. [gregory.clement: use the real first commit to fix and add the cc:stable flag] Fixes: f66b2aff46ea ("arm64: dts: marvell: add xmdio nodes for 7k/8k") Cc: <stable(a)vger.kernel.org> Signed-off-by: Maxime Chevallier <maxime.chevallier(a)bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com> diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi index 48cad7919efa..ca22f9d100f5 100644 --- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi @@ -141,6 +141,8 @@ #size-cells = <0>; compatible = "marvell,xmdio"; reg = <0x12a600 0x10>; + clocks = <&CP110_LABEL(clk) 1 5>, + <&CP110_LABEL(clk) 1 6>, <&CP110_LABEL(clk) 1 18>; status = "disabled"; };

7 years, 5 months

1
0
0 0

FAILED: patch "[PATCH] ARM64: dts: marvell: armada-cp110: Add mg_core_clk for" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From f43194c1447c9536efb0859c2f3f46f6bf2b9154 Mon Sep 17 00:00:00 2001 From: Maxime Chevallier <maxime.chevallier(a)bootlin.com> Date: Wed, 25 Apr 2018 20:19:47 +0200 Subject: [PATCH] ARM64: dts: marvell: armada-cp110: Add mg_core_clk for ethernet node Marvell PPv2.2 controller present on CP-110 need the extra "mg_core_clk" clock to avoid system hangs when powering some network interfaces up. This issue appeared after a recent clock rework on Armada 7K/8K platforms. This commit adds the new clock and updates the documentation accordingly. [gregory.clement: use the real first commit to fix and add the cc:stable flag] Fixes: e3af9f7c6ece ("RM64: dts: marvell: armada-cp110: Fix clock resources for various node") Cc: <stable(a)vger.kernel.org> Signed-off-by: Maxime Chevallier <maxime.chevallier(a)bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com> diff --git a/Documentation/devicetree/bindings/net/marvell-pp2.txt b/Documentation/devicetree/bindings/net/marvell-pp2.txt index 1814fa13f6ab..fc019df0d863 100644 --- a/Documentation/devicetree/bindings/net/marvell-pp2.txt +++ b/Documentation/devicetree/bindings/net/marvell-pp2.txt @@ -21,9 +21,10 @@ Required properties: - main controller clock (for both armada-375-pp2 and armada-7k-pp2) - GOP clock (for both armada-375-pp2 and armada-7k-pp2) - MG clock (only for armada-7k-pp2) + - MG Core clock (only for armada-7k-pp2) - AXI clock (only for armada-7k-pp2) -- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk" - and "axi_clk" (the 2 latter only for armada-7k-pp2). +- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk", + "mg_core_clk" and "axi_clk" (the 3 latter only for armada-7k-pp2). The ethernet ports are represented by subnodes. At least one port is required. @@ -80,8 +81,8 @@ cpm_ethernet: ethernet@0 { compatible = "marvell,armada-7k-pp22"; reg = <0x0 0x100000>, <0x129000 0xb000>; clocks = <&cpm_syscon0 1 3>, <&cpm_syscon0 1 9>, - <&cpm_syscon0 1 5>, <&cpm_syscon0 1 18>; - clock-names = "pp_clk", "gop_clk", "gp_clk", "axi_clk"; + <&cpm_syscon0 1 5>, <&cpm_syscon0 1 6>, <&cpm_syscon0 1 18>; + clock-names = "pp_clk", "gop_clk", "mg_clk", "mg_core_clk", "axi_clk"; eth0: eth0 { interrupts = <ICU_GRP_NSR 39 IRQ_TYPE_LEVEL_HIGH>, diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi index ca22f9d100f5..ed2f1237ea1e 100644 --- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi @@ -38,9 +38,10 @@ compatible = "marvell,armada-7k-pp22"; reg = <0x0 0x100000>, <0x129000 0xb000>; clocks = <&CP110_LABEL(clk) 1 3>, <&CP110_LABEL(clk) 1 9>, - <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 18>; + <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 6>, + <&CP110_LABEL(clk) 1 18>; clock-names = "pp_clk", "gop_clk", - "mg_clk", "axi_clk"; + "mg_clk", "mg_core_clk", "axi_clk"; marvell,system-controller = <&CP110_LABEL(syscon0)>; status = "disabled"; dma-coherent;

7 years, 5 months

1
0
0 0

Please apply dd83c161fbcc ("kernel/exit.c: avoid undefined behaviour when calling wait4()") to v4.9.y and older

by Guenter Roeck

Hi Greg, please apply commit dd83c161fbc ("kernel/exit.c: avoid undefined behaviour when calling wait4()") to v4.9.y and older to fix CVE-2018-10087. Thanks, Guenter

7 years, 5 months

2
4
0 0

FAILED: patch "[PATCH] vsprintf: Replace memory barrier with static_key for" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 85f4f12d51397f1648e1f4350f77e24039b82d61 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org> Date: Tue, 15 May 2018 22:24:52 -0400 Subject: [PATCH] vsprintf: Replace memory barrier with static_key for random_ptr_key update Reviewing Tobin's patches for getting pointers out early before entropy has been established, I noticed that there's a lone smp_mb() in the code. As with most lone memory barriers, this one appears to be incorrectly used. We currently basically have this: get_random_bytes(&ptr_key, sizeof(ptr_key)); /* * have_filled_random_ptr_key==true is dependent on get_random_bytes(). * ptr_to_id() needs to see have_filled_random_ptr_key==true * after get_random_bytes() returns. */ smp_mb(); WRITE_ONCE(have_filled_random_ptr_key, true); And later we have: if (unlikely(!have_filled_random_ptr_key)) return string(buf, end, "(ptrval)", spec); /* Missing memory barrier here. */ hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key); As the CPU can perform speculative loads, we could have a situation with the following: CPU0 CPU1 ---- ---- load ptr_key = 0 store ptr_key = random smp_mb() store have_filled_random_ptr_key load have_filled_random_ptr_key = true BAD BAD BAD! (you're so bad!) Because nothing prevents CPU1 from loading ptr_key before loading have_filled_random_ptr_key. But this race is very unlikely, but we can't keep an incorrect smp_mb() in place. Instead, replace the have_filled_random_ptr_key with a static_branch not_filled_random_ptr_key, that is initialized to true and changed to false when we get enough entropy. If the update happens in early boot, the static_key is updated immediately, otherwise it will have to wait till entropy is filled and this happens in an interrupt handler which can't enable a static_key, as that requires a preemptible context. In that case, a work_queue is used to enable it, as entropy already took too long to establish in the first place waiting a little more shouldn't hurt anything. The benefit of using the static key is that the unlikely branch in vsprintf() now becomes a nop. Link: http://lkml.kernel.org/r/20180515100558.21df515e@gandalf.local.home Cc: stable(a)vger.kernel.org Fixes: ad67b74d2469d ("printk: hash addresses printed with %p") Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 30c0cb8cc9bc..23920c5ff728 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -1669,19 +1669,22 @@ char *pointer_string(char *buf, char *end, const void *ptr, return number(buf, end, (unsigned long int)ptr, spec); } -static bool have_filled_random_ptr_key __read_mostly; +static DEFINE_STATIC_KEY_TRUE(not_filled_random_ptr_key); static siphash_key_t ptr_key __read_mostly; -static void fill_random_ptr_key(struct random_ready_callback *unused) +static void enable_ptr_key_workfn(struct work_struct *work) { get_random_bytes(&ptr_key, sizeof(ptr_key)); - /* - * have_filled_random_ptr_key==true is dependent on get_random_bytes(). - * ptr_to_id() needs to see have_filled_random_ptr_key==true - * after get_random_bytes() returns. - */ - smp_mb(); - WRITE_ONCE(have_filled_random_ptr_key, true); + /* Needs to run from preemptible context */ + static_branch_disable(&not_filled_random_ptr_key); +} + +static DECLARE_WORK(enable_ptr_key_work, enable_ptr_key_workfn); + +static void fill_random_ptr_key(struct random_ready_callback *unused) +{ + /* This may be in an interrupt handler. */ + queue_work(system_unbound_wq, &enable_ptr_key_work); } static struct random_ready_callback random_ready = { @@ -1695,7 +1698,8 @@ static int __init initialize_ptr_random(void) if (!ret) { return 0; } else if (ret == -EALREADY) { - fill_random_ptr_key(&random_ready); + /* This is in preemptible context */ + enable_ptr_key_workfn(&enable_ptr_key_work); return 0; } @@ -1709,7 +1713,7 @@ static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec spec) unsigned long hashval; const int default_width = 2 * sizeof(ptr); - if (unlikely(!have_filled_random_ptr_key)) { + if (static_branch_unlikely(&not_filled_random_ptr_key)) { spec.field_width = default_width; /* string length must be less than default_width */ return string(buf, end, "(ptrval)", spec);

7 years, 5 months

2
1
0 0

[PATCH 4.16 00/55] 4.16.10-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.16.10 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Sun May 20 08:14:42 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.10-rc… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.16.10-rc1 Willy Tarreau <w(a)1wt.eu> proc: do not access cmdline nor environ from file-backed areas Dave Carroll <david.carroll(a)microsemi.com> scsi: aacraid: Correct hba_send to include iu_type Ursula Braun <ubraun(a)linux.ibm.com> net/smc: keep clcsock reference in smc_tcp_listen_work() Antoine Tenart <antoine.tenart(a)bootlin.com> net: phy: sfp: fix the BR,min computation Israel Rukshin <israelr(a)mellanox.com> net/mlx5: Fix mlx5_get_vector_affinity function Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()' Hangbin Liu <liuhangbin(a)gmail.com> ipv4: reset fnhe_mtu_locked after cache route flushed Mohammed Gamal <mgamal(a)redhat.com> hv_netvsc: Fix net device attach on older Windows hosts Eric Dumazet <edumazet(a)google.com> tipc: fix one byte leak in tipc_sk_set_orig_addr() Eric Dumazet <edumazet(a)google.com> tcp: restore autocorking Xin Long <lucien.xin(a)gmail.com> sctp: clear the new asoc's stream outcnt in sctp_stream_update John Hurley <john.hurley(a)netronome.com> nfp: flower: set tunnel ttl value to net default Florian Fainelli <f.fainelli(a)gmail.com> net: systemport: Correclty disambiguate driver instances Huy Nguyen <huyn(a)mellanox.com> net/mlx5e: DCBNL fix min inline header size for dscp Ido Schimmel <idosch(a)mellanox.com> mlxsw: spectrum_switchdev: Do not remove mrouter port from MDB's ports list Paolo Abeni <pabeni(a)redhat.com> udp: fix SO_BINDTODEVICE Eric Dumazet <edumazet(a)google.com> nsh: fix infinite loop Jianbo Liu <jianbol(a)mellanox.com> net/mlx5e: Allow offloading ipv4 header re-write for icmp Eric Dumazet <edumazet(a)google.com> ipv6: fix uninit-value in ip6_multipath_l3_keys() Stephen Hemminger <stephen(a)networkplumber.org> hv_netvsc: set master device Talat Batheesh <talatb(a)mellanox.com> net/mlx5: Avoid cleaning flow steering table twice during error flow Tariq Toukan <tariqt(a)mellanox.com> net/mlx5e: TX, Use correct counter in dma_map error flow Jiri Pirko <jiri(a)mellanox.com> net: sched: fix error path in tcf_proto_create() when modules are not configured Debabrata Banerjee <dbanerje(a)akamai.com> bonding: send learning packets for vlans on slave Debabrata Banerjee <dbanerje(a)akamai.com> bonding: do not allow rlb updates to invalid mac Michael Chan <michael.chan(a)broadcom.com> tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent(). Yuchung Cheng <ycheng(a)google.com> tcp: ignore Fast Open on repair mode Neal Cardwell <ncardwell(a)google.com> tcp_bbr: fix to zero idle_restart only upon S/ACKed data Xin Long <lucien.xin(a)gmail.com> sctp: use the old asoc when making the cookie-ack chunk in dupcook_d Xin Long <lucien.xin(a)gmail.com> sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg Xin Long <lucien.xin(a)gmail.com> sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr Xin Long <lucien.xin(a)gmail.com> sctp: fix the issue that the cookie-ack with auth can't get processed Xin Long <lucien.xin(a)gmail.com> sctp: delay the authentication for the duplicated cookie-echo chunk Eric Dumazet <edumazet(a)google.com> rds: do not leak kernel memory to user land Heiner Kallweit <hkallweit1(a)gmail.com> r8169: fix powering up RTL8168h Bjørn Mork <bjorn(a)mork.no> qmi_wwan: do not steal interfaces from class drivers Stefano Brivio <sbrivio(a)redhat.com> openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found Andre Tomt <andre(a)tomt.net> net/tls: Fix connection stall on partial tls record Dave Watson <davejwatson(a)fb.com> net/tls: Don't recursively call push_record during tls_write_space callbacks Lance Richardson <lance.richardson.net(a)gmail.com> net: support compat 64-bit time in {s,g}etsockopt Ursula Braun <ubraun(a)linux.ibm.com> net/smc: restrict non-blocking connect finish Eric Dumazet <edumazet(a)google.com> net_sched: fq: take care of throttled flows before reuse Roman Mashak <mrv(a)mojatatu.com> net sched actions: fix refcnt leak in skbmod Adi Nissim <adin(a)mellanox.com> net/mlx5: E-Switch, Include VF RDMA stats in vport statistics Roi Dayan <roid(a)mellanox.com> net/mlx5e: Err if asked to offload TC match on frag being first Moshe Shemesh <moshe(a)mellanox.com> net/mlx4_en: Verify coalescing parameters are in range Christophe JAILLET <christophe.jaillet(a)wanadoo.fr> net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()' Grygorii Strashko <grygorii.strashko(a)ti.com> net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode Rob Taglang <rob(a)taglang.io> net: ethernet: sun: niu set correct packet size in skb Eric Dumazet <edumazet(a)google.com> llc: better deal with too small mtu Andrey Ignatov <rdna(a)fb.com> ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg Julian Anastasov <ja(a)ssi.bg> ipv4: fix fnhe usage by non-cached routes Eric Dumazet <edumazet(a)google.com> dccp: fix tasklet usage Hangbin Liu <liuhangbin(a)gmail.com> bridge: check iface upper dev when setting master via ioctl Ingo Molnar <mingo(a)elte.hu> 8139too: Use disable_irq_nosync() in rtl8139_poll_controller() ------------- Diffstat: Makefile | 4 +- drivers/infiniband/hw/mlx5/main.c | 2 +- drivers/net/bonding/bond_alb.c | 15 +-- drivers/net/bonding/bond_main.c | 2 + drivers/net/ethernet/broadcom/bcmsysport.c | 16 ++- drivers/net/ethernet/broadcom/tg3.c | 9 +- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 +++ drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 8 +- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +- drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 8 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 7 +- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 20 ++-- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 23 ++-- drivers/net/ethernet/mellanox/mlxsw/core.c | 4 +- .../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 12 +-- drivers/net/ethernet/netronome/nfp/flower/action.c | 10 +- drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 5 +- drivers/net/ethernet/realtek/8139too.c | 2 +- drivers/net/ethernet/realtek/r8169.c | 3 + drivers/net/ethernet/sun/niu.c | 5 +- drivers/net/ethernet/ti/cpsw.c | 2 + drivers/net/hyperv/netvsc_drv.c | 3 +- drivers/net/hyperv/rndis_filter.c | 2 +- drivers/net/phy/sfp-bus.c | 2 +- drivers/net/usb/qmi_wwan.c | 12 +++ drivers/scsi/aacraid/commsup.c | 8 +- fs/proc/base.c | 8 +- include/linux/mlx5/driver.h | 12 +-- include/linux/mm.h | 1 + include/net/bonding.h | 1 + include/net/tls.h | 1 + mm/gup.c | 3 + net/bridge/br_if.c | 4 +- net/compat.c | 6 +- net/dccp/ccids/ccid2.c | 14 ++- net/dccp/timer.c | 2 +- net/ipv4/ping.c | 7 +- net/ipv4/route.c | 119 ++++++++++----------- net/ipv4/tcp.c | 5 +- net/ipv4/tcp_bbr.c | 4 +- net/ipv4/udp.c | 11 +- net/ipv6/route.c | 7 +- net/ipv6/udp.c | 4 +- net/llc/af_llc.c | 3 + net/nsh/nsh.c | 4 + net/openvswitch/flow_netlink.c | 9 +- net/rds/recv.c | 1 + net/sched/act_skbmod.c | 5 +- net/sched/cls_api.c | 2 +- net/sched/sch_fq.c | 37 ++++--- net/sctp/associola.c | 30 +++++- net/sctp/inqueue.c | 2 +- net/sctp/ipv6.c | 3 + net/sctp/sm_statefuns.c | 88 ++++++++------- net/sctp/stream.c | 2 + net/sctp/ulpevent.c | 1 - net/smc/af_smc.c | 18 ++-- net/tipc/socket.c | 3 +- net/tls/tls_main.c | 8 ++ 60 files changed, 398 insertions(+), 245 deletions(-)

7 years, 5 months

4
60
0 0

[PATCH] blk-mq: avoid to starve tag allocation after allocation process migrates

by Ming Lei

When the allocation process is scheduled back and the mapped hw queue is changed, do one extra wake up on orignal queue for compensating wake up miss, so other allocations on the orignal queue won't be starved. This patch fixes one request allocation hang issue, which can be triggered easily in case of very low nr_request. Cc: <stable(a)vger.kernel.org> Cc: Omar Sandoval <osandov(a)fb.com> Signed-off-by: Ming Lei <ming.lei(a)redhat.com> --- block/blk-mq-tag.c | 13 +++++++++++++ include/linux/sbitmap.h | 7 +++++++ lib/sbitmap.c | 6 ++++++ 3 files changed, 26 insertions(+) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 336dde07b230..a965db489f98 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -134,6 +134,8 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) ws = bt_wait_ptr(bt, data->hctx); drop_ctx = data->ctx == NULL; do { + struct sbitmap_queue *bt_orig; + /* * We're out of tags on this hardware queue, kick any * pending IO submits before going to sleep waiting for @@ -159,6 +161,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) if (data->ctx) blk_mq_put_ctx(data->ctx); + bt_orig = bt; io_schedule(); data->ctx = blk_mq_get_ctx(data->q); @@ -170,6 +173,16 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) bt = &tags->bitmap_tags; finish_wait(&ws->wait, &wait); + + /* + * If destination hw queue is changed, wake up original + * queue one extra time for compensating the wake up + * miss, so other allocations on original queue won't + * be starved. + */ + if (bt != bt_orig) + sbitmap_queue_wake(bt_orig); + ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 841585f6e5f2..b23f50355281 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -484,6 +484,13 @@ static inline struct sbq_wait_state *sbq_wait_ptr(struct sbitmap_queue *sbq, void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); /** + * sbitmap_wake_up() - Do a regular wake up compensation if the queue + * allocated from is changed after scheduling back. + * @sbq: Bitmap queue to wake up. + */ +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); + +/** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct * seq_file. * @sbq: Bitmap queue to show. diff --git a/lib/sbitmap.c b/lib/sbitmap.c index e6a9c06ec70c..c6ae4206bcb1 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -466,6 +466,12 @@ static void sbq_wake_up(struct sbitmap_queue *sbq) } } +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) +{ + sbq_wake_up(sbq); +} +EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); + void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, unsigned int cpu) { -- 2.9.5

7 years, 5 months

1
0
0 0

[PATCH] x86/mm: Drop TS_COMPAT on 64-bit exec() syscall

by Dmitry Safonov

The x86 mmap() code selects the mmap base for an allocation depending on the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and for 32bit mm->mmap_compat_base. exec() calls mmap() which in turn uses in_compat_syscall() to check whether the mapping is for a 32bit or a 64bit task. The decision is made on the following criteria: ia32 child->thread.status & TS_COMPAT x32 child->pt_regs.orig_ax & __X32_SYSCALL_BIT ia64 !ia32 && !x32 __set_personality_x32() was dropping TS_COMPAT flag, but set_personality_64bit() has kept compat syscall flag making in_compat_syscall() return true during the first exec() syscall. Which in result has user-visible effects, mentioned by Alexey: 1) It breaks ASAN $ gcc -fsanitize=address wrap.c -o wrap-asan $ ./wrap32 ./wrap-asan true ==1217==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING. ==1217==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range. ==1217==Process memory map follows: 0x000000400000-0x000000401000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan 0x000000600000-0x000000601000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan 0x000000601000-0x000000602000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan 0x0000f7dbd000-0x0000f7de2000 /lib64/ld-2.27.so 0x0000f7fe2000-0x0000f7fe3000 /lib64/ld-2.27.so 0x0000f7fe3000-0x0000f7fe4000 /lib64/ld-2.27.so 0x0000f7fe4000-0x0000f7fe5000 0x7fed9abff000-0x7fed9af54000 0x7fed9af54000-0x7fed9af6b000 /lib64/libgcc_s.so.1 [snip] 2) It doesn't seem to be great for security if an attacker always knows that ld.so is going to be mapped into the first 4GB in this case (the same thing happens for PIEs as well). The testcase: $ cat wrap.c int main(int argc, char *argv[]) { execvp(argv[1], &argv[1]); return 127; } $ gcc wrap.c -o wrap $ LD_SHOW_AUXV=1 ./wrap ./wrap true |& grep AT_BASE AT_BASE: 0x7f63b8309000 AT_BASE: 0x7faec143c000 AT_BASE: 0x7fbdb25fa000 $ gcc -m32 wrap.c -o wrap32 $ LD_SHOW_AUXV=1 ./wrap32 ./wrap true |& grep AT_BASE AT_BASE: 0xf7eff000 AT_BASE: 0xf7cee000 AT_BASE: 0x7f8b9774e000 Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") commit ada26481dfe6 ("x86/mm: Make in_compat_syscall() work during exec") Cc: Borislav Petkov <bp(a)suse.de> Cc: Cyrill Gorcunov <gorcunov(a)openvz.org> Cc: Dmitry Safonov <0x7f454c46(a)gmail.com> Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: <linux-mm(a)kvack.org> Cc: <x86(a)kernel.org> Cc: <stable(a)vger.kernel.org> # v4.12+ Reported-by: Alexey Izbyshev <izbyshev(a)ispras.ru> Bisected-by: Alexander Monakov <amonakov(a)ispras.ru> Investigated-by: Andy Lutomirski <luto(a)kernel.org> Signed-off-by: Dmitry Safonov <dima(a)arista.com> --- arch/x86/kernel/process_64.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 4b100fe0f508..12bb445fb98d 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -542,6 +542,7 @@ void set_personality_64bit(void) clear_thread_flag(TIF_X32); /* Pretend that this comes from a 64bit execve */ task_pt_regs(current)->orig_ax = __NR_execve; + current_thread_info()->status &= ~TS_COMPAT; /* Ensure the corresponding mm is not marked. */ if (current->mm) -- 2.13.6

7 years, 5 months

4
10
0 0

[patch 09/10] mm: don't allow deferred pages with NEED_PER_CPU_KM

by akpm＠linux-foundation.org

From: Pavel Tatashin <pasha.tatashin(a)oracle.com> Subject: mm: don't allow deferred pages with NEED_PER_CPU_KM It is unsafe to do virtual to physical translations before mm_init() is called if struct page is needed in order to determine the memory section number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we initialize struct pages for all the allocated memory when deferred struct pages are used. My recent fix c9e97a1997 ("mm: initialize pages on demand during boot") exposed this problem, because it greatly reduced number of pages that are initialized before mm_init(), but the problem existed even before my fix, as Fengguang Wu found. Below is a more detailed explanation of the problem. We initialize struct pages in four places: 1. Early in boot a small set of struct pages is initialized to fill the first section, and lower zones. 2. During mm_init() we initialize "struct pages" for all the memory that is allocated, i.e reserved in memblock. 3. Using on-demand logic when pages are allocated after mm_init call (when memblock is finished) 4. After smp_init() when the rest free deferred pages are initialized. The problem occurs if we try to do va to phys translation of a memory between steps 1 and 2. Because we have not yet initialized struct pages for all the reserved pages, it is inherently unsafe to do va to phys if the translation itself requires access of "struct page" as in case of this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP The following path exposes the problem: start_kernel() trap_init() setup_cpu_entry_areas() setup_cpu_entry_area(cpu) get_cpu_gdt_paddr(cpu) per_cpu_ptr_to_phys(addr) pcpu_addr_to_page(addr) virt_to_page(addr) pfn_to_page(__pa(addr) >> PAGE_SHIFT) We disable this path by not allowing NEED_PER_CPU_KM with deferred struct pages feature. The problems are discussed in these threads: http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel… http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel… http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Steven Sistare <steven.sistare(a)oracle.com> Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Fengguang Wu <fengguang.wu(a)intel.com> Cc: Dennis Zhou <dennisszhou(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff -puN mm/Kconfig~mm-dont-allow-deferred-pages-with-need_per_cpu_km mm/Kconfig --- a/mm/Kconfig~mm-dont-allow-deferred-pages-with-need_per_cpu_km +++ a/mm/Kconfig @@ -636,6 +636,7 @@ config DEFERRED_STRUCT_PAGE_INIT default n depends on NO_BOOTMEM depends on !FLATMEM + depends on !NEED_PER_CPU_KM help Ordinarily all struct pages are initialised during early boot in a single thread. On very large machines this can take a considerable _

7 years, 5 months

1
0
0 0

[patch 07/10] radix tree: fix multi-order iteration race

by akpm＠linux-foundation.org

From: Ross Zwisler <ross.zwisler(a)linux.intel.com> Subject: radix tree: fix multi-order iteration race Fix a race in the multi-order iteration code which causes the kernel to hit a GP fault. This was first seen with a production v4.15 based kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used order 9 PMD DAX entries. The race has to do with how we tear down multi-order sibling entries when we are removing an item from the tree. Remember for example that an order 2 entry looks like this: struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] where 'entry' is in some slot in the struct radix_tree_node, and the three slots following 'entry' contain sibling pointers which point back to 'entry.' When we delete 'entry' from the tree, we call : radix_tree_delete() radix_tree_delete_item() __radix_tree_delete() replace_slot() replace_slot() first removes the siblings in order from the first to the last, then at then replaces 'entry' with NULL. This means that for a brief period of time we end up with one or more of the siblings removed, so: struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] This causes an issue if you have a reader iterating over the slots in the tree via radix_tree_for_each_slot() while only under rcu_read_lock()/rcu_read_unlock() protection. This is a common case in mm/filemap.c. The issue is that when __radix_tree_next_slot() => skip_siblings() tries to skip over the sibling entries in the slots, it currently does so with an exact match on the slot directly preceding our current slot. Normally this works: V preceding slot struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling] ^ current slot This lets you find the first sibling, and you skip them all in order. But in the case where one of the siblings is NULL, that slot is skipped and then our sibling detection is interrupted: V preceding slot struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling] ^ current slot This means that the sibling pointers aren't recognized since they point all the way back to 'entry', so we think that they are normal internal radix tree pointers. This causes us to think we need to walk down to a struct radix_tree_node starting at the address of 'entry'. In a real running kernel this will crash the thread with a GP fault when you try and dereference the slots in your broken node starting at 'entry'. We fix this race by fixing the way that skip_siblings() detects sibling nodes. Instead of testing against the preceding slot we instead look for siblings via is_sibling_entry() which compares against the position of the struct radix_tree_node.slots[] array. This ensures that sibling entries are properly identified, even if they are no longer contiguous with the 'entry' they point to. Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators") Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com> Reported-by: CR, Sapthagirish <sapthagirish.cr(a)intel.com> Reviewed-by: Jan Kara <jack(a)suse.cz> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: Christoph Hellwig <hch(a)lst.de> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Dave Chinner <david(a)fromorbit.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- lib/radix-tree.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff -puN lib/radix-tree.c~radix-tree-fix-multi-order-iteration-race lib/radix-tree.c --- a/lib/radix-tree.c~radix-tree-fix-multi-order-iteration-race +++ a/lib/radix-tree.c @@ -1612,11 +1612,9 @@ static void set_iter_tags(struct radix_t static void __rcu **skip_siblings(struct radix_tree_node **nodep, void __rcu **slot, struct radix_tree_iter *iter) { - void *sib = node_to_entry(slot - 1); - while (iter->index < iter->next_index) { *nodep = rcu_dereference_raw(*slot); - if (*nodep && *nodep != sib) + if (*nodep && !is_sibling_entry(iter->node, *nodep)) return slot; slot++; iter->index = __radix_tree_iter_add(iter, 1); @@ -1631,7 +1629,7 @@ void __rcu **__radix_tree_next_slot(void struct radix_tree_iter *iter, unsigned flags) { unsigned tag = flags & RADIX_TREE_ITER_TAG_MASK; - struct radix_tree_node *node = rcu_dereference_raw(*slot); + struct radix_tree_node *node; slot = skip_siblings(&node, slot, iter); _

7 years, 5 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror