The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f43194c1447c9536efb0859c2f3f46f6bf2b9154 Mon Sep 17 00:00:00 2001
From: Maxime Chevallier <maxime.chevallier(a)bootlin.com>
Date: Wed, 25 Apr 2018 20:19:47 +0200
Subject: [PATCH] ARM64: dts: marvell: armada-cp110: Add mg_core_clk for
ethernet node
Marvell PPv2.2 controller present on CP-110 need the extra "mg_core_clk"
clock to avoid system hangs when powering some network interfaces up.
This issue appeared after a recent clock rework on Armada 7K/8K platforms.
This commit adds the new clock and updates the documentation accordingly.
[gregory.clement: use the real first commit to fix and add the cc:stable
flag]
Fixes: e3af9f7c6ece ("RM64: dts: marvell: armada-cp110: Fix clock resources for various node")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Maxime Chevallier <maxime.chevallier(a)bootlin.com>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
diff --git a/Documentation/devicetree/bindings/net/marvell-pp2.txt b/Documentation/devicetree/bindings/net/marvell-pp2.txt
index 1814fa13f6ab..fc019df0d863 100644
--- a/Documentation/devicetree/bindings/net/marvell-pp2.txt
+++ b/Documentation/devicetree/bindings/net/marvell-pp2.txt
@@ -21,9 +21,10 @@ Required properties:
- main controller clock (for both armada-375-pp2 and armada-7k-pp2)
- GOP clock (for both armada-375-pp2 and armada-7k-pp2)
- MG clock (only for armada-7k-pp2)
+ - MG Core clock (only for armada-7k-pp2)
- AXI clock (only for armada-7k-pp2)
-- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk"
- and "axi_clk" (the 2 latter only for armada-7k-pp2).
+- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk",
+ "mg_core_clk" and "axi_clk" (the 3 latter only for armada-7k-pp2).
The ethernet ports are represented by subnodes. At least one port is
required.
@@ -80,8 +81,8 @@ cpm_ethernet: ethernet@0 {
compatible = "marvell,armada-7k-pp22";
reg = <0x0 0x100000>, <0x129000 0xb000>;
clocks = <&cpm_syscon0 1 3>, <&cpm_syscon0 1 9>,
- <&cpm_syscon0 1 5>, <&cpm_syscon0 1 18>;
- clock-names = "pp_clk", "gop_clk", "gp_clk", "axi_clk";
+ <&cpm_syscon0 1 5>, <&cpm_syscon0 1 6>, <&cpm_syscon0 1 18>;
+ clock-names = "pp_clk", "gop_clk", "mg_clk", "mg_core_clk", "axi_clk";
eth0: eth0 {
interrupts = <ICU_GRP_NSR 39 IRQ_TYPE_LEVEL_HIGH>,
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
index ca22f9d100f5..ed2f1237ea1e 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
@@ -38,9 +38,10 @@
compatible = "marvell,armada-7k-pp22";
reg = <0x0 0x100000>, <0x129000 0xb000>;
clocks = <&CP110_LABEL(clk) 1 3>, <&CP110_LABEL(clk) 1 9>,
- <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 18>;
+ <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 6>,
+ <&CP110_LABEL(clk) 1 18>;
clock-names = "pp_clk", "gop_clk",
- "mg_clk", "axi_clk";
+ "mg_clk", "mg_core_clk", "axi_clk";
marvell,system-controller = <&CP110_LABEL(syscon0)>;
status = "disabled";
dma-coherent;
Hi Greg,
please apply commit dd83c161fbc ("kernel/exit.c: avoid undefined behaviour when calling wait4()")
to v4.9.y and older to fix CVE-2018-10087.
Thanks,
Guenter
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 85f4f12d51397f1648e1f4350f77e24039b82d61 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Tue, 15 May 2018 22:24:52 -0400
Subject: [PATCH] vsprintf: Replace memory barrier with static_key for
random_ptr_key update
Reviewing Tobin's patches for getting pointers out early before
entropy has been established, I noticed that there's a lone smp_mb() in
the code. As with most lone memory barriers, this one appears to be
incorrectly used.
We currently basically have this:
get_random_bytes(&ptr_key, sizeof(ptr_key));
/*
* have_filled_random_ptr_key==true is dependent on get_random_bytes().
* ptr_to_id() needs to see have_filled_random_ptr_key==true
* after get_random_bytes() returns.
*/
smp_mb();
WRITE_ONCE(have_filled_random_ptr_key, true);
And later we have:
if (unlikely(!have_filled_random_ptr_key))
return string(buf, end, "(ptrval)", spec);
/* Missing memory barrier here. */
hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key);
As the CPU can perform speculative loads, we could have a situation
with the following:
CPU0 CPU1
---- ----
load ptr_key = 0
store ptr_key = random
smp_mb()
store have_filled_random_ptr_key
load have_filled_random_ptr_key = true
BAD BAD BAD! (you're so bad!)
Because nothing prevents CPU1 from loading ptr_key before loading
have_filled_random_ptr_key.
But this race is very unlikely, but we can't keep an incorrect smp_mb() in
place. Instead, replace the have_filled_random_ptr_key with a static_branch
not_filled_random_ptr_key, that is initialized to true and changed to false
when we get enough entropy. If the update happens in early boot, the
static_key is updated immediately, otherwise it will have to wait till
entropy is filled and this happens in an interrupt handler which can't
enable a static_key, as that requires a preemptible context. In that case, a
work_queue is used to enable it, as entropy already took too long to
establish in the first place waiting a little more shouldn't hurt anything.
The benefit of using the static key is that the unlikely branch in
vsprintf() now becomes a nop.
Link: http://lkml.kernel.org/r/20180515100558.21df515e@gandalf.local.home
Cc: stable(a)vger.kernel.org
Fixes: ad67b74d2469d ("printk: hash addresses printed with %p")
Acked-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 30c0cb8cc9bc..23920c5ff728 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1669,19 +1669,22 @@ char *pointer_string(char *buf, char *end, const void *ptr,
return number(buf, end, (unsigned long int)ptr, spec);
}
-static bool have_filled_random_ptr_key __read_mostly;
+static DEFINE_STATIC_KEY_TRUE(not_filled_random_ptr_key);
static siphash_key_t ptr_key __read_mostly;
-static void fill_random_ptr_key(struct random_ready_callback *unused)
+static void enable_ptr_key_workfn(struct work_struct *work)
{
get_random_bytes(&ptr_key, sizeof(ptr_key));
- /*
- * have_filled_random_ptr_key==true is dependent on get_random_bytes().
- * ptr_to_id() needs to see have_filled_random_ptr_key==true
- * after get_random_bytes() returns.
- */
- smp_mb();
- WRITE_ONCE(have_filled_random_ptr_key, true);
+ /* Needs to run from preemptible context */
+ static_branch_disable(¬_filled_random_ptr_key);
+}
+
+static DECLARE_WORK(enable_ptr_key_work, enable_ptr_key_workfn);
+
+static void fill_random_ptr_key(struct random_ready_callback *unused)
+{
+ /* This may be in an interrupt handler. */
+ queue_work(system_unbound_wq, &enable_ptr_key_work);
}
static struct random_ready_callback random_ready = {
@@ -1695,7 +1698,8 @@ static int __init initialize_ptr_random(void)
if (!ret) {
return 0;
} else if (ret == -EALREADY) {
- fill_random_ptr_key(&random_ready);
+ /* This is in preemptible context */
+ enable_ptr_key_workfn(&enable_ptr_key_work);
return 0;
}
@@ -1709,7 +1713,7 @@ static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec spec)
unsigned long hashval;
const int default_width = 2 * sizeof(ptr);
- if (unlikely(!have_filled_random_ptr_key)) {
+ if (static_branch_unlikely(¬_filled_random_ptr_key)) {
spec.field_width = default_width;
/* string length must be less than default_width */
return string(buf, end, "(ptrval)", spec);
This is the start of the stable review cycle for the 4.16.10 release.
There are 55 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun May 20 08:14:42 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.10-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.16.10-rc1
Willy Tarreau <w(a)1wt.eu>
proc: do not access cmdline nor environ from file-backed areas
Dave Carroll <david.carroll(a)microsemi.com>
scsi: aacraid: Correct hba_send to include iu_type
Ursula Braun <ubraun(a)linux.ibm.com>
net/smc: keep clcsock reference in smc_tcp_listen_work()
Antoine Tenart <antoine.tenart(a)bootlin.com>
net: phy: sfp: fix the BR,min computation
Israel Rukshin <israelr(a)mellanox.com>
net/mlx5: Fix mlx5_get_vector_affinity function
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
Hangbin Liu <liuhangbin(a)gmail.com>
ipv4: reset fnhe_mtu_locked after cache route flushed
Mohammed Gamal <mgamal(a)redhat.com>
hv_netvsc: Fix net device attach on older Windows hosts
Eric Dumazet <edumazet(a)google.com>
tipc: fix one byte leak in tipc_sk_set_orig_addr()
Eric Dumazet <edumazet(a)google.com>
tcp: restore autocorking
Xin Long <lucien.xin(a)gmail.com>
sctp: clear the new asoc's stream outcnt in sctp_stream_update
John Hurley <john.hurley(a)netronome.com>
nfp: flower: set tunnel ttl value to net default
Florian Fainelli <f.fainelli(a)gmail.com>
net: systemport: Correclty disambiguate driver instances
Huy Nguyen <huyn(a)mellanox.com>
net/mlx5e: DCBNL fix min inline header size for dscp
Ido Schimmel <idosch(a)mellanox.com>
mlxsw: spectrum_switchdev: Do not remove mrouter port from MDB's ports list
Paolo Abeni <pabeni(a)redhat.com>
udp: fix SO_BINDTODEVICE
Eric Dumazet <edumazet(a)google.com>
nsh: fix infinite loop
Jianbo Liu <jianbol(a)mellanox.com>
net/mlx5e: Allow offloading ipv4 header re-write for icmp
Eric Dumazet <edumazet(a)google.com>
ipv6: fix uninit-value in ip6_multipath_l3_keys()
Stephen Hemminger <stephen(a)networkplumber.org>
hv_netvsc: set master device
Talat Batheesh <talatb(a)mellanox.com>
net/mlx5: Avoid cleaning flow steering table twice during error flow
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TX, Use correct counter in dma_map error flow
Jiri Pirko <jiri(a)mellanox.com>
net: sched: fix error path in tcf_proto_create() when modules are not configured
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: send learning packets for vlans on slave
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: do not allow rlb updates to invalid mac
Michael Chan <michael.chan(a)broadcom.com>
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
Yuchung Cheng <ycheng(a)google.com>
tcp: ignore Fast Open on repair mode
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Xin Long <lucien.xin(a)gmail.com>
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
Xin Long <lucien.xin(a)gmail.com>
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
Xin Long <lucien.xin(a)gmail.com>
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
Xin Long <lucien.xin(a)gmail.com>
sctp: fix the issue that the cookie-ack with auth can't get processed
Xin Long <lucien.xin(a)gmail.com>
sctp: delay the authentication for the duplicated cookie-echo chunk
Eric Dumazet <edumazet(a)google.com>
rds: do not leak kernel memory to user land
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix powering up RTL8168h
Bjørn Mork <bjorn(a)mork.no>
qmi_wwan: do not steal interfaces from class drivers
Stefano Brivio <sbrivio(a)redhat.com>
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
Andre Tomt <andre(a)tomt.net>
net/tls: Fix connection stall on partial tls record
Dave Watson <davejwatson(a)fb.com>
net/tls: Don't recursively call push_record during tls_write_space callbacks
Lance Richardson <lance.richardson.net(a)gmail.com>
net: support compat 64-bit time in {s,g}etsockopt
Ursula Braun <ubraun(a)linux.ibm.com>
net/smc: restrict non-blocking connect finish
Eric Dumazet <edumazet(a)google.com>
net_sched: fq: take care of throttled flows before reuse
Roman Mashak <mrv(a)mojatatu.com>
net sched actions: fix refcnt leak in skbmod
Adi Nissim <adin(a)mellanox.com>
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
Roi Dayan <roid(a)mellanox.com>
net/mlx5e: Err if asked to offload TC match on frag being first
Moshe Shemesh <moshe(a)mellanox.com>
net/mlx4_en: Verify coalescing parameters are in range
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
Rob Taglang <rob(a)taglang.io>
net: ethernet: sun: niu set correct packet size in skb
Eric Dumazet <edumazet(a)google.com>
llc: better deal with too small mtu
Andrey Ignatov <rdna(a)fb.com>
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Julian Anastasov <ja(a)ssi.bg>
ipv4: fix fnhe usage by non-cached routes
Eric Dumazet <edumazet(a)google.com>
dccp: fix tasklet usage
Hangbin Liu <liuhangbin(a)gmail.com>
bridge: check iface upper dev when setting master via ioctl
Ingo Molnar <mingo(a)elte.hu>
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
-------------
Diffstat:
Makefile | 4 +-
drivers/infiniband/hw/mlx5/main.c | 2 +-
drivers/net/bonding/bond_alb.c | 15 +--
drivers/net/bonding/bond_main.c | 2 +
drivers/net/ethernet/broadcom/bcmsysport.c | 16 ++-
drivers/net/ethernet/broadcom/tg3.c | 9 +-
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 +++
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 8 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 20 ++--
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 23 ++--
drivers/net/ethernet/mellanox/mlxsw/core.c | 4 +-
.../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 12 +--
drivers/net/ethernet/netronome/nfp/flower/action.c | 10 +-
drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 5 +-
drivers/net/ethernet/realtek/8139too.c | 2 +-
drivers/net/ethernet/realtek/r8169.c | 3 +
drivers/net/ethernet/sun/niu.c | 5 +-
drivers/net/ethernet/ti/cpsw.c | 2 +
drivers/net/hyperv/netvsc_drv.c | 3 +-
drivers/net/hyperv/rndis_filter.c | 2 +-
drivers/net/phy/sfp-bus.c | 2 +-
drivers/net/usb/qmi_wwan.c | 12 +++
drivers/scsi/aacraid/commsup.c | 8 +-
fs/proc/base.c | 8 +-
include/linux/mlx5/driver.h | 12 +--
include/linux/mm.h | 1 +
include/net/bonding.h | 1 +
include/net/tls.h | 1 +
mm/gup.c | 3 +
net/bridge/br_if.c | 4 +-
net/compat.c | 6 +-
net/dccp/ccids/ccid2.c | 14 ++-
net/dccp/timer.c | 2 +-
net/ipv4/ping.c | 7 +-
net/ipv4/route.c | 119 ++++++++++-----------
net/ipv4/tcp.c | 5 +-
net/ipv4/tcp_bbr.c | 4 +-
net/ipv4/udp.c | 11 +-
net/ipv6/route.c | 7 +-
net/ipv6/udp.c | 4 +-
net/llc/af_llc.c | 3 +
net/nsh/nsh.c | 4 +
net/openvswitch/flow_netlink.c | 9 +-
net/rds/recv.c | 1 +
net/sched/act_skbmod.c | 5 +-
net/sched/cls_api.c | 2 +-
net/sched/sch_fq.c | 37 ++++---
net/sctp/associola.c | 30 +++++-
net/sctp/inqueue.c | 2 +-
net/sctp/ipv6.c | 3 +
net/sctp/sm_statefuns.c | 88 ++++++++-------
net/sctp/stream.c | 2 +
net/sctp/ulpevent.c | 1 -
net/smc/af_smc.c | 18 ++--
net/tipc/socket.c | 3 +-
net/tls/tls_main.c | 8 ++
60 files changed, 398 insertions(+), 245 deletions(-)
When the allocation process is scheduled back and the mapped hw queue is
changed, do one extra wake up on orignal queue for compensating wake up
miss, so other allocations on the orignal queue won't be starved.
This patch fixes one request allocation hang issue, which can be
triggered easily in case of very low nr_request.
Cc: <stable(a)vger.kernel.org>
Cc: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
block/blk-mq-tag.c | 13 +++++++++++++
include/linux/sbitmap.h | 7 +++++++
lib/sbitmap.c | 6 ++++++
3 files changed, 26 insertions(+)
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 336dde07b230..a965db489f98 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -134,6 +134,8 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
ws = bt_wait_ptr(bt, data->hctx);
drop_ctx = data->ctx == NULL;
do {
+ struct sbitmap_queue *bt_orig;
+
/*
* We're out of tags on this hardware queue, kick any
* pending IO submits before going to sleep waiting for
@@ -159,6 +161,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
if (data->ctx)
blk_mq_put_ctx(data->ctx);
+ bt_orig = bt;
io_schedule();
data->ctx = blk_mq_get_ctx(data->q);
@@ -170,6 +173,16 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
bt = &tags->bitmap_tags;
finish_wait(&ws->wait, &wait);
+
+ /*
+ * If destination hw queue is changed, wake up original
+ * queue one extra time for compensating the wake up
+ * miss, so other allocations on original queue won't
+ * be starved.
+ */
+ if (bt != bt_orig)
+ sbitmap_queue_wake(bt_orig);
+
ws = bt_wait_ptr(bt, data->hctx);
} while (1);
diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 841585f6e5f2..b23f50355281 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -484,6 +484,13 @@ static inline struct sbq_wait_state *sbq_wait_ptr(struct sbitmap_queue *sbq,
void sbitmap_queue_wake_all(struct sbitmap_queue *sbq);
/**
+ * sbitmap_wake_up() - Do a regular wake up compensation if the queue
+ * allocated from is changed after scheduling back.
+ * @sbq: Bitmap queue to wake up.
+ */
+void sbitmap_queue_wake_up(struct sbitmap_queue *sbq);
+
+/**
* sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct
* seq_file.
* @sbq: Bitmap queue to show.
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index e6a9c06ec70c..c6ae4206bcb1 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -466,6 +466,12 @@ static void sbq_wake_up(struct sbitmap_queue *sbq)
}
}
+void sbitmap_queue_wake_up(struct sbitmap_queue *sbq)
+{
+ sbq_wake_up(sbq);
+}
+EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up);
+
void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr,
unsigned int cpu)
{
--
2.9.5
The x86 mmap() code selects the mmap base for an allocation depending on
the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
for 32bit mm->mmap_compat_base.
exec() calls mmap() which in turn uses in_compat_syscall() to check whether
the mapping is for a 32bit or a 64bit task. The decision is made on the
following criteria:
ia32 child->thread.status & TS_COMPAT
x32 child->pt_regs.orig_ax & __X32_SYSCALL_BIT
ia64 !ia32 && !x32
__set_personality_x32() was dropping TS_COMPAT flag, but
set_personality_64bit() has kept compat syscall flag making
in_compat_syscall() return true during the first exec() syscall.
Which in result has user-visible effects, mentioned by Alexey:
1) It breaks ASAN
$ gcc -fsanitize=address wrap.c -o wrap-asan
$ ./wrap32 ./wrap-asan true
==1217==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==1217==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==1217==Process memory map follows:
0x000000400000-0x000000401000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000600000-0x000000601000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000601000-0x000000602000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x0000f7dbd000-0x0000f7de2000 /lib64/ld-2.27.so
0x0000f7fe2000-0x0000f7fe3000 /lib64/ld-2.27.so
0x0000f7fe3000-0x0000f7fe4000 /lib64/ld-2.27.so
0x0000f7fe4000-0x0000f7fe5000
0x7fed9abff000-0x7fed9af54000
0x7fed9af54000-0x7fed9af6b000 /lib64/libgcc_s.so.1
[snip]
2) It doesn't seem to be great for security if an attacker always knows
that ld.so is going to be mapped into the first 4GB in this case
(the same thing happens for PIEs as well).
The testcase:
$ cat wrap.c
int main(int argc, char *argv[]) {
execvp(argv[1], &argv[1]);
return 127;
}
$ gcc wrap.c -o wrap
$ LD_SHOW_AUXV=1 ./wrap ./wrap true |& grep AT_BASE
AT_BASE: 0x7f63b8309000
AT_BASE: 0x7faec143c000
AT_BASE: 0x7fbdb25fa000
$ gcc -m32 wrap.c -o wrap32
$ LD_SHOW_AUXV=1 ./wrap32 ./wrap true |& grep AT_BASE
AT_BASE: 0xf7eff000
AT_BASE: 0xf7cee000
AT_BASE: 0x7f8b9774e000
Fixes:
commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
commit ada26481dfe6 ("x86/mm: Make in_compat_syscall() work during exec")
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Cyrill Gorcunov <gorcunov(a)openvz.org>
Cc: Dmitry Safonov <0x7f454c46(a)gmail.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <linux-mm(a)kvack.org>
Cc: <x86(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # v4.12+
Reported-by: Alexey Izbyshev <izbyshev(a)ispras.ru>
Bisected-by: Alexander Monakov <amonakov(a)ispras.ru>
Investigated-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Dmitry Safonov <dima(a)arista.com>
---
arch/x86/kernel/process_64.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4b100fe0f508..12bb445fb98d 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -542,6 +542,7 @@ void set_personality_64bit(void)
clear_thread_flag(TIF_X32);
/* Pretend that this comes from a 64bit execve */
task_pt_regs(current)->orig_ax = __NR_execve;
+ current_thread_info()->status &= ~TS_COMPAT;
/* Ensure the corresponding mm is not marked. */
if (current->mm)
--
2.13.6
From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Subject: mm: don't allow deferred pages with NEED_PER_CPU_KM
It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we
initialize struct pages for all the allocated memory when deferred struct
pages are used.
My recent fix c9e97a1997 ("mm: initialize pages on demand during boot")
exposed this problem, because it greatly reduced number of pages that are
initialized before mm_init(), but the problem existed even before my fix,
as Fengguang Wu found.
Below is a more detailed explanation of the problem.
We initialize struct pages in four places:
1. Early in boot a small set of struct pages is initialized to fill
the first section, and lower zones.
2. During mm_init() we initialize "struct pages" for all the memory
that is allocated, i.e reserved in memblock.
3. Using on-demand logic when pages are allocated after mm_init call (when
memblock is finished)
4. After smp_init() when the rest free deferred pages are initialized.
The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2. Because we have not yet initialized struct pages
for all the reserved pages, it is inherently unsafe to do va to phys if
the translation itself requires access of "struct page" as in case of this
combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
The following path exposes the problem:
start_kernel()
trap_init()
setup_cpu_entry_areas()
setup_cpu_entry_area(cpu)
get_cpu_gdt_paddr(cpu)
per_cpu_ptr_to_phys(addr)
pcpu_addr_to_page(addr)
virt_to_page(addr)
pfn_to_page(__pa(addr) >> PAGE_SHIFT)
We disable this path by not allowing NEED_PER_CPU_KM with deferred struct
pages feature.
The problems are discussed in these threads:
http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel…http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel…http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Steven Sistare <steven.sistare(a)oracle.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Fengguang Wu <fengguang.wu(a)intel.com>
Cc: Dennis Zhou <dennisszhou(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff -puN mm/Kconfig~mm-dont-allow-deferred-pages-with-need_per_cpu_km mm/Kconfig
--- a/mm/Kconfig~mm-dont-allow-deferred-pages-with-need_per_cpu_km
+++ a/mm/Kconfig
@@ -636,6 +636,7 @@ config DEFERRED_STRUCT_PAGE_INIT
default n
depends on NO_BOOTMEM
depends on !FLATMEM
+ depends on !NEED_PER_CPU_KM
help
Ordinarily all struct pages are initialised during early boot in a
single thread. On very large machines this can take a considerable
_
From: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Subject: radix tree: fix multi-order iteration race
Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault. This was first seen with a production v4.15 based kernel
(4.15.6-300.fc27.x86_64) utilizing a DAX workload which used order 9 PMD
DAX entries.
The race has to do with how we tear down multi-order sibling entries when
we are removing an item from the tree. Remember for example that an order
2 entry looks like this:
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
where 'entry' is in some slot in the struct radix_tree_node, and the three
slots following 'entry' contain sibling pointers which point back to
'entry.'
When we delete 'entry' from the tree, we call :
radix_tree_delete()
radix_tree_delete_item()
__radix_tree_delete()
replace_slot()
replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL. This means that for a
brief period of time we end up with one or more of the siblings removed,
so:
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
This causes an issue if you have a reader iterating over the slots in the
tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection. This is a common case in
mm/filemap.c.
The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot. Normally
this works:
V preceding slot
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
^ current slot
This lets you find the first sibling, and you skip them all in order.
But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:
V preceding slot
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
^ current slot
This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers. This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.
In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at 'entry'.
We fix this race by fixing the way that skip_siblings() detects sibling
nodes. Instead of testing against the preceding slot we instead look for
siblings via is_sibling_entry() which compares against the position of the
struct radix_tree_node.slots[] array. This ensures that sibling entries
are properly identified, even if they are no longer contiguous with the
'entry' they point to.
Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Reported-by: CR, Sapthagirish <sapthagirish.cr(a)intel.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/radix-tree.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff -puN lib/radix-tree.c~radix-tree-fix-multi-order-iteration-race lib/radix-tree.c
--- a/lib/radix-tree.c~radix-tree-fix-multi-order-iteration-race
+++ a/lib/radix-tree.c
@@ -1612,11 +1612,9 @@ static void set_iter_tags(struct radix_t
static void __rcu **skip_siblings(struct radix_tree_node **nodep,
void __rcu **slot, struct radix_tree_iter *iter)
{
- void *sib = node_to_entry(slot - 1);
-
while (iter->index < iter->next_index) {
*nodep = rcu_dereference_raw(*slot);
- if (*nodep && *nodep != sib)
+ if (*nodep && !is_sibling_entry(iter->node, *nodep))
return slot;
slot++;
iter->index = __radix_tree_iter_add(iter, 1);
@@ -1631,7 +1629,7 @@ void __rcu **__radix_tree_next_slot(void
struct radix_tree_iter *iter, unsigned flags)
{
unsigned tag = flags & RADIX_TREE_ITER_TAG_MASK;
- struct radix_tree_node *node = rcu_dereference_raw(*slot);
+ struct radix_tree_node *node;
slot = skip_siblings(&node, slot, iter);
_
From: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Commit 372b2b0399fc ("media: v4l: vsp1: Release buffers in
start_streaming error path") introduced a helper to clean up buffers on
error paths, but inadvertently changed the code such that only the
output WPF buffers were cleaned, rather than the video node being
operated on.
Since then vsp1_video_cleanup_pipeline() has grown to perform both video
node cleanup, as well as pipeline cleanup. Split the implementation into
two distinct functions that perform the required work, so that each
video node can release it's buffers correctly on streamoff. The pipe
cleanup that was performed in the vsp1_video_stop_streaming() (releasing
the pipe->dl) is moved to the function for clarity.
Fixes: 372b2b0399fc ("media: v4l: vsp1: Release buffers in start_streaming error path")
Cc: stable(a)vger.kernel.org # v4.13+
Signed-off-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
---
drivers/media/platform/vsp1/vsp1_video.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index c8c12223a267..ba89dd176a13 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -842,9 +842,8 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
return 0;
}
-static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+static void vsp1_video_release_buffers(struct vsp1_video *video)
{
- struct vsp1_video *video = pipe->output->video;
struct vsp1_vb2_buffer *buffer;
unsigned long flags;
@@ -854,12 +853,18 @@ static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
vb2_buffer_done(&buffer->buf.vb2_buf, VB2_BUF_STATE_ERROR);
INIT_LIST_HEAD(&video->irqqueue);
spin_unlock_irqrestore(&video->irqlock, flags);
+}
+
+static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+{
+ lockdep_assert_held(&pipe->lock);
/* Release our partition table allocation */
- mutex_lock(&pipe->lock);
kfree(pipe->part_table);
pipe->part_table = NULL;
- mutex_unlock(&pipe->lock);
+
+ vsp1_dl_list_put(pipe->dl);
+ pipe->dl = NULL;
}
static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
@@ -874,8 +879,9 @@ static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
if (pipe->stream_count == pipe->num_inputs) {
ret = vsp1_video_setup_pipeline(pipe);
if (ret < 0) {
- mutex_unlock(&pipe->lock);
+ vsp1_video_release_buffers(video);
vsp1_video_cleanup_pipeline(pipe);
+ mutex_unlock(&pipe->lock);
return ret;
}
@@ -925,13 +931,12 @@ static void vsp1_video_stop_streaming(struct vb2_queue *vq)
if (ret == -ETIMEDOUT)
dev_err(video->vsp1->dev, "pipeline stop timeout\n");
- vsp1_dl_list_put(pipe->dl);
- pipe->dl = NULL;
+ vsp1_video_cleanup_pipeline(pipe);
}
mutex_unlock(&pipe->lock);
media_pipeline_stop(&video->video.entity);
- vsp1_video_cleanup_pipeline(pipe);
+ vsp1_video_release_buffers(video);
vsp1_video_pipeline_put(pipe);
}
--
git-series 0.9.1
This is the start of the stable review cycle for the 4.9.101 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun May 20 08:15:20 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.101-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.101-rc1
Willy Tarreau <w(a)1wt.eu>
proc: do not access cmdline nor environ from file-backed areas
Jakub Kicinski <jakub.kicinski(a)netronome.com>
nfp: TX time stamp packets before HW doorbell is rung
James Chapman <jchapman(a)katalix.com>
l2tp: revert "l2tp: fix missing print session offset info"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap"
Vasily Averin <vvs(a)virtuozzo.com>
lockd: lost rollback of set_grace_period() in lockd_down_net()
Antony Antony <antony(a)phenome.org>
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
Jiri Slaby <jslaby(a)suse.cz>
futex: Remove duplicated code and fix undefined behaviour
Alexey Khoroshilov <khoroshilov(a)ispras.ru>
serial: sccnxp: Fix error handling in sccnxp_probe()
Xin Long <lucien.xin(a)gmail.com>
sctp: delay the authentication for the duplicated cookie-echo chunk
Xin Long <lucien.xin(a)gmail.com>
sctp: fix the issue that the cookie-ack with auth can't get processed
Yuchung Cheng <ycheng(a)google.com>
tcp: ignore Fast Open on repair mode
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: send learning packets for vlans on slave
Talat Batheesh <talatb(a)mellanox.com>
net/mlx5: Avoid cleaning flow steering table twice during error flow
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: do not allow rlb updates to invalid mac
Michael Chan <michael.chan(a)broadcom.com>
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Xin Long <lucien.xin(a)gmail.com>
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
Xin Long <lucien.xin(a)gmail.com>
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
Xin Long <lucien.xin(a)gmail.com>
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix powering up RTL8168h
Bjørn Mork <bjorn(a)mork.no>
qmi_wwan: do not steal interfaces from class drivers
Stefano Brivio <sbrivio(a)redhat.com>
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
Lance Richardson <lance.richardson.net(a)gmail.com>
net: support compat 64-bit time in {s,g}etsockopt
Eric Dumazet <edumazet(a)google.com>
net_sched: fq: take care of throttled flows before reuse
Adi Nissim <adin(a)mellanox.com>
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
Moshe Shemesh <moshe(a)mellanox.com>
net/mlx4_en: Verify coalescing parameters are in range
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
Rob Taglang <rob(a)taglang.io>
net: ethernet: sun: niu set correct packet size in skb
Eric Dumazet <edumazet(a)google.com>
llc: better deal with too small mtu
Andrey Ignatov <rdna(a)fb.com>
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Eric Dumazet <edumazet(a)google.com>
dccp: fix tasklet usage
Hangbin Liu <liuhangbin(a)gmail.com>
bridge: check iface upper dev when setting master via ioctl
Ingo Molnar <mingo(a)elte.hu>
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
-------------
Diffstat:
Makefile | 4 +-
arch/alpha/include/asm/futex.h | 26 ++-----
arch/arc/include/asm/futex.h | 40 ++--------
arch/arm/boot/dts/imx6qdl-wandboard.dtsi | 1 -
arch/arm/include/asm/futex.h | 26 +------
arch/arm64/include/asm/futex.h | 27 +------
arch/frv/include/asm/futex.h | 3 +-
arch/frv/kernel/futex.c | 27 +------
arch/hexagon/include/asm/futex.h | 38 +--------
arch/ia64/include/asm/futex.h | 25 +-----
arch/microblaze/include/asm/futex.h | 38 +--------
arch/mips/include/asm/futex.h | 25 +-----
arch/parisc/include/asm/futex.h | 26 +------
arch/powerpc/include/asm/futex.h | 26 ++-----
arch/s390/include/asm/futex.h | 23 ++----
arch/sh/include/asm/futex.h | 26 +------
arch/sparc/include/asm/futex_64.h | 26 ++-----
arch/tile/include/asm/futex.h | 40 ++--------
arch/x86/include/asm/futex.h | 40 ++--------
arch/xtensa/include/asm/futex.h | 27 ++-----
drivers/net/bonding/bond_alb.c | 15 ++--
drivers/net/bonding/bond_main.c | 2 +
drivers/net/ethernet/broadcom/tg3.c | 9 ++-
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 ++++
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 ++-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 21 +++--
.../net/ethernet/netronome/nfp/nfp_net_common.c | 4 +-
drivers/net/ethernet/realtek/8139too.c | 2 +-
drivers/net/ethernet/realtek/r8169.c | 3 +
drivers/net/ethernet/sun/niu.c | 5 +-
drivers/net/ethernet/ti/cpsw.c | 2 +
drivers/net/usb/qmi_wwan.c | 12 +++
drivers/tty/serial/sccnxp.c | 13 +++-
fs/lockd/svc.c | 2 +
fs/proc/base.c | 10 +--
include/asm-generic/futex.h | 50 +++---------
include/linux/mm.h | 1 +
include/net/bonding.h | 1 +
kernel/futex.c | 39 ++++++++++
mm/gup.c | 3 +
net/bridge/br_if.c | 4 +-
net/compat.c | 6 +-
net/dccp/ccids/ccid2.c | 14 +++-
net/dccp/timer.c | 2 +-
net/ipv4/ping.c | 7 +-
net/ipv4/tcp.c | 2 +-
net/ipv4/tcp_bbr.c | 4 +-
net/ipv4/udp.c | 7 +-
net/l2tp/l2tp_netlink.c | 2 -
net/llc/af_llc.c | 3 +
net/openvswitch/flow_netlink.c | 9 +--
net/sched/sch_fq.c | 37 ++++++---
net/sctp/associola.c | 30 +++++++-
net/sctp/inqueue.c | 2 +-
net/sctp/ipv6.c | 3 +
net/sctp/sm_statefuns.c | 89 ++++++++++++----------
net/sctp/ulpevent.c | 1 -
net/xfrm/xfrm_state.c | 1 +
59 files changed, 380 insertions(+), 585 deletions(-)
This is the start of the stable review cycle for the 4.14.42 release.
There are 45 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun May 20 08:15:14 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.42-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.42-rc1
Willy Tarreau <w(a)1wt.eu>
proc: do not access cmdline nor environ from file-backed areas
James Chapman <jchapman(a)katalix.com>
l2tp: revert "l2tp: fix missing print session offset info"
Antony Antony <antony(a)phenome.org>
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
ethanwu <ethanwu(a)synology.com>
btrfs: Take trans lock before access running trans in check_delayed_ref
Herbert Xu <herbert(a)gondor.apana.org.au>
xfrm: Use __skb_queue_tail in xfrm_trans_queue
Dave Carroll <david.carroll(a)microsemi.com>
scsi: aacraid: Correct hba_send to include iu_type
Paolo Abeni <pabeni(a)redhat.com>
udp: fix SO_BINDTODEVICE
Eric Dumazet <edumazet(a)google.com>
nsh: fix infinite loop
Jianbo Liu <jianbol(a)mellanox.com>
net/mlx5e: Allow offloading ipv4 header re-write for icmp
Eric Dumazet <edumazet(a)google.com>
ipv6: fix uninit-value in ip6_multipath_l3_keys()
Stephen Hemminger <stephen(a)networkplumber.org>
hv_netvsc: set master device
Talat Batheesh <talatb(a)mellanox.com>
net/mlx5: Avoid cleaning flow steering table twice during error flow
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TX, Use correct counter in dma_map error flow
Jiri Pirko <jiri(a)mellanox.com>
net: sched: fix error path in tcf_proto_create() when modules are not configured
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: send learning packets for vlans on slave
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: do not allow rlb updates to invalid mac
Michael Chan <michael.chan(a)broadcom.com>
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
Yuchung Cheng <ycheng(a)google.com>
tcp: ignore Fast Open on repair mode
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Xin Long <lucien.xin(a)gmail.com>
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
Xin Long <lucien.xin(a)gmail.com>
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
Xin Long <lucien.xin(a)gmail.com>
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
Xin Long <lucien.xin(a)gmail.com>
sctp: fix the issue that the cookie-ack with auth can't get processed
Xin Long <lucien.xin(a)gmail.com>
sctp: delay the authentication for the duplicated cookie-echo chunk
Eric Dumazet <edumazet(a)google.com>
rds: do not leak kernel memory to user land
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix powering up RTL8168h
Bjørn Mork <bjorn(a)mork.no>
qmi_wwan: do not steal interfaces from class drivers
Stefano Brivio <sbrivio(a)redhat.com>
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
Andre Tomt <andre(a)tomt.net>
net/tls: Fix connection stall on partial tls record
Dave Watson <davejwatson(a)fb.com>
net/tls: Don't recursively call push_record during tls_write_space callbacks
Lance Richardson <lance.richardson.net(a)gmail.com>
net: support compat 64-bit time in {s,g}etsockopt
Eric Dumazet <edumazet(a)google.com>
net_sched: fq: take care of throttled flows before reuse
Roman Mashak <mrv(a)mojatatu.com>
net sched actions: fix refcnt leak in skbmod
Adi Nissim <adin(a)mellanox.com>
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
Roi Dayan <roid(a)mellanox.com>
net/mlx5e: Err if asked to offload TC match on frag being first
Moshe Shemesh <moshe(a)mellanox.com>
net/mlx4_en: Verify coalescing parameters are in range
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
Rob Taglang <rob(a)taglang.io>
net: ethernet: sun: niu set correct packet size in skb
Eric Dumazet <edumazet(a)google.com>
llc: better deal with too small mtu
Andrey Ignatov <rdna(a)fb.com>
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Julian Anastasov <ja(a)ssi.bg>
ipv4: fix fnhe usage by non-cached routes
Eric Dumazet <edumazet(a)google.com>
dccp: fix tasklet usage
Hangbin Liu <liuhangbin(a)gmail.com>
bridge: check iface upper dev when setting master via ioctl
Ingo Molnar <mingo(a)elte.hu>
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
-------------
Diffstat:
Makefile | 4 +-
drivers/net/bonding/bond_alb.c | 15 +--
drivers/net/bonding/bond_main.c | 2 +
drivers/net/ethernet/broadcom/tg3.c | 9 +-
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 +++
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 8 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 20 ++--
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 23 +++--
drivers/net/ethernet/realtek/8139too.c | 2 +-
drivers/net/ethernet/realtek/r8169.c | 3 +
drivers/net/ethernet/sun/niu.c | 5 +-
drivers/net/ethernet/ti/cpsw.c | 2 +
drivers/net/hyperv/netvsc_drv.c | 3 +-
drivers/net/usb/qmi_wwan.c | 12 +++
drivers/scsi/aacraid/commsup.c | 8 +-
fs/btrfs/extent-tree.c | 7 ++
fs/proc/base.c | 8 +-
include/linux/mm.h | 1 +
include/net/bonding.h | 1 +
include/net/tls.h | 1 +
mm/gup.c | 3 +
net/bridge/br_if.c | 4 +-
net/compat.c | 6 +-
net/dccp/ccids/ccid2.c | 14 ++-
net/dccp/timer.c | 2 +-
net/ipv4/ping.c | 7 +-
net/ipv4/route.c | 118 ++++++++++------------
net/ipv4/tcp.c | 3 +-
net/ipv4/tcp_bbr.c | 4 +-
net/ipv4/udp.c | 11 +-
net/ipv6/route.c | 7 +-
net/ipv6/udp.c | 4 +-
net/l2tp/l2tp_netlink.c | 2 -
net/llc/af_llc.c | 3 +
net/nsh/nsh.c | 2 +
net/openvswitch/flow_netlink.c | 9 +-
net/rds/recv.c | 1 +
net/sched/act_skbmod.c | 5 +-
net/sched/cls_api.c | 2 +-
net/sched/sch_fq.c | 37 ++++---
net/sctp/associola.c | 30 +++++-
net/sctp/inqueue.c | 2 +-
net/sctp/ipv6.c | 3 +
net/sctp/sm_statefuns.c | 88 ++++++++--------
net/sctp/ulpevent.c | 1 -
net/tls/tls_main.c | 8 ++
net/xfrm/xfrm_input.c | 2 +-
net/xfrm/xfrm_state.c | 1 +
51 files changed, 349 insertions(+), 205 deletions(-)
Entry corresponding to 220 us setup time was missing. I am not aware of
any specific bug this fixes, but this could potentially result in enabling
PSR on a panel with a higher setup time requirement than supported by the
hardware.
I verified the value is present in eDP spec versions 1.3, 1.4 and 1.4a.
Fixes: 6608804b3d7f ("drm/dp: Add drm_dp_psr_setup_time()")
Cc: stable(a)vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Jose Roberto de Souza <jose.souza(a)intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan(a)intel.com>
---
drivers/gpu/drm/drm_dp_helper.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 36c7609a4bd5..a7ba602a43a8 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1159,6 +1159,7 @@ int drm_dp_psr_setup_time(const u8 psr_cap[EDP_PSR_RECEIVER_CAP_SIZE])
static const u16 psr_setup_time_us[] = {
PSR_SETUP_TIME(330),
PSR_SETUP_TIME(275),
+ PSR_SETUP_TIME(220),
PSR_SETUP_TIME(165),
PSR_SETUP_TIME(110),
PSR_SETUP_TIME(55),
--
2.14.1
In 08810a4119aaebf6318f209ec5dd9828e969cba4 setting
dev->power.direct_complete was made conditional on
pm_runtime_suspended().
The justification was:
While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.
However, this breaks resuming from suspend on those newer HP laptops
if the amdgpu driver is used (due to hybrid intel+radeon graphics). Given
the justification for the change, undoing it seems best as it
appears to have unintended side effects.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199693
References: https://bugs.freedesktop.org/show_bug.cgi?id=106447
Signed-off-by: Thomas Martitz <kugel(a)rockbox.org>
Cc: Pavel Machek <pavel(a)ucw.cz>
Cc: Len Brown <len.brown(a)intel.com>
Cc: <linux-pm(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org> [4.15+]
Signed-off-by: Thomas Martitz <kugel(a)rockbox.org>
---
drivers/base/power/main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 02a497e7c785..b2fb0974f832 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1960,8 +1960,7 @@ static int device_prepare(struct device *dev, pm_message_t state)
*/
spin_lock_irq(&dev->power.lock);
dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
- pm_runtime_suspended(dev) && ret > 0 &&
- !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
+ ret > 0 && !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
spin_unlock_irq(&dev->power.lock);
return 0;
}
--
2.17.0
Hi,
Commit 820da5357572 ("l2tp: fix missing print session offset info") has
been backported to several -stable trees (AFAICS 3.18, 4.4, 4.9, 4.14
and 4.15). This patch has been reverted upstream as the L2TP offset
option was dropped. Therefore it doesn't make sense to start exporting
this data in stable releases.
Can you guys revert the corresponding commits from your trees, or queue
up de3b58bc359a ("l2tp: revert "l2tp: fix missing print session offset info"")?
If some of you have 820da5357572 queued up for other trees, then please
drop it.
Guillaume
Since Linux v4.10 release (commit 1d9174fbc55e "PM / Runtime: Defer
resuming of the device in pm_runtime_force_resume()"),
pm_runtime_force_resume() function doesn't runtime resume device if it was
not runtime active before system suspend. Thus, driver should not do any
register access after pm_runtime_force_resume() without checking the
runtime status of the device. To fix this issue, simply move
s3c64xx_spi_hwinit() call to s3c64xx_spi_runtime_resume() to ensure that
hardware is always properly initialized. This fixes Synchronous external
abort issue on system suspend/resume cycle on newer Exynos SoCs.
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
CC: <stable(a)vger.kernel.org> # 4.10.x: 1c75862d8e5a spi: spi-s3c64xx: Remove unused s3c64xx_spi_hwinit()
CC: <stable(a)vger.kernel.org> # 4.10.x
Reviewed-by: Krzysztof Kozlowski <krzk(a)kernel.org>
Acked-by: Andi Shyti <andi(a)etezian.org>
---
Resend reason: added cc: stable, reviewed and acked tags
---
drivers/spi/spi-s3c64xx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
index f55dc78957ad..7b7151ec14c8 100644
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -1292,8 +1292,6 @@ static int s3c64xx_spi_resume(struct device *dev)
if (ret < 0)
return ret;
- s3c64xx_spi_hwinit(sdd);
-
return spi_master_resume(master);
}
#endif /* CONFIG_PM_SLEEP */
@@ -1331,6 +1329,8 @@ static int s3c64xx_spi_runtime_resume(struct device *dev)
if (ret != 0)
goto err_disable_src_clk;
+ s3c64xx_spi_hwinit(sdd);
+
return 0;
err_disable_src_clk:
--
2.17.0
The audit_filter_rules() function in auditsc.c used the in_[e]group_p()
functions to check GID/EGID match, but these functions use the current
task's credentials, while the comparison should use the credentials of
the task given to audit_filter_rules() as a parameter (tsk).
Note that we can use group_search(cred->group_info, ...) as a
replacement for both in_group_p and in_egroup_p as these functions only
compare the parameter to cred->fsgid/egid and then call group_search.
In fact, the usage of in_group_p was incorrect also because it compared
to cred->fsgid and not cred->gid.
GitHub issue:
https://github.com/linux-audit/audit-kernel/issues/82
Fixes: 37eebe39c973 ("audit: improve GID/EGID comparation logic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace(a)redhat.com>
---
kernel/auditsc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index cbab0da86d15..ec38e4d97c23 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -490,20 +490,20 @@ static int audit_filter_rules(struct task_struct *tsk,
result = audit_gid_comparator(cred->gid, f->op, f->gid);
if (f->op == Audit_equal) {
if (!result)
- result = in_group_p(f->gid);
+ result = groups_search(cred->group_info, f->gid);
} else if (f->op == Audit_not_equal) {
if (result)
- result = !in_group_p(f->gid);
+ result = !groups_search(cred->group_info, f->gid);
}
break;
case AUDIT_EGID:
result = audit_gid_comparator(cred->egid, f->op, f->gid);
if (f->op == Audit_equal) {
if (!result)
- result = in_egroup_p(f->gid);
+ result = groups_search(cred->group_info, f->gid);
} else if (f->op == Audit_not_equal) {
if (result)
- result = !in_egroup_p(f->gid);
+ result = !groups_search(cred->group_info, f->gid);
}
break;
case AUDIT_SGID:
--
2.17.0
The ONFI spec clearly says that FAIL bit is only valid for PROGRAM,
ERASE and READ-with-on-die-ECC operations, and should be ignored
otherwise.
It seems that checking it after sending a SET_FEATURES is a bad idea
because a previous READ, PROGRAM or ERASE op may have failed, and
depending on the implementation, the FAIL bit is not cleared until a
new READ, PROGRAM or ERASE is started.
This leads to ->set_features() returning -EIO while it actually worked,
which can sometimes stop a batch of READ/PROGRAM ops.
Note that we only fix the ->exec_op() path here, because some drivers
are abusing the NAND_STATUS_FAIL flag in their ->waitfunc()
implementation to propagate other kind of errors, like
wait-ready-timeout or controller-related errors. Let's not try to fix
those drivers since they worked fine so far.
Fixes: 8878b126df76 ("mtd: nand: add ->exec_op() implementation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
---
This patch is fixing a problem we had with on-die ECC on Micron
NANDs [1].
On these chips, when you have an ECC failure, the FAIL bit is set and
it's not cleared until the next READ operation, which led the following
SET_FEATURES (used to re-enable on-die ECC) to fail with -EIO and
stopped the batch of page reads started by UBIFS, which in turn led to
unmountable FS.
[1]http://patchwork.ozlabs.org/patch/907874/
Changes in v2:
- Fix the subject prefix
---
drivers/mtd/nand/raw/nand_base.c | 27 +++++++++------------------
1 file changed, 9 insertions(+), 18 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index f28c3a555861..ee29f34562ab 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2174,7 +2174,6 @@ static int nand_set_features_op(struct nand_chip *chip, u8 feature,
struct mtd_info *mtd = nand_to_mtd(chip);
const u8 *params = data;
int i, ret;
- u8 status;
if (chip->exec_op) {
const struct nand_sdr_timings *sdr =
@@ -2188,26 +2187,18 @@ static int nand_set_features_op(struct nand_chip *chip, u8 feature,
};
struct nand_operation op = NAND_OPERATION(instrs);
- ret = nand_exec_op(chip, &op);
- if (ret)
- return ret;
-
- ret = nand_status_op(chip, &status);
- if (ret)
- return ret;
- } else {
- chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, feature, -1);
- for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
- chip->write_byte(mtd, params[i]);
+ return nand_exec_op(chip, &op);
+ }
- ret = chip->waitfunc(mtd, chip);
- if (ret < 0)
- return ret;
+ chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, feature, -1);
+ for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
+ chip->write_byte(mtd, params[i]);
- status = ret;
- }
+ ret = chip->waitfunc(mtd, chip);
+ if (ret < 0)
+ return ret;
- if (status & NAND_STATUS_FAIL)
+ if (ret & NAND_STATUS_FAIL)
return -EIO;
return 0;
--
2.14.1
From: Vaibhav Jain <vaibhav(a)linux.ibm.com>
On Power-8 the AFU attr prefault_mode tried to improve storage fault
performance by prefaulting process segments. However Power-9 radix
mode doesn't have Storage-Segments and prefaulting Pages is too fine
grained.
So this patch updates prefault_mode_store() to not allow any other
value apart from CXL_PREFAULT_NONE when radix mode is enabled.
Cc: <stable(a)vger.kernel.org>
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
Documentation/ABI/testing/sysfs-class-cxl | 4 +++-
drivers/misc/cxl/sysfs.c | 16 ++++++++++++----
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 640f65e79ef1..267920a1874b 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -69,7 +69,9 @@ Date: September 2014
Contact: linuxppc-dev(a)lists.ozlabs.org
Description: read/write
Set the mode for prefaulting in segments into the segment table
- when performing the START_WORK ioctl. Possible values:
+ when performing the START_WORK ioctl. Only applicable when
+ running under hashed page table mmu.
+ Possible values:
none: No prefaulting (default)
work_element_descriptor: Treat the work element
descriptor as an effective address and
diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index 4b5a4c5d3c01..629e2e156412 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -353,12 +353,20 @@ static ssize_t prefault_mode_store(struct device *device,
struct cxl_afu *afu = to_cxl_afu(device);
enum prefault_modes mode = -1;
- if (!strncmp(buf, "work_element_descriptor", 23))
- mode = CXL_PREFAULT_WED;
- if (!strncmp(buf, "all", 3))
- mode = CXL_PREFAULT_ALL;
if (!strncmp(buf, "none", 4))
mode = CXL_PREFAULT_NONE;
+ else {
+ if (!radix_enabled()) {
+
+ /* only allowed when not in radix mode */
+ if (!strncmp(buf, "work_element_descriptor", 23))
+ mode = CXL_PREFAULT_WED;
+ if (!strncmp(buf, "all", 3))
+ mode = CXL_PREFAULT_ALL;
+ } else {
+ dev_err(device, "Cannot prefault with radix enabled\n");
+ }
+ }
if (mode == -1)
return -EINVAL;
--
2.17.0
Oops sorry, I failed to write the subject. It should been something like the subject of this e-mail.
> -----Original Message-----
> From: stable-owner(a)vger.kernel.org [mailto:stable-owner@vger.kernel.org] On
> Behalf Of Daniel Sangorrin
> Sent: Friday, May 18, 2018 9:59 AM
> To: stable(a)vger.kernel.org
> Cc: mtk.manpages(a)gmail.com; viro(a)zeniv.linux.org.uk
> Subject:
>
> Hello Greg,
>
> After running LTP with Fuego on the LTS kernel 4.4.y, there were
> a few test cases failing that I thought needed some investigation.
>
> I reviewed the first one (fcntl35 and fcntl35_64) so far. According to the
> comments on LTP's fcntl35.c file (by Xiao Yang <yangx.jy(a)cn.fujitsu.com>)
> the bug tested by this test case was fixed by:
> pipe: cap initial pipe capacity according to pipe-max-size
> commit 086e774a57fba4695f14383c0818994c0b31da7c
> Author: Michael Kerrisk (man-pages) <mtk.manpages(a)gmail.com>
> Date: Tue Oct 11 13:53:43 2016 -0700
>
> I backported that patch (see next e-mail), tested again and confirmed that
> the patch fixed the bug (or at least the error message in LTP's test).
>
> Before:
> fcntl35.c:98: FAIL: an unprivileged user init the capacity of a pipe to 65536
> unexpectedly, expected 4096
> After:
> fcntl35.c:101: PASS: an unprivileged user init the capacity of a pipe to 4096
> successfully
>
> Thanks,
> Daniel Sangorrin
>
Hello Greg,
After running LTP with Fuego on the LTS kernel 4.4.y, there were
a few test cases failing that I thought needed some investigation.
I reviewed the first one (fcntl35 and fcntl35_64) so far. According to the
comments on LTP's fcntl35.c file (by Xiao Yang <yangx.jy(a)cn.fujitsu.com>)
the bug tested by this test case was fixed by:
pipe: cap initial pipe capacity according to pipe-max-size
commit 086e774a57fba4695f14383c0818994c0b31da7c
Author: Michael Kerrisk (man-pages) <mtk.manpages(a)gmail.com>
Date: Tue Oct 11 13:53:43 2016 -0700
I backported that patch (see next e-mail), tested again and confirmed that
the patch fixed the bug (or at least the error message in LTP's test).
Before:
fcntl35.c:98: FAIL: an unprivileged user init the capacity of a pipe to 65536 unexpectedly, expected 4096
After:
fcntl35.c:101: PASS: an unprivileged user init the capacity of a pipe to 4096 successfully
Thanks,
Daniel Sangorrin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin(a)oracle.com>
Tested-by: John Sobecki <john.sobecki(a)oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: stable(a)vger.kernel.org
---
drivers/xen/swiotlb-xen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
* physical address */
phys = xen_bus_to_phys(dev_addr);
- if (((dev_addr + size - 1 > dma_mask)) ||
+ if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
--
2.14.3 (Apple Git-98)